Essential 10

3. Inclusion and exclusion criteria

Inclusion and exclusion criteria define the eligibility or disqualification of animals and data once the study has commenced. To ensure scientific rigour, the criteria should be defined before the experiment starts and data are collected [1-4]. Inclusion criteria should not be confused with animal characteristics (see item 8 – Experimental animals) but can be related to these (e.g. body weights must be within a certain range for a particular procedure) or related to other study parameters (e.g. task performance has to exceed a given threshold). In studies where selected data are re-analysed for a different purpose, inclusion and exclusion criteria should describe how data were selected.

Exclusion criteria may result from technical or welfare issues such as complications anticipated during surgery, or circumstances where test procedures might be compromised (e.g. development of motor impairments that could affect behavioural measurements). Criteria for excluding samples or data include failure to meet quality control standards, such as insufficient sample volumes, unacceptable levels of contaminants, poor histological quality, etc. Similarly, how the researcher will define and handle data outliers during the analysis should also be decided before the experiment starts (see item 3b for guidance on responsible data cleaning).

Exclusion criteria may also reflect the ethical principles of a study in line with its humane endpoints (see item 16 – Animal care and monitoring). For example, in cancer studies an animal might be dropped from the study and euthanised before the predetermined time point if the size of a subcutaneous tumour exceeds a specific volume [5]. If losses are anticipated, these should be considered when determining the number of animals to include in the study (see item 2 – Sample size). While exclusion criteria and humane endpoints are typically included in the ethical review application, reporting the criteria used to exclude animals or data points in the manuscript helps readers with the interpretation of the data and provides crucial information to other researchers wanting to adopt the model.

Best practice is to include all a priori inclusion and exclusion/outlier criteria in a pre-registered protocol (see item 19 – Protocol registration). At the very least these criteria should be documented in a lab notebook and reported in manuscripts, explicitly stating that the criteria were defined before any data was collected.

 

References

  1. Avey MT, Moher D, Sullivan KJ, Fergusson D, Griffin G, Grimshaw JM, Hutton B, Lalu MM, Macleod M, Marshall J, Mei SHJ, Rudnicki M, Stewart DJ, Turgeon AF, McIntyre L and Canadian Critical Care Translational Biology G (2016). The devil is in the details: incomplete reporting in preclinical animal research. PLoS ONE. doi: 10.1371/journal.pone.0166733
  2. Vahidy F, Schäbitz W-R, Fisher M and Aronowski J (2016). Reporting standards for preclinical studies of stroke therapy. Stroke. doi: 10.1161/STROKEAHA.116.013643
  3. Rice ASC, Morland R, Huang W, Currie GL, Sena ES and Macleod MR (2013). Transparency in the reporting of in vivo pre-clinical pain research: The relevance and implications of the ARRIVE (Animal Research: Reporting In Vivo Experiments) guidelines. Scandinavian Journal of Pain. doi: 10.1016/j.sjpain.2013.02.002
  4. Salkind NJ (2010). Encyclopedia of research design. Sage. doi: 10.4135/9781412961288
  5. Workman P, Aboagye EO, Balkwill F, Balmain A, Bruder G, Chaplin DJ, Double JA, Everitt J, Farningham DAH, Glennie MJ, Kelland LR, Robinson V, Stratford IJ, Tozer GM, Watson S, Wedge SR, Eccles SA and An ad hoc committee of the National Cancer Research I (2010). Guidelines for the welfare and use of animals in cancer research. British Journal Of Cancer. doi: 10.1038/sj.bjc.660564

Example 1 

“The animals were included in the study if they underwent successful MCA occlusion (MCAo), defined by a 60% or greater drop in cerebral blood flow seen with laser Doppler flowmetry. The animals were excluded if insertion of the thread resulted in perforation of the vessel wall (determined by the presence of sub-arachnoid blood at the time of sacrifice), if the silicon tip of the thread became dislodged during withdrawal, or if the animal died prematurely, preventing the collection of behavioral and histological data.” [1]

 

References

  1. Sena ES, Jeffreys AL, Cox SF, Sastra SA, Churilov L, Rewell S, Batchelor PE, van der Worp HB, Macleod MR and Howells DW (2013). The benefit of hypothermia in experimental ischemic stroke is not affected by pethidine. Int J Stroke. doi: 10.1111/j.1747-4949.2012.00834.x

Animals, experimental units, or data points that are unaccounted for can lead to instances where conclusions cannot be supported by the raw data [1]. Reporting exclusions and attritions provides valuable information to other investigators evaluating the results, or who intend to repeat the experiment or test the intervention in other species. It may also provide important safety information for human trials (e.g. exclusions related to adverse effects).

There are many legitimate reasons for experimental attrition, some of which are anticipated and controlled for in advance (see Item 3a) but some data loss might not be anticipated. For example, data points may be excluded from analyses due to an animal receiving the wrong treatment, unexpected drug toxicity, infections or diseases unrelated to the experiment, sampling errors (e.g. a malfunctioning assay that produced a spurious result, inadequate calibration of equipment), or other human error (e.g. forgetting to switch on equipment for a recording).

Most statistical analysis methods are extremely sensitive to outliers and missing data. In some instances, it may be scientifically justifiable to remove outlying data points from an analysis, such as obvious errors in data entry or measurement with readings that are outside a plausible range. Inappropriate data cleaning has the potential to bias study outcomes [2]; providing the reasoning for removing data points enables the distinction to be made between responsible data cleaning and data manipulation. Missing data, common in all areas of research, can impact the sensitivity of the study and also lead to biased estimates, distorted power and loss of information if the missing values are not random [3]. Analysis plans should include methods to explore why data are missing. It is also important to consider and justify analysis methods that account for missing data [4,5].

There is a movement towards greater data sharing (see item 20 – Data access), along with an increase in strategies such as code sharing to enable analysis replication. These practices, however transparent, still need to be accompanied by a disclosure on the reasoning for data cleaning, and whether methods were defined before any data were collected.

Report all animal exclusions and loss of data points, along with the rationale for their exclusion. For example, this information can be summarised as a table or a flowchart describing attrition in each treatment group. Accompanying this information should be an explicit description of whether researchers were blinded to the group allocations when data or animals were excluded (see item 5 – Blinding and [6]). Explicitly state where built-in models in statistics packages have been used to remove outliers (e.g. GraphPad Prism’s outlier test). 

 

References

  1. Kafkafi N, Agassi J, Chesler EJ, Crabbe JC, Crusio WE, Eilam D, Gerlai R, Golani I, Gomez-Marin A, Heller R, Iraqi F, Jaljuli I, Karp NA, Morgan H, Nicholson G, Pfaff DW, Richter SH, Stark PB, Stiedl O, Stodden V, Tarantino LM, Tucci V, Valdar W, Williams RW, Würbel H and Benjamini Y (2018). Reproducibility and replicability of rodent phenotyping in preclinical studies. Neuroscience & Biobehavioral Reviews. doi: 10.1016/j.neubiorev.2018.01.003
  2. Scott S, Kranz JE, Cole J, Lincecum JM, Thompson K, Kelly N, Bostrom A, Theodoss J, Al‐Nakhala BM, Vieira FG, Ramasubbu J and Heywood JA (2008). Design, power, and interpretation of studies in the standard murine model of ALS. Amyotrophic Lateral Sclerosis. doi: 10.1080/17482960701856300
  3. Kang H (2013). The prevention and handling of the missing data. Korean J Anesthesiol. doi: 10.4097/kjae.2013.64.5.402
  4. Allison PD (2001). Missing data. SAGE Publications. doi: 10.4135/9781412985079
  5. Jakobsen JC, Gluud C, Wetterslev J and Winkel P (2017). When and how should multiple imputation be used for handling missing data in randomised clinical trials - a practical guide with flowcharts. BMC Med Res Methodol. doi: 10.1186/s12874-017-0442-1
  6. Holman C, Piper SK, Grittner U, Diamantaras AA, Kimmelman J, Siegerink B and Dirnagl U (2016). Where have all the rodents gone? The effects of attrition in experimental research on cancer and stroke. PLoS Biol. doi: 10.1371/journal.pbio.1002331

Example 1 

“Pen was the experimental unit for all data. One entire pen (ZnAA90) was removed as an outlier from both Pre-RAC and RAC periods for poor performance caused by illness unrelated to treatment...Outliers were determined using Cook’s D statistic and removed if Cook’s D > 0.5. One steer was determined to be an outlier for day 48 liver biopsy TM and data were removed.” [1] 

Example 2 

“Seventy-two SHRs were randomized into the study, of which 13 did not meet our inclusion and exclusion criteria because the drop in cerebral blood flow at occlusion did not reach 60% (seven animals), postoperative death (one animal: autopsy unable to identify the cause of death), haemorrhage during thread insertion (one animal), and disconnection of the silicon tip of the thread during withdrawal, making the permanence of reperfusion uncertain (four animals). A total of 59 animals were therefore included in the analysis of infarct volume in this study. In error, three animals were sacrificed before their final assessment of neurobehavioral score: one from the normothermia/water group and two from the hypothermia/pethidine group. These errors occurred blinded to treatment group allocation. A total of 56 animals were therefore included in the analysis of neurobehavioral score.” [2]  

Example 3 

Baseline assesment with CMR and Echo[3]

“Flow chart showing the experimental protocol with the number of animals used, died and included in the study…After baseline CMR and echocardiography, MI was induced by left anterior descending (LAD) coronary artery ligation (n = 48), as previously described. As control of surgery procedure, sham operated mice underwent thoracotomy and pericardiotomy without coronary artery ligation (n = 12).” [3]

 

References

  1. Genther-Schroeder ON, Branine ME and Hansen SL (2018). Effects of increasing supplemental dietary Zn concentration on growth performance and carcass characteristics in finishing steers fed ractopamine hydrochloride. Journal of animal science. http://dx.doi.org/10.1093/jas/sky094
  2. Sena ES, Jeffreys AL, Cox SF, Sastra SA, Churilov L, Rewell S, Batchelor PE, van der Worp HB, Macleod MR and Howells DW (2013). The benefit of hypothermia in experimental ischemic stroke is not affected by pethidine. Int J Stroke. http://dx.doi.org/10.1111/j.1747-4949.2012.00834.x
  3. Castiglioni L, Colazzo F, Fontana L, Colombo GI, Piacentini L, Bono E, Milano G, Paleari S, Palermo A, Guerrini U, Tremoli E and Sironi L (2015). Evaluation of Left Ventricle Function by Regional Fractional Area Change (RFAC) in a Mouse Model of Myocardial Infarction Secondary to Valsartan Treatment. PLOS ONE. http://dx.doi.org/10.1371/journal.pone.0135778

 

The exact number of experimental units analysed in each group (i.e. the n number) is essential information for the reader to interpret the analysis, it should be reported unambiguously. All animals and data used in the experiment should be accounted for in the data presented. Sometimes, for good reasons, animals may need to be excluded from a study (e.g. illness or mortality), or data points excluded from analyses (e.g. biologically implausible values). Reporting losses will help the reader to understand the experimental design process, replicate methods, and provide adequate tracking of animal numbers in a study, especially when sample size numbers in the analyses do not match the original group numbers.

For each outcome measure, indicate numbers clearly within the text or on figures, and provide absolute numbers (e.g. 10/20, not 50%). For studies where animals are measured at different time points, explicitly report the full description of which animals undergo measurement, and when [1].

 

References

  1. Vahidy F, Schäbitz W-R, Fisher M and Aronowski J (2016). Reporting standards for preclinical studies of stroke therapy. Stroke. doi: 10.1161/STROKEAHA.116.013643

Example 1 

“Group F contained 29 adult males and 58 adult females in 2010 (n = 87), and 32 adult males and 66 adult females in 2011 (n = 98). The increase in female numbers was due to maturation of juveniles to adults. Females belonged to three matrilines, and there were no major shifts in rank in the male hierarchy. Six mid to low ranking individuals died and were excluded from analyses, as were five mid-ranking males who emigrated from the group at the beginning of 2011.” [1] 

Example 2 

“The proportion of test time that animals spent interacting with the handler (sniffed the gloved hand or tunnel, made paw contact, climbed on, or entered the handling tunnel) was measured from DVD recordings. This was then averaged across the two mice in each cage as they were tested together and their behaviour was not independent…Mice handled with the home cage tunnel spent a much greater proportion of the test interacting with the handler (mean ± s.e.m., 39.8 ± 5.2 percent time of 60 s test, n = 8 cages) than those handled by tail (6.4 ± 2.0 percent time, n = 8 cages), while those handled by cupping showed intermediate levels of voluntary interaction (27.6 ± 7.1 percent time, n = 8 cages).” [2] 

 

References

  1. Brent LJ, Heilbronner SR, Horvath JE, Gonzalez-Martinez J, Ruiz-Lambides A, Robinson AG, Skene JH and Platt ML (2013). Genetic origins of social networks in rhesus macaques. Scientific reports. doi: 10.1038/srep01042
  2. Gouveia K and Hurst JL (2017). Optimising reliability of mouse performance in behavioural testing: the major role of non-aversive handling. Scientific reports. doi: 10.1038/srep44999