Essential 10

10. Results

For each experiment conducted, including independent replications, report:

Summary/descriptive statistics provide a quick and simple description of the data, they communicate quantitative results easily and facilitate visual presentation. For continuous data, these descriptors include a measure of central tendency (e.g. mean, median) and a measure of variability (e.g. quartiles, range, standard deviation) to help readers assess the precision of the data collected. Categorical data can be expressed as counts, frequencies, or proportions.

Report data for all experiments conducted. If a complete experiment is repeated on a different day, or under different conditions, report the results of all repeats, rather than selecting data from representative experiments. Report the exact number of experimental units per group so readers can gauge the reliability of the results (see item 2 – Sample size, and item 3 – Inclusion and exclusion criteria). Present data clearly as text, in tables, or in graphs, to enable information to be evaluated, or extracted for future meta-analyses [1]. Report descriptive statistics with a clearly identified measure of variability for each group. Example 1 shows data summarised as means and standard deviations and, in brackets, ranges. Boxplots are a convenient way to summarise continuous data, plotted as median and interquartile range, as shown in Example 2.



  1. Michel MC, Murphy TJ and Motulsky HJ (2020). New author guidelines for displaying data and reporting data analysis and statistical methods in experimental biology. Mol. Pharmacol. doi: 10.1124/mol.119.118927

Example 1

Figure 5[1]

“Bioacoustic parameters of new species of miniaturised cophyline microhylids...Values are presented as mean ± standard deviation, with range in brackets. na = not applicable. *In all species except R. proportionalis calls consist of a single note according to the definition herein, and in these species call duration is therefore synonymous with note duration.” [1]

Example 2


Figure 6[2]

“Fractions of the unperturbed elements of calcium release in cardiac myocytes. MORPHOL: fractions of compact dyads estimated by morphometry from electron microscopic images…ELPHYS: fractions of the early CRF components estimated by fitting records of integral fluorescence signals …. CTR - control myocardium; IMY - injured myocardium. All collected data are shown. Box plots show the 25%, 50% and 75% percentiles; whiskers show 10% and 90% percentile. Solid squares denote the means.” [2]



  1. Scherz MD, Hutter CR, Rakotoarison A, Riemann JC, Rodel MO, Ndriantsoa SH, Glos J, Hyde Roberts S, Crottini A, Vences M and Glaw F (2019). Morphological and ecological convergence at the lower size limit for vertebrates highlighted by five new miniaturised microhylid frog species from three different Madagascan genera. PLoS One. doi: 10.1371/journal.pone.0213314
  2. Novotová M, Zahradníková A, Nichtová Z, Kováč R, Kráľová E, Stankovičová T, Zahradníková A and Zahradník I (2020). Structural variability of dyads relates to calcium release in rat ventricular myocytes. Sci. Rep. doi: 10.1038/s41598-020-64840-5

In hypothesis-testing studies using inferential statistics, investigators frequently confuse statistical significance and small p-values, with biological or clinical importance [1]. Statistical significance is usually quantified and evaluated against a preassigned threshold, with p < 0.05 often used as a convention. However, statistical significance is heavily influenced by sample size and variation in the data (see item 2 – Sample size). Investigators must consider the size of the effect that was observed and whether this is a biologically relevant change.

Effect sizes are often not reported in animal research, but they are relevant to both exploratory and hypothesis-testing studies. An effect size is a quantitative measure that estimates the magnitude of differences between groups, or strength of relationships between variables. It can be used to assess the patterns in the data collected and make inferences about the wider population from which the sample came. The confidence interval for the effect indicates how precisely the effect has been estimated, and tells the reader about the strength of the effect [2]. In studies where statistical power is low, and/or hypothesis-testing is inappropriate, providing the effect size and confidence interval indicates how small or large an effect might really be, so a reader can judge the biological significance of the data [3,4]. Reporting effect sizes with confidence intervals also facilitates extraction of useful data for systematic review and meta-analysis. Where multiple independent studies included in a meta-analysis show quantitatively similar effects, even if each is statistically non-significant, this provides powerful evidence that a relationship is ‘real’, although small.

Report all analyses performed, even those providing non-statistically significant results. Report the effect size to indicate the size of the difference between groups in the study, with a confidence interval to indicate the precision of the effect size estimate.



  1. Wasserstein RL, Schirm AL and Lazar NA (2019). Moving to a World Beyond “p < 0.05”. The American Statistician. doi: 10.1080/00031305.2019.1583913
  2. Altman DG (2005). Why we need confidence intervals. World J Surg. doi: 10.1007/s00268-005-7911-0
  3. Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, Elbourne D, Egger M and Altman DG (2010). CONSORT 2010 Explanation and Elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. doi: 10.1136/bmj.c869
  4. Nakagawa S and Cuthill IC (2007). Effect size, confidence interval and statistical significance: a practical guide for biologists. Biological reviews of the Cambridge Philosophical Society. doi: 10.1111/j.1469-185X.2007.00027.x

Example 1


Figure 7[1]

“For all traits identified as having a significant genotype effect for the Usp47tm1b(EUCOMM)Wtsi line (MGI:5605792), a comparison is presented of the standardized genotype effect with 95% confidence interval for each sex with no multiple comparisons correction. Standardization, to allow comparison across variables, was achieved by dividing the genotype estimate by the signal seen in the wildtype population. Shown in red are statistically significant estimates. RBC: red blood cells; BMC: bone mineral content; BMD: bone mineral density; WBC: white blood cells.” [1]



  1. Karp NA, Mason J, Beaudet AL, Benjamini Y, Bower L, Braun RE, Brown SDM, Chesler EJ, Dickinson ME, Flenniken AM, Fuchs H, Angelis MHd, Gao X, Guo S, Greenaway S, Heller R, Herault Y, Justice MJ, Kurbatova N, Lelliott CJ, Lloyd KCK, Mallon A-M, Mank JE, Masuya H, McKerlie C, Meehan TF, Mott RF, Murray SA, Parkinson H, Ramirez-Solis R, et al. (2017). Prevalence of sexual dimorphism in mammalian phenotypic traits. Nature communications. doi: 10.1038/ncomms15475