Fixing the bridge between biologists and statisticians

Models are wrong... but, some are useful (G. Box)!


Multi-environment split-plot experiments

Published at September 13, 2022 ·  7 min read

Have you made a split-plot field experiment? Have you repeated such an experiment in two (or more) years/locations? Have you run into troubles, because the reviewer told you that your ANOVA model was invalid? If so, please, stop for awhile and read: this post might help you understand what was wrong with your analyses.

Let’s think of a field experiment, where 6 genotypes of faba bean were compared under two different sowing times (autumn and spring). For the ease of organisation, such an experiment was laid down as a split-plot, in a randomised complete block design; sowing times were randomly allocated to main-plots, while genotypes were randomly allocated to sub-plots. Let’s stop for a moment… does this sound strange to you? Do you need further information about split-plot designs? You can get some general information at this link and hints on how to analyse the results at this other link

...

Meta-analysis for a single study. Is it possible?

Published at July 21, 2022 ·  12 min read

We all know that the word meta-analysis encompasses a body of statistical techniques to combine quantitative evidence from several independent studies. However, I have recently discovered that meta-analytic methods can also be used to analyse the results of a single research project. That happened a few months ago, when I was reading a paper from Damesa et al. (2017), where the authors describe some interesting methods of data analyses for multi-environment genotype experiments. These authors gave a few nice examples with related SAS code, that is rooted in mixed models. As an R enthusiast, I was willing to reproduce their analyses with R, but I could not succeed, until I realised that I could make use of the package ‘metafor’ and its bunch of meta-analityc methods.

...

Should I say ''there is no difference'' or ''the difference is not significant''?

Published at June 1, 2022 ·  5 min read

In a recent manuscript we wrote a sentence similar to the following: “On average, the genotype A gave a yield of 12.4 tons per hectare, while the genotype B gave 10.6 tons per hectare and such a difference was not significant (P = 0.20)”. Perhaps I should point out that we were talking about maize yields… One of the reviewers complained that “This is an example of expression having no place in a scientific paper” and that we should write: “… no difference in yield was found between A and B (P = 0.20)”.

...

Analysing seed germination and emergence data with R (a tutorial). Part 9

Published at January 18, 2022 ·  10 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts expand on a manuscript that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper. In order to work throughout this post, you need to install the ‘drcte’ and ‘drcSeedGerm’ packages, by using the code provided in this page.

Quantiles from time-to-event models

We have previously shown that time-to-event models (e.g., germination or emergence models) can be used to describe the time course of germinations/emergences for a seed lot (this post) or for several seed lots, submitted to different experimental treatments (this post). We have seen that fitted models can be used to extract information of biological relevance (this post).

...

Analysing seed germination and emergence data with R (a tutorial). Part 8

Published at January 18, 2022 ·  8 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts expand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Predictions from a parametric time-to-event model

In previous posts we have shown that time-to-event models (e.g., germination or emergence models) can be used to describe the time course of germinations/emergences for a seed lot (this post) or for several seed lots, submitted to different experimental treatments (this post). We have seen that fitted models can be used to extract information of biological relevance (this post).

...

Analysing seed germination and emergence data with R (a tutorial). Part 7

Published at January 18, 2022 ·  4 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts expand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Exploring the results of a time-to-event fit: model parameters

In the previous post we have shown that time-to-event curves (e.g., germination or emergence curves) can be used to describe the time course of germinations/emergences for a seed lot (this post). We have also seen that the effects of experimental factors on seed germination can be accounted for by coding a different time-to-event curve for each factor level (this post).

...

Analysing seed germination and emergence data with R (a tutorial). Part 6

Published at January 18, 2022 ·  13 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts exapand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Fitting time-to-event models with environmental covariates

In the previous post we have shown that time-to-event curves (e.g., germination or emergence curves) can be used to describe the time course of germinations/emergences for a seed lot (this post). We have also seen that the effects of experimental factors on seed germination can be accounted for by coding a different time-to-event curve for each factor level (this post).

...

Analysing seed germination and emergence data with R: a tutorial. Part 5

Published at December 23, 2021 ·  14 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts exapand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Comparing germination/emergence for several seed lots

Very often, seed scientists need to compare the germination behavior of different seed populations, e.g., different plant species, or one single plant species submitted to different temperatures, light conditions, priming treatments and so on. How should such a comparison be performed? For example, if we have submitted several seed samples to different environmental conditions, how do we decide whether the germinative response is affected by those environmental conditions?

...

Analysing seed germination and emergence data with R: a tutorial. Part 4

Published at December 6, 2021 ·  9 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts exapand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Time-to-event models for seed germination/emergence

The individual seeds within a population do not germinate/emerge altogether at the same moment; this is an undisputed fact, resulting from seed-to-seed variability in germination/emergence time. Accordingly, the primary reason why we organise germination assays is to describe the progress to germination for the whole population, by using some appropriate time-to-event model.

...

Biplots are everywhere: where do they come from?

Published at November 24, 2021 ·  25 min read

Principal Component Analysis (PCA) is perhaps the most widespread multivariate technique in biology and it is used to summarise the results of experiments in a wide range of disciplines, from agronomy to botany, from entomology to plant pathology. Whenever possible, the results are presented by way of a biplot, an ubiquitous type of graph with a formidable descriptive value. Indeed, carefully drawn biplots can be used to represent, altogether, the experimental subjects, the experimental variables and their reciprocal relationships (distances and correlations).

...