Archive 2021

Analysing seed germination and emergence data with R: a tutorial. Part 5

Published at December 23, 2021 ·  14 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts exapand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Comparing germination/emergence for several seed lots

Very often, seed scientists need to compare the germination behavior of different seed populations, e.g., different plant species, or one single plant species submitted to different temperatures, light conditions, priming treatments and so on. How should such a comparison be performed? For example, if we have submitted several seed samples to different environmental conditions, how do we decide whether the germinative response is affected by those environmental conditions?

...


Analysing seed germination and emergence data with R: a tutorial. Part 4

Published at December 6, 2021 ·  9 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts exapand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Time-to-event models for seed germination/emergence

The individual seeds within a population do not germinate/emerge altogether at the same moment; this is an undisputed fact, resulting from seed-to-seed variability in germination/emergence time. Accordingly, the primary reason why we organise germination assays is to describe the progress to germination for the whole population, by using some appropriate time-to-event model.

...


Biplots are everywhere: where do they come from?

Published at November 24, 2021 ·  25 min read

Principal Component Analysis (PCA) is perhaps the most widespread multivariate technique in biology and it is used to summarise the results of experiments in a wide range of disciplines, from agronomy to botany, from entomology to plant pathology. Whenever possible, the results are presented by way of a biplot, an ubiquitous type of graph with a formidable descriptive value. Indeed, carefully drawn biplots can be used to represent, altogether, the experimental subjects, the experimental variables and their reciprocal relationships (distances and correlations).

...


Principal Component Analysis: a brief intro for biologists

Published at November 23, 2021 ·  24 min read

In this post I am revisiting the concept of Principal Component Analysis (PCA). You might say that there is no need for that, as the Internet is full with posts relating to such a rather old technique. However, I feel that, in those posts, the theoretical aspects are either too deeply rooted in maths or they are skipped altogether, so that the main emphasis is on interpreting the output of an R function. I think that both approaches may not be suitable for biologists: the first one may be too difficult to understand, while skipping altogether the theoretical aspects promotes the use of R as a black-box, which is dangerouse for teaching purposes. That’s why I wrote this post… I wanted to make my attempt to create a useful lesson. You will tell me whether I suceeded or not.

...


Analysing seed germination and emergence data with R: a tutorial. Part 3

Published at October 19, 2021 ·  12 min read

This is a follow-up post. If you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts exapand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Reshaping time-to-event data

The first thing we should consider before working through this tutorial is the structure of germination/emergence data. To our experience, seed scientists are used to storing their datasets in several formats, that may not be immediately usable with the ‘drcte’ and ‘drc’ packages, which this tutorial is built upon. The figure below shows some of the possible formats that I have often encountered in my consulting work.

...


Analysing seed germination and emergence data with R (a tutorial). Part 2

Published at October 9, 2021 ·  17 min read

This is a follow-up post: if you are interested in other posts of this series, please go to: https://www.statforbiology.com/tags/drcte/. All these posts exapand on a paper that we have recently published in the Journal ‘Weed Science’; please follow this link to the paper.

Survival analysis and germination/emergence data: an overlooked connection

Seed germination and emergence data describe the time until the event of interest occurs and, therefore, they can be put together in the wide group of time-to-event data. You may wonder: what’s the matter with time-to-event data? Do they have anything special that needs our attention? The answer is, definitely, yes!

...


Analysing seed germination and emergence data with R (a tutorial). Part 1

Published at October 7, 2021 ·  4 min read

Introduction to the tutorial

Germination/emergence assays are relatively easy to perform, by following standardised procedures, as described, e.g., by the International Seed Testing Association (see here ). In short, we take a sample of seeds and we put them in an appropriate container. We put the container in the right environmental conditions (e.g., relating to humidity content and temperature) and we inspect the seeds according to a regular schedule (e.g., daily). At each inspection, we count the number of germinated/emerged seeds and remove them from the containers; inspections are performed until no new germinations/emergences are observed for a sufficient amount of time.

...


Why are derivatives important in life? A case-study with nonlinear regression

Published at June 9, 2021 ·  7 min read

In general, undergraduate students in biology/ecology courses tend to consider the derivatives as a very abstract entity, with no real usefulness in the everyday life. In my work as a teacher, I have often tried to fight against such an attitude, by providing convincing examples on how we can use the derivatives to get a better understanding about the changes on a given system.

In this post I’ll tell you about a recent situation where I was involved with derivatives. A few weeks ago, a colleague of mine wrote me to ask the following question (I’m changing it a little, to make it, hopefully, more interesting). He asked: “I am using a power curve to model how the size of the sampling area affects species richness. How can I quantify my knowledge gain?”. This is an interesting question, indeed, although I feel I should provide you with some background information.

...


Other useful functions for nonlinear regression: threshold models and all that

Published at May 1, 2021 ·  13 min read

In a recent post I presented several equations and just as many self-starting functions for nonlinear regression analyses in R. Today, I would like to build upon that post and present some further equations, relating to the so-called threshold models.

But, … what are threshold models? In some instances, we need to describe relationships where the response variable changes abruptly, following a small change in the predictor. A typical threshold model looks like that in the Figure below, where we see three threshold levels:

...


The R-squared and nonlinear regression: a difficult marriage?

Published at March 25, 2021 ·  4 min read

Making sure that a fitted model gives a good description of the observed data is a fundamental step of every nonlinear regression analysis. To this aim we can (and should) use several techniques, either graphical or based on formal hypothesis testing methods. However, in the end, I must admit that I often feel the need of displaying a simple index, based on a single and largely understood value, that reassures the readers about the goodness of fit of my models.

...


lmDiallel: a new R package to fit diallel models. Multienvironment diallel experiments

Published at March 5, 2021 ·  7 min read

In recent times, a few colleagues at my Department and I have devoted some research effort to data management for diallel mating experiments, which we have summarised in a paper (Onofri et al., 2020) and a series of five blog posts (see here). A final topic that remains to be covered relates to the frequent possibility that these diallel experiments are repeated across years and/or locations. How should the resulting dataset be analysed?

...


lmDiallel: a new R package to fit diallel models. The Gardner-Eberhart models

Published at February 22, 2021 ·  15 min read

Another post for this series about diallel mating experiments. So far, we have published a paper in Plant Breeding (Onofri et al., 2020), where we presented lmDiallel, a new R package to fit diallel models. We followed up this paper with a series of four blog posts, giving more detail about the package (see here), about the Hayman’s models type 1 (see here) and type 2 (see here) and about the Griffing’s family of models (see here).

...


Split-plot designs: the transition to mixed models for a dinosaur

Published at February 11, 2021 ·  15 min read

Those who long ago took courses in ‘analysis of variance’ or ‘experimental design’ … would have learned methods … based on observed and expected mean squares and methods of testing based on ‘error strata’ (if you weren’t forced to learn this, consider yourself lucky). (Douglas Bates, 2006).


In a previous post, I already mentioned that, due to my age, I see myself as a dinosaur within the R-users community. I already mentioned how difficult it is, for a dinosaur, to adjust to new concepts and paradigms in data analysis, after having done things differently for a long time ( see this post here ). Today, I decided to sit and write a second post, relating to data analyses for split-plot designs. Some years ago, when switching to R, this topic required some adjustments to my usual workflow, which gave me a few headaches.

...


Pairwise comparisons in nonlinear regression

Published at January 19, 2021 ·  6 min read

Pairwise comparisons are one of the most debated topic in agricultural research: they are very often used and, sometimes, abused, in literature. I have nothing against the appropriate use of this very useful technique and, for those who are interested, some colleagues and I have given a bunch of (hopefully) useful suggestions in a paper, a few years ago (follow this link here).

Pairwise comparisons usually follow the application of some sort of linear or generalised linear model; in this setting, the ‘emmeans’ package (Lenth, 2020) is very handy, as it uses a very logical approach. However, we can find ourselves in the need of making pairwise comparisons between the elements of a vector, which does not came as the result of linear model fitting.

...


lmDiallel: a new R package to fit diallel models. The Griffing's models (1956)

Published at January 12, 2021 ·  10 min read

Diallel mating designs are often used by plant breeders to compare the possible crosses between a set of genotypes. In spite of such widespread usage, the process of data analysis in R is not yet strightforward and it is not clear which tool should be routinely used. We recently gave a small contribution by publishing a paper in Plant Breeding (Onofri et al., 2020 ), where we advocated the idea that models for diallel crosses are just a class of general linear models, that should be fit by Ordinary Least Squares (OLS) or REstricted Maximum Likelihood methods (REML).

...


lmDiallel: a new R package to fit diallel models. The Hayman's model (type 2)

Published at January 5, 2021 ·  9 min read

This posts follows two other previously published posts, where we presented our new ‘lmDiallel’ package (see here) and showed how we can use it to fit the Hayman’s model type 1, as proposed in Hayman (1954) (see here). In this post, we will give a further example relating to another very widespread model from the same author, the Hayman’s model type 2. We apologise for some overlapping with previous posts: we think this is necessary so that each post can be read on its own.

...