Blog

Statistics for Sustainable Development > Blog > Beyond ANOVA – the Layers and Loops of Analysis

Technical Pieces

Beyond ANOVA – the Layers and Loops of Analysis

If there is one thing statisticians can claim expertise in, it should be analysis of data. For data coming from designed experiments, the situation is very clear – or so we and many other scientists are taught! The right way to analyse the data is determined by the design - and that is why we worry about whether those replicates were really replicates - and whether the layout was actually a split-plot, or not. As a student I remember being very impressed with, and at the same time mystified by, the equivalence of an analysis based on linear models (the way everyone is taught), and that based entirely on the randomisation of the design. It seemed to imply that there is indeed only one way to analyse data from a given experiment. Now that we work with farmers doing large-N trials, the situation looks much less clear. Yes, we can still do the standard analysis –an analysis of variance, some F-tests, tables of mean treatment effects and so on, but it typically only reveals a little of the story to be told from the data.  What’s going on here…?

Diagram showing the 3 main groups involved in a farming experiment and their methods of processing results

Think of a trial in which several hundred farmers each compare a set of treatments on their farm, a type of design which is commonly used. Typically, farmers are organised into groups of some sort. Data collection is by farmers or agents attached to each group. There is a process of aggregation that eventually results in a data-set reaching the researchers. The analysis is done on at least three levels:

These insights and explanations should lead to revisions in the way researchers look at the data. At the same time, the statistical analysis of the whole data-set can reveal patterns that farmers find useful. For example, knowing that early planting had a similar benefit wherever it was tried, helps increase farmers' confidence in their own results.

If the information channels are working well, then the initial tentative conclusions from each layer of analysis will be updated. Maybe a few more iterations will be needed to make the most of the data and experience of the experiment before next steps are negotiated by all those concerned.

All this is very much richer and more complex to manage than the 'analysis of designed experiments' that we teach trainee statisticians, and is a good example of what needs to change in statistical practice if we are to keep being relevant to current applied research.

What are your views on teaching trainee statisticians? Do you feel that it’s relevant and uses enough applied research? Perhaps you’re a trainee statistician yourself? We’re keen to hear your views, so do please jot down a comment or two in the box below :)

Ric Coe
Author: Ric Coe

Ric’s main focus is on improving the quality and effectiveness of research for development using the application of statistical principles and ideas. He is particularly interested in research design, including the design of complex integrative research projects.