Why so few experiments in the real world?

…very few studies were judged to be attempts at a controlled experiment where livestock numbers or systems were intentionally manipulated, and even many of these lacked elements such as adequate replication.

Schieltz, J. M., & Rubenstein, D. I. (2016). Evidence based review: positive versus negative effects of livestock grazing on wildlife. What do we really know?. Environmental Research Letters, 11(11), 113003.

This is a ubiquitous lament.

In our exploitation of natural resources, including soil, we rarely use the scientific method, at least in its pure controlled experiment form. This means that robust evidence of the sort that can be accepted readily, even when it conflicts with values that different people hold, is limited in its extent and applicability.

This means we are short on inference.

Most environmental issues lack experimental evidence from a robust applicat­ion of the scientific method and easily become contentious.

The example above is from a review summarizing evidence on the effects of livestock grazing on wildlife.

Here is another one referring to the control of vertebrate pests – principally wild dogs, pigs, goats, rabbits, camels and deer – in Australian agricultural landscapes…

We review the design of 1,915 pest control actions conducted with the aim of protecting native biodiversity in Australia during 1990–2003. Most (67.5%) pest control actions consisted of a single treatment area without monitoring of either the pest or biodiversity. Only 2.4% of pest control actions had one or more treatment and non-treatment areas, and very few treatment and non-treatment areas (0.3%) were randomly assigned. Replication of treatment and non-treatment areas occurred in only 1.0% of pest control actions.

Reddiex, B., & Forsyth, D. M. (2007). Control of pest mammals for biodiversity protection in Australia. II. Reliability of knowledge. Wildlife Research, 33(8), 711-717.

That is nearly 2,000 pest control actions — pest control activities conducted in a control operation such as poison baits to control wild dogs, not the termites in the basement form of pests — where virtually no evidence is available to decide how effective the actions were to either control the pest or mitigate the disturbances the pest induces.

This is not good.

There is no way of knowing if the control method worked so no way of knowing if it was worth the money or the effort or if we should stop or keep going. The only information to make these important financial and resourcing decisions is anecdotal. Maybe we saw fewer wild dogs this season, maybe not.

When it comes to real-world practicalities, we don’t use the scientific method.

Why is this?

The scientific method is tried and tested. It has given us most of our technology and understanding of nature so it seems odd that we don’t follow it in coming to terms with using nature for maximum value whatever that value might be.

It is especially important when the objective is to balance competing values such as production and conservation, utility and persistence. It is even more important to know the details when the prime value is an emotional one, such as not wanting to wake up and find two calves in the paddock that died a painful death from bites.

Here are three reasons why science in the environment is hard…

  • Replication is a challenge
  • Manipulation is difficult to achieve
  • Controls are hard to find

These are the technical difficulties of using the scientific method on hills and fields and countryside dotted with woodland remnants and craggy moorland.

It is very hard to replicate, manipulate and control any of the critical variables.

These are the fundamentals of the scientific method in its deductive form. Here is what the method looks like

The most important step is the experiment.

This is where one critical variable, let’s say the number of wild dogs, is changed from what it is normally to more or less or both in a series of treatments and the consequences are recorded through measurement variables, let’s say number of lambs lost.

The significance of any change in the measurement variable in the treatments is compared with what happens in the controls where nothing is changed, in this example the dog numbers are left alone.

Logical and simple enough in theory.

In practice, there are many specific challenges.

What should the replicate be in this case? Is it a field, a farm or a district. It clearly cannot be a test tube or a pot in a greenhouse or a plot marked out in a flat field on an agricultural research station.

If we choose fields as the replicates how can we make them the same? Not really possible even if we knew precisely the behaviour of wild dogs and how many there were in each district so as to determine how many times they would visit the fields. The same number of times if the fields are to be realistic replicates.

Always the replicates will be a little different to each other. Nothing like 100ml test tubes. Then they are not true replicates at all.

The solution is to have many replicates after first deciding from an understanding of the ecology of wild dogs how they use a landscape and so what size the unit of replication should be.

Let’s say we settle on fields of between 5 and 10 hectares in size where sheep graze for at least 100 days per year at similar stocking rates and where lambs are raised.

The fields must have at least one other field between them and yet must also be in the same district.

The real problem is that we need plenty of replicates. Three or four is nowhere near enough to bound the natural variation in the treatment and measurement variables. More like forty replicates of each is needed. This is unlikely to be possible due to logistics and cost, even assuming there is a district with enough fields that meet the criteria.

There will also need to be plenty of controls and unlike in a traditional experiment with, let’s say, fertilizer application, the control will have to be in a district where the wild dogs are not controlled. This means they are not strictly formal controls.

Then only one treatment is realistic, reduction in wild dog numbers. Not too many farmers would be up for a treatment where dog numbers are increased.

Not surprising then that

Only 2.4% of pest control actions had one or more treatment and non-treatment areas, and very few treatment and non-treatment areas (0.3%) were randomly assigned.

The first glance at the statistics sends you to a criticism of the wildlife ecologists and pest managers for not using the scientific method and delivering robust evidence.

The reality is that the method is very difficult to implement. And that fundamental challenge is not the fault of the pest managers or the wildlife biologists.

An alternative form of the scientific method

It should be possible to complete some before-after-control-impact studies, sometimes called BACI analyses.

The idea here is to compare levels of both the treatment and the measurement variable before and after an intervention such as a baiting program in the wild dog example.

There are formal statistical procedures to interpret data of this kind and as long as there are sufficient instances of the comparison some robust evidence can be gathered. Not as rigorous as formal experiments but close enough given the constraints of the real world.

This approach takes time and coordination but it can be done, even retrospectively from standard observations and reports of dog sightings and livestock losses.

The science purists will tell you that inference is weak without formal experiments. This is true.

But for this type of expensive interventions even weak inference can help decisions on the necessity or scale of the control programs.

Why so few experiments?

There are few experiments on the use of natural resources because they are hard, sometimes impossible, and always challenging to design and deliver.

Few institutions have the funds to take them on given the levels of replications and the logistics involved.

Few scientists have the time to wait for the results. It is a career limiting step to wait a decade for your experimental results when publishing at least three peer-reviewed papers a year is essential.

Few scientists have the smarts to design experiments on this scale without resorting to pseudoreplication or other shortcuts.

The value sets that the experiments explore are so strongly held that they are hard to shift even if the inference is definitive.

What is the solution?

Be pragmatic about all of this.

Make sure that there is at least some information and apply the various rules to determine how useful it is to the questions at hand.

Help people to see that there is likely an emotional response at the root of their opinions and whilst this is legitimate it will make for tougher than necessary compromises.

Use big data. There is a new source of information that is a combination of remotely sensed data, artificial intelligence algorithms that can see complex patterns and modelling particularly the modeling of ecological processes that can make the inference much stronger than before.

We are much better off than we think when it comes to environmental information.


What makes a genius and do we need some

1. exceptional intellectual or creative power or other natural ability.
2. an exceptionally intelligent person or one with exceptional skill in a particular area of activity.

Do we need some geniuses or is it genii?

The definition of genius as ‘exceptional’ is a place to start.

Of course, we all have a little bit of above average in us, that is a rule of nature when it comes to human beings. There are no below-average drivers, photographers or singers in the world. Anything a person spends time on will automatically make them better than the norm. So says our ego.

When it comes to genius we maybe need a little more objectivity; to know what we mean by an exceptional intellect, creativity or intelligence.

What level or amount of these qualities and attributes do we say is necessary before the holder is exceptional or out of the ordinary?

It could cover the one in twenty occurrences of an intelligent person, or perhaps the one in a hundred, or one in a thousand individuals. These are the standard probability levels for statistical significance, the P-values that did your head in at school.

Let’s just for the sake of discussion say that genius-level starts at one in 1,000 individuals, that is P<0.001

This would make the genius uncommon, not so frequent as to be among the passengers on a crowded bus, although a commuter train has one or two of them, on average, assuming you are not getting off at the convention centre the week of the National High-IQ Convention.

One person in a thousand is a genius.

The normal distribution

There is research that suggests that intelligence is normally distributed.

That would mean that most of us are around average intelligence and the smarter folks become fewer in frequency as the level of smarts increases. Same for the ‘not so clever’ in the other direction on the variable of smartness.

Here is the normal distribution, the classic ‘bell-shaped’ curve.

This pattern means that of the 50 people on the bus, we can expect 34 of them to be within one standard deviation of the mean (average) intelligence however that might be measured, IQ-score for example.

Only 2 of the passengers would be at the tails of the distribution being either smarter or a little short of a picnic. These are the folk that are better or worse than 2 standard deviations of the mean. The standard deviation being a measure of the spread of the distribution, in this case of intelligence.

Go as far as three standard deviations from the mean and just three people in 1,000 will be really smart or really dumb — 0.28% of the distribution.

That’s close enough to 1 in a 1,000 for our genius estimation.

Alright, so not very many.

This calculation applies to the normal distribution whatever the measurement and whatever the actual mean and variance parameters for the sample. They are a property of the distribution and not the data. Whatever the measurement in a thousand observations just 3 observations would be expected by chance to be greater than and less than the mean plus three standard deviations and the mean minus three standard deviations.

How many geniuses in Australia?

Let’s assume that the normal distribution of intelligence holds for such an eclectic population as that in Australia and the 0.14% is what we are looking for as those in the above-average tail of the distribution.

Given the adult population is roughly 20 million in 2020, then there are around 29,000 adult geniuses in the country over the age of 20 and another 6,700 young geniuses learning and growing into adulthood.

Roughly 35,000 in total.

This should be enough to have a genius or two in the higher echelons of each major walk of life.

Or is it?

As of June 30 2019, there were 2,375,753 actively trading businesses in the Australian economy. This means one genius for every 67 businesses.

Not all of these businesses employed people, indeed there were 4,271 businesses with more than 3,200 employees, so they could at least get a genius each.

It would be good to have a genius on staff at each high school (9,393) and maybe a couple at each university (43) or perhaps each faculty (250+).

Then we need some in government departments (200+), hospitals (1,350) and even a few in the military where there are 58,650 active personnel and 21,700 reservists, so a couple of hundred there at least.

Source: Australian Bureau of Statistics

After a few of these needs are met, the 35,000 begins to get allocated pretty quickly. And if we do use them all up then there are allocations and distribution issues to consider. What sectors should get their full genius allocation?

Do we need any more geniuses?

Australia has plenty of genius-level capacity.

There should be enough smarts to figure out solutions to our problems among the 35,000 plus people who are more than 3 standard deviations above average intelligence.

There is a problem of where they are, who they work for and who listens to them. It also assumes that being a genius will be enough to solve problems and that may not always be true.

Then there is another statistical problem.

Suppose that the average intelligence, 𝞵 = 100

While the average person scores 100 on the intelligence test what the genius will score depends on the variability in test scores, 𝝈

If the variability is high ( 𝝈 = 15) a genius might need a score of over 140 to get into that upper 0.14% of the population. If the variability is lower ( 𝝈 = 5) then a genius need only score over 115 on the test.

The number of geniuses stays the same, scores differ.

If a specific level of genius score is required of a true genius, say 150, and the variability in intelligence in the population is low, then very few geniuses would be present in the population.

A score of 125 is 5 standard deviations above the mean. On average there would be one person scoring 125 in every 3.58 million in the population.

If this were the new genius level there are only 100 in the country.

This really is a problem.

Definitely not enough genii to go around.

Perhaps we should gather these elite individuals together, a bit like we do for sports, put them in a convivial environment and add some folk with a modicum of common sense and management nouse. The set them some of the real problems society has to face.

Worth a try?