Tarrant update 3 – Known knowns, known unknowns and unknown unknowns.

A fascinating comment was recently added to my post The Slow Death of a Chalk Stream. Nick Walton – a hydrogeologist with 50 years experience – wrote: 

Given the above, said Nick, historical evidence, empirical data, local knowledge and some hydrological common sense are worth a lot and shouldn’t be dismissed. 

Those four things are exactly what the River Tarrant Protection Society report contains. The RTPS is saying that when all the evidence is taken in the round the case is strong enough to justify further, detailed and truly independent investigation. 

This shouldn’t be a debate about whether the Wessex Basin Model is more sophisticated than the CSF modelling. It clearly is. 

The issue is whether the confidence placed in the Wessex reports conclusions is justified, given the limitations of the underlying data, the acknowledged uncertainties in conceptual understanding, and the internal inconsistencies in model performance across the Pimperne and Tarrant catchments and beyond to the edge of the Stour. 

Data from an impacted system

All models are limited by the quality of the data that is fed into them — and here the data are limited and impacted. The Wessex model is built on:

  • groundwater level records (largely post-1970),
  • short-term stream flow gauging with spot meters (primarily 2015–2017)
  • short-term targeted pumping and switch-off tests,
  • a system that has been subject to decades of abstraction.

In other words, the model is based on data extracted from a system that is already altered from its natural state. 

Without continuous flow records prior to the 1970s, and without direct measurements of groundwater–surface water interactions before large-scale abstraction, surely historical and qualitative evidence becomes more, not less, important? And yet it is largely excluded from the formal assessment.

Of course, Jane Dottridge wasn’t commissioned to comment on this other evidence. She nevertheless described it as “anecdotal”. I don’t think that’s fair. Anecdotal refers to an account or short narrative that is subjective, unreliable, or hearsay. Mapped Domesday mills are not anecdotal evidence.

Pimperne calibration

As to the models: Jane critiqued the one-dimensional simplicity of the CSF conceptual model. However, in spite of its attempts to capture the more complex reality, there is still uncertainty and assumption in the Wessex model, especially around the Pimperne–Tarrant interfluve. 

Jane does highlight this: “the Pimperne calibration is not very good, with a very smooth modelled recession in contrast to the marked break in slope of the observations. Some of the gauges on the middle Tarrant (Rushton, Preston Farm) also show the same feature” 

But she makes little of it. In the next paragraph Jane writes: “The conclusions appear to be justified based on the evidence presented in the report”

I don’t follow that logic. To recap, the conclusions of the report are:

  • Tarrant: only the abstraction pump in the valley (Stubhampton) is relevant to flows in the Tarrant. The stream is negligibly impacted by this abstraction “along the perennial reach” * and the ecology is not adversely impacted.
  • Pimperne: the abstraction at Black Lane does not impact flows in the Pimperne.

That is a very clear no impact statement given: 

  • The calibration is poor in the Pimperne and the lower Tarrant. 
  • The Black Lane abstraction is a high % of the catchment recharge. 
  • The groundwater boundary is modelled as fixed with no impact on the neighbouring Tarrant.

Surely the mismatch between the strength of no impact conclusions and the poor calibration warrants a furrowed brow. 

* This is a variation on a rhetorical ploy I’ve seen before: if a stream is dry then abstraction is ipso facto not impacting the stream. It’s also evidence of my point about how the impacted state can become the new baseline. The RTPS contends that the lower river is naturally perennial.

Known knowns, known unknowns and unknown unknowns.

The Wessex report presents a conceptual model strategically refined by fieldwork that included stream-bed surveys, weekly observations and spot-flow measurements, new boreholes to investigate the interfluve, switch-off and pumping tests. 

Accordingly, the model was refined to simulate lower transmissivity beneath interfluves, higher transmissivity in valley bottoms and the introduction of “unmapped faults” in the chalk – horizontal flow barriers – to improve calibration.

Surely these iterative refinements highlight, rather than resolve, the uncertainty? The interfluve behaviour was not predicted by earlier model versions, new borehole data required significant reinterpretation of the system and the fault-line is partly imposed through model structure, inferred  – because the river dries – rather than directly observed in the geology.

The cornerstone conclusion ref the Tarrant — that abstractions outside the catchment have no impact — depends on the assumption of limited cross-interfluve connectivity. And yet groundwater catchments are known to shift with hydraulic gradients and Jane’s review confirms that groundwater boundaries can and often do vary over time and with rising and falling groundwater levels. If they do this, they can also vary because of abstraction pressure. 

A central element of the Wessex argument is that switch-off and pumping tests define what they call “zones of influence” of abstractions and that impacts are therefore spatially limited.

This interpretation is not supported by general hydrogeological principles. Why does it pass, unchallenged?

Short-duration tests reveal immediate, local drawdown responses but do not capture longer-term system adjustment. They don’t capture the delayed propagation of pressure changes, the redistribution of groundwater flow paths, or slowly accreting capture from inter-connected water bodies. 

The absence of observed drawdown at a location during a short test simply cannot be taken as evidence of no long-term hydrological impact.

Pick’n’mix

There is also evidence of expedient selectivity in the Wessex report and even Jane points this out. Where the Wessex model performs reasonably well – the Tarrant – it is used to support conclusions. Where it performs poorly – the Pimperne – then alternative methods are used: pump tests and empirical observations.

This pick’n’mix kind of undermines confidence in the whole thing surely? The analytical method is not consistent across the whole piece. 

John’s CSF model may be pilloried for its simplicity, but at least it treats the whole study area in the same way. The Wessex Water approach ought to weaken the Environment Agency’s confidence in the system-wide conclusions, particularly those relating to this cross-catchment impacts we insist are plausible but which Wessex Water hotly denies.

What about 2017?

I’ve already underlined the coincidence between a long-term shut down 2016 to 2017 of the Black Lane pump in the neighbouring Pimperne valley and the fact that the summer of 2017 was the one year in the past ten that the lower Tarrant did not dry. This is such good evidence that the Black Lane pump may well be having an impact on the Tarrant, or the Black Lane and Shapwick pumps in tandem, especially when one remembers that the spring of 2017 was bad for chalk streams. That was the year that I took photographs of drying streams all round London, the Ver, Chess, Misbourne, Beane, Rib, Ash and others.

The year the Chess looked like this, the River Tarrant kept flowing.

Wessex Water has an answer: they claim that late summer rain prevented the Tarrant from drying when it was otherwise on course to. I put this to John Lawson and he went away to look at the rainfall figures over a longer time-series, to see if this late summer rain was an anomaly that plausibly did make the difference.

As you can see, the summers of 2015, 2021 and 2023 were similarly wet or wetter than 2017, but the river still dried. Whereas the the preceding October 2016 to March 2017 was unusually dry and that’s what usually determines flows in the following summer.

In summary

There is a mismatch between limited range of data (no consistent, long-term flow gauging), incomplete understanding of the aquifer, poor calibration in the modelling and confidence in the conclusions.

The purpose of highlighting these issues is not to suggest that “we are right and Wessex Water and the Environment Agency are wrong”. Instead it is to demonstrate that:

  • alternative models produce plausible results which do suggest an abstraction impact,
  • key assumptions (e.g. fixed catchment boundaries, limited zones of influence) are not definitively proven,
  • the current evidence base does not support a strong “no impact” conclusion.

Given all the above surely it would be prudent to treat the current findings as provisional rather than definitive and look for a more robust truly independent investigation, with scope not limited to model comparison.

Let’s not forget, this stream is used for spawning by Atlantic salmon. The stream may not be as protected as the Bourne and Wylye, but the salmon is. These fish are genetically unique to chalk streams and the Stour’s population of these fish must be the most endangered stock of all.

Oh and just one more thing …

Underlining the mismatch between what we know and confidence in conclusions, it is worth addicting that recent research into the Chalk aquifer by Andy Farrant and others at the BGS has highlighted the greater-than-previously-recognised role of karstic dissolution features and preferential flow pathways in chalk. These can provide localised areas of enhanced permeability that are not necessarily captured in regional groundwater models. Hydraulic connectivity may well occur along pathways that are not predicted by averaged aquifer properties or detected by limited observation boreholes. This must be relevant where abstraction alters hydraulic gradients, potentially activating or enhancing flow along such pathways?

Sure, this does not demonstrate that such connections exist between the Pimperne and Tarrant catchments, it does underline the uncertainty associated with assuming that lower-transmissivity interfluves act as hard hydraulic boundaries.

Just saying …

Tarrant update 2 – In defence of simplicity

In my last post I questioned why the Environment Agency confined its review of the River Tarrant Protection Society (RTPS) report to a comparison between two modelling approaches.

I argued that the Chalk Streams First (CSF) model—a simple, lumped parameter model—was never intended to replace the more complex 3-D model used by Wessex Water, but rather to highlight uncertainty. Several hydrogeologists, including the independent reviewer, have previously suggested that such approaches can be used in a complementary, tiered way, with monitoring data providing essential context.

In that light, it makes little sense to treat this as a modelling contest in which the limitations of one approach invalidate its findings. Model outputs should be interpreted alongside other lines of evidence.

The independent review compared:

  • the Wessex Water Middle Stour report (the official position), and
  • the RTPS report on low flows and drying

with a focus on hydrogeological data and modelling.

In this post I consider that comparison in the light of Jane Dottridge’s review (attached to my previous post), focusing specifically on the conceptual and methodological validity of the CSF model.

Assessment versus indicator

Jane was asked to comment on the validity of the RTPS findings on abstraction impacts, and to consider the Wessex report by comparison. She concluded that the CSF model does not “provide a more reliable assessment of abstraction impacts than the Wessex model”.

However, the RTPS report did not claim to provide a more reliable assessment, but rather a more reliable indicator. That distinction matters. An assessment implies a definitive evaluation; an indicator signals a relationship or pattern without claiming certainty.

The CSF model was presented as part of a broader evidential framework. Its outputs, taken together with other observations, were used to question the certainty of Wessex Water’s conclusions. Judging it as if it were intended to deliver a standalone assessment risks setting up a straw-man comparison.

The conceptual model

Jane states that the CSF model is highly simplified and suggests first of all that it has no conceptual basis, then later that it lacks a sound conceptual basis. There is some ambiguity here: whether no conceptual model exists, or whether the one used is considered inadequate.

In practice, the CSF model is based on a clearly defined—if simple—conceptual model. It assumes:

  • a fixed groundwater catchment based on topography
  • uniform transmissivity
  • a broadly synchronous rise and fall in groundwater levels
  • a distributed pattern of spring discharge across the valley

These are simplifications of a complex system. In reality, groundwater catchments shift, transmissivity varies, and flow processes are spatially heterogeneous. But the question is not whether the model captures every detail—it does not—but whether it is appropriate for its intended purpose.

There is ample precedent in groundwater science for simplified conceptual models, particularly where the aim is to identify dominant controls or test the plausibility of observed relationships. 

Model complexity should be proportionate to the question being asked.

Empirical relationship between groundwater and flow

The CSF approach is grounded in an empirical observation: that groundwater level and streamflow are closely correlated in chalk streams.

John Lawson has shown – using historical data – that, within relatively tight bounds, when groundwater levels are at a given elevation, streamflows fall within a given range. This close relationship appears to hold across long time series and across multiple different chalk stream catchments. John has looked in detail at the Rivers Kennet, Og, Misbourne, Chess, Ver, Mimram, Beane, Ivel and Darent, with some examples shown below.

Note. 1. baseflows derived from gauged flows using baseflow separation software. 2. Plotted baseflows usually lead GWLs by 2-3 weeks

And, of the course the River Tarrant.

The implication is that groundwater level is the dominant control on flow, with abstraction largely affecting flows indirectly by lowering groundwater levels relative to their natural state.

This is not a theoretical construct imposed on the system, but a pattern observed in the data and then represented mathematically.

The CSF equation and non-linearity

The CSF model expresses this relationship in the form:

Q = a(GWL – b)^c

where the constants are calibrated to fit observed data, where the constants are calibrated to fit observed data. Q is flow and (GWL – b), is the height (h) of the groundwater at the observation point over the stream bed at the discharge point.

As shown on the above plots for the Rivers Chess, Misbourne, Mimram and Ver. The relationships between GWLs and baseflows is very strong for “pure” chalk streams with baseflow indices over 90%, like the Chess and Misbourne in the above plots. In rivers like the Darent, with mixed geology including some tertiary deposits, the baseflow indices are below 80% and the relationships show more scatter, but are still plain to see.

A key feature of the relationship is that it is non-linear: increases in groundwater level produce disproportionately larger increases in flow. The model captures this behaviour through the exponent (c), which typically lies between 2 and 2.5 as seen on the plus above.

This non-linearity can be understood heuristically. As groundwater levels rise:

  • the area of saturated ground contributing to spring flow increases, and
  • the hydraulic response of the system becomes more pronounced

Together these effects produce a more-than-linear increase in discharge. While the precise physical mechanisms are debated — ranging from valley geometry to fracture density—the existence of non-linear behaviour is widely observed in the data.

The CSF model does not claim to resolve all underlying processes, but it does provide a consistent way of representing this empirical relationship.

Calibration and transparency

Jane raises concerns about how model parameters — such as subsurface flow and specific yield— are derived.

In the CSF model, these parameters are obtained through calibration: the constants are adjusted until the model reproduces the observed relationship between groundwater levels and streamflows over historic records.

This is a standard empirical approach. The parameters effectively encapsulate the combined influence of aquifer properties such as permeability, transmissivity and storage (a) and valley shape combined with other components of the non-linearity, such as fracture density rising with altitude (b).

The method is described in the RTPS report (page 22), including the treatment of throughflow and specific yield. While simple, it is transparent: the model is designed to reproduce observed system behaviour rather than simulate all underlying processes explicitly.

The key question is therefore not how the parameters are derived in isolation, but whether the calibrated model reproduces reality with sufficient fidelity. On that measure, the fits to historic data are strong.

Is simplicity a weakness?

Prior to the Affinity Water conference in 2022, the CSF model was reviewed by several hydrogeologists. While they noted its simplicity and raised questions about parameter estimation, they did not dismiss the approach. On the contrary, they regarded the results as promising and worthy of further consideration.

Andy Binley wrote: “I must say that the modelling results and analysis of historic data appear convincing to me. You have modelled a substantial set of historic records using a simple lumped approach – the fits to data are impressive and appear to outperform the EA model.”

Jonathan Paul wrote “The reports showcase an interesting, if highly simplified, analytical relationship between groundwater level and river discharge. Initial results look very promising, but greater clarity in how your exponents a and b were obtained would be welcome.”

Jane herself noted in earlier correspondence that the model was “a neat little model” and more satisfactory than some alternatives, albeit highly simplified.

This highlights a tension in the review. The same simplicity that was previously seen as acceptable — within a defined scope — is later treated as a fundamental weakness.

Yet simplified models have a recognised role. They are often used in early-stage assessment, to identify key controls and sense-check more complex analyses. If they can reproduce observed behaviour reliably, they can provide a valuable benchmark against which more elaborate models can be tested.

Conclusion

The CSF model is not a replacement for detailed 3-D modelling, nor does it claim to be. It is a simplified, empirically calibrated tool designed to capture the dominant relationship between groundwater levels and streamflow.

Its conceptual basis is explicit, if simplified. Its parameters are derived transparently through calibration. And its outputs align closely with observed data across multiple catchments.

In that context, the key issue is not whether the model is simple, but whether it is useful. If it consistently reproduces observed behaviour, then it has a legitimate role — particularly in testing the robustness of conclusions drawn from more complex models.

To dismiss it on the basis of its simplicity alone risks overlooking precisely the kind of evidence that can help identify uncertainty in groundwater impact assessments.