# Overview of estimation methodology

The following outlines the methods used within Qube for estimating natural (and influenced) time series of daily mean flow.

**Step 1:**Creation of a pool of donor catchments.**Step 2:**Selecting the most appropriate TSEP Donor.**Step 3:**Estimating the flow time series.

The premise within the Qube time series estimation is to use the best available data to estimate a time series for your catchment of interest. At a local data gauging station location, the combination of the naturalised and gauged FDC and gauged time series will result in the naturalised and gauged time series. As you move away from the gauging station, Qube will increasingly utilise modelled FDCs and time series.

The method is described below, with further details available in a supporting technical note.

# Step 1: Creation of a pool of donor catchments

Qube uses a pool of ~11,000 donor catchments across Great Britain. Donor Time Series of Exceedance Probabilities (TSEP) sites consist of a time series of monthly exceedence percentiles data from either gauging station records or CERF generalised continuous simulation model runs.

## Identifying Potential Gauged Donors

The suitability of all NRFA gauging stations has been classified for use as potential donor TSEP sites. The classification process assumes that the time series information (the rank order) in a gauged flow record is not sensitive to systematic hydrometric biases in a record (for example high flows are under-estimated or low flows are over estimated) or temporally invariant influences, but will be influenced by large, temporally varying influences. Gauging stations that have unstable hydrometric records (e.g. large changes to the measurement structure) or are influenced by large temporally varying influences were considered unsuitable and excluded.

Approximately **1,000 gauging stations** have been selected as potential donor TSEP sites across Great Britain. These have been classified as:

- Y - Natural or low net influence, or
- P - Significant but time invariant net influence (but no impounding reservoirs)

Where the gauged record is not complete, this is infilled by CERF simulated monthly exceedence percentile data.

## Generation of CERF Donors

Approximately **10,000 CERF** simulations have been generated across Great Britain for potential donor TSEP sites.

## Assessing Donor Suitability

The suitability of the time series for use varies from catchment to catchment. The core method for selecting an appropriate TSEP site is based upon maximising the likelihood of replicating the timeseries information that would be observed if that location was gauged. Similarity is measured by Spearman Rank Correlation (SRC), which is calculated between two time series. A correlation of 1 represents a perfect correlation of Ranks.

All TSEP sites have been assigned an at site SRC (*SITE_SRC*). Gauged time series data are more highly ranked than modelled time series data, as follows:

- a complete gauged record has a
*SITE_SRC*of 1, as the time series is measured. - a CERF simulation has a
*SITE_SRC*of 0.8 (the average value of rank correlation between CERF and gauged flows across the catchments within the Qube ROI pool). - an incomplete gauged record, infilled by a CERF simulation, has a weighted
*SITE_SRC*between 1 for the proportion of gauged values and the SRC calculated between the gauged flows and the corresponding CERF flows for the days on which there are gauged flows for the proportion of CERF simulated values.

# Step 2: Selecting a TSEP Donor

Close to a gauged TSEP site, Qube will utilise the gauged time series. As the site of interest moves from a gauging station, dependant on the density of the gauged network, the method may consider that the TSEPs from available gauging stations are less representative of your catchment than the TSEP derived from a local application of the CERF generalised continuous simulation model.

A pool of candidate donors is identified subject based on area and distance constraints and the differentiation between nested and non nested catchments. The method then considers how well the time series is represented at the location of the TSEP site (the at site SRC), and subsequently how representative that candidate donor time series is for the target catchment (the SIMINDEX).

*Note:**The SIMINDEX was derived using multi-variate regression analysis to explain the variation in SRC between pairs of gauged catchments drawn from across the UK and the dependency of this variation on differences in key catchment descriptors. The key descriptors are the difference in distance, runoff and BFIHOST.*

Candidate TSEP donors are selected within 50km and area factor 4, based on a hierarchy classified by TSEP type and whether they are nested or adjacent to the target. More stringent rules are applied to adjacent sites (limiting adjacent gauged TSEP donors to those classified as Y - natural or low net influence), as well as for small catchments (less than 20 km²). The donor is then selected using the following method:

# Step 3: Estimating the flow time series

The time series are generated using the Qube estimated monthly FDCs for the catchment of interest as 'index' FDCs to assign the flows to the time series. This is the influenced FDCs if the donor is a nested gauged TSEP and the natural FDCs if the donor is a CERF TSEP (or an adjacent gauged TSEP). For each of the monthly index FDCs there are 'paired' natural and influenced percentile and flow values. For example, a natural June Q99.9 of 0.6 m³/s may be negatively influenced to become 0.45 m³/s. The 0.45 m³/s flow represents the influenced June Q91, whereas the influenced June Q99.9 is 0.13 m³/s. The natural June Q99.9 of 0.6 m³/s and influenced June Q91 of 0.45 m³/s remain 'paired', with the natural or influenced index percentile used to lookup the paired natural and influenced flow.

## Estimating a natural flow time series

Natural flow time series are generated by sampling the Qube at site estimate of the natural monthly FDCs using the donor time series of monthly percentiles.

Two approaches are used when applying flows to the percentile time series dependent on the Donor TSEP type (gauged or CERF) and whether the Donor TSEP and target catchment are nested or adjacent:

- Method A is used for nested gauging station sites. It is assumed that the donor, and that the influences that affect the TSEP flows, are nested to both the gauged and target catchments (noting this may not always be the case). As influences are only visible to regulator users (due to data rights), the method assumes for other users that the rank order of gauged flows within a given month are the same as the rank order of the naturalised flows.
- Method B is predominantly used for CERF TSEP sites, thus the donor is assumed to be natural. This is also used for adjacent gauged sites, where the time series is assumed to be natural (noting this may not always be the case) as the influences affecting the TSEP flows would not be affecting the target catchment.

## Estimating an influenced flow time series

*Please note that artificial influences, thus influenced flow estimates, are only visible to regulator users (due to data rights).*

As described above, when estimating the influenced monthly FDCs Qube maintains the pairing of natural and influenced flows and flow percentile. Using the example given above, a natural June Q99.9 of 0.6 m³/s may be negatively influenced to become 0.45 m³/s. The 0.45 m³/s flow represents the influenced June Q91, whereas the influenced June Q99.9 is 0.13 m³/s. Within the Qube influenced time series, on a June day when the natural flow is estimated to be the natural Q99.9, the influenced flow on that day would be 0.45 m³/s (influenced Q91).