Self Training Examples

   

       
           

Example 3:  Fitting a probability model to gauged data           

Launch Flike           

This example demonstrates undertaking a flood frequency analysis using the procedures described in Australian Rainfall and Runoff Book 3: Peak Discharge Estimation (http://arr.ga.gov.au/arr-guideline). Specifically, this example covers the fitting of a Log Pearson Type 3 (LP3) distribution to an annual maximum series for the Hunter River at Singleton. The analysis will be undertaken using Flike which has been developed to undertake flood frequency analysis as described in ARR, that is, it has the ability to fit a range of statistical distributions using a Bayesian Inference method.        

Once Flike has been obtained and installed, launch Flike and the screen in Figure 1 will appear.                 

Figure 1 TUFLOW Flike Splash Screen           

Figure 1: Flike Splash Screen           

Create new .fld file           

The first step will be to create the .fld file which contains information about the project.  To create a new .fld file, select New from the File dropdown menu. This will open a new window called Open as shown in Figure 2.           

         

Figure 2: Create New .fld File           

Create and save a new .fld file in an appropriate location, such as in a folder under the job directory, and give it a logical name, in this case Example_3.fld. A message will appear asking if you want to create the file, select Yes. Note that the window is titled Open, but it works for creating new files as well. Once the .fld file has been saved, the Flike Editor window will open which will be used in the next step.           

Configure the project details           

The .fld file is used to store the project data and configuration. Once the .fld has been created the Flike Editor window will open automatically (see Figure 3) and the project will be configured here. The first bit of information to be completed is the project a name which is filled in the Title text box. The project title can go over two lines. 

         

Figure 3: Flike Editor Screen           

Import the data           

The next step is to import the flood series to analyse. To do this select the Observed values tab in the Flike Editor as shown in Figure 4. In this tab the flood series to be investigated will be imported.           

         

Figure 4: Observed Values Screen           

To import the flood series select the Import button and the Import gauged values window opens as shown in Figure 5. Now select the Browse button and navigate to the Singleton flood series. This example data are included in the Flike download, a copy of which was installed in the data folder in the install location of Flike. By default, this location is C:\TUFLOW Flike\data\singletonGaugedFlows.csv.  This data also appears at the end of this example.           

       

Figure 5: Import Gauged Values Screen          

Once the data file has been selected, the program will return to the Import gauged values window. As the input data format is flexible Flike needs to be told how to interpret the data file. To view the format of the data, select the View button and the data will be open in your default text editor (see Figure 6). In the example data the first line contains a header line and the data follows this. The flow values are in the first column and the year in the fourth column. Having taken note of the data structure close the text editor and return to the Import gauged values window. It's a good habit to check the data in the text editor to ensure that the format of the data is known and the file has not been corrupted or includes a large number of trailing comma or whitespace.  This last issue commonly occurs when deleting information from excel files, but it is easy to fix.  Simply delete any trailing comma or white space in a text editor.             

         

Figure 6: View Gauged Values in Text Editor           

The next step is to configure the import of the data. As the example data has a header, the first line needs to be skipped. Enter 1 into the Skip first __ records and then text field. This will skip the first line. Ensure that the Read to the end-of-file option is selected (this is the default). Occasionally, there may be a need to specify how many records to read, in which case this can be achieved by selecting the Read next __ records option and entering the desired number of records to read. Next, specify which column the flood data are in, by filling the gauged values are in column __ text box, in this example data this is column 1. Next, select the Years available in column __ text box and specify the column that this data is in (column 4). Finally, select OK to import the data. The Import gauged values window should look similar to Figure 5.           

The Value and Year columns in the Observed values tab will now be filled with the data in the order that they were in the data file as shown in Figure 7. The data can be sorted by value and year using the Rank button. Selecting this button will open a new window (Figure 8) where there are five choices to rank by, these are:       

  • Descending flow: Ranks the data in order of values from largest to smallest               
  • Ascending flow: Ranks the data in order of values from smallest to largest               
  • Descending year: Ranks the data in order of year from largest to highest               
  • Ascending year: Ranks the data in order of year from highest to largest               
  • Leave unchanged: Leaves both the values and years unchanged       

It is always a good idea to initially rank your data in descending order so you can check the largest flows. For this data series the value is 12,525.66 m3/s.  Leave the data ranked in descending order for this example.           

Note that the value name and units can be specified by entering values in the Value name and Unit text boxes. These titles do not affect the computations in any way, they do, however, assist in reviewing the results, particularly when presenting results to external audiences.           

         

Figure 7: Observed Values screen with Imported Data           

         

Figure 8: Rank Data Screen           

Configure the distribution and fit method           

Now that the data has been imported the statistical distribution can be fitted to the data. To do this, select the General tab. As noted above, for this example the Log Pearson Type III distribution will be fitted using the Bayesian Inference method.           

Before configuring the model it is worthwhile checking that Flike has interpreted the data correctly. The number of observed data is reported in the Number of observed data text box. In this case the number of observations or length of the data series is 31 as shown in Figure 9. Before continuing, check that this is the case.           

Next, select the probability model; the Log Pearson Type III. To do this ensure that the radio button next to the text Log Pearson Type III (LP3) is selected (this is the default) as in Figure 9.           

The final task is to choose the fitting method. In this example the Bayesian Inference method will be used. To do this, ensure that the radio button next to Bayesian with is selected and the radio button next to No prior information is selected as shown in Figure 9. Again, both of these are the defaults.           

         

Figure 9: General Screen - After Data Import

Running Flike and accessing Results

Flike presents the results in two ways:

  • As a plot; and   
  • In a report file.    

Both of these will be explored in this example and both should be consulted when undertaking a Flood Frequency Analysis. Before we proceed with this example the length of the x-axis in the plot needs to be specified; that is, the lowest probability (rarest event) to be displayed. It is recommended to always enter a value greater than the 1-in-Y AEP event that you are interested in. This is specified in the Maximum AEP 1-in-Y in probability plot ___ years text box. In this example, enter the 1-in-200 year AEP event as shown in Figure 9. By default the plot window automatically launches when a distribution is fitted.           

In addition to the plot window a report file can also be automatically launched in a text editor. This can be quite helpful when you are developing a model, as it allows you to more readily compare the results. To do this select the appropriate radio button next to Always display report file as shown in Figure 9.           

Run Flike           

Now that the data has been imported, the distribution selected, the fit method configured and the output configured Flike is ready to run. To fit the model select OK on the General tab and this will return you to the Flike window, which will look quite empty as in Figure 10. In this window, select the Option dropdown menu and choose Fit model. This will run Flike and present you with a Probability Plot as well as opening the Report File in a text editor.           

         

Figure 10: Blank Flike Screen

Reviewing the results           

When Flike has finished fitting the distribution to the input data, a plot screen will appear similar to Figure 11 and the results file will be shown in the default text editor as in Figure 12 

         

Figure 11: Probability Plot

         

Figure 12: Results File

When fitting a flood series to a probability distribution it is essential that the results are viewed and reviewed. This is most easily achieved by first viewing the results in the Probability Plot. If the Probability Plot window has been closed, it can be reopened by selecting the Option dropdown menu and then View plot. The plot contains information about the fit as well as the quantile values and confidence limits. Within the plot window the y-axis contains information on discharge (or log discharge depending on the Plot scale selected) and x-axis displays the Annual Exceedance Probability (AEP) in terms of 1 in Y years. The plot displays the:              

  • Log normal probability plot of the gauged flows with plotting position determined using the Cunnane plotting position, shown as blue triangles;               
  • X% AEP quantile curve (derived using the posterior mean parameters), shown as a black line;               
  • 90% quantile confidence limits shown as dashed pink lines; and               
  • The expected probability quantile, shown as a red line.   

For the data contained in this example the resulting plot displays a good fit to the gauged data and appears to have tight confidence limits with all gauged data points falling within the 90% confidence limits; by default the figure plots the logarithm of the flood peaks. The plot can be rescaled to remove the log from the flow values. Select the Plot scale button and choose one of the non-log options, that is, either Gumble or Exponential and the uncertainty changes as in Figure 13. This will present a more sobering perspective on the model fit with the confidence limit appearing much larger for rarer flood quantiles. This can be confirmed by reviewing the results in the Result file. Table 1 presents a subset of the results found in the Result file of selected X% AEP quantiles qY and their 90% confidence limits. For example, for the 1% AEP flood, the 5% and 95% confidence limits are respectively 37% and 546% of the quantile qY! The 0.2% AEP confidence limits are so wide as to render estimation meaningless. Note the expected AEP for the quantile qY consistently exceeds the nominal X% AEP. For example, the 1% (1 in 100) AEP quantile of 19,572 m3/s has an expected AEP of 1.35% (1 in 74).             

         

Figure 13: Probability Plot using Gumbel Scale           

Table 1: Selected Results

AEP (%) Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit Expected AEP (%) for qY
10% 3,929 2,229 8,408 10.1%
2% 12,786 5,502 51,010 2.32%
1% 19,572 7,188 107,122 1.36%
0.2% 47,034 11,507 570,635 0.48%           

Table 2: Gauged flows on the Hunter River at Singleton

1938 76.26 1946 1374.42 1954 1391.43 1962 2125.4
1939 171.87 1947 280.18 1955 12525.66 1963 966.35
1940 218.21 1948 202.62 1956 1099.54 1964 2751.68
1941 668.79 1949 4052.42 1957 447.75 1965 49.03
1942 1374.42 1950 2323.77 1958 478.92 1966 76.51
1943 124.12 1951 2536.31 1959 180.52 1967 912.5
1944 276.3 1952 3315.62 1960 164.36 1968 926.67  
1945 895.5 1953 1232.73 1961 229.54    
         
           

Example 4: Use of binomial censored historical data           

This example is a continuation of Example 3 and it examines the benefit of using historical flood information. In the previous example the gauged record spanned the period 1938 to 1968. The biggest flood in that record occurred in 1955 with a discharge of 12,526m3/s. An examination of historic records indicates that during the ungauged period 1820 to 1937 there was only one flood that exceeded the 1955 flood and that this flood occurred in 1820. The information for the 1820 flood is not from a stream gauge; rather it is from a variety of sources including newspaper articles. This information is valuable, perhaps the most valuable, even though the magnitude of the 1820 flood is not reliably known. This information can be incorporated into a Bayesian approach. The way that this is done in Flike is through censoring data.           

From the information about the flood history at Singleton we can make the following conclusions:

  • Over the ungauged period 1820 to 1937 there was:                             
    • One flood above the 1955 flood; and
    • 117 floods below the 1955 flood.
    •            

Note that the ungauged record length is 118 years, that is, all years from 1820 to 1937 are included as it is assumed each year has an event. Also, note that the ungauged period cannot overlap with the gauged period.

           

Launch Flike           

As in Example 3 launch Flike; however, this time open the .fld file previously created: Example_3.fld. This file will be used as it contains the data that is needed for this example.  To do this choose the File dropdown menu and then Open.  Navigate to the Example_3.fld in the next dialogue box and open the file.  The Flike Editor window will appear containing all the information from Example 3.           

Save Example_4.fld           

The next step is to save the Example_4.fld file as a new file.  It is best to do this immediately to ensure that no data is overwritten.  To do this, select OK from the Flike Editor window and control will return to the main Flike window.  Select File again and then Save as.  Save the file as Example_4.fld in a new folder called Example 4.           

Enter Historical Flood Information           

In this step the historical flood information is entered. To edit the Example_4.fld data from the Flike window select Options and then Edit data. This reopens the Flike Editor window. Now select the Censoring of observed values tab and this will open a window similar to Figure 1 with no data.           

         

Figure 1: Censoring observed values tab           

The historical data needs to be entered into the Censoring of observed values tab, that is, we need to let Flike know that there has been one flood greater than the 1955 flood between 1820 and 1937. So:

  • The Threshold value is 12,526m3/s - the size of the 1955 flood.               
  • The Years greater than the threshold (Yrs > threshold) is one (1) – the 1820 flood.
  • The Years less than or equal to the threshold (Yrs <= threshold) is 117 – there were 117 years between 1820 and 1937 with flood less than the 1820 flood.               
  • The Start Year is 1820; and               
  • The End Year is 1937.         

Once the data has been entered, select OK which will return the main Flike window. Flike preforms some checks of the data to ensure that it has been entered correctly. However, these are only checks and it is up to the user to ensure they have correctly configured the historic censoring.           

Return to the General tab by selecting Options and then Edit data and it should appear as in Figure 2.  Note the Number of censoring thresholds text field has been populated with the number 1, so Flike has recognised that there censoring has been configured.           

As with the previous example, check that the Always display report file radio button has been selected           

         

Figure 2: Configured Flike Editor           

Run Flike with Historic Censoring Data           

On the general tab select OK and return to the Flike window.  As in the previous exercise select Option and then Fit model.  This will run Flike and when the engine has finished the  Probability Plot will open together with the Report File.           

Results           

Table 1 presents the posterior mean, standard deviation and correlation for the Log Pearson Type 3 parameters: m, loges and g which are respectively the mean, standard deviation and skewness of loge(q) taken from the Report File. Comparison with Example 3 reveals the censored data have reduced by almost 17% the uncertainty in the skewness (g) parameter.  This parameter controls the shape of the distribution, particularly in the tail region where the floods of interest are.           

Table 1: Posterior mean, standard deviation and correlation for the LP3

LP3 parameter Mean Std Dev Correlation
m 6.365 0.237 1.000    
loges 0.303 0.120 -0.236 1.000  
g -0.004 0.405 -0.227 -0.409 1.000

The resulting Probability plot is shown in Figure 3. This figure displays on a log normal probability plot the gauged flows, the X% AEP quantile curve (derived using the posterior mean parameters), the 90% quantile confidence limits and the expected probability curve. Compared with Example 3 the tightening of the confidence limits is noticeable.           

         

Figure 3: Probability plot of the Singleton data with historic information           

The following table (Table 2) of selected 1 in Y AEP quantiles qY and their 90% confidence limits illustrates the benefit of the information contained in the historic data. For example, for the 1% AEP flood the 5% and 95% confidence limits are respectively 58% and 205% of the quantile qY! This represents a major reduction in quantile uncertainty compared with Example 3 which yielded limits of 38% and 553%. This is illustrated in graphically Figure 4.           

Table 2: Comparison of selected quantiles with 90% confidence limits

AEP (%) Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit Expected AEP (%) for qY
10% 3,294 2,181 4,947 10.37%
2% 9,350 5,778 16,511 2.09%
1% 13,511 7,785 27,678 1.08%
0.2% 28,452 12,966 85,583 0.28%

Note that Report File presents the Expected AEP in 1 in Y years whereas Table 2 presents as the Expected AEP as a percentage.           

This example highlights the significant reductions in uncertainty that historical data can offer. However, care must be exercised to ensure the integrity of the historic information – see Section 2.3.8 of ARR for more details.           

         

Figure 4: Probability plot of the Singleton data with historic information

Example 5:  Use of regional information           

In this example the use of regional parameter information is explored, building on Example 3. As was shown in Example 3, there was significant uncertainty in the skewness parameter. In that example, the posterior mean of the skewness was estimated to be 0.131 with a posterior standard deviation of 0.479. This led to significant uncertainty in the quantile estimates, for instance the 5% and 95% confidence limits for the 1% AEP quantile were 37% and 546% respectively of the 1% AEP quantile. This example shows how the use of regional information can reduce, sometimes significantly, the uncertainty of quantile estimates.  Details on the use of regional information can be found in Section 2.3.10. and 2.6.3.5 of ARR Book 3.           

In this hypothetical example, a regional analysis of skewness has been conducted and the expected regional skew was found to be 0.00 with a standard deviation of 0.30. This information can be incorporated into the Bayesian analysis undertaken by Flike as shown in this example.           

Obtain Prior Information   

If the Log Pearson 3 distribution has been selected in Flike, the option to import the prior information from the ARR Regional Flood Frequency Estimation method is available (http://rffe.arr.org.au/).

There is no need to source data from http://rffe.arr.org.au/ for this example. Manually entered data will be applied. For future reference however, Flike does include an import function for the RFFE data (refer to Figure 2).

Launch Flike           

The Singleton data from the previous examples will be used in this example, so as in Example 4 launch Flike and open the .fld file created in Example 3. Save the opened .fld as, say, Example_5.fld.    

Enter Prior Information           

The next step will be to enter the prior information, that is the regional information on skew. To do this, select Edit data from the Options menu. As before, this opens the Flike Editor. To enter the prior regional information, check the Gaussian prior distributions radio button and then click on the Edit button as shown in Figure 1. This will open the Prior for Log-Pearson 3 window as shown in Figure 2.           

The regional skewness (0.00) is entered into the Mean Skew of log Q text box and the standard deviation of the regional skew (0.300) is entered into the Standard Deviation Skew of log Q as shown in Figure 2. Note in practice careful attention to the units being used is required.           

Very large prior standard deviations are assigned to the Mean of log Q and Standard deviation of log Q parameters to ensure there is no prior information about theses parameters. If the Log Pearson 3 distribution has been selected, the option to import the prior information from the ARR Regional Flood Frequency Estimation method is available (http://rffe.arr.org.au/).

Select OK to return to the Flike editor window.           

         

Figure 1: Gaussian prior distributions           

         

Figure 2: Prior for Log-Pearson 3 window           

Run Flike with Regional Information           

As in the previous examples select OK from the Flike editor window and return to the main Flike window and select Fit model from the Option menu to run Flike.  This should result in a Probability plot as shown in Figure 3.           

         

Figure 3: Probability plot of with prior regional information           

Figure 3 presents the probability plot for the LP3 model fitted to the gauged data with prior information on the skewness. Comparison of the result from this example with the results from Example 3 (see Figure 4) reveals substantially reduced uncertainty in the right hand tail.           

         

Figure 4:  Comparison between the results from Example 3 and Example 5           

Table 1 presents the posterior moments for the LP3 distribution fitted with and without prior information on skewness. The posterior uncertainty of the skewness (std dev = 0.261) is about 87% of the prior standard deviation (0.300) indicating the gauged data are not very informative about the shape parameter (which is represented by skew loge Q (see Section 2.4.2.2 of ARR Book 3) of the flood distribution.           

Table 1: Comparison of LP3 parameters with and without prior information           

LP3 parameter No Prior Information With Prior Information
  Mean Std Deviation Mean Std Deviation
mean 6.433 0.262 6.421 0.251
loge 0.353 0.144 0.320 0.131
Skew 0.131 0.479 0.019 0.261

Table 2 presents selected AEP quantiles qY and their 90% confidence limits. This table further illustrates the benefit of incorporating regional information. For example, for the 1% AEP flood the 5% and 95% confidence limits are respectively 37% and 546% of the quantile q1% when no prior information is used. These limits are reduced to 46% and 292%, respectively using prior regional information.           

Table 2: Comparison of quantiles with and without prior information

AEP (%) No Prior Information With Prior Information
  Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit
10% 3,929 2,229 8,408 3,598 2,172 6,702
2% 12,786 5,502 51,010 10,535 5,310 26,633
1% 19,572 7,188 107,122 15,413 7,093 45,087
0.2% 47,034 11,507 570,635 33,365 12,244 134,107
       
       
           

Example 6: Censoring PILFs using multiple Grubbs-Beck test

In many Australian watercourses there are often years in which there are no floods. The annual maximum from those years are not representative of the population of floods and can unduly influence the fit of the distribution as discussed in Section 2.6.3.9. The flow values are referred to as Potentially Influential Low Flows (PILFs). It is recommended that in all flood frequency analyses the removal of these flows is investigated using the multiple Grubbs-Beck test to identify PILFs. The following example is taken from Pedruco et al. (2013) using data provided by the Wimmera Catchment Management Authority. The table at the end of this example lists 56 years of Annual Maximum discharges for the Wimmera River at Glynwylln. This data is included in the Flike download and was installed in the data folder in the install location of Flike. This location will be something similar to C:\TUFLOW Flike\data\wimmeraGaugedFlows.csv.           

This example will examine the influence of PILFs and demonstrate how to use the multiple Grubbs-Beck test to safely remove them from the flood frequency analysis.           

Launch Flike and Import Data           

As in Example 3 launch Flike and create a new .fld file. Save the opened .fld as say, Example_6.fld. Import the Wimmera River data in the same way that the Singleton data was imported, ensuring that the structure of the data has been checked using the View button. The Records start in the second row (skip the first), Years are in column 1 and the Gauged values are in column 2. Configure the import options and import the data.           

Once this has been done and the Gauged values have been ranked in descending order the Flike Editor window should look like Figure 1.           

         

Figure 1: Flike editor window with Wimmera data.           

Fit Distribution           

The Wimmera data will be fitted to a Generalised Extreme Value (GEV) distribution. To do this, return to the Flike Editor General tab and ensure that the following settings have been chosen:              

  • Bayesian inference method with No prior information;               
  • The GEV probability model; and               
  • The Maximum AEP is set to 200 years   

Once these settings have been selected, select OK and run Flike in the usual way.

Initial Results           

When Flike has run a new probability plot window will open. The plot will not look like Figure 2. To expose a better view of the distributions fit, the plot scale should be changed using the Plot Scale button from a Gumbel plot scale to a Gumbel-log plot scale and the y-axis rescaled using the Rescale button to have a minimum of 0.0 and a maximum of 4.0.           

In Figure 2, the fit to the right-hand tail is not satisfactory. The expected quantiles are significantly greater than the gauged data, further the largest 3 data points fall outside of the lower 90% confidence limits.           

         

Figure 2: Initial probability plot for Wimmera data with GEV           

Multiple Grubbs-Beck test           

The fit of the distribution can be improved by removing PILFs. In Flike this can be done using the multiple Grubbs-Beck test, to do this, return to the Flike Editor window and select the Censor button. Flike will run the multiple Grubbs-Beck test on the Wimmera data and when finished it will return a window similar to the one shown in Figure 3. The multiple Grubbs-Beck test has detected 27 possible PILFs, select Yes to censor them.           

         

Figure 3: Results of the multiple Grubbs-Beck test           

On agreeing to censor these flows, Flike automatically performs two changes to the inference setup:            

  1. The 27 lowest discharges are excluded from the calibration.               
  2. A censored threshold is added data and populated with the information that there are 27 Annual Maximum discharges that lie below the threshold of 54.396 m3/s which corresponds the 28th ranked discharge.           

These are further explained below.           

Excluded Data           

The exclusion of the lowest 27 discharges can be seen in the Observed Flows tab of the Flike Editor as shown in Figure 4. In this tab all the values below the threshold have the Exclude check box crossed, this can be seen by scrolling down the window or by re-ranking the data and selecting Ascending.  If you have re-ranked the data in ascending order re-rank it back into Decesending order.           

         

Figure 4: Excluded gauged values           

Censoring Threshold           

The addition of the censored threshold appears in the Censoring of observed values tab of the Flike Editor as shown in Figure 5. The Threshold value (54.396m3/s) has been automatically populated together with the years that are greater than the threshold (0). The number of years less than the threshold (27) has also been populated. What this is telling Flike is that 27 years of discharges are less than the threshold are being censored; that is, gauged values are not considered but the frequency is. The Start year and End year are also populated with dummy year ranges beginning 1000BC. This is done to satisfy an automatic check in Flike designed to assist in the entry of historic data.           

         

Figure 5: Censoring of observed values           

Results using multiple Grubbs-Beck test           

Return to the main Flike window and run FLIKE by selecting Fit model. As usual, a Probability plot window will automatically appear, as for the initial results change the plot scale to Gumbel-log and rescale the y-axis to have a minimum of 0.0 and a maximum of 4.0. The resulting plot will look like Figure 6.           

A comparison of Figure 2 and Figure 6 shows the improved fit, in Figure 6 all of the gauged data points fall within the 90% confidence limits. Further, censoring the PILFs using the multiple Grubbs-Beck test has significantly altered the quantile estimates and reduced the confidence limits as shown in Table 1. For instance the quantile q1% when PILFs are excluded is around 21% of the initial estimate.  The lower and upper confidence limits have been considerable reduced, initially they were 30% and 500% of the quantile q1% and following the removal of PILFs they became 68% and 220% of the quantile q1%.           

         

Figure 6:  GEV fit - 56 years AM of gauged discharge - Using multiple Grubbs-Beck test           

Table 1: Comparison of FLood Quantiles with and without PILFs

AEP (%) No removal of PILFs Removal of PILFs
  Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit
10% 286 172 578 227 177 311
2% 1,315 493 4,975, 423 304 785
1% 2,481 737 12,398 521 354 1,145
0.2% 10,696 1,802 101,034 789 4488 2,813
           

Table 2: Annual Maximum data for the Wimmera River at Glynwylln

464.35 167.72 119.63 71.4 32.18 14.16 8.52
395.65 155.22 110.56 69.67 25.91 12.64 3.22
285.92 147.00 102.62 67.49 24.83 11.90 2.28
278.01 143.99 97.32 61.64 23.95 11.79 2.13
235.22 143.62 96.78 54.40 22.76 11.41 1.90
211.91 142.66 87.98 38.62 19.04 10.80 1.43
173.79 134.36 79.15 36.62 17.37 10.31 1.16
170.13 123.8 77.03 34.07 14.87 10.08 0.01
       
       
           

Example 7: Improving poor fits using censoring of low discharge data           

The standard probability models such as GEV and LP3 may not adequately fit flood data for a variety of reasons, for example Probable Influential Low Flow Flows (PILFs). In this example the censoring of data is used to censor low discharge data and improve the fit of the distribution to the data.           

Often the poor fit of a distribution is associated with a sigmoidal probability plot as illustrated in Figure 1. In such cases a four or five-parameter distributions which have sufficient degrees of freedom can be used to track the data in both upper and lower tails of the sigmoidal curve. Alternatively a calibration approach that gives less weight to smaller floods can be adopted. The second approach is adopted in this example.           

         

Figure 1: Bayesian fit to all gauged data Gumbel probability plot           

Launch Flike and Import Data           

As in previous examples launch Flike, create a new .fld file and import the Albert River at Bromfleet data (albertRvGaugedFlows.txt) file which was included in the Flike install in the data directory. Note the structure of this file and configure the Import gauged values window.  The Albert River at Broomfleet data is included at the end of this example.           

Fit GEV Distribution           

To recreate Figure 1 fit a GEV distribution to the Albert River data and accept the defaults in the General tab of the FLIKE Editor. The plot in Figure 1 can be recreated by changing the plot scale to Gumbel and rescaling the y-axis to 0 and 4,000.           

Figure 1 displays the GEV Bayesian fit on a Gumbel probability plot. Although the observed floods are largely contained within the 90% confidence limits, the fit, nonetheless, is poor – the data exhibit a sigmoidal trend with reverse curvature developing for floods with an AEP greater than 50%. It appears that the confidence limits have been inflated because the GEV fit represents a poor compromise.           

Use multiple Grubbs Beck test to improve fit           

The first step in improving the poor fit of this data is to use the multiple Grubbs Beck test to remove PILFs. Repeat the procedure outline in the previous example. This will result in the censoring of 5 data points with a threshold of 36.509m3/s.           

Now run Flike and fit the model. Changing the plot scale and rescale the y-axis as above will result in  Figure 2.           

Figure 2 displays the fit after censoring the 5 low outliers identified by the multiple Grubbs-Beck test. The improvement in fit is marginal at best over Figure 1.           

         

Figure 2: Bayesian fit with 5 low outliers censored after application of multiple Grubbs-Beck test           

Trial and error approach           

To deal with this poor fit, a trial-and-error approach to selecting the threshold discharge for the censoring low flows can be used to obtain a fit that favours the right hand tail of the distribution.  This involves testing different threshold values until an acceptable fit is produced. Figure 3 illustrates one such fit. To de-emphasise the left hand tail the floods below the threshold of 250 m3/s were censored. This means the GEV distribution was fitted to:            

  • A gauged record consisting of the 27 floods above 250m3/s; and               
  • A censored record consisting of 23 floods below the threshold of 250m3/s and 0 floods above this threshold.   

To do this in Flike there are two steps, as in Example 6, these are:              

  • Exclude the flows below 250m3/s               
  • Create a censoring threshold   

This is essentially the same process that was undertaken to exclude flows in Example 6 except it needs to be done manually.  This is outlined below.           

Excluded Data           

The flows below 250m3/s need to be excluded from the analysis. To do this select the Observed values tab of the Flike Editor and choose the Block exclude button. Enter 250 into Value below which values are to be excluded text box and select OK.  This will exclude all values below 250m3/s which can be confirmed by scrolling down the table in the Observed values tab.           

Censoring Threshold           

As in the previous example a censoring threshold needs to be entered into the Censoring of observed values tab.  Populate the tab with the following information:            

  • Threshold value: 250               
  • Years greater than threshold (Yrs > threshold): 0               
  • Years less than or equal to threshold (Yrs <= threshold): 23               
  • Start year: 1000               
  • End year: 1022       

Results of the trial and error approach           

Run Flike in the usual way and a Probability plot similar to Figure 3 will be obtained.           

The censored record provides an anchor point for the GEV distribution – it ensures that the chance of an Annual Maximum flood being less than 250m3/s is about 23/50 without forcing the GEV to fit the peaks below the 250m3/s threshold. The fit effectively disregards floods with a greater than 50% AEP and provides a good fit to the upper tail. Another benefit is the substantially reduced 90% confidence limits which can be reviewed by examining the results files.           

         

Figure 3:  Bayesian fit with floods below 250 m3/s threshold treated as censored observations           

Table 1: Annual Maximum data for the Albert River at Bromfleet data

1765.92 1689.51 1652.72 1468.77 1364.06 1341.42 1327.27 1273.5
1214.07 1185.77 1177.28 1086.72 865.98 863.15 860.32 761.27
761.27 752.78 676.37 466.95 461.29 384.88 362.24 305.64
302.81 285.83 271.68 249.61 249.61 220.74 210.55 190.74
156.5 156.22 131.03 124.52 116.88 113.77 99.9 95.65
88.3 87.73 78.11 72.73 36.51 22.36 16.7 15.85
15.57 13.02              
                       

Example 9: L moments fit to gauged data           

This example illustrates fitting a GEV distribution to gauged data using L moments. L moments are a special case of LH moments where there is no shift (H=0). The procedure to use L moments to fit a distribution is set out in Section 2.6.4 of ARR Book 3. In this example Annual Maximum flood data for the Styx River at Jeogla will be fitted using L moments. The flood data are listed at the end of this example.           

The procedure for fitting distributions by L moments can be completed by hand, and also using Flike. Both of these techniques will be outlined in this example.         

L moments by Hand           

The first four L moments can be estimated by equations 3.2.41 to 3.2.44 of ARR Book 3 and are reported in Table 1. The GEV parameter estimates can be calculated by substituting the L moment estimates into the equations in Table 3.2.3 of ARR Book 3 to estimate τ, Κ and α. The standard deviation and correlation were derived from 5,000 bootstrapped samples following the procedure described in Section 2.6.4.6 of ARR Book 3 and Box 7 (Section 2.9.7). Note standard deviation and correlation cannot be calculated by hand.           

Table 1: L moment and GEV parameter estimates

L  moment L  moment estimates GEV parameter Parameter estimate Standard deviation Correlation
λ1 189.238 τ 100.660 17.550 1.000    
λ2 92.476 α 104.157 15.534 0.584 1.000  
λ3 29.264 κ -0.219 0.132 0.341 0.269 1.000
           

L moments using Flike           

L moments and the distribution parameters can be estimated in Flike. To do this, create a new .fld file and import the Styx River at Jeogla data set. Return the Flike Editor General tab. Now set the Inference method to LH moments fit to observed values with and check the H=0 radio box. This last option sets the shift to 0 (i.e. L moments). The Flike Editor window should look like Figure 1. Run Flike and examine the results file for the L moments and GEV parameters.           

         

Figure 1: Flike Editor configured for L moments           

The following table lists 47 ranked flows for the Styx River at Jeogla.

878 541 521 513 436 411 405 315
309 300 294 258 255 235 221 220
206 196 194 190 186 177 164 126
117 111 108 105 92.2 88.6 79.9 74
71.9 62.6 61.2 60.3 58 53.5 39.1 26.7
26.1 23.8 22.4 22.1 18.6 13.0 8.18  
       
       
           

Example 10: Improving poor fits using LH moments           

In Example 7 the fit of the distribution to the Albert River flood series was improved by censoring low flows. In this example, LH moments are used instead of censoring to improve the fit of the GEV distribution to the flood data.           

Launch Flike           

This example uses the same data as Example 7 for the Albert River at Broomfleet, so the previous Example 7.fld file can be used.  To do this, launch Flike and open Example_7.fld and save the opened .fld as Example_10.fld. Open the Flike Editor to configure the LH moments fitting method. Note that the Example_7.fld file was configured with a Bayesian inference method.           

Configure Inference Method           

In Example 7, a Bayesian inference method was used with censored low flows, so a number of changes are required to Example_7.fld before the LH moments inference method can be used. As low flows were censored in the previous example, these need to be included back into the analysis by:               

  • Removing the censoring threshold; and               
  • Including all the flood data.           

Ensure that the Bayesian with button is still checked. If the LH moments fit to observed values with radio button is checked the Censoring of observed values tab cannot be accessed.           

To remove the censoring threshold, select the Censoring of observed values tab and select the Clear all button.           

To include all the flood data, select the Observed values tab and select the Include all button. Scroll through the data to ensure that all the crosses (x) in the Exclude column have been removed.           

Fit L Moments           

To configure Flike to fit distributions using the LH moments inference method, return to the General tab and check the LH moments fit to observed values with radio button. In the first instance, select the H=0 radio button. This will fit a distribution using L moments, this is, LH moments with no shift.           

Flike will only fit LH moments with H >= 1 for the GEV distribution, however it will fit L moments (H = 0) for all distributions. Ensure that the GEV probability model has been selected.           

The configured Flike Editor should look like Figure 1. Select OK and run Flike. As usual, a probability plot will appear together with the report file. Rescale the plot so it looks like Figure 2.           

         

Figure 1: Configured Flike Editor           

         

Figure 2: L moment fit - Albert River at Broomfleet           

Figure 2 displays the GEV L moment fit on a Gumbel probability plot. Although the observed floods are largely contained within the 90% confidence limits, the fit, nonetheless, is poor with systematic departures from the data which exhibits reverse curvature.           

Fit LH Moments           

To deal with this poor fit, a LH moment search was conducted to find the optimal shift parameter using the procedure described in Section 2.6.4.5 of ARR. To do this in Flike check the Optimized H radio button and run Flike. The results file reveals that the optimal shift was found to be 4. Figure 3 presents the LH moment fit with shift equal to 4. The fit effectively disregards floods more frequent than the 50% AEP (around 350m3/s) and provides a very good fit to upper tail.           

         

Figure 3: LH moment fit with shift H=4           

The very significant reduction in the quantile confidence intervals is largely due to the shape parameter Κ changing from –0.17 to 0.50. The L moment fit in Figure 2 was a compromise; most of the small and medium-sized floods suggested an upward curvature in the probability plot which resulted in a negative GEV shape parameter (to enable upward curvature). In contrast, the LH moment fit favoured the large-sized floods which exhibit a downward curvature resulted in a positive shape parameter. For positive Κ the GEV has an upper bound. In this case the upper bound is about 2070 m3/s which is only 17% greater than the largest observed flood.           

A comparison of the quantile derived from the Bayesian inference method with censoring of PILFs and those determined using Optimised LH moments in presented in Table 1. The two different inference methods produce similar results in terms of the calculated quantiles; however, the confidence limits are smaller using the Bayesian framework. This highlights how LH moment results could be used to inform the selection of the censoring threshold for PILFs in the Bayesian framework.           

Table 1: Comparison of Quantiles using a Bayesian and LH Moments inference methods.

AEP (%) Bayesian with removal of PILFs Optimised LH Moments
  Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit Quantile  estimate qY Quantile confidence  limits 5% limit Quantile confidence limits 95% limit
10% 1,400 1,249 1,590 1,406 1,133 1,634
2% 1,720 1,605 1,931 1,782 1,492 2,021
1% 1,782 1,675 2,003 1,868 1,546 2,168  
0.2% 1,854 1,757 2,111 1,982 1,599 2,482