Within the ss3diags
package, there are 2 main diagnostic
functions useful for residual analysis:
ss3diags::SSplotRunstest()
and
ss3diags::SSplotJABBAres()
. It is suggested to implement
these diagnostics early and often in the model development process to
gain a better understanding of how your model is fitting to your
data.
Runs Test
The runs test is a nonparametric hypothesis test for randomness in a data sequence that calculates the 2-sided p-value to estimate the number of runs (i.e., sequences of values of the same sign) above and below a reference value. The runs test can diagnose model misspecification using residuals from fits to index and composition data (Carvalho et al. 2017) by testing if there are non-random patterns in the residuals. In addition, the three-sigma limits can be considered useful for identifying potential outliers as any data point further than three standard deviations away from the expected residual process average would be unlikely given a random process error in the observed residual distribution.
For this example we will be using the “simple_small” SS model
included within the r4ss
package. We can read in the report
file from the model output using the r4ss::SS_output()
function as shown below.
library(r4ss)
files_path <- system.file("extdata", "simple_small", package = "r4ss")
report <- SS_output(dir = files_path, verbose = FALSE, printstats = FALSE)
sspar(mfrow = c(1, 2))
SSplotRunstest(report, add = TRUE)
#> Running Runs Test Diagnostics w/ plots forIndex
#> Plotting Residual Runs Tests
#> Residual Runs Test (/w plot) stats by Index:
#> Index runs.p test sigma3.lo sigma3.hi type
#> 1 SURVEY1 0.159 Passed -0.7301441 0.7301441 cpue
#> 2 SURVEY2 0.887 Passed -1.1052361 1.1052361 cpue
The output for SSplotRunstest()
includes a plot of the
residuals by fleet and a table with the results from the runs test and
‘three-sigma limit’ values. In the plots above, the shaded area
represents the ‘three-sigma limit’, or three residual standard
deviations from zero. If any of the individual residual points fall
outside of the three-sigma limit, they are colored red. Green shaded
area indicates the residuals are randomly distributed (p-value >=
0.05) whereas red shaded area indicates the residuals are not randomly
distributed. Failing the runs test (p-value < 0.05) can be indicative
of some misspecification or conflict with the indices or composition
data.
Customizing the Plot
Runs test plots can be customized as needed. Some common features that you may want to specify are:
- plotting other data types (default is Index of Abundance)
- plotting specific fleet(s)
- adjusting the y-axis range
Examples of each of these changes are shown below, incrementally
making each adjustment. To plot other data types, they can be specified
with the subplots
argument, and the options include “cpue”,
“len”, “age”, “size”, or “con”. “con” is for conditional size-at-age
data and “size” is for generalized size composition data. Fleets can be
specified using the indexselect()
function, which takes a
vector of the fleet numbers to plot.
sspar(mfrow = c(2, 2))
SSplotRunstest(report, subplots = "len", indexselect = 1, add = TRUE)
#> Running Runs Test Diagnostics w/ plots forMean length
#> Plotting Residual Runs Tests
#> Residual Runs Test (/w plot) stats by Mean length:
#> Index runs.p test sigma3.lo sigma3.hi type
#> 1 FISHERY 0.126 Passed -0.0613176 0.0613176 len
SSplotRunstest(report, subplots = "age", indexselect = 2, add = TRUE)
#> Running Runs Test Diagnostics w/ plots forMean age
#> Plotting Residual Runs Tests
#> Residual Runs Test (/w plot) stats by Mean age:
#> Index runs.p test sigma3.lo sigma3.hi type
#> 1 SURVEY1 0.11 Passed -0.2743816 0.2743816 age
SSplotRunstest(report, subplots = "age", indexselect = 2, add = TRUE, ylim = c(-0.5, 0.5))
#> Running Runs Test Diagnostics w/ plots forMean age
#> Plotting Residual Runs Tests
#> Residual Runs Test (/w plot) stats by Mean age:
#> Index runs.p test sigma3.lo sigma3.hi type
#> 1 SURVEY1 0.11 Passed -0.2743816 0.2743816 age
SSplotRunstest(report, subplots = "age", indexselect = 2, add = TRUE, ylim = c(-0.5, 0.5), ylimAdj = 1)
#> Running Runs Test Diagnostics w/ plots forMean age
#> Plotting Residual Runs Tests
#> Residual Runs Test (/w plot) stats by Mean age:
#> Index runs.p test sigma3.lo sigma3.hi type
#> 1 SURVEY1 0.11 Passed -0.2743816 0.2743816 age
Note that a small adjustment is automatically added to the ylim to
provide extra white space. This is why in the third plot, even though we
set the y-axis to range from -0.5 to 0.5, the y-axis goes from -0.6 to
0.6. If you do not want the extra space, set ylimAdj = 1
as
done in the 4th plot.
Summary Table
In addition to the residual plots, SSplotRunstest()
produces a summary table of the runs test output values, including:
- p-value for the runs test
- if the test passed or failed (indicated by green or red shading in the plot)
- upper and lower limits for the 3-sigma interval
- type of data tested (CPUE, length-composition, age-composition, or conditional age-at-length)
To only produce the summary table and skip the plot, e.g. to
faciliate automated processing, use SSrunstest()
.
rcpue <- SSrunstest(report, quants = "cpue")
#> Running Runs Test Diagnosics for Index
#> Computing Residual Runs Tests
#> Residual Runs Test stats by Index:
rlen <- SSrunstest(report, quants = "len")
#> Running Runs Test Diagnosics for Mean length
#> Computing Residual Runs Tests
#> Residual Runs Test stats by Mean length:
rbind(rcpue, rlen)
#> Index runs.p test sigma3.lo sigma3.hi type
#> 1 SURVEY1 0.159 Passed -0.7301441 0.7301441 cpue
#> 2 SURVEY2 0.887 Passed -1.1052361 1.1052361 cpue
#> 3 FISHERY 0.126 Passed -0.0613176 0.0613176 len
#> 4 SURVEY1 0.841 Passed -0.1725225 0.1725225 len
RMSE
RMSE or root mean square error is useful for evaluating how far
predictions fall from the observed data. The
ss3diags::SSplotJABBAres()
function allows you to visualize
the full time series of residuals for all fleets of the indicated data
(index of abundance or composition). In the example below, we plot the
residuals for index of abundance, mean age (age-composition), and mean
length (length composition) for both fleets.
sspar(mfrow = c(2, 2))
SSplotJABBAres(report, subplots = "cpue", add = TRUE)
#> Plotting JABBA residual plot.
#> is plot TRUE? TRUE
#> drawing plot at
#> Plot exists:
#> RMSE stats by Index:
#> # A tibble: 3 × 3
#> Fleet RMSE.perc Nobs
#> <chr> <dbl> <int>
#> 1 SURVEY1 21.6 4
#> 2 SURVEY2 27.7 12
#> 3 Combined 26.3 16
SSplotJABBAres(report, subplots = "age", add = TRUE)
#> Plotting JABBA residual plot.
#> is plot TRUE? TRUE
#> drawing plot at
#> Plot exists:
#> RMSE stats by Index:
#> # A tibble: 3 × 3
#> Fleet RMSE.perc Nobs
#> <chr> <dbl> <int>
#> 1 FISHERY 9 12
#> 2 SURVEY1 8.1 4
#> 3 Combined 8.8 16
SSplotJABBAres(report, subplots = "len", add = TRUE, ylim = c(-0.2, 0.2))
#> Plotting JABBA residual plot.
#> is plot TRUE? TRUE
#> drawing plot at
#> Plot exists:
#> RMSE stats by Index:
#> # A tibble: 3 × 3
#> Fleet RMSE.perc Nobs
#> <chr> <dbl> <int>
#> 1 FISHERY 2.1 12
#> 2 SURVEY1 3.4 4
#> 3 Combined 2.5 16
In the plots above, each point represents the difference between the observed and expected value (residual) and is color-coded to correspond to a specific fleet. The solid, color-coded vertical lines from the horizontal dashed line (at y = 0) show the distance between 0 and the residual value. Box plots summarize the spread of residual values across all fleets within each year and the thick solid black line is a loess smoother fit through all residual points. Total RMSE is displayed in the top right corner. When assessing this plot, it is important to see if there are any patterns or trends in the residuals, how large the overall RMSE is, and whether the data from different fleets generally agrees or conflicts with each other.
To retrieve just RMSE and residual values without generating the
plot, SSrmse()
produces a named list with 2 objects,
RMSE
and residuals
. These can be accessed as
shown below. This function is useful for automating analyses.
rmse <- SSrmse(report, quants = "cpue")$RMSE
residuals <- SSrmse(report, quants = "cpue")$residuals
rmse
#> # A tibble: 3 × 3
#> Fleet RMSE.perc Nobs
#> <chr> <dbl> <int>
#> 1 SURVEY1 21.6 4
#> 2 SURVEY2 27.7 12
#> 3 Combined 26.3 16
head(residuals)
#> Fleet Fleet_name Area Yr Seas Subseas Month Time Vuln_bio Obs
#> 926 2 SURVEY1 1 2013 1 2 7 2013.5 62947.50 144745.0000
#> 927 2 SURVEY1 1 2016 1 2 7 2016.5 49335.70 63760.3000
#> 928 2 SURVEY1 1 2019 1 2 7 2019.5 38132.00 59242.9000
#> 929 2 SURVEY1 1 2022 1 2 7 2022.5 34045.80 49649.7000
#> 930 3 SURVEY2 1 2011 1 2 7 2011.5 6306.83 11.5668
#> 931 3 SURVEY2 1 2012 1 2 7 2012.5 6544.96 13.9955
#> Exp Calc_Q Eff_Q SE SE_input Dev Like
#> 926 101399.0000 1.61084000 1.61084000 0.3 0.3 0.35591400 7.03749e-01
#> 927 79472.2000 1.61084000 1.61084000 0.3 0.3 -0.22027600 2.69564e-01
#> 928 61424.7000 1.61084000 1.61084000 0.3 0.3 -0.03616590 7.26653e-03
#> 929 54842.4000 1.61084000 1.61084000 0.3 0.3 -0.09947210 5.49706e-02
#> 930 11.5342 0.00182885 0.00182885 0.7 0.7 0.00281832 8.10502e-06
#> 931 11.9698 0.00182885 0.00182885 0.7 0.7 0.15635300 2.49451e-02
#> Like+log(s) SuprPer Use residuals
#> 926 -0.500223 NA 1 0.355910344
#> 927 -0.934409 NA 1 -0.220276535
#> 928 -1.196710 NA 1 -0.036166093
#> 929 -1.149000 NA 1 -0.099471269
#> 930 -0.356667 NA 1 0.002822391
#> 931 -0.331730 NA 1 0.156349038