Skip to contents

Likelihood profiling is a key model diagnostic that helps identify the influence of information sources on model estimates. The number of recruits in an unfished population (R0) is a commonly profiled parameter since it dictates the absolute size of a fish population (serves as global scaling parameter). To conduct a profile, values of the parameter over a given range are fixed and the model re-run and then changes in total likelihood and data-component likelihoods are examined. It is recommended to run this throughout the development process, particularly after incorporating new data sets to understand how informative each component is on the estimation of R0 and if there is conflict between data sources.

Model inputs

To run a Stock Synthesis model, four input files are required: starter.ss, forecast.ss, control.ss, and data.ss. The input files for the example model can be found within the ss3diags package and accessed as shown below. Also, if you do not have the r4ss package installed, you will need to install it for this tutorial.

install.packages("pak")
pak::pkg_install("r4ss/r4ss")
library(r4ss)

files_path <- system.file("extdata", package = "ss3diags")
dir_tmp <- tempdir(check = TRUE)
dir_profile <- file.path(dir_tmp, "profile")
dir.create(dir_profile, showWarnings = FALSE, recursive = TRUE)
list.files(files_path)
#> [1] "control.ss"  "data.ss"     "forecast.ss" "starter.ss"
file.copy(from = list.files(files_path, full.names = TRUE), to = dir_tmp)
#> [1] TRUE TRUE TRUE TRUE

You will need to make sure you have the SS executable file either in your path or in the directory you are running the profile from (in this case dir_profile). An easy way to get the latest release of stock synthesis is to use the r4ss function get_ss3_exe().

r4ss::get_ss3_exe(dir = dir_tmp, version = "v3.30.21")
#> The stock synthesis executable for Linux v3.30.21 was downloaded to: /tmp/Rtmp5xX4CR/ss3

Running a Profile on R0

Once you have the four input files and SS executable file, you can run a likelihood profile as shown below. The first step is to run the model you would like to do a profile for. We will do this in dir_tmp and then copy the necessary files to dir_profile. It’s best to run the profile in a subdirectory of your main model run to keep the output files separate from other diagnostic tests you may run.

r4ss::run(dir = dir_tmp, exe = "ss3", verbose = FALSE)
#> [1] "ran model"
files <- c("data.ss", "control.ss_new", "starter.ss", "forecast.ss", "ss.par", "ss3")
file.copy(from = file.path(dir_tmp, files), to = dir_profile)
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE

Once you have the input files in the dir_profile directory, you will need to create a vector of values to profile across. The range and increments to choose depend on your model and the resolution at which you want to analyze the likelihood profile. For this example we will use a fairly coarse resolution to speed up total run time.

CTL <- SS_readctl_3.30(file = file.path(dir_profile, "control.ss_new"), datlist = file.path(dir_profile, "data.ss"))

CTL$SR_parms
#>               LO HI   INIT PRIOR PR_SD PR_type PHASE env_var&link dev_link
#> SR_LN(R0)    6.0 12 8.9274  10.3  10.0       0     1            0        0
#> SR_BH_steep  0.2  1 0.8000   0.8   1.0       0    -4            0        0
#> SR_sigmaR    0.0  2 0.6000   0.8   0.8       0    -4            0        0
#> SR_regime   -5.0  5 0.0000   0.0   1.0       0    -4            0        0
#> SR_autocorr  0.0  0 0.0000   0.0   0.0       0   -99            0        0
#>             dev_minyr dev_maxyr dev_PH Block Block_Fxn
#> SR_LN(R0)           0         0      0     0         0
#> SR_BH_steep         0         0      0     0         0
#> SR_sigmaR           0         0      0     0         0
#> SR_regime           0         0      0     0         0
#> SR_autocorr         0         0      0     0         0
# getting the estimated r0 value
r0 <- CTL$SR_parms$INIT[1]
# creating a vector that is +/- 1 unit away from the estimated value in increments of 0.2
r0_vec <- seq(r0 - 1, r0 + 1, by = 0.2)
r0_vec
#>  [1] 7.9274 8.1274 8.3274 8.5274 8.7274 8.9274 9.1274 9.3274 9.5274 9.7274
#> [11] 9.9274

We also need to modify the starter file by changing the name of the control file that it will read from and making sure the likelihood is calculated for non-estimated quantities.

START <- SS_readstarter(file = file.path(dir_profile, "starter.ss"), verbose = FALSE)
START$prior_like <- 1
START$ctlfile <- "control_modified.ss"
SS_writestarter(START, dir = dir_profile, overwrite = TRUE, verbose = F)

To run the profile, use r4ss::profile() and you will need to specify a partial string of the name of the parameter you are profiling over (in this case “SR_LN” will match with “SR_LN(R0)”), and the vector of values to profile across. The newctlfile is the control file that will be adjusted with values from the profilevec and can be named anything you prefer, it just needs to match what you put in the starter file for “ctlfile”. Full documentation of the profile() function can be found on the r4ss website.

profile(
  dir = dir_profile,
  newctlfile = "control_modified.ss",
  string = "SR_LN",
  profilevec = r0_vec,
  exe = "ss3",
  verbose = FALSE
)
#> running profile i=1/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  8.9274  7.9274        1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=2/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  7.9274  8.1274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=3/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  8.1274  8.3274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=4/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  8.3274  8.5274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=5/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  8.5274  8.7274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=6/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  8.7274  8.9274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=7/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  8.9274  9.1274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=8/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  9.1274  9.3274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=9/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  9.3274  9.5274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=10/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  9.5274  9.7274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#> running profile i=11/11
#> Parameter names in control file matching input vector 
#> 'strings' (n=1): SR_LN(R0)
#> line numbers in control file (n=1): 120
#> Wrote new file to control_modified.ss with the following changes:
#>   oldvals newvals oldphase newphase oldlos newlos oldhis newhis oldprior
#> 1  9.7274  9.9274       -1       -1      6      6     12     12     10.3
#>   newprior oldprsd newprsd oldprtype newprtype     comment
#> 1     10.3      10      10         0         0 # SR_LN(R0)
#>     Value converged[whichruns]    TOTAL       Catch Equil_catch    Survey
#> 1  7.9274                 TRUE 1067.570 4.40715e-01   16.620400  6.065520
#> 2  8.1274                 TRUE  940.077 2.29735e-01   12.985200  1.965820
#> 3  8.3274                 TRUE  844.236 1.08857e-01    9.799480 -2.132340
#> 4  8.5274                 TRUE  790.513 9.17170e-06    3.930790 -5.726270
#> 5  8.7274                 TRUE  762.562 6.48948e-08    0.271032 -7.140910
#> 6  8.9274                 TRUE  754.302 1.33458e-10    0.174869 -7.118110
#> 7  9.1274                 TRUE  758.927 2.10816e-13    1.337060 -6.106700
#> 8  9.3274                 TRUE  767.708 1.90347e-13    3.424990 -4.917570
#> 9  9.5274                 TRUE  777.243 1.15331e-13    6.433440 -3.857200
#> 10 9.7274                 TRUE  787.189 6.73658e-14   10.382600 -2.563580
#> 11 9.9274                 TRUE  798.152 4.05051e-14   15.279300 -0.380771
#>    Length_comp Age_comp Size_at_age Recruitment InitEQ_Regime
#> 1      453.198  341.340     230.715   19.181500   0.00000e+00
#> 2      418.750  291.595     201.557   12.989400   0.00000e+00
#> 3      393.384  257.968     179.068    6.036600   0.00000e+00
#> 4      379.721  243.456     167.416    1.713560   1.54502e-30
#> 5      370.420  234.970     163.525    0.515980   1.54502e-30
#> 6      366.756  231.845     162.399    0.244060   1.54502e-30
#> 7      367.497  233.210     162.675    0.312441   0.00000e+00
#> 8      369.543  235.957     163.170    0.529082   0.00000e+00
#> 9      371.605  238.622     163.598    0.839498   1.54502e-30
#> 10     373.376  240.872     163.942    1.178030   0.00000e+00
#> 11     374.815  242.718     164.208    1.510550   0.00000e+00
#>    Forecast_Recruitment Parm_priors Parm_softbounds Parm_devs Crash_Pen
#> 1                     0           0      0.00756177         0         0
#> 2                     0           0      0.00433602         0         0
#> 3                     0           0      0.00337783         0         0
#> 4                     0           0      0.00197985         0         0
#> 5                     0           0      0.00143663         0         0
#> 6                     0           0      0.00141332         0         0
#> 7                     0           0      0.00150262         0         0
#> 8                     0           0      0.00159065         0         0
#> 9                     0           0      0.00166199         0         0
#> 10                    0           0      0.00171844         0         0
#> 11                    0           0      0.00176282         0         0

Visualizing Output

Once the profile is finished running, we can visualize the results to determine if there is conflict between the data sources. If all data sources reach a negative log-likelihood (NLL) minimum at the same R0 value, this indicates good agreement between them. However, more likely, one or more will reach this NLL minimum at different R0 values from the global R0 value. This is a sign of conflict between your data sources and may require you to consider data weighting.

profile_mods <- SSgetoutput(dirvec = dir_profile, keyvec = 1:length(r0_vec), verbose = FALSE)
profile_mods_sum <- SSsummarize(profile_mods, verbose = FALSE)
SSplotProfile(profile_mods_sum,
  profile.string = "SR_LN",
  profile.label = "SR_LN(R0)"
)
#> Parameter matching profile.string=SR_LN: SR_LN(R0)
#> Parameter values (after subsetting based on input 'models'): 7.9274, 8.1274, 8.3274, 8.5274, 8.7274, 8.9274, 9.1274, 9.3274, 9.5274, 9.7274, 9.9274
#> Likelihood components showing max change as fraction of total change.
#> To change which components are included, change input 'minfraction'.
#>                      frac_change include                            label
#> TOTAL                     1.0000    TRUE                            Total
#> Catch                     0.0014   FALSE                            Catch
#> Equil_catch               0.0525    TRUE                Equilibrium catch
#> Survey                    0.0422    TRUE                       Index data
#> Length_comp               0.2759    TRUE                      Length data
#> Age_comp                  0.3495    TRUE                         Age data
#> Size_at_age               0.2181    TRUE                 Size-at-age data
#> Recruitment               0.0605    TRUE                      Recruitment
#> InitEQ_Regime             0.0000   FALSE Initital equilibrium recruitment
#> Forecast_Recruitment      0.0000   FALSE             Forecast recruitment
#> Parm_priors               0.0000   FALSE                           Priors
#> Parm_softbounds           0.0000   FALSE                      Soft bounds
#> Parm_devs                 0.0000   FALSE             Parameter deviations
#> Crash_Pen                 0.0000   FALSE                    Crash penalty

The profile plot above shows the changes in log-likelihood across the selected range of values for each of the contributing data components. There is an adjustable minimum contribution threshold that a component must meet to appear in this figure (if there is a data source that is in your model but does not show up in the plot, its contribution to the total likelihood may not be large enough). The steepness of each trajectory indicates how informative (or not) a data source is. For example, the age data in the plot above is much steeper on the left side of the minimum R0 value than the index data, which suggests that age composition data is more informative in the model.

You can also plot data-type and fleet-specific profiles using r4ss::PinerPlot(). Below we are plotting the profile for the length composition data by fleet and the likelihood of survey data by fleet. This will allow us to see if there are conflicts and what sources are the main drivers.

sspar(mfrow = c(1, 2))
PinerPlot(profile_mods_sum,
  component = "Length_like",
  main = "Length"
)
#> Parameter matching profile.string = 'R0': 'SR_LN(R0)
#> Parameter values (after subsetting based on input 'models'): 7.9274, 8.1274, 8.3274, 8.5274, 8.7274, 8.9274, 9.1274, 9.3274, 9.5274, 9.7274, 9.9274,
#> Fleet-specific likelihoods showing max change as fraction of total change.
#> To change which components are included, change input 'minfraction'.
#>         frac_change include
#> FISHERY      0.6825    TRUE
#> SURVEY1      0.3208    TRUE
#> SURVEY2      0.0000   FALSE
PinerPlot(profile_mods_sum,
  component = "Surv_like",
  main = "Survey"
)
#> Parameter matching profile.string = 'R0': 'SR_LN(R0)
#> Parameter values (after subsetting based on input 'models'): 7.9274, 8.1274, 8.3274, 8.5274, 8.7274, 8.9274, 9.1274, 9.3274, 9.5274, 9.7274, 9.9274,
#> Fleet-specific likelihoods showing max change as fraction of total change.
#> To change which components are included, change input 'minfraction'.
#>         frac_change include
#> FISHERY      0.0000   FALSE
#> SURVEY1      0.3218    TRUE
#> SURVEY2      0.9202    TRUE