loo v2.7.0
Major changes
-
New sample size specific diagnostic threshold for Pareto
k
.
The pre-2022 version of the PSIS paper recommended diagnostic thresholds of
k < 0.5 "good"
0.5 <= k < 0.7 "ok"
0.7 <= k < 1 "bad"
k>=1 "very bad"
The 2022 revision of the PSIS paper now recommends
k < min(1 - 1/log10(S), 0.7) "good"
min(1 - 1/log10(S), 0.7) <= k < 1 "bad"
k > 1 "very bad"
whereS
is the sample size.
There is now one fewer diagnostic threshold ("ok"
has been removed), and the
most important threshold now depends on the sample sizeS
. With sample sizes
100
,320
,1000
,2200
,10000
the sample size specific part1 - 1/log10(S)
corresponds to thresholds of0.5
,0.6
,0.67
,0.7
,0.75
.
Even if the sample size grows, the bias in the PSIS estimate dominates if
0.7 <= k < 1
, and thus the diagnostic threshold for good is capped at
0.7
(ifk > 1
, the mean does not exist and bias is not a valid measure).
The new recommended thresholds are based on more careful bias-variance analysis
of PSIS based on truncated Pareto sums theory. For those who use the Stan
default 4000 posterior draws, the0.7
threshold will be roughly the same, but
there will be fewer warnings as there will be no diagnostic message for0.5 <= k < 0.7
.
Those who use smaller sample sizes may see diagnostic messages with a
threshold less than0.7
, and they can simply increase the sample size to about
2200
to get the threshold to0.7
. -
No more warnings if the
r_eff
argument is not provided, and the
default is nowr_eff = 1
. The summary print output showing MCSE and ESS now
shows diagnostic information on the range ofr_eff
. The change was made to
reduce unnecessary warnings. The use ofr_eff
does not change the expected
value ofelpd_loo
,p_loo
, and Paretok
, and is needed only to estimate
MCSE and ESS. Thus it is better to show the diagnostic information aboutr_eff
only when MCSE and ESS values are shown.
Other changes
- Make Pareto
k
Inf if it is NA by @topipa in #224 - Fix bug in
E_loo()
when type is variance by @jgabry in #22 E_loo()
now allowstype="sd"
by @jgabry in #226- include cc-by 4.0 license for documentation by @jgabry in #216
- Add order statistic warning by @yannmclatchie in #230
pointwise()
convenience function for extracting pointwise estimates by @jgabry in #241- use new
k
threshold by @avehtari in #235 - simplify
mcse_elpd
using log-normal approximation by @avehtari in #246 - show NA for
n_eff/ESS
ifk > k_threshold
by @avehtari in #248 - improved
E_loo()
Pareto-k diagnostics by @avehtari in #247 - Doc improvement in
loo_subsample.R
by @avehtari in #238 - Fix typo and deprecations in LFO vignette by @jgabry in #244
- update array syntax in vignettes by @jgabry in #229
- Fix unbalanced knitr backticks by @jgabry in #232
- Register internal S3 methods by @jgabry in #239
- Avoid R cmd check NOTEs about some internal functions by @jgabry in #240
- fix R cmd check note due to importance_sampling roxygen template by @jgabry in #233
- fix R cmd check notes by @jgabry in #242
New Contributors
- @yannmclatchie made their first contribution in #230
Full Changelog: v2.6.0...v2.7.0