Prior Estimation
Empirikos.EBayesMethod
— TypeAbstract type representing empirical Bayes estimation methods.
Nonparametric estimation
The typical call for estimating the prior $G$ based on empirical Bayes samples Zs
is the following,
StatsBase.fit(method, Zs)
Above, method
is a type that specifies both the assumptions made on $G$ (say, the convex prior class $\mathcal{G}$ in which $G$ lies), as well as details concerning the computation (typically a JuMP.jl compatible convex programming solver).
Nonparametric Maximum Likelihood estimation (NPMLE)
For example, let us consider the nonparametric maximum likelihood estimator:
Empirikos.NPMLE
— TypeNPMLE(convexclass, solver) <: Empirikos.EBayesMethod
Given $n$ independent samples $Z_i$ from the empirical Bayes problem with prior $G$ known to lie in the convexclass
$\mathcal{G}$, estimate $G$ by Nonparametric Maximum Likelihood (NPMLE)
\[\widehat{G}_n \in \operatorname{argmax}_{G \in \mathcal{G}}\left\{\sum_{i=1}^n \log( f_{i,G}(Z_i)) \right\},\]
where $f_{i,G}(z) = \int p_i(z \mid \mu) dG(\mu)$ is the marginal density of the $i$-th sample. The optimization is conducted by a JuMP compatible solver
.
Suppose we have Poisson samples Zs
, each with a different mean $\mu_i$ drawn from $G=U[1,5]$:
using Distributions
n = 1000
μs = rand(Uniform(1,5), n)
Zs = PoissonSample.(rand.(Poisson.(μs)))
We can then estimate $G$ as follows using Mosek:
using MosekTools
g_hat = fit(NPMLE(DiscretePriorClass(), Mosek.Optimizer), Zs)
Or we can use the open-source Hypatia.jl solver:
using Hypatia
g_hat = fit(NPMLE(DiscretePriorClass(), Hypatia.Optimizer), Zs)
Other available nonparametric methods
Empirikos.KolmogorovSmirnovMinimumDistance
— TypeKolmogorovSmirnovMinimumDistance(convexclass, solver) <: Empirikos.EBayesMethod
Given $n$ i.i.d. samples from the empirical Bayes problem with prior $G$ known to lie in the convexclass
$\mathcal{G}$ , estimate $G$ as follows:
\[\widehat{G}_n \in \operatorname{argmin}_{G \in \mathcal{G}}\{\sup_{t \in \mathbb R}\lvert F_G(t) - \widehat{F}_n(t)\rvert\},\]
where $\widehat{F}_n$ is the ECDF of the samples. The optimization is conducted by a JuMP compatible solver
.