Prior Estimation

`EBayesMethod`

:::{.callout appearance="minimal"}

Abstract type representing empirical Bayes estimation methods.

:::

Nonparametric estimation

The typical call for estimating the prior \(G\) based on empirical Bayes samples Zs is the following,

StatsBase.fit(method, Zs)

Above, method is a type that specifies both the assumptions made on \(G\) (say, the convex prior class \(\mathcal{G}\) in which \(G\) lies), as well as details concerning the computation (typically a JuMP.jl compatible convex programming solver).

Nonparametric Maximum Likelihood estimation (NPMLE)

For example, let us consider the nonparametric maximum likelihood estimator:

`NPMLE`

:::{.callout appearance="minimal"}

NPMLE(convexclass, solver) <: Empirikos.EBayesMethod

Given $n$ independent samples $Z_i$ from the empirical Bayes problem with prior $G$ known to lie in the convexclass $\mathcal{G}$, estimate $G$ by Nonparametric Maximum Likelihood (NPMLE)

$\widehat{G}_n \in \operatorname{argmax}_{G \in \mathcal{G}}\left\{\sum_{i=1}^n \log( f_{i,G}(Z_i)) \right\},$

where $f_{i,G}(z) = \int p_i(z \mid \mu) dG(\mu)$ is the marginal density of the $i$-th sample. The optimization is conducted by a JuMP compatible solver.

:::

Suppose we have Poisson samples Zs, each with a different mean \(\mu_i\) drawn from \(G=U[1,5]\):

using Distributions
n = 1000
μs = rand(Uniform(1,5), n)
Zs = PoissonSample.(rand.(Poisson.(μs)))

We can then estimate \(G\) as follows using Mosek:

using MosekTools
g_hat = fit(NPMLE(DiscretePriorClass(), Mosek.Optimizer), Zs)

Or we can use the open-source Hypatia.jl solver:

using Hypatia
g_hat = fit(NPMLE(DiscretePriorClass(), Hypatia.Optimizer), Zs)

Other available nonparametric methods

`KolmogorovSmirnovMinimumDistance`

:::{.callout appearance="minimal"}

KolmogorovSmirnovMinimumDistance(convexclass, solver) <: Empirikos.EBayesMethod

Given $n$ i.i.d. samples from the empirical Bayes problem with prior $G$ known to lie in the convexclass $\mathcal{G}$ , estimate $G$ as follows:

$\widehat{G}_n \in \operatorname{argmin}_{G \in \mathcal{G}}\{\sup_{t \in \mathbb R}\lvert F_G(t) - \widehat{F}_n(t)\rvert\},$

where $\widehat{F}_n$ is the ECDF of the samples. The optimization is conducted by a JuMP compatible solver.

:::