Empirical Bayes samples

The design choice of this package, is that each sample is wrapped in a type that represents its likelihood. This works well, since in the empirical Bayes problem, we typically impose (simple) assumptions on the distribution of $Z_i \mid \mu_i$ and complexity emerges from making compound or nonparametric assumptions on the $\mu_i$ and sharing information across $i$. The main advantage is that it then makes it easy to add new likelihoods and have it automatically integrate with the rest of the package (say the nonparametric maximum likelihood estimator) through Julia's multiple dispatch.

The abstract type is

Empirikos.EBayesSample — Type

EBayesSample{T}

Abstract type representing empirical Bayes samples with realizations of type T.

source

Example: StandardNormalSample

We explain the interface in the most well-studied empirical Bayes setting, namely the Gaussian compound decision problem wherein $Z_i \mid \mu_i \sim \mathcal{N}(\mu_i,1)$. Such a sample is represented through the StandardNormalSample type:

Empirikos.StandardNormalSample — Type

StandardNormalSample(Z)

An observed sample $Z$ drawn from a Normal distribution with known variance $\sigma^2 =1$.

\[Z \sim \mathcal{N}(\mu, 1)\]

$\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> StandardNormalSample(0.5)          #Z=0.5
Z=     0.5 | σ=1.0

source

The type can be used in three ways. First, say we observe $Z_i=1.0$, then we reprent that as Z = StandardNormalSample(1.0). Two more advanced functionalities consist of StandardNormalSample(missing), which represents the random variable $Z_i$ without having observed its realization yet. Finally, StandardNormalSample(Interval(0.0,1.0)) represents a $Z_i$ whose realization lies in $[0,1]$; this is useful to conduct rigorous discretizations (that can speed up many estimation algorithms). We note that open, closed, unbounded intervals and so forth are allowed, cf. the intervals in the Intervals.jl package.

Interface

The main interface functions are the following:

Empirikos.likelihood_distribution — Function

likelihood_distribution(Z::EBayesSample, μ::Number)

Returns the distribution $p(\cdot \mid \mu)$ of $Z \mid \mu$ (the return type being a Distributions.jl Distribution).

Examples

julia> likelihood_distribution(StandardNormalSample(1.0), 2.0)
Normal{Float64}(μ=2.0, σ=1.0)

source

StatsAPI.response — Method

response(Z::EBayesSample{T})

Returns the concrete realization of Z as type T, thus dropping the information about the likelihood.

Examples

julia> response(StandardNormalSample(1.0))
1.0

source

Empirikos.marginalize — Function

marginalize(Z::EBayesSample, prior::Distribution)

Given a prior distribution $G$ and EBayesSample $Z$, return that marginal distribution of $Z$. Works for EBayesSample{Missing}`, i.e., no realization is needed.

Examples

jldoctest julia> marginalize(StandardNormalSample(1.0), Normal(2.0, sqrt(3))) Normal{Float64}(μ=2.0, σ=1.9999999999999998)`

source

Distributions.pdf — Method

pdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, compute the marginal density of Z.

Examples

julia> Z = StandardNormalSample(1.0)
Z=     1.0 | σ=1.0
julia> prior = Normal(2.0, sqrt(3))
Normal{Float64}(μ=2.0, σ=1.7320508075688772)
julia> pdf(prior, Z)
0.17603266338214976
julia> pdf(Normal(2.0, 2.0), 1.0)
0.17603266338214976

source

Distributions.cdf — Method

cdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, evaluate the CDF of the marginal distribution of $Z$ at response(Z).

source

Distributions.ccdf — Method

ccdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, evaluate the complementary CDF of the marginal distribution of $Z$ at response(Z).

source

Other implemented EBayesSample types

Currently, the following samples have been implemented.

Empirikos.NormalSample — Type

NormalSample(Z,σ)

An observed sample $Z$ drawn from a Normal distribution with known variance $\sigma^2 > 0$.

\[Z \sim \mathcal{N}(\mu, \sigma^2)\]

$\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> NormalSample(0.5, 1.0)          #Z=0.5, σ=1
Z=     0.5 | σ=1.0

source

Empirikos.BinomialSample — Type

BinomialSample(Z, n)

An observed sample $Z$ drawn from a Binomial distribution with n trials.

\[Z \sim \text{Binomial}(n, p)\]

$p$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $p$.

julia> BinomialSample(2, 10)          # 2 out of 10 trials successful
Z=2  | n=10

source

Empirikos.PoissonSample — Type

PoissonSample(Z, E)

An observed sample $Z$ drawn from a Poisson distribution,

\[Z \sim \text{Poisson}(\mu \cdot E).\]

The multiplying intensity $E$ is assumed to be known (and equal to 1.0 by default), while $\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> PoissonSample(3)
Z=3  | E=1.0
julia> PoissonSample(3, 1.5)
Z=3  | E=1.5

source