Empirical Bayes samples

The design choice of this package, is that each sample is wrapped in a type that represents its likelihood. This works well, since in the empirical Bayes problem, we typically impose (simple) assumptions on the distribution of \(Z_i \mid \mu_i\) and complexity emerges from making compound or nonparametric assumptions on the \(\mu_i\) and sharing information across \(i\). The main advantage is that it then makes it easy to add new likelihoods and have it automatically integrate with the rest of the package (say the nonparametric maximum likelihood estimator) through Julia’s multiple dispatch.

The abstract type is

`EBayesSample`

:::{.callout appearance="minimal"}

EBayesSample{T}

Abstract type representing empirical Bayes samples with realizations of type T.

:::

Example: StandardNormalSample

We explain the interface in the most well-studied empirical Bayes setting, namely the Gaussian compound decision problem wherein \(Z_i \mid \mu_i \sim \mathcal{N}(\mu_i,1)\). Such a sample is represented through the StandardNormalSample type:

`StandardNormalSample`

:::{.callout appearance="minimal"}

StandardNormalSample(Z)

An observed sample $Z$ drawn from a Normal distribution with known variance $\sigma^2 =1$.

$Z \sim \mathcal{N}(\mu, 1)$

$\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> StandardNormalSample(0.5)          #Z=0.5
N(0.5; μ, σ=1.0)

:::

The type can be used in three ways. First, say we observe \(Z_i=1.0\), then we reprent that as Z = StandardNormalSample(1.0). A more advanced functionality consists of StandardNormalSample(missing), which represents the random variable \(Z_i\) without having observed its realization yet.

Interface

The main interface functions are the following:

`likelihood_distribution`

:::{.callout appearance="minimal"}

likelihood_distribution(Z::EBayesSample, μ::Number)

Returns the distribution $p(\cdot \mid \mu)$ of $Z \mid \mu$ (the return type being a Distributions.jl Distribution).

Examples {.unnumbered}

julia> likelihood_distribution(StandardNormalSample(1.0), 2.0)
Normal{Float64}(μ=2.0, σ=1.0)
likelihood_distribution(Z::FoldedNormalSample, μ)

Return the folded Normal distribution fold(Normal(μ, σ)) for a given mean μ.

:::

`response`

:::{.callout appearance="minimal"}

response(model::RegressionModel)

Return the model response (a.k.a. the dependent variable).

response(Z::EBayesSample{T})

Returns the concrete realization of Z as type T, thus dropping the information about the likelihood.

Examples {.unnumbered}

julia> response(StandardNormalSample(1.0))
1.0

:::

`marginalize`

:::{.callout appearance="minimal"}

marginalize(Z::EBayesSample, prior::Distribution)

Given a prior distribution $G$ and EBayesSample $Z$, return that marginal distribution of $Z$. Works for EBayesSample{Missing}`, i.e., no realization is needed.

Examples {.unnumbered}

jldoctest julia> marginalize(StandardNormalSample(1.0), Normal(2.0, sqrt(3))) Normal{Float64}(μ=2.0, σ=1.9999999999999998)`

marginalize(Z::FoldedNormalSample, prior::Normal) -> Folded{Normal}

Compute marginal distribution for folded normal observation with normal prior.

  1. Unfold the observation to normal sample

  2. Compute marginal distribution of the normal sample with the normal prior

  3. Fold the resulting distribution

marginalize(Z::FoldedNormalSample, prior::Uniform) -> Folded{UniformNormal}

Compute the marginal distribution for a folded normal observation under a uniform prior.

Arguments {.unnumbered}

  • Z::FoldedNormalSample: A folded normal observation.

  • prior::Uniform: Uniform prior distribution.

Returns {.unnumbered}

  • Folded{UniformNormal}: Folded UniformNormal distribution representing the marginal distribution.

:::

`pdf`

:::{.callout appearance="minimal"}

pdf(d::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:Real,N}) where {N}

Evaluate the probability density function of d at x.

This function checks if the size of x is compatible with distribution d. This check can be disabled by using @inbounds.

Implementation {.unnumbered}

Instead of pdf one should implement _pdf(d, x) which does not have to check the size of x. However, since the default definition of pdf(d, x) falls back to logpdf(d, x) usually it is sufficient to implement logpdf.

See also: logpdf.

pdf(d::Distribution{ArrayLikeVariate{N}}, x) where {N}

Evaluate the probability density function of d at every element in a collection x.

This function checks for every element of x if its size is compatible with distribution d. This check can be disabled by using @inbounds.

Here, x can be

  • an array of dimension > N with size(x)[1:N] == size(d), or

  • an array of arrays xi of dimension N with size(xi) == size(d).

pdf(d::UnivariateDistribution, x::Real)

Evaluate the probability density (mass) at x.

See also: logpdf.

pdf(d::Union{UnivariateMixture, MultivariateMixture}, x)

Evaluate the (mixed) probability density function over x. Here, x can be a single sample or an array of multiple samples.

pdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, compute the marginal density of Z.

Examples {.unnumbered}

julia> Z = StandardNormalSample(1.0)
N(1.0; μ, σ=1.0)
julia> prior = Normal(2.0, sqrt(3))
Normal{Float64}(μ=2.0, σ=1.7320508075688772)
julia> pdf(prior, Z)
0.17603266338214976
julia> pdf(Normal(2.0, 2.0), 1.0)
0.17603266338214976

:::

`cdf`

:::{.callout appearance="minimal"}

cdf(d::UnivariateDistribution, x::Real)

Evaluate the cumulative probability at x.

See also ccdf, logcdf, and logccdf.

cdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, evaluate the CDF of the marginal distribution of $Z$ at response(Z).

:::

Other implemented EBayesSample types

Currently, the following samples have been implemented.

`NormalSample`

:::{.callout appearance="minimal"}

NormalSample(Z,σ)

An observed sample $Z$ drawn from a Normal distribution with known variance $\sigma^2 > 0$.

$Z \sim \mathcal{N}(\mu, \sigma^2)$

$\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> NormalSample(0.5, 1.0)          #Z=0.5, σ=1
N(0.5; μ, σ=1.0)
NormalSample(Z::FoldedNormalSample; positive_sign = true)

Convert a FoldedNormalSample back to a NormalSample by assigning a sign (default: positive).

:::

`BinomialSample`

:::{.callout appearance="minimal"}

BinomialSample(Z, n)

An observed sample $Z$ drawn from a Binomial distribution with n trials.

$Z \sim \text{Binomial}(n, p)$

$p$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $p$.

julia> BinomialSample(2, 10)          # 2 out of 10 trials successful
ℬ𝒾𝓃(2; p, n=10)

:::

`PoissonSample`

:::{.callout appearance="minimal"}

PoissonSample(Z, E)

An observed sample $Z$ drawn from a Poisson distribution,

$Z \sim \text{Poisson}(\mu \cdot E).$

The multiplying intensity $E$ is assumed to be known (and equal to 1.0 by default), while $\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> PoissonSample(3)
𝒫ℴ𝒾(3; μ)
julia> PoissonSample(3, 1.5)
𝒫ℴ𝒾(3; μ⋅1.5)

:::