Let us load the dataset from Lee (2018). We will reproduce analyses from Imbens and Kalyanaraman (2012).
using DataFrames, RegressionDiscontinuity, Plots
lee08 = load_rdd_data(:lee08) |> DataFrame
first(lee08, 3)
3×3 DataFrame
│ Row │ Ys │ Ws │ Zs │
│ │ Float64 │ Bool │ Float64 │
├─────┼─────────┼──────┼─────────┤
│ 1 │ 0.0 │ 0 │ -1.0 │
│ 2 │ 0.0 │ 0 │ -1.0 │
│ 3 │ 0.0 │ 0 │ -1.0 │
running_var = RunningVariable(lee08.Zs, cutoff=0.0, treated=:≧);
Let us first plot the histogram of the running variable:
plot(running_var; ylim=(0,600), bins=40, background_color="#f3f6f9", size=(700,400))
Next we plot the regressogram (also known as scatterbin) of the response:
regressogram = plot(running_var, lee08.Ys; bins=40, background_color="#f3f6f9", size=(700,400), legend=:bottomright)
We observe a jump at the discontinuity, which we can estimate, e.g., with local linear regression. We use local linear regression with rectangular kernel and choose bandwidth with the Imbens-Kalyanaraman bandwidth selector:
rect_ll_rd = fit(NaiveLocalLinearRD(kernel=Rectangular(), bandwidth=ImbensKalyanaraman()),
running_var, lee08.Ys)
Local linear regression for regression discontinuity design
⋅⋅⋅⋅ Naive inference (not accounting for bias)
⋅⋅⋅⋅ Rectangular kernel (U[-0.5,0.5])
⋅⋅⋅⋅ Imbens Kalyanaraman bandwidth
⋅⋅⋅⋅ Eicker White Huber variance
────────────────────────────────────────────────────────────
h τ̂ se bias
────────────────────────────────────────────────────────────
Sharp RD estimand 0.462024 0.08077 0.0087317 unaccounted
────────────────────────────────────────────────────────────
plot!(regressogram, rect_ll_rd; show_local_support=true)
Let's zoom in on the support of the local kernel and also with more refined regressogram:
local_regressogram = plot(rect_ll_rd.data_subset; bins=40, background_color="#f3f6f9", size=(700,400), legend=:bottomright)
plot!(rect_ll_rd)
Finally, We could repeat all of the above analysis with another kernel, e.g. the triangular kernel.
triang_ll_rd = fit(NaiveLocalLinearRD(kernel=SymTriangularDist(), bandwidth=ImbensKalyanaraman()),
running_var, lee08.Ys)
Local linear regression for regression discontinuity design
⋅⋅⋅⋅ Naive inference (not accounting for bias)
⋅⋅⋅⋅ Triangular kernel
⋅⋅⋅⋅ Imbens Kalyanaraman bandwidth
⋅⋅⋅⋅ Eicker White Huber variance
───────────────────────────────────────────────────────────────
h τ̂ se bias
───────────────────────────────────────────────────────────────
Sharp RD estimand 0.293907 0.0799218 0.00834476 unaccounted
───────────────────────────────────────────────────────────────
Publications
Imbens, Guido, and Karthik Kalyanaraman. "Optimal bandwidth choice for the regression discontinuity estimator." The Review of economic studies 79.3 (2012): 933-959.
Lee, David S. "Randomized experiments from non-random selection in US House elections." Journal of Econometrics 142.2 (2008): 675-697.
Related Julia packages
GeoRDD.jl: Package for spatial regression discontinuity designs.
Related R packages