Learning Portal

Learning portal - Regression discontinuity design

Regression discontinuity design (RDD) is a powerful evaluation method mainly used in programme assessments. It leverages a cut-off point in eligibility criteria to compare those just above and below this threshold. This unique approach offers insights into programme effects with minimal bias.

people holding a tablet and analyzing plants in the fields

Basics

In a nutshell

Regression discontinuity design (RDD) method is a quasi-experimental evaluation methodology.

The technique can be used to assess a programme’s effects or measures that have a continuous eligibility index with a clearly defined cut-off value determining which farms, enterprises, holdings or communities are eligible and which are not. The main idea behind this design is that units in the target population just below the cut-off (not benefitting from an intervention) are good comparisons to those just above the cut-off (exposed to an intervention). Thus, in this setting, the analyst can evaluate the outcome of an intervention by comparing the average value of the outcome indicator for the recipients just above the cut-off with the corresponding value for non-recipients just below it. Under certain comparability conditions, the assignment near the cut-off can be seen almost as random. The RDD method assumes that individual units around the eligibility cut-off point (on both sides) are similar, thus the selection bias should be minimal.

Pros and cons

Advantages

Disadvantages

  • Can be used for programmes with a continuous eligibility index and clearly defined cut-off scores to determine who is eligible and who is not.
  • Similar to an experiment near the cut-off for selection.
  • Identifies causal effects without arbitrary assumptions on the selection process (without excluding any eligible population).
  • Estimated impacts are only valid around the eligibility cut-off score. This can say little about the effect of the programme on units located far away from the cut-off point (in each direction).
  • Limited in considering other factors that might influence selection bias.
  • Determination of the bandwidth around the cut-off is rather arbitrary.
  • Not suitable for estimating the average treatment effect (ATT) for all programme beneficiaries.
  • Limited external validity (produces local average treatment effects that cannot be generalised).

When to use?

RDD can be used when a programme expects a distinctive eligibility threshold (who is eligible and who is not) on the basis of one or more criteria. To apply the RDD, two main conditions are needed:

  1. a continuous eligibility index on which the population of interest can be ranked; and
  2. a clearly defined cut-off point on the index above or below in which the population is classified as eligible for a programme or measure.

Depending on the nature of the eligibility rule (whether it changed over time), panel or cross-sectional data should be used. The main steps in implementing RDD methods should include: a) specifying the eligibility index and the cut-off value; b) choosing the size of the bandwidth around the cut-off value; c) checking the type of discontinuity; d) estimating programme effects; e) discussing the external validity of the obtained results.

Preconditions

  • Available dataset containing the eligibility index and observations on eligible and non-eligible units.
  • Time series of cross-sectional data.

The technique can be applied to assess the effect of CAP support on the evolution of the programme effects listed in the following table.

RDP impact indicator CAP Strategic Plan impact indicator
I.01 - Agricultural entrepreneurial income I.2 - Evolution of agricultural income compared to the general economy
I.02 - Agricultural factor income I.3 - Evolution of agricultural income
  I.5 - Evolution of agricultural income in areas with natural constraints (compared to the average)
I.03 - Total factor productivity in agriculture I.6 - Total factor productivity in agriculture
I.07 - Emissions from agriculture

I.10 - Greenhouse gas emissions from agriculture

I.14 - Ammonia emissions from agriculture

I.08 - Farmland bird index I.19 - Farmland bird index
I.09 - High nature value (HNV) farming  
I.10 - Water abstraction in agriculture I.17 - Water Exploitation Index Plus (WEI+)
I.11 - Water quality

I.15 - Gross nutrient balance on agricultural land

I.16 - Nitrates in ground water

I.13 - Soil erosion by water I.13 - Percentage of agricultural land in moderate and severe soil erosion

Step-by-step

  • Step 1 – Ensure that treatment is assigned exclusively on the basis of a cut-off value on an eligibility index.
  • Step 2 – RDD analysis should begin with a graphical presentation in which the value of the outcome indicator for each data point is plotted on the vertical axis, and the corresponding value of the eligibility index is plotted on the horizontal axis. The graphical presentation provides a powerful visualisation of discontinuity (‘jump’) in the outcome indicator value at the cut-off point.
    • It is recommended to proceed with the graphical analysis in four steps:
      • Divide the eligibility index into a number of equal-sized intervals, which are often referred to as ‘bins’;
      • Calculate the average value of the outcome indicator and the midpoint value of the eligibility index for each bin and count the number of observations in each bin;
      • Plot the average outcome indicator values for each bin on the Y-axis against the midpoint values of the eligibility index for each bin on the X-axis, using the number of observations in each bin as the weight. The  size of a plotted data point should reflect the number of observations contained in the corresponding bin;
      • To help readers better visualise whatever patterns exist in the data, the evaluation can superimpose flexible regression lines on top of the plotted data. This also provides a visual sense of the amount of noise in the data.
    • A challenge of the graphical assessment is to determine the bin width. There are some recommended formal tests which can be found in Jacob et al. (2012).
  • Step 3 – Next, the evaluator can formally estimate the treatment effects using an RDD. There are two types of strategies for correctly specifying the functional form between an eligibility index and outcome indicator:
    • Discontinuity at the cut-off point: This parametric strategy uses every observation in the sample to model the value of the outcome indicator as a function of the eligibility index and treatment status. This method considers all available observations, including those far from the cut-off value, to estimate the average value of the outcome indicator for the observations near the cut-off value. To minimise bias, different functional forms for the eligibility index (linear, quadratic, cubic), as well as interactions with treatment, are tested by inspecting the residuals and conducting F-tests on interaction terms.
    • Local randomisation: This nonparametric/local strategy adopts local randomisation for the estimation of treatment effects and limits the analysis to observations that lie within the bandwidth of the cut-off value, where the functional form is more likely to be linear. The main challenge here is selecting the right bandwidth. Once the bandwidth is selected, a linear (or polynomial) regression is estimated, using observations within one bandwidth on either side of the cut-off value.
  • Step 4 – Assess the internal validity of RDD impact estimates. If the cut-off value is to be chosen using information about an observed eligibility index, evaluators can set this value in a way that includes or excludes specific observations. On the other hand, if an eligibility index of each observation is determined based on knowledge about the corresponding cut-off value, it can be manipulated to include or exclude specific observations. The methods that researchers can use to determine whether the eligibility indexes or the cut-off values could have been manipulated (that is, whether or not an RDD is internally valid) include:
    • Examination of the implementation process;
    • Plotting the probability of receiving treatment as a function of the eligibility index. For a valid RDD, there should be a discontinuity (or ‘jump’) at the cut-off value in the probability of receiving treatment;
    • Plotting the relationship between non-outcome variables and an eligibility index. Non-outcome variables here refer mainly to potential covariates that, according to the theory of action, should not be affected by a treatment.
  • Step 5 – Assess the precision of the estimates obtained from an RDD. This is something that is particularly relevant when using an existing data set. The precision of estimated treatment effects is typically expressed in terms of a minimum detectable effect (MDE) or a minimum detectable effect size.
  • Step 6 – The results of RDD should be accompanied by a critical discussion of the obtained evidence including triangulation with other quantitative and qualitative findings.

Main takeaway points

  • RDD is distinct due to its use of eligibility cut-offs for programme evaluation.
  • By comparing units just above and below the cut-off, RDD minimises selection bias.
  • Essential conditions for RDD include a continuous eligibility index and a clear cut-off point.
  • RDD differs from other methods by focusing on localised impacts rather than generalising effects.
  • It offers valuable insights into the causal effects of specific interventions or programmes.

Learning from practice

Further reading

Publication - Guidelines and tools |

Assessing RDP Achievements and Impacts in 2019