Learning Portal

Learning portal - Propensity score matching

Propensity score matching (PSM) is a cutting-edge evaluation method whose main feature minimises selection bias in programme evaluations, caused by observable characteristics. It generates accurate and reliable comparisons between programme participants and non-participants, while enhancing the credibility results.

Three farmers with laptop

Basics

In a nutshell

Advanced tool

Matching methods are part of the quasi-experimental evaluation family of methods. The propensity score matching (PSM) technique is currently one of the most advanced and effective tools applied in the evaluation of various programmes. One can distinguish two types of PSM approaches:

  1. standard, conventional or binary PSM; and
  2. generalised PSM.

Minimising the bias

The PSM is a powerful quasi-experimental approach that can minimise selection bias arising from the observable characteristics of participants and non-participants. Selection bias occurs when participants in a programme are systemically different from non-participants. These systemic differences may be due to specific characteristics of participants and non-participants, which can be observed using variables available in the corresponding data sources (observables). For example, farms in certain sectors or with a specific economic or physical size may be more motivated to participate in a programme. This bias can be minimised by matching participants with non-participants who, based on their observable characteristics, had the same probability of participating in a programme but decided not to do so. This probability is measured by the so-called propensity score, which is calculated for each unit based on a set of observable characteristics not affected by a programme.

Comparing the similar

In the case of binary PSM, the idea is to find units that are observationally similar from a group of non-participants to programme participants in terms of pre-programme characteristics. Each participant is matched with one or more observationally similar non-participants based on their corresponding propensity scores. Then the average outcome difference across the two groups is calculated to estimate the programme’s average treatment effect. In practice, different techniques or algorithms can be used to match participants and non-participants on the basis of the propensity score. They include the nearest neighbour (NN) matching, calliper and radius matching, stratification and interval matching, kernel matching or local linear matching (LLM).

Pros and cons

Advantages

Disadvantages

  • Effective in finding appropriate control groups (counterfactuals).
  • Works well when programme participation is affected solely by a unit’s observable characteristics (e.g. farm, person, community, region).
  • If a selection bias from unobserved characteristics is likely negligible, then PSM provides a good comparison  with randomised estimates.
  • Allows for straightforward computation of indices like  ATT (average treatment effect on treated), ATE (average treatment effect) and ATNT (average treatment effect on non-treated).
  • Relies heavily on the assumption that observable characteristics used to estimate the propensity score, explain all differences between the supported units and the comparison group prior to programme implementation.
  • Cannot generate plausible results when there are omissions of observable characteristics explaining differences in performance.
  • PSM is a data-intensive method.

When to use?

This technique can be used when sufficient data are available before the implementation of a programme to match participants with non-participants. It assumes that the probability of participating in a programme is affected mainly by characteristics that can be observed and described using corresponding variables available for both groups. Moreover, this technique can only be used when it is possible to calculate the value of an impact indicator(s) at the time of the evaluation and not before the implementation of a programme.

While applying a binary PSM method to identify a valid control group, one must be sure there are no systemic differences in unobserved characteristics between the units supported by a programme and the matched comparison units that could influence the outcome. This method will not generate reasonable results if other important observable characteristics are not included in the model that explains the differences.

Preconditions

  • Good understanding of conditions determining the probability of participating in a programme.
  • Abundant data on programme participants and non-participants before the programme’s implementation, which allows for the observation of the main characteristics that affect the probability of participation in a programme.
  • Ability to calculate the value of impact indicators for the matched groups of participants and non-participants during an evaluation.
  • High quantitative skills of the evaluator.

The technique can be applied to assess the effect of CAP support on the evolution of the values of the impact indicators listed in the following table.

RDP impact indicator CAP Strategic Plan impact indicator
I.01 - Agricultural entrepreneurial income I.2 - Evolution of agricultural income compared to the general economy
I.02 - Agricultural factor income I.3 - Evolution of agricultural income
  I.4 - Evolution of agricultural income level by type of farming (compared to the average in agriculture)
  I.5 - Evolution of agricultural income in areas with natural constraints (compared to the average)
I.03 - Total factor productivity in agriculture I.6 - Total factor productivity in agriculture
I.07 - Emissions from agriculture

I.10 - Greenhouse gas emissions from agriculture

I.14 - Ammonia emissions from agriculture

I.08 - Farmland bird index I.19 - Farmland bird index
I.09 - High nature value (HNV) farming  
I.10 - Water abstraction in agriculture I.17 - Water Exploitation Index Plus (WEI+)
I.11 - Water quality

I.15 - Gross nutrient balance on agricultural land

I.16 - Nitrates in groundwater

I.13 - Soil erosion by water I.13 - Percentage of agricultural land in moderate and severe soil erosion
I.14 - Rural employment rate I.24 - Evolution of the employment rate in rural areas, including a gender breakdown
I.15 - Degree of rural poverty I.27 - Evolution of poverty index in rural areas
I.16 - Rural GDP per capita I.25 - Evolution of gross domestic product (GDP) per capita in rural areas

Specifically for the Farmland Bird Index, an assessment approach conducted at the field/plot scale level (micro level) can be realised using the Common Birds Monitoring Programme, if sufficient data is available. This can be achieved by applying the PSM technique to match beneficiaries and non-beneficiaries and then comparing the average CAP support effect on biodiversity in each group. At the macro level, PSM is also recommended to net out the CAP support effect on biodiversity at the level of quadrants used to observe the farmland birds’ populations under the Common Birds Monitoring Programme. The quadrants can be used as functional units for the Farmland Bird Index and later calculated by bio-geographical areas (different agricultural habitats) or at the regional level on the basis of geo-referenced data.

Regarding indicators measuring employment, GDP per capita and poverty rate in rural areas, beneficiaries’ and control groups can be constructed based on supported and non-supported geographical regions, ideally LAU 2 (check if NUTS2) as defined by the Eurostat urban-rural typology. In this case, access to a comprehensive source of statistical data and information about characteristics of geographical regions before the implementation of the programme is required.

Step-by-step

  • Step 1 – Find a sample of beneficiaries (e.g. farms, farmers, non-farming enterprises, communities, areas, regions, etc.) in an available database (e.g. FADN) and use the monitoring and evaluation electronic system (e.g. the electronic information system of Article 70, Regulation (EU) 1305/2013, or the one of Article 130, Regulation (EU) 2021/2115) as a reference point.
  • Step 2 – In the same database (e.g. FADN), select all relevant units that did not receive support or received an arbitrarily low level of support in the same period (non-beneficiaries). In the latter case of arbitrary low level of support, it might be better to define the low level by not using absolute levels of support but rather the levels of support normalised by another variable, like the utilised agricultural area of the farmer or the area or population of a geographical region.
  • Step 3 – In a group of non-beneficiaries, identify those units which could not fulfil the programme eligibility conditions (due to high income, size, location, etc.) and remove them from the analysis.
  • Step 4 – Collect data for all units in both groups (beneficiaries and non-beneficiaries) on their major characteristics (variables) at the start of an implementation period. Note that the variables included in the analysis should affect both the selection of a unit as well as the indicators computed at the micro-level (e.g. impact indicators). Some of the proposed variables (used as important control variables) can be:
    • the level of support received by a given unit during previous programming periods; and/or
    • the level of support received by a given unit from other public sources (e.g. EU structural funds, Pillar I) in the analysed period.
  • Step 5 – Apply appropriate techniques to identify a suitable control group from the sample of non-beneficiaries (see Steps 2-3), the members of which have the same propensity to participate in a programme (some of the non-beneficiaries and/or beneficiaries will be dropped from the analysis due to a lack of adequate control units).
  • Step 6 – Check statistically the similarity of both groups prior to receiving support from a programme (e.g. by performing statistical tests on covariates included in the analysis). The average value of a unit in the beneficiary group should not differ significantly from the respective unit in the control group. Once the beneficiaries’ group and the control group have been constructed, the net effect of the support can be estimated through the subsequent steps.
  • Step 7 – Compute the average value of common or additional impact indicator(s) both for the group of beneficiaries and the control group when the evaluation is carried out.
  • Step 8 – Calculate the net effect as the difference in the average value of the common or additional impact indicator(s) between the beneficiaries’ group and the control group (average treatment on treated).
  • Step 9 – Perform a sensitivity analysis (e.g. Rosenbaum bounding approach) to assess the possible effects of unobservables on obtained results.
  • Step 10 – Aggregate the findings and calculate the CAP support effects on the analysed impact indicators at the macro and programming area level. In this step, the evaluator should calculate net direct effects of the CAP support on the impact indicators at the programme area level by applying extrapolation techniques (i.e. multiplying average micro-results computed at a micro level by the number of beneficiaries/non-beneficiaries).

Main takeaway points

  • PSM stands out for its ability to reduce selection bias by comparing participants with non-participants based on observable characteristics.
  • Binary PSM focuses on finding non-participants similar to participants in pre-programme traits, while generalised PSM extends this comparison.
  • PSM effectively controls selection bias by matching groups based on their propensity scores, calculated from observable characteristics.
  • PSM is highly applicable in evaluating the effectiveness of various programmes, particularly in contexts where pre-programme data is abundant.
  • ATT, ATE and ATNT indices are computed to measure the programme’s impact on participants and the overall population.

Further reading