Navigating the quantitative methods for CAP evaluation
- Evaluation
- Evaluation
- Evaluation Methods
- Data Management
- Performance Monitoring and Evaluation Framework (PMEF)
- Monitoring and evaluation framework
This page provides guidance for non-experts to navigate quantitative methods used in CAP evaluation. The resource offers some hints to support Managing Authorities (MAs), evaluators, policy analysts and stakeholders in the navigation of the many and different quantitative methods used in CAP evaluation.
Page contents
It is important to emphasise that:
- This section is intended for non-experts. Method selection, implementation and results interpretation require professional evaluators with econometric backgrounds.
- Unit of analysis: examples primarily use farms as the unit affected by policy, but principles apply to other beneficiaries (e.g. Producer Organisations).
- Data requirements: advanced methods require large, high-quality datasets. Data availability often constrains methodological choices.
This guide focuses only on ex post evaluation, that is, based on historical data (i.e. not coming from simulation models) observed after the policy under analysis has been implemented.
Basics
Foundational concepts
Evaluation methods can be classified according to several key dimensions, including:
- Ex ante vs. ex post evaluation
- Quantitative vs. qualitative methods
- Cross-section vs. panel data
- Correlation vs. causation
Each dimension is briefly presented hereunder.
Ex ante vs. ex post evaluation
Policy evaluation operates along two complementary timelines: While ex ante analysis forecasts potential impacts before implementation, thereby supporting evidence-based policy design, ex post evaluation validates whether policies achieved their intended effects in practice by examining actual outcomes after implementation. Together, they create essential feedback for policy development, adjustment and redesign.
Ex ante evaluation: this prospective approach forecasts policy impacts before implementation using modelling approaches, simulations, and economic projections. Ex ante evaluations support policy design and inform choices among alternative interventions.
Ex-post evaluation: this retrospective approach analyses actual outcomes after implementation has occurred, comparing real-world empirical data against counterfactual scenarios (what would have happened without the intervention). Ex post evaluations provide evidence on whether intended impacts materialised and to what extent.
Example – Consider the introduction of a new eco-scheme under the CAP:
- Ex ante evaluation: before launch, agricultural and economic models predict how farmers might change their crop choices, production practices and income levels in response to the scheme.
- Ex-post evaluation: after the scheme has operated for several years, evaluators examine actual farm data from beneficiaries and comparable non-beneficiaries to determine whether predicted impacts occurred and at what magnitude.
Quantitative vs. qualitative methods
In CAP evaluation, quantitative and qualitative methods offer distinct but complementary approaches to understanding policy impacts. Quantitative methods focus on measuring the size and statistical significance of changes, providing clear answers to ‘how much’ a policy has affected outcomes. Qualitative methods, meanwhile, delve into the underlying reasons, exploring ‘why’ and ‘how’ policies succeed or face challenges in different contexts. Together, these approaches provide a fuller picture: quantitative analysis reveals what happened and to what extent, while qualitative analysis explains the mechanisms and conditions behind those results.
Quantitative methods employ numerical data, statistical analysis and mathematical models to assess relationships among variables and estimate policy impacts with statistical precision. They are used to determine whether observed effects are statistically significant and to measure the magnitude of changes, such as shifts in yields or income. By relying on large datasets and rigorous statistical techniques, quantitative approaches provide objective, generalisable findings that are essential for evaluating the effectiveness of policies at scale.
Qualitative methods utilise interviews, surveys, focus groups, case studies, document analysis and textual interpretation to understand the mechanisms of implementation, contextual factors and stakeholder perspectives. These approaches are invaluable for uncovering barriers, enabling conditions, and the reasons behind observed outcomes – insights that numbers alone cannot capture. Principal qualitative methods in CAP evaluation include actor-network analysis, contribution analysis, cost-benefit and cost-effectiveness analysis in the context of CAP evaluation, Innovation capacity scoring tool, knowledge mapping, method for impact assessment of programmes and projects (MAPP), most significant change, outcome mapping, rapid appraisal of agricultural innovation systems (RAAIS), social network analysis, stakeholder mapping, theory-based approaches and visualised AKIS mapping, among others.
Example – Suppose an agricultural policy has introduced new subsidies for eco-friendly farming practices.
- A quantitative evaluation might use data from farm records to calculate how these subsidies affected crop yields or farm income across a region, determining the average change and its statistical significance.
- A qualitative evaluation would involve interviewing farmers and local officials to explore why some farmers adopted new practices quickly, which factors slowed adoption for others and how local traditions or community networks influenced outcomes.
Cross-section vs. panel data
The key distinction between cross-sectional and panel data lies in what they allow us to compare. Cross-sectional data provides a snapshot, enabling comparisons between different units (such as farms) at a single point in time. In contrast, panel data follows the same units over multiple time periods, allowing us to observe changes within those units and better understand the effects of policies or interventions over time. Together, these data types offer complementary perspectives: cross-sectional data highlights differences across units at a single point in time, while panel data reveals how outcomes evolve for the same units.
Cross-sectional data captures information from multiple units (e.g. farms) at a single point in time. This approach answers questions like, ‘How are participants different from non-participants right now?’ It helps identify patterns and differences across units, but is limited in its ability to establish causal relationships or account for pre-existing differences. Because it provides only a snapshot, cross-sectional data cannot distinguish whether observed differences are due to a policy or to other factors, such as location or managerial ability.
Panel data tracks the same units repeatedly over several time periods (for example, from 2018 to 2023). This method allows researchers to observe changes within units that occur after a policy is introduced. This approach answers questions like, ‘How did the farms’ outcomes change after the policy was introduced?’ Panel data is especially valuable because it controls for stable, unchanging characteristics (such as soil quality, location or inherent managerial ability), helping to isolate the actual effect of a policy or intervention.
Example – Suppose we want to evaluate whether a new CAP eco-scheme (a voluntary programme launched in 2023) increases farm income.
- Using cross-sectional data: an evaluator collects data from 5 000 farms in 2023 and compares the average income of the 2 500 farms that joined the eco-scheme to the 2 500 that did not. While this shows differences in income (e.g. participating farms have higher incomes), it cannot tell us whether the eco-scheme caused the difference or whether more innovative, profitable, or better-located farms were more likely to sign up in the first place. The ‘snapshot’ cannot separate the policy’s effect from these pre-existing advantages.
-
Using panel data: an evaluator would collect data from the same 5 000 farms every year from 2020 to 2024. By tracking each farm’s income before and after the eco-scheme was introduced, they can compare changes in income for participating and non-participating farms. This approach helps control for stable, unchanging farm characteristics (such as soil quality or a farmer’s innate skill) by comparing each farm against its own past performance. Hence, it provides a clearer picture of the eco-scheme’s actual impact.
Correlation vs. causation
Understanding the difference between correlation and causation is essential in policy evaluation. Correlation describes a statistical association between two variables – when one changes, the other tends to change as well. However, correlation alone does not prove that one variable causes the other to change. Causation, in contrast, means that changes in one variable directly produce changes in another, establishing a specific direction of effect. While correlation is necessary for causation, it is not sufficient; establishing causation requires ruling out alternative explanations and demonstrating a direct link.
Correlation indicates that two variables move together in a predictable way. This relationship can be positive (both increase or decrease together), negative (one increases as the other decreases) or not statistically significant from zero. Correlation helps identify patterns and potential relationships, but it does not explain why the relationship exists or whether one variable influences the other.
Causation means that changes in one variable directly result in changes in another. Establishing causation requires more rigorous analysis, including ruling out alternative explanations (such as confounding factors or spurious relationships) and demonstrating that the effect flows in a specific direction. Causal inference often relies on experimental or quasi-experimental methods to isolate the actual effect of a policy or intervention.
Example – Suppose farms receiving higher basic income support scheme (BISS) payments also show higher average incomes.
- This correlation could arise for several reasons:
- Larger farms receive larger payments and are independently more profitable (spurious correlation).
- Payments genuinely cause higher incomes (actual causal effect).
- Both income and payment levels depend on unobserved farm characteristics (confounding).
Rigorous causal methods are needed to systematically isolate the actual policy effect from these alternative explanations.
To prove a CAP policy caused a result, evaluators should not fall into one of these three ‘traps’ (technical terms for these are in parentheses).
The ‘missing factor’ trap (confounding)What it is: sometimes two things seem connected because a third, hidden thing influences both. This ‘hidden factor’ gives the false impression that one thing directly causes the other. Why is it tricky? It is easy to mistake coincidence for causation. Ignoring hidden factors may lead us to wrongly attribute policy effects to the intervention, even when something else is actually responsible for the observed changes. Example – Suppose the CAP pays higher subsidies to organic farms, and organic farms tend to have higher profits. Did the subsidy make organic farms profitable? Maybe not. Organic farms might be larger or located in better regions – advantages that make them both more likely to apply for subsidies and to earn higher profits. In this case, the real cause of profitability is not the subsidy, but these hidden, unmeasured advantages. The ‘chicken and egg’ trap (reverse causality)What it is: this trap occurs when we assume a policy came first and caused a result, but the opposite is true. The result existed before the policy was implemented. Why is it tricky? It confuses the sequence of events, leading us to believe the policy changed something, when in fact it did not. Example – Imagine a programme that supports farm investment. If supported farms are those with the highest investments, it may look like the programme caused the investment. However, these farms could have planned to invest before applying for support. Their existing willingness to invest led them to seek support. Under these circumstances, the support did not cause the investment, but rather, the investment intention came first. The ‘cherry-picking’ trap (selection bias)What it is: this occurs when we compare groups that chose to participate in a policy with those that did not, without accounting for pre-existing differences between the groups. The simple and direct comparison suggests the policy had an effect, when in reality the effect could be due to differences between the groups. Why is it tricky? It gives credit to the policy for results that may actually reflect the qualities of those who opted in, not the policy itself. Example – Consider a CAP programme to encourage young farmers. If we compare young farmers (who joined the programme) to older farmers (who did not), young farmers may already be more motivated, better educated, or more willing to try new techniques. Their better results, higher yields or profits, could be due to these characteristics, not the policy itself. |
Classification of quantitative methods
This section provides a brief description of the most commonly used methods. As demonstrated in the previous section, methods can be classified along several key dimensions. Here, these quantitative methods are classified by their capacity to establish correlational vs. causal relationships. Readers interested in further details could surf the portal that covers each of these approaches or consult econometric books such as Angrist & Pischke (2008) and (2015), Cerulli (2022), Cunningham (2021), Greene (2012) and Wooldridge (2010).
Methods that assess correlation
These methods identify statistical associations but do not establish causation.
When to use? Correlation methods are valuable for exploratory analysis, policy monitoring, and when causal methods are not feasible.
Main correlation methods are:
- Ordinary least squares (OLS): a regression method that estimates linear relationships by choosing coefficients that minimise the sum of squared residuals; mainly descriptive if key causal assumptions are not justified.
- Fixed effects (FE) models: panel‑data regression that controls for all time‑invariant characteristics of each unit by using only within‑unit variation over time.
- Difference-GMM (Diff-GMM) and system-GMM (SYS-GMM): dynamic panel estimators that use internal instruments (lags of variables) to address endogeneity in models with lagged dependent variables and long panels.
Methods that establish causality
In policy evaluation, assessing causality is far more informative than simply identifying correlations. To rigorously establish causal relationships, advanced statistical methods are required. These methods require both strong econometric skills and access to extensive, high-quality data, as discussed later.
Causal analysis often relies on counterfactual impact evaluation (CIE) approaches, which construct a counterfactual – an estimate of what would have happened in the absence of the intervention. This helps address selection bias and isolate the true effect of a policy or programme.
CIE methods typically compare two groups:
- Treatment group: units (e.g. farms) that are exposed to the policy, programme or intervention being studied.
- Control group: comparable units that are not exposed to the intervention, serving as a benchmark for what would have happened to the treated units without the policy.
By comparing outcomes between these groups, evaluators can more confidently attribute observed changes to the intervention itself, rather than to other factors.
A list of the main methods that establish causality is provided below.
- Randomised controlled trials (RCTs): an experimental design where eligible units are randomly assigned to either a treatment group or a control group. Randomisation ensures that both groups are statistically comparable on observed and unobserved characteristics, allowing the difference in outcomes to be attributed solely to the intervention (considered the ‘gold standard’ for causal inference)
- Propensity score matching (PSM): a method that matches treated and control units with similar probabilities of receiving treatment (propensity scores) based on observed covariates, to approximate a randomised experiment.
- Difference-in-differences (DiD): a causal method that compares changes in outcomes over time between treated and control groups, relying on the assumption of parallel trends in the absence of treatment.
- PSM-DiD: a hybrid approach combining PSM and DiD to improve causal estimates. First, PSM is used to select a comparable control group based on observable characteristics; then, DiD is applied to this matched sample to control for time-invariant unobserved heterogeneity, thereby relaxing the parallel trends assumption and reducing bias more effectively than either method alone.
- Regression discontinuity design (RDD): causal design that exploits a sharp cutoff in an assignment variable (e.g. eligibility threshold) and compares units just above and just below the threshold.
- Synthetic control method (SCM): causal approach for cases with one or a few treated units, constructing a weighted combination of untreated units to mimic the treated unit’s pre‑intervention trajectory.
- Instrumental variables (IV): method used when treatment assignment is endogenous (correlated with the error term). It employs an external variable (instrument) that is correlated with the treatment but affects the outcome only through the treatment (exclusion restriction), allowing for the isolation of exogenous variation to estimate causal effects.
- Quantile methods
- Quantile dose-response function (QDRF): estimates the treatment effect across the entire distribution of the outcome variable for continuous treatments, showing how different levels (doses) of treatment affect specific quantiles (e.g. lower vs. upper tail) of the outcome.
- Quantile conditional treatment effect (QCTE): measures the causal effect of a binary treatment on different quantiles of the outcome distribution, conditional on covariates, allowing researchers to understand impacts beyond the simple average effect (e.g. effects on high-performers vs. low-performers).
How to select the most appropriate method(s)?
Choosing the right evaluation method involves considering several key aspects and is best done with the support of experts who have a strong background in econometrics applied to policy evaluation. However, this section offers practical guidance for non-experts navigating this complex decision.
Key factors to consider when selecting evaluation methods include:
- Policy characteristics and participation:
- Is the intervention completely new (a clear break from previous policy), or is it a continuation with only minor adjustments?
- Can you clearly distinguish between farms (or units) that participate in the policy and those that do not? Is there only one group, two groups or more?
- Advantages and limitations of each method (pros/cons)
- Data characteristics and availability:
- Is your data cross-sectional (a snapshot at one point in time) or panel data (tracking the same units over multiple periods)?
Because evaluation is often limited by data availability and quality, this section focuses primarily on data considerations, especially whether panel or cross-sectional data are available. It also outlines the key steps for selecting an appropriate evaluation approach and highlights which methods are suitable for different evaluation contexts.
The figure below presents a simplified decision tree to help guide the selection of appropriate methods, with a focus on data availability. Please interpret this tool with caution: choosing the right method is complex, requires consideration of multiple factors, and should ideally be done with expert support.
Start by asking ‘what data do we have?’ and then move through the choices.
-
Level 1: Data structure
- Panel data?
- YES → Panel data available (multiple observations per unit over time)
- NO → Only cross-section data available (single observation per unit)
If panel data is available:
- Before and after policy?
- YES → Pre & post data available (can exploit policy timing)
- NO → Only post-policy data available
If pre- and post-data are available:
- Objective?
- CAUSAL (estimate treatment effect):
- Synthetic control: few treated units, long pre-period
- DiD/Event study: multiple comparison groups available
- RDD: sharp threshold for treatment eligibility
- DESCRIPTIVE (characterise patterns):
- Panel length?
- Fixed effects (FE): 3-5 years of data
- Diff-GMM/Sys-GMM: >5 years of data
- Panel length?
- CAUSAL (estimate treatment effect):
If only post-policy data:
- Treatment information available?
- YES (rich treatment info):
- PSM: Propensity score matching (with rich covariates)
- RDD: Regression discontinuity (with clear eligibility cutoff)
- YES (rich treatment info):
If cross-section data only:
- Objective?
- CAUSAL (estimate treatment effect):
- PSM: matching on observed covariates
- RDD: eligibility cutoff (regression discontinuity)
- DESCRIPTIVE (characterise relationships):
- OLS: ordinary least squares regression
- CAUSAL (estimate treatment effect):
- Panel data?
Quick reference: Method selection summary
The table below provides a quick reference guide for selecting appropriate evaluation methods based on data structure and policy exposure. It summarises which methods are suitable for different combinations of data types (cross-sectional or panel), the presence or absence of control groups, and the nature of policy implementation. For each scenario, the table indicates recommended methods and clarifies whether the resulting analysis supports correlation or causal inference.
This overview is intended to help evaluators quickly identify the most relevant methodological options for their specific evaluation context.
| Data structure | Policy exposure/control group | Suitable methods | Type of relationship |
|---|---|---|---|
| Cross‑section | All farms affected (no control group) | OLS | Correlation |
| Cross‑section | Some farms treated, others not; rich covariates | Propensity score matching (PSM) | Causality |
| Cross‑section | Treatment assigned by sharp eligibility cutoff | Regression discontinuity design (RDD) | Causality |
| Panel | All farms affected; medium‑length panel (≈3-5 years) | Fixed effects (FE) | Correlation (within‑farm) |
| Panel | All farms affected; long panel (>5 years, dynamic effects) | Difference‑GMM / System‑GMM | Correlation (dynamic) |
| Panel | Some farms treated, others not; pre‑ and post‑policy data | Difference‑in‑differences (DiD); event study | Causality |
| Panel | Few treated units; long pre‑policy time series | Synthetic control | Causality |
| Panel | Only post‑policy panel; some treated, others not; rich covariates | PSM (panel); RDD (eligibility cutoff, if rule generates treatment in panel) | Causality |
Key principles
- Data structure (panel vs. cross-section) determines what methods are available.
- Temporal structure (pre/post availability) enables causal identification strategies.
- Your research objective (causal vs. descriptive) shapes method choice.
- Rich covariate information increases options for causal inference.
- Always validate assumptions of your chosen method.
Further reading
- European Evaluation Network for Rural Development for the CAP (2014), Capturing the success of your RDP: Guidelines for the ex post evaluation of 2007-2013 RDPs
- Angrist, J. D., & Pischke, J.-S. (2008), Mostly harmless econometrics: An empiricist’s companion
- Angrist, J. D., & Pischke, J.-S. (2015), Mastering metrics: The cause from path to effect
- Cerulli, G. (2022), Econometric Evaluation of Socio-Economic Programs. In The Econometrics of Panel Data
- Cunningham, S. (2021), Causal Inference: The Mixtape
- Greene, W. H. (2012), Econometric Analysis
- Opatrny, M. (2021), Evaluating Economic Policy Using the Synthetic Control Method
- Wooldridge, J. M. (2010), Econometric Analysis of Cross Section and Panel Data
- Wooldridge, J. M. (2019), Introductory Econometrics: A Modern Approach (7th ed.)