Follow Africa’s lead in meticulous evaluation of P4P schemes for healthcare

By Mylene Lagarde

Working with researchers to evaluate the introduction of financial incentives in developed healthcare economies would yield vital knowledge, explains Mylene Lagarde

The jury is very much out on pay-for-performance (P4P) schemes in healthcare – at least as far as the research community is concerned. Lots of unanswered questions remain over their effectiveness and hidden costs, as well as potential unintended consequences and their merit relative to other potential approaches. Yet many policy makers seem to have made their minds up already. These schemes, which link financial rewards to healthcare performance, make sense intuitively. They are being introduced widely.

This disconnection between the research and policy-making worlds means that we are almost certainly not getting the best out of P4P initiatives. Perhaps more worrying, there is a danger that the tree will hide the forest – that the attractive, sometimes faddish, simplicity of pay-for-performance may obscure other, perhaps more complicated but possibly more cost-effective ways to improve healthcare. As systems struggle to configure themselves to address modern demographics and disease profiles, harnessing latest technologies, we need to know what works best to reshape behaviours.

There are three key issues that weaken case for P4P in healthcare, as we set out in the PIRU report “Challenges of payment-for-performance in health care and other public services – design, implementation and evaluation”. These concern a lack of evidence about their costs and effectiveness and for identifying which particular P4P designs may work better than others.

First, costs. P4P schemes are complex to design. They usually involve lots of preliminary meetings between the many participants. Yet studies have largely ignored these transaction costs and frequently also fail to track and record carefully the considerable costs of monitoring performance.

Second, the effectiveness of P4P is often impossible to assess with enough certainty. Typically, introduction of a new scheme does not include a control group. For example, if a scheme incentivises reduced hospital length of stay or emergency admissions for one hospital, it may be difficult to find a comparable hospital to serve as a counterfactual. That makes it harder to attribute a particular change to P4P – maybe it would have happened anyway.

Furthermore, only small groups of outcomes are usually monitored by P4P schemes, so evaluators may be left with a narrow, and thus weak, selection of effects. For example, reductions in hospital lengths of stay may be identified, but these may coincide with poorer outcomes elsewhere in the system, such as increased admissions to nursing homes. These unintended effects, perhaps reflecting a shift rather than a reduction in costs and problems, are often not collected by the programme. That makes whole system analysis difficult.

Third, P4P is not a unique and uni-dimensional intervention. It is a family of interventions. They are all based on the premise that financial incentives can support change, but there are many variables: the size of the reward; how frequently it is offered; whether it is focussed on relative or absolute targets; whether it is linked to competition between providers or it is universally awarded. Very often, one type of intervention is used but another might equally well be employed. Each variation can produce different results, yet we still know little about the relative performance of alternative designs for these incentive schemes.

Researchers are not completely in the dark about P4P in healthcare. We are beginning to understand factors that characterise successful schemes. These typically involve a long lead-in time to plan, test and reflect carefully on the different elements of a programme. However, we must strengthen evaluation.

The first step would be to involve researchers at an early stage of the programme design. That’s the moment to spot where in the system you might need data to be collected. It’s also the time to identify control groups so that the causal impacts of these programmes can eventually be attributed more confidently.

Good evaluation requires political willingness to evaluate, which is sometimes lacking. When an initiative has a political breeze behind it, policy makers worry that researchers will let the wind out of the sails. But some Low and Middle Income Countries are taking the risk. There have been large numbers of randomised controlled trials over the last few years in African countries, looking at the effects of P4P schemes. Most are ongoing, but, so far, the evidence is promising. Rwanda was one of the first African countries to evaluate these financial incentives, mainly for increasing uptake of primary healthcare. Its programme is now being scaled up.

Why is Africa leading the way in setting high standards for P4P evaluation? Because the funders of these schemes, typically external donors (e.g. the World Bank, DfID, USAID), are well placed to demand meticulous evaluation by the receiving governmental authorities as a condition for the cash. Researchers, particularly in developed countries, rarely enjoy such firm leverage over national policy makers. And national policy-makers in these countries do not apply to themselves the degree of scrutiny they exercise with international aid recipients. Yet, if we are to get the best out of P4P – and not attach potentially false hopes to this healthcare innovation – we need more of the disciplined approach that is currently being used in Africa.

Dr Mylene Lagarde is a Senior Lecturer in Health Economics at the London School of Hygiene and Tropical Medicine. “Challenges of payment-for-performance in health care and other public services – design, implementation and evaluation” by Mylene Lagarde, Michael Wright, Julie Nossiter and Nicholas Mays is published by PIRU.