Let’s be honest that pilots are not just about testing: they’re also about engineering the politics of change

By Stefanie Ettelt

There is more to policy piloting than evaluation – piloting is a policy tool in itself, not only a means for conducting research, says Stefanie Ettelt


Pilot evaluation tends to frustrate and disappoint some or all of its stakeholders, be they policy-makers, local implementers or evaluators, according to a study I have been working on for PIRU. Policy makers typically want robust, defensible proofs of success, ideally peppered with useful tips to avoid roll-out embarrassments. But they are distinctly uncomfortable with potentially negative or politically damaging conclusions that can also spring from rigorous evaluation.

Meanwhile, implementers of pilots at a local level don’t welcome the ambivalence that evaluation suggests, particularly when randomised controlled trials (RCTs) are used, given the associated assumption of uncertain outcome (equipoise). Implementers understandably worry that all their hard work putting change into action might turn out to have been a waste of time, producing insufficient improvement and leading to a programme being scrapped.

The evaluators may prefer a more nuanced approach than either of the above want, in order to capture the complex results and uncertainties of change. But this approach might find little favour with those commissioning the work.  Evaluators are often dissatisfied with the narrow time frames and limited sets of questions that are allowed for their investigations. They may feel tasked with gathering what they consider to be over-simplistic measures of success as well as being disappointed to discover that a roll-out has begun regardless of –  or even in advance of – their findings.

Keeping all of these stakeholders happy is a big ask. It’s probably impossible, not least because satisfying any one of these stakeholders may mutually exclude contentment among the others.  Why do we find ourselves in such a difficult situation?

Why is it so hard to satisfy everyone about pilots?

Perhaps this tricky issue is linked to the particular way in which British policy-making is institutionalised. These days, policy-making in the UK seems to be less ideologically-driven – or least supported by ideology – than it was in the past. With this loss of some ideological defences has also gone some of the perceived – albeit sometimes flawed – certainties that may once have protected policies from criticism. As a result, there are sometimes overblown expectations of research evidence in the UK and sometimes illusory beliefs that evidence can create new certainties.

The institutional design of the Westminster system perhaps invites excessive expectations that policy can be highly informed by evidence, because political centralisation means that there seem to be fewer actors who can veto decisions, than in some countries, for example, in Germany.  There are more regional players in Germany’s federal system who can veto, obstruct or influence a decision. Relatively minor coalition partners in Berlin also have a long standing tradition of providing strong checks and balances on the larger governing party. So, in Germany, there is more need for consensus and agreement at the initial policy-making stage. This participative process tends to reduce expectations of what a policy can deliver and also, perhaps, the importance of evidence in legitimising that policy.

Britain compared with Germany

In contrast, the comparatively centralised Westminster system seems more prone to making exaggerated claims for policy development and more in need of other sources of legitimacy. Piloting may, thus, at times become a proxy for consensus policy-making and a means of securing credibility for decisions. It might help to reduce expectations, and thus avoid frustration, if policy makers were clearer about their rationale for piloting. So, for example, they might explain whether a pilot is designed to promote policy or to question if the policy is actually a good way forward. If the core purpose is to promote policy, then some forms of evaluation such as RCTs may be inappropriate.

Evaluators understandably find it difficult to accept that the purpose of piloting and evaluation might first and foremost be for policy-makers to demonstrate good policy practice and to confirm prior judgements (i.e. ‘symbolic’). But there should be recognition that piloting sometimes does have such a political nature which is genuinely distinct from it having a purely evaluative role.

Of course, such a distinction is not made any easier by policy makers who tend to use rhetoric such as ‘what works’ and ‘policy trials/experiments’ when they already know that the purpose of the exercise is simply to affirm what they are doing. If policy makers – including politicians and civil servants – use such language, they really are inviting, and should be prepared to accept, robust evaluation and acknowledge that sometimes the findings will be negative and uncomfortable for them.

Improving piloting and evaluation

There are ways in which we can improve evaluation methods to make them more acceptable to all concerned. More attention could be given to identifying the purpose of piloting to avoid disappointment and manage the expectations of evaluators, policy-makers and local implementers. If the intention is to promote local and national policy learning more participation from local implementers in the objectives and design of evaluations of pilots would be desirable, so that these stakeholders might feel less worried by the process. Evaluators might also be more satisfied with more extensive use of ‘realist evaluation’. This approach particularly explores how context influences the outcomes of an intervention or policy, which is useful information for roll-out.

I would like to see local stakeholders to be more directly involved in policy-making and their role more institutionalised. So their involvement would be ongoing and not abandoned if it was considered unhelpful by a different incoming government. These are roles that need time to grow, to become embedded and for skills to develop.  Such a change would enhance the localism agenda.  It would also acknowledge that local implementers are already key contributors to national policy learning through all the local trial and error that they employ.

Dr Stefanie Ettelt is a Lecturer in Health Policy at the London School of Hygiene and Tropical Medicine. She contributes to PIRU through her work on piloting and through participating in the evaluation of the Direct Payments in Residential Care Trailblazers. She also currently explores the role of evidence in health policy, comparing England and Germany, as part of the “Getting evidence into policy” project at the LSHTM.