Monthly Archives: April 2015

Policy process for implementing individual budgets highlights some of the tensions in public policy evaluation

by gerald wistow

A high profile initiative to transform social care delivery demonstrates how the demand for rigorous evaluation can be difficult to fulfil alongside enthusiastic policy advocacy, explains former government advisor, Gerald Wistow

Over 40 years ago, the eminent social psychologist, Donald T Campbell, complained that excessive commitment to policies had prevented proper evaluation of Lyndon Johnson’s ‘Great Society’ reforms. Campbell urged social scientists to engage with policy makers to ensure that they appreciated the value of evaluation and did not allow its political risks to preclude its thorough application. His comments are just as relevant today.

I am grateful to Stefanie Ettelt for drawing my attention to a quote from Campbell’s 1969 paper, ‘Reforms as experiments’. In it, he declares: ‘If the political and administrative system has committed itself in advance to the correctness or efficacy of its reforms, it cannot tolerate learning of failure. To be truly scientific we must be able to experiment. We must be able to advocate without that excess of commitment that blinds us to reality testing.’ 

These sentiments spring to mind when reflecting on the piloting of individual budgets for adult social care that took place from 2005. This process highlights the risk that powerful advocacy within government can still lead to what, from the perspective of evaluation, might be considered excessive commitment and so obscure the ‘reality testing’ that evaluation is supposed to provide.

I was a scientific advisor from 2005 to the individual budgets policy team at the Department of Health, providing advice and support through all stages of the evaluation.  At the time, policy processes were being modernised and made more professional. The New Labour mantra, ‘what matters is what works’, meant policy makers were supposed to favour analysis over ideology not least through experimentation and evaluation in advance of universal national roll out. The Modernising Government White Paper (1999) emphasised that evaluation should have a clearly defined purpose with criteria for success established from the outset, that evaluation methods should be built into the policy-making process from the beginning and that learning from pilots should be interpreted and applied.

A key starting point for the formal introduction of individual budgets was the implementation of the ‘Valuing People’ White Paper (2001) which established the central importance of people with learning disabilities being treated as full citizens rather than being excluded from living normally in society. Its four key principles were rights, choice, independence and inclusion.

The Department of Health established a ‘Valuing People Support Team’ to help local authorities and the NHS to implement these principles. In 2003, the Team formed a partnership with Mencap, known as ‘In Control’, to implement a process of ‘self-directed support’ which was piloted with limited evaluation in six local authorities.  The pilots were designed to enable people with learning disabilities to assess their own needs, write their own care plans and organise their own support. The background to this initiative was the need for people with learning disabilities to have greater opportunities to secure more flexible and individualised services because of the low take-up of direct payments (one per cent of all community care packages in around 2003). At the time, some 75 per cent of all money on learning disabilities was still being spent on three traditional, institutional services – residential and nursing home care and day care.

In Control quickly became an organised movement which penetrated national and local government (almost every local authority in the country soon signed up to its programme). By 2005, it had also allied with the physical disability movement which had been working with the Cabinet Office to develop a national strategy that included proposals for a programme of individual budgets.  The concept envisaged that individuals would be able to combine into a single budget all the different funding streams to which an individual might be entitled – such as social security, housing, access to employment and social care.  Individuals would be able to use such a budget on the basis of their assessed needs to purchase the services that they thought most suited those needs. This fitted in with the principles of improving social care services, scoring high on choice, control and independent living.

So by 2005 we had had proposals for individual budgets that were coming from the heart of government, from the Prime Minister’s Strategy Unit, the Department of Health and the Department of Work and Pensions. It was in the 2005 Labour party manifesto and, during the General Election itself, Downing Street wrote a scoping paper on implementation. All of these champions envisaged a process of piloting and evaluation would be necessary and appropriate. In January 2005, the Cabinet Office had described individual budgets as a radical initiative, which would take time to get right, but which would be progressively implemented and, subject to evaluation and resource availability, would be rolled out nationally by 2012. However, by March, the DWP was saying it would be rolled out nationally by 2010.

There remained in these narratives the possibility of failure – everything was subject to evidence that it worked. Evaluation was part of the Government’s risk management – the risk of introducing a radical change that some people strongly supported but whose workings remained unclear. It also appealed to sceptics by saying, ‘Let’s do it progressively, let’s evaluate, let’s make sure that it works’.

The Treasury also had considerable interest in what the programme would cost to introduce, its outcomes and cost effectiveness compared with conventional approaches to service delivery. This last requirement drove the evaluation design so that its core element was a randomised controlled trial. There was also a process evaluation of factors that facilitated and inhibited implementation but the central focus at the outset was to evaluate how the costs and outcomes of individual budget pilots would compare with standard service delivery arrangements.

Although RCTs were widely regarded in DH as the gold standard for evaluation methodologies, especially for clinical interventions, other government departments were less comfortable with the idea that trials were appropriate in the context of individual budgets. The DH implementation support team, and some local staff, shared these concerns and particularly questioned the ethics of denying some participants in the trial access to individual budgets in order to provide comparisons with those who received such budgets.

Meanwhile, the evaluators soon realised, as is often the case, that the intervention to be evaluated was poorly specified. With the policy team, they had to ask: What is an individual budget? How is it allocated? What’s the operating system? How is need to be assessed? How would an assessment of need be converted into a financial sum that someone had available to spend on their care and support? Fortunately, from one point of view, ‘In Control’ had developed a model in their earlier six pilots that not only filled the vacuum but effectively became the intervention to be piloted and evaluated.

Then, in 2006, a new Minister moved the goal posts and announced that, in his view, the inherent value of individual budgets was not in doubt and that he had decided that the initiative should be rolled out nationally from 2010. The evaluation still had an important role, but it would now advise on how best to implement that decision rather than provide evidence to inform whether such a decision should in fact be made. So the RCT continued, but it was undermined. Sites felt more reluctant to identify participants in the study who would not receive a service that had now been ministerially endorsed. Recruitment to the study was slow and, with systems change lagging behind the evaluation timetable, some participants had not received services for the full follow up period before the pilots ended.

The evaluation reported on time and found that people in receipt of budgets, and their carers, reported greater independence and control over how care was provided. Individual budgets were slightly more cost-effective for some (but not all) groups of people. In addition, the implementation of individual budgets had important implications for staff roles, training and the management of funding streams.

In practice, the evaluation was conducted at the intersection with politics, policy-making and implementation. Ministers wanted to prove they could deliver change in what were their frequently short periods in a particular post. They were also influenced greatly by their own informal networks, including in the case of the second minister, his own previous experience of social care services and knowledge of the ‘In Control’ model.

The Department of Health implementation support team who were helping the local sites to implement individual budgets, were also closely associated with ‘In Control’ and its operating model for individual budgets.

The experience of implementing the individual budget pilots demonstrated how the value base of health and social care competed with arguments about technical rationality underlying the modernising government and public sector reform agendas. The former values emphasised the rights of older people and people with disabilities to have greater control over their lives while the latter argument required evidence to demonstrate the benefits of such control, or at least the costs and effectiveness of an intervention which more anecdotal evidence already appeared to support in advance of results being available from the DH commissioned independent evaluation.

As Russell and colleagues (2008) argue – and the individual budgets example supports – policy-making in practice is more a ‘formal struggle over ideas and values’ than a systematically structured search to find and apply the best evidence of what works. As the same authors also underline, there is no single ‘right answer’ to be identified in the messy world of policy-making but only ‘more-or-less good reasons to arrive at more-or-less plausible conclusions’ (Russell et al 2008).

It is sometimes argued that policy makers need better understanding of evaluation but it is perhaps no less true that evaluators need better understanding of policy-making and political processes. There are, for example, some givens in public policy which inevitably and necessarily impact on the conduct and interpretation of evaluation. These givens include the impact of electoral and financial cycles as well as electoral and bureaucratic politics. There are also multiple actors and stakeholders, some of whose actions and influence within policy processes are less apparent than others. For example, for policy researchers there are fascinating questions about how the radical concept of individual budgets was developed and rolled out universally within less than a decade. How a small and newly established organisation such as ‘In Control’ was able to achieve the transformation of national social care policy and service delivery guidelines so rapidly and subsequently begin to extend its model into the NHS is, in itself, an evaluation topic of great interest and relevance to policy researchers.

As for social policy evaluators, these reflections underline the advice of Donald Campbell cited above from another era of social policy transformation. Moreover, in an inherently political clash between values and evidence, the roles of evaluators can perhaps usefully be summarised as being to provide challenge which is both rigorous and sustained; to serve as professional sceptics where others are the professional advocates of change; and, finally, to suspend belief in the absence of independent analysis.

Gerald Wistow is Visiting Professor in Social Policy at the London School of Economics. This piece is based on a presentation that Professor Wistow gave at the meeting ‘Evaluation – making it timely, useful, independent and rigorous’ on 4 July 2014, organised by PIRU at the London School of Hygiene and Tropical Medicine, in association with the NIHR School for Public Health Research and the Public Health Research Consortium (PHRC).

Modelling lets evaluators test-drive change safely and cheaply, using a diversity of non-RCT evidence

by sally Brailsford

Enhanced decision-making, blue-skies thinking and quick trials of hypotheses are all much easier if modelling is in your evaluation tool kit, explains Sally Brailsford

Everyone thinks that they know what a model is. But we all have different conceptions. I like the definition from my colleague Mike Pidd, from Exeter University. He sees a model as ‘an external and explicit representation of a part of reality’.  People use it ‘to understand, to change, to manage, and to control that part of reality’.

We tend to acknowledge the limitations that models have, but fail to fully appreciate their potential.  ‘All models are wrong,’ as George Box said, ‘but some are useful’.

I work in Operational Research. It’s a tool kit discipline. In one part, we make use of statistics, mathematics and highly complex algorithmic models. In another, we draw pictures and play games. I use these elements to create simulation – I build a model in a computer which replicates a real system and then we can play ‘what if’ with it.

Models inform decision-making

I use models mainly for informing decision-making. Sometimes, they don’t actually need much data to be very useful. For example, there is a famous model about optimal hospital bed occupancy, created by Adrian Bagust and colleagues at Liverpool University’s Centre for Health Economics.  It includes some numbers but they are not based on any specific hospitals. It shows that if a hospital tried to keep all its beds fully occupied, then some patients would inevitably have to be turned away.

The model varies patient arrivals as occupancy increases and demonstrates how often the hospital has to turn away emergency patients. It shows that hospitals deemed inefficient, because they occasionally have empty beds, are actually operating effectively. The finding really influenced policy. It showed that, as a hospital reaches about 85 per cent occupancy, it is increasingly likely to have to turn emergency patients away. It is a simple model. It did not involve long-running, expensive randomised controlled trials. Yet it provided vital evidence and was powerful in influencing occupancy targets.

30 year clinical trial in five minutes

In another model, we looked at patients with diabetes at risk of developing retinopathy. Everyone agreed that it was a good idea to screen patients with diabetes to prevent retinopathy before it leads to blindness. However, there was a whole range of screening practices. We used data from all over the place, from the US and from the UK. The model followed patients with diabetes through the life course and through different progression stages.

We had to draw data from very early studies because it would be unethical to conduct a clinical trial that did not treat people according to best practice. We then adapted the model for different populations, with varying ethnic mixes and probabilities of diabetic incidents. We superimposed on the model a range of different screening policies to see which was most cost-effective. In effect, once we felt confident that the model was valid, we could run a clinical trial on a computer in five minutes rather than running a real clinical trial for 30 years. As a result, we discovered really valuable findings.

The beneficial difference between all the various techniques and screening programmes proved to be minor compared with the large impact of more people being screened. We realised that raising attendance, perhaps by social marketing, offered much better value than buying expensive equipment.

Guiding design of hypothetical systems

The next model is even more hypothetical. Three engineers had an exciting, blue skies idea for patients with bipolar disorder. What if, they asked, different sensors tracked a person’s behavioural patterns and, having established an individual’s ‘activity signature’, could spot small signs of a developing episode that would trigger a message that the person might need help?

We expected, rightly, that success depended on what monitoring individuals could tolerate – perhaps a bedside, touch sensor mat, or a light sensor in their sitting room, sound sensors or GPS. We built these different possibilities into the model. We could also check how accurate the algorithms would have to be, if this technology was developed. So we were guiding design of a hypothetical system.

Many, particularly those from clinical backgrounds, find it hard to accept that modelling can provide evidence upon which to make a major decision. People often expect the same kind of statistical evidence as from randomised controlled trials. Modelling does not claim to provide that level of certainty. It is a decision-support tool, helping you understand what might happen if you do something.

Appreciate modelling advantages

We should recognise the advantages of models. They are quick and cheap – you can run a clinical trial that could last decades in a matter of minutes. If you lack confidence statistically in your model, there are solutions: expert opinion and judgement can help fill the gaps. A model allows people to talk about issues in a policy setting and to articulate their assumptions. Quite often the conversations along the road are more important than the eventual model and the model is just a means to that end.

Like in the bi-polar project, you can model innovations that don’t even exist. So I often use modelling for hospitals around redesigning a system or a service. The development does not exist yet, so there are no data – you must gather all the available evidence you can and build it into your model. It lets you explore more than when using traditional methods because your assumptions can be more flexible.

Collecting primary data is hugely expensive, sometimes impossible.  You can consider all sorts of options that it would be unethical to explore in reality. As the bed occupancy model shows, the findings can be powerful and influential.

There is a saying that, if all you have is a hammer, then every problem is a nail. As researchers, we should avoid being confined by preferred methods, whatever our discipline. Modelling can be a valuable research tool.

Sally Brailsford is Professor of Management Science at the University of Southampton. Her blog is based on her presentation on 4 July 2014 at PIRU’s Conference: ‘Evaluation – making it timely, useful, independent and rigorous’.