A Conceptual and Technical Revolution in Program/Policy Design: The Critical Missing Tools

Chris Callaghan
Feb 17, 2021
19 min read

Updated: Apr 1, 2021

Part One Summary

In part one of this article, we propose that many organizations are largely guessing about how to obtain better results, and that they fail to leverage scientific thinking (evidence) in the quest for more effective and efficient outcomes. We maintain that an increase in the application of scientific thinking to public policy and social science problems is being hindered by the perception that science is synonymous with randomized controlled trials, and that RCTs are often impractical in fast-paced environments. We also touch on how modes of governance are evolving. The business-centric New Public Management of the 80s and 90s is giving way, with Europe in the lead, to the natural evolution of a collaborative governance style (New Public Governance) that emphasizes trust, social capital, and cooperation as core features. But while these ideas (using evidence and engaging in collaborative policymaking) are gaining ground in the literature, and in more forward-thinking countries and organizations, there are still many organizations around the world that could, but currently do not, benefit from engaging in deep evidence-informed policymaking underpinned by open collaboration and deliberative dialogue. We further maintain that barriers to rapid progress in these areas are linked to a critical shortage of pivotal tools and concepts available to practitioners, decision-makers and stakeholders/rightsholders. In part two, below, we describe the tools that PolicySpark has developed to overcome these barriers, and how these tools work.

Foundations

The way policies and programs are designed has far-reaching impacts on results, decision-making, and delivery. Think of the old cliché about building a house on a shaky foundation. We know intuitively that this is not a great idea because it sets us up for difficulty down the road. It involves a lack of foresight. The same is true for policies and programs. When they are built on guesswork instead of pooled wisdom and the best available evidence, there’s an increased likelihood that sub-par results will follow, along with persistently poor outcomes over time. In addition to a lack of appropriate use of evidence, policy and program interventions produce weak results, or fail, because stakeholders often don’t believe that their voices have been heard. People are reluctant to accept policies which have been imposed on them when they are not provided with an opportunity to have a meaningful say in how those policies are designed and implemented.

In this article we describe a fresh approach to addressing these fundamental issues. Leveraging modern data science and human-centered artificial intelligence, PolicySpark has developed an easy-to-understand, visual and intuitive approach to bridging and reconciling the differing policy, research, and stakeholder perspectives that are invariably found around all intervention systems. As we explain in the following sections, our approach comprises a crucial missing tool set that unifies evidence-based modeling, stakeholder voice and scientific thinking into a single analytical platform specifically designed to find out what matters and what works. These tools have important applications in program design, strategic review, implementation, ongoing management, performance measurement, and evaluation/assessment.

When policies and programs are built on guesswork instead of pooled wisdom and the best available evidence, there’s an increased likelihood that sub-par results will follow, along with persistently poor outcomes over time. Interventions also fail because stakeholders often don’t believe that their voices have been heard. In this article we describe the crucial missing tool set that unifies evidence-based modeling, stakeholder voice and scientific thinking into a single analytical platform specifically designed to find out what matters and what works.

Using Evidence in Design and Decision-making

What can we do to significantly improve how we design and assess policies and programs without making it too complicated and without drastically increasing the input of time, effort, and resources? One important thing we can do is to find smart ways to increase the use of evidence. There’s an ever-widening consensus that to be effective and efficient, interventions must be informed by good evidence. National governments (Canada, the United Kingdom, the United States, the European Union, Sweden, Norway and many others), multilateral organizations (the United Nations, OECD, World Bank Group), non-governmental and donor-based organizations, major philanthropic foundations, and shareholder-based interests in the private sector, all increasingly point to the need to use more evidence in design and decision-making.

What does it look like to use evidence? What does this mean? There’s no rigid definition for evidence-informed decision-making, but the Public Health Agency of Canada provides one description of how evidence might be used in the public health field: “evidence-informed decision-making is ‘the process of distilling and disseminating the best available evidence from research, practice and experience and using that evidence to inform and improve public health policy and practice. Put simply, it means finding, using and sharing what works in public health.” In this article, we describe a practical approach to doing exactly this using groundbreaking data science and engagement tools that can be applied not only to public health problems, but to a wide variety of complex social and business problems.

There’s an ever-widening consensus that to be effective and efficient, interventions must be informed by good evidence.

The Benefits of Using Evidence

Why are people interested in using evidence for intervention design and decision-making? What are the benefits? Actually, it seems that there are a number of highly significant benefits. At the top of the list is that interventions that use evidence are more likely to achieve desired outcomes. It makes a lot of sense that minimizing guesswork and seat-of-the-pants thinking will increase the chances of getting good results. Another good reason to use evidence is that focusing on doing things that are supported by evidence increases the likelihood that resources will not be spent on activities that have little or no impact. We might think of this as intelligent resource allocation. Yet another compelling reason to use evidence is that it builds trust, buy-in, and communication with stakeholders and benefactors. This is especially true when stakeholders are invited to participate directly in the evidence-gathering process (i.e., their knowledge is a valued primary source of evidence). Increased use of evidence also leads to enhanced perceptions of institutional/organizational credibility and accountability. From a management perspective, the correct use of evidence strengthens a number of important activities, including program design, strategic review, implementation, ongoing management, performance measurement, and evaluation/assessment.

Interventions that use evidence are more likely to achieve desired outcomes, are more likely to dedicate resources to highly impactful activities, build trust, buy-in and communication with stakeholders, and enhance perceptions of institutional credibility. Evidence also plays a critical role in strengthening rational program design.

Types of Evidence

Before going on to look at the unique tools we’ve developed to gather and use pooled evidence, let’s examine some of the different types of evidence that may be available to managers and decision-makers. The most obvious type is research evidence. This can take the form of published scientific literature, which provides insight into what's been studied and learned in the sphere of activity that's being targeted by the intervention. A second type of evidence can come directly from people who have experience working with or living in the intervention system. We might call these people knowledge holders. There will inevitably be a number of different groups of knowledge holders involved in any given system, all possessing different lived perspectives on how the system works. While there’s little doubt that good empirical information provides us with a primary source of credible evidence, the lived experiences of stakeholders can constitute highly valuable and insightful experiential evidence.

To illustrate the value of experiential evidence, let’s consider the problem of diabetes in Indigenous communities. Western, scientific views of treating diabetes generally focus on body weight and physical activity, and emphasize individual lifestyle factors such as diet, exercise and weight control. But when Giles et al. (2007) set out to ask Indigenous people about their perceptions of what causes diabetes, they found that other key factors were important – factors that are absent from Western medical models. The Indigenous people that were interviewed pointed to concepts such as relationship to the land, relationships to other, and relationships to the sacred as being crucial to their state of health. What this tells us is that different kinds of evidence may be vital to drawing a complete picture of what drives outcomes. Based on what was found in this study, it seems likely that a government or other intervention aimed at reducing diabetes in Indigenous settings that relies solely on a Western scientific mindset would be ineffective because it would be missing important spiritual and cultural factors that mediate Indigenous health outcomes. These factors only surfaced through a deliberate and systematic attempt to draw out stakeholder views and experiences. Granted, not all stakeholder views will be objective as seen through a Western scientific lens, but these perspectives reflect valuable insights into parts of reality that are not readily encoded by reductionist thinking. Further, stakeholder perspectives are reflections of how people think and what they believe, and since social interventions are largely concerned with influencing how people behave in order to realize desired outcomes, gaining detailed insight into thinking and beliefs can provide program designers and decision-makers with crucial strategic information that can be used, along with other evidence, to make better-informed decisions about where to intervene and why (see for example World Bank, 2015).

A third type of evidence, expert opinion, can be seen as a hybrid between research evidence and experiential evidence. Expert opinion can be gathered in person (or virtually) from researchers and other professionals who have direct contact with the intervention context. Finally, evidence may be extracted from existing data sets. This includes big data from, for example, government or private sector sources, or open data, which are often well-structured data sets that are available for use by the public at large. As will become clear below, evidence from virtually any source that expresses how factors in a system affect one another can be used to gain key insights into how intervention systems work.

Stakeholder perspectives are reflections of how people think and what they believe, and since social interventions are largely concerned with influencing how people behave in order to realize desired outcomes, gaining detailed insight into thinking and beliefs can provide program designers and decision-makers with crucial strategic information that can be used, along with other evidence, to make better-informed decisions about where to intervene and why.

Bringing Evidence Together Using a Common Language

A significant challenge to using evidence effectively involves finding a practical method to systematically gather disparate sources of information (e.g., research evidence or evidence comprised of lived stakeholder experiences) in ways that allow them to be combined synergistically to triangulate clearer views of reality. One possible approach to this problem is to identify a unifying conceptual structure that bridges evidence together using a common language. One such structure or language is causality. Causality is simply characterizing and describing how one thing affects another. We can use this to bridge different sources of evidence, even if they look superficially dissimilar, because every source of evidence will contain within it a set of (hypothetical) cause and effect structures. Characterizing evidence through this lens not only permits comparisons among evidence types, it also provides a powerful foundation upon which to understand which particular factors in a system are most influential in driving outcomes.

Using causality as a central integrating construct to merge evidence in this way aligns well with the emphasis that has been placed on causality by experts in the field of program design and evaluation. Coryn et al. (2011) state that the core construct of a program is causation, whereby a theory or model describes the cause-and-effect sequence through which actions are presumed to produce long-term outcomes or benefits. In their textbook on program theory, Funnell and Rogers (2011) state that, in essence, a program theory is an explicit theory or model of how an intervention, such as a project, program, strategy, initiative, or policy, contributes to a chain of intermediate results and finally to the intended or observed outcomes. Carol Weiss, a pioneer in the exploration of program theory, stated in 1997 that program theories are popular largely because of their potential to explain how interventions work (Rogers and Weiss, 2007). These ideas and assertions are still very much with us today. As was noted in a recent special issue of the Canadian Journal of Program Evaluation centered on modern program theory, “interest in describing and understanding the underlying logic of social programs is pervasive and persistent.” (Whynot et al., 2019).

A significant challenge to using evidence effectively involves finding a practical method to systematically gather disparate sources of information in ways that allow them to be combined synergistically to triangulate clearer views of reality. We use causality to bridge different sources of evidence, even when they are superficially dissimilar, because every source of evidence will contain within it a set of (hypothetical) cause and effect structures.

Better Tools

At the heart of all this is the idea that uncovering the details of how things work, causally, can provide us with a powerful way to design, implement and assess policies and program. But while there’s been an apparent longstanding recognition that examining causation holds great potential for deepening our understanding, and despite many attempts to use models to do this, tools and approaches have remained seriously limited. Why? This is likely because modeling techniques have either been overly simplistic or too complicated, expensive and unwieldy. Logic models, for example, while used widely, are often far too simplistic to be truly useful. They are frequently cobbled together quickly and in the absence of any real evidence. On the other hand, models that may come closer to accounting for real-world complexity, for example structural equation modeling or multivariate analysis, are often difficult for people to grasp easily and require extensive and expensive data input. Similarly, and as we mentioned in part one of this article, experiments that take the form of randomized controlled trials, while sometimes used very effectively, are also difficult and expensive to set up, implement and interpret. What seems to be missing so far, in terms of realizing the full potential for models and theories to significantly catalyze better outcomes, is an intuitive, rapid, inexpensive, but nonetheless rigorous way to systematically gather and visualize evidence, and to use that evidence to make sense of complex intervention structures.

What would better tools look like? For starters, they would be easy to understand, easy to apply, and wouldn’t take too much time and effort to use. Practical tools would take advantage of what evidence has to offer, hitting the right balance between intuitiveness and rigor, without drastically changing how planning and assessment are done. To be effective, this kind of tool set would provide the means to systematically gather, compare and combine different forms of evidence, leverage powerful analytical tools to understand what the evidence tells us, and outline clear ways to test what we come up with so that we can be confident that the resulting intervention blueprints are reasonably accurate.

Conceptually, the best way to do all this would be to follow the same approach that scientists use to investigate phenomena in many fields: (1) muster an intelligent, informed guess about how something might work (generate hypotheses), (2) collect precise information about how the system behaves in the real world (observation and measurement), and (3) compare the original guess to the observations. If you do this, and the model turns out to be reasonably good at predicting what happens ‘out there’, then this model will provide a powerful way to plan and adjust the intervention at hand. This is essentially a road map to formally applying the basic scientific method to the problem of understanding the inner workings of policies, programs and other complex social interventions. As we explain below, what’s new and powerful about PolicySpark’s approach is that we leverage customized technology to untangle real-world complexity to generate deep, rich hypotheses about how interventions work based on multiple sources of evidence, including evidence derived from deliberative stakeholder dialogue.

Modeling techniques have either been overly simplistic or too complicated, expensive and unwieldy. Logic models, for example, while used widely, are often far too simplistic to be truly useful. They are frequently cobbled together quickly and in the absence of any real evidence. On the other hand, models that may come closer to accounting for real-world complexity, for example structural equation modeling or multivariate analysis, are often difficult for people to grasp easily and require extensive and expensive data input.

Leveraging Strategic Intelligence

PolicySpark’s main purpose is to fill the methodological gaps described above by developing, deploying and refining accessible and easy to use tools to produce the strategic intelligence needed to find out what matters and what works. The cornerstone of our approach is an intuitive, hands-on method that uses research evidence and stakeholder evidence (collected in participatory fora) to build visual models that are easy to interpret. The models help identify what we call system “leverage points.” These can be thought of as the small number of key factors that exert the highest levels of influence on chosen outcomes. This information can be used to highlight the best places to intervene in program systems and to construct full program theories. All of this can be used to engage in more efficient design, planning, implementation and assessment. Underlying our approach are powerful, purpose-built, data science algorithms that keep humans in control, which we describe below. Similar analytical tools are driving innovation in many areas, and we feel the time has come to apply these techniques to improving policies, programs and other interventions.

How the Tools Work

To provide a clearer idea of what these tools look like and how they work, we’ve summarized our process in the following series of steps. What we describe here is a general approach that can be used to investigate and better understand almost any complex social or business problem.

1. Identify key outcomes and sources of evidence. The first step is to get a clear idea of the problem to be investigated. This is done by identifying one or two top-level system outcomes. Once this is clear, we identify all sources of available evidence that describe the problem. Typical evidence sources include research literature, stakeholder knowledge, expert opinion, and available existing datasets (e.g., big data/open data). Evidence need not be restricted to these sources though, and any source of evidence that describes cause and effect can be used.

2. Build evidence-based models. Once all the sources of evidence are lined up, we then build a model for each source. A series of models can be built by the various groups of people around the intervention system, including decision-makers, distinct stakeholder groups, and experts. Research evidence, for example existing evidence extracted from academic literature, is incorporated into a separate model by PolicySpark, often collaboratively with clients. To construct these models, we’ve developed a technique for mapping out all the important causes and effects in a system. We map and connect all the factors together using an intuitive, visual approach that anyone can use and understand. The connections in the models are weighted with different things like strength, difficulty and time, which is a customized approach we’ve developed specifically to better understand complex social and business problems. Our hands-on, participatory approach to capturing stakeholder knowledge builds enthusiasm, interest and trust among stakeholders. This is because people are able to meaningfully and directly participate in developing the programs of which they are a part. Using a variety of evidence helps to build strong ideas (hypotheses) about how systems work. The modeling approach that we use is based on a branch of mathematics called graph theory. Graph theory algorithms power many of the artificial intelligence technologies and business applications that are rapidly expanding in the modern world.

3. Make sense of the models. The raw models that we build with the evidence reflect a fair amount of the complexity that we intuitively know is out there. Really, these models look like tangled spaghetti before we process them. This is actually good, because it tells us that we’ve been thorough in terms of capturing many of the things that could be affecting outcomes. The heart of our approach involves distilling the complex models to remove the noise so that we can see what really matters. This is where the data science and algorithms come in. Running the original complicated-looking (but information dense) models through our algorithmic platform enables us to obtain results that point to a small set of factors (hypothetical causes) that have the largest influences on the outcomes of interest. We call these the system “leverage points.” The leverage points represent truly valuable strategic intelligence. They are essentially very good, evidence-based ideas about where to intervene in the system in order to produce the desired effects. Once we have the leverage points, we can then elaborate on the involved results pathways, identify required behaviour change, and determine more precisely what should go into design, planning, implementation, measurement, and assessment. We can also use the models to carry out simulation, which is the ability to ask “what-if” questions. By manipulating factors in the model, it’s possible to see how the other factors will change, and where the system is predicted to go overall. This can be a powerful way for managers and decision-makers to visualize different hypothetical management scenarios. As mentioned above, it’s important to remember that although our platform uses an algorithmic approach to untangle complexity, we do this in a way that keeps humans in control. We see artificial intelligence as a technology that should augment human decision-making, not replace it. Our tools do not make autonomous decisions. Instead, they are built to assist us as we attempt to untangle the causal factors driving complex systems. One of the terms we sometimes use to describe our tools – a term that might be thought of as a refinement on the concept of artificial intelligence – is extended intelligence.

4. Measure outcomes. In part one, and earlier in this article, we outlined the importance of taking a scientific approach to figuring out what matters and what works. In the step above, we essentially use evidence to take the first step along this path, which is to muster a high-quality guess at what’s driving results. We make sense of complexity by identifying system leverage points and then hypothesizing that these leverage points will affect outcomes in particular ways. Measuring outcomes is like taking the second step in the scientific method, which involves collecting information about how the system behaves in the real world (observation and measurement). For the kinds of interventions that we’re talking about here, measurement and observation are usually carried out through the use of indicators. Performance measurement and indicator work sometimes gets confusing and unnecessarily expensive. This is because measurement is often not properly (or not at all) tied to an underlying model. People get confused about what to measure and why and end up measuring a lot of things that don’t need to be measured (and missing important things that should be measured). This comes back to poor or absent underlying models. With the evidence-based, leverage points approach, we are able to generate strong foundational models, which can and should be used to ground performance measurement activity. This makes performance measurement much more rational, straightforward and efficient. The purpose of indicators, which is often misunderstood, is to gather the information needed to test the strength of the original ideas about how the intervention might work (hypotheses). Having good evidence-based models permits us to define a performance measurement framework that identifies and precisely measures a small, but highly strategic set of indicators that will yield the key information necessary (no more and no less) to test the verity of the original set of hypotheses.

5. Test the models. Which brings us to the final step. After creating a set of high-quality models, which support design processes and the creation of a theory of change comprising specific results pathways, and using these to define exactly what should be measured, we’re in a good position to implement (or adjust) the intervention and begin collecting performance information. It’s at this stage that it becomes important to devise a concrete way to test how well things are working. We need to do this because without proper testing, we don’t actually know how good a model is. The real measure of a model is how well it predicts what happens. By using evidence to create the model in the first place, we definitely set ourselves up to win, because it’s more likely that a good evidence-based model will make accurate predictions compared to a model that’s based on guesswork. Our approach is already a significant improvement over current practices, which often neglect evidence and fail to leverage explicit stakeholder participation. But while these tools represent a leap forward in evidence gathering and analysis, we still need to test models to see how strong they really are. The prize that we’re reaching for is a model that does a good job of predicting the key outputs and activities that will produce the desired results. Such a model is extremely valuable. It can guide everything from design to resource allocation to implementation, helping to ensure that time, money and effort are not misspent doing unnecessary and ineffective things. An evidence-based model that stands up to scrutiny functions like a guiding compass for management and decision-making. In keeping with the scientific approach to understanding interventions and taking full advantage of the strong foundational models that we develop using the evidence-technology-collaboration alliance, we are able to leverage testing to its fullest extent by not only defining and measuring the right indicators, but by designing and implementing appropriate scientific experiments to probe the strength of the models. The detailed causal hypotheses (foundational models) that we generate using our unique approach makes such experimentation not only possible, but highly accessible, practical and inexpensive. In part one of this article, I speculated that one important factor standing in the way of more rational (scientific) policymaking is the perception that doing science on complex interventions necessarily involves the conduct of randomized controlled trials (RCTs), and that RCTs are hard, expensive, and impractical. Our approach circumvents this barrier by taking a different tack. By using evidence and technology to design deep intervention architectures, we’re able to generate rich sets of competing causal hypotheses that can be tested using on-the-fly performance information, which can be collected in a highly targeted and efficient manner. This, combined with appropriate, simplified experimental designs (e.g., ‘N-of-1’ experiments), enables us to provide our clients with heretofore unobtainable insight into how their interventions work.

PolicySpark’s integrated platform models real-world complexity and uses graph theory algorithms to discover the intervention leverage points that drive outcomes.

Empowering People and Organizations

We would portray what we’ve laid out in this article as a crucial missing tool set, grounded in fresh methodological thinking. The main purpose of the tool set is to synergistically integrate and operationalize key ideas that are already recognized as useful in the pursuit of understanding interventions, and to do so in a way that merges the ideas into a product that is more than the sum of its parts. The ideas – building good evidence-based models, leveraging stakeholder knowledge, and using scientific thinking to find out what matters and what works – are certainly not new. In fact, there is growing demand to implement these ideas. What is new is the creation of a clear, explicit, and practical way to bring these ideas together into a unified analytical platform, and specifically, into a platform that is rooted in modern data science, which as we know, is proliferating rapidly in multiple fields and spheres of activity.

Getting a better idea of how results come about has always been important. Whether the resources that are being applied to solve problems come from our own pockets (taxes), from the coffers of collective, multilateral entities, from private donors, or in the case of the private sector, from shareholders, it seems necessary, even ethical, to do our best to spend those resources wisely and responsibly. In many cases, the quality of people’s lives depends on how effectively policies and programs shape the conditions required for the emergence of positive change. These responsibilities become even more relevant as the resource base shrinks under the pressure of the global pandemic, and under other unpredictable stressors that will no doubt arise in the future. To support positive change, our vision for these tools, which we believe outlines a long-overdue shift in how to elicit better results, is to empower people and organizations by helping them produce the precise strategic intelligence that they need to design and implement better interventions. We are dedicated to this vision, and we continue to work with interested people, organizations and institutions to improve outcomes in specific contexts, and to promote the development of new communities of practice centered on participatory, evidence-based problem-solving.

References

Coryn, C. L., Noakes, L. A., Westine, C. D., & Schröter, D. C. (2011). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32(2), 199–226.

Funnell, S. C., & Rogers, P. J. (2011). Purposeful Program Theory: Effective Use of Theories of Change and Logic Models. John Wiley & Sons.

Giles, B. G., Findlay, C. S., Haas, G., LaFrance, B., Laughing, W., & Pembleton, S. (2007). Integrating conventional science and aboriginal perspectives on diabetes using fuzzy cognitive maps. Social Science & Medicine, 64(3), 562–576.

Rogers, P. J., & Weiss, C. H. (2007). Theory-based evaluation: Reflections ten years on: Theory-based evaluation: Past, present, and future. New Directions for Evaluation, 2007(114), 63–81.

Whynot, J., Lemire, S., & Montague, S. (2019). The Current Landscape of Program Theorizing. Canadian Journal of Program Evaluation, 33(3).

World Bank. 2015. World Development Report 2015: Mind, Society, and Behavior. Chapter 3. Thinking with Mental Models. Washington, DC: World Bank.