Skip to content
Home » Blog » Program evaluation, how to determine whether a program is optimal

Program evaluation, how to determine whether a program is optimal

Does the discount you offer increase sales of the products you sell? Does a bonus in the form of a 25% increase in product weight increase public interest in buying your products? To answer these questions, an evaluation of the program being implemented is a must. What is a program evaluation, how do you conduct a program evaluation, and how do you evaluate whether a program is effective or not? On this occasion, I will discuss these points.

Program evaluation is a systematic effort that includes data collection and analysis aimed at assessing the effectiveness, efficiency, and impact of a program. Its main purpose is to provide information to stakeholders to improve the program and ensure that its objectives are achieved.

How to conduct a program evaluation?

How do you evaluate a program? In general, evaluators commonly use four methods to assess programs: randomized controlled experiments, natural experiments, nonequivalent controls, and difference-in-differences. Each of these four methods has its own advantages and disadvantages, which I will discuss here.

Randomized, controlled experiment

The first and most accurate program evaluation method is the randomized controlled experiment. In this method, researchers divide a group into two and give each part a different treatment. There are two main challenges in implementing this method.

The first is that there are many experiments that cannot be conducted directly on humans. As a result, we can only conduct research on humans with experiments that have positive expected outcomes. An example of an experiment that cannot be conducted on humans is if you want to study the positive effects of divorce. Then test a new drug directly on humans. This requires a specific strategy to use this method.

The second weakness is that there is too much variation in humans compared to laboratory test rats. Variation in humans is so diverse. From height, gender, race, and people with jobs as teachers have different habits from people with jobs as entrepreneurs. How can we ensure that all these differences do not interfere with the study? The best method is to randomly divide the group into two. The art of randomization is that when we do it and divide into two groups, the variation will be evenly distributed.

In the field of medicine, randomized controlled trials are the most commonly used evaluation method. The aim is to determine whether the treatment or medication given by doctors has a real effect or is merely a placebo effect (the patient feeling better after taking a dummy medication). One application of this method is when researchers want to know whether surgery to reduce pain in the legs has a significant effect. The first group underwent real surgery on their legs, while the second group underwent “sham surgery” where the doctor made incisions on their legs and acted as if he was performing surgery. The results showed that real surgery did not significantly reduce pain in their legs.

Natural experiment

The next evaluation program is a natural experiment. Sometimes researchers are unable to conduct randomized, controlled experiments due to limited funds and resources. One way to do this is by using a natural experiment. A natural experiment is a study that occurs when researchers do not directly control a variable, but there are natural events, policy factors, and external factors that affect a group so that it can be compared before and after to find the cause-and-effect relationship.

An example of a natural experiment that exists around us in Indonesia is that in the past. Mandatory education was 9 years, but now it is 12 years. Researchers then looked for the relationship between the length of education and income levels. After conducting the research, the results showed that people with 12 years of education had higher incomes than those with 9 years of education. This is because people with 12 years of education have more skills and access to better jobs. So at the same age, they have higher incomes.

Nonequivalent control

The next program evaluation is nonequivalent control. There are some conditions where researchers cannot use random samples to conduct research. The hope is that after conducting research using this method, there will not be too much difference between the two groups. The good news is that we can still use treatment and control groups. The bad news is that non-random selection of the groups we are studying can create a very large bias.

One example of its application is, for example. A researcher who wants to observe “The Effect of Morning Exercise Programs on the Physical Fitness Levels of Elderly People in Nursing Homes.” The researcher conducted observations at two different nursing homes, namely nursing home A and nursing home B. Before the study, researchers tested the health of all elderly participants. At nursing home A, they held exercise sessions three times a week, while at nursing home B, the residents did not exercise at all. After three months, the results showed that the elderly residents of Nursing Home A experienced better health improvements compared to those in Nursing Home B.

Difference in differences

One of the best ways to learn something is to try it and see what happens, and then we can learn from it. Like a little kid who learns that if he cries, he’ll get candy. We sometimes do this too, as adults. What if I only drank water for a week? What if I offered a discount every Friday and observed the impact? Would there be a decrease in weight after a week of drinking only water? Would sales increase after offering the discount?

The difference in difference method works by comparing the changes before and after in a group that received treatment (treatment group) and a group that did not receive treatment (control group). Thus, the formula for difference in difference can be written as follows:

DiD=(Ytreatment,after​−Ytreatment,before​)−(Ycontrol,after​−Ycontrol,before​)

We will study a case example of difference in difference, for example, the policy of raising the minimum wage with an increase in the unemployment rate. Suppose there is a policy to raise the minimum wage in city A (treatment) while in city B there is no increase in the minimum wage (control). We want to know the impact on the unemployment rate:

  • Before the policy
    • City A : 5%
    • City B : 6%
  • After the policy
    • City A : 7%
    • City B : 6.5%

If we only look at the change in city A, there is a 2% increase in the unemployment rate. However, DiD calculates the difference in the change in the unemployment rate between city A and city B.

DiD = ( 7 − 5​ )−( 6.5 − 6 ) = 2 − 0.5 = 1.5%

This means that the estimated net impact of the policy on City A is a 1.5% increase in unemployment. This means that if this policy is not implemented, the unemployment rate is likely to increase by 0.5% in line with the national trend.

Read also: Common mistakes in regression analysis