Quasi-experimentation: a critical evaluation
Originally written for a Psychology Research Methods Course in 1997.
Introduction
Quasi-experimentation is a method developing out of the realities of investigating cause in the social world. Underlying the procedures of quasi-experimentation are certain assumptions regarding the reality and knowledge of causation. I will argue that quasi-experimentation is not only a method for investigating cause in the complex social world, but it also reflects that social world. Firstly, exploring causality in the social world is complicated and difficult, so certain assumptions and practices are undertaken, some of which are problematic. Secondly, the experimenter is both product of and participant in this social world. Thirdly, both the practice of method and interpretation of results is undertaken in the context of a community of study and a wider social system. The usefulness of quasi-experimentation as a method is ultimately dependent on the individual’s judgement and the critique of the community of study.
I will discuss how the social world has created the need for quasi-experimentation. The role of individual and group in the concept and process of plausibility will be outlined. The concept of validity, and internal validity in particular, will be examined. Quasi-experimentation’s basis on falsification and manipulabilty will be explained and critiqued. Alternative methods for exploring causation in the field will be briefly discussed before finally inspecting quasi-experimentation as a critical method.
Cook and Campbell, the major authority on quasi-experimental method, define quasi-experiments as “experiments that have treatments, outcome measures, and experimental units, but do not use random assignment to create the comparisons from which treatment-caused change is inferred” (Cook and Campbell, 1979: 6). Campbell and Stanley (1966) and Cook and Campbell (1979) set out to establish the “threats to validity”, where validity is “the best available approximation to the truth … of propositions” in conducting research without randomisation (Cook and Campbell, 1979: 37). The main focus of these works were to identify possible threats and identify designs that eliminate particular threats. Cook and Campbell divide validity into four categories. Firstly internal validity, which is concerned with establishing causality at the level of sample and setting (Cook, 1991: 128). Statistical conclusion validity is concerned with the correct conclusion regarding the source of covariation (Cook and Campbell, 1979: 37). A third categorisation was external validity, that is the validity of generalisations “to and across alternate measures of the cause and effect and across different types of persons, settings and times” (Cook and Campbell, 1979: 37). Fourthly, the concern of construct validity is with generalising across and to constructs of both cause and effect (Cook and Campbell, 1979: 38).
Why is there a need for Quasi-experimentation?
The complexity of the social world creates the need for quasi-experimentation. The complexity of the social world consequently shapes the form of quasi-experimentation. Moving from the controlled environment of the laboratory to the “field setting” introduces numerous alternative explanations of an observed phenomena, we have immediate problems in isolating the cause. Ensuring the comparability of groups is one way to lessen the alternate explanations. If we want to investigate causation in the social world we are constrained in our ability to establish comparable groups. The most ‘legitimate’ way to establish comparability is randomisation, which creates comparability on average (Cook and Campbell, 1979).
Quasi-experimentation developed specifically out of difficulties associated with randomisation. Randomisation is not always possible. The possibility of randomisation is dependent on ethics, politics, and logistics (Cook, 1986: 76). For instance, if we wanted to investigate some effect of violence on an individual it would be unethical to actually deliver an unfavourable treatment. Instead we must look at groups that have already or are experiencing violence. Another example is if we wanted to explore the effects of rare events, for instance natural disasters, it is impossible to simulate the treatment. Connected with the possibilities of establishing a randomised experiment is the tendency for a randomised experiment to breakdown. Randomisation only effects the “comparability” of groups at the outset of the experiment and so any “treatment-correlated-attrition” will have a negative influence on the research (Cook, 1991: 117). Cook and Campbell identify two further potential problems with randomisation. Firstly, they suggest that “inter-unit” communication is a problem in the breakdown of randomisation, as there may be effects of recognising inequities of treatment. This is more likely the “lower the unit of aggregation studied”, such as the individual student versus a whole class. However, it is more costly and difficult to achieve comparable groups the higher the unit of aggregation studied. Secondly, they note that randomisation is not the normative form of allocation and this may impact research (Cook and Campbell, 1986: 148-9).
In a closed system, the laboratory, most extraneous forces can be assumed away due to the controls on the environment (Cook and Campbell, 1979: 7-8). Cook and Campbell state some concern with conducting experiments in a closed system: “While the laboratory could make many forces independent that are normally correlated, the fear was that in so doing human phenomena would be removed from the very contextual and social arrangements that give them human meaning” (Cook and Campbell, 1986: 150). They identify three main points of concern regarding the laboratory. First, is the possibility of the uniqueness of the laboratory’s “social culture”. Second, is the limits the laboratory places on the “range of human phenomena that can be studied”. For example the length of time that can be practically studied in the laboratory is limited. A third concern is that there is no reason to assume we can generalise to nonlaboratory settings and there is no available mechanism to make this generalisation (Cook and Campbell, 1986: 149-150). This concern with generalising beyond the limits of the study is part of the external validity of the research, and it is one strength of quasi-experimentation over experimentation in laboratory settings. The point is made well by Cook: “Quasi-experimentalists seek to conduct research in settings to which extrapolation is desired” (1983:77).
Plausibility
The plausibility of alternative hypothesis is a recurrent theme throughout the literature on quasi-experimentation. Presented with the results of a quasi-experiment our task is to decide which threats to validity have been ruled out by the particular design and then examine the plausibility of other possible threats to validity. These are our alternative hypotheses. Although plausibility is not native to quasi-experimentation it is clear that quasi-experimentation “is more dependent on criticism than are other forms of experimentation” (Cook, 1983: 90). The fallibility of human judgement implies the potential fallibility of plausibility. We can consider plausibility at two levels of analysis. Firstly at the level of the individual and second at the level of group. Cook and Campbell make this clear: “Ultimately, it is researchers’ and their critics understanding of the subject matter under study at the time which determine whether a threat should be treated as plausible” (Cook and Campbell, 1986: 155).
At the level of the individual there are a number of important factors. Cordray (1986) points out that the role of judgement has been neglected as the focus has been on more theoretic considerations. Cordray acknowledges some of the limitations of human cognition and perception in identifying cause. Shadish briefly mentions other problems at the individual level, such as incomplete knowledge and difficulties in perceiving bias (1986: 30). Cook also has made reference to the individual level, identifying common sense as another important factor at the individual level (Cook, 1983: 92). Campbell is especially harsh in his evaluation of the individual: “scientists are thoroughly human beings: greedily ambitious, competitive, unscrupulous, self-interested, clique-partisan, biased by tradition and cultural memberships, given to mutual backscratching and the like” (Campbell, 1984: 31).
Quasi-experimentation’s theorists have recognised quasi-experimentation’s reliance on judgement and have sought to increase “intersubjective reliability”, an approximate objectivity, by including the judgement’s of many individuals (Cook, 1983: 83). Thus, following on from the level of the individual is the importance of the group. Cook and Campbell note that at the level of collectivity we have our best “substitute for ‘proof’” (1986: 146). But, as Cook notes plausibility reflects the “social constructions of a particular time and place”, therefore this will not always guarantee valid inferences (Cook, 1991: 118). For example, the positions of the stars and planets were held as a plausible explanation of human behaviour for a long time. Following from the temporal roots of plausibility we can see that establishing plausibility is a process that will continue not only through the particular procedure’s implementation and initial analysis but also past the lifetime of the community of study.
Both the individual and group levels interact. What is considered plausible by an individual is a function of both learning from the group and group processes that tend to conformity. The researcher is a product of a particular community of study and a continuing participant in a community of study. The individual’s standing in the group will partly determine the attention given to the research and the legitimacy of their findings.
Validity
We have already briefly discussed the conception and categorisation of validity. The threats to validity have been developed out of researchers’ experience and it is recognised throughout the literature that the list is not exhaustive. A concern of quasi-experimentation’s critics has been the importance it gives to internal validity, primarily because it neglects generalisation. This, according to Cook, is due to different “priorities in what is worth knowing” (Cook, 1983: 87), this disagreement is especially important in applied social science where there are those stressing ‘applied’ and others stressing ‘science’. Although not neglecting external validity (in fact Campbell invented the term), the importance given to internal validity is both due to the high costs of being wrong about it and the recognition that generalisation is meaningless if causal inferences are wrong (Cook, 1983: 87). An overemphasis on internal validity over external validity could restrict possibilities of triangulation, as we may have no basis on which to compare different studies.
For the purposes of phenomena detection, internal validity can be seen as crucial. This is the level at which phenomena is detected. Internal validity is concerned with the “relationship between the research operations irrespective of what they theoretically represent” (Cook and Campbell, 1979: 38). Emphasis on internal validity is consistent with an abductive theory of scientific method where the concern is first with establishing the existence of a phenomena, and from there generating a relevant theory (Haig, 1996: 192). Cook and Campbell advocate a concern with establishing a “factual base” before “elaborate theorizing”, citing some areas of psychology as an example to the contrary (Cook and Campbell, 1979: 25).
Popper and Falsification
The debt to Popper’s conception of falsification is made explicit throughout the literature on quasi-experimentation. Popper’s view is that the only logically possible knowledge is deductive (Cook and Campbell, 1979: 20). This means that we cannot prove, only disprove a theory. The route to “establishing a scientific theory is one of eliminating rival hypotheses”. Cook and Campbell also stress that alternative theories must coincide with disconfirming observations for a theory to be falsified (Cook and Campbell, 1979: 23). The theories we are concerned with falsifying are those regarding rival threats to validity. Briefly stated the position held in the field of quasi-experimentation is that: “At best, one can know what has not been ruled out as false” (Cook and Campbell, 1979: 37).
In practice there are limits to falsification. Firstly, falsification is contrary to “the dogma that one cannot prove the null hypothesis” (Cook, 1983:86). The conventions of statistics in the social and behavioural sciences dictate that we can never accept the null, which would be a requirement to falsify a theory. Secondly, falsification of a theory relies on accuracy of observation, “theory-laden” or otherwise, measurement, and implementation of method. To falsify a theory requires that we assume that the theories of measurement have been proven. Another consideration is that dealing with probabilistic explanations, as favoured in the ontology of quasi-experimentation (Cook, 1983), there is weak justification to falsify based on a sample.
The postpositivists have attacked falsification on the basis of it’s assumption of the comparability of theories, an assumption necessary for falsification to hold. The postpositivists have posited the “theory-ladenness of facts”, that is that observations are “impregnated with the theory or paradigm under which they were collected” (Cook and Campbell, 1979: 23). The “single-theory-ladenness of facts” was rejected by Cook and Campbell (1979: 24) and Cook (1983: 83), but their explanation of “multiple-theory-ladenness” on it’s own does not discount the criticisms of postpostivists. They assume a symmetrical distribution of theories that will tend to even out the bias inherent in observations. They ignore the possibility of a skewed distribution of theories. “Intersubjective reliability” (Cook, 1983: 83), their approximation of objectivity, is conditional on the community of critique not being biased towards theories that are important to the question under consideration and are not blind to their bias.
Recognising the fallibility of perception and judgement of human beings necessitates a qualification to the strict observance of falsification even when accompanied by theory. Thus we have a tentative falsification that must allow for imperfect or biased observation, imperfect measurement and implementation of the design, and also limitations on statistical analysis.
Manipulability
Cook and Campbell’s (1979) conception of causation tends towards one of manipulability. This follows from the “everyday” connotation of cause (Cook and Campbell, 1979: 25) and practical considerations of conducting experimentation, which make it easier to infer causation if we are actually varying some treatment (Cook and Campbell, 1979). Cook has stated: “Quasi-experimentation … probes individual manipulanda that are presumed to have immediate implications for modifying the external world. The testing of causal laws, the generation of causal understanding, are not its strong points” (1983: 80). Therefore, although manipulability is limited in the phenomena detected, it is practical in the sense that we identify phenomena that we can have some influence on.
Manipulability is problematic in that we still must analyse the treatment to identify which components are producing the impact. It is here according to Chen and Rossi that “a priori knowledge and theory” are important (1984: 348). Cook and Campbell discuss other critiques of manipulability. One which is important for the practice of research is the difficulty in delivering a particular treatment in the social sciences (1986: 172).
A consequence of Cook and Campbell’s emphasis on manipulation is a concern with a particular level of analysis. This is evident in the first of their statements outlining their position on causation: “Causal assertions are meaningful at the molar level even when the ultimate micromediation is not known”. They define molar as “causal laws stated in terms of large and often complex objects” and micromediation as “the specification of causal connections at a level of smaller particles than make up the molar objects and on a finer time scale” (Cook and Campbell, 1979: 32). Mark (1986) criticises the lack of emphasis on the causal process in quasi-experimentation. He claims that knowledge of the “mediating linkages” implies more full knowledge of the requisite conditions (57). He argues that a more thorough understanding of the process strengthens construct validity and is also important for understanding how far we can generalise.
Alternatives
What are the alternatives to investigating cause in an open system? Firstly, although often impossible or at least difficult to perform, randomised experiments are presented as the ideal consistently throughout the literature on quasi-experimentation. Cook has said that Campbell “ceaselessly advocated random experiments” (1991: 137). However, randomisation is not as ideal as indicated. Cook and Campbell (1979), besides indicating the difficulties associated with the possibilities of implementation and preventing the breakdown of the experiment, make it clear that randomisation does not control for the threats to external validity and only controls for some threats to internal validity.
The claim that is made to justify randomisation is that is overcomes the threats to internal validity by establishing equivalence of groups at the outset of the experiment. However Urbach (1985) has indicated a major problem that is not considered in the literature on quasi-experimentation. As Groups are only randomised at the outset we cannot assume that they are treated the same over the course of the experiment, for example if the treatments are applied to each group at different time periods there may be some explanatory factor associated with those particular points in time. According to Urbach, to compensate for the infinite number of sources of error we must perform an infinite number of randomisations (Urbach, 1985: 263). Urbach points towards plausibility as a criteria to overcome these problems with randomisation, stating that we should “control for those factors judged to be significant” (Urbach, 1985: 267).
Qualitative methods are another major alternative practiced throughout the social sciences. Campbell has noted that qualitative methods share with quasi-experimentation a “plausible-rival-hypothesis approach” (1984: 31). The aim being to eliminate rival hypotheses. He claims that to interpret the data from a quantitative approach requires “situation-specific wisdom” in common with qualitative analysis (34). However, Cook (citing Campbell) highlights the difficulty in ruling out “all threats to internal validity” (Cook, 1991: 139). In the case of qualitative research the judgements of the researcher and the community of study becomes even more important than in the case of quasi-experimentation.
Causal modeling is another technique used in place of quasi-experimentation which, according to Cook and Campbell is best suited to studying “nonmanipulable causal forces” in certain circumstances (Cook and Campbell, 1986: 174). Furthermore, they claim that good causal modeling is more theory dependent than quasi-experimentation, as many possible models can be fitted to the data (Cook and Campbell, 1986: 175).
A critical method?
We can see in the development of quasi-experimentation a concern with critical methodology. This is evident in that much of the criticisms have been identified and acted on by Cook and Campbell who are the main authorities on quasi-experimentation. One example is the strong criticism by Campbell of Suchman’s advocation of Campbell and Stanley as “the Bible for evaluation” (Cook, 1991: 136). Their self-criticism follows from the ontological and epistemological positions that they have outlined in the literature (Cook, 1983; Cook and Campbell, 1986; Cook, 1991).
An important change has occurred in quasi-experimentation concerning the degree of complexity and specificity of the techniques used. Cook describes this as the outworking of the addition of verificationist elements. He further states that: “the more quantitatively specific or complex verified causal predictions are the more difficult it will usually be to generate alternative interpretations matching the same data pattern” (Cook, 1991: 121). This is perhaps more relevant with regard to deductive reasoning: the actual testing of theory.
The reliance on differing forms of triangulation has also become more important. This is made most explicit with Cook’s “critical multiplism”, attempting to overcome the limits of individual assessments of plausibility and theory-ladenness of observations by exposing the research to a wider and larger audience of critique and investigating the same problem with different designs (Cook and Campbell, 1986: 173; Shadish, Cook, and Houts, 1986). Cook (1983) further discusses the triangulation of measurement and data analysis.
Part of the power of quasi-experimentation lies in it’s ability to take on elements of different techniques of analysis. Cook and Campbell (1986) have discussed integrating quasi-experimental designs and causal modeling. They have also discussed it’s consistency with Meta-analysis.
The critical development of a method is only one important consideration. A critical method also involves critical practice. Cordray laments at the poor practice of quasi-experimentation in the field of evaluation (1986: 9). The danger in applying quasi-experimentation’s ‘threats to validity’ as a checklist to sound inference has been identified. Cook discusses the fear that this can lead to a “uncritical” emphasis on techniques over qualitative knowledge (Cook, 1983: 77). Cook and Campbell also express concern that a checklist approach may mean that researchers miss threats not on the list and threats that are expressed in unconventional ways in the field (Cook and Campbell, 1986: 154-5).
Conclusion
It is clear that exploring causation in the social world creates many problems for the researcher. Quasi-experimentation has developed in response to these problems. A recognition of the role of individual and the social nature of quasi-experimentation is an important step in conducting quasi-experiments. We should further recognise quasi-experimentation as a social process, involving not only the physical process of experimentation, but also the process of inference within and beyond a community of study. The strength of internal validity is something we must pay attention to as we explore quasi-experimentation’s role in phenomena detection. In discussing the concepts of falsification and manipulability some practical problems have been identified. However, in examining quasi-experimentation along with it’s alternatives we can see it does present itself as the most satisfactory under certain circumstances. The concern with self-criticism by quasi-experimentation’s theorists is admirable although we must distinguish between critical theory and practice. Quasi-experimentation’s consistency with various techniques of analysis and it’s adaptability to criticism justify it’s usefulness in studying the social world.
Works Cited
Campbell, Donald T., and Stanley, J. C. 1966. Experimental and Quasi- Experimental Designs for Research. Chicago: Rand McNally.
Campbell, Donald T. 1984. “Can We Be Scientific in Applied Social Science?” Evaluation Studies Review Annual. Ed. R. F. Conner. Vol. 9. Beverly Hills, Calif.: Sage.
Chen, H., and Rossi, P. H. 1984. “Evaluating with Sense: The Theory-Driven Approach”. Evaluation Studies Review Annual. Ed. R. F. Conner. Vol. 9. Beverly Hills, Calif.: Sage.
Cook, Thomas D. 1983. “Quasi-Experimentation: Its Ontology, Epistemology, and Methodology”. Beyond Method. Ed. Gareth Morgan. London: Sage. 74-94.
Cook, Thomas D. 1991. “Clarifying the Warrant for Generalized Causal Inferences in Quasi-Experimentation”. Evaluation and Education: At Quarter Century. Nineteenth NSSE Yearbook, Pt. 2. Chicago: University of Chicago Press, 115-144.
Cook, Thomas D., and Campbell, D. T. 1979. Quasi-Experimentation: Designs and Analysis for Field Settings. Chicago: Rand McNally.
Cook, Thomas D., and Campbell, D. T. 1986. “The Causal Assumptions of Quasi- Experimental Practice”. Synthese, 68: 141-180.
Cordray, David S. 1986. “Quasi-Experimental Analysis: A Mixture of Methods and Judgement”. Advances in Quasi-Experimental Design and Analysis. Ed. William M. K. Trochim. New Directions for Program Evaluation, 31. San Francisco: Jossey-Bass. 9-27.
Haig, Brian D. 1996. “Statistical methods in education and psychology: A critical perspective”. Australian Journal of Education, 40: 190-209.
Mark, Melvin M. 1986. “Validity Typologies and the Logic of Quasi- Experimentation”. Advances in Quasi-Experimental Design and Analysis. Ed. William M. K. Trochim. New Directions for Program Evaluation, 31. San Francisco: Jossey-Bass. 47-66.
Shadish, William R., Jr., Cook, Thomas D., and Houts, Arthur C. 1986. “Quasi- Experimentation in a Critical Multiplist Mode”. Advances in Quasi- Experimental Design and Analysis. Ed. William M. K. Trochim. New Directions for Program Evaluation, 31. San Francisco: Jossey-Bass. 29-46.
Urbach, Peter. 1985. “Randomization and the Design Of Experiments”. Philosophy of Science, 52: 256-273.