EDR720
 StartSyllabusClassLibraryCommunicate
Help EDR720 : The Class : Methodology : Delimitations : Online Reading

Online Reading: Electronic Textbook: Limitations and Delimitations

As I mentioned earlier, I realize that, strictly speaking, these are not part of our "bare-bones, 1st-shot" rough outline of the doctoral research proposal. However, I have noticed many students struggling with the meaning of these terms in Dissertation Seminar, when they are trying to package a prospectus. Thus, I want to give you a head start on what these terms mean, and how they are distinctly different.
  1. Limitations

    These relate to a concept called the internal validity of your study design.

    Read about Internal Validity by Bill Trochim of Cornell


    Internal Validity


    It is unfortunate, by the way, that research terminology has used the same base term, "validity," in two distinctly different ways.

    1. Our Intro partners already know (and the rest of you will, too, soon!) that when referring to instrumentation construction, validity means "did I measure what I thought I did?" If I say it's a test of "depression," is that what I'm truly picking up - or could my responses accidentally be contaminated or confounded by, say, "anxiety," which I'm told by ed. psych. friends is actually a surprisingly 'close correlate' of "depression?"

    2. When referring to the design methodology, and when prefaced as internal validity, validity has a closely related meaning. It refers to the credibility or believability of the findings and results. In other words, "did things happen for the reasons that you say they did? or could those findings and results, and therefore your conclusions, have become accidentally contaminated by some other variable(s) or factor(s) that you were unable to control, randomize, match subjects on, etc.?"

    If you've read through Introduction to Research Module 4, you have probably concluded that the experimental family of designs has the highest internal validity. This is due to their relatively "tight control" and/or randomization of a wide variety of such potential contaminating factors or variables. Through these procedures, we can be "more sure" that 'the treatment caused the outcome(s)' that we observe. Another way of looking at this is: we can safely attribute the results to that treatment. This control and randomization of as many other potential contaminants or causes gives our belief such high credibility: e.g., internal validity.

    In a general sense, in "Limitations" you'd cast a critical eye on your design -- which, by the way, is not to imply that it's "bad" if it isn't experimental in nature! In fact, we'll soon see that we have to 'pay a price' in terms of such 'high internal validity' of experimental designs!' And you would, for your "Limitations," "thoroughly brainstorm" a listing of as many other such potential "causes," "contaminants," "uncontrolled factors," etc., as possible that could very well have 'crept into' your study design and produced the results that you obtained.

    To get you started on such brainstorming, let me share with you a listing of some commonly accepted "families" of "Limitations," or "threats to internal validity." These happen to be from William Wiersma, Research Methods in Education, 6th ed., 1994: Simon & Schuster, Pubishers. Virtually any intro and/or intermediate research design book will list and define these terms similarly, however: they're 'that universal!'


    Table 1.
    Threats to Internal Validity
    adapted from Wiersma, Research Methods in Education

    Threat Example
    1. History - unanticipated events occurring while the study is in progress that affect the dependent variable(s). 1. During a relatively short instructional experiment, one group of subjects misses some instruction because of a sudden power failure at the school.
    2. Maturation - processes and changes occurring within the subject(s) simply as a function of the passing of time, rather than anything "done" or "not done" by the researcher. 2. In a learning experiment, subject performance begins decreasing after about 50 minutes simply because of fatigue.
    3. Testing - the effect of taking one test upon the scores of a subsequent test. 3. In a study in which performance on a logical reasoning test is the dependent variable, the content of the pretest 'cues' the subjects about what is likely to appear on the posttest.
    4. Instrumentation - an effect due to inconsistent use of the measuring instrument(s). 4. Two assistants in an instructional study administered the posttest with slightly different instructions and procedures (e.g., maximum time allowed for the students to complete one section before proceeding to the following section).
    5. Statistical regression - an effect caused by a tendency for subjects to 'regress' from extreme high or low initial scores back to a more 'moderate' or 'average' level of performance on subsequent tests. 5. In a study involving reading instruction, subjects initially grouped on the factor of 'poor' pretest reading scores show considerably greater gains than the average readers. It is important to keep in mind that this is a quantitative artifact that is bound to happen anyway, regardless of the quality or impact of the 'treatment' (e.g., reading instruction), simply due to the 'poor' initial grouping.
    6. Differential selection of subjects - an effect due to the groups of subjects not being randomly assigned or selected (***: NOTE: PLEASE SEE INTRO # 5 ON THIS!); but rather, a selection factor is operating such that the groups are not equivalent. 6. The 'experimental' group in an instructional experiment happens to consist of a 'high-IQ' class, while the control group happens to be an 'average-IQ' class. The question arises: wouldn't the first group have been likely to do better anyway? regardless of the impact or quality of the 'treatment?'
    7. Experimental mortality or differential loss of subjects - an effect due to subjects dropping out of the study on a non-random basis. 7. In a health experiment designed to determine the effects of various exercises, those subjects who find exercise to be most difficult also stop participating.
    8. Selection-maturation interaction - an effect of maturation not being consistent across the groups due to some selection factor. 8. In a problem-solving experiment, intact groups of junior high school students and senior high school students are involved. The junior high subjects happen to tire of the task sooner than the older, senior high subjects.



    You would, again, list-enumerate (some chairs prefer the 1., 2., 3., indented format) any and all such "threats to internal validity" of your study back under the Chapter One "Limitations" subsection. (By the way, seems a bit illogical to have these back in Chapter One, when they pertain so closely to the Chapter Three design methodology terms, doesn't it?! Ah, such is life ... ! And why this "user-friendly methods chair/person on dissertation committee finds herself saying to her beloved advisees: "Once we've identified any and all relevant design buzzwords for your Chapter Three methodology subsection, let's now take one last look back at your Chapter One Limitations! just to make sure that you've got 'em all!" And more often than not, the dissertation writer finds him/herself adding to the Chapter One Limitations at this point - to "cover" any add'l. design buzzwords that he/she may not initially have thought of, but now realizes pertain to his/her study.)

    IMPORTANT! From the "don't-be-too-insecure-&-sell-yourself-short" dept.! While you may think of these limitations along the lines of 'disclaimers,' please don't accidentally cross the line and become "overly apologetic" for them! Along those lines: also please indicate if you are planning to do something that will reduce or eliminate a given limitation. This is known as mitigating a threat to validity. The best example of this is a multimethod design. The goal of a well-planned multimethod study is to employ two or more research designs and/or procedures that have different threats to validity. Thus, if the findings from the different paths end up 'pointing in the same direction' regarding your hypothesis, there is increased evidence of a 'real' result. It can't be due to the threats alone if those threats were different for the different procedures. They in effect cancelled one another out. Thus, a multimethod design enhances the internal validity of your study, or credibility of your stated findings and results. Another way of saying this is that by choosing to apply multimethods designs and/or procedures you are mitigating some of the threats to internal validity. This should also be briefly explained -- i.e., 'shown off!' -- when you identify your limitations!


  2. Delimitations

    In contrast to "Limitations," the Delimitations deal with issues of external validity, or generalizability.

    External Validity


    Under the Chapter One "Delimitations," then, you'd briefly identify or describe for the reader the issues of WITH WHOM/WHERE/WHEN that in essence form the boundaries of your study.

    For instance: if you analyzed the test-score data of Arizona public school 4th graders between the years of 1990-1994, this is what you'd remind the reader of under the Delimitations. Please note that this is not an 'inherent weakness' or even a 'disclaimer.' It would be impossible to design a single study that would take into account all persons, places and time periods to whom/which you hope the findings will generalize!

    Some additional 'threats to external validity' are as follows - again, from Wiersma but standard terminology in virtually any intro or intermediate research text. These may or may not apply to your particular study: in fact, the first, second and fourth apply primarily to the experimental 'family' of design methodologies.


    Table 2.
    Threats to External Validity
    adapted from Wiersma, Research Methods in Education

    Threat Example
    1. Interaction effect of testing - pretesting interacts with the experimental treatment & causes some effect(s) such that the results may not necessarily generalize to an unpretested population. 1. In a physical performance experiment, the pretest clues the subjects to respond in a certain way to the experimental treatment that would not have been the case if there were no pretest.
    2. Interaction effects of selection biases & the experimental treatment - an effect of some selection factor of intact groups (***: REMINDER TO REVIEW Module #2, Intro to Research Materials! THIS WOULD IMPLY A QUASI-EXPERIMENTAL STUDY DESIGN!) that would not be the case if the groups had been randomly formed. 2. The results of an experiment in which teaching method is the experimental treatment, used with classes of 'low achievers,' may not necessarily generalize to classes consisting of more heterogeneous ability levels.
    3. Reactive effects of experimental arrangements - an effect that is due simply to the fact that subjects know that they are participating in a study & thus reacting primarily to the 'novelty' of it, rather than any 'treatment' per se. Also known as the Hawthorne effect. 3. An experiment in remedial reading insruction has an effect that does not occur when the remedial reading program (i.e., the experimental treatment) is implemented in the regular program.
    4. Multiple-treatment interference - when the same subjects receive two or more treatments (i.e., repeated measures) ... there may be a carry-over effect between treatments such that the results may not necessarily be generalizable to single treatments. 4.In a drug experiment, the same animals are administered four different drug doses in some predetermined sequence. The effects of the 2nd through 4th doses cannot be separated from possible (delayed, 'spillover', time-lagged) effects of the preceding doses.

    One final comment - earlier on, I mentioned that there was a 'price to be paid' regarding the 'high internal validity' typically possessed by experimental "family" design methodologies.

    True -- we can control and randomize in a tight laboratory setting to make the resultant cause-effect relationships highly credible. Thus, the study is said to possess high internal validity. The dependent variable(s) changed for the reason(s) we think - the experimental treatment(s), or the independent variable(s). We can reasonably rule out most other factors, if we have controlled and/or randomized on them.

    But -- the tradeoff is external validity! Life is not like a tightly controlled, predictable laboratory - as if we didn't know...!

    The more naturalistic, qualitative-type case studies, on the other hand, are more "true to real life" and thus tend to possess higher external validity, or generalizability.

    View some examples of how case studies may be used in research.


    The Application of Case Study Evaluations (ERIC)

    But because real-world settings involve "messy tangles of many variables," often with little or no control, the researcher may be hard pressed to specify cause-effect relationships with any degree of certainty. In cases of little or no control, often the best he/she can do is either describe what occurs or perhaps go into correlational-type designs -- e.g., what seems to go with what, but with no 'guarantees' that one factor necessarily caused the other. (As you may have learned when studying the concept of 'correlation' in an introduction to statistics course, such observed 'surface' relationships between 2 or more variables may very well be fueled by a third, 'hidden iceberg' variable - thereby leading us to call the surface factors a 'spurious correlation.')

    In a nutshell, then, here is the "internal vs. external validity tradeoff" we face! (Please see Figure 2, below)


    Figure 2.
    The Design/Validity Tradeoff




    It is a delicate balance that researchers need to be aware of. The ultimate objectives of the study must be kept firmly in mind in deciding which 'extreme' to 'maximize,' if any. When working with extensive, well-validated theory, your ultimate goal may be to add to that theory (body of knowledge) by expanding the network of cause-effect relationships. If that is the case, then you will want to go for higher internal validity as offered with experimental-type methodologies. On the other hand, for applied and/or exploratory research, your goal may be to shoot for higher external validity instead. A careful balance may be attained with a combination of well-thought-out design and sampling strategies, as well as mitigating threats through multimethod approaches.

    Next time around, dear research design partners, we'll start out a multi-part look at research instrumentation, the avenue of gathering our data! Till then, may all of the mechanical and technological forces in your life treat you kindly ... !


Once you have finished you should:

Go on to Group Assignment 1: Explore Limitations/Delimitations
or
Go back to Topic 3: Limitations and Delimitations

E-mail M. Dereshiwsky at statcatmd@aol.com
Call M. Dereshiwsky at (520) 523-1892


NAU

Copyright © 1999 Northern Arizona University
ALL RIGHTS RESERVED