Between Subject Designs

Between-Subject Experimental Designs

Since between subject experimental designs are an axperimental approach they have the four characteristics of an experiment that were described in chapter 7.

  • An independent variable is manipulated to create at least two treatment conditions
  • A dependent variable is measured within each treatment condition
  • The scores measured in each condition are compared
  • All other (extraneous) variables are controlled

An experimental design where one group of individuals in one treatment condition is compared to another group of individuals in a different treatment condtion is called a between-subjects experimental design.

 

                                                                                between-subjects design.jpg

                                                                                                           (image of two groups of people)

Characteristics of between-subjects designs (p. 228)

Between-subjects designs are used in both experimental and non-experimental research designs. Keep in mind that chapter 8 is focused only on between-subjects experimental diesigns.

This type of design is used when a researcher wants to compare two or more treatment conditions.

Each treatment condition, or level of the independent variable, includes a different group of individuals.

For example one group might recieve an experimental drug, while another group of participants recieves a placebo control

Or one group is exposed to one wieght loss program and a second group is exposed to another program

Or one group is exposed to one textbook and a second group is excposed to another...

A between-subjects experimental design is sometimes called an independent measures experimental design because there is only one score for each individual.

Advantages

The major advantages of the between subjects design is that the researcher can be confident that any differences between the groups are due to the differing treatments rather than to other treatment factors that can occur when the same individual is measured more than once.  That is, there are no practice, carryover, or contrast effects from exposure to multiple treatment conditions

Another advantage is that it is almost always possible to assign an individual to one of several treatment condtions...it is not always possible to expose the same individual to multiple treatment conditions.

Note that many of these advantages of the between-subjects design (discussed on p. 230-231) are advantages in contrast to within-subject designs...

Disadvantages

One disadvantage of the design is that many particpants are needed since each is exposed to only one level of the independent variable.

The major disadvantage is that since each group includes different individuals we may not always be certain whether observed differences are due to the independent variable or due to differences in the participants in each group.

What are 'Individual Differences' and why do they matter?

Individual differences include any characteric of a person that differs from one person to another, such as age, gender, height, education, intelligence, extraversion, athleticism, weight, temperment, and so on.

Sometimes individual differences are the independent variable, for instance, we compare how a group of boys versus a group of girls respond to a teaching technique to answer a question such as "do boys or girls benefit more from teatment X?"

More often, indivudal differences are considered extraneous variables. We know they are there and try to minimize their effects. Individual differences are most problematic when they become confounding variables.

Individual Differences as confounding variables p. 232)

When one experimental group differs from the other in a consistent or systematic way, then we may not be able to determine whether the observed outcome is due to the independent variable or to the confounding individual difference.

For instance, imagine that group #1 has less heart disease than group #2 following a dietary intervention. We would like to conclude that the intervention 'worked' - that making dietary changes reduces the risk of heart disease. However, if one group is significantly different from the other with respect to age, gender, ethnicity, or weight, then we cannot confidently conclude that the diet rather than one of the other factors is responsible for the observed group differences in heart disease.

Note that individual difference can also mask outcomes. For instance, in the above example, age differences might make it appear that an ineffective diet reduced heart disease when the younger group receives the experiemntal diet. However, if the older group receives the diet and no group differences are observed we migth mistakenlyc onclude that the diet was ineffective when it actually prevented a group fo older individuasl from getting heart disease and this went unnoticed because they were being compared to younger people less likely to get heart disease.

Assignments bias occurs when experiemntal groups differ with respect to individual differences.

Environmental Variables can also be confounding. For insatnce, even when a two groups are similar with respect to individual characteristics, if they are treated in different environments, then we cannot be sure whether the treatment or the environment is responsible for individual differences.

We control these confounds by creating equivalent groups

Equivalent Groups (p. 234)

1. created equally - the same (or as similar as possible) process is used for creating both groups. For instance, in an infamous study on homosexuality conducted in the 1950's, heterosexual partcipants were recruited from the community, and gay and lesbian partcipants were recruited from an inpatient psychiatric hospital. The study found that homsexuals were less well adjusted than heterosexuals. However, the recruitment precudres created non-equivalent groups, so they could not say whether differences in adjustment were due to sexual oreintation or whether or not a partcipant was a psychiatric inpatient.

2. treated equally - except for the indipendent variable, treatment condtions should be very simialr for both groups. For isntance, if someone were studying the effects of a teaching technique on learning and one group was held in a dark, drab, windowless room and was taught by a very old man, and the other group was in a brightly lit room with windows and was taught by a very young woman it would be difficult to know what variables might account for any group differences.

3. composed of equivalent individuals - the procedures discussed below can help ensure that the individuals in each group are similar with respect to individual differences.

Limiting Confounding by individual differences

Several strategies for limiting individual differences were mentioned in chapter 7 and are worth repeating here.

Randomization (p. 235)

Randomization is the most common method of limiting confounding by individual differences.

You should recall that sampling procedures were discussed in chapter 5. When we have a procedure fro randomizing group assignment, AND a large enough sample, we should have fairly equivalent groups. That is, if we recruit 100 participants and randomly assign them to one of two groups, we shoudl be reasonably confident that each group will have about - but not exactly - the same proportion of men and women, of about the same average age, and about the same average IQ.

If the groups are fairly equivalent, the we can be confident that observed differences in the dependent variable are due to manipulationofo the independent variable rather than to between-group differences .

The major advantage of randomization is that it is much simpler and less costly than matching and holding variables constant.

The major disadvantage is that randomization does not absolutely guarantee that individual differences are well-matched across groups. This is especially true when the sample size is small. For example, if you had only 8 particpant sin each group they could differe significantly with respect to gender, ethnicity, age, intellegence, weight or any number of characteristics. As sample size gets larger, this becomes less of a problem.

Matching (p. 236)

 Perfectly matched particpants would be the same with respect to all possible individual differences. In other words, your best matched particpant would be a clone of you!

identical participants.jpg

 (image of one person - perhaps a clone?! - ducplicated and filling a lecture hall)

The closest we can get to clones is twins and other 'multiples' and in fact, twins are often studied because they are 'identical' with respect to genetic variables. However, even twins, or clones, would have different experiences, education levels, diets, exercise habits, friends, and etc.

 mouse in maze.jpg

In animal research we can match subjects with respect to many more variables. For example, strains of mice (and rats, frogs, fruit flies and other species) are genetically engineered so that they have similar genetic characteristics and are reared in the same environments so as to be matched with respect to their experiences as well as genetics.

For most research with humans, twin studies are only a small part of the research.

  • Q: What makes it difficult or impratical to use twins for all or even most research? Answer below...

More commonly, particpants are matched with respect to one or more variables that the resercher believes might be a potential confound. For example, if there was concern that intelligence might affect the outcome of a study comparing two strategies for improving memory, then the researcher might match particpants with respect to IQ, and for each particpant in one group find a 'match' with the same IQ for the second group.

There are three steps in the matching process:

  1. identify the variable or variables to be matched
  2. Measure the variable for each particpant
  3. Use a restricted random assignment procedure to make sure the groups are balanced with respect to the matched variable.

The major disadvantage of matching is that it is often costly, in terms of both time and the expense of measurement, to measure some variables and make sure that the groups are balanced. For this reason matching is only done when the researcher has a very good reason to believe that the matched variable could be a potential confound.

  • Answer to question about twin studies: the major limitation is that it takes a long time to find a large enough sample of twins that have the desired charactersitics. Only about 3% of births in the United States are to twins, and since twins are more common today than in the past, for older participants, the rate is an even lower 1 to 2%. And, they have to both be accessible and both be willing to be a particpant in the study - a tall order indeed.

Holding Variables Constant (p. 237)

Holding a variable constant or restricting its range is another way to recduce variability due to individual differences.

When we hold a variable constant or restrict its range we can be confident that the variable will not afect the outcome of our study. For instance, if all the particpants are boys, then sex can't be a confound; if all the particpants have iQ's between 100 and 110 then intelligence can't be a confound.

The main advantage of holding a variable constant is that the procedure eliminates it's potential to be a confound.

However, this advantage comes with some big disadvantages.

1. it reduces generalizability/external validity

if all the participants are girls, gender is not a confound, but we can't generalize the resutls to boys, and if all the particpants have similar IQ's, then we can't generalize our findings to people with different IQ's.

2. It is often impractical

This disadvantage is similar to the disadvantage for matching; if we want all partcipants to have the same IQ, then we have to measure IQ and it will  be expensive and will take a long time to find a large group fo people with the exact same IQ...AND, we won't be able to generalize the results to poeple with other IQ's.

An important note about convenience sampling

Don't confuse "holding a variable constant" with using a convenience sample. For instance, much research is done with college students, and college student samples have restricted ranges of variability with respect to age and education...and therefore the results of studies with college student samples have limited generalizability. However, college student and other convenience samples are not usually used because the researcher beleives those variables (age, education, etc.) will be confounds - theya are usually used only because they are convenient, that is, it is easy for a researcher at a university to recruit college studnets.

The discussion in chpater 8 is focused on holding a variable constant because the researcher believes the variable might be a confound, not because it is convenient!!

Individual Differences and variability (p. 239)

This discussion is a little complex, so make sure your ead it carefully and understand what the authors are getting at....

In brief, while previous sections look at how individual differences can be confounds and how we can minmize the potential for individual differences to be confounds, this section is more about variability in sores on the dependent variable.

If I gave a randomly selected group of high school students the SAT, there will be a lot of variability in their scores. Howeve,if I gave the same test to only high schools tudents in a program for gifted students, then there would be less variability in their scores. Or if I gave a fitness test to a group of 100 ramdonly selected adults, then there would be more variability than if I gave the same fitness test to 100 marines, or 100 morbidly obese people, or 100 55-year-old women.

When variability is high, it is more difficult to see the effect of the independent variable. In your textbook they use the analogy of variability as being similar to 'interference' on a phone call or radio signal -- it makes it harder to hear what you want to hear...or in research to detect the results youa re trying to detect.

People are different...there is not much we can do about that without reducing genralizability/external validity....however, even while we can't do much about individual differences, we can control variance within and between treatments in order to make it more likely that if there is a difference betwen treatments we will be able to detect it.

 Minimizing Variance Within Treatments (p. 241)

Even when we may not be able to control individual differences there are some thing swe can do to reduce score variance by making sure there is as little variability in the treatment procedures and setting.

Standardize procedures  

We standardize procedures to make sure that all partcipants are treated exactly the same - except for with respect to the independent variable.

Researchers create detailed written protocols to make sure that each particpant receives exactly the same instructions. If more than one researcher delivers the treatment conditions, then the researcher might countrbalance them to make sure that particpants in all conditions are equally likely to be in contact with any reseearcher or research assistant. That is, if Smith and Jones are doing a study with two groups the might make sure that each of them delivers the treatment condtion to half of the particpants, because of Smith only manages one treatment condtion and Jones the other, thena we cannot be sure if group differences are due to the independet variable or to effects of Smith and Jone!

We know that a researh protocol is well standardized when another researcher can replicate the study and get about the same results.

Standardize treatment setting

We can also miniize variance by making sure that the research setting is the same for all partcipants; e.g., in the same or simialr rooms, that the time of day or year is balanced across individuals, that the noise level, tempereature, etc. is similar for both groups.

The importance of sample size

 A large sample size will help minimize the effect of group variability, however once a sample size is fairly large there are diminishing returns for increasing it further. Most researchers in the behavioral sciences select sample size that will provide them with enough statistical power to detect a differerence between groups. While important, further discussion of statistical power is beyond the scope of this group. Suffuce it to say that for a very basic two group, one dependent variable research design with the independent variable likely to have a moderate effect on the dependent variable, a sample size of at least 37 participants per group is required.

Other threats to internal validity of between subject designs (p. 243)

The threats to internal validity that have been discussed so far are threats in many sorts of research designs. The threats discussed in this section are specific to between-subjects designs.

Differential Attrition

Attrition occurs when a particpant withdraws from a study before it is completed. Some attrition is common, it is a threat to internal validity when there is a high rate of attrition and especially when the rates of attritiona re different in each group. If particpants are withdrawing from one group, then it is possible that two groups that wer einitially similar are not similar by the end of the study. This is especially problematic when the attrition is related to the independent or independet variable. For instnace, imagine a researcher comparing the effects of two smoking cessation programs. If participnats drop out of one at a high rate it is not unreasonable to suspect that particpants are dropping out because they ahve begun smoking and no longer want to particpate. Those left in the study wuld be those with the highest motivation to quit, so we could not be sure if the treatmetst or the motivation of the particpants is responsible for the observed outcome, and differential arrtition becomes a plausible explanation for the outcome of the study.

Communication between groups

When particpants in one group communicate with partcipants in another group there is a potential for outcomes to be determined by factors other than the effect of the independent variable. There are several different ways in which between group communication can be problematic:

  • Diffusion occurs whn the tretament effects spread from a treatment group to a control group. For example, when people receiving no treatment learn details fo the treatment from those in the experimental condition
  • Compensatory equalization occurs when one group learns about the treatment another is receiving and demands the same treatment. This is especially a risk when there is a no-treatment control group, and it can occur when two or mor etreatments are being ccompared.
  • Compensatory rivalry occurs when particpants behave in a manner different from their usual behavior when they learn that another group is receiving treatment that is perceived to be special or better. They may be tryign to show the researcher that they can improve even without treatment.
  • Resentful demoralization is sort of the opposite of compensatory rivalry. It occurs when particpants give up, becoming less prodcutive and less motivated, upon learning that they are in a no-treatment or placebo control group

Statistical Analysis and Research Designs  (p. 248)

We made it all the way to chapter, and at last it is time to start talking about statistics and the statistical analysis of between group designs. Make sure you know which statistical tests are used for different types of designs.

Single-factor two-group designs

The single-factor two-groupd esign is the most basic between-subjects design where a researcher compares two groups and manipulates one independet variable with two levels.

When the dependent variable is numeric we use an independent measures t-test to determine if there is a differecne between the group means.

When the dependent variable is not numeric, that is when it is a nominal variable such as gender, then we use a chi square test - this will be discussed further below...

The advantage of the two-group design is that it is simple (which is why your lab for this module is focuse don this type of design).

However, the disadvantage of the two-group design is that it is simple! That is, the world and relationships between variables are often complex, and a single-factor two-group design doesn't tell us much about the relationship between the independent and dependent variable since there are only two levels of the independent variable. Your textbook, on p. 246, gives a great example of situations where more complex relationships between variables would be completely misunderstood if only two levels of the independe variable are measured.

Single-factor multiple-group designs

This research design is common and is used when we want to evaluate the effect of multiple levels of the independent variable, such as comparing four dosages of a drug, three different teaching strategies, the effect of messages from four differnet political parties, the effect of several blood-alcohol levels on driving skills, and etc.

When the dependent variable is numeric, e.g., score on ameasure of depression, rating of  a political ad on a scale form 0 - 100, number of errors on a road test, etc., then the appropriate statistical test is a single-factor analysis of variance, or ANOVA. If the ANOVA test indicates a significant difference, than a post-hoc or posttest is used to determine exactly which groups are significantly different from each other. This will be discussed in greater detail in chapter 15.

caution_sign_l.jpg

(image of caution sign)

While comparing multiple groups may seem like a great idea -- afterall, it does allow you to compare means at several levels of the dependet variable -- a word of caution is in order

When comparing multiple groups it is important to mkae sure that you do not have so many levels tha tyou can't detect between-group effects. For instance, if you wanted to study the effects of consuming alcohol on behavior and measured group differences at blood alcohol levels of .01, .015, .02, .025, .03, .035, .04, .045, .05, .055, .06, .065, .07, .075, .08, .085, .09, 0.95, and .10 you may not see significant differences between each group. If instead you measured at .0, .04, .08, and .12 then you would be more likely to see between-group differences in behavior (BTW, in the United States the legal level for intoxication is .08 or above but drivvng and other activities are impaired below that level...)

Comparing Proportions for two or more groups

Whe the depednent variable is nominal we cannot compute a mean. For instance, dependent varialbe is pass or fail (nominal variable) or a grade earned A, B, C, D, E (ordinal) then we cannot compute a mean and cannot use a t-test or ANOVA.

In these instances we compute the proportions of each value of the dependent variable for each tretment condtion and use a Chi-Square Test for independence (which will also be discussed in more detail in chapter 15. Your textbook, on p. 249, provides a good example of this from a study on the effect of questions asked on eye-witness testimony.