Results 1

The Limitations of the Method Used to Determine Harm

Sample Bias: The Use of College Students
Sample Bias: The Use of College Students
Exclusion of relevant outcomes
Comparison to national samples

Independent Measures: Lack of Operational Definition of CSA

Dependent measures.: Omission of Relevant Measures

[Page 716 continued]

The article by Rind et al. ( 1998) provided a coherent research protocol that outlined their data collection and statistical methods. Rind et al. also did a thorough job in tracking down published and unpublished studies of college students that addressed their research question and ensured that the studies were independent (i.e., did not use overlapping samples). Despite these strengths, we identified a number of important limitations in their study's ability to answer the authors' research questions concerning the prevalence and intensity of harm associated with CSA. Some of these limitations were intrinsic to the primary data set, whereas others originated in the Rind et al. study's design. We briefly review each of these problems and describe how they may affect or qualify the findings of the meta-analysis.

Sample Bias: The Use of College Students

Rind et al. ( 1998) dismissed the conclusions of previous critical reviews, such as Kendall- Tackett et al. (1993) and Beitchman et al. (1992), along with previous meta-analytic reviews, such as Jumper (1995) and Neumann, Houskamp, Pollock, and Briere (1996), with the argument that these reviews were based largely on clinical and legal samples not representative of the general population. However, rather than examining the general population, Rind et al. (1998) examined college students: a young, well-functioning portion of the population. Although the use of college students offers advantages in terms of the availability of a large number of studies using a wide variety of outcome measures, it also limits the generalizability of the study's findings. We identified several major problems associated with relying exclusively on college students to determine whether CSA causes intense or pervasive long-term harm in the general population.

Potential exclusion of those most severely affected.

As Haugaard and Emery (1989) noted,

"if child sexual abuse is debilitating, fewer victims may be found among college students and these victims may have made a better adjustment to their abuse, since colleges often require above average academic and social performance from entering students" (pp. 89-90).

Empirical support for this concern can be found in numerous studies that have identified a relationship between CSA and academic difficulties (e.g., Boney-McCoy & Finkelhor, 1995; Chandy, Blum, & Resnick, 1996; Erickson & Rapkin, 1991; Kendall-Tackett et al., 1993; Lisak & Luster, 1994), failing to finish high school (Edgardh & Ormstad, 2000), and failing to remain in college (Duncan, 2000). These data suggest that by including only studies of college

[Page 717]

students, Rind et al. may have excluded some of the individuals most severely affected.

Exclusion of relevant outcomes.

The use of college samples also limited the scope of dependent measures available for study. Most studies of CSA-adjustment relations in college students have used generic measures of internalizing behaviors such as depression, anxiety, and eating disorders. Unfortunately, few investigators have studied nonclinical populations using measures specifically designed to detect symptoms of posttraumatic stress. As a result, the meta-analysis by Rind et al. ( 1998) did not include any measures of posttraumatic stress disorder (PTSD) one of the most commonly reported aftereffects associated with sexual abuse in children and adolescents (Cuffe et al., 1998; McLeer, Deblinger, Henry, & Orvaschel, 1992; Merry & Andrews, 1994).

The relationship between CSA and externalizing has also been neglected in college samples. The absence of such measures may have caused Rind et al. (1998) to underestimate the adverse effects associated with CSA, as numerous nonclinical studies of high school students have reported that CSA is associated with a wide variety of high-risk behaviors, including

	antisocial behavior,
	conduct disorders,
	self-destructive behavior,
	substance abuse,
	younger age at first coitus,
	more frequent and risky sexual activity,
	not using condoms or birth control,
	sexually transmitted diseases,
	increased HIV risk, and
	teen pregnancy

(Bensley. Spieker, Van Eenwyk, & Schoder. 1999; Bensley. Van Eenwyk, Spieker, & Schoder. 1999; Fiscella. Kitzman, Cole, Sidora, & Olds, 1998; Harrison, Fulkerson. & Beebe, 1997; Hibbard, Brack. Rauch, & Orr. 1988; Hibbard. Ingersoll. & Orr, 1990; Nagy, DiClemente, & Adcock, 1995; Stock, Bell, Boyer, & Connell, 1997).

Moreover, studies of high school students have reported that sexual abuse has a particularly negative impact on the behavior of adolescent males (e.g.. Chandy et al., 1996; Garnefski & Arends, 1998; Hibbard et al.. 1990). Chandy et al. (1996) examined gender-specific outcomes for 370 abused boys and 2,681 abused girls who were identified in a study of over 36,000 students in Grades 7 to 12. Compared with their female counterparts, male adolescents who acknowledged experiencing CSA were at significantly higher risk for poor school performance, delinquent activities, sexual risk taking, and dropping out of high school. These results suggest that by restricting their analysis to college samples. Rind et al. may have missed some of the most harmful effects associated with CSA, particularly those found in boys.

Comparison to national samples

Despite the problems with age restriction and the use of a healthy population, Rind et al. (1998) argued that their results should be considered generalizable to the population as a whole. Their argument was based in large part on their assertion that the

	(a) prevalence,
	(b) severity, and
	(c) effects

of CSA in their college sample were similar to those found in three studies using nationally representative samples:

Baker and Duncan (1985; Great Britain); Laumann. Gagnon, Michael, and Michaels (1994; United States); and L6pez, Carpintero. Hernandez, and Fuertes (1995; Spain) (see Rind et al., 1998, Table 1).

We examined each of these studies and found that meaningful comparisons were limited by the fact that CSA definitions, outcome measures, and data collection methods differed across the studies. Moreover, when meaningful comparisons were possible we found that Rind et al. (1998) often presented the data in a misleading manner.

For example, whereas college samples categorized victims by highest category of abuse severity they had experienced, Rind et al. averaged simple frequency counts to obtain numbers for the national samples.

For instance, 28% of college students reported exhibition to be the most severe form of abuse they had experienced. Because Lopez et al. (1995) was the only national study that reported exhibition, Rind et al. noted that numbers they reported for the prevalence of exhibition in national samples (33% for male and female samples combined) came from this study (see Rind et al., Table 1).

When we translated López et al. [*1],

[*1] I Antonio Cepeda-Benito, whose native language is Spanish, translated this article.

we found that the authors had reported abuse severity using two methods:

	(a) they provided a simple frequency count of each type of abuse experienced (the sum of which is more than 200%) and
	(b) they categorized respondents by the highest category of abuse severity experienced (see López et al., 1995, Table 2).

Therefore, whereas López et al. reported that 33% of their combined sample had experienced exhibition, only 16% of Spanish nationals had reported exhibition as the most severe form of abuse they underwent.

Rather than reporting the data from López et al. that was comparable to that reported for students (16% ), Rind et al. reported data from the simple frequency count (33% ). This method of data reporting made it appear that national samples and college samples had reported a similar abuse severity, while masking the fact that Spanish nationals had reported a higher prevalence of contact CSA than college samples.

To allow readers to more fully appreciate how college and national studies differed, we compare prevalence rates, as well as levels of CSA severity (i.e., exhibitionism, fondling, oral sex, and intercourse) and number of occurrences for college samples and the three national surveys cited by Rind et al. (1998) in our Table 1.

This table shows that the severity of abuse reported by U.S. college students differs with that reported in Spain. The other two national samples cited by Rind et al. either did not provide information on abuse severity (i.e., Baker & Duncan, 1985) or did not provide data that could be broken down and categorized in a comparable format (i.e., Laumann et al., 1994 ). As a whole, these studies offered little information appropriate to determine whether the severity of abuse experienced by U.S. college students is similar to that experienced by the U.S. population as a whole.

A study more relevant to the U.S. general population (which was included in Rind and Tromovitch' s, 1997, study, but which Rind et al., 1998, failed to cite) is Finkelhor, Hotaling, Lewis, and Smith (1990). Finkelhor et al. reported the results of a Los Angeles Times national telephone survey of 2,626 American men and women 18 years of age or older. CSA was reported by 16% of the men and 27% of the women, which corresponds well with the prevalence rates of 14% for men and 27% for women that Rind et al. reported for college samples. However, Finkelhor et al. reported that 62% of men and 49% of women reported experiencing actual or attempted intercourse. These numbers are more than double those reported for college students (33% of men and 13% of women) and do not support Rind et al.'s claim that "SA [sexually abused] students experienced as much intercourse ... as persons in the general population" (p. 44).

[Page 718]

Rind et al. ( 1998) also compared effect sizes for college samples and national samples and reported that

"the magnitudes of CSA-adjustment relations in the college samples and in the national samples meta-analyzed by Rind and Tromovitch (1997) were identical, r_u = .07 for men, and r_u = .10 for women" (p. 42).

We evaluated this claim by examining the three national samples Rind et al. referenced in their article, along with two surveys of the U .S. population (Finkelhor et aI., 1990; Boney-McCoy & Finkelhor, 1995) also included in Rind and Tromovitch's analysis. Effect sizes were computed for the three national surveys that provided appropriate outcome data; Laumann et al. ( 1994), López et al.( 1995), and Boney-McCoy and Finkelhor (1995).

Our results, which are displayed in Table 2, show that the effect sizes that Rind et al. reported for college students (after appropriate correction for base-rate differences) were similar to those we estimated for Laumann et al., who studied sexual dysfunction in the U.S. Pop ulation, but lower than those estimated for Boney-McCoy and Finkelhor, who studied PTSD in a national survey of adolescents, or for López et al., who asked participants about emotional and behavioral problems.

In summary, prevalence, severity, and effect sizes varied considerably across college and national samples. These results contradict Rind et al.'s (1998) contention that "the college data were completely consistent with data from national samples" (p. 22) and provide little support for the claim that their findings should be considered generalizable to the population as a whole.

Independent Measures: Lack of Operational Definition of CSA

It is obvious that how CSA is defined is critical for studying it's effects and that varying definitions cause difficulties when combining data for study. In the current meta-analysis, the use of studies that lacked a common definition was unavoidable, as there are nearly as many different definitions of abuse as there are studies (e.g., see Appendix from Rind et al., 1998, pp. 52-53).

Definitions of CSA ranged from whatever the victim thought the term sexually molested meant (e.g., Fritz, Stoll, & Wagner, 1981) to more structured definitions (e.g., Finkelhor, 1984). Structured definitions differed widely on the upper age limit for the victim to be classified as a child; some set the age at puberty, others picked an age somewhere between 12 and 17.

This resulted in similarly abused adolescents being placed in the study group in some studies and in the non-abused control group in others. Studies also varied as to what constituted a perpetrator, whether the sexual experience had to be forced, and whether any physical contact between the perpetrator and the child was required. As might be expected, studies that used broader definitions of CSA reported higher prevalence rates. The variability in definitions of CSA is reflected in prevalence rates in the primary data set that varied from 3% to 71%.

Rind et al. (1998) compounded this lack of standardization in the literature by failing to delineate any inclusion or exclusion criteria and analyzing all studies that were even remotely relevant

[Page 719]

to their research questions. Thus, in contrast to what we consider to be an under-inclusive selection of populations to be studied, Rind et al. tended to be over-inclusive in the studies they used in their primary data set.

For example, Rind et al. included a number of studies in their primary data set that did not even purport to examine the effects of CSA. Landis ( 1956), for example, examined experiences with "sexual deviants" at any age, Schultz and Jones (1983) looked at all types of "sexual acts" before age 12, and Sedney and Brooks (1984 ) [*2] examined all types of "sexual experiences" during childhood. Some studies even included sexual experiences that occurred after age 17 (e.g., Greenwald, 1994; Landis, 1956; Sarbo, 1985).

[*2] For example, Neumann et al. (1996) rejected this study from their meta-analysis of long-term sequelae of CSA, noting that the study reported data "which may not meet the criteria for sexual abuse" (p. 8).

At the same time, Rind et al. (1998) excluded from analysis two studies that examined the effects of incest: Jackson, Calhoun, Angelynne, Maddever, and Habif (1990) and Roland, Zelhart, and Dubes (1989). These studies were eliminated because Rind et al. claimed that their relatively large effect sizes (r = .36 and r = .40, respectively) were outliers in the higher direction.

In our opinion their elimination, given a close reading of Rind et al., is quite baffling. In a footnote Rind et al. noted that these two studies

"may capture more accurately the essence of abuse in a scientific sense [italics added] - that is, more persuasive evidence of harm combined with the likely contextual factors of being unwanted and perceived negatively" (p. 46).

This statement raises an obvious question: If it were these samples that, in a scientific sense, more accurately captured the essence of abuse and are clearly more consistent with what the public associates with CSA, why did Rind et al. (1998) choose to exclude these studies while including studies (such as those including consensual peer experiences) that did not capture the essence of abuse in any way?

Upon further investigation, we found that these studies were only outliers because Rind et al. erroneously coded a third study, Silliman (1993), in the lower direction. Had Silliman not been miscalculated, the entire distribution would have shifted slightly in terms of effect sizes, and Jackson et al. (1990) and Roland et al. (1989) would no longer have been statistical outliers. [*3]

[*3] Silliman (1993) was a one-page report with irreconcilable problems with what little data were provided. However, rather than clarify the problems with the author, Rind et al. (1998) assumed that the abused group scored much higher on self-esteem than the non-abused group and calculated an effect size of approximately -.49

(when averaged with the effect size for the other dependent variable in the study, one arrives at the effect size of -.25).

To clarify the statistics, we contacted the author of the original study and found that Rind et al.'s inference was incorrect. The mean values for the abused and control groups were 327 and 337, respectively (M. Riposo [formerly M. E. Silliman], personal communication, July 31, 1999).
This mean difference corresponds to a point-bi-serial r of .16 and an overall effect for self-esteem of approximately .08. Had the Silliman report not been miscalculated and eliminated, the entire distribution would have shifted slightly in terms of effect sizes, and Jackson et al. (1990) and Roland et al. (1989) would no longer have been statistical outliers. We reanalyzed their data (using the r values reported by Rind et al. in their appendix) after coding Silliman correctly and without eliminating any outliers, and the results suggested a homogenous sample,
X² (52, N = 15,872) = 67.85, p = .07. [* X = Chi]

It should be noted that inclusion of these outliers would not have substantively changed the overall findings of Rind et al.'s meta-analysis. However, the inclusion of a large number of studies that looked at milder forms of abuse or non-CSA experiences can be expected to minimize the likelihood of identifying any significant consequences experienced by the smaller number of people who underwent more severe forms of abuse. As Rind et al. examined this problem in moderator analysis, we address this issue in more depth in our review of this section of their article.

Dependent measures: Omission of Relevant Measures

In reviewing the primary data set, we found a number of outcomes we consider relevant to the question of harm that Rind et al. ( 1998) failed to report.

For example, a number of studies asked about illegal drug use, re-victimization, and whether respondents had been treated for emotional problems. In each case, those acknowledging a history of CSA indicated more problems in these areas than their peers. For example, re-victimization, defined as sexual assault or rape occurring after age 18, was a dependent measure that was reported in at least five of the studies included in the authors' primary data set

(i.e., Alexander & Lupfer, 1987; Fromuth, 1986; Gidycz,-Coble, Latham, & Layman, 1993; Wisniewski, 1990; Zetzer, 1991).

To assess the significance of the exclusion of data on re-victimization, we meta-analyzed the data from these studies along with data from three more recent studies that used college samples, (Humphrey & White, 2000; Johnsen & Harlow, 1996; Schaaf & 1998).

The association between CSA history and revictimization

[Page 720]

was consistent across studies and robust (see Table 3).

Moreover, several studies compared rates of re-victimization for varying levels of CSA severity and found a linear relationship with more invasive experiences during childhood being associated with higher rates of re-victimization during adulthood (Gidycz et ai., 1993; Humphrey & White, 2000).

Fromuth (1986) reported that the relationship between CSA and re-victimization remained strong even after controlling for family environment. These results show that Rind et al. may have underestimated some of the adverse effects associated with CSA in the college populations by not including all available dependent measures in their analyses.