About Us
Topics
BUDGET POLICY
CHILD CARE, EARLY
EDUCATION & HEAD START
CHILD WELFARE & CHILD ABUSE
EDUCATION
ELDERLY
FAMILY POLICY, MARRIAGE & DIVORCE
FOOD ASSISTANCE, SNAP & WIC
HEALTH CARE POLICY
INCOME & POVERTY
JOB TRAINING
LEGAL ISSUES
PAY FOR SUCCESS, PAY FOR RESULTS & SIBS
POLITICAL PROCESS
PROGRAM EVALUATION
SOCIAL POLICY
TEEN SEX & NON-MARITAL BIRTHS
WELFARE REFORM
Child Abuse Training
International Activities
Health Policy Collaboration
Rossi Award for Program Evaluation
UMD Capstone Courses
Publications
Mailing List
Contact Us



Publications: Preventing Subsequent Births

The Evaluators Reply

Michael Camasso, School of Social Work and Center for Urban Policy Research, Rutgers University,
Carol Harvey, Center for State Health Policy, Rutgers University
Radha Jagannathan, Woodrow Wilson School of Public and International Affairs, Princeton University
Mark R. Killingsworth, Department of Economics, Rutgers University

Peter Rossi (see chapter X) points to three main “deficiencies” that, according to him, are “serious enough to cast strong doubts on the validity of the findings” in our work on New Jersey’s Family Development Program (FDP). In our view, however, we do indeed know what happened as a result of FDP.

Are the Statistical Models Inappropriate?

            For both our experimental–control and pre–post analyses, we presented numerous tables of logit and probit estimates. The final results highlighted in our reports are based on probit models. Rossi nevertheless finds it “surprising that the researchers present OLS results and appear to regard them as valid.” Our conclusions are invariant to the statistical methods used. This should not be surprising to any experienced analyst: OLS and other simple estimators often are surprisingly robust, much more so than Rossi appears to realize. Rossi would have the reader believe that our results are highly sensitive to the statistical technique used, and he presents a comparison of our birth and abortion estimates derived from OLS, logit, and probit models. The numbers are different, as would be expected, but are they illustrative of a big difference? For example, the difference between our highest and lowest estimates of the family cap’s effect on abortions (2,064 vs. 1,329) is about 1.7 percent of all actual abortions occurring during the treatment period (approximately 41,000). Differences of this size should not surprise (or disturb) anyone who understands that even statistically significant point estimates come with a standard error.

Rossi also argues that because we used longitudinal data in which the same people appear at numerous dates, the observations may be serially dependent. Because of this, he contends, even the logit and probit procedures we have used are invalid. He also conjectures that use of a more “valid” (to him) estimator could yield different results.

            We have performed further analyses of the experimental–control and pre–post data using the statistical methodology that (at least at this point) Rossi seems to prefer: Probit and logit with robust (Huber-sandwich) standard errors. Table 1 compares estimated treatment coefficients and standard errors for our experimental design models for two outcomes (own births and abortions)[1] using two estimation methods: Probit and Huber-adjusted probit. Table 2 provides analogous information from our pre–post analysis.

            Different estimation methods do not yield appreciably different coefficient standard errors or, therefore, warrant any change in our inferences regarding the impact of FDP on births or abortions. There is simply no empirical support for Rossi’s speculations about the possible impact of serial dependence of the observations.

Rather than withdraw his speculation, Rossi now resorts to even more speculation, asserting that the “other deficiencies” in our work “are sufficiently serious” that even estimates based on his own preferred statistical method, Huber-corrected probit, “are not credible.” This remarkable claim, which Rossi did not make in any of the previous drafts of his critique, raises an interesting question: If even the Huber-corrected estimates are “not credible,” why did Rossi argue so strenuously for computing them? 

Does the Pre–Post Analysis Suffer from Omitted-Variables Bias?
Rossi’s second major criticism is that our pre–post analysis “cannot take into account the effects of time or other events that might affect outcomes.” However, Rossi conspicuously neglects to identify any variables that we should have included but did not. He does note that the period from 1991 to 1996 saw increases in the proportion of Aid to Families with Dependent Children (AFDC) cases with either an ineligible adult caretaker or a never-married woman, and he speculates that “the effects estimated by the pre–post analysis are likely confounded with those changes.” However, he apparently does not realize that we included variables that explicitly identify the presence of ineligible adult caretakers and never-married women. Are there other variables that we should have included but did not? Rossi says nothing about what these other variables might be.

Was the Experiment Flawed in Its Implementation?
The experimental design was developed by the Department of Human Services, State of New Jersey, and the Administration for Children and Families, U.S. Department of Health and Human Services. The evaluators were not involved in either the development or the implementation of this experiment. We did, however, monitor the experimental sample to detect any evidence of experimental–control crossover or contamination.

Rossi offers two main reasons to support his conclusion that implementation of FDP was flawed. First, he argues that “the integrity of the control group was problematic.” He notes that “about 20” control-group recipients did not receive benefit increases appropriate to their status during the first year or so of FDP. He then assumes that “only a small proportion of recipients will have children in that period of time.” He therefore asserts that this error must mean that “a much larger number of women were mistakenly treated as if they were under FDP rules.” Of course, Rossi does not actually know that this was the case; it is merely speculation. We believe that there is no support in the data for such speculation. Although we uncovered several cases of questionable Medicaid extension, our monitoring of more than a dozen other FDP treatment blocks revealed no additional incidence of contamination. The 21 cases we identified as having been erroneously denied benefit increases never grew in number. Of these, 16 cases were handled by a single field office within a large urban county. Rossi’s speculation is appropriate only if it can be assumed that such benefit denials were random and widespread, rather than limited and idiosyncratic, as we strongly believe was the case.

Rossi also argues that many experimental and control subjects did not know which rules applied to them. For several reasons, we are perplexed by this argument. First, although Rossi claims that no conclusions can be drawn from our other analyses because of their methodological flaws, he bases this conclusion on the responses to what he concedes is one “poorly worded question” in a 275-question survey, which had a 41-percent response rate. Second, although Rossi now feels able to draw conclusions from this survey, two years ago he argued that the “serious methodological flaws” in this survey are so severe that “no one may ever know whether New Jersey’s ‘family cap’ had any impact on the birth rates of mothers on welfare” (see Besharov, Germanis, and Rossi 1997, 20–21). Of course, Rossi may now have changed his mind and may now truly feel that the survey provides credible evidence. If so, however, we are surprised that he ignored results in this survey showing, for example, that experimental cases were more likely than control cases to have decided to put off having more children, advised a friend to have an abortion, begun to use contraception methods more consistently, begun to use different contraception methods, sought family-planning advice, received birth-control or abortion counseling, and tried harder to get off welfare. We would nevertheless suggest that this client survey is much more “problematic” (to use Rossi’s term) than Rossi seems to realize. For example, Jagannathan (1999) found that only 29 percent of actual abortions were reported in the survey. In addition, she finds that many women tended to report an actual abortion as a birth.

Rossi also says that the FDP evaluation should have done follow-up studies of births and abortions for women (in both the experimental and the control groups) after leaving AFDC. We endorse this idea, particularly because the evaluation contract did not provide for follow-up data collection of this nature and because, so far as we are aware, no such follow-up analysis has ever been funded. We would also note that, by its very nature, any such follow-up analysis of post-AFDC behavior in New Jersey would have to rely on respondents’ self-reporting (which can be quite unreliable) instead of on administrative records and might well suffer from attrition. Thus, such an analysis might well be more “problematic” than Rossi seems to realize.

The evaluation of New Jersey’s FDP is the only completed evaluation of a family-cap policy that includes analysis of births, abortions, family-planning visits, and contraception use. Many of Rossi’s criticisms are based solely on speculation; the rest are resoundingly rejected by the empirical evidence. Perhaps most important, Rossi never explains why two distinctly different evaluation designs have led to similar results. To accept Rossi’s conjectures, one must be willing to believe that this confluence of findings is a mere artifact of design flaws and differences in estimation methods.

We can argue whether New Jersey’s family cap was a good policy. We can discuss whether one can generalize from New Jersey to other states. We cannot, however, avoid the conclusion that women on welfare in New Jersey responded to a family cap in an entirely predictable way by reducing pregnancies and births and (temporarily) increasing abortions.

 

References

Besharov, D. J.; Germanis, P.; and Rossi, P. H. 1997. Evaluating welfare reform: A guide for scholars and practitioners. College Park: University of Maryland School of Public Affairs.

Jagannathan, R. 1999. Who tells the truth when reporting abortions: A study of women on AFDC. Princeton, NJ: Princeton University, Office of Population Research, 1999.


Table 1. Treatment-Effect Coefficients: Experimental Design

 

                                ONGOING CASES                           NEW CASES

 

 

 

Treatment (SE)

 

Time*Status (SE)

 

Treatment (SE)

 

Time * Status (SE)

 
Own Births

 

 

 

 

 

 

 

  Probit

  0.029

(0.053)

 -0.011*

(0.006)

-0.098*

(0.035)

NC

 

Huber-adjusted probit

 

0.029

(0.052)

 

-0.011*

(0.005)

 

-0.098*

(0.040)

 

NC

 

 

 

 

 

 

 

 

 

 

 Abortions

 

 

 

 

 

 

 

  Probit

-0.037

(0.046)

 0.007

(0.005)

0.170*

(0.065)

  -0.010

(0.008)

 

Huber-adjusted probit

 

-0.037

(0.052)

 

0.007

(0.005)

 

0.170*

(0.070)

 

-0.010

(0.008)

 

 

 

 

 

 

 

 

 

 

SE = Standard error

* p £ .05 [??AU: correct here and in table 2?]

 

Table 2. Treatment-Effect Coefficients: Pre–Post Analysis

 

 

 

 

Middle (SE)

 

Post (SE)

 

Time * Middle (SE)

 

Time * Post (SE)

 

Own Births

 

 

 

 

 

 

 

 

Probit

 

0.202*

(0.054)

 

-0.041*

(0.019)

 

-0.027*

(0.006)

 

-0.011*

(0.003)

 

Huber-adjusted probit

 

0.201*

(0.054)

 

-0.041*

(0.018)

 

-0.027*

(0.006)

 

-0.011*

(0.003)

 

 

 

 

 

 

 

 

 

 

Abortions

 

 

 

 

 

 

 

  Probit

 -0.067

(0.050)

0.040*

(0.018)

0.009

(0.006)

-0.001

(0.003)

 

Huber-adjusted probit

 

-0.065

(0.048)

 

0.039*

(0.018)

 

0.009

(0.006)

 

-0.001

(0.003)

 

 

 

 

 

 

 

 

 

 

SE = Standard error

* p £ .05 



[1] Space limitations prevent us from presenting estimated coefficients for contraception use, family planning, and sterilization. Our conclusions regarding the impact of FDP on these outcomes are likewise not affected by estimation method.


Back to top


HOME - PUBLICATIONS - CONFERENCES - ABOUT US - CONTACT US