Department of Health and Human Services logo  Skip ACF banner navigation
Questions?  
Privacy  
Site Index  
Contact Us  
   ACF Home   |   Services   |   Working with ACF   |   Policy/Planning   |   About ACF   |   ACF News Search  
Administration for Children and Families US Department of Health and Human Services
Error processing SSI file

1999 Report Home | Table of Contents | Previous Chapter | Next Chapter

Evaluating Head Start:
A Recommended Framework for Studying the Impact of the Head Start Program

Chapter 4

Research Questions | Criteria | Outcomes and Related Measurement Issues | Overall Research Design


Recommended Framework for Studying the Impact of Head Start

The remainder of this report presents the recommendations developed by the Advisory Committee on Head Start Research and Evaluation in response to the charge of the Congress, the programmatic history and context offered above, the research experience summarized in the previous chapter, and the extensive experience of individual members with research and evaluation in early childhood programs and across broader social policy issues. This chapter sets forth the framework that the Committee recommends to the Department as it embarks on an impact study or set of studies of Head Start. The framework offers the Committee's best thinking on appropriate research questions, criteria, outcomes and related measurement issues, and the overall research design.

Following this chapter on recommendations for a research framework, Chapter V provides a fuller account of the rationale for these recommendations. Thus Chapter V outlines the challenges that the Committee debated-challenges that any research design would have to address-and the strategies that the Committee considered for meeting those challenges. The final chapter completes the discussion of recommendations by providing a set of specific next steps that the Committee offers to the Department, the research community, and the Head Start community, in order to ensure the successful implementation of this research effort.

Research Questions

Recommendation 1

The foundation for any research design is clarity about the questions to be answered. Therefore, based on members' review of the Head Start Amendments of 1998, their understanding of the Head Start program, and their experience studying human service interventions, the Advisory Committee recommends that two critical research questions be investigated as part of an impact study or set of studies. These questions will be operationalized further during the development of the detailed research design.

  • What difference does Head Start make to key outcomes of development and learning (and in particular, the multiple domains of school readiness) for low-income children? What difference does Head Start make to parental practices that contribute to children's school readiness?

  • Under what circumstances does Head Start achieve the greatest impact? What works for which children? What Head Start services are most related to impact?

The first of these two questions is highlighted in the statute in several places and reflects the Congress's interest in learning "if, overall, the Head Start programs have impacts consistent with their primary goal of increasing the social competence of children." (Head Start Amendments of 1998)9. The second question was also central to the recommendations of the 1990 Advisory Panel for the Head Start Evaluation Design, because understanding what works best for whom is important to the work of policymakers and program operators in supporting the continuous improvement of the Head Start program and other early childhood efforts.

In answering this second question, any feasible study or activities will only address some of the many possible sources of variation in impact. Therefore, the Committee spent considerable time discussing the kinds of variation that are most important to be able to explore. As noted below, the Committee is particularly interested in variation that relates to the diverse characteristics of children and communities served, the region of the country, and the quality of programs. To the extent possible, the Committee also believes it is important to look at the variation in impact according to the design and auspices of the Head Start programs (e.g., two-year vs. one-year programs, part-day vs. full-day programs, programs operated by nonprofits vs. programs operated by public schools), and the variable nature of the settings in which control group children are served.

In addition to addressing these two top priority research questions, the Committee believes it is also important for those researchers carrying out the impact study or studies to make every effort feasible to communicate information learned as part of the study to participating Head Start programs. For instance, it may be possible to provide descriptive as well as impact information on individual programs to program managers and staff for decisionmaking. Sharing information openly demonstrates respect for the programs and helps them receive a direct and immediate benefit for their involvement in the effort.

Finally, the Committee discussed additional research questions, such as determining the impact of Head Start on communities. While believing that all of these additional questions raise important issues and, if answered, would be helpful to policymakers, the Committee concluded that developing the most feasible design for a study or set of studies, required limiting the research questions to the top priorities identified above.

Criteria

Recommendation 2

The Committee believes that an acceptable research design must answer the priority research questions identified above and must satisfy two key criteria:

 

  • An acceptable research design must be scientifically valid and widely credible. It must provide evidence that is scientifically convincing and persuasive to a variety of audiences, such as the Congress, the research community, program staff, and parents.

  • An acceptable design must be feasible. It must be capable of being well implemented in the real world by Head Start programs and researchers.

Much of the Committee's deliberation focused on the potential tension between these two criteria.

The Committee also reached consensus on two other criteria that are critical to the study's ability to answer the research questions rigorously and credibly. The Committee believes that a credible study must:

  • Collect information on the quality of services provided to the Head Start children; and

  • Collect the same or comparable information on children in Head Start and control group or comparison children (e.g., services received; quality and intensity of the intervention; and cost, descriptive, and contextual information). The Committee members thought that while there might be exceptions in practice (e.g., control group children in the care of a relative who would not allow observation or children enrolled in a non-Head Start program that is unwilling to be studied), this comparable-information principle was nonetheless extremely important.

Members of the Committee believe these criteria to be important for several reasons. First, based on a variety of existing research, members believe that quality is likely to be a key moderator of the impact of Head Start, and that conclusions about impact will be much less credible and much less useful if this key intervening variable is not carefully measured. Second, the ability to address the second research question, the variation of impact according to the characteristics of the Head Start program, depends centrally on the careful measurement of quality. Third, careful documentation of the experiences of both Head Start and control group children is one important way that the final design addresses the serious concerns raised by members of the Committee regarding the effect Head Start has on the other child care programs available to low-income children in a community, and the potential that this "contamination" of the control group experience could endanger the credibility of the research. That is, as discussed more fully in the next chapter, some members of the Committee noted that if Head Start programs fulfill their mission of collaboration within the community, they can potentially have a major effect on the quality of other child care programs. In turn, this means that a good Head Start program improves the experiences of children in the control group or comparison group, making it much harder for the research to isolate and provide accurate estimates of the impact of Head Start. Documenting in detail the experiences of Head Start and control group children does not eliminate this problem, but it does provide the researchers with information that is helpful in identifying and assessing the extent of the problem when interpreting the impact findings.

As with the discussion of research questions, members of the Committee suggested additional criteria that they believed essential to a successful design. For example, some members of the Committee thought that the design should:

  • Address a limited number of questions, with program impact as the primary question;

  • Examine a limited number of pre-specified child and family outcomes most likely to show the greatest effect as a result of Head Start participation, with the multiple domains of school readiness as the primary child outcomes;

  • Have multiple measures in independent domains;

  • Address racial, cultural, and linguistic differences;

  • Minimize selection, participation, attrition, and measurement bias;

  • Capture how program variation relates to outcomes;

  • Provide information about outcomes as they relate to various quantities of service;

  • Provide for longitudinal evaluation of the children; and

  • Provide information that will be useful for continuously improving Head Start.

The Advisory Committee concluded that no single design nor set of designs can meet all these important criteria. However, elements from the list of secondary criteria that are considered most important in guiding the design of the study were debated extensively in arriving at recommendations about preferred design options.

Outcomes and Related Measurement Issues

Recommendation 3

The Committee recommends that outcome measurement in the study should focus on the multiple domains important to school readiness of children and the parental practices that contribute to the school readiness of children. The Committee also proposes principles that the Department should consider in its detailed design regarding the domains of school readiness to focus on, the nature of the measures that should be used, the need to improve measurement for children of diverse cultural backgrounds and those for whom English is a second language, and the timing of assessments over the course of the research effort.

Rather than identifying specific outcomes and measures for the impact research, the Committee suggested key principles for determining appropriate outcomes that the Department should consider as the operationalization of an impact study or studies continues.

First, consistent with the Head Start Amendments of 1998, the Committee recommends that the multiple domains of school readiness be the central outcomes evaluated as part of the impact study of Head Start. The Committee believes that the broad framework for school readiness defined by the National Education Goals Panel, Goal One Technical Planning Group, and the Head Start Performance Measures provides the right overall framework for the study. As articulated through the Goal One effort and as exhibited in the Performance Measures, readiness must not be perceived narrowly or unidimensionally as there are multiple dimensions that contribute to the overall outcome of school readiness.

Second, the Committee believes that it is important to balance a broad framework and approach to school readiness with a focus on key measures, so that those measures can be studied carefully over a period of time. The Committee further concluded that it is important to select measures that are linked by empirical evidence to school readiness and to known Head Start effects. These areas include emergent literacy and literacy; social behavior (both positive and negative behavior); health status viewed comprehensively including physical, mental, dental, and nutritional health; and parent variables, including but not limited to childrearing practices and school involvement, that are particularly associated with school readiness.

Third, the Committee recommends building on existing measures (such as the measures used in the FACES study) while at the same time focusing on improving measures in select priority areas. The Committee discussed improving existing measures and developing new measures that are developmentally, culturally, and linguistically appropriate. For example, with increasing diversity among the children served by Head Start programs nationally, it is important to develop measures for assessing the multiple domains of school readiness that are appropriate for children of diverse cultures and those for whom English is their second language. Finally, attention should be paid to selecting or developing measures that provide longitudinal assessments of children from preschool into the early grades.

Fourth, measurement of outcomes should incorporate multiple modes of assessment for treatment and control group children to the greatest extent feasible (e.g., direct assessment, ratings by parents and teachers, and direct observations of children's behavior). This helps to ensure that determinations about outcomes will be based on more than one method of assessment, thus decreasing potential measurement bias.

The Committee recommends a pre-test and post-test during the Head Start year and follow-up in kindergarten and first grade for both the treatment and control or comparison group children. The Committee also recommends baseline measurement of parent, child, and community variables that are closely associated with child outcomes.

Overall Research Design

The Committee considered a wide variety of design options, described in full in Appendix B, in seeking a strategy for addressing the critical research questions, satisfying the key criteria, and operationalizing the principles regarding outcomes and measurement. The Committee chose to recommend a set of core principles and to identify options for implementing those principles in the actual design. The Committee makes the following recommendations about a research design that can effectively balance the tradeoff between credibility and feasibility.

Recommendation 4

The Committee believes that the research design should include random assignment of children and families to Head Start and non-Head Start groups, at a diverse group of sites located across the country that represent the variation in Head Start programs. The Committee spent a considerable portion of its deliberations discussing the feasibility, credibility, and ethics of random assignment designs and concluded, despite considerable concerns and challenges that are outlined fully in the next chapter, that random assignment within the framework described here offers the greatest potential to credibly answer the two key research questions and therefore must be an element of the design for assessing impact.

Within the Committee, some members believe that the challenges to the feasibility of random assignment in a Head Start context are modest, while others believe they are grave. However, all members concluded after reviewing the evidence (including evidence from feasibility studies involving random assignment conducted by the Quality Research Centers through partnerships with local Head Start programs) that a credible impact study or set of studies needs to include random assignment of children to Head Start and non-Head Start groups in order to respond to the criteria established by Congress. The group came to this conclusion because of the methodological power of random assignment in answering causal questions such as the two research questions; the difficulty, after a careful and extensive review, of identifying effective alternative designs; and the other features of the recommended design that addressed some of the concerns of Committee members regarding random assignment. Key design features that contributed to this consensus were the criteria for site selection and exclusion described below, the commitment described below to the use of existing information to supplement the random assignment design and to nesting this study in a full and rich overall Head Start research agenda, and the commitment described above to the collection of comparable data for experimental and control group children.

Recommendation 5

Every effort should be made to ensure that the sites selected are representative of Head Start sites nationally. Diversity should be sought on key criteria (e.g., region of the country and poverty level of the community). Sites should reflect the range of Head Start quality across the country.

The Committee identified four core variables on which the sample Head Start sites used in the research must be diverse, in order to reflect the range of Head Start programs across the nation. During the development of the detailed design, the Department will be able to determine whether the sample needs to be stratified on these variables, or whether other variables should be used. The core variables are:

  • Region of the country;

  • Race/ethnicity/language status;

  • Urban/rural; and

  • Depth of poverty in communities.

Equally important is variation on the dimension of quality, so that the programs studied reflect the existing range of Head Start quality. While quality is not likely to be feasible as an initial stratification variable, because it cannot be easily measured in advance of site selection, it is extremely important to measure carefully during the impact study or studies. Quality should be measured across multiple dimensions, with special emphasis on those aspects of quality that link to the outcomes being measured.

Finally, the Committee identified sources of variation across sites that will be useful to consider in analyzing the impact data. These sources of variation include:

  • Design of program as a one-year or two-year experience for children;

  • Program options (e.g., center-based, home-based, part-day, full-day);

  • Auspice (e.g., Community Action Agency, public school, nonprofit organization);

  • Community-level resources;

  • Alternative child care options for low-income children; and

  • The nature of the child care market and the labor market in the community studied.

Committee members also believe it is important to address selection factors in any evaluation of Head Start, whether experimental or quasi-experimental. Unmeasured characteristics of families may influence the choice of Head Start versus other care arrangements and therefore can bias estimates of Head Start's impacts. Similarly, unmeasured characteristics of programs may influence the probability of agreeing to participate in an impact study or studies. Econometric methodologies (such as sample selection models, instrumental variables estimation) may be helpful in modeling such selection processes. These methods often require the collection of data on geographic factors (e.g., for the family example, factors which might influence child care choices, such as families' geographic proximity to Head Start centers or community-level availability of child care slots; for the program example, program factors which might influence decisions to participate in a random assignment study).

Recommendation 6

To ensure that random assignment is feasible, and to ensure that, while the experiment randomly assigns Head Start services among eligible families, it does not lead to reduction of services in any site (an ethical concern to many members of the Committee), sites where Head Start saturates the community (i.e., where there are not enough unserved children to maintain full program service and a control group) would be excluded from the study. The Committee also recommends that the relatively small number of sites that are out of compliance with Head Start standards or are extremely new to the program would also be excluded.

In discussing the experiences of the Quality Research Centers, the Committee noted that a key challenge for programs involved in the study was maintaining full enrollment while also maintaining a control group. As noted earlier in the report, the information currently available shows a national level of service of 48 percent of eligible 4-year-olds and 27 percent of eligible 3-year-olds, but this does not take into account state prekindergarten enrollment and is not sufficiently detailed to show in which or how many sites there is local saturation. This is an area where the Committee suggests the collection of additional information as part of the development of a detailed research design.

Committee members also recommend the exclusion of programs that fail to reflect a minimal level of functioning as Head Start sites. This would include exclusion of programs if they are very new to Head Start or if they are deemed non-compliant based on a Head Start monitoring review. The sense of the Committee was that programs that are not yet providing Head Start services at their typical level of quality should not be part of the evaluation. But at the same time, the Committee does not want this exclusion to be so broad that it prevents evaluation of the typical array of Head Start programs.

Recommendation 7

The Committee believes that the Department should consider carefully, in consultation with the Head Start community, what incentives for parents and for sites would be most helpful to secure participation in the study or studies, consistent with the research methodology. The Committee strongly encourages the use of appropriate incentives.

In addition, the Committee believes that the Head Start community should be involved from the beginning in the design and conduct of the research proposed in this report. Building relationships of trust between programs and researchers requires, above all, that programs have the maximum information, involvement, and respect from the research community.

For the research to be successful, Head Start programs must be committed to participating to the maximum extent possible. Because of Head Start's long tradition of involvement in research and demonstration programs, and the program's commitment to continuous improvement, the Committee trusts that the research can be carried out successfully.

Nonetheless, the experience of past research efforts in a wide variety of social programs, as well as the experience of Head Start in particular, suggests that there are many obstacles to program participation and that a variety of incentives may be needed to reduce the obstacles. The Committee believes that the Department should ask programs what they need and should demonstrate its own commitment to the research by attempting to provide incentives to the maximum extent practicable and consistent with high quality research.

Possible incentives for programs could include compensation for additional staff time required to cooperate with the impact study research, funding for a new classroom (possibly to be funded the year after the research cohort is enrolled), or the provision of additional resources to enable programs to conduct new activities, such as remodeling a classroom, expanding to a new service area, securing vehicles for transportation, or purchasing materials or professional training related to program quality. Another incentive could be professional recognition of the programs' involvement with the impact study.

The Committee discussed the particular advantages and disadvantages of offering as an incentive the resources for programs to serve additional children. Under this kind of approach, as part of the overall expansion of Head Start, programs that actively participated in the research would have a special opportunity to expand in a later year. Some members saw two advantages to this approach: (1) that it helps to identify those programs which are not experiencing saturation (because they are more likely to be interested in expansion resources) and (2) that it addresses some of the ethical concerns that programs and researchers may have with random assignment by ensuring that research is tied to expanding the number of children with the opportunity to receive Head Start. However, other members believed that this incentive might not be effective. In addition, because of a concern that the additional classroom might eliminate the potential control group, some members of the Committee proposed that this incentive should only be offered for the year after the cohort that is being studied completes the program. Other incentives that would potentially impact program quality should also be granted after the research cohort completes the program, in order to ensure that the research is measuring the existing range of program quality.

The Committee also discussed incentives that might be appropriate to offer families in exchange for their participation in the research. The Committee believes that this issue deserves more attention and deliberation. The most straightforward incentive for families assigned to the treatment and control groups is a stipend for their participation in each interview and observation. Some members suggested consideration of research designs that would guarantee control group families other services, such as receipt of subsidized child care, partial Head Start services such as health services or social service referrals, or books that they can read to their children. However, other members believe that these designs would reduce the ability of the research to answer the impact question by changing the experience of the control group families to be more like Head Start.

Recommendation 8

The Committee discussed at least three options for selecting sites to be part of the randomized experiment. Each strategy has advantages and disadvantages, which should be fully assessed and reviewed by the Department during development of the detailed research design. The three options are:

  • Stratified national random sample. Sites could be selected by taking a nationally representative sample of all Head Start programs, stratified on the variables identified above. Sites that were selected would then be contacted. All those that met the criteria and were able to participate would do so; a quasi-experimental study could possibly be conducted at the sites that did not participate.

  • Stratified national sample with replacement. As above, sites could be selected by taking a nationally representative sample of all Head Start programs, stratified on the core variables. If once selected, a site could not participate, another program with the same characteristics would be randomly selected as a replacement.

  • Purposive sample selected for national diversity. Sites could be invited to demonstrate their interest if they believe that they have a sufficient number of unserved children to be capable of maintaining a control group during the time of the experiment. Sites that fit into the stratification cells could be selected from those that demonstrate this capacity.

The Committee recommends that the Department, in the development of the detailed research design, consult with sampling statisticians to gather additional information such as the number of sites that should be in the study or studies and the specifics of various sampling approaches.

Recommendation 9

The Committee discussed the option of using quasi-experimental studies to supplement the information from the random assignment study. This option should be more fully developed and reviewed by the Department during development of the detailed research design.

While the Committee does not believe that the research design should rely solely on a quasi-experimental study because of its limitations in answering the impact questions, some members of the Committee believe such a study should be carried out as a complement to the randomized study. Committee members discussed the potential of a quasi-experimental design to enhance the goal of evaluating the national impact of Head Start, particularly if there was unrepresentativeness in the sample of sites where random assignment of children was implemented. Quasi-experimental designs do not require randomly assigning subjects to control and experimental groups and instead study differences in outcomes for naturally-occurring treatment and non-treatment groups. Even though quasi-experimental designs may be necessary, the Committee urges the Department to allocate as large a share of the funds as possible to the experimental study or studies to ensure rigor by increasing the number of participating sites and families.

As the Department develops these options further, the Committee urges the Department to consider the most effective ways to link the impact research with ongoing efforts, such as the ECLS-B, ECLS-K, or FACES studies. There may be opportunities in sites where randomization takes place to include a control group consisting of children randomly assigned and a second control group of children that would participate in a quasi-experimental component of the research. The two types of control groups within the same sites would provide an opportunity to calibrate the results of the quasi-experiment against the randomized experiment.

Recommendation 10

The Committee believes that it is critical to draw on information from Head Start's extensive existing research agenda to complement the information gained from the random assignment impact study or studies. Thus, the Committee believes that the impact research proposed here should be a part of a rich and active Head Start research agenda, not a substitute for it. As such, the Department should ensure that the research and findings from the impact study or studies are used in combination with the rest of the Head Start research effort to improve the effectiveness of Head Start programs for children and families.

Members emphasized that many other parts of Head Start's ongoing research agenda are critical to improving the quality of Head Start and other early childhood programs and ensuring better outcomes for children. As the Department allocates resources, the Committee believes that the Department should ensure that the impact research is complemented by a rich array of other studies that focus on quality improvement and results measurement, program variation, and the needs of particular populations of children. This overall agenda should provide information to local Head Start programs, policymakers, researchers, and the early childhood field about how early childhood programs, and in particular Head Start, can most effectively support the development of young children. As noted in the review of ongoing Head Start research efforts in Chapter III, a number of national data collection efforts could contribute to this comprehensive approach to assessing the impact and improving the quality of Head Start. For example, the kindergarten and birth cohorts of the ECLS and efforts to continue the FACES research strategy are important potential resources to consider to inform key questions related to the impact of Head Start. Thus, the framework for impact research as outlined above is presented by the Committee with the condition that other continuing and new research be supported that will provide information about the link between quality and outcomes; the relative value of program enhancements (e.g., expanded literacy efforts, two years of Head Start, full-day services); and information about services for special populations.

In addition, some members were particularly concerned that the impact research envisioned in this framework would not provide sufficient opportunity to compare different options within Head Start, particularly options that are becoming an increasing part of Head Start's programmatic repertoire. These members asked that the Department pay particular attention in designing its research agenda to the Option II design described in Appendix B as "Random Assignment of Sites to Traditional Head Start and an Enhanced Head Start." This option would allow for the study of planned variation of program features or strategies in different Head Start locations, so that different program approaches could be compared directly. It is possible that oversampling as part of an experimental study or set of studies could also enable researchers to compare the different programmatic options within Head Start.

Finally, other members noted the importance of research that would address the costs and benefits of Head Start and other early childhood programs. These members urged the Department to begin a planning effort in this area as part of the continuing research agenda.

Overall, however, the Committee members emphasized the need to use the information gathered as a result of the impact study or studies to inform the field so that the Head Start program can continuously improve its practices to provide an effective, high quality early childhood experience for children from low-income families.

Recommendation 11

Based on the key parameters of its recommended design, the Committee notes that it will not be possible to meet the expected deadlines for a final report by September 30, 2003.

Because the statute and the Committee recommendations stipulate the collection and analysis of data on children through the end of first grade, all of the design options considered by the Committee would lead to a final report no earlier than the year 2006. The Committee urges the Department to make every effort to ensure that the report is completed by no later than 2006. In view of this expected schedule for reporting on the new impact study or studies, it is particularly important for the Department to report findings from other ongoing research efforts, as discussed in Recommendation 10, in formats and at times that are most useful to policymakers.

 

 

9The definition of social competence used by Head Start encompasses multiple domains of development and is comparable to Goal One-the "readiness" Goal-of the National Education Goals. The second question is also addressed in the statute, which directs that the Secretary, "to the extent practicable, consider addressing possible sources of variation in the impact of Head Start programs" (Head Start Amendments of 1998, Section 649(g)(6)). back to footnote 9

 

1999 Report Home | Table of Contents | Previous Chapter | Next Chapter