|
The remainder of this report presents the recommendations developed
by the Advisory Committee on Head Start Research and Evaluation
in response to the charge of the Congress, the programmatic history
and context offered above, the research experience summarized
in the previous chapter, and the extensive experience of individual
members with research and evaluation in early childhood programs
and across broader social policy issues. This chapter sets forth
the framework that the Committee recommends to the Department
as it embarks on an impact study or set of studies of Head Start.
The framework offers the Committee's best thinking on appropriate
research questions, criteria, outcomes and related measurement
issues, and the overall research design.
Following this chapter on recommendations for a research framework,
Chapter V provides a fuller account of the rationale for these
recommendations. Thus Chapter V outlines the challenges that the
Committee debated-challenges that any research design would have
to address-and the strategies that the Committee considered for
meeting those challenges. The final chapter completes the discussion
of recommendations by providing a set of specific next steps that
the Committee offers to the Department, the research community,
and the Head Start community, in order to ensure the successful
implementation of this research effort.
Research Questions
Recommendation 1
The foundation for any research design is clarity about the
questions to be answered. Therefore, based on members' review
of the Head Start Amendments of 1998, their understanding of the
Head Start program, and their experience studying human service
interventions, the Advisory Committee recommends that two critical
research questions be investigated as part of an impact study
or set of studies. These questions will be operationalized further
during the development of the detailed research design.
-
What difference does Head Start make to key outcomes of development
and learning (and in particular, the multiple domains of school
readiness) for low-income children? What difference does Head
Start make to parental practices that contribute to children's
school readiness?
-
Under what circumstances does Head Start achieve the greatest
impact? What works for which children? What Head Start services
are most related to impact?
The first of these two questions is highlighted in the statute
in several places and reflects the Congress's interest in learning
"if, overall, the Head Start programs have impacts consistent
with their primary goal of increasing the social competence of
children." (Head Start Amendments of 1998)9.
The second question was also central to the recommendations of
the 1990 Advisory Panel for the Head Start Evaluation Design,
because understanding what works best for whom is important to
the work of policymakers and program operators in supporting the
continuous improvement of the Head Start program and other early
childhood efforts.
In answering this second question, any feasible study or activities
will only address some of the many possible sources of variation
in impact. Therefore, the Committee spent considerable time discussing
the kinds of variation that are most important to be able to explore.
As noted below, the Committee is particularly interested in variation
that relates to the diverse characteristics of children and communities
served, the region of the country, and the quality of programs.
To the extent possible, the Committee also believes it is important
to look at the variation in impact according to the design and
auspices of the Head Start programs (e.g., two-year vs. one-year
programs, part-day vs. full-day programs, programs operated by
nonprofits vs. programs operated by public schools), and the variable
nature of the settings in which control group children are served.
In addition to addressing these two top priority research questions,
the Committee believes it is also important for those researchers
carrying out the impact study or studies to make every effort
feasible to communicate information learned as part of the study
to participating Head Start programs. For instance, it may be
possible to provide descriptive as well as impact information
on individual programs to program managers and staff for decisionmaking.
Sharing information openly demonstrates respect for the programs
and helps them receive a direct and immediate benefit for their
involvement in the effort.
Finally, the Committee discussed additional research questions,
such as determining the impact of Head Start on communities. While
believing that all of these additional questions raise important
issues and, if answered, would be helpful to policymakers, the
Committee concluded that developing the most feasible design for
a study or set of studies, required limiting the research questions
to the top priorities identified above.
Criteria
Recommendation 2
The Committee believes that an acceptable research design must answer
the priority research questions identified above and must satisfy
two key criteria:
-
An acceptable research design must be scientifically valid
and widely credible. It must provide evidence that is scientifically
convincing and persuasive to a variety of audiences, such
as the Congress, the research community, program staff, and
parents.
-
An acceptable design must be feasible. It must be capable
of being well implemented in the real world by Head Start
programs and researchers.
Much of the Committee's deliberation focused on the potential
tension between these two criteria.
The Committee also reached consensus on two other criteria that
are critical to the study's ability to answer the research questions
rigorously and credibly. The Committee believes that a credible
study must:
-
Collect information on the quality of services provided to
the Head Start children; and
-
Collect the same or comparable information on children in
Head Start and control group or comparison children (e.g.,
services received; quality and intensity of the intervention;
and cost, descriptive, and contextual information). The Committee
members thought that while there might be exceptions in practice
(e.g., control group children in the care of a relative who
would not allow observation or children enrolled in a non-Head
Start program that is unwilling to be studied), this comparable-information
principle was nonetheless extremely important.
Members of the Committee believe these criteria to be important
for several reasons. First, based on a variety of existing research,
members believe that quality is likely to be a key moderator of
the impact of Head Start, and that conclusions about impact will
be much less credible and much less useful if this key intervening
variable is not carefully measured. Second, the ability to address
the second research question, the variation of impact according
to the characteristics of the Head Start program, depends centrally
on the careful measurement of quality. Third, careful documentation
of the experiences of both Head Start and control group children
is one important way that the final design addresses the serious
concerns raised by members of the Committee regarding the effect
Head Start has on the other child care programs available to low-income
children in a community, and the potential that this "contamination"
of the control group experience could endanger the credibility
of the research. That is, as discussed more fully in the next
chapter, some members of the Committee noted that if Head Start
programs fulfill their mission of collaboration within the community,
they can potentially have a major effect on the quality of other
child care programs. In turn, this means that a good Head Start
program improves the experiences of children in the control group
or comparison group, making it much harder for the research to
isolate and provide accurate estimates of the impact of Head Start.
Documenting in detail the experiences of Head Start and control
group children does not eliminate this problem, but it does provide
the researchers with information that is helpful in identifying
and assessing the extent of the problem when interpreting the
impact findings.
As with the discussion of research questions, members of the
Committee suggested additional criteria that they believed essential
to a successful design. For example, some members of the Committee
thought that the design should:
-
Address a limited number of questions, with program impact
as the primary question;
-
Examine a limited number of pre-specified child and family
outcomes most likely to show the greatest effect as a result
of Head Start participation, with the multiple domains of
school readiness as the primary child outcomes;
-
Have multiple measures in independent domains;
-
Address racial, cultural, and linguistic differences;
-
Minimize selection, participation, attrition, and measurement
bias;
-
Capture how program variation relates to outcomes;
-
Provide information about outcomes as they relate to various
quantities of service;
-
Provide for longitudinal evaluation of the children; and
-
Provide information that will be useful for continuously
improving Head Start.
The Advisory Committee concluded that no single design nor set
of designs can meet all these important criteria. However, elements
from the list of secondary criteria that are considered most important
in guiding the design of the study were debated extensively in
arriving at recommendations about preferred design options.
Outcomes and
Related Measurement Issues
Recommendation 3
The Committee recommends that outcome measurement in the study
should focus on the multiple domains important to school readiness
of children and the parental practices that contribute to the
school readiness of children. The Committee also proposes principles
that the Department should consider in its detailed design regarding
the domains of school readiness to focus on, the nature of the
measures that should be used, the need to improve measurement
for children of diverse cultural backgrounds and those for whom
English is a second language, and the timing of assessments over
the course of the research effort.
Rather than identifying specific outcomes and measures for the
impact research, the Committee suggested key principles for determining
appropriate outcomes that the Department should consider as the
operationalization of an impact study or studies continues.
First, consistent with the Head Start Amendments of 1998, the
Committee recommends that the multiple domains of school readiness
be the central outcomes evaluated as part of the impact study
of Head Start. The Committee believes that the broad framework
for school readiness defined by the National Education Goals Panel,
Goal One Technical Planning Group, and the Head Start Performance
Measures provides the right overall framework for the study. As
articulated through the Goal One effort and as exhibited in the
Performance Measures, readiness must not be perceived narrowly
or unidimensionally as there are multiple dimensions that contribute
to the overall outcome of school readiness.
Second, the Committee believes that it is important to balance
a broad framework and approach to school readiness with a focus
on key measures, so that those measures can be studied carefully
over a period of time. The Committee further concluded that it
is important to select measures that are linked by empirical evidence
to school readiness and to known Head Start effects. These areas
include emergent literacy and literacy; social behavior (both
positive and negative behavior); health status viewed comprehensively
including physical, mental, dental, and nutritional health; and
parent variables, including but not limited to childrearing practices
and school involvement, that are particularly associated with
school readiness.
Third, the Committee recommends building on existing measures
(such as the measures used in the FACES study) while at the same
time focusing on improving measures in select priority areas.
The Committee discussed improving existing measures and developing
new measures that are developmentally, culturally, and linguistically
appropriate. For example, with increasing diversity among the
children served by Head Start programs nationally, it is important
to develop measures for assessing the multiple domains of school
readiness that are appropriate for children of diverse cultures
and those for whom English is their second language. Finally,
attention should be paid to selecting or developing measures that
provide longitudinal assessments of children from preschool into
the early grades.
Fourth, measurement of outcomes should incorporate multiple
modes of assessment for treatment and control group children to
the greatest extent feasible (e.g., direct assessment, ratings
by parents and teachers, and direct observations of children's
behavior). This helps to ensure that determinations about outcomes
will be based on more than one method of assessment, thus decreasing
potential measurement bias.
The Committee recommends a pre-test and post-test during the
Head Start year and follow-up in kindergarten and first grade
for both the treatment and control or comparison group children.
The Committee also recommends baseline measurement of parent,
child, and community variables that are closely associated with
child outcomes.
Overall
Research Design
The Committee considered a wide variety of design options, described
in full in Appendix B, in seeking a strategy for addressing the
critical research questions, satisfying the key criteria, and
operationalizing the principles regarding outcomes and measurement.
The Committee chose to recommend a set of core principles and
to identify options for implementing those principles in the actual
design. The Committee makes the following recommendations about
a research design that can effectively balance the tradeoff between
credibility and feasibility.
Recommendation 4
The Committee believes that the research design should include
random assignment of children and families to Head Start and non-Head
Start groups, at a diverse group of sites located across the country
that represent the variation in Head Start programs. The Committee
spent a considerable portion of its deliberations discussing the
feasibility, credibility, and ethics of random assignment designs
and concluded, despite considerable concerns and challenges that
are outlined fully in the next chapter, that random assignment
within the framework described here offers the greatest
potential to credibly answer the two key research questions and
therefore must be an element of the design for assessing impact.
Within the Committee, some members believe that the challenges
to the feasibility of random assignment in a Head Start context
are modest, while others believe they are grave. However, all
members concluded after reviewing the evidence (including evidence
from feasibility studies involving random assignment conducted
by the Quality Research Centers through partnerships with local
Head Start programs) that a credible impact study or set of studies
needs to include random assignment of children to Head Start and
non-Head Start groups in order to respond to the criteria established
by Congress. The group came to this conclusion because of the
methodological power of random assignment in answering causal
questions such as the two research questions; the difficulty,
after a careful and extensive review, of identifying effective
alternative designs; and the other features of the recommended
design that addressed some of the concerns of Committee members
regarding random assignment. Key design features that contributed
to this consensus were the criteria for site selection and exclusion
described below, the commitment described below to the use of
existing information to supplement the random assignment design
and to nesting this study in a full and rich overall Head Start
research agenda, and the commitment described above to the collection
of comparable data for experimental and control group children.
Recommendation 5
Every effort should be made to ensure that the sites selected
are representative of Head Start sites nationally. Diversity should
be sought on key criteria (e.g., region of the country and poverty
level of the community). Sites should reflect the range of Head
Start quality across the country.
The Committee identified four core variables on which the sample
Head Start sites used in the research must be diverse, in order
to reflect the range of Head Start programs across the nation.
During the development of the detailed design, the Department
will be able to determine whether the sample needs to be stratified
on these variables, or whether other variables should be used.
The core variables are:
Equally important is variation on the dimension of quality, so
that the programs studied reflect the existing range of Head Start
quality. While quality is not likely to be feasible as an initial
stratification variable, because it cannot be easily measured
in advance of site selection, it is extremely important to measure
carefully during the impact study or studies. Quality should be
measured across multiple dimensions, with special emphasis on
those aspects of quality that link to the outcomes being measured.
Finally, the Committee identified sources of variation across
sites that will be useful to consider in analyzing the impact
data. These sources of variation include:
-
Design of program as a one-year or two-year experience for
children;
-
Program options (e.g., center-based, home-based, part-day,
full-day);
-
Auspice (e.g., Community Action Agency, public school, nonprofit
organization);
-
Community-level resources;
-
Alternative child care options for low-income children; and
-
The nature of the child care market and the labor market
in the community studied.
Committee members also believe it is important to address selection
factors in any evaluation of Head Start, whether experimental
or quasi-experimental. Unmeasured characteristics of families
may influence the choice of Head Start versus other care arrangements
and therefore can bias estimates of Head Start's impacts. Similarly,
unmeasured characteristics of programs may influence the probability
of agreeing to participate in an impact study or studies. Econometric
methodologies (such as sample selection models, instrumental variables
estimation) may be helpful in modeling such selection processes.
These methods often require the collection of data on geographic
factors (e.g., for the family example, factors which might influence
child care choices, such as families' geographic proximity to
Head Start centers or community-level availability of child care
slots; for the program example, program factors which might influence
decisions to participate in a random assignment study).
Recommendation 6
To ensure that random assignment is feasible, and to ensure that,
while the experiment randomly assigns Head Start services among
eligible families, it does not lead to reduction of services in
any site (an ethical concern to many members of the Committee),
sites where Head Start saturates the community (i.e., where there
are not enough unserved children to maintain full program service
and a control group) would be excluded from the study. The Committee
also recommends that the relatively small number of sites that
are out of compliance with Head Start standards or are extremely
new to the program would also be excluded.
In discussing the experiences of the Quality Research Centers,
the Committee noted that a key challenge for programs involved
in the study was maintaining full enrollment while also maintaining
a control group. As noted earlier in the report, the information
currently available shows a national level of service of 48 percent
of eligible 4-year-olds and 27 percent of eligible 3-year-olds,
but this does not take into account state prekindergarten enrollment
and is not sufficiently detailed to show in which or how many
sites there is local saturation. This is an area where the Committee
suggests the collection of additional information as part of the
development of a detailed research design.
Committee members also recommend the exclusion of programs that
fail to reflect a minimal level of functioning as Head Start sites.
This would include exclusion of programs if they are very new
to Head Start or if they are deemed non-compliant based on a Head
Start monitoring review. The sense of the Committee was that programs
that are not yet providing Head Start services at their typical
level of quality should not be part of the evaluation. But at
the same time, the Committee does not want this exclusion to be
so broad that it prevents evaluation of the typical array of Head
Start programs.
Recommendation 7
The Committee believes that the Department should consider carefully,
in consultation with the Head Start community, what incentives
for parents and for sites would be most helpful to secure participation
in the study or studies, consistent with the research methodology.
The Committee strongly encourages the use of appropriate incentives.
In addition, the Committee believes that the Head Start community
should be involved from the beginning in the design and conduct
of the research proposed in this report. Building relationships
of trust between programs and researchers requires, above all,
that programs have the maximum information, involvement, and respect
from the research community.
For the research to be successful, Head Start programs must be
committed to participating to the maximum extent possible. Because
of Head Start's long tradition of involvement in research and
demonstration programs, and the program's commitment to continuous
improvement, the Committee trusts that the research can be carried
out successfully.
Nonetheless, the experience of past research efforts in a wide
variety of social programs, as well as the experience of Head
Start in particular, suggests that there are many obstacles to
program participation and that a variety of incentives may be
needed to reduce the obstacles. The Committee believes that the
Department should ask programs what they need and should demonstrate
its own commitment to the research by attempting to provide incentives
to the maximum extent practicable and consistent with high quality
research.
Possible incentives for programs could include compensation for
additional staff time required to cooperate with the impact study
research, funding for a new classroom (possibly to be funded the
year after the research cohort is enrolled), or the provision
of additional resources to enable programs to conduct new activities,
such as remodeling a classroom, expanding to a new service area,
securing vehicles for transportation, or purchasing materials
or professional training related to program quality. Another incentive
could be professional recognition of the programs' involvement
with the impact study.
The Committee discussed the particular advantages and disadvantages
of offering as an incentive the resources for programs to serve
additional children. Under this kind of approach, as part of the
overall expansion of Head Start, programs that actively participated
in the research would have a special opportunity to expand in
a later year. Some members saw two advantages to this approach:
(1) that it helps to identify those programs which are not experiencing
saturation (because they are more likely to be interested in expansion
resources) and (2) that it addresses some of the ethical concerns
that programs and researchers may have with random assignment
by ensuring that research is tied to expanding the number of children
with the opportunity to receive Head Start. However, other members
believed that this incentive might not be effective. In addition,
because of a concern that the additional classroom might eliminate
the potential control group, some members of the Committee proposed
that this incentive should only be offered for the year after
the cohort that is being studied completes the program. Other
incentives that would potentially impact program quality should
also be granted after the research cohort completes the program,
in order to ensure that the research is measuring the existing
range of program quality.
The Committee also discussed incentives that might be appropriate
to offer families in exchange for their participation in the research.
The Committee believes that this issue deserves more attention
and deliberation. The most straightforward incentive for families
assigned to the treatment and control groups is a stipend for
their participation in each interview and observation. Some members
suggested consideration of research designs that would guarantee
control group families other services, such as receipt of subsidized
child care, partial Head Start services such as health services
or social service referrals, or books that they can read to their
children. However, other members believe that these designs would
reduce the ability of the research to answer the impact question
by changing the experience of the control group families to be
more like Head Start.
Recommendation 8
The Committee discussed at least three options for selecting
sites to be part of the randomized experiment. Each strategy has
advantages and disadvantages, which should be fully assessed and
reviewed by the Department during development of the detailed
research design. The three options are:
-
Stratified national random sample. Sites could be selected
by taking a nationally representative sample of all Head Start
programs, stratified on the variables identified above. Sites
that were selected would then be contacted. All those that
met the criteria and were able to participate would do so;
a quasi-experimental study could possibly be conducted at
the sites that did not participate.
-
Stratified national sample with replacement. As above, sites
could be selected by taking a nationally representative sample
of all Head Start programs, stratified on the core variables.
If once selected, a site could not participate, another program
with the same characteristics would be randomly selected as
a replacement.
-
Purposive sample selected for national diversity. Sites could
be invited to demonstrate their interest if they believe that
they have a sufficient number of unserved children to be capable
of maintaining a control group during the time of the experiment.
Sites that fit into the stratification cells could be selected
from those that demonstrate this capacity.
The Committee recommends that the Department, in the development
of the detailed research design, consult with sampling statisticians
to gather additional information such as the number of sites that
should be in the study or studies and the specifics of various
sampling approaches.
Recommendation 9
The Committee discussed the option of using quasi-experimental
studies to supplement the information from the random assignment
study. This option should be more fully developed and reviewed
by the Department during development of the detailed research
design.
While the Committee does not believe that the research design
should rely solely on a quasi-experimental study because of its
limitations in answering the impact questions, some members of
the Committee believe such a study should be carried out as a
complement to the randomized study. Committee members discussed
the potential of a quasi-experimental design to enhance the goal
of evaluating the national impact of Head Start, particularly
if there was unrepresentativeness in the sample of sites where
random assignment of children was implemented. Quasi-experimental
designs do not require randomly assigning subjects to control
and experimental groups and instead study differences in outcomes
for naturally-occurring treatment and non-treatment groups. Even
though quasi-experimental designs may be necessary, the Committee
urges the Department to allocate as large a share of the funds
as possible to the experimental study or studies to ensure rigor
by increasing the number of participating sites and families.
As the Department develops these options further, the Committee
urges the Department to consider the most effective ways to link
the impact research with ongoing efforts, such as the ECLS-B,
ECLS-K, or FACES studies. There may be opportunities in sites
where randomization takes place to include a control group consisting
of children randomly assigned and a second control group of children
that would participate in a quasi-experimental component of the
research. The two types of control groups within the same sites
would provide an opportunity to calibrate the results of the quasi-experiment
against the randomized experiment.
Recommendation 10
The Committee believes that it is critical to draw on information
from Head Start's extensive existing research agenda to complement
the information gained from the random assignment impact study
or studies. Thus, the Committee believes that the impact research
proposed here should be a part of a rich and active Head Start
research agenda, not a substitute for it. As such, the Department
should ensure that the research and findings from the impact study
or studies are used in combination with the rest of the Head Start
research effort to improve the effectiveness of Head Start programs
for children and families.
Members emphasized that many other parts of Head Start's ongoing
research agenda are critical to improving the quality of Head
Start and other early childhood programs and ensuring better outcomes
for children. As the Department allocates resources, the Committee
believes that the Department should ensure that the impact research
is complemented by a rich array of other studies that focus on
quality improvement and results measurement, program variation,
and the needs of particular populations of children. This overall
agenda should provide information to local Head Start programs,
policymakers, researchers, and the early childhood field about
how early childhood programs, and in particular Head Start, can
most effectively support the development of young children. As
noted in the review of ongoing Head Start research efforts in
Chapter III, a number of national data collection efforts could
contribute to this comprehensive approach to assessing the impact
and improving the quality of Head Start. For example, the kindergarten
and birth cohorts of the ECLS and efforts to continue the FACES
research strategy are important potential resources to consider
to inform key questions related to the impact of Head Start. Thus,
the framework for impact research as outlined above is presented
by the Committee with the condition that other continuing and
new research be supported that will provide information about
the link between quality and outcomes; the relative value of program
enhancements (e.g., expanded literacy efforts, two years of Head
Start, full-day services); and information about services for
special populations.
In addition, some members were particularly concerned that the
impact research envisioned in this framework would not provide
sufficient opportunity to compare different options within Head
Start, particularly options that are becoming an increasing part
of Head Start's programmatic repertoire. These members asked that
the Department pay particular attention in designing its research
agenda to the Option II design described in Appendix B as "Random
Assignment of Sites to Traditional Head Start and an Enhanced
Head Start." This option would allow for the study of planned
variation of program features or strategies in different Head
Start locations, so that different program approaches could be
compared directly. It is possible that oversampling as part of
an experimental study or set of studies could also enable researchers
to compare the different programmatic options within Head Start.
Finally, other members noted the importance of research that
would address the costs and benefits of Head Start and other early
childhood programs. These members urged the Department to begin
a planning effort in this area as part of the continuing research
agenda.
Overall, however, the Committee members emphasized the need to
use the information gathered as a result of the impact study or
studies to inform the field so that the Head Start program can
continuously improve its practices to provide an effective, high
quality early childhood experience for children from low-income
families.
Recommendation 11
Based on the key parameters of its recommended design, the Committee
notes that it will not be possible to meet the expected deadlines
for a final report by September 30, 2003.
Because the statute and the Committee recommendations stipulate
the collection and analysis of data on children through the end
of first grade, all of the design options considered by the Committee
would lead to a final report no earlier than the year 2006. The
Committee urges the Department to make every effort to ensure
that the report is completed by no later than 2006. In view of
this expected schedule for reporting on the new impact study or
studies, it is particularly important for the Department to report
findings from other ongoing research efforts, as discussed in
Recommendation 10, in formats and at times that are most useful
to policymakers.
|