Is it feasible to design an impact study (or set of studies)
where the programs the control group children attend and the services
they receive are not affected by Head Start?
The assumption behind an experimental design is that the control
group is not influenced by the treatment. If the control group is
affected, the study results will be "contaminated" and will likely
underestimate the impact of the program. In other words, contamination
exists when children in the control group receive all or part of
the Head Start program services. Committee members discussed this
issue extensively, drawing on the limited available evidence about
the programs and services available to low-income children not in
Head Start. Given the limits of the evidence, Committee members
reached different conclusions about the gravity of the threat to
internal validity and credibility of the research posed by this
issue. As noted in Recommendation 2 in Chapter IV, the Committee
believes that careful documentation of the experiences of control
group children is critical to assessing the extent of this problem
and interpreting the research data in light of it.
During the course of this discussion, Committee members identified
ways in which Head Start programs potentially influence the care
of children not enrolled in Head Start.
-
Head Start is not only an individual- or family-level intervention,
but a community-level intervention as well. The Head Start Performance
Standards call for programs to orchestrate community partnerships
that reach non-Head Start children and families. As such, the
program often influences the services and supports provided
by other local child development programs. For example, some
Head Start programs extend their training sessions to local
child care providers. Thus, the child care settings of comparison
children could be of high quality because of their partnerships
with Head Start or because their teachers were trained by, or
were past employees of, Head Start.
-
Head Start also seeks to support parents. If Head Start programs
are working effectively, parents will become more effective
advocates for their children in the community, thus potentially
improving the quality of services for other children as well.
-
In some places, early childhood programs that are aspiring
to excellence are adopting or adapting the Head Start Performance
Standards in addition to other accreditation systems or best
practice guidelines.
-
Within families, effective Head Start programs will also help
parents develop practices that support their role as their child's
first teacher. These practices not only benefit the current
child enrolled in the program, but they also benefit other children
in the family, so that if control group children have siblings
who have been in Head Start, their own experience may be affected.
Some members of the Committee believe that these influences, along
with the congruity in program design between Head Start and many
state prekindergarten programs, have a widespread effect on the
experiences of control group children. Others believe that the effects
are much more limited. Members in the first group also point out
that the better and more effective a Head Start program is, the
greater the contamination could be and the smaller the impact measured
by the research (the program-control group difference) is likely
to be. Members in the second group point out that despite these
influences, existing research suggests that the child care settings
experienced by low-income children not in Head Start can be of lower
quality and that a body of research evidence suggests the difficulty
of disseminating innovations.
However, despite the difficulty of determining from the available
information how serious the concern is, all members of the Committee
recommend several strategies that are incorporated in the design
framework described in Chapter IV and that would minimize so far
as possible the threats to the research from this source. These
strategies include the following:
-
Assess the quality of both Head Start and control group settings
in order to understand control group settings and how they may
have been influenced;
-
Avoid situations where there is the greatest degree of contamination
of the control group setting (specifically, child care programs
that are in partnership with Head Start, blend funds, and/or
have adopted Head Start Program Performance Standards); and
A related, but not identical issue that received more limited attention
from the Committee is the issue of care received outside the Head
Start program by children in the treatment group. In particular,
children may be in other child care settings for parts of the day
or year when they are not in Head Start, which further complicates
the determination of effects specifically related to the Head Start
portion of their care. The Committee would address this issue through
the fullest possible documentation of the nature and quality of
the services received by both Head Start and control group children.
What are the challenges for the impact study (or studies) to
gather detailed information on the control/comparison group children,
including the type, intensity, and quality of care these children
receive?
As noted above, in order to address potential threats to the quality
and credibility of the research, the Committee believes it is extremely
important that the same data be collected on children whether assigned
to Head Start or control or comparison groups. It is critically
important to understand the type, intensity, and quality of care
that the control group or comparison children receive in order to
draw accurate findings about the impact of Head Start versus other
child care and education options. It is also important to understand
the quality of the care settings that Head Start children are in
when they are not in Head Start.
But the Committee members recognize that collecting information
on these non-Head Start settings, both for the Head Start children
and the control or comparison group, is a challenge and will require
significant planning and coordination to ensure that as many local
programs and providers as possible are willing to participate in
the study or studies. There may be substantial barriers to the agreement
of non-Head Start providers-including child care centers, family
home providers including neighbors and friends, and children's relatives-to
have their practices and care environment described and documented.
Thus, any design or set of designs selected for studying impact
must pay careful attention to how researchers will gain entry to
alternative care settings and what types of data will need to be
collected in these settings. It should be expected that there will
need to be oversampling for the control group in order to account
for higher rates of refusal in these alternative care settings.
One possibility raised for the Department to consider in the detailed
design is that an initial general survey could be conducted of all
parents and caregivers of control group children. In addition, the
settings of a smaller random subsample of the control group children
would then be observed with more intensive measures like those being
used to study the settings of the children served by Head Start.
This would allow testing of the validity of the more general survey
responses against the more intensive measures.
What are the strengths and weaknesses of alternative strategies
for establishing causality, other than random assignment of children?
In response to the concerns described above, the Committee considered
two primary alternatives to random assignment of children within
a site:
-
A design option which randomly assigned sites to Head Start
as it is now or to Head Start enhancements. As described more
fully under Option II in Appendix B, this option was offered
as an approach to solving the problem of contamination through
a rigorous experimental design to compare the effects of the
basic Head Start model with program options such as an added
focus on literacy services, implementation of various curriculum
models, or full-day versus part-day program options. A sequence
of such studies would create information about the relative
effects of these different forms of Head Start services.
-
Design options which used quasi-experimental strategies to
compare children receiving Head Start with naturally-occurring
comparison groups (based on existing patterns of parental choice
and access to Head Start, other early childhood programs, or
no formal early childhood or child care participation). These
options were offered to address the ethical and feasibility
problems with random assignment.
The Committee concluded after extensive deliberation that the first
alternative option offers important information to policymakers
and is an attractive part of a full research agenda for Head Start.
However, the Committee would not recommend it as the design for
the impact study directly required by Congress because it answers
a somewhat different (though extremely important) set of research
questions.
The Committee concluded that the quasi-experimental options when
used alone do not permit rigorous enough causal inference to answer
the Congress's question about impact, but the Committee does recommend
that the Department consider whether to use quasi-experimental research
to supplement the overall impact study. Some members believe that
if executed as planned, experimental research is preferable to quasi-experimental
research. Other members contend it is highly likely that the experimental
research may not be done as planned, especially with low control
group compliance rates as experienced with the Quality Research
Centers randomized trials. At that point, quasi-experimental research
may become preferable.
Challenges Related to Generalizing
Findings to the National Head Start Program
The second major area where the Committee focused its deliberations
was on the issue of how to generalize from specific study sites
to determine the impact of Head Start as required by the Congress.
For the study to answer the key research question about the impact
of Head Start, the individual sites where the research is carried
out must represent the typical impact of Head Start with the families
it typically serves. If the study is based on only a special or
biased set of programs, conclusions will not be generalizable to
the entire Head Start population.
Two broad approaches, with variants of each, were discussed at
length: seeking to understand impact through a nationally representative
random stratified sample of sites, or seeking to understand impact
through replication of findings at a group of sites that are chosen
to represent the total universe of Head Start programs (the typical
medical research model for establishing impact). In the end, given
the limits of available information, the Committee chose to recommend
a set of criteria that the research sites must meet and several
acceptable options for selecting a set of sites that meet those
criteria. The Committee urges the Department to draw on all available
expertise to further develop and select among these options during
the detailed design and feasibility stages of the project.
The Committee considered specific issues in this area:
-
What do we know about the feasibility of randomly selecting
sites to participate in the random assignment design? What share
of sites is likely to be unable to participate because of saturation
of services within communities or for some other reason? What
are the advantages and disadvantages of alternative approaches
to selecting sites for random assignment experiments?
-
What role should quasi-experimental studies play in assuring
reasonable national representativeness?
-
What are the challenges to addressing questions about the impacts
of variations among Head Start programs within the impact study
or studies?
-
What are the challenges posed by seeking a design that will
be relevant for the future evolution of Head Start? To what
degree should the sampling process include variants of Head
Start that may now be in the minority but that reflect "Head
Start of the future"?
Most of these issues would be categorized by researchers as affecting
the external validity of the research effort (i.e., the extent to
which the findings of the individual research sites reflect the
reality of Head Start across the nation).
What do we know about the feasibility of selecting sites at
random to participate in the random assignment design? What share
of sites is likely to be unable to participate because of saturation
or for some other reason? What are the advantages and disadvantages
of alternative approaches to selecting sites for random assignment
experiments?
In assessing the feasibility of an impact study that provides information
that is generalizable to all Head Start programs, the Committee
discussed two alternative approaches to assessing national impact
that are suggested by the research literature and were proposed
by Committee members.
-
A nationally representative sample of Head Start programs requires
a national stratified random sample of sites. If successfully
achieved, this will lead to estimating the average national
impact of the program. As such, it will be the best possible
estimate of national impact. However, some Head Start programs
cannot assign at random because their communities are already
saturated or other reasons. If these programs differ from participating
programs in measured ways related to school readiness, then
this will indicate a biased national estimate that must then
be adjusted within the limits of prevailing statistical methodologies.
The sampled and unsampled programs may also differ in unmeasured
ways whose effects on national estimates cannot be fully known.
-
Alternatively, a different model of causal generalization widely
used in the experimental sciences and quantitative review methods
like meta-analysis, does not seek so much a single national
estimate as to assess the robustness of Head Start effects across
a heterogenous, diverse sample of locations. While this procedure
does not guarantee a single unbiased national impact estimate,
it will provide a test of effectiveness across a diverse range
of Head Start programs.
The Committee recommends exploring all options for providing a
national analysis of the impact of Head Start.
To assess the feasibility of a nationally representative, stratified
random sample of sites, the Committee spent considerable time in
its deliberations reviewing the existing evidence on the ability
of programs to participate in the study or studies. In particular,
the Committee reviewed:
-
The experience of the Quality Research Centers feasibility
studies in securing participation from local Head Start program
partners;
-
The experience of other national evaluations of social service
programs in identifying local sites that were capable of carrying
out rigorous random assignment research; and
-
The limited evidence available on the extent to which local
communities are saturated (e.g., do not have enough unserved
children to maintain a control group).
Based on this evidence, the Committee discussed several possible
reasons why a nationally representative design could be a challenge.
Because of the limitations of the available evidence, the Committee
did not form a conclusion about the number or percentage of sites
that would be unable to participate. Some members of the Committee
see the inability of sites to participate as a grave concern that
limits the usefulness of a national random sample strategy, while
others believe it is a concern that could be handled within such
a strategy. The Committee identified the following specific issues
from the available evidence:
-
As discussed above, the unserved children who would form the
potential control group may be distributed unequally across
geographical areas and individual Head Start service areas.
Therefore, some locations might not be able to participate due
to lack of sufficient numbers of eligible, unserved children.
If the design were to exclude communities where Head Start eligible
children are largely served in preschool programs that have
been heavily influenced by Head Start or use the Head Start
Performance Standards, this problem could be accentuated.
-
The experience of the Quality Research Centers in evaluating
local Head Start programs suggests that there is considerable
variation in the ability of service delivery sites to participate
effectively in rigorous research designs, and that it is necessary
to select sites that have both the capacity and interest to
do so.
-
Committee members reported similar findings from the experience
of other national evaluations of social service programs.
In addition, the Committee discussed whether a 70 percent participation
rate by sampled sites was an appropriate criterion to use in assessing
the feasibility of a sampling strategy. This criterion was proposed
as one that has general assent as a best practice in the field.
However, some members of the Committee believe that a randomly selected
sample of programs could be the best way to select participants
even if the participation rate in the end is considerably less than
70 percent, because it would lead to the most representative possible
sample. Others believe that this is not the case, because they believe
that participation at lower rates would likely indicate bias in
participation or because they believe there are major time and resource
costs in seeking participation from programs that are unlikely to
be able to participate in the end.
Members of the Committee who argued for a stratified national sample,
randomly selected, believe that this approach will yield the most
representative group of programs, and are hopeful that it will be
possible to improve on the prior record of program participation
through clear national commitment and leadership in the design of
this study, along with appropriate incentives for participation.
Members of the Committee who argued for the alternative medical
model pointed to the fact that this has been the standard both in
the medical literature and in past evaluations of social policy
at the national level (such as the national evaluations for Even
Start, JOBS, JTPA, and others). They argued that a focus on replication
in diverse sites allows for strategies to reach the sites that are
not saturated and are capable of participating at much lower cost
in terms of time and financial resources. Some members in both groups
proposed quasi-experimental strategies to fill gaps in the study
resulting from non-participation.
Committee deliberations on these challenges and options led to
recommendations in Chapter IV related to criteria for site selection
(including exclusions), approaches to improving the participation
rate of suitable sites through cooperative national leadership and
the identification and use of appropriate incentives, and the three
potential options for site selection in Recommendation 8.
What role should quasi-experimental efforts play in gaining
national representativeness?
The Committee discussed but did not resolve the role of quasi-experimental
strategies in supplementing the experimental sites in order to improve
the ability of the impact research effort to reflect the whole nation.
At least three different quasi-experimental strategies were discussed:
-
Conducting quasi-experiments at a nationally representative
sample of sites, for example by identifying a community comparison
group in the same location as a national sample of Head Start
programs and studying quality and outcomes for both populations
of children.
-
Conducting quasi-experiments in the sites that are selected
for experimental participation but are unable to participate
(under the nationally representative, stratified random sample
strategy above).
-
Conducting quasi-experiments in sites that are not invited
or do not volunteer for experimental participation (under the
purposive sample strategy above), in order to learn about possible
differences between volunteer and non-volunteer sites.
The Committee was not able to reach a conclusion regarding the
use of quasi-experimental strategies to supplement the experimental
sites. Some Committee members believed that having quasi-experimental
studies as part of the impact research agenda would be important,
especially if acceptable compliance rates for the experimental research
are not achieved. Thus, the Committee recommends that the Department
consider the use of quasi-experimental strategies carefully during
the development of a detailed design. The Committee also recommends
that if the Department chooses to implement a quasi-experimental
component to the design, that component should be modest in cost.
Among the most important issues to consider in making this design
decision are the relationship of the quasi-experimental strategy
to ongoing research such as FACES; the best strategies for using
information from the quasi-experiment to complement information
from the experimental sites; the cost of quasi-experiments in relation
to their benefit; and the available strategies for identifying community
comparison groups (particularly in saturated locations).
What are the challenges to addressing questions about the impacts
of variations among Head Start programs within the impact study?
All Head Start programs share a common philosophy, provide core
services, and are required to meet Performance Standards, as explained
earlier. Beyond this, local programs are free to vary their practices
and approaches. For example, some programs serve only 4-year-olds
and others serve 3- and 4-year-olds; some programs operate part-day,
part-year and others operate full-day, full-year; and programs are
operated by a range of community-based organizations including but
not limited to Community Action Agencies and public school systems.
Committee members highlighted this program variability as a key
methodological challenge and believe that these dimensions could
be related to important variations in impact. Therefore, the Committee
urges the Department to develop a research design that documents
fully the variability that exists in the sample and takes that variability
into account in the analysis as fully as possible. Some members
of the Committee also believe, as noted below, that the Department
should consider whether additional studies are needed to assess
the variations in impact of different program designs within Head
Start.
What are the challenges posed by seeking a design that will
produce findings with maximum relevance for the future of Head Start
and other early childhood programs? To what degree should the program
selection process include forms of Head Start that may increase
in quantity and significance as Head Start programs continue to
evolve to meet family and community needs?
Part of the criticism of early studies of Head Start was that by
the time the information became available, the findings were no
longer relevant to the program. Thus, the Committee discussed whether
the impact study or studies should include analysis of the newest
versions of the program (e.g., father involvement initiatives; two
years versus one year of services; full-day, full-year services)
in order to determine how these permutations differentially influence
outcomes for children and families. Such an approach could provide
insight into how Head Start and other early childhood programs implement
variations on the Head Start model that are responsive to the needs
of children and families as changing demographics, work requirements,
and other social and economic factors alter the resources and social
supports available in communities. The Committee also discussed
whether the "Head Start of the Future" approach could lead to an
impact study or studies which went beyond a focus on Head Start
alone, to look at the combination of different forms or levels of
investment across Head Start, child care, and prekindergarten. Understanding
how communities are able to blend these programs would be useful
information for policymakers and administrators.
In the end, the Committee concluded that given limited resources
and the framework for the research questions identified by the Congress,
these other important questions will be only very partially addressed
by the impact research. The first issue, the effect of emerging
program designs within Head Start, can only be addressed to the
extent that there is oversampling of programs with those characteristics.
The second issue, the impact of community-wide strategies, will
not be addressed by this design, although the documentation of the
experiences of control group children as well as Head Start children
may provide useful background information for future study designs.
Because the first research question, in particular, is so important
to the future of Head Start, the Committee does recommend that the
impact study or studies must be embedded in a rich overall research
agenda for Head Start including attention to program variation.
Some members of the Committee would specifically urge the Department
to pay particular attention in designing its research agenda to
Option II in Appendix B, which would allow for a systematic approach
to studying program variation in Head Start, so that different program
strategies could be compared directly.