Introduction
In May, 2002, J.E. Stone, Ed.D, of the College of Education at
East Tennessee State University, produced a seven-page paper entitled,
"The Value-Added Achievement Gains of NBPTS-Certified Teachers
in Tennessee: A Brief Report." The Education Commission of
the States asked four scholars to review the study. This synthesis
summarizes the comments of Dominic Brewer, Susan Fuhrman, Robert
Linn and Ana Maria Villegas.
Research Problem
The Stone study addresses the question of whether Tennessee's
teachers certified by the National Board for Professional Teaching
Standards (NBPTS or Board) "
are exceptionally successful
in improving the achievement scores of their students." The
author uses the scores of teachers in the Tennessee Value-Added
Assessment System (TVAAS) database and defines "exceptional"
teaching as that which "brings about an improvement in student
achievement equal to 115% of one year's academic growth in the
local school system." That standard is used by the state
of Tennessee and is the same standard used to identify "high
performing" teachers in a new Chattanooga, Tennessee, incentive
program. Stone finds no NBPTS teachers meeting the standard in
all the required subjects and over the three years required by
the Chattanooga system and therefore questions the effectiveness
of NBTS' system.
The reviewers agree that the problem Stone addresses, the relationship
between student achievement gains and teacher certification
by the National Board for Professional Teaching Standards, is
important. As one reviewer noted, "
the paper is one
of the very few that actually attempts to link student achievement
to National Board Certification." Previous studies have shown
that teachers value the certification process as an important
professional development experience (Kelley and Gardner 2002)
and that certification has been related to quality teaching. For
example, Bond et al. (2000) found that the students of Board-certified
teachers showed deeper understanding of teacher-designed units
than the students of teachers who sought certification but did
not get it (as cited in Borman 2002). But these studies have not
focused on the link to student achievement. At least one reviewer
also noted that these previous studies have not been the subject
of the kind of scrutiny being applied to the Stone study.
In part, the absence of studies focusing on the consequences
of NBPTS for student learning relates to the newness of Board
certification; only recently are there sufficient numbers of certified
teachers to support such research. In part, the absence of such
studies relates to the Board's own approach in identifying excellent
teachersexamining their practices rather than the learning
of their students. Reviewers believe that, regardless of whether
Board certification takes student achievement into account in
its own processes, policymakers are supporting Board activities
in the hope of improved student performance, and they want to
know whether certification is related to student learning.
Subjects and Sample
The study sample consists of 16 of the 40 teachers in Tennessee
who have received NBPTS certification; it is not clear how the
16 teachers were selected. Reviewers presume that the other 24
teachers are not in grades 3-8, where students are tested annually
under the Tennessee Value-Added Assessment System.
Stone's failure to explain how the 16 teachers were selected
from the total "n" of 40 is just one example of the
absence of any descriptive data about the subjects. Reviewers
would have liked the following information about the sample teachers:
1. Demographic information
2. Educational background
3. Experience
4. Year and type of Board certification
5. Demographic and educational characteristics of their students
6. How they compare (with respect to items 1-5) to the total number
of Tennessee Board Certified Teachers (as just mentioned )
7. How they compare (with respect to items 1-5) to teachers who
sought but did not receive Board Certification
8. How they compare to other teachers (with respect to items 1-3
and 5 if applicable) in their school systems, their schools, grades
and the state overall.
Without such descriptive information, readers cannot judge how
representative the 16 teachers are of all Board Certified Teachers,
of teachers in their school systems or of teachers in the state.
One cannot look for patterns between the teachers' performance
and their preparation/credentials or between their performance
and their students' characteristics. As one reviewer says, "
by not describing the sample adequately, Dr. Stone provides no
way for readers to explore alternative explanations for the reported
findings," and "the non-random nature of the selections
process restricts the generalizability of the findings to the
teachers studied. Technically speaking, this is a descriptive
study of 16 teachers."
Instruments
Stone uses teacher effects data from the Terra Nova test, as
included in Tennessee's value-added analysis, the TVAAS system.
Reviewers understand that Terra Nova is a commonly used commercial
assessment and probably a reasonable choice for measuring student
achievement. However, readers are given no information about the
extent to which the assessment is aligned to Tennessee's academic
standards and thus about whether it is a valid measure of student
learning in that state. Further, some teachers increase student
scores on multiple-choice tests like Terra Nova by narrowly focusing
on the specific knowledge and skills it covers. If teachers recognized
by the NBPTS do not focus so narrowly while other teachers do,
their students may not perform as well as the students of other
teachers.
Reviewers appreciate the sophistication of the value-added analyses
conducted by William Sanders that comprise the TVAAS system. However,
readers are given no explanation of how these scores are derived,
and the Sanders system has been criticized for its secrecy and
absence of outside scrutiny. The model controls for students'
prior achievement, but this may not be enough to assure that all
background factors are irrelevant and that teachers' scores represent
only the value they add. As one reviewer says, "it is not
entirely clear that taking prior achievement into account is all
that is needed to level the playing field for teachers who teach
students who come from different backgrounds and who receive different
amounts of academic support from home during the school year."
Further, teachers' scores or percent gains are calculated relative
to other teachers in their systems. It is not clear that it is
appropriate to compare teachers from different systems.
Procedures
Stone's approach is to examine the percent gains for the NBPTS
teachers in the various tested subjects they teach and for the
various years in which testing data is available for these teachers.
Reviewers note several problems in Stone's approach. First, a
sample of 16 teachers is much too small to support any generalizations.
Furthermore, because teacher scores from year to year are extremely
volatile (as evident by examination of Appendix A), the TVAAS
system only considers teachers' reports "official" when
three years of data are available. Such data are not available
for 10 of the teachers included in Stone's study. Therefore, the
true sample includes only 6 teachers, an obviously inadequate
number.
Second, the volatility of teacher scores raises additional issues.
Not only are there significant year-to-year differences, but teacher
scores vary significantly by subject and by schools system. "Large
variability suggests that the results are quite unreliable,"
according to one reviewer. Consider the differences among school
systems. A teacher with a relatively small effect could be cited
for large percentage gain if the systemwide effect was also relatively
small.
Third, examining the scores in different ways leads to different
conclusions. For example, examination of medians and percentile
scores as opposed to means shows that the scores in the 75th percentile
come close in language arts and mathematics to the 115% percent
gain chosen by Stone to represent exemplary performance.
Finally, because, as noted previously, nothing is known about
teachers in the sample, there is no way to interpret the scores
presented in Appendix A. We don't know if the teacher scores in
the table were gained after Board certification or represent
both pre and post measures. We don't know the fields of certification
and have no way of interpreting different subject or grade-level
scores. As one reviewer notes, "In other words there is no
way using the methods presented one could make any valid inferences
about cause and effect."
Results and Conclusions
The reviewers are unanimous in asserting that the conclusions
reached by Stone, that "the findings of this study present
a serious challenge to NBPTS' claims
" and that "
they suggest that public expenditures on NBPTS certification be
suspended
," are completely unsupported by the study.
These conclusions severely overreach, considering the methodological
limitations identified by reviewers. It would be hazardous enough
to base recommendations about the whole NBPTS system on a study
of teachers in only one state. Relying on the Stone study, which
provides no data about sample teachers that would enable interpretation
and includes an extraordinarily small sample, is impossible.
Stone anticipates some criticism in his own brief report. About
small sample size, he says that other studies of NBPTS, with different
findings also have small samples. Reviewers find this response
and Stone's other attempts to rationalize the deficiencies of
the sample inadequate. Other studies do not, as this one does,
focus on the link between NBPTS status and student achievement.
Given the importance of this topic, it is imperative that studies
have adequate sample size and include sufficient information to
interpret findings.