Paper Summary
Share...

Direct link:

Exploring Instruments Used in Teacher Education Contexts

Fri, April 14, 8:00 to 9:30am CDT (8:00 to 9:30am CDT), Hyatt Regency Chicago, Floor: West Tower - Ballroom Level, Regency C

Abstract

Purpose
The purpose of this study is to describe quantitative instruments related to mathematics teacher behavior and affect to better understand what currently exists in the field, and to identify validity and reliability evidence that has been published for such instruments.
1. How many and what types of instruments measuring mathematics teacher behavior and affect exist?
2. What types of validity and reliability evidence are published for these instruments?
Theoretical Framework
The Standards (2014; 1999) provide clear guidelines regarding measurement validity and reliability. They describe five sources of validity evidence that should be addressed to some degree within a validation argument (see Table 1). At a minimum, sufficient evidence for the sources of validity, along with evidence of reliability, should be collected and shared broadly (AERA et al., 2014; 1999). Unfortunately, “evidence of instrument validity and [emphasis added] reliability is woefully lacking” (Ziebarth et al., 2014, p. 115) in the literature.
Furthermore, the social consequences of use are important to evaluate for validity (Messick, 1995). Examining the relations between social consequences of use and score interpretation provides an opportunity to discuss what is considered valid, and for whom. Thus, identifying sources of validity and reliability evidence of contemporary quantitative instruments is an initial step for interrogating how equity emerges in the research and measurement landscape.
Methods
In our search of the literature, we modified Thunder & Berry’s (2016) steps for a qualitative review: (1) determine a research question, (2) determine search terms, (3) search databases, (4) select relevant studies, (5) assess quality of selected studies, (6) synthesize findings, and (7) report findings (see Table 2 for how we modified these steps).
Results
From the 2286 articles returned from the initial search, we found 271 different measures of teacher behavior or affect published in the 24 mathematics education journals we reviewed. Of these, the most common were surveys (n = 83) and observation protocols (n = 22) (see Table 3). We found that authors most frequently reported evidence related to test content (n = 40) and internal structure (n = 33). Most authors also reported reliability (n = 80). Few studies reported evidence related to consequences of testing and bias (n = 4) response processes (n = 6; see Table 4).
Discussion
Cultivating equitable education systems in the long-term will require policy changes at the local and national levels. Given the continued focus on randomized control trials and large quantitative studies as the “gold standard,” mathematics education researchers hoping to influence policy must engage in rigorous quantitative research. Our study indicates that as a field we need to work together to create measures that have validity and reliability evidence. Additionally, we found a distressing lack of studies that collected evidence of consequences of testing and bias. In order to consider equity issues and the social consequences related to use of a measure (Messick, 1995), researchers must investigate this form of validity evidence as well as test content and response processes.

Authors