TMI

Thinking About Mathematics Instruction


Home

Collaborators

Project Description

Advisory Board

Papers & Publications

Survey

Project Staff

Center for the Development
of Teaching Homepage

TMI is a project in EDC's Center
for the Development of Teaching

EDC logo and link

NSF logo and link
Funded by the
National Science Foundation
Grant EHR-0335384

 

© Education Development Center, 2006-2008

Thinking About Mathematics Instruction
Leadership Content Knowledge
Elementary and Middle School Principals’ Survey

Table of Contents

  1. Introduction
  2. Overview and comparison of the Pre- and Post-surveys
  3. Pre-Survey
  4. Post-Survey
  5. Assembling Your Own Mathematics Content Knowledge Section
  6. Coding Schemes and Sample Responses
  7. Validity and Reliability Considerations

Validity and Reliability Considerations

Validity

The following description of how we established validity for our survey is excerpted from A Preliminary Investigation of Elementary and Middle School Principals’ Leadership Content Knowledge for Mathematics*, Barbara Scott Nelson, et. al.,  presented at AERA in April, 2005 in Montreal.

The validity and reliability of the Mathematics Content Knowledge instrument in our survey had been established by the SII group, but there has been no prior validity work conducted for the mathematics epistemology measures. A consultant to our project, engaged to help us think through a variety of validity issues, recommended that we conduct a small number of cognitive interviews to assess the validity of these instruments.

We therefore conducted a small study, using cognitive interviews to understand how respondents formulated responses to our survey questions. In particular, we wanted to explore whether respondents interpreted and understood the survey items as we had intended. The cognitive interviews also allowed us to identify awkward or ambiguous wording of survey items. We used this knowledge to revise survey questions and to better understand the thinking behind participants’ responses. Of particular interest here are the results for the open response section of the survey, as this is the one piece of qualitative data we have in the survey to help us diagnose principals’ beliefs about learning and teaching mathematics.

Back to the top of the page

Sample

Twenty-one principals participated in this mini-study. Since the purpose of the cognitive interview was to see if principals were interpreting items on the survey in the way we had intended, we sought principals who did not have prior knowledge of our work and would not, therefore, impute meaning to the questions based on their prior knowledge of our work.

The participants were not part of the MSPs with which we are working. We obtained our sample through a snowball method, contacting principals we knew (but who had participated only marginally in prior research) and asking them to suggest other colleagues who might be willing to participate. We also contacted several principals’ organizations and other colleagues who work with principals and asked them to suggest possible participants. In this way, we obtained the names of 40 principals who were located in Massachusetts, New York state, and other parts of New England. We contacted all 40. In several cases our email was undeliverable, some principals did not call us back, in other cases the principal was too busy or couldn’t participate on our time frame. Twenty-one principals (53%) agreed to participate. Travel and scheduling difficulties, and occasional technical problems with taping, left us with 13 participants for whom we had both written response and recordings of the cognitive interview.

The demographic distribution of the principals was not significantly different than that of our Cohort 1 principals.Five were principals in urban districts, three in suburban, two worked in small cities, and three led schools in small town or rural areas. Seven were female, six male. Three had some acquaintance with the ideas we are investigating (having taken one or 2 sessions of a Lenses on Learning course). Because these early sessions are largely introductory, we did not feel that these principals would have had enough experience with the course to necessarily anticipate how we would intend the items on the survey. To the best of our knowledge the remaining ten were not familiar with any aspect of our work.

Back to the top of the page

Data Collection

Cognitive interviews were conducted with principals at their schools, at their convenience. A standard interview protocol was used for all interviews. (For an excerpt from the cognitive interview protocol see Appendix B.) The protocol for our cognitive interviews used two interviewing techniques, concurrent and retrospective methods (DeMaio & Rothgeb, 1996). Concurrent methods are used to understand the response process while respondents are answering a question. Respondents were instructed before beginning to fill out the survey to “think out loud” and describe their thoughts while answering the survey questions. The interviewer guided them during this process by reminding them to “tell me what you are thinking,” and “say more about that.” We used retrospective methods to probe the response process after a respondent had answered a question. The interviewer asked questions such as: “Please paraphrase the instructions or the question,” “What did you think this question meant?” and “How did you come up with this answer?” (See Appendix B for excerpts from the interview protocol for the open response survey questions.) Respondents were asked to complete the epistemology portions of the survey, “thinking aloud” as they did so, and later were asked to respond to particular questions. Interviews were audio-taped and subsequently transcribed. While we conducted the cognitive interviews on the whole of the mathematics epistemology instrument, the analysis reported here is only for the Mathematics Instruction in Context (written open response) section.

Back to the top of the page

Analysis

In order to assess whether principals’ written responses captured the same level of LCK for principals as the cognitive interviews did, every respondent’s written and interview responses were read and coded with the Beliefs scoring scheme by two different staff members: one coded the written response and another coded the interview data for any given respondent. (Some of the coders had also conducted the cognitive interviews; coding assignments were made so that no one coded an interview that she had conducted). All coding was done blind to the identities of the principals.

After both written and transcript data were coded, we compared respondents’ responses, looking for any differences between the two formats in terms of how respondents articulated their beliefs about mathematics teaching and learning. We specifically looked for evidence that would disconfirm the scoring of the written response and any “value-added” information provided by the interviews.

Back to the top of the page

Results

The open response section asks respondents to read a short scenario describing part of a 4th grade mathematics lesson and respond to three underlined sentences, in each case commenting on what the teacher was doing, whether they thought it was good teaching or not, and why. In the lesson depicted in the scenario, the class was working on division problems. Jason, one of the students, misread a division problem in the textbook, inverting the divisor and the dividend. This prompted the teacher to deviate from the planned lesson and discuss with the class whether it is possible to divide a smaller number by a larger one. There is a good deal of student discussion and student thinking shown in the scenario.

Table 2 presents Beliefs scores for respondents’ written responses. Only one of the 13 scored differently on the cognitive interview (“high 3”) and written response (“2/3”). For the remaining 12, the interviews added more detail but did not result in changed scores. The detail was primarily an elaboration of points already made. Ten of the thirteen respondents also made “value added” remarks that involved new ideas. In all cases these were consistent with the overall perspective on teaching and learning that principals had written about. The consistency seemed to hold, regardless of the level. The 13 also seemed to cluster toward the middle and lower end of the scoring scheme, which is consistent with earlier findings (Nelson, Benson, and Reed, 2004).

Table 2
TMI Survey scoring levels

Scoring level
1
2
3
4
5
n
0
5
6
2
0

Back to the top of the page

Reliability

Our experience has been that substantial training is necessary to obtain inter-rater reliability when a group of researchers is coding a large body of data. Researchers using the instruments on this site should be sure that they have sufficient inter-rater reliability for their coded results, especially if they plan to compare their results with ours.

Our scoring process for the open item, A Classroom Reflection, was the same for both Pedagogy and MIU coding, and the bulk of the open responses were coded simultaneously for these two scales. A group of 10 staff members met for a day for the purpose of calibrating ourselves in the use of the pedagogy and math-in-use rubrics, using A Classroom Reflection data from our pilot work with principals. Together we examined data that exemplified responses from each category for both rubrics, paying close attention to the features that define each category. Following this, we checked for shared understandings among the coders by independently scoring a different set of responses that reflected a range of categories. We compared scores and discussed data that were scored differently until we reached consensus about them. 

Once we were calibrated, staff members were assigned to coding triplets that rotated membership. To the extent possible, pre- and post-test responses were intermixed, and raters were blind to condition (“treatment” or “control”). To ensure we stayed calibrated during the course of the period of time during which the group was coding, we held calibration sessions  several times.

After rating the responses independently, coding groups met to compare scores and discuss disagreements. We recorded each group member’s ratings on a coding sheet and noted whether there was agreement on this first round. If not, the group discussed their reasons for their coding. In most cases, the group reached consensus on the second round. However, if consensus was not achieved on the second round, the score of the response was decided upon by an arbitration group. Senior staff members took turns arbitrating scores in groups of 2 or 3. 

*
"Thinking about Mathematics Instruction: A Preliminary Investigation of Elementary and Middle School Principals’ Leadership Content Knowledge for Mathematics"; Barbara Scott Nelson, Lynn T. Goldsmith, Greta Johnson, Kristen E. Reed, Education Development Center, Inc.; Apriel Hodari The CNA Corporation; Unpublished paper presented at the Annual Meeting of the American Educational Research Association , April, 2005, Montreal

Back to the top of the page