Vol. 1, No. 2 - January 1997

Determining Alignment of Expectations and Assessments in Mathematics and Science Education 1

By Norman L. Webb

Diagram

Many states and school districts are making concerted efforts to boost student achievement in mathematics and science. These are not simple face lifts, but attempts to develop deep, lasting changes in how students learn these critical subjects.

This Brief is intended for those who seek to improve student learning by creating coherent systems of expectations and assessments in states and districts. Other potential audiences are those who study reform, make decisions about reform, and are affected by reform. The intention of this Brief is to help people think more clearly about the concept of alignment, and to help them examine what is required for expectations and assessments to be in alignment.



Why Alignment Is Important

Educators, notably through efforts spearheaded by national professional associations, increasingly recognize the need for major reform in K-12 mathematics and science curricula, and are embracing a vision of ambitious content for all students. Making this vision a reality means encouraging "a far deeper and more dynamic level of instructional decision making" (Baker, Freeman, & Clayton, 1991), something that cannot be done simply by mandating new accountability measures. At the heart of these efforts to make deep changes in instruction is the concept of "alignment." The major elements of an education system must work together to help students achieve higher levels of mathematical and scientific understanding.

magnify glass

Educators increasingly recognize that, if policy elements are not aligned, the system will be fragmented, send mixed messages, and be less effective (CPRE, 1991; Newmann, 1993). 2 For example, the Systemic Initiatives program of the National Science Foundation seeks to help states, districts, and regions establish policies based, in part, on assessments aligned with those goals. Other examples: The U.S. Department of Education’s explanation of Goals 2000: Educate America Act, and the Improving America’s Schools Act (which includes Title I), both say that alignment of curriculum, instruction, professional development, and assessments are key performance indicators for states, districts, and schools that are striving to meet challenging standards.

As more and more attention is paid to the accountability of education systems, alignment between assessments and expectations for learning becomes not only critical, but also essential. Just as a schooner’s speed increases when its sails are set properly, alignment among an education system’s policy elements will strengthen that system, and improve what the system is able to attain. Alignment is critical to helping an education system articulate and maintain its desired course and intensity. An aligned system is better able to focus its resources and thereby strengthen its capacity for making deep, meaningful changes in instructional decision making and practice. Alignment also serves to keep local policy efforts in synch with larger-scale initiatives. 3

This Brief focuses on alignment between two major elements of education policy:

  • Expectations of what students should know about mathematics and science and what they should be capable of doing with that knowledge. Expectations can be communicated in different ways. Educators can, for example, craft sets of standards or frameworks, ranging from broad vision statements to precise indications of expected performance and recommended instructional practices.
  • Assessments that accurately gauge student achievement in science and mathematics and indicate whether expectations are being achieved. Assessments can be used to formulate policy, monitor policy effects, enforce compliance with policies, demonstrate accountability, make comparisons, monitor progress toward goals, and/or make judgments about the effectiveness of particular programs (Blank, Pechman, & Goldstein, 1996).

There are, of course, many other important elements in any education system, including professional development, instructional materials, college entrance requirements, teacher certification, resource allocations, and state mandates. But this Brief focuses on expectations and assessments because those elements are now of great concern among educators and policymakers, and because those are the elements at the center of most thinking about alignment to date.


Methods of Alignment

Determining alignment between expectations and assessments is difficult for several reasons. To begin with, both expectations and assessments frequently are expressed in several pieces or documents, making it difficult to assemble a complete picture. Also, it is difficult to establish a common language for describing different elements of policy. The same term may have very different meanings when used to define a goal and when used to describe something measured by assessment. Further, the policy environment in an education system can be constantly changing. New goals can be mandated, for example, while old forms of assessment are still in place. Ever-expanding content areas, expanding technology, and a growing body of research on learning also can contribute to the complexity of identifying expectations and assessments.

2 girls
The major elements of an education system must work together to guide the process of helping all students achieve higher levels of mathematical and scientific understanding.

A review of current practice and relevant literature identifies three major approaches to assuring alignment. These are not the only approaches, however, nor should they be seen as items on a menu to be chosen and then applied in pure form. In most situations, some combination of these approaches is appropriate.

Sequential Development. Policy elements, such as expectations and assessments, are aligned by design. A set of standards, for instance, might be converted directly into specifications for developing an assessment. Once one policy element is established, it becomes the blueprint for subsequent elements. For example, the South Carolina Department of Education (1996) approved standards in a content area that are used to develop academic achievement standards (measurable outcomes), which are then used to develop assessment instruments.

One disadvantage to this approach is the amount of time needed to put a sequentially developed program in place. This approach also ignores a synergism among policy elements: The development of assessments, for example, can provide useful information for thinking about instruction and what students can be expected to learn. Another disadvantage to this approach is that it frequently does not reflect reality: In many states, the process for developing expectations and assessments is not linear or sequential, but more dynamic.

Expert Review. A panel of experts reviews the policy elements and makes some judgment on their alignment. For example, the Oregon Department of Education convened a national panel to look at various issues related to its standards (Roeber, 1996). A subpanel looked at the alignment of the planned assessments and the standards.

The format and formality of this process can vary. In many states the process is an open one, seeking input from committees and community forums of teachers, administrators, parents, and others. Whatever format is used, however, must include input from content-area specialists, because the comparisons to be made are complex and require sophisticated knowledge about how students learn.

Document Analysis. Alignment can be measured by coding and analyzing the documents that convey the expectations and assessments. A coding system must be developed that specifies the distinctions to be made in describing each document. The documents are then divided into blocks, such as individual standards, which can be described separately using the coding system’s categories. Coders must be trained to independently and validly describe the documents by using the coding categories on the blocks of document information (Schmidt & McKnight, 1995; Porter, 1995). For example, the Third International Mathematics and Science Study successfully trained national teams to perform document analyses comparing curriculum materials with assessments used in the study (McKnight, Britton, Valverde, & Schmidt, 1992).

computer
Expectations and assessments that are aligned will demand equally high learning standards for all students, while providing fair means for all students to demonstrate the expected level of learning.

These approaches and their interactions raise questions about quality control. Sequential development, for example, frequently is controlled within an agency and therefore is less likely to include any external review. While such reviews add authority, they can’t always be done within the short time lines required by legislative mandates or administrative pressures. The quality of expert review, on the other hand, depends on how qualified the reviewers are, and whether they have the opportunity to interact and build consensus. And the quality of document analysis depends on the validity of the scoring rubric being used, the quality of training, and the reliability of the coders.

Most likely, these approaches will be used in conjunction with each other. One approach will be used to verify another, or two or three approaches will be used together. An expert panel, for example, may use document analysis to judge alignment.



Specific Criteria

These approaches to the judging of alignment are strengthened by using specific criteria to assure agreement among expectations and assessments. The following criteria were identified through a review of national and state standards and alignment studies. They were adjusted after review by a panel of assessment experts from the National Institute for Science Education and the Council of Chief State School Officers, state curriculum supervisors, and others. It is expected that this set of criteria will evolve as they are used.

beakers
Assessments must achieve a high degree of match between what students are expected to know and what information is gathered on their knowledge.

The five categories are intended to be a comprehensive set for judging the alignment between expectations and assessments. Each general category and all subcategories are important in ascertaining the coherence of a system, meaning the degree to which assessments and expectations converge to direct and measure student learning. In practice, reaching full agreement between expectations and assessments on all criteria is extremely difficult. Tradeoffs must be made because real constraints exist on any education system, including resources, finances, time, and legal authority. Decision makers must consider potential consequences when deciding what tradeoffs to make among these criteria, or what level of compliance will be acceptable.

Such decisions will hinge on a number of factors. Assessing the depth of content knowledge, for example, can conflict with assessing the breadth of knowledge (these concepts are explained in greater detail below). Given finite resources, it may be difficult to fully explore both. Decision makers will need to choose which criteria are considered more important within a particular context and why, and how those decisions affect the pursuit of alignment.

Because resources are finite, decision makers also will need to think broadly about expectations and assessments. It may be far more reasonable and cost-efficient, for example, to give teachers the responsibility of assessing students’ abilities at reasoning and problem-solving, instead of trying to measure them through new systemwide tests. Whether assessments are carried out at the classroom level, locally, or systemwide, however, the focus must be the same: achieving a high degree of match between what students are expected to know and what information is gathered on their knowledge.

The following criteria 4 are ordered to consider content first, then students, instruction, and finally system concerns.

  1. Content Focus. Expectations and assessments should focus consistently on developing students’ knowledge of mathematics and science. This consistency will be present to the extent expectations and assessments share the following attributes:
    1. Categorical Concurrence. The same categories of content, such as subject headings and their subheadings, appear in each. The level of detail, however, may vary: Standards and frameworks can be statements of general expectations, or they can be more refined descriptions of content. Assessment documents may be still more specific. To be in categorical concurrence with the National Science Education Standards developed by the National Research Council (1996), for example, an assessment would need to represent each of the eight content topics in those standards. Alignment would be even greater if the assessment results were reported by those eight content topics.
    2. Depth of Knowledge Consistency. This can vary on a number of dimensions, including the level of cognitive complexity of the information students should be expected to know, how well they should be able to transfer this knowledge to different contexts, and how much prerequisite knowledge they must have in order to grasp more sophisticated ideas. Expectations and assessments are aligned if they reflect similar requirements on these dimensions. For example, the Curriculum and Evaluation Standards for School Mathematics published by the National Council of Teachers of Mathematics (1989) states that students in grades 9 through 12 should study data analysis and statistics, so that all students can "design a statistical experiment to study a problem, conduct the experiment, and interpret and communicate the outcomes." An assessment system requiring students only to interpret an existing set of data would not be aligned with the depth of knowledge specified in this standard.
    3. Range of Knowledge Correspondence. Expectations and assessments cover a comparable span of topics and ideas within categories. For example, standards published by the Virginia Board of Education (1995) say students should be able to read four different types of maps: bathymetric, geologic, topographic and weather. For an assessment to correspond to that range of knowledge, it would need to measure how well students can interpret information using all four map types.
    4. Structure of Knowledge Comparability. The underlying concepts of science and mathematics, and what it means to "know" these concepts, are in agreement. For example, if standards indicate that students should see mathematics "as an integrated whole" (NCTM, 1989) or "science as inquiry" (NRC, 1996), then the assessment activities should be directed toward those same ends. Both expectations and assessments should embody similar requirements for how students are to draw connections among ideas. Assessment of knowledge only as isolated skills, for example, would not be in full alignment with the national standards.
    5. Balance of Representation. Similar emphasis is given to different content topics, instructional activities, and tasks. The expectations and assessments give comparable emphasis to what students are expected to know, what they should be able to do, and in what contexts they are expected to demonstrate their proficiency. For example, the National Science Education Standards emphasizes different skills at different grade levels: Students in kindergarten through grade 4 should focus on developing observation and description skills, while students in higher grades should work on constructing models that explain visual and physical relationships. An aligned assessment system would need to reflect a similar shift in emphasis. It also would need to include enough different tasks or activities to reflect the same priorities and intentions.
    6. Dispositional Consonance. When expectations include more than learning concepts, procedures, and their applications — such as molding student attitudes and beliefs about science and mathematics — assessments also should support that broader vision. For example, the National Science Education Standards underscores the importance of students becoming self-directed learners. The ability to self-assess understanding is an essential tool for this process. Assessment practices aligned with this goal will include opportunities for students to critique their own work and to explain how work samples provide evidence of understanding. Teachers need to give students opportunities to reflect on their scientific understanding and abilities, so they can begin to internalize the expectation that they can learn science.
  2. Articulation Across Grades and Ages. Students’ knowledge of mathematics and science grows over time. Expectations and assessments should be rooted in a common view of how students develop and how best to help them learn at different developmental stages. This common view should be based on:
    1. Cognitive Soundness Determined by Best Research and Understanding. There has been considerable research on the learning of mathematics and science, which has produced extensive knowledge of how students mature in their understanding of these content areas (Romberg & Carpenter, 1986; Stein, Grover, & Henningsen, 1996). Expectations and assessments should build on this knowledge to develop a sound learning program, and they should do so in ways that are aligned.
    2. Cumulative Growth in Knowledge During Students’ Schooling. Expectations and assessments should be linked by an underlying rationale of mathematics and science as content areas. Although the learning of mathematical and scientific concepts over time doesn’t follow a strict order of steps, students often need to grasp certain concepts and ideas in order to address more advanced ideas. For students to take part in scientific inquiry, for example, they first need to learn to identify questions and concepts that guide scientific investigations, how to design and conduct such investigations, how to use technology and mathematics, how to formulate and revise scientific explanations, and how to recognize and evaluate alternative explanations. Aligned expectations and assessments describe and represent, in complementary fashion, the underlying structure of knowledge students need to develop and how their instructional experiences should be organized.
  3. Equity and Fairness. When expectations are that all students can learn to high standards, aligned assessments must give every student a reasonable opportunity to demonstrate attainment of what is expected. Expectations and assessments that are aligned will demand equally high learning standards for all students, while providing fair means for all students to demonstrate the expected level of learning. The knowledge a student will demonstrate on an assessment can vary by the form of assessment (Baxter, Shavelson, Herman, Brown, & Valadez, 1993). Even a slight variation in the wording of a question can alter performance. Rarely will one form of assessment be capable of producing valid evidence for all students.

    A student’s ability to perform well on an assessment depends on a number of factors in addition to the level of knowledge, including culture, social background, and experiences. Therefore, expectations and assessments will be better aligned, and more equitable, if multiple forms of assessment are used. The challenge becomes developing and maintaining an aligned system with a variety of means of assessment, which function together to reflect more accurately what students know and can do.

    It may be difficult to gauge the alignment of expectations and assessments on these criteria for some time. Consistently low scores on an assessment of a particular learning goal may be the result of many factors, including misplaced expectations, rather than poor instruction or lack of effort by students. Students may be developmentally unprepared to attain a particular expectation, for example, or the structure of the curriculum may keep them from attaining sufficient experiences to learn what is expected. It takes time for patterns to form and be recognized.

  4. Pedagogical Implications. Classroom practice greatly influences what students learn. Expectations and assessments can and should have a strong impact on these practices and should send consistent messages to teachers about appropriate pedagogy.

    Judging the pedagogical implications of expectations and assessments requires more than simple content analysis. Any review must attempt to gauge the likely implications for classroom practice. Meaningful analyses have been done, for example, by directly asking teachers how they interpret expectations and assessments and how their classroom practices fit with them (Romberg, Zarinnia, & Williams, 1990; Cohen, 1990).

    Of course, the true test is what happens in the classroom. For example, educators are now paying increased attention to the importance of involving students in scientific inquiry, hands-on learning, and more "authentic" instruction (Newmann, Secada, & Wehlage, 1995). Assessments that reflect a more passive type of instruction would be less aligned with those expectations. Likewise, expectations that students should perform scientific inquiry through actively constructing ideas and explanations (NRC, 1996) will lack full alignment with assessments that are based solely on an assumption that students have memorized canonical ideas and explanations. Alignment is achieved when the instructional practices and materials implied by expectations, and those implied by assessments, are consistent.

    Critical elements to be considered in judging alignment and its influence on pedagogy include:

    1. Engagement of Students and Effective Classroom Practices. Traditional forms of student assessment, and the constraints imposed by limits on time and other resources, may place an inordinate influence on the superficial acquisition of skills and facts. In this way, education systems can gravitate toward readily measured outcomes, instead of more complex but also more desirable outcomes, such as students being able to investigate, create models, or otherwise demonstrate deeper content knowledge.

      Expectations and assessments need to work together to provide consistent messages to teachers, administrators, and others about the goals of learning activities. For example, a preliminary draft of statewide academic standards for Illinois indicates that students should learn and contribute productively both as individuals and as members of groups (Illinois Academic Standards Project, 1996). This is defined in the draft as an important skill, one that will greatly determine the success of students later in life. But if no part of the assessment system produces evidence of whether students are contributing productively as members of groups, then teachers would receive conflicting messages about how much classroom time should be spent having students work in teams.

    2. Use of Technology, Materials, and Tools is vital to knowing and "doing" mathematics and science today. Students should develop skill and confidence using tools such as calculators and computers in their everyday lives (National Council of Teachers of Mathematics, 1991). Expectations and assessments should send students consistent messages about technology and how it is related to what they are expected to learn. If standards indicate that students should learn to use calculators or computers routinely, for example, then the curriculum should provide adequate opportunity for students to use them in this manner. To be aligned, assessments would allow students to use calculators and computers effectively to derive correct answers.
  5. System Applicability. Although expectations and assessments should seek to encourage high student performance, they also need to form the basis for a program that is realistic and manageable in the real world. The policy elements must be in a form that can be used by teachers and administrators in a day-to-day setting. Also, the public must feel that these elements are credible, and that they are aimed at getting students to learn things about mathematics and science that are important and useful in society.

Conclusions Notebook

Above all else, when using these criteria to judge the alignment of expectations and assessments in a system, a sense of reality needs to be maintained. The available resources, the amount of time available, legislative mandates, and other factors will influence how well alignment can be determined and how practical it is to make such determinations.

The alignment of expectations and assessments is a key underlying principle of systemic and standards-based reform. Establishing alignment among policy elements is an early activity for improving the potential for realizing significant reform. Those working to build aligned systems should not think too narrowly about the task. The criteria presented here demonstrate that a number of factors can be considered in judging alignment among policy elements. These can be studied in several alternative and potentially complementary ways. 5

In approaching reform, the consideration of alignment cannot come too soon. And just as educators need to remain vigilant to assure that expectations, assessments, and instructional practices are current, they also will need to review the alignment among these major policy elements as new policies are instituted, new administrative rules are imposed, and system needs are changed.


Norman L. Webb leads the Strategies for Evaluating Systemic Reform project of the National Institute for Science Education. He is a senior research scientist for the Wisconsin Center for Education Research, where he has directed a number of evaluations and assessment development projects.

Very thoughtful reviews of earlier versions of this Brief were provided by Andrew Porter, NISE Co-Director; Robert Linn, University of Colorado; Joan Herman, UCLA National Center for Research on Evaluation, Standards, and Student Testing; and Cathy Seeley of the University of Texas at Austin.


1 This Brief is the result of a collaboration between the National Institute for Science Education and the Council of Chief State School officers. The CCSSO effort is supported by Grant #9554462 from the National Science Foundation.

2 The type of alignment referred to here is "horizontal alignment," meaning the degree to which standards, frameworks, and assessments work together within an education system. This is different from another critical factor, "vertical alignment," which is the degree to which the elements of an education system are aligned with other forces, such as national standards, public opinion, work force needs, textbook content, classroom instruction, and student outcomes.

3 Alignment is intimately related to the "validity" of tests, but distinctions can be drawn between the two concepts. Alignment refers to how well all policy elements in a system work together to guide instruction and, ultimately, student learning. Validity, on the other hand, refers to the appropriateness of inferences made from information produced by an assessment. For example, the degree to which a test is aligned with a curriculum framework may affect the test’s validity for a single purpose, such as making decisions on the curriculum’s effectiveness. But a test and a curriculum framework that are in alignment will work together to communicate a common understanding of what students are to learn, to provide consistent implications for instruction, and to represent fairness for all students, and will be based on sound principles of cognitive development.

4 A more complete discussion of these criteria, and how they can be used, is available in Webb, N. L., Criteria for Alignment of Frameworks, Standards and Student Assessments for Mathematics and Science Education. This paper is a joint publication by the National Institute for Science Education and the Council of Chief State School Officers. For more information, contact NISE at (608)263-1028 or via the NISE World Wide Web site: http://www.wcer.wisc.edu/nise.

5 The complete paper includes a more detailed description of procedures and scales useful for judging attainment of these criteria.


FOR FURTHER READING books

Baker, E. L., Freeman, M., & Clayton, S. (1991). Cognitive assessment of history for large-scale testing. In M. C. Wittrock & E. L. Baker (Eds.), Testing and cognition, (pp. 131-153). Englewood Cliffs, NJ: Prentice-Hall.

Baxter, G. P., Shavelson, R. J., Herman, S. J., Brown, K. A., & Valadez, J. R. (1993). Mathematics performance assessment: Technical quality and diverse student impact. Journal for Research in Mathematics Education, 24(3), 190-216.

Blank, R. K., Pechman, E. M., & Goldstein, D. (1996). State mathematics and science standards, frameworks, and student assessments: What is the status of development in the 50 states? Washington, DC: Council of Chief State School Officers.

Cohen, D. K. (1990). A revolution in one classroom: The case of Mrs. Oublier. Educational Evaluation and Policy Analysis, 12(3), 327-345.

Consortium for Policy Research in Education. (1991). Putting the pieces together: Systemic school reform. (CPRE Policy Briefs). New Brunswick, NJ: Eagleton Institute of Politics, Rutgers, The State University of New Jersey.

Illinois Academic Standards Project. (1996). Preliminary draft: Illinois academic standards for public review and comment, English language arts and mathematics, Volume 1, State goals 1-10. Springfield, IL: Author.

McKnight, C., Britton, E. D., Valverde, G. A., & Schmidt, W. H. (1992). Survey of mathematics and science opportunities: Document analysis manual (Research report series No. 42). East Lansing, MI: Third International Mathematics and Science Study, Michigan State University.

National Council of Teachers of Mathematics. (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author.

National Council of Teachers of Mathematics. (1991). Professional standards for teaching mathematics. Reston, VA: Author.

National Research Council. (1996). National science education standards. Washington, DC: National Academy Press.

Newmann, F. M. (1993). Beyond common sense in educational restructuring: The issues of content and linkage. Educational Researcher, 22(2), 4-13, 22.

Newmann, F. M., Secada, W. G., & Wehlage, G. G. (1995). A guide to authentic instruction and assessment: Vision, standards, and scoring. Madison, WI: Center on Organization and Restructuring of Schools.

Porter, A. C. (1995). Developing opportunity-to-learn indicators of the content of instruction: Progress report. Madison, WI: Wisconsin Center for Education Research.

Roeber, E. D. (1996). Review of the Oregon content and performance standards. A report of the National Standards Review Team prepared for the Oregon Department of Education. Salem, OR: Oregon Department of Education.

Romberg, T. A., & Carpenter, T. P. (1986). Research on teaching and learning mathematics: Two disciplines of scientific inquiry. In M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed. pp. 850-873). New York: Macmillan.

Romberg, T. A., Zarinnia, E. A., & Williams, S. (1990). Mandated school mathematics testing in the United States: A survey of state mathematics supervisors. Madison, WI: National Center for Research in Mathematical Sciences Education.

Schmidt, W. H., & McKnight, C. (1995, Fall). Surveying educational opportunity in mathematics and science: An international perspective. Educational Evaluation and Policy Analysis, 3, 337-353.

South Carolina Department of Education. (1996). South Carolina science academic achievement standards (draft). Columbia, SC: Author.

Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33(2), 455-488.

Virginia Board of Education. (1995). Standards of learning for Virginia public schools. Richmond, VA: Author.


NISE Brief Staff

Co-Directors

Andrew Porter
Terrence Millar
Project Manager Paula White
Editor Leon Lynn
Editorial Consultant Deborah Stewart
Graphic Designer Rhonda Dix


This Brief was supported by a cooperative agreement between the National Science Foundation and the University of Wisconsin-Madison (Cooperative Agreement No. RED-9452971). At UW-Madison, the National Institute for Science Education is housed in the Wisconsin Center for Education Research and is a collaborative effort of the College of Agricultural and Life Sciences, the School of Education, the College of Engineering, and the College of Letters and Science. The collaborative effort also is joined by the National Center for Improving Science Education in Washington, DC. Any opinions, findings or conclusions herein are those of the author(s) and do not necessarily reflect the views of the supporting agencies.

No copyright is claimed on the contents of the NISE Brief. In reproducing articles, please use the following credit: "Reprinted with permission from the NISE Brief, published by the National Institute for Science Education, UW–Madison." If you reprint, please send a copy of the reprint to the NISE.