Sound assessments at any level from the classroom to the boardroom1:
"Student learning targets" are also called many other things: content standards, benchmarks, learning objectives, outcomes, learning goals, essential academic learning requirements. They all attempt to define clear and appropriate achievement targets for students. The role of assessment then is to align with the targets.
These keys are also shown in Figure 1.1 below. The rest of this section will add a little more detail to each key. Chapter 3 will expand on these ideas with respect to alternative assessment.
| Reference Box:
For further reading on keys to quality assessment at the classroom level, see: Rick Stiggins, 1997, Student Centered Classroom Assessment, pp. 14-17. Prentice-Hall,
For more information on keys to quality assessment at the large-scale level see: Joan Herman, 1996, "Technical Quality Matters," in Robert Blum and Judy Arter (Eds.), Student Performance Assessment in an Era of Restructuring, Section I, Article #7. Association for Supervision and Curriculum Development (ASCD), (800) 933-2723. ISBN 0-87120-267-0. |


Key 1: Clear and Appropriate Learning Targets

The first key to quality with respect to assessment is to have clear and appropriate learning targets. One can't assess something if one doesn't know what it is they're trying to assess.
For Key 1, we'll tackle three related topics. The first is "content standards." Standards setting activities have blitzed the country over the past eight years and are intended to define the "appropriate" part of the "clear and appropriate student learning targets" equation. So, educators need to know about them. Secondly, we'll address the "clear" part of the equation. Then, finally, we'll present a couple of ways to categorize learning targets, and discuss why we might want to classify them.
Appropriate Targets/Content Standards. As we've mentioned previously in this chapter, content standards are statements of what should be taught; they specify the "what" of what students should know and be able to do. Content standards come by many names—benchmarks, outcomes, essential academic learning requirements, skills standards, competencies, common curriculum goals, and academic student expectations. Here are some examples:
"Nothing new," we hear you thinking, and in a very real sense you're right. The idea of focusing instruction and assessment on that which is most important and enduring is not new. The hard part of it is coming to agreement on what is important and enduring. Have you ever had disagreements with your colleagues on the most important things to emphasize in instruction? Expand this a thousand-fold as a nationwide effort, and you get the picture. Some of the efforts to set standards read like soap operas. But this only underscores the necessity to do it. How can anyone hold students responsible for outcomes for which they disagree on the meaning? We all owe it to our students and ourselves to be crystal clear on our goals and expectations—no surprises and no excuses.
Clear Targets. Learning targets for students not only need to be appropriate, they need to be clear. It's easy to agree on a target like "communicates well." But, what does this mean? What type of communication, in what contexts, for which purposes? The key to effective student learning targets, be they at the national, state, district, or classroom level, is that they are specific enough to enable everyone to share the same understanding of what students need to know and be able to do. When targets are ambiguous, instruction can take students to vastly different places and assessments can be vastly different. The goal here is not to standardize instruction; rather, the goal is to aim at the same learning target even if teachers have different instructional designs.
Learning targets also need to be clear enough so that the persons who find or write assessment items and tasks have the same interpretation of what should be covered as the persons who wrote the target statements. Will future teachers interpret them the same? Are the interpretations clear enough so that, when the assessment results are used to profile achievement strengths and weaknesses, users will know what to do about it? For example, how would one design an assessment for "communicates well?" The assessment could be anything from writing an essay to observing students on the playground as they informally communicate with their peers.
Targets can be unclear for lots of different reasons. In our experience when trying to design assessments to match ambiguous content standards, we have noted the following sources of confusion:
| Caution...striking a balance between detail and over-restrictiveness can be tricky. We want targets clear enough that we can all agree on what student success looks like, but not so detailed that "these 10 things" are all we mean. Or, even worse, that "these 300 things" are exactly what we mean. |
Rule of thumb: Is the target clear enough that a group of teachers would agree on the range of knowledge, skills and performance implied by the target? Would they agree on what to teach and what to assess?
Tricky? You bet. In fact, Joan Herman and her colleagues state that available evidence suggests that many states' standards currently are not strong enough to support rigorous assessment development. But, we believe that it's the attempt to clarify targets, as much as having final clear targets in place, that makes a difference. In groups we've worked with, the general consensus is that everyone who makes the effort ends up with a much more indepth understanding of what they are trying to accomplish with students.
Types of Learning Targets. There are a million (well, actually maybe a hundred) different ways to categorize the types of learning targets (achievement goals, outcomes) we've seen for students. But, hold on, you're saying, why would one even want to "categorize" them? Well, this isn't just an outgrowth of compulsiveness on the part of number-crunchers. The process of categorizing helps to do three things. First, it helps folks to thoroughly think through what they want students to know and be able to do (in other words, clarify targets). Second, it helps folks determine if they have a good 'mix' of learning targets. And, finally, it will help folks, later, to choose the appropriate assessment method.
Here are two different "takes" on how to categorize learning targets for students. The first thing to remember about these (or any other) categorization schemes is that they are conveniences made up by someone in order to help people discuss things that are complex. There is no "truth" out there in the universe that "there are five kinds of student outcomes" or that "there are two basic kinds of student outcomes, each of which has two variations." Each scheme has its strengths, weaknesses and interesting aspects. Neither is perfect.
Take One: Bob Marzano. Marzano divides student outcomes into two types: process skills and content/declarative knowledge skills. Both process skills and content knowledge can be simple or complex. Simple process skills are short routines that are applied fairly consistently across situations, such as long division. Complex process skills are those that require many decisions and the integration of many simple process skills, such as writing or critical thinking. Likewise, simple content knowledge relates to things like recall of facts, while complex content knowledge relates to understanding concepts and making generalizations. Figure 1.2 shows examples of each of these types of student outcomes.
| Reference Box:
For more information on Bob Marzano's categorization scheme see: Robert J. Marzano, 1996, "Understanding the Complexities of Setting Performance Standards," in Robert Blum and Judy Arter (Eds.), Student Performance Assessment in an Era of Restructuring, Section I, Article #6. Association for Supervision and Curriculum Development (ASCD), (800) 933-2723. ISBN 0-87120-267-0. |
| Type of Learning Target | Examples |
|---|---|
| Process Skills: Simple | Long division Punctuation, grammar Decoding words |
| Process Skills: Complex | Problem solving Writing Setting up an experiment Critical thinking Group cooperation Lifelong learning Dance |
| Content/Declarative Knowledge: Simple | Recall facts—e.g., dates, places, events |
| Content/Declarative Knowledge: Complex | Concepts—e.g., democracyGeneralizations—e.g., "power corrupts" |
Take Two: Rick Stiggins. Rick Stiggins finds that classifying student learning targets into five categories helps teachers find a good mix in instruction and assessment:
| Reference Box:
For further reading on types of targets à la Rick Stiggins, see: Rick Stiggins, 1997, Student Centered Classroom Assessment, Chapter 3. Prentice-Hall, (201) 236-7000. |
| Related Toolkit98 Chapters and Activities:
Activity 1.2—Clear Targets—What Types Are These? Asking people to describe what kind of target they are looking at (knowledge, reasoning, etc.) is an excellent way to begin to tease out differences in what is meant by targets statements. Activity 1.6—A Comparison of Multiple-Choice and Alternative Assessment. Comparing different ways to assess a content area provides a means of exploring what it means to know and understand a content area. Activity 1.10—Clear Targets and Appropriate Method—The View From the Classroom. In this activity, participants are asked to self-evaluate the clarity of their learning targets using a rating form. Activity 2.1—Sorting Student Work. Analyzing what makes student work effective is an excellent way of opening up the discussion of what it means, for example, for a student to write well. |
| Caution
Affective targets can be a red flag in some communities. If so, the user can delete any references to the affective domain in this chapter and stick just to the cognitive domain. |
Key 2: Focused and Appropriate Purpose

We'll talk about two things in this section. First, we'll address why educators assess—purposes. In other words, who are the users and uses of assessment results? This will underscore the importance of doing a good job of assessing students. Then, we'll look at "the rest of the story"—how purpose actually affects the way an assessment is designed.
So, why do folks assess student achievement? Who uses the results and what do they use the results for? Well, just about everybody for just about everything, and assessment activity seems to get more intense every day. For example:
Looking at this list reminds one that these are pretty important uses for assessment results and that we all had better be darned sure that our assessments are of good quality.
What would happen, for example, if an assessment gave an inaccurate picture of student achievement? What would happen if what was actually assessed was not really what was thought to be assessed? Or worse yet, what if users were unsure as to what they were assessing, so they didn't know what the results really meant? Is everyone positive that they are accurately assessing the most enduring outcomes for students so that the decisions made can really serve to guide learning?
Looking at the list of users and uses also reminds one of the crucial importance of not only large-scale assessments—those that occur in roughly the same way at roughly the same time across classrooms—but also classroom assessments. After all, which assessments—day-to-day classroom assessments or once-a-year large-scale assessment—most affect the kinds of decisions made by teachers, parents, and students? (We would choose classroom assessment; hopefully readers did too.) What happens if classroom assessments are not well thought-out and executed?
The point here is not to suggest that teachers should be blamed for lapses in their knowledge about classroom assessment. (After all, most teachers never had the opportunity to learn about assessment because most states don't even require an assessment class for certification. And, even in places where an assessment course is required, there is an evolving understanding of what teachers really need to know and be able to do to be good classroom assessors.)
The point is that student assessment is of crucial importance. That's why there is activity on all fronts to improve it, from clearly defining valued student learning targets at the state level and rethinking how to best assess them in large-scale assessment, to changing coursework for pre-service teachers, to assisting teachers to align important learning targets to instruction and assessment in the classroom. One of the efforts to assist teachers to fine tune classroom assessment practice is this Toolkit98!
Now for the second part of this section—How does purpose affect how educators assess? It's probably obvious that an assessment good for one purpose—for example, providing detailed diagnosis of a student's strengths and weaknesses—is not necessarily best for other purposes—for example, determining the strengths and weaknesses of the school's overall curriculum or whether most students attained the school's grade-level goals for student performance. Thus, a first major decision to make is deciding one's purpose for assessment.
Just consider the differences in uses of information from large-scale and classroom assessments. In general, large-scale purposes require more rigorous evidence of technical quality than do classroom assessments, primarily because important decisions are likely to be based on them and because it is usually only a single testing episode. In contrast, for classroom purposes, a teacher has lots of formal and informal evidence upon which to base decisions, and so the results of any single, faulty assessment are not likely to be given undue weight.
| Related Toolkit98 Chapters and Activities:
Activity 1.8—Sam's Story asks participants to judge the assessments they'd trust to give good information for a particular purpose—determining proficiency in math for instructional planning. At the end, it asks whether other purposes (for example, whether a student is working up to potential), might require different assessments. Activity 1.9—Going to School asks participants to think about different designs for performance criteria and which might be most useful for different purposes. Chapter 2 addresses various purposes for classroom assessment and posits design implications. |
As other examples of how purpose can affect assessment design, consider these:
Key 3: Appropriate Methods (Target-Method Match)

The third Key to Quality with respect to assessment is to match targets and purposes to methods. We are not of the opinion that the only good assessment is a performance assessment. Rather, there are times and places for all different forms of assessment. We'll again take two looks at "target-method-match," one from Bob Marzano and one from Rick Stiggins.
Take One: Bob Marzano. Figure 1.3 shows Bob Marzano's scheme for matching targets to methods.
| Type of Skills | Multiple-Choice, Short Response | Performance Assessment |
|---|---|---|
| Process Skills: Simple | Long division Punctuation, grammar Decoding words | |
| Process Skills: Complex | Problem solving Writing Setting up an experiment Critical thinking Group cooperation Lifelong learning Dance | |
| Content/Declarative Knowledge: Simple | Recall facts—e.g., dates, places, events | |
| Content/Declarative Knowledge: Complex | Concepts—e.g., democracy Generalizations—e.g., "power corrupts" |
The basic premise is "simple target, simple assessment; complex target, complex assessment." Multiple-choice, matching, true/false and short answer are perfectly fine to assess simple processes and simple declarative knowledge. To assess complex procedural skills, Marzano would set up a task that requires students to use the skills in question and develop performance criteria to measure different levels of successful performance.
To assess complex declarative knowledge Bob Marzano recommends:
Take Two: Rick Stiggins. Now, for "target-method match" according to Rick Stiggins. He maintains that, although you can assess most types of student learning targets by most methods, there are some more and less efficient ways to do it. For example, if all you want to know is whether students know their multiplication facts, why design performance assessments? Figure 1.4 shows his recommendations for matching targets to methods. X's denote a good match; O's denote a partial match.
| Selected Response | Essay | PerformanceAssessment | Personal/Oral Communication | |
|---|---|---|---|---|
| Knowledge Mastery | X | X | O | |
| Reasoning Proficiency | O | X | X | X |
| Skills | X | X | ||
| Products | O | X | X | |
| Dispositions | X | O | O | X |
| Related Toolkit98 Chapters and Activities:
Activity 1.6—A Comparison of Multiple-Choice and Alternative Assessment provides participants an opportunity to compare a traditional multiple-choice assessment with a performance assessment and discuss when each should be used. Activity 1.7—Target-Method Match introduces assessment methods and gives participants practice in matching methods to learning targets. Activity 1.10—Clear Targets and Appropriate Method—The View From the Classroom asks participants to self-evaluate the extent to which they successfully match assessment methods to targets. Chapter 3 covers design options for alternative assessment and includes additional discussions of target-method match. Activity 3.2—Spectrum of Assessment Activity looks at how to "open up" traditional assessment tasks in order to measure more complex outcomes. Activity 3.4—Assessing Learning: The Student's Toolbox demonstrates the relationship between assessment tasks and the student learning trying to be assessed. |
It's simply not efficient to use performance assessment or personal oral communication to assess every knowledge outcome educators have for students. For example, using performance assessments to see whether students know all their multiplication facts could take years. But we could assess instances of ability to multiply in the context of a problem solving performance task.
While it is possible to assess some kinds of student reasoning skills in, say, multiple-choice format, to really see reasoning in action one needs a more complex assessment format. For example, most standardized, norm-referenced tests have questions about such things as fact versus option and "what is most likely to happen next." But these are usually assessed out of context as a discrete skill. One would need a performance assessment to see how students can use all their reasoning skills together to address an issue, or to see if they know when, for example, they need to identify an opinion.
Knowledge about what it takes to perform skillfully or to produce a product can be assessed in multiple-choice format, but to actually see if a student can do it, one needs a performance assessment. (For example, as we discussed earlier in this chapter, one can assess student knowledge about how to give a good oral presentation through an essay, but if one wants to see if students can apply this knowledge, one has to have students give an oral presentation.)
Selected response questionnaires can tap student dispositions, but so can open-ended questions (essays) and personal communication with students.

Assume that targets are perfectly clear and appropriate, purposes have universal agreement, and the absolutely best way to assess each target has been picked. Is this all? Unfortunately, no. It's still possible, even easy, to execute the plan poorly.
In short, things can go wrong in assessment. Have you ever tried to engage students in an instructional activity and not gotten at all what you expected to? The instructions weren't clear, or there wasn't enough time, or students didn't have all the prerequisite skills, or the activity didn't allow students with different learning styles to do their best.
Well, the same thing can happen in assessment. These "things that go wrong" are called sources of mismeasurement, bias and distortion, or invalidity. The result is that the information from the assessment doesn't mean what we think it means. What happens if the ability to read the instructions interferes with a student's ability to demonstrate math skills? Or, the necessity to write a response interferes with how well a student can demonstrate ability to set up a scientific experiment?
Then these assessments could really measure reading or writing rather than math or science—these are serious potential sources of bias and distortion. Or, what happens if student writing competence is judged from only a single piece of writing? Then the judgment of student writing ability may be biased because not enough samples of student writing were obtained across audiences, purposes, and content to really determine, in general, writing competence.
| Related Toolkit98 Chapters and Activities:
Eliminating potential sources of bias and distortion is discussed in many ways in many places in Toolkit98. Chapter 3 contains an in-depth discussion of various alternative assessment designs and the relative merits of each approach. Much of this discussion relates to issues of potential sources of bias and distortion. Activities that stress the importance of quality include: Activities 1.5—Clapping Hands (potential sources of bias and distortion in performance assessments); 1.8—Sam's Story (the most valid pieces of information for a particular purpose); 1.11—Assessment Standards; 1.12—Assessment Principles (equity); 3.1—Performance Tasks, Keys to Success (characteristics of quality tasks); 3.3—Performance Criteria, Keys to Success (characteristics of quality criteria); 3.6—How to Critique an Assessment (practice critiquing on all aspects of quality); 3.7—Chickens and Pigs (equity); 3.8—Questions About Culture and Assessment (equity); and 3.9—Tagalog Math Problem (equity). |
If these assessments then form the basis for a grade or for certifying competence on a state graduation test, the result would be unfortunate. A grade or certification of competence is only as good as the assessment upon which it is based.
This, then, is the fourth Key to Quality—attending to what might go wrong and fixing it. Since this topic is handled so completely elsewhere in Toolkit98, we'll not cover that ground again here.
1These ideas are adapted from the two authors listed in the reference box.