Toolkit98

Prerequisites might include (a) an activity on the rationale for alternative assessment (e.g., Activities 1.1—Changing Assessment Practices..., 1.6—Comparing Multiple-Choice and Alternative Assessment, or 1.12—Assessment Principles); (b) activities that build knowledge of the role of performance criteria in a performance assessment, how to use criteria to score student work, and how to use criteria as instructional tools in the classroom such as Chapter 2 text and Activities 2.1—Sorting Student Work, and 2.5—How Knowledge of Performance Criteria Affect Performance; and (c) one of the activities providing a gentle introduction to quality issues, such as 1.5—Clapping Hands or 1.11—Assessment Principles.
The activity also models the process for teaching criteria to students. When teachers teach criteria they need to define each trait, give some examples, have students practice identifying strong and weak examples of products/performances using the criteria, and have students practice making weak products/performances stronger. While students might be working on, say, criteria for quality oral presentations, participants in this professional development activity are working on criteria for criteria. Thus, in this activity, we define criteria for criteria and identify strong and weak examples of criteria. You can even make this into a little assessment joke because the criteria for criteria have to adhere to their own standards—content/coverage, clarity, and generality.
In addition, choose one of the following:
Option 1: Mathematics
Option 2: Writing
A. (2 minutes) Introduction. Use Overhead A3.3,O1—Performance Criteria Keys Purposes to introduce the activity. The facilitator's notes include two running examples—math and writing. You will likely only be using one of the two in any single setting. Directions that relate to both activities are not boxed. Boxed instructions mean that you need to choose either the math or writing version.
B. (20 minutes) Key 1: Coverage
Ask participants to think of examples from their own experience about how criteria communicate what counts. Think: driver's test, Olympics, what usually gets rewarded in student writing, etc.
|
Math: Show the strong student performance (A3.3,O3—Number of Teeth). Ask, How well did this student communicate what he or she did? Most teachers agree that the student did a good job on this trait. Now put up the Skimpy Criteria (A3.3,O5) and point out the section on communication. Then cover up the trait of communication and ask, What message would be sent if communication was not scored? How might instruction be affected? How might student work be affected? |
Writing: Show the strong student performance (A3.3,O9—Fox) and give participants the Skimpy Writing Criteria (A3.3,10). Ask, How well did this student use detail to add clarity? Most teachers agree that the student did a good job on this trait. Then cover up the trait of detail and ask, What message would be sent if detail was not scored? How might instruction be affected? How might student work be affected? |
(What other groups have said: There should be a trait for "content." Under delivery one might add hand gestures and posture. Students also need to understand that oral presentation is organized and presented different than written presentation—sentences are shorter, there is more repetition, and fewer points can be made in the same amount of time. Where should multi-media and visuals go? Where should personal style go?)
Now rate A3.3,O13—Speaking Criteria to Critique on a scale of 1-5 where 1 denotes "lots of holes; coverage is weak" and 5 denotes "not many holes; coverage is good. (These speaking criteria usually get a 3-4 rating on the trait of coverage.)
Participants usually feel that counting grammatical errors is not a very good criterion for good writing. It leaves out a lot that is important. Therefore, the coverage is rated very low. (It anchors the "1" end of the scale.)
C. (20 minutes) Key 2: Detail
|
Math: Refer to the Skimpy Criteria (A3.3,O5). Have participants score the strong and/or weak student performance (A3.3,O3 or A3.3,O4) on the trait of communication. Chances are that rater agreement will be pretty good. Now, pose the dilemma: What if you give feedback to a student using this scale and the student says, "I know I communicated well because my score is high, but I don't know why. Why did I get a good score on communication?" |
Writing: Refer to the Skimpy Criteria (A3.3,O10). Ask participants to score the strong and/or weak student performances (A3.3,O8 orA3.3,O9) on the trait of "personal expression." Chances are that rater agreement will be pretty good. (Fox is very high and Redwoods is very low.) Now pose the dilemma: What if you give feedback to a student using this scale and the student says, "I know I exhibited a sense of personal expression because my score is high, but I don't know why. Why did I get a good score on personal expression? |
The point to make is that students have a harder time generating another strong response if they don't know what made the previous response strong. Or if the previous response is weak, they have a harder time making the next one better if they don't know what made the current one weak. The point of having clarity and detail in criteria is not necessarily to ensure that two different raters would give the student work the same score (rater agreement) although detail will help here too; rather, detail helps to clearly communicate to students what they need to do to produce quality work.
Also note that criteria can have good content/coverage and still be weak in clarity. For example, either of the Skimpy Criteria covers the important dimensions of performance; it is just not clear what many of the terms mean, or what the difference is between, for example, a '1' and a '2.'
Now ask participants to rate the detailed analytical trait criteria selected from the Appendix A—Sampler on the trait of 'clarity' where '1' is low and '5' is high. (These rubrics get a higher score, usually around '4' for math and '5' for writing) Ask participants to outline what they would do to make the criteria even clearer.
D. (20 minutes) Key 3: General
|
Math : Illustrate task-specific criteria with A3.3,O6—Task Specific Criteria, Version 1: Name the Graph, and A3.3,O7—Task Specific Criteria, Version 2: Name the Graph. Ask participants whether they could use these criteria to judge student performance on another task (take any other math task from the Sampler. (The answer is "no." This is the essence of task-specific scoring; it applies only to a single task.) Then ask whether the general criteria (selected from the Appendix A—Sampler for this purpose) could be used to judge student performance on Name the Graph. (The answer is "yes." This is the essence of generalized criteria; they apply across tasks.) |
Writing: Illustrate task specific criteria with A3.3,O12—Task Specific Writing Criteria. Ask participants whether they could use these criteria to judge student performance on another task (say, "write a persuasive essay on school uniforms"). (The answer is "no." This is the essence of task-specific scoring; it applies only to a single task.) Then ask whether the general criteria (selected from the Appendix A—Sampler for this purpose) could be used to judge student performance on Describe a Favorite Place. (The answer is "yes." This is the essence of generalized criteria; they apply across tasks.) |
Advantages of task-specific criteria:
a. Quicker to train raters.
b. High rater agreement right from the start.
c. Therefore, often used in large-scale assessment.
Disadvantages of task-specific criteria:
a. Have to develop new criteria for each task.
b. Makes no sense to show them to students ahead of time because they "give away" the answer.
c. Can't use them for judging the work in portfolios because the content of each student's portfolio can be very different. (This is especially true at the large-scale level.)
d. What happens if a student comes up with a perfectly reasonable strategy or solution, but it isn't one included in the task-specific criteria? This happens frequently in large-scale assessment when raters are going fast and not really thinking about what they are doing. In fact a personal communication from a major test publisher indicated that this was a major reason they were moving away from task-specific scoring.
e. Related to (d) above—task-specific criteria do not make the rater think, the thinking has already been done for the rater. The developer of the rubric already thought through, for example, how good problem solving would look on this problem. Therefore, the rater doesn't have to think. Likewise such criteria don't make students think. If a major reason for developing criteria is as a tool for learning in the classroom (e.g., students learn standards of quality for work), then generalized criteria do a better job because we want students to think—to be able to generalize what they learned on one task to make performance on the next task better.
(A sample script for the last point might be: The value of generalized criteria is that they help us to define what "good" looks like so that we can begin to generalize across performances and bring information from past experiences to bear on new experiences. This is especially important for the "hard to define" or "disagreement on what this means" goals such as critical thinking, problem solving, collaborative working, communication skills, etc. If people already agree on the definitions, then you can cut corners. But, people often don't agree on what some skills mean and what it looks like when students do "it" well. We can't hold students accountable for different visions of the same target. It is our moral obligation to define precisely what we mean.)
Advantages of generalized criteria:
a. They help students generalize what quality looks like from one task to the next.
b. It is very beneficial to ask students to help develop them.
Disadvantages of generalized criteria:
a. They take longer to learn (because the rater has to learn how to apply them to a variety of tasks).
b. Rater agreement, at least at first, is lower.
A compromise might be to "mix and match" generalized criteria and task-specific criteria. For example, one might, on a task, overlay generalized criteria for group collaboration and critical thinking, and also have task-specific criteria for specific substeps. Another compromise might be to have students consciously develop task-specific criteria from a generalized rubric. That way they practice the ability to generalize.
E.(20 minutes) Key 4: Analytical Trait Criteria
If the purpose for criteria is using them with students to promote learning, the fourth key to success (A3.3,O2) is analytical trait rather than holistic criteria. Hopefully you have already done activities that illustrate the difference and why it is important (such as Activity 2.1—Sorting Student Work). In a pinch, you might use the following script. (But, telling is never as effective as showing.) You might say, Analytical trait means that more than one dimension of performance is assessed. Holistic means that you make one overall judgment of the quality of the response.
The bottom line is that analytical trait systems are not worth the effort in the classroom if all they are to be used for is putting grades on student papers. If, however, they are used as an instructional methodology—to focus instruction, communicate with students, allow for student self-evaluation, and direct instruction on traits—they are very powerful. In short, purpose dictates design.
F. (10 minutes) Wrap-Up. Make a game out of repeating the mantra for quality criteria—coverage, clarity, generality (and maybe analytical trait if the purpose is instructional). Ask participants to repeat this mantra to each other.
Ask participants to explain to each other why these features of criteria are important for the purpose of classroom use in instruction. (And, if the purpose were different, say, large-scale screening of students, what rubric features would be important.)
Then ask participants to write in their own words what it means to have quality criteria. If a colleague were to come up to them and ask, "I'm thinking about using these criteria in my classroom, what do you think?" what would they look for?
This document's URL is:
Home | Up & Coming | Programs & Projects: Assessment | People | Products & Publications | Topics
© 2001 Northwest Regional Educational Laboratory
Email Webmaster
Tel. 503.275.9500