Assessment, ‘Rich Knowledge’, and Philosophy

“Teachers learn to maximise pupil performances considered desirable by examiners regardless of whether such performances manifest the understanding needed for the use and application of knowledge in contexts other than test conditions.”

That’s Andrew Davis (Durham University), who specializes in philosophy of education, in an interview at 3:16 with Richard Marshall. He calls the kind of understanding needed for the use and application of knowledge  “rich understanding.”

[detail of student-made art at Avonworth High School, Pittsburgh]

Marshall asks:

You have argued that that there’s a fundamental conceptual difficulty with the use of assessment to gauge the effectiveness of schools and teachers on the grounds that no criterion-based assessment system can achieve both reliability and validity. What’s your thinking here?

Davis replies:

It needs, among other things, test questions and tasks amenable to consistent descriptions and consistent verdicts from large numbers of assessors, teachers and examiners. The related achievement descriptions (criteria) should be unambiguous, so not open to interpretation. Such consistency requires closely detailed marking guidelines, lest examiners’ verdicts diverge. Hence, tests with high levels of reliability cannot assess any ‘constructs’ relating to rich knowledge and understanding. For tight prescription of responses only allows for one type of knowledge manifestation, yet ‘richly’ understood knowledge is manifestable in unlimited ways.

It is this ‘rich’ knowledge that is needed by adults, even if we confine our educational aims to the strictly utilitarian kind relating to future employment. So validity and achievement descriptions relating to rich knowledge is what we need, yet they cannot be obtained in an assessment system demanding high levels of reliability.

Moreover, achievements relating to the arts, and indeed any subject where interpretation is at the heart of legitimate verdicts about them are sidelined. For the consistency of interpretations cannot be forced. There is no one legitimate interpretation of how the First World War came about or of the poem The Waste Land even if there are plenty of silly ones. There is no one informed and valuable interpretation of Mozart’s Magic Flute or Stravinski’s Rite of Spring even if asserting that the latter expresses the humour of a stand-up performance is a little astray.

Davis’s remarks concern assessment in the pre-college settings, but the problem he identifies can be present in university-level assessments.

Philosophy courses sometimes fill undergraduate curricular requirements, such as satisfying learning outcomes concerning “values” or “critical reasoning” and occasionally (usually when accreditation time comes around) universities have to assess whether students taking the courses are in fact achieving these learning outcomes. This means having students complete assignments their performance on which should clearly indicate to members of (sometimes interdisciplinary) assessment committees the degree to which they’ve achieved the learning outcomes.

Multiple choice tests might give us the “tight prescription of responses” needed for “high reliability” but do not seem to be the right instrument for testing the relevant learning outcomes for many philosophy courses, which are often about “rich knowledge” and subjects and skills for which “interpretation is at the heart of legitimate verdicts”. Essay exams are better at capturing rich knowledge and understanding, but are lacking in amenability to “consistent descriptions and consistent verdicts from large numbers of assessors, teachers and examiners”.

Philosophers, how is such assessment done at your school, and how does it fare with the tradeoff Davis describes?

1 Comment
Newest Most Voted
Inline Feedbacks
View all comments
Daniel Bonevac
Daniel Bonevac
7 months ago

Three thoughts:

1. How valid and reliable do evaluative measures have to be? The usual correlation among human graders of philosophy essays ranges from .3 to .9. Without knowing the target, it’s hard to know whether it’s possible to hit it.

2. It might be possible to devise valid and reliable tests for necessary conditions of rich understanding, even if not for sufficient conditions. (It would be hard to have a rich understanding of Kant’s thesis that there are synthetic a priori truths without knowing what ;synthetic’ and ‘a priori’ mean.)

3. The level of the course seems relevant to the question of appropriate measures. I use multiple choice quizzes, together with short papers, in large introductory courses to encourage students to keep up with the reading and to assess their level of understanding. I assign term papers in junior- and senior- level courses. I think that’s fairly typical at my university, though I probably make more use of automated multiple-choice quizzes than most. So, it’s important to know whether one is seeking to assess basic, introductory-level knowledge or the rich understanding one might expect of a philosophy major.Report