Len Holmes, The Business School, University of North London (at time of presentation)
Prepared for Ethnomethodology: A Critical Celebration conference, University of Essex, March 2002
The study of educational assessment is generally treated in either of two ways : psychological or sociological. Psychological treatments are essentially internally-oriented, mainly focus on issues concerning the validity and reliability of assessment regimes, and on the various aspects of those two key concepts. In line with the traditional and dominant empirical-realist orientation of scientific psychology, the emphasis is on discovering facts about students: as Rowntree(Rowntree 1977 ) puts it, to assess someone is
"[to] some extent or other [ ] an attempt to know that person." (emphasis in original)
Sociological studies tend to be concerned with the role of assessment (and certification) in the reproduction of societal structures. Again, this perspective tends to be one of fact-discovery: in this case, not facts about individual persons but of relationships between the education system and wider society. The aim of this paper is to explore assessment as fact-production, particularly as a distributed process extending across what might be termed the education-assessment-selection nexus.
The focus here will be on higher-level education and employment, the arenas of higher education in relation to graduate employment and of professional and managerial education and development. The work of Hager and Butler (1996), on two models of assessment, as scientific-measurement and as judgement, will be examined from the perspective of ethnomethodology. Whilst their rejection of assessment as scientific-measurement will be shown to be supported from an ethnomethodological standpoint, the paper will argue that their alternative model is unsustainable. It will be argued that assessment should not be viewed in terms of the activities of individual assessors as 'lone agents' discovering 'facts' about students, but as a distributed process of 'fact production' concerned with issues of 'emergent identity'.
The fact-discovery perspective on assessment is evident in what Hager and Butler (Hager and Butler 1996) term the 'scientific measurement model' which, they assert, has been characteristic of traditional assessment practices. In contrast, they argue for a 'judgement model' which has emerged in the context of, and is more compatible with, recent educational innovations and initiatives, particularly those which are concerned with preparation for professional practice. Hager and Butler refer explicitly to problem-based learning, education for capability, and portfolio-based performance assessment. Table 1 shows Hager and Butler's comparison of the assessment practices of the scientific-measurement model and the judgemental model. These are clearly recognisable in the practices advocated and adopted within the competence movement for vocational qualifications and in the 'key skills' agenda in higher education (Jessup 1991).
Hager and Butler (op. cit.) assert that the scientific-measurement model underpins almost all assessment instruments at the level which has dominated in professional education. Here the curriculum is constructed as an orderly sequence of knowledge and cognitive, technical and interpersonal skills. However, whilst such assessment may be viewed as necessary, it is rejected by them as insufficient because it lacks congruence with the 'real world of practice'. Two further levels of assessment are considered: performance in simulated or practice domains; and personal competence in the practice domain. The first of these is seen in the use of assessment checklists for a set of procedures, 'that operate at a level to integrate knowledge areas and skill regimes' (p. 371). The final level, 'personal competence', according to Hager and Butler 'is a characteristic which can only be displayed in the practice setting'. They define competence as 'the ability of a person to fulfil a role effectively'
"It is an ability that encompasses the entire range of demands that make up a complicated role. A competent practitioner is expected to be flexible and versatile, a reflective practitioner, a manager of change who is willing to innovate, and a person who has the attitudes and motivations to act skilfully and ethically. Inherent in this definition is the expectation that the competent practitioner demonstrates the ability to practice over an extended period of time within a wide range of diverse contexts that include uncommon occurrences and contingencies."
(Hager and Butler 1996 ).
Such a definition and elaboration is compatible with those promulgated by proponents of NVQs, and by advocates of more extensive use of work experience, assessed and accredited, within higher education.
Assessment at this level of competence in the practice domain requires, argue Hager and Butler, the application of the judgemental model. It involves 'the gathering of information which must be interpreted in terms of competencies and standards' (op. cit., p. 372). They argue that although this raises the issue of subjectivity on the part of the assessors, this is not a problem: subjectivity can never be eliminated from the assessment process, nor is it desirable to do so. Whilst judgements of competence depend upon tacit knowledge and the expertise of assessors, this indicates not bias but rather
"the learned, relative standards of the assessor being used as the basis for assessment judgements. Moreover, if standards can be learned they can also be shared and brought into some relationship of uniformity with the subjective standards of others with equivalent knowledge and experience."(ibid.)
On this point, Hager and Butler diverge from the model of assessment which is adopted within the NVQ system. In the next section, we shall consider how an ethnomethodological exploration would support Hager and Butler, against the NVQ approach. However, the judgemental model still carries difficulties; we shall explore those in the subsequent section.
'New' approaches to assessment: the NVQ model
Within the National Vocational Qualifications system, as originally devised
, assessment of competence is based on the consideration by the assessor of
a 'portfolio' of evidence of performance by the candidate. That evidence is
judged against a framework of 'elements of competence', and the set of 'performance
criteria' for each element. A single occasion of performance is deemed insufficient
to provide evidence of competence. Rather, performance is considered across
several occasions, varying in type according to a set of 'range indicators'.
For any particular occupation, the framework of elements of competence, grouped
into units of competence, and the performance criteria and range indicators
for each elements, is referred to as the '(occupational) standards'; a single
element, with accompanying performance criteria and range indicators is 'a standard'
So, for the occupation of first-line management, the original framework of standards
consisted of 9 units, 26 elements, 163 performance criteria, and 126 range indicators.
A candidate for the award of an NVQ at level 4 in management would be required
to provide a portfolio of evidence for every element, and every performance
criteria for each elements, across the range of situations specified. The assessor
would be required to
"conclude whether an individual is competent in activities specified in the standard - that is able to perform the stated activity across the range of instances described in the standard and satisfying all of the performance criteria
If doubt remains, further evidence may be required."
(Management Charter Initiative 1991)
Table 2 gives an example of one such 'standard'.
Table 2: Example 'Standard': NVQ level 4 in Management, Element 3.1
The developers of such frameworks of 'standards' are clearly far from Hager and Butler's vision of assessors as engaged in judgement based on their tacit knowledge and expertise. The standards are pre-specified rather than learned, purporting to be categorical rather than relative. They are developed by the application of a method referred to as 'functional analysis', which
"starts with the identification of a key purpose statement which attempts to describe the unique contribution which an industry or occupation makes within the economy "
(Mansfield and Mitchell 1996 )
From this 'key purpose statement',
" a number of substatements are generated, by a process of analysis, which represent outcomes which must be achieved to account fully for the key purpose"
(op. cit., 138)
The literature on functional analysis, and on competence and its assessment, which emanated from the key agencies and individuals engaged in the development of NVQs in general, and the management competence framework in particular, is replete with the language of 'research', 'analysis', 'identification', 'evidence', 'inference' which betrays its scientific-measurement perspective. The impressive (?) amount of verbiage produced suggests a deliberative attempt to 'pin down' the criteria upon which valid inference of competence may be made. That is, the evidence may be compared with the criteria so that 'no doubt' remains, otherwise further evidence is required - presumably until no doubt remains.
Garfinkel's 'demonstrations' of common understandings suggest that such an attempt is bound to fail. He describes how students were asked to write down, on the left hand side of a sheet of paper, the actual words spoken by two parties to a conversation, then to write, on the right side, what each party understood they were talking about. The students found the second part difficult, and increasingly so as Garfinkel requested further elaboration, progressively imposing accuracy, clarity and distinctness (Garfinkel 1967 p.26).
If we examine the example of a standard shown in table 2, we may note the use of the words 'valid' in the first performance criterion, 'appropriate' in the second criterion, 'clearly' in the third, and 'clearly and concisely' in the fourth. But what constitutes 'valid', 'appropriate' etc? Such relative and equivocal terms can only be determined in situ; they are, as Garfinkel emphasises, indexical. This would suggest that Hager and Butler are correct in their view that judgements of competence are made on the basis of the assessors' tacit knowledge and expertise, their 'common understanding' in Garfinkel's terms. This is supported by the findings of the research undertaken by Eraut et al., in which it was concluded as untenable the assumption that detailed specifications and trained assessors would ensure both validity and reliability (Eraut, Steadman et al. 1996).
The judgemental model proposed by Hager and Butler avoids the problems of the scientific-measurement model underpinning the NVQ approach to assessment. However, it remains problematic in terms of how it construes the nature of performance on which judgements of competence are made. The implicit assumption is that any particular instance of performance is itself a stable datum, about which a judgement has to be made. This assumption is also made within the NVQ approach, so the critique here will apply there also. The assumption is challenged by the longstanding philosophical principle that human behaviour is not amenable to unequivocal determination: the same instance of behaviour may be viewed as 'mere' movement or as intentional action. Moreover, intentional action may be undertaken to perform acts, with social meaning; but there is no one-to-one correspondence between actions and acts (Harré and Secord 1972). An instance of socially salient performance must be interpreted, or construed, as intentional action in the performance of a particular act, for it to be amenable to any form of assessment, whether in educational contexts or in everyday, mundane interaction.
Drawing upon the analysis by Harré and Secord, we may identify two critical standing conditions for such interpretation of activity as performance-of-a-kind. There must be
Figure 1 attempts to illustrate this process of interpretation/ construal of activity as performance-of-a-kind.
The significance of identity in the interpretation/ construal of activity as performance is indicated by Garfinkel's 'breaching' studies, in which he sets his students the task of acting as if they were boarders in their own homes. In a number of cases, other members of the family interpreted the student's behaviour as something different to that which the student attempted to perform. Thus, the 'polite' request (by a boarder to a host) to take a snack from the refrigerator was taken by a student's mother to be an instance of disrespect and insubordination (by a daughter to a parent) (Garfinkel 1967 p.48). In a further experiment, the randomly selected 'yes' and 'no' answers to questions presented by students to what was purported to be a student counsellor were taken by the students to be meaningful and appropriate responses (Garfinkel 1967 p.79-94).
The term 'identity' is, of course, fraught with difficulties and is dealt with in various different ways within social science literature. Here, it is being used as a sensitising concept, to give expression to a range of issues which arise in the study of social ordering. To distinguish the particular focus and orientation of the analysis here, we may use the term 'emergent identity' (Holmes, 2000) to emphasise the relational and negotiated character of identifying as the ongoing process by which individual persons seek to present themselves to others and by which significant others view them. Drawing from a range of perspectives, Jenkins argues that
"an understanding emerges of the 'self' as an ongoing, and in practice simultaneous, synthesis of (internal) self-definition and the (external) definitions of oneself offered by others."
(Jenkins 1996 p.20)
Taking this perspective, we may view emergent identity as the manifestation of the outcome of the claim (or disclaim) by a person on a socially salient identity (ie a particular position within some social arena) and the affirmation or disaffirmation by significant others in that social arena. Such outcome should be viewed as essentially negotiated and (re)negotiable, fragile and temporary, but which may in practice become relatively stabilised (but never stable). Figure 2 attempts to express the modalities of emergent identity, ie the various possible socially salient outcomes of the claim/ disclaim-affirmation/disaffirmation dialectic.
The model shown in figure 2 may be used as a heuristic for exploring educational, and particularly assessment, processes in terms of trajectories of emergent identity. So an entrant to a process of education would be positioned in zone 1, that of 'indeterminate identity', aspiring to the position shown as zone 4, 'achieved identity'. Along the way, to continue the metaphor of travel implicit in the notion of trajectory, the entrant moves through what is shown as zone X, that of 'under-determined identity'. However, there is the possibility of moving into zone 2, 'failed identity', whereby the claim made on the desired identity is disaffirmed by others. The remaining position, zone 3 ('imposed identity'), is of less concern here but is of relevance in respect of those who are ascribed an identity which they would contest .
A key aspect of this model for the current discussion is that it draws our attention to the process by which any particular modality is jointly accomplished. This may be considered in terms of warranting (Toulmin 1958; Draper 1988) (Gergen 1989), the use by interactors of particular modes of explaining and seeking to justify particular actions which are intended to be socially consequential. Warranting may be regarded as a broader, more inclusive term than 'accounting', as used by Scott and Lyman (1968) who base their analysis on Austin's discussion of 'excuses and justifications' (Austin 1961). The expression of claims (and disclaims), and of affirmations (and disaffirmations) may be viewed as forms of warranting; these may take a linguistic form or be expressed in other modes. Not all warrants will be successful in achieving the identity claims by individuals, and not all warrants will be successful in accomplishing the affirmations or disaffirmations by significant others. The specific warrants deployed in any particular situations are a matter for empirical investigation, but certain types of warrants may be found to be of significant effect.
The arena of assessment may, on this analysis, be viewed as a site of identity claim and affirmation/ disaffirmation warranting. This takes us beyond the judgemental model proposed by Hager and Butler (op. cit.): it draws into the field of analysis the active engagement of the persons being assessed, the 'candidates', in the joint production (with the assessors) of what are taken to be facts about those persons. In this respect it is no different from the banal, everyday processes by which, in making sense of the actions of each other, we engage in mundane reasoning (Pollner 1987). It may be explored in the same manner as ethnomethodological studies of other aspects of jointly-accomplished social ordering. We might, for example, explore assessment-related elements of the interactions between teaching staff and students, to examine how assessment, or more fully, the education-assessment nexus, is made understandable in the sense of the patterning of reflexive and accountable actions of those engaged. To take a simple example, students emerging from an examination room typically engage in conversation with each other on what a particular question 'meant' and what would constitute a 'good' answer. This may be explored in a similar fashion to Garfinkel's 'false counsellor' experiment, particularly the use by the candidates of the documentary method of sensemaking.
Once the candidates are brought into the analysis and theorisation of assessment as co-producers of what are taken as facts about them, the assessors themselves are transformed from 'judges', separated from the world in which students engage in the performance which is to be judged and with mysterious powers to 'infer' some metaphysical property (competence) of the student. They become 'ordinary' human beings engaged in fact production, just like the jurors and suicide prevention staff in Garfinkel's studies, like Zimmerman's public welfare caseworkers (Zimmerman 1969), like Cicourel's police officers (Cicourel 1968). Moreover, just as in the just-mentioned settings, the fact production process implicates a range of other actors which, as the actor network literature has shown us, are not restricted to human beings. Viewing fact-production is a distributed process allows us to see assessment as a distributed process whereby emergent identity is constructed.
We may determine a number of elements in this distributed process. A reasonably complete exploration of assessment cannot be limited to the site of specific assessment, by particular individual assessors of particular individual candidates. Even within that site there are other candidates, who may be in a position to compare their assessment results. The possibility of candidates challenging assessors, on the 'fairness' of such results, may give rise to interactions ('conversations') between assessors and candidates in which assessors will be expected to warrant their gradings on jointly understood grounds . Where, as is often the case in institutions of higher education, the assessment regulations exclude the right of appeal by students against 'academic judgement', those regulations themselves form a significant element in the procedures deemed necessary for valid assessment. That is to say, assessment regulations are 'actants' in assessment fact-production.
We must also recognise that, although specified in absolute terms, the prohibition on appeal against 'academic judgement' exists within a framework of procedures and regulations which may be deployed to sustain or undermine the identity of an individual as a 'competent' assessor. Assessment processes extend beyond the candidate-assessor interaction, implicating a range of procedures and regulations which require situated interpretative enactment . The assessor as 'lone agent' implied by much of the literature on assessment ignores the complex web of actants, which include administrative procedures and associated artefacts (results lists, reports, minutes of examination boards, and so on), and the physical architecture of the locations for assessment activity (exam rooms, seating arrangements, schedules etc). In particular, we may note that individual assessors engage with other people who are directly involved with grading decisions (external examiners, moderators, verifiers and so on), and others who have indirect involvement eg in formally declaring results and presenting awards (academic registrars, university secretaries, vice-chancellors etc). In so far as an institution or other awarding body has formal (ie legally endorsed) authority to issue qualifications in its own name, such internal web of actants can to a certain extent construct a sustained and largely regularised process of fact-production: a graduate of university X is someone who has been so declared through the processes and procedures in place. However, such construction of 'graduateness' would have little social consequence if the matter rested there: the wider social arenas into which graduates proceed, particularly that of employment, brings into play other actants.
Progress of a graduate (from a first degree or from postgraduate management and professional programmes) into employment depends upon the judgements and decisions of recruiters; progress within employment depends upon the judgements and decisions of the graduate's line manager and others (eg human resource personnel). These too are implicated in the fact-production process, determining whether a particular applicant who has been awarded a relevant qualification is to be regarded as 'employable' within the particular organisation. Various procedures may be used in the process of recruitment, eg the use of psychometric tests, interviews and, more extensively, assessment centres. These may be explored in a similar fashion to that of accounting for suicides examined by Garfinkel (1969: 11ff). Recruiters have to warrant their decisions, to demonstrate that they have made a 'correct selection decision' (Silverman and Jones 1973). If, in the face of the 'fact' of a 'correct selection decision', a graduate is seen to be under-performing, the line manager may be required to account for this, possibly by challenging the 'correctness' of the decision. In this respect, we might consider the use of various forms of documentation (certificates, test scores, interview reports, etc) more as 'letters of provenance' in much the same way as these are used in the world of antiques and fine arts.
The issue of graduate employability is of increasing importance in higher education policy, as the participation rate has increased, and continues to do so, whilst the number of 'traditional' graduate jobs has not increased proportionately. Many institutions are thus engaged in developing assessment processes which help to sustain the claim that the award of a degree is based on 'facts' about those to whom the degree is awarded. This now includes new vocabularies of capability and skill, new assessment approaches (portfolios, practical projects, work-based assignments etc). These may be explored from an ethnographic perspective not as 'improvements' to the methods of discovering 'facts' about students but as new processes for fact-production.
On this analysis, assessment may be viewed as a distributed process of fact-production, within the complex education-assessment-employment nexus. This distributed process has social consequences for all parties but particularly for the students who are subject to assessment. The 'facts' so produced concern the emergent identity of the students/ graduates, rather than some purported properties they possess (knowledge, skills, competence). Studies of assessment which construe it as fact-discovery, whether in the scientific-measurement model or the judgemental model, are abstracted from the ongoing accomplishment of that arena of the social order with which they purport to explain.
The aim of this paper has been to argue for the value of taking an ethnomethodological approach to the exploration of assessment. Taken as a distributed process of fact-production, concerned with emergent identity, such an approach opens up for conceptualisation and empirical analysis within the same frame matters which tend to be separated within conventional renderings. This suggests key areas for empirical studies of how the various parties to the distributed process 'do assessment work'. Perhaps of more importance for those who are 'on the receiving end', the students/ graduates themselves, the approach suggests areas for developing educational practices which enable them to engage more effectively, in respect of warranting their claim on the 'graduate identity'.
Austin, J. L. (1961). Philosophical Papers. London, Oxford University Press.
Cicourel, A. (1968). The Social Organization of Juvenile Justice. New York, Wiley.
Draper, S. (1988). "What's going on in everyday explanation?", in Analysing Everyday Explanation: A Casebook of Methods. A. Anataki. London, Sage.
Eraut, M., S. Steadman, et al. (1996). The Assessment of NVQs. Brighton, University of Sussex Institute of Education.
Garfinkel, H. (1967). Studies in Ethnomethodology. Englewood Cliffs, NJ: Prentice-Hall.
Gergen, K. (1989). "Warranting Voice and the Elaboration of the Self", in Texts of Identity. J. Shotter and K. Gergen. London, Sage: 70-81.
Hager, P. and J. Butler (1996). "Two Models of Educational Assessment.", in Assessment & Evaluation in Higher Education 21(4): 367-378.
Harré, R. and P. Secord (1972). The Explanation of Social Behaviour. Oxford, Blackwell.
Holmes, L. (2000). "What can performance tell us about learning?Explicating a troubled concept.", in European Journal of Work and Organizational Psychology 9(2): 253-266.
Holmes, L. and G. Robinson (1999). "The Making of Black Managers:Unspoken Issues of Identity Formation", presented at . International Conference on Critical Management Studies, UMIST, available at http://www.re-skill.org.uk/papers/mbm.htm.
Jenkins, R. (1996). Social Identity. London, Routledge.
Jessup, G. (1991). Outcomes: NVQs and the Emering Model of Education and Training. London, Falmer Press.
Latour, B. (1987). Science in Action. Milton Keynes, Open University Press.
Management Charter Initiative (1991). Assessment Guidelines. London, Management Charter Initiative.
Mansfield, B. and L. Mitchell (1996). Towards a Competent Workforce. Aldershot, Gower.
Pollner, M. (1987). Mundane Reason: Reality in Everyday and Sociological Reason. Cambridge, Cambridge University Press.
Rowntree, D. (1977). Assessing Students: How Shall We Know Them? London, Harper & Row.
Scott, M. and S. Lyman (1968). "Accounts.", in American Sociological Review 33(December): 46-62.
Silverman, D. and J. Jones (1973). "Getting In: the Managed Accomplishment of 'Correct' Selection Outcomes", in Man and Organization. J. Child. London, George Allen and Unwin.
Toulmin, S. (1958). The Uses of Argument. Cambridge, Cambridge University Press.
Wittgenstein, L. (1953). Philosophical Investigations. Oxford, Blackwell.
Zimmerman, D. (1969). "Record-keeping and the intake process in a welfare
agency", in On Record: Files and Dossiers in American Life, S. Wheeler,
Russell Sage Foundation: 319-354.