Back to Home Page of CD3WD Project or Back to list of CD3WD Publications

CHAPTER ONE PERFORMANCE INDICATORS: CONCEPTUAL FRAMEWORK

1.1 LESSONS FROM PREVIOUS ATTEMPTS TO USE PERFORMANCE INDICATORS
1.2 DEFINING AND DEVELOPING PERFORMANCE INDICATORS
1.3. USES AND ABUSES OF PERFORMANCE MEASUREMENT
1.4. EXPERIENCES OF OTHER AGENCIES/COUNTRIES WITH SYSTEMS OF PERFORMANCE INDICATORS
1.5. SPECIFICITY OF EDUCATIONAL SYSTEMS IN DEVELOPING COUNTRIES
1.6. POSSIBLE FRAMEWORKS FOR PERFORMANCE INDICATORS
ANNEX 1A: World Bank (1996) Performance Monitoring Indicators: A Handbook for Task Managers (Operational Policy Department)
ANNEX 1B : Problems of Measurement at the Sectoral Level: Examples of Indicators and their Associated Problems
ANNEX 1C: Collecting Data for Individual Performance Indicators

CHAPTER ONE PERFORMANCE INDICATORS: THE CONCEPTUAL FRAMEWORK

This chapter is organised into six sections. Section one presents a schematic history of performance indicators in developed countries. The definition and development of performance indicators are discussed in section two. Section three examines the uses and abuses of performance indicators. Lessons gained from the experiences of development agencies with educational performance indicators are reviewed in section four. Section five looks at the current educational context in developing countries in relation to DFID's position. Section six proposes an appropriate framework for developing performance indicators.

1.1 LESSONS FROM PREVIOUS ATTEMPTS TO USE PERFORMANCE INDICATORS

1.1.1 The Revised Code in the Nineteenth Century
1.1.2 Earlier International Attempts
1.1.3 The New Managerialism
1.1.4 The Resurgence of Performance Indicators in Education Systems

This section selects examples of educational monitoring from the past 150 years to illustrate crucial points for later development of the theme.

1.1.1 The Revised Code in the Nineteenth Century

In response to Benthamite/utilitarian ideas of efficiency, early Victorian England discovered examinations. From 1846, pupil teachers had been paid following the results of an annual examination. The first Code for elementary schools was introduced in 1860: to obtain a government grant, the school needed to abide by the Code. In 1861 the Newcastle Commission reported on the state of elementary education in England and Wales saying "They leave school, they go to work, and in the course of a year they know nothing at all". From 1862, under the Revised Code, more popularly known as 'payment by results', the bulk of the nation's elementary school pupils were subjected to an annual examination. The schools were paid four shillings a year for the satisfactory attendance of each pupil between the ages of six and 12, and eight shillings for each pupil, dependent upon the results of an annual examination, not restricted by age but by standard, carried out by Her Majesty's Inspectors (HMI).

The Code had some positive achievements: a reduction in public expenditure on education; attendance at elementary schools certainly improved and basic literacy was extended. Elementary teachers, however, became increasingly dependent on market forces and their success rate in the annual examination (Hogg, 1990: 9). One HMI reported:

The tendency of the new Code is to cause managers and teachers to regard simply the pecuniary grants, and all that does not tend to produce an increased result as to these is hardly taken into account... The expression on a child's failure to pass any subject is not to regret at his ignorance so much as indignation at his stupidity and the consequent loss. (Committee of Privy Council on Education Report, 1864: 114)

Reviewing this history, Hogg suggests that the issue is not whether Performance Indicators are good or bad, but what questions are being asked to which performance indicators might provide answers. When using Performance Indicators to assess schools, for example, the questions underlying the exercise might be: What is education for? What difficulties in the education system is it thought performance monitoring might improve? How far will monitoring and assessment be successful in meeting present and new needs? (Hogg, 1990: 12).

Teachers' reactions to the Code were virulent:

A young teacher was dying of consumption, but on hearing of the death of the inspector he started up, a wild light struck out of his eyes, like fire from steel and he said with a hideous broken scream: By God, I hope he's in hell!" (Runciman, 1887: 29)
...sooner than teach in an elementary school, under any one of a score of inspectors I could name, I would go before the mast in a collier, or break stones on a casual ward - or better die! (Runciman, 1887: 29)

These quotes highlight how an inappropriately designed set of performance indicators can have profound effects on those directly affected by them. This suggests that it would be wise to check the likely impact of a set of Performance Indicators with those most directly affected by them before full implementation.

During the late nineteenth century and early twentieth century, more scientific methods of testing school children began to be developed in the United States, France and the United Kingdom; and were introduced in the 1920s. While there were critics, particularly among teachers, testing was widespread between the two World Wars, although confidence in assessment began to wane from the mid 1930s.

Looking at more recent history and the growth of performance indicators in the educational field, Hogg (1990: 7) suggests that it was the launching of the Soviet Sputnik in 1958 and the American concern to 'catch-up', as demonstrated by the American National Defense Education Act, that led to the revival of interest in the use of performance indicators in the 1960s. This was then absorbed into the more general 'social indicator movement' attempting to monitor the impact of technological change (à la Bauer, 1966). With the oil crisis and the retrenchment of the public sector (including central statistical agencies) in Western capitalist countries, there was a hiatus until the revival in the United States with the publication of A Nation at Risk (National Commission 1983). This coincided with other tendencies (see section 1.4) to generate the current level of world-wide interest. Is it perhaps appropriate to talk of long waves of performance indicators?

1.1.2 Earlier International Attempts

International Association for the Evaluation of Educational Achievement

The International Association for the Evaluation of Educational Achievement (IEA) has been in existence for over thirty years; and has been carrying through a testing programme since the beginning of the 1970s. Their first efforts were limited to testing in mathematics, because it was presumed at the time that the test material in that subject area was the least likely to suffer from cultural variation.¹ Since then there has been testing of language, mathematics and science in over 25 countries, and in several countries more than once; although the issue of cultural and systemic variation between countries (Cummings and Riddell 1992) is still unanswered.

The school focus of the IEA, however, is insufficient in countries where not everyone in an age group goes to school. This is acknowledged by IEA: "Surveys of formal schooling alone are not sufficient to assess education in third world countries, if one wants to judge and compare an entire age or grade population cohort." (Plomp and Loxley 1992) The particularities of developing performance indicators in the area (s) of non-formal education are considered further in section 1.5.

Organisation for Economic Co-operation and Development (OECD): A Quarter Century of Educational Performance Indicators

A very early publication was Indicators of Performance of Education Systems (Carr-Hill and Magnussen, 1973) which suggested the following framework for thinking about the goals of education systems for which indicators needed to be developed.

· transmission of knowledge and skills;
· education and economy;
· equality and educational opportunity;
· provision of educational services for individual requirements; and
· education and the quality of life.

Carr-Hill and Magnussen argued that as no clear set of educational goals is available, nor can be imposed (from which indicators can be derived), differing goals should be treated separately. For example, the different outlooks of the pedagogue (the learning process), of the economist (education as an investment and supplying qualified manpower), of the sociologist (concerned with access to education and the effect on social class stratification), of the 'consumer' (estimates of private demand for education) and of the philosopher (questions of the impact of education on the quality of life) should all be considered separately. Although their recommendations did not lead to a regular publication of Performance Indicators in education, a set of 45 indicators covering some aspects of this framework were adopted in principle, (see A Framework for Educational Indicators to Guide Government Decisions, OECD, 1974).

Most subsequent authors have implicitly agreed with the liberal eclecticism of this approach; but literally dozens of different frameworks have been proposed (van Herpen 1992). The type of system actually proposed by the OECD in its second incarnation is of greater interest to developing countries - if only because of its relative simplicity. Following on from two international conferences - Washington (1988) and Poitiers (1989) - the system was initially organised around three areas:

· the economic, social and demographic context of education;
· cost, resources and processes; and
· the results of education.

In later editions of Education at a Glance (various years since 1993), four subdivisions are used:

· the demographic, social and economic context of education;
· the costs of education and human and financial resources;
· participation in the educational process; and
· the outcomes of education.

The section on participation in the educational process is further subdivided into:

· access to education, participation and progression; and
· school environment and school/classroom processes.

The section on the outcomes of education is further subdivided into:

· graduate output of educational institutions;
· student achievement and adult literacy; and
· labour market outcomes of education.

A number of working groups were formed to develop appropriate indicators for each of these sub-sections: educational outcomes; student destinations; school features and processes; and expectations and attitudes to education. They continue to meet under the aegis of OECD.

1.1.3 The New Managerialism

While we have demonstrated that management by objectives in the form of payment by results is not new, undoubtedly there has been a greater emphasis in recent years on outcomes. It is important to recognise that this is part of an overall trend, especially in the public sector, associated with the move away from direct state provision, and probably with the globalisation of competition consequent upon the alleged triumph of Western capitalism. While it is not suggested that this cryptic analysis should be explored in depth, the examination of the reasons for this greater emphasis might give us some clues as to what works and what does not work.

The Recent Growth of Performance Indicators as a Management Tool²

There have been performance indicators in the UK public sector for some time, but they have tended to focus on the managerial process, the principal objective (Carter, 1989) being to enable closer control of devolved management by central government. The introduction of outcome-related performance indicators (ORPIs) was intended to enhance accountability to external interested parties - service users, taxpayers, or auditors acting on their behalf. Anthony and Young (1984: 649) argue that, in non-profit organisations, ORPIs are addressing the most important stimulus to improved management control, i.e. 'more active interest in the effective and efficient functioning of the organisation by its governing board'. Increased sensitivity of representatives to popular preference will then permeate the organisation and - as in the private corporate sector - will have a profound impact on internal control mechanisms.

Thus, throughout the private sector, control has effectively been exercised through the financial accounts - what we could call 'input accountability'. The presumption in the private sector has been that, in competitive product markets, consumers can observe directly the merits of competing goods and services - so the private sector tends³ to pay little attention in accounting terms to the quality of the output, although of course they may advertise the quality of their products to attract the consumer. But most public services - even after decentralisation and/or quasi-privatisation - are effective local monopolies, so the citizen cannot directly experience the services provided (or value-for-money) in different localities; and many citizens may not directly experience, for example, the police and fire services even though they value them. One role of ORPIs, therefore, is to act as a proxy for the direct experience of services provided by alternative jurisdictions -i.e. to address a concern for geographical and social equity.

Assuming also that taxpayers not only wish to see tax revenues being used well but also that they believe they can exert some influence, performance indicator systems are one means of communicating relevant information. Here, the analogy with the shareholder in the private sector is closer as the concern is with efficiency and effectiveness. The use of performance indicators for input accountability is-based on the familiar principal-agent model of management (Baiman, 1990). Public sector managers, however, are responsible in varying degrees to a much wider variety of constituencies than the single 'principal' envisioned in this model.

Essentially, therefore, the growth of performance indicators is linked to a demand for accountability and equity, and demands to demonstrate value for money in respect of activities, in the context of evident differences between providers as a consequence of decentralisation. Moreover, "on the assumption that well-informed electors will not tolerate manifestly inefficient management teams, it can be argued that performance systems will also encourage managerial efficiency in the use of resources" (Smith, 1993: 137). Equally, from the cybernetic point of view (Hofstede, 1981), an ORPI system should enhance the political control of the public sector by offering individual citizens timely and meaningful feedback on the effect of public sector activity, and should influence the design of the organisation's internal control system.

Communicating Performance Indicators

Performance indicator systems are not easy to scrutinise as the reader/user may have to disentangle at least four causes of variability in terms of:

· the objectives being pursued by different organisations;
· the social and economic environments;
· the accounting methods used; and
· the levels of (managerial) efficiency.

This has led to the argument for an expert intermediary; hence the role in the UK of the National Audit Office (with responsibility for auditing government departments including DFID) and the Audit Commission (with reference to Health and Local Authorities). They both have the remit to publish relatively easy-to-read reports on the performance/value-for-money of activities of the corresponding organisations. In the education system, the Inspectorate fulfils the 'traditional' role of monitoring educational quality from the centre and highlighting problem areas, while the Office for Standards in Education (OFSTED) has a more public face. The latter represents a form of 'recentralisation' at least in terms of publishing league tables of school performance along a number of dimensions (not only in terms of examination results).

In developing countries, the local inspectorate is usually inadequately resourced and often dysfunctional; indeed, the improvement of the system of school supervision is returning to donor programmes (see Khaniya, 1997). Donors, of course, audit their own programmes but that usually tends to be internal. With the increase in joint funding and sector programmes, the mechanisms have become slightly more public: for example, the several donors involved with the District Primary Education Programme (DPEP) in India combine in a Joint Supervision Mission to assess progress. Such missions fulfil a role halfway between the inspectorates and public accountability.

The Importance of Evidence - and of Judgement

Excessive reliance on performance measures in a complex society where the public sector is under pressure can have unintended and often dysfunctional consequences (Merchant, 1990) mirroring those of the mid-nineteenth century commented on by Hogg (1990). To a greater or lesser extent, most data reported within a performance indicator system can be controlled by operational managers. Unless the system is a perfect reflection of all the intended inputs, processes and outputs of an organisation from the point of view of those operational managers (which is very unlikely), there is potential for distortion (see section 1.3.2). The problem - a major theme throughout this report - is that we need to be able to-assess the impact of performance indicator systems upon the performance of the individuals, institutions or organisations (or units within organisations) whose performance is being monitored. While there are many anecdotal accounts, systematic evidence of that kind is sorely lacking.

No one denies the importance of evidence of performance: it is a sine qua non of evaluating any professional practice. Part of the professional role, however, is to make judgements in complicated situations. While those who promote performance indicator systems emphasise the importance of very careful interpretation, much less attention is paid to institutionalising the role of such interpretation and judgements.⁴

1.1.4 The Resurgence of Performance Indicators in Education Systems

The previous section focused on the growth in the use of performance indicators in the public sector generally: what about the particular case of education?

Ruby (1989), reviewing the Australian experience, suggests a number of reasons for renewed policy interest in indicators:

(a) a concern to improve the country's international economic competitiveness by a variety of means but particularly by increasing the general level of education of the workforce;
(b) demands by decision-makers for better information about outcomes and performance to improve policy-making about education - the 'what works' syndrome;
(c) demands for information to guide and monitor the implementation of reforms, particularly structural reforms involving the devolution of authority, and to evaluate the outcome of those reforms;
(d) political commitment to equity such as equality of outcomes for minority groups;
(e) a belief that better information about effective strategies and performance will bring about qualitative improvements in teaching and learning;
(f) enhancing accountability measures in the public sector by gathering data on performance and outcomes; and
(g) a commitment to improving the information available to the public about the performance of public authorities.

Educational Indicators and Accountability

Wyatt (1994) agrees that the concept of educational indicators as summary statistics on the status of education systems is not new. Whenever there are perceptions of falling levels of achievements, the traditional response has been a call for greater accountability and the imposition of higher 'standards'.

He cites Carley's (1981) explanation for the decline in the social indicators of the 1960s: expectations were too high regarding the time needed for their elaboration and the (in-) adequacy of social theory; eagerness to supply social indicator information often led to the provision of poor quality data thereby undermining confidence; insufficient attempts to relate indicators explicitly to policy objectives. While his commentary is rather naive (see chapter three), these constitute important warnings about the appropriate expectations, quality of data, and clear specification of relationship between indicators and policy objectives.

Wyatt (1994) suggests that the recent pressure towards educational indicators is due to calls for accountability; and the requirement of central government for a means of monitoring the process of devolution of responsibility to the school. The latter led to a call for school accountability in respect of centrally determined criteria; how schools might evaluate themselves emphasising the use of locally determined indicators in the school management process; and the use of indicators to monitor specific policy objectives in schools. Note that both the first and third are forms of counterbalancing 'recentralisation', following decentralisation to schools.

Commenting upon the potential use of performance indicators, Wilcox says:

First, performance indicators are seen as an essential element in the greater accountability which will be demanded of schools as a consequence of financial delegation...[second] there is a concerted attempt...to develop appropriate [performance indicators] but also to model and interpret them (Wilcox, 1990: 32).

Recent Trends

Scheerens (1992) identifies three recent trends:

· a transition from descriptive statistics (largely input and resource measures) to measurement of performance outcomes;
· a movement towards more comprehensive systems and a growing interest in manipulable characteristics;
· a concern to measure data at more than one aggregation level.

He also shows how different indicators are appropriate according to the type, level and mode of decision making:

Types of Decision Making - whether we are interested in: the organisation of instruction; the planning of education and establishing the structures within which it is delivered; personnel management; or resource allocation and use.

Levels of Decision Making - whether at the level of the school; lower intermediate authority (e.g. districts); upper intermediate authority (e.g. provinces or regions); central authority.

Modes of Decision Making - varying from full autonomy to collaborative; to independent but within a framework (although the extent to which the latter is different depends on how tight the frame is).

The moves toward Sector Wide Approaches have raised the profile of performance inddicators: thus "the shift towards strenghening primary education is a notable and welcome development but there are significant difficulties attached to SWAPs.

In paticular evaluation is not significantly integrated into design of projects". (DFID Evaluation of Primary Schooling, Synthesis Study). One could add that another significant difficulty is that the context of a sector programme includes not only the educational systems but also other social sectors.

Most evaluations at least pay lip-service to some logical framework - or equivalent management tool - constructed at the beginning of the project or programme, with specific indicators being taken as good proxy indicators for the attainment of one objective rather than another. An example, given in Figure 1, taken from the log frame proposed for Kenya's SPRED programme shows clearly how different indicators are seen as relevant to the attainment of different specific objectives.

The agencies most concerned with 'performance indicators' are unsurprisingly The World Bank and USAID with the EC/EU not far behind (see Figure 2): the World Bank rates projects on five areas USAID proposes very broad-brush indicators and the EC/EU also retains some straightforward indicators. There is little awareness of the difficulty of collecting reliable data for these indicators."

Figure 2: Types of Performance Indicators Used by Different Agencies

(A) World Bank
Evaluation carried out by Operations Evaluation Department. All projects rated According to three results oriented criteria - outcomes, sustainability, institutional Development; and two process oriented - Bank performance.
(B) USAID
· Education's share of national budget
· Primary education's share of education budget (for recurrent and capital Expenditure); and
· Share of primary recurrent, non-salary expenditure of primary budget.
As an indication of effective schools, the use of a fundamental quality and equity Level (FQEL) index, which measures the number of schools meeting minimum Criteria in services and coverage... a means of capturing the united elements that go into making an effective school and the idea of "access-with quality (USAID 1998: 41).
(C) EC/EU
Education indicators are well known. Some of them, like those now being chosen to support structural adjustment in Burkina Faso, may be used in the context of the new conditionally approach: school-attendance rates (boys/girls); first-year primary Attendance rates (boys/girls), success rates in end-of-primary exams (boys/girls), number of books per pupil; level of satisfaction among users; cost and rate of use of informal education by adults. A gender breakdown of indicators is essential here.

SPRED II: LOGICAL FRAMEWORK (part only)

Narrative Summary

Measurable indicators

Means of Verification

Important Assumptions

GOAL:
1 Increased demand for and utilisation of high quality primary education

1.1 Reduced wastage rates, especially for girls, from current 56% to less than 30% by 2005.

1.1 MoE Planning Unit statistics

(Goal to Supergoal (ODA Aim 2 Statement):
1 GoK and parents continue to maintain present level of commitment to education

1.2 Improved student performance, raising KCPE average by 20 points and stabilising repetition rate at 15% through to 2005.

1.2 KCPE examination results

1.3 Increase current GER of 79% to 85% by 2005.

1.3 MoE Planning Unit statistics

1.4 Increased pupil and parental satisfaction

1.4 MoE/Project surveys

PURPOSE:
1 To improve the quality and cost-effectiveness of teaching and learning in primary schools on an equitable basis

1.1 Teaching and learning environment improved in all districts by 1999, though all teachers using new skills that promote active learning, and through use of textbooks provided under project.

1.1 Impact assessments. Inspectorate and TAC Tutor reports.

(Purpose to Goal):
1 Effective co-ordination with PRISM project
2 Non-ODA funded components achieve targets
3 Effective targeting of project resources through operational research to districts where wastage and non-enrolment are greatest.
4 Consistency of KCPE examinations
5 No increase in poverty in targeted districts.

1.2 Improved professional support and inspection service to schools nationwide through upgraded and diversified Teacher Advisory Centre (TAC) system and upgraded inspectorate, by 2000.

1.2 District Education Board reports, Inspectorate Reports, project reports.

1.3 Strengthened community participation in school management and delivery:
1) through effective school committees established in all schools by 1997
ii) community education programmes operational and sustained in at least 5 disadvantaged districts by 2001.
iii) information, education and communication programmes operational nationwide (with special emphasis in all areas of high wastage and low enrolment in first 3 years of project) by 2001.

1.3 Impact assessments, consultants reports, District Education Board reports. District Education Officer reports.

1.4 MoE use operational research to improve resource allocation and planning:
i) ensuring al severly disadvantaged schools receive free textbooks by 2001.
ii) teacher recruitment and deployment systems improved based on reduced student staff ratios, and piloting of multi-grade teaching by 2001.

1.4 MoE statistics, planning and budgeting reports, District Education Officer reports, consultants reports.

iii) rationalised cost-sharing system in place in all districts by 2000. Reduction by at least 10% in parental expenditure per child in schools in receipt of project textbooks.

OUTPUTS: Component 1:
1 Improved teacher training through School-based in-service Teacher Development Programme (STD)

1.1 STD operational in all target districts by 1999 and minimum of 90% (14, 000 schools) coverage nationwide by 2001.

1.1 Project impact study, Inspectorate reports, consultants reports.

(Output to Purpose):
1 Orientation of headteachers to STD through PRISM project 2 Politicians and unions accept the desirability of improved student: teacher ratios and the concept of multi-grade teaching in remote rural areas. 3 Institutional strengthening components received broad-based support 4 Policy commitment to increasing non-salary components of primary budget is retained 5 Teachers able to apply new teaching methodologies.

1.2 Teachers guides prepared for circulation to 175, 000 teachers by mid 1998

1.2 Project reports, consultants reports

1.3 School advisers guides prepared by mid 1997. 1300 TAC tutor trained in management of STD and 16, 000 mentor teachers trained by 2000.

1.3 Project reports, consultants reports, TAC tutor reports, Inspectorate surveys.

1.4 Testing and Measurement booklets prepared for circulation to 175, 000 teachers by KCPE examiners by end of 1997.

1.4 Project reports, consultants reports

Why Have Performance Indicators Now Become Acceptable?

The devolution of governance to districts and to schools has meant that they have become interested in comparing their performance with that of others. In turn this has stimulated discussion over what should count as performance. This was not the case in the 1960s and 1970s when the first systematic systems were proposed (e.g. OECD 1973). Unless a central authority has the power (as in nineteenth century England), it is very unlikely that a performance indicator would be acceptable, let alone implemented, without convergence on what is meant by good and bad performance.

Bottani and Delfau (1992), introducing the OECD project, argue that the influences on the development of an indicator system are: policy considerations; research knowledge; technical considerations; practical considerations; and the (usually political) 'choosers'. These principles interact and sometimes conflict, although policy relevance is likely to be the major driver. Fewer indicators rather than more are likely to be preferred.

For DFID, however, the arguments of Scheerens about the kinds of indicators required for different levels, modes and types are important: what types of decision (choosing new projects, monitoring on-going projects)?; at what level (head office, country, project)?; in what mode (in-house, with the recipient country, for the general public)? To this one might add distinguishing between the kinds of data required for different educational outcomes. This is developed in the final section where we discuss a possible framework.

Finally, we must not forget that the basis for any indicator system is the quality of the basic data (whether qualitative or quantitative) that are collected. This depends on the extent to which the field officers (in this case teachers) accept the system and can collect the appropriate data with the willing participation of the community.

1.2 DEFINING AND DEVELOPING PERFORMANCE INDICATORS

1.2.1 Approaches to Definitions
1.2.2 Developing Useful Indicators
1.2.3 Who Should Choose and Design the System?

1.2.1 Approaches to Definitions

There are as many different 'concepts' of educational performance indicators cited in the literature as there are different dissections that can be made of an educational system. The lists complied by Ashenden (1987a) and Sing (1990), for example, cite a range of indicators of effectiveness, equity, productivity, process, quality, among many others.

The sources of data for performance indicators are eclectic: the data provided by the institutions which are part of administrative information systems; the data based on client and provider perceptions collected by questionnaire; or information collected through direct observation of the workings of the institutions (Wilcox, 1990: 37).

The most frequently cited definition of performance indicators in the educational context is that of Oakes (1986: 1-2). He argues that these indicators must provide at least one of the following kinds of information:

· a description of performance in achieving desired educational conditions and outcomes;
· features known through research to be linked with desired outcomes;
· a description of central features of the system in order to understand its functioning;
· information which is problem oriented; and
· policy-relevant information.

Of course, Oakes is writing mainly about school or college-based education in developed countries, contexts to which his approach can be applied relatively easily. It is much more difficult, however, to describe context and performance for out-of-school provision, or for the recruitment into school of the disadvantaged, in developing countries (see section 1.5).

He goes on to argue that indicators should be:

· ubiquitous features of schooling found in some form throughout the systems/settings being compared;
· enduring features of the systems so that trends over time can be analysed;
· readily understandable by a broad audience;
· feasible in terms of time, cost and expertise;
· generally accepted as valid and reliable statistics.

Again, in the context of developing countries, that is not always easy.

For the objectives of this report, we propose the most generic definition of performance indicators possible: information that is useful for understanding levels and variations in performance, in order to assess the impact of interventions and ultimately inform decision-making.

As Wyatt says, there is a certain consensus that performance indicators are "statistics which reveal something about the health or performance of education, describe its core features and are useful for decision-making" (Wyatt, 1992: 106). The problem at that level - performance indicators for an education system - is the necessity to build a coherent set that provides a valid representation of the condition of the education system. The problem here - for the assessment of DFID education projects and sector performance in those countries where DFID is involved - is the necessity to build consensus with project partners before defining the indicators (see DFID 1997, White Paper). Hence our reluctance to define those indicators which should not be defined a priori, but through a process of agreement.

1.2.2 Developing Useful Indicators

The shared basis of the approaches is an input-process-output model (van Herpen, 1992). The apparent reduction of the complexity of educational systems and their interaction with other societal systems is, however, illusory. Not only can an output of one activity easily be an input to another, an output from the perspective of one person or group can be seen as an input by another. Equally, the environment is sometimes seen as explanatory (having influence on the outcomes of the educational process), while some focus on the impact education has on the economic and social environment (the outcomes). Even if these problems are ignored, there are still problems of moving from a set of agreed (verbal) values to agreed indicators, identifying the target audience for a particular set of indicators, and then choosing the most appropriate data.

Problems of Moving from Agreed Values to Agreed Indicators

There are fundamental differences in the cultural significance and the social interpretation placed on agreed values, and in the willingness to identify and quantify a policy problem. Ruby (1992) argues that the choice of indicators should reflect either - or preferably both - common policy problems or the enduring values that underpin education systems (Nuttall, 1992).

Socio-cultural issues, however, not only mediate the demand for the collection of specific data series, they may also affect the way in which indicator systems are developed overall. For example, the nature of institutional linkages, the absence of political will, shortcomings in technical capacity, and different time frames will mediate between the basic values and interests shaping national or system policies and the priorities that are reflected in the data systems that are developed.⁵

These socio-cultural differences between institutions or societies can be very important. The obvious example is the disjunction between the time frames of data collection and policy. Data systems (as distinct from data collection) can show change over time; but whilst policy concerns emerge quickly and demand immediate answers, data systems are costly to change. Pragmatic agreement over a (sub) set of indicators is a possible way of relieving this tension. Account needs to be taken, however, of the different philosophies about the hierarchies of policy and action (see chapter two for a discussion about this in context of South Africa).

Identifying the Target Audience

Equally important, different professions and disciplines have different perspectives that come to bear on the agreement process. For example, in caricature, researchers belonging to the effective school movement focus on process indicators; economists on the relationship between input and output; political scientists on the possibilities of steering the system; sociologists focus on the environment; and teachers on survival.

For policymakers, good information is simple, comparable and timely. To technicians, however, this often means something different. For example, technical comparability might emphasise common definitions and collection times; whereas, for the policymaker, the rationale for comparability is in order to trade off or exclude some options by reference to experience elsewhere or previously (without their necessarily being full comparability in a technical sense).

Along similar lines, politicians want information that is accessible, direct and public; (Riley and Nutall, 1994). For example: the choice of schools has been a stimulus to demands for information; but, previously, this may not have been an issue because, from the point of view of the parents, there was no easy way to assess the difference between schools and hence no reason to choose.

One response to multiple audiences and ambiguity of purpose is to collect more information (Stern and Hall, 1987); but this is not necessarily a solution. On the one hand there is the danger of 'information overload' (see the case study of Andhra Pradesh in chapter two); and, on the other, of the demand for pseudo-scientific indices to reduce the confusion (see chapter three, section three).

What Data to Collect?

In technical terms, a good indicator is relevant, reliable, understandable, and can be updated. These requirements are not always easy to fulfil. Moreover, the different uses to which performance indicators are put reflect different data gathering requirements:

· Some provide a benchmark for measuring change over time; others are focused on differences across geographic areas or institutions at a point in time.
· Some reflect a policy issue, or an aspect of education that might be altered by a policy decision; others information relevant to managerial processes.
· Some are macro and quantitative, reflecting broad-brush decisions and others are micro and qualitative as part of a change process.

1.2.3 Who Should Choose and Design the System?

The issues of how indicators are chosen and by whom has itself generated a large literature focusing, in particular, on policy, technical and practical considerations. Broad agreement exists on the need for valid, reliable, timely, comparable, feasible, reasonably costed, policy relevant and comprehensible indicators (Nuttall 1992: 93). There remain differences, however, about "the number, the need for redundancy, and the extent to which the indicators should be comprehensive and organised by and into a framework that reflects the functioning of the education system with...known causal links." (Nuttall 1992: 93)

The Policy Agenda

Choices are inevitably influenced by value systems of those making the choice. McDonnell suggests that:

The policy context then plays two distinct roles in the design of a system of indicators. First, it provides the major rationale for developing and operating such a system. Second, the policy context constitutes a key component of any educational indicator system, because specific policies can change the major domains of schooling in ways which affect educational outcomes (1989: 241-242).

Nuttall (1994), however, emphasises the importance of creating indicators that are independent of the current policy agenda otherwise it will be difficult to maintain a stable statistical system.'

A Model of the System

Several authors argue that a set of Performance Indicators must reflect scientific understanding of how the education system functions, but it also needs to reflect the interests of the policy making community, including consumers and data producers. Again Nuttall cautions:

...the present understanding of the educational process is insufficient for the postulation of a [precise] model, but that it is possible to create a framework that embodies the available limited knowledge of empirical relationships and that begins to relate malleable variables to desirable outcomes without promising too much. (Nuttall, 1994: 85)

Criteria for choosing and developing and evaluating indicators

Nuttall suggests a series of important practical lessons that have been learnt from the various attempts to develop performance indicators in developed countries:

· indicators are diagnostic and suggestive of alternatives rather than judgements;
· any implicit model [however partial] must be made explicit and acknowledged;
· criteria for selection must be clear and related to an underlying model;
· individual indicators should be valid, reliable and useful, etc.;
· comparisons must be done fairly and in a variety of different ways; and
· the various consumers of information have to be educated about what the indicators mean, how they are to be interpreted and what consequences they might have.

For any proposed system of monitoring education programmes or projects in developing countries, the objective should not be to provide a comprehensive, causally specified framework. Instead, it would be useful to have small sets of indicators organised roughly into the categories of inputs, context, processes and outputs (perhaps distinguishing between levels and modes of decision-making and kinds and types of data). At the same time, the danger of keeping the indicator set too small, and of corrupting the behaviour of those whose performance is being monitored, needs to be recognised (Darling-Hammond, 1992).

1.3. USES AND ABUSES OF PERFORMANCE MEASUREMENT

1.3.1 Uses
1.3.2 Abuses or perverse uses
1.3.3 The (Limited) Value of Performance Measurement

Despite a reasonably thorough search of the literature, and although much has been written on both the advantages and disadvantages of performance⁶ indicators, it has been difficult to find anything relevant to the public sector other than theoretical accounts.

1.3.1 Uses

Sizer, Bormans and Spee (1992) identify five core uses of performance indicators in government institutional relationships: monitoring, evaluation, dialogue, rationalisation, and resource allocation. Where performance indicators become more controversial is where the emphasis shifts from their use as one of the many inputs into effective decision-making, to using them as a ranking device to allocate esteem and funding differentially (Meek, 1995). For Oakes (1986), indicators should be used to:

· report the status of schooling;
· monitor changes over time;
· explain the causes of various conditions and changes;
· predict likely changes in the future;
· profile the strengths and weaknesses of the system;
· inform policy makers of the most effective way to improve the system;
· inform decision making and management; and
· define educational objectives.

In the context of development, the World Bank (1996) suggests the following range of uses:

· clarification of the objectives and logic underlying the strategic plan;
· promotion of efficient use of resources via performance accounting;
· forecasting and early warning during program implementation;
· measuring programme results for accountability, programme marketing and public relations;
· benchmarking in order to learn from success; and
· measuring customer (beneficiary) satisfaction for quality management.

In the DFID context, the most important uses are probably for accountability, marketing and public relations.

Audience and purpose

The problem is: do policymakers, politicians and the public want the same information for the same reasons? Table 1 below shows the different uses of performance indicators from the perspective of different audiences. Unless, there is agreement over the objectives and uses of performance indicators, then the PI reports will never be accepted.

Table 1: Different Purposes of Performance Indicators

Audience

Purpose

Institutional

Internal
Comparisons with others
Marketing
Evaluation of teachers

Government

Accountability
Policy and planning
Allocation
Overall level of funding
Value of investment
Manpower planning

Public

Accountability

Student

Choice

Teachers

Self-assessment

Industry

Research funding Graduate employment

Source: taken from Davis 1996 Report for Commonwealth Secretariat

Concerns about the Development and Use of Performance Indicators

Davis (1996) summarises the concerns about the use of performance indicators as follows:

· costs of additional data;
· emphasis on one aspect;
· inappropriate when institutions have different objectives;
· their use in isolation;
· loss of diversity;
· imposition of central control;
· limited value for quality; and
· effectiveness and efficiency emphasised more than quality.

Compare this with the list of problems enumerated previously in an internal document by ODA (ODA 1995):

· indicators in isolation may mislead;
· simple models mask reality;
· collection and analysis of indicator data can be difficult; and
· indicators may be massaged for other purposes.

In the latter list, the issues about costs and diversity are not mentioned at all; and the limitations in terms of monitoring quality are not registered. Without this kind of understanding, any indicator system that is developed is likely to have over-ambitious aims.

Moreover, there is a salutary warning from the management science literature reflecting on the way most of the performance indicator systems used in the UK have been imposed by central government with little consultation. As Jones (1986) notes, the predisposition amongst operational managers to indulge in dysfunctional behaviour is likely to be heightened if they perceive that a control mechanism is being imposed on them against their will. This increases the possibility that any private gains to be made from distorting behaviour will be exploited by at least some managers. Evidence from the private sector indicates that the style with which control schemes are implemented may have a profound impact on their effectiveness (Hopwood, 1972; Otley, 1978; Kenis, 1979).

DFID should clearly take great care in the way in which the systems are developed. Whilst the argument for bottom-up, appropriate participatory approaches is a separate and, of course, very important issue in development, the crucial importance of involving country partners is central to the above discussion (see DFID 1997, White Paper).

1.3.2 Abuses or perverse uses

For every performance indicator, questions must be asked about the implied message, the behavioural implications. In other words, knowing that certain indicators are being collected and monitored, what implications do people draw? (Fitz-Gibbon, 1990: 2)

Smith (1993) enumerates seven ways in which excessive use of outcome-related performance indicators might influence public sector managerial behaviour:

Tunnel vision: Concentration on areas included in the outcome-related performance indicator scheme to the exclusion of other important areas.

Suboptimization: The pursuit by managers of their own narrow objectives, at the expense of strategic co-ordination.

Myopia: Concentration on short term issues to the exclusion of long-term criteria, which may only show up in outcome -related performance indicators in many years' time.

Convergence: An emphasis on not being exposed as an outlier on any outcome-related performance indicator, rather than a desire to be outstanding.

Ossification: A disinclination to experiment with new and innovative methods. Gaming: Altering behaviour so as to obtain strategic advantage. Misrepresentation: Including 'creative' accounting and fraud.

On the basis of a small number of interviews with managers in the health sector, Smith (1993) suggests that all of these are possible effects, but the most likely problems were with 'gaming' and 'misrepresentation'.

The opportunity for 'gaming' in the public sector is exaggerated by the poor understanding of most production functions, illustrating the importance of choosing outcome-related performance indicators (ORPIs) with great caution, and of ensuring that incentives are compatible with organisational objectives.

Moreover, because outcome-related performance indicators are thought to require 'expert' interpretation, which tends to be provided by the manager responsible, there is considerable scope for interpretative 'misrepresentation'. Whether such misrepresentation occurs to any great extent is of course a matter of conjecture. As Smith's interviews suggested an acute awareness that other units might indulge in creative interpretation, one must conclude that there is a strong possibility that it exists. This is compounded when, as in the public sector, there are very few outside "experts' able to give reasonably dispassionate commentary on performance measures.

Abuse of Educational Performance Indicators

In the education sector, one can see the opportunity for each and every one of the seven processes described above.

Tunnel vision.' If head-teachers are only rewarded for those examined then there will be a tendency to focus teaching only on those children to be examined to the neglect of others.

Suboptimization: If head-teachers are rewarded according to the number of children in the school, there will be a tendency to devote energies to 'capturing' new pupils even though this does not improve attainment.

Myopia: An example of short-termism is where in-service training of teachers currently in post is neglected for cost reasons.

Convergence: the fear of being an 'outlier' might assume particular significance in a corporatist environment where either the state or the unions are powerful.

Ossification: New curricula and innovative teaching methods are likely to absorb staff time and be disliked because of that.

Gaming: Although unlikely to be a problem in developing countries, the reaction of head-teachers to league tables is a good example.

Misrepresentations: Where head-teachers are rewarded by numbers of children on the school register, there will be a strong temptation to falsify registers.

Smith (1993) concludes that most public sector performance indicator schemes have been developed on the assumption that they are neutral reporting devices, and too little attention has been given to the organisational context in which they will be used. The management control literature suggests that such a cavalier attitude to context threatens the objectives of the scheme.

1.3.3 The (Limited) Value of Performance Measurement

Whilst such abuses or perverse uses may be rare, the limitations have to be understood

Limitations of Performance Measurement

There are constraints in transforming theoretical concepts of outcomes into practicable measurement procedures, thus:

"The gap between our academic aims and available measures is important because, to the extent that educational indicators have direct consequences attached to them, as in the case of performance indicators, these limited measures begin to reform classroom practice in their image. There is an assumption that policy action based on indicators will produce a desired result. Indicators are intended to advance constructive action, but such action is contextually embedded. Variations in culture and basic understanding about the inter-relationships of individuals, family, school and society are features of educational systems. The imminent danger is that the indicator model will frame the subsequent discussion in essence becoming the implicit model for schooling everywhere." (Bryk and Hermanson, 1993)

These arguments are well understood in terms of 'teaching to the test'. Performance indicators provide a useful focus on achievement, but top-down approaches aimed at using testing to bring about change are limited unless linked to support for school improvement" (Selden, 45: 1994). The argument, however, is more general. Flamholtz (157: 1983) notes that 'an accounting system cannot be viewed as a control system per se; rather [it] must be seen as a part of a carefully designed total system of organisational control'. As Hofstede (1981: 200) notes: 'the more formalised a control system, the greater the risk of obtaining pseudo-control rather than control'. Any measurement of performance needs to be introduced for a system and not just for individuals, or individual organisations within that system (Walsh, 1994); otherwise, inevitably, there are distortions.

Performance Indicators are, at most, Useful at Different Levels

The distinctions made by Scheerens (1993) between levels, modes and types of decision-making have already been discussed: and this and other distinctions made above have implications for the type of indicators that will be useful at the different levels. Both the case study of developing a monitoring and evaluation system in Andhra Pradesh and the relevance of different indicators at different levels of the new South African System (see chapter two) illustrate this well.⁷

Moreover, it needs to be emphasised that most systems restrict themselves to the 'school system' and there is relatively little attention given to adult and out-of-school education (see Carr-Hill, 1989). Van Herpen (1992) suggests that these should be built into a comprehensive system. But the information requirements of out-of-school education are very different from the formal schooling system (Carron and Carr-Hill, 1991, see section 1.5).

Care in Elaborating Performance Indicators

The lessons for DFID from these observations are that there are a number of crucial questions to ask of any proposed performance indicators:

· Is the performance indicator about a significant aspect of the education system or of the impact of education?
· Can it be readily understood by everyone involved both in-country as well as external parties?
· Will the data be reliable and not subject to significant modification as a result of response error, or changes in the personnel generating it?
· To what extent are the data being reported under the control of operational managers and therefore subject to potential distortion?

1.4. EXPERIENCES OF OTHER AGENCIES/COUNTRIES WITH SYSTEMS OF PERFORMANCE INDICATORS

1.4.1 Australia
1.4.2 Sweden
1.4.3 Commonwealth Secretariat
1.4.4 World Bank

This section reviews some of the lessons learnt by countries attempting to institutionalise school-based performance indicator systems, and relevant innovations. Hence the apparently curious choice of agencies/countries.

1.4.1 Australia

Ruby (1994) highlights six lessons that emerged from only the first twelve months of institutionalising a performance indicator system:

1. It is difficult to communicate accurately and economically about ways of assessing performance given:
· the technically and theoretically complex nature of indicators; and,
· any external assessment potentially challenges concepts of professionalism, traditional notions of autonomy and raises questions about the nature of accountability.
2. Technical problems of outcome measurement do not predominate; instead the focus is on problems of interpretation and the influence of contextual and process variables.
3. The importance of stressing fundamental questions of why there is a demand for performance indicators, and in what context and for what purposes they are useful, rather than on the technical and practical questions of constructing indicators.
4. Indicators are essentially normative and goal oriented, directly linked to policymaking and the political process, and only useful when linked to a model of the education system.
5. The importance of drawing on as many paradigms and perspectives as possible, to involve people working in science policy and public policy as well as education.
6. The benefits of exploring new ideas using a co-operative and relatively open process. Keeping the process of analysis transparent to those affected by the outcome establishes credibility for the outcome of the process and of those involved.

Commentary for DFID

While it is not feasible to incorporate all these lessons (e.g. to have an agreed model of every education system as a basis for a framework), issues of communication, interpretation, pluralism and transparency are important to bear in mind when designing sets of performance indicators which will be acceptable within different country contexts.

1.4.2 Sweden

The Swedish National Agency on Education set up an evaluative project in 1992 to examine the non-cognitive development of pupils in Swedish schools. The project took account of pupils' own views as 'connoisseurs of their own schools'. It examined pupils' development on four core variables which reflected strong national purposes: independence, self-confidence, participation in decision-making and solidarity with others. This was based on the view that:

Individuals that hold a critical mind and are used to act in independent ways are seen as important parts of the assurances that the Swedish society have taken towards fascism. (Ekholm and Karang, 1993: 13)

In other words, it was important to measure self-confidence as this was seen as a prerequisite for successful learning, and involvement in decision making as essential to sustaining democracy. Tolerance and understanding of others were also seen as essential to democracy (Ekholm and Karang, 1993: 14).

Commentary for DFID

The purpose of supporting primary education in developing countries is, in part, support of democratisation (Western style) - see ODA (1993) and DFID (1997). Thus, while simple counts of participation or registration in primary school - and, eventually, attendance at primary education - may have been sufficient at one stage, discussions about measures of outcome or success are usually in terms of achievements in literacy, numeracy and science. But, where democratisation is a central tenet of a donor's strategy (as with the DFID), appropriate indicators of aid effectiveness are required; at a minimum, measures of school effectiveness have to include non-cognitive achievements.

1.4.3 Commonwealth Secretariat

Davis (1996) set out to compare the progress made in developing performance indicators at the higher education level in Commonwealth countries. In the preface to her study, Fielden writes:

"Surely, we thought, at the very least, we can obtain staff-student ratios from various different jurisdictions" (Fielden, preface Commonwealth Higher Education Management Service, 1996.

Despite the rhetoric about performance indicators, however, little was actually obtained, and so:

"Our aim is to show how very little has been achieved and how, despite the massive industry of researchers working on performance Indicators, comparatively few are in use nationally." (Fielden, op cit)

Commentary for DFID

Part of the problem was that performance indicators are used for different purposes in different contexts (see section 1.3 above). This study graphically demonstrates the gulf between the identification of a possible indicator - and even ways of collecting, relatively cheaply, the corresponding data - and the use of an indicator to inform policy.

1.4.4 World Bank

Prior to 1987, a basic data sheet giving country economic and social and sectoral data was required for all Bank-financed education projects. This was abandoned subsequently, as the data collected were often of poor quality and not seen as relevant to individual projects.

General experience of indicator use

The World Bank Operations Department report that there are:

· disparate educational information systems between and within countries;
· differences in educational systems;
· differences in classification and terminology; and
· imbalances in the collection of data.

They identify possible reasons as: complexity of education, lack of resources, lack of capacity to carry out educational research, the political nature of educational data and lack of standardisation of education system components (McRae, 1990).

The World Bank's recent approaches (or at least the rhetoric about their approaches) are increasingly characterised by a focus on developing local capacity for project benefit monitoring and evaluation. This shift of focus has occurred in response to the Wapenhans Report (1990) which points to:

· too much emphasis on the mechanics of project implementation;
· poor identification of risks and factors influencing project outcomes;
· lack of objective criteria, transparency and consistency across units; and
· ratings which tend to be overly optimistic.

For the purposes of the Bank and its clients, the most significant benefits of performance indicators accrue in project design, project supervision and monitoring and project evaluation (World Bank, 1996).

The approach is spelt out in a paper by Sigurdsson and Schweitzer (1994). This paper discusses three types of data: basic data (providing socio-economic background and context), education sector data (useful in project identification and evaluation), and project performance data (to mark the progress of project components towards specific targets).

The appropriate types of indicators related to the project cycle are: input indicators; process indicators (to monitor stages of project implementation); output indicators are the immediate project targets identified as project components to be completed; and impact indicators are derived from sectoral data (See Annex 1A for further details).

The World Bank suggests that: "Policy related indicators can be used to identify risk and enabling factors during preparation and appraisal for projects and systems." (Sigurdsson and Schweitzer, Executive Summary, 1994). It is recognised, however, that to be meaningful, education indicators must be analysed in the context of system needs and available financing. The danger that funding decisions based on indicator performance may encourage skewed or falsified data recording is also acknowledged (1994: 3).

Sigurdsson and Schweitzer conclude with the following recommendations:

· that project performance indicators be project specific;
· that a uniform approach to economic justification be applied;
· that consistent attention be paid to process indicators; and
· that the World Bank should assist UNESCO in their ongoing work to define desirable definitions for MIS systems in Bank-financed projects.

Commentary for DFID

Many of the lessons from the World Bank's experience and some of its recommendations should be adopted. Disregarding the Bank's obsession with a uniform economic justification of projects, the emphasis on project-specific performance indicators, the importance of process as well as outcome indicators and the need to pay careful attention to the source data, are all sensible points.

1.5. SPECIFICITY OF EDUCATIONAL SYSTEMS IN DEVELOPING COUNTRIES

1.5.1 The Jomtien Agenda
1.5.2 Covering the Diversified Field of Education

1.5.1 The Jomtien Agenda

In some ways, the Jomtien Conference can be read as yet another attempt to implement universal primary education. While the Conference acknowledged the potential role of alternatives (at least rhetorically), the predominant view was that appropriate indicators should be based on enrolments. From the point of view of developing any indicators that go beyond simple numbers of children in schools (the simplicity is in the statement not in the counting!), the emphasis on quality is the most important.

The focus of basic education must...be on actual learning, acquisition. It is therefore necessary to define acceptable levels of learning, acquisition for educational programmes to improve and apply systems of assessing learning achievement. (Education for All Conference, March 1990)

The Jomtien agenda focused on:

· strategies for improved training of teachers and education managers;
· alternative methods of improving access;
· increasing production and dissemination of teaching learning materials; and
· efforts at strengthening education administration, planning and management.

Potentially, of course, such a list provides the opportunity for considerable divergence over what should be done to improve quality, and therefore generates difficulties when different stakeholders try to agree on the appropriate performance indicators. As suggested in the previous section, in order for performance indicators to be developed, there has to be clarity over objectives and therefore convergence (if not consistency) between the different interested parties. According to Hoppers (1994), however, the context in which the Jomtien Agenda is being implemented has provided a fertile ground for the development of performance indicators. He points to the almost universal phenomena of:

· stagnating enrolments and an apparent deteriorating quality;
· the need for greater economies in the development aid budget;
· the internationalisation in technologies of curriculum development; and
· intensive interaction amongst policy makers promoting the same views.

The similarity of the problems and proposed solutions to which Hoppers points would suggest that there could be agreement. Ruby (1994) argues, however, that apparently similar values and policy concerns do not necessarily mean that the same indicators are relevant (see also Blunt, 1995). Given that DFID prefers developing sets of indicators in conjunction with the developing countries themselves (White Paper, 1997), then it will be important to allow for potentially divergent frameworks; and indeed to have a mechanism for identifying - at least internally - disagreement and non-consensus.

1.5.2 Covering the Diversified Field of Education

The analyses discussed above - because they are based on the experience in (post) industrialised countries - have all been focused only on the schools as the vehicle for education. In many poorer countries, this is not applicable: either the quality of schooling is so variable, or the main vehicle for education is outside school. The former highlights the importance of developing sensitive performance indicators, the latter is the subject of this section.

Non formal programmes have been distinguished from formal programmes along a number of dimensions. While the original distinction made by Coombs et al (1973) into formal/non-formal/informal based on the degree of hierarchy etc. have been shown to be inadequate, no other single set of dimensions is any more successful. As Carron and Carr-Hill (1991) show, this is because it is important to understand:

· their aims and objectives;
· the kinds of clientele they serve;
· the organising agency; and
· the relationship with the formal educational system.

They go on to distinguish four types of programme:

· para-formal or parallel educational programmes;
· professional and vocational education;
· personal development with no specific professional intent; and
· popular education.

These types are described briefly below together with the kinds of indicators that might be appropriate.

Para-formal education: The set of programmes designed for educational equivalencies to officially recognised primary, secondary or higher educational diplomas. Case studies (e.g. Bibeau 1989; Gallart 1989) have demonstrated that there has been a progressive tendency for the formal educational system to absorb 'innovations' from the non-formal education sector as part of the standard curriculum.

In addition to these second chance para-formal education programmes, there has been a rapid expansion of the private tutoring of regular formal school students. It has grown with the massification of formal education as elite middle class parents who perceive their previous privileged position to be disappearing, have sought ways of retaining the competitive edge for their children. At the same time, for formal school teachers in many developing countries, who have seen their salary eroded over the last decades, the private tutoring system has been a welcome opportunity to increase their income. Obviously, from the point of view of the parents and the tutors there are clear criteria of success; but one might also want to assess the impact of this private tutoring system upon the formal schooling system and this is more difficult.

Professional and Vocational Education: Although this is an obvious grouping, there are problems in defining the outcomes both for the individual and society. This is not the place to rehearse the well-known arguments about the difficulty of interpreting rate of return analyses (see Hough 1991), but the point is that some vocational qualifications clearly are used for screening, and some vocational education is intended to socialise people for the general 'world of work'. Moreover it seems sensible to make a distinction in vocational education and training between general vocational education referring to the transmission of skills, knowledge and behavioural traits which are broadly relevant to performance in all or a considerable number of occupational roles (learning to learn); and specific instruction which is concerned with the performance of a single task (or set of tasks) within a single job or occupation within a single institutional locale (learning to do) which is limited in scope and non-portable in application.

While these distinctions are sensible, only the latter (specific instruction) provides a basis for elaborating appropriate performance indicators from the point of view of the company in terms of improved productivity on the specific job. In principle, there will also be enhanced income and/or security for the individual so long as they remain in that job; however, the medium or long-term outcomes for the individuals may not be so successful. The arguments about the rate of return in general mean that it is not possible to define indicators for general vocational education that would be generally accepted.

Personal Development: The rapid expansion of personal development activities is one of the most significant common trends in the diversification of the educational field. This is mostly a phenomenon in the North and so may not be very relevant here. Appropriate indicators should be reasonably clear, however, given that the purpose of this type of non-formal education is to fulfil individual wants. Data collection on client satisfaction would, therefore, be appropriate.

Popular Education: Finally, another separately identifiable example is the type of education used as a means of consciousness-raising, practised for example by the Catholic communities in Latin America during the 1980s. This model of collective promotion appears to have weakened in favour of the spectacular emergence of personal development activities but there are still situations where a liberating form of education is seen as an essential vehicle for political and social movement.

In terms of the focus of this report, however, the point is that, in this mode, education is being used as a vehicle for a totally different perspective on society. Performance here is measured in terms of revolutionary not evolutionary outcomes.

In principle, therefore, we have been able, in most cases, to identify performance indicators that would be appropriate for the different types of non-formal education. At the same time, we have to be realistic about the possibilities of collecting data. A major stumbling block is simply the 'countability' of each of the different kinds of educational activities as some non-formal education programmes do not always record enrolment. From this perspective one could separate the various strands of non-formal education into two groups:

( ) those where the providing institution would hold enrolment and/or registration data and which could therefore be captured in principle through a census survey of institutions
( ) those where the most practicable way of obtaining estimates would be through a sample population survey

These various possibilities are summarised in Table 2 below.

Table 2: Collecting Data about Non Formal Education

Level

Name

Standard

Non-standard

Suggested Classification

State

Collectable from Institutions

Collectable via Household Surveys

1

Pre-primary

Nursery

Playgroups

Child Care

Para-formal

2

Primary

Primary

Evening classes

Street children

Para-formal

3

Secondary

Secondary

Evening classes

Youth groups

Para-formal

4

Further Education Colleges

Industry

Youth groups, back-street colleges

TVET General/specific

5

Tertiary

University

'Open' Universities

Auditing

Personal Development

6

Post-Doctoral

University

'Open' Universities

Auditing

Personal Development

1.6. POSSIBLE FRAMEWORKS FOR PERFORMANCE INDICATORS

1.6.1 The DFID Context
1.6.2 A Skeleton for Frameworks
1.6.3 Developing Indicators at Different Levels and for Different Stages

The breadth and range of the above definitions and approaches means that it is probably not possible to develop a comprehensive set of indicators reflecting a definitive theoretical framework. It is possible, however, to lay out the skeleton of overlapping frameworks, which will need to be completed as appropriate.

1.6.1 The DFID Context

1. Aims of ODA Education Aid

The details of the current DFID educational strategy are being drafted. The aims set out in "Learning Opportunities for All" (DfID, 1999).

Priority is given to meeting the International Development Goals of:

· Universal Primary Education (UPE) in all countries by 2015.
· Demonstrated progress towards gender equality and the empowerment of women by eliminating gender disparity in primary and secondary education by 2005.

And, in a minor key, DfID will also help to promote adult literacy, lifelong learning and the acquisition of practical skills for development, for women and for men.

Obviously these could be the basis for a generalised indicator framework, given that, at least within DfID, there can be consensus over what is meant by UPE, and what would count as gender equality (although measuring levels of adult literacy and skill acquisition are more difficult).

The more detailed 'Framework for Action' is less clear cut. DfID will support the efforts of people and governments committed to:

Effective and Equitable UPE

· overcoming barriers to access and retention
· supporting children to complete a basic cycle of education
· improving the quality of schooling
· equity for all children
· placing UPE within the under education sector

Gender equality in school education

Literacy and Skills Development

Knowledge and Skills for Development in a Global World

Sustainable, Well Managed Education Institutions Systems and Partnerships

It is however less clear how to measure these components and sub-components. For example, considering Effective and Equitable UPE:

· whether access should be measured in terms of attendance or completion and therefore the identification of disadvantaged areas;
· what is meant by improvement and quality
· the most appropriate way of placing UPE within the wider education sector

The other components also pose a number of difficulties for measurement. There have been many debates over what gender equality means, there are disputes over different academic literacies, the relationship between local and global economies is contentious, and qualifying educational institutions as 'well managed' begs the question. On the other hand, it might well be possible to generate indicators if these rather general statements were better specified.

2. Planning Education Projects

The ODA guide for planning education projects (ODA 1991 a, cited in Hirani Ratcliffe 1994: 4) highlights several key questions:

· What is the evidence of demand for the education service proposed?
· What will be the benefits to the country and individual?
· Can these improvements be measured?
· Will there be cost economies resulting?
· Is the proposed strategy seen as the most cost-effective?
· Are the recurrent cost implications manageable?

Again, in principle, each of these questions could be used as the basis for developing a (small) set of indicators; but even then there will probably be too many. It is not clear that they could be combined into a comprehensive performance indicator framework.

3. What makes a project successful?

The conclusions of an ODA review (ODA 1993) as to what were the essential prerequisites for a successful project included:

· a conducive policy environment;
· joint commitment to project goals and outcomes by donors and government;
· effective project design;
· local ownership including participatory appraisal;
· local financial and institutional capacity for implementation and sustainability;
· effective management and administration of donor inputs; and
· effective monitoring and reporting.

These are much vaguer: it is difficult to get everyone to agree on what counts as a conducive policy environment, a joint commitment, local ownership, local capacity, effective management, and effective monitoring. Without further specification, it is hard to see how these could be the basis for a framework which would be consensually agreed even within DFID, let alone with country partners.

1.6.2 A Skeleton for Frameworks

The Basic Axis of the Frameworks

There needs to be a framework of what to consider for each such group of performance indicators. The best starting point is probably the OECD/INES project suggestion of context, input, process and output. Given the focus here on projects and programmes, this should be extended to include aims and outcomes. It will, of course, not always be appropriate to consider each and every one of these components.

Stakeholders

In this context, who are likely to be affected by sets of performance indicators?

· DFID Education Advisors;
· programme participants;
· recipient governments;
· consultancy organisations;
· political authorities.

The concern to involve recipient governments (as well as direct beneficiaries) implies that, although there is no intention to develop a comprehensive framework, it is unlikely that a small set of 'key' indicators would be sufficient. The indicator set would have to reflect multiple goals with multiple indicators measured by multiple methods (McEwan and Hau Chow, 1991).

For example, an ODA survey of ELT projects (1994) identified 105 projects with an English focus or significant ELT component in 52 countries. DFID sees English within a wider development context of national language policy; indeed, the survey suggests that there is a distinction between English as a medium for educational development, for international communication, and for economic development. In contrast, the Department of Trade and Industry and the Foreign and Commonwealth Office, see English as an export. The survey concludes that at least three possible sets of indicators for English programmes/projects could be developed.

Relating Indicators to the Decision Making Context

Scheerens' distinction between types, levels and modes of decision-making is also important. What types of decision (choosing new projects, monitoring ongoing projects)?; at what level (head office, country, project)?; in what mode (in- house, with the country, for general public)?. The latter issue - of which 'mode' - is crucial in considering aid programmes and projects because of the range of 'stakeholders', but we should not forget the distinction between types of projects and levels of decision-making. The development of the skeleton frameworks suggested below for sets of performance indicators will obviously have to be appropriate to the specific task and the educational outcomes (cognitive/non-cognitive) considered. The corresponding indicators will vary depending upon the entry point (type, level or mode of decision-making).

Nevertheless, based on these principles, it should be possible to elaborate each of the lists in the previous section (one based on DFID strategy, the next on planning, the third on project implementation, and the last on economic appraisal) in collaboration with the appropriate personnel, field agencies etc. into an integrated and overlapping framework. The problem remains, however, that although an 'intuitive' reading of these different lists would find no contradiction, we have shown how the literature is replete with examples of what happens if one ignores the apparently small nuances between definitions of objectives and 'outcomes' when performance indicators are instantiated in the field.

Assuming these can be resolved, the crucial issue at each level is what to include, what to omit and why. Attention should be paid not only to the technical criteria for good indicators, but also to:

· issues of relevance to the particular system;
· propinquity to the phenomena being monitored without interfering in the operations of the system (either through an over-heavy burden of data collection or through giving the wrong incentives); and
· the potential need for multiple indicators because of multiple stakeholders.

1.6.3 Developing Indicators at Different Levels and for Different Stages

Essentially, for DFID, we can envisage three different sets of indicators: those at the sectoral level; those at the planning and pre-planning stage; and those which would be used for monitoring and evaluation.

Sectoral Level

If the focus is the sector, (and reflecting on the OECD experience) there is only a limited choice given the types of data that can be collected. The main problems are the mechanisms of collecting the data and its quality. Technology for carrying out sample surveys, however, is now well developed and, with country agreement, this is probably the best way forward.

Here the second axis of the table could be either the level or mode of decision-making. We have chosen to take the level as the crucial dimension, as this usually determines the nature of possible participation. Here the entries in the cells are more in the nature of values relative to presumed targets for the sector (without falling into the trap of setting targets which will lead to falsification of data).

The table would have to be differentiated according to whether the focus of performance indicators was:

· the formal or informal system (CONTEXT);
· qualitative improvement or equity of access (AIMS);
· types of INPUTS and preferred PROCESSES;
· nature and timing of OUTPUTS and OUTCOMES.

This would, therefore, involve at least four tables according to whether the formal or non-formal system was being considered and whether the overall aim was qualitative improvement or equity of access. The variation between types of inputs and processes, and between kinds of outputs and outcomes (the latter closely connected with the AIMS) may not, however, be sufficient to generate substantially different tables.

Table 3: Planning Process and the Need for Performance Indicators at Different Levels

Central

Regional

District

Village

Household

Context

Aims

Inputs

Processes

Outputs

Outcomes

For the non-formal system, the distinction made above in section 1.5 between different types of non-formal education and the corresponding types of performance indicators have to be taken into account as well as the distinctions made in Table 3. Lockheed and Levin (1991) suggest that context should be specified in terms of facilitating conditions such as community involvement, school-based professionalism, flexibility, and the will to act as reflected in vision and decentralised solutions. They do not specify exactly how one is meant to measure any of those and only seems feasible at the most local level. Their list of inputs, however, is probably as good a starting point as any: curriculum, instructional materials, time for learning, teaching practices (Lockheed and Levin, 1991).

Planning and Pre-Planning of Projects

The approach here will probably have to be rather different. Consider the list cited in section 1.5.2.(2) above for the planning of projects:

What is the evidence of demand for the education service proposed? What will be the benefits to the country and individual? Can these improvements be measured and where possible quantified and qualified? Will there be cost economies resulting? Is the proposed strategy seen as the most cost effective? Are the recurrent cost implications manageable?

Recommendations: Each of these questions is likely to generate its own set of indicators - almost certainly too many for consistent judgements to be made - and each might well require a different set of data which would be costly to collect. Inasmuch as the answers to these questions are agreed to be important criteria for choosing projects, then perhaps the best approach is to propose what detailed specifications of indicators would have looked like in respect of a number of projects - some funded, some not funded - in order to assess whether or not those criteria are actually taken seriously, in deciding upon funding.

Project-Specific Performance Indicators

If we focus on the project, then it is appropriate to consider project-specific performance indicators (PSPIs). These are, in principle, straightforward measures of the extent to which a project completes the defined tasks: in the terminology of the logical framework, these are the 'purpose-level' indicators.

The issue is whether there is any scope for consistency in terms of the sets of performance indicators used in the logical frameworks for different types of projects (similar to the approach adopted by the Health and Population Division). This should be feasible for a large proportion of projects but the main problem is to assess exactly how these are used. At worst they may generate perverse incentives (as explained) or they may simply be ignored. In all cases, the importance of involving the 'beneficiaries' is crucial.

If we are considering a project then we need to be able to develop indicators for the context, aims, inputs, processes, outputs and outcomes (the same kind of list as in the logical framework) at different stages of the project cycle: distinguishing (at least) between pre-planning, start-up, mid-term evaluation and follow up. On this basis, an appropriate framework could be as follows:

Table 4: Performance Indicators at Different Stages of the Planning Process and of the Project Cycle

Preplanning

Startup

Mid-term

Evaluation

Follow-up

Context

Aims

Inputs

Processes

Outputs

Outcomes

Essentially, the approach here is an extension of the project framework methodology which, although it is likely to miss crucial process characteristics, is used to assess the whole project process from start-up through to evaluation. That methodology was never intended to be used at the pre-planning phase; and although it is recommended for use as a basis for follow-up evaluation, the context may have changed dramatically, so that it may not be appropriate. Moreover, not all of the basic components would be appropriate at each stage.

In any practical application, of course, one would need to explode each of the cells, in terms of specifying the level at which indicators are required and, eventually, the extent of participation in the process.

ANNEX 1A: World Bank (1996) Performance Monitoring Indicators: A Handbook for Task Managers (Operational Policy Department)

World Bank (1996) Performance Monitoring Indicators: A Handbook for Task Managers (Operational Policy Department)

The handbook specifies the potential uses of performance indicators for:

STRATEGIC PLANNING. For any program or activity, from a development project to a sales plan, incorporating performance measurement into the design forces greater consideration of the critical assumptions that underlie that program's relationship and causal paths. Thus performance indicators help clarify the objectives and logic of the programme.

PERFORMANCE ACCOUNTING. Performance indicators can help inform resource allocation decisions if they are used to direct resources to the most successful activities and thereby promote the most efficient use of resources.

FORECASTING AND EARLY WARNING DURING PROGRAM IMPLEMENTATION. Measuring progress against indicators may point toward future performance, providing feedback that can be used for planning, identifying areas needing improvement, and suggesting what can be done.

MEASURING PROGRAM RESULTS. Good performance indicators measure what a program has achieved relative to its objectives, not just what it has completed; thus they promote accountability.

PROGRAM MARKETING AND PUBLIC RELATIONS. Performance indicators can be used to demonstrate program results to satisfy an external audience. Performance data can be used to communicate the value of program or project to elected officials and the public.

BENCHMARKING. Performance indicators can generate data against which to measure other projects or programs. They also provide a way to improve programs by learning from successes, identifying good performers, and learning from their experience to improve the performance of others.

QUALITY MANAGEMENT. Performance indicators can be used to measure customer (beneficiary) satisfaction, and thereby assess whether and how the program is improving their lives.

The handbook recognises that the performance indicators must be based on the unique objectives of individual projects; but also that they should be based on an underlying logical framework that links project objectives with project components and respective inputs, activities and outputs at different stages.

They then discuss a number of advantages and limitation of Logical Framework and suggest a number of general principles for selecting indicators:

· relevance
· selectivity
· practicality of indicators, borrower ownership and data collection
· distinction between intermediate and leading indicators
· quantitative and qualitative indicators

There is then a description of the PMIs affecting the Bank's work at project identification, preparation pre-appraisal, preparation/appraisal, implementation/supervision, supervision/completion after completion onwards.

The following box illustrates the distinction between the performance information needs of differing levels of project management:

IMPLEMENTERS IN THE FIELD NEED
· input indicators
· output indicators
· [efficiency indicators]
· risk indicators
· some outcome and impact indicators
THE IMPLEMENTATION UNIT NEEDS
· summary input and output indicators, including site-comparative indicators as appropriate
· outcome indicators, including site-comparative indicators as appropriate
· [effectiveness indicators]
· risk indicators
· impact indicators
THE BORROWER AND THE BANK NEED
· summary input indicators
· summary output indicators
· risk indicators
· key outcome, impact [and relevance] indicators
· [sustainability indicators]
Note: Indicators in brackets are not a required part of Bank monitoring or project supervision.

But are they actually used in these ways?

ANNEX 1B : Problems of Measurement at the Sectoral Level: Examples of Indicators and their Associated Problems

Problems of Measurement at the Sectoral Level: Examples of Indicators and their Associated Problems

The intention of this section is not to provide an exhaustive overview of what needs to be measured at the sectoral level. We have already explained that the detail of performance indicators has to be developed in conjunction with in-country representatives. Instead, the purpose is simply to draw attention to some problems of definition and data where the performance indicators are to be based on data from the entire system.⁸

Enrolment Ratios

The problem here is 'simple': the quality of the data. There are several aspects to both the numerator and denominator:

· what is actually meant by enrolment: registration of child (for whatever reason), appearance in class at beginning of school year (if that can be identified), regular (rather than sporadic) attendance, inscription for (or sitting) annual examination;
· the relevant population: the 'decennial' censuses in many developing countries are unreliable for a variety of reasons so that the estimated size of the relevant age group has to be treated with extreme caution (Murray, 1988).

Attendance

No one suggests that this will be easy to monitor in developing countries. Yet, after enrolment, it is the next most important statistic because, without that data, we cannot sensibly assess what the enrolment figures mean in terms of childhood exposure to school. It might give some perverse comfort to know that this has also been a problem in developed countries. Ruby (1992) explains the difficulties of operationalising attendance as part of the OECD project set of indicators.

Measuring Quality

This is probably the most contentious area. Most of the heat has been generated around school effectiveness research because of the growth of achievement monitoring in order to identify "improved environments and educational aids which lead to detectable gains in knowledge, skills and values acquired by students" (Ross and Mahlck 1989).

In addition there are arguments over what is meant by 'quality' (e.g. Cheng, 1994) and over who should decide what is meant by quality (e.g. Hoppers, 1994 ;Stephens, 1991). Together, this would suggest that it is foolhardy to propose a system intended to be valid across all countries and systems

Instead, it might be appropriate to consider the more cautious approach of the USAID research project on Improving Educational Quality (IEQ) (with offices in Mali, Ghana, Guatemala, South Africa and Uganda) with the following objectives:

· to understand the processes through which classroom interventions in different countries influence student performance;
· to demonstrate a process whereby classroom research on improving educational quality is integrated into the educational system;
· to create opportunities for dialogue and partnership among researchers and educators who are seeking to improve educational quality at local, regional, national and international levels.

In this way, the intention, presumably, is to develop sets of quality indicators that are consensually agreed at the country level. Whether this is feasible, and whether the conclusions of such groups actually do generate national consensus is unclear.

Disadvantaged groups

Both the Jomtien agenda and the ODA Education Strategy Paper (1993) highlight the importance of monitoring the situation of disadvantaged groups. Possible indicators are:

· participation and success rate of ethnic, religious or language minority students;
· number and status of teachers and administrators in the educational system from those groups;
· appropriate curriculum and textbook content;
· provision of teachers familiar with non-mainstream cultures; and
· linguistic information on teachers and students.

This is very obviously a case where no hard and fast suggestions can be made. It depends on the specific situations in each country or region. Introducing a term or concept from one country or region to another may lead to entirely inappropriate conclusions.

What Happens After School?

The ODA Education Strategy Paper (1993) points to the difficulties here of institutionalising verification and accountability mechanisms for the wider objectives and longer term outputs of projects⁹:

· on exit from education system, 'no one is responsible';
· the management and design of tracer studies is rarely specified clearly in project design;
· interest in longer term outputs/outcomes often diminishes.

From the societal point of view, this is the most important outcome; yet the 1999 Policy Framework Document is much vaguer, talking about strengthening capacity (p40) and rights and responsibilities (p33).

Decentralisation and Devolution

Indicators of the devolution of financial responsibility can include: number of distinct school systems; proportion of key education decisions that are made locally; existence of school boards, their methods of selection and financial mandates; percent of locally generated revenue that stays local.

In the health sector of Scandinavian countries, Mills suggests assessing the extent of decentralisation in terms of the following two sets of indicators:

( ) revenue raising in devolved systems (thus: percentage of public health care centrally funded; local authority tax powers; controls on local taxes; central sanctions if expenditure is exceeded; and the local right to take out loans);
( ) planning controls in devolved systems (thus: the existence of a planning process linking levels; the initiating level; whether it is compulsory; and whether government approval is required.

Democratisation/Beneficiary Participation

The World Bank (1996) now argues for monitoring beneficiary participation to increase client investment in project success. Developing joint monitoring and evaluation systems work towards Bank goals of teaching new skills, but one must note their own caveat that these require: continuity of personnel from both government and donor agency; a network of supportive government personnel; avoidance of partisan politics; community leadership; and a sense of community and investment in project goals (Uphoff, 1992). Clearly, not an easy task.

ANNEX 1C: Collecting Data for Individual Performance Indicators

Collecting Data for Individual Performance Indicators

It is not clear exactly what should count as the final outcomes for individual pupils or for a group of pupils. As we have already emphasised several times, it depends on the original objectives of education in the first place. Schematically one could distinguish between those who emphasise individual (educational) attainment, others the kind of job/income and therefore opportunity for social mobility for people, and yet others the quality of life that people lead.

Conceptual Problems

The appropriate indicators of outcome or performance would, of course, be different in each case.

· income/jobs/social mobility In principle, the indicators are based on the relation between the years of schooling and estimated lifetime earnings or the different jobs statuses. Putting to one side for the moment the well-known difficulties of collecting the data (see below) there is also the problem that such analyses assume there is an income or jobs to go to, and also that we can count the amount of education in terms of number of years of schooling. The former problem is one of many concerned with the problem of interpreting such analyses which have been dealt with thoroughly elsewhere (e.g. Hough 1990); but the latter has often been ignored (or assumed away which amounts to the same thing). Yet we all know that a year of schooling in Sweden is something very different from a year of schooling in Zambia; although we can crudely account for that by taking the cost of a year of schooling as the measure of resource input (rather than counting years), and the same procedure could be followed for non-formal education programmes. But, within a system, the quality of a year of schooling can vary enormously with only minor - or no discernible -variation in resource input and there is, no obviously way of adjusting for this apart from using another outcome measure (such as attainment levels) which means we would not be able to calculate any ratio of labour market outcome to the resource costs of the effort required. Basically, it means that data has to be collected on quality as well as quantity of schooling.
· quality of life: Although one has the same problem of counting years of schooling, the corresponding difficulty of the appropriate measure of outcomes is rather different and less attention has been paid to it in the literature. Part of the answer is a systematic attempt to measure the quality of life as illustrated in Part III.
· individual attainment The indicators appear 'obvious' here, although the problems of interpretation are often underestimated. Even assuming agreement on the range of curriculum topics/subjects which should be the subject of measurement, the difficulty is in separating out the background and school influences (the value added problem). This is, of course, a contentious issue in developing countries; but in some respects, separating out home and school effects is a simpler problem because the background of pupils/students is so similar. On the other hand, there is very little data.

Collecting the Data

In addition to these 'conceptual' problems, there are also considerable difficulties with collecting the data in developing countries. There are two main choices: following a cohort of pupils/students through from school to the labour market (what are sometimes called 'tracer' studies); or collecting retrospective data on a random sample of adults in the labour market (what could be called 'reverse tracer' studies). Each pose different sets of problems.

· tracer studies. The obvious problems here are the difficulty and expense of following up people for any length of time: keeping track of people's movements is complicated. However, this has the obvious advantage of collecting data at the time it happens (whether one is talking about the quality and quantity of schooling OR about the income/job).
· reverse tracer studies. These studies are, in principle, much cheaper and easier to organise, as they simply involve a questionnaire to adults about their school experiences. But, inasmuch as one believes that the quality of schooling is an important variable, it is obviously inadequate to rely on adult recall of their schooling experiences. This therefore entails identifying the schools that the adults attended and then retrieving the files on them (if they exist and can be found). This exercise is also tedious and time consuming, although not usually as much as the prospective approach. However, it is unlikely to be 100% successful because the files might have been mislaid or erratically filed or simply because they are incomplete. One cannot therefore rely on this procedure to provide good data on quality.

Footnote

1. Although that was already being debated - see Pateman, (1968). Subsequently the whole sub-discipline of ethno-mathematics has developed (see, for example, Bishop 1997).
2. Some of the arguments in this sub-section rely heavily on Smith (1993).
3. The increased interest in environmental and social responsibility accounting may signal a change in the private sector; see Hopkins 1999.
4. Perhaps the clearest example of the attempt to recognise the interplay between evidence and judgement is in the criminal law. Both prosecution and defence try to build up a convincing picture based on the presentation of evidence - to place before the jury. But this does not mean that lawyers or juries ignore the evidence: indeed it would be seen as rather silly to crusade for the use of evidence in criminal trials: the issue is how it is used.
5. Aiach and Carr-Hill (1989: 29) provide a concrete example of how countries vary in relating data to policy in respect of the debates over inequalities in health. They point to: the extent to which the political regime in power is prepared to recognise the problem; the extent to which the problem can be documented; prevailing views about causation; the particular form of system via which services are delivered; the economic and historical context; and the relative position and power of disadvantaged groups.
6. As Smith (1993) demonstrates, within the private sector the task is much easier because of the focus on financial inputs and outputs.
7. See, also, the last chapter of Carron and Ngoc Chau (1996).
8. Many of the problems described in this section can be overcome when a statistically representative sample is taken, because much more effort can be put into securing the quality of the data. The move towards decentralisation, however, implies collecting data on all those in a smaller unit.
9. Indeed, according to an DFID Internal Document (December 1995, para. 23) "Experience suggests that it is not worth spending too much time trying to identify indicators at the level of wider objectives, as these are unlikely to be very relevant to the project itself. If necessary, the wider objectives box for indicators can be left blank!" (author's exclamation mark).

Narrative Summary	Measurable indicators	Means of Verification	Important Assumptions
GOAL: 1 Increased demand for and utilisation of high quality primary education	1.1 Reduced wastage rates, especially for girls, from current 56% to less than 30% by 2005.	1.1 MoE Planning Unit statistics	(Goal to Supergoal (ODA Aim 2 Statement): 1 GoK and parents continue to maintain present level of commitment to education
	1.2 Improved student performance, raising KCPE average by 20 points and stabilising repetition rate at 15% through to 2005.	1.2 KCPE examination results
	1.3 Increase current GER of 79% to 85% by 2005.	1.3 MoE Planning Unit statistics
	1.4 Increased pupil and parental satisfaction	1.4 MoE/Project surveys
PURPOSE: 1 To improve the quality and cost-effectiveness of teaching and learning in primary schools on an equitable basis	1.1 Teaching and learning environment improved in all districts by 1999, though all teachers using new skills that promote active learning, and through use of textbooks provided under project.	1.1 Impact assessments. Inspectorate and TAC Tutor reports.	(Purpose to Goal): 1 Effective co-ordination with PRISM project 2 Non-ODA funded components achieve targets 3 Effective targeting of project resources through operational research to districts where wastage and non-enrolment are greatest. 4 Consistency of KCPE examinations 5 No increase in poverty in targeted districts.
	1.2 Improved professional support and inspection service to schools nationwide through upgraded and diversified Teacher Advisory Centre (TAC) system and upgraded inspectorate, by 2000.	1.2 District Education Board reports, Inspectorate Reports, project reports.
	1.3 Strengthened community participation in school management and delivery: 1) through effective school committees established in all schools by 1997 ii) community education programmes operational and sustained in at least 5 disadvantaged districts by 2001. iii) information, education and communication programmes operational nationwide (with special emphasis in all areas of high wastage and low enrolment in first 3 years of project) by 2001.	1.3 Impact assessments, consultants reports, District Education Board reports. District Education Officer reports.
	1.4 MoE use operational research to improve resource allocation and planning: i) ensuring al severly disadvantaged schools receive free textbooks by 2001. ii) teacher recruitment and deployment systems improved based on reduced student staff ratios, and piloting of multi-grade teaching by 2001.	1.4 MoE statistics, planning and budgeting reports, District Education Officer reports, consultants reports.
	iii) rationalised cost-sharing system in place in all districts by 2000. Reduction by at least 10% in parental expenditure per child in schools in receipt of project textbooks.
OUTPUTS: Component 1: 1 Improved teacher training through School-based in-service Teacher Development Programme (STD)	1.1 STD operational in all target districts by 1999 and minimum of 90% (14, 000 schools) coverage nationwide by 2001.	1.1 Project impact study, Inspectorate reports, consultants reports.	(Output to Purpose): 1 Orientation of headteachers to STD through PRISM project 2 Politicians and unions accept the desirability of improved student: teacher ratios and the concept of multi-grade teaching in remote rural areas. 3 Institutional strengthening components received broad-based support 4 Policy commitment to increasing non-salary components of primary budget is retained 5 Teachers able to apply new teaching methodologies.
	1.2 Teachers guides prepared for circulation to 175, 000 teachers by mid 1998	1.2 Project reports, consultants reports
	1.3 School advisers guides prepared by mid 1997. 1300 TAC tutor trained in management of STD and 16, 000 mentor teachers trained by 2000.	1.3 Project reports, consultants reports, TAC tutor reports, Inspectorate surveys.
	1.4 Testing and Measurement booklets prepared for circulation to 175, 000 teachers by KCPE examiners by end of 1997.	1.4 Project reports, consultants reports

Audience	Purpose
Institutional	Internal Comparisons with others Marketing Evaluation of teachers
Government	Accountability Policy and planning Allocation Overall level of funding Value of investment Manpower planning
Public	Accountability
Student	Choice
Teachers	Self-assessment
Industry	Research funding Graduate employment

Level	Name	Standard	Non-standard		Suggested Classification
		State	Collectable from Institutions	Collectable via Household Surveys
1	Pre-primary	Nursery	Playgroups	Child Care	Para-formal
2	Primary	Primary	Evening classes	Street children	Para-formal
3	Secondary	Secondary	Evening classes	Youth groups	Para-formal
4		Further Education Colleges	Industry	Youth groups, back-street colleges	TVET General/specific
5	Tertiary	University	'Open' Universities	Auditing	Personal Development
6	Post-Doctoral	University	'Open' Universities	Auditing	Personal Development

	Central	Regional	District	Village	Household
Context
Aims
Inputs
Processes
Outputs
Outcomes

	Preplanning	Startup	Mid-term	Evaluation	Follow-up
Context
Aims
Inputs
Processes
Outputs
Outcomes