2.2.1 School achievement studies
2.2.2 Recent methodological developments
2.2.3 Some results from the IEA science studies
2.2.4 A note on the effective schools literature
There are now a substantial number of studies on the factors that effect school achievement. The earliest examples were conducted in developed countries and established the importance of socio-economic background as a determinant of the performance of students in different types of school (Plowden 1967, Coleman et al 1966, Jencks 1972) and began to analyse the reasons for this. These studies were accompanied by a developing literature which offered a critique of the methods used and which also became entangled in the debates about nature and nurture in the development of intelligence, school achievement and subsequent success in the labour market (Bowles and Gintis 1976, Little 1975). Much of the concern was to explore to what extent meritocracies functioned as such and to what extent educational achievement behaved as an intervening variable explaining why in these societies socio-economic status of children continued to be linked closely to parents socio-economic status (Halsey 1977). These studies tended to show that school factors were less important determinants of scholastic success than home background factors. It was, however, misleading to draw the conclusion, as some popularisers did, that this implied that not much of importance went on in schools. It was differences that were being studied not absolute effects - as a weary commentator observed "students don't imagine algebra". Neither do most of them independently establish Newton's laws of motion.
Subsequently studies have appeared which extend the analysis and demonstrate that within school factors may be more important than previously supposed (e.g. Rutter et al 1983, Mortimore et al 1988). The studies also began to be applied to data from developing countries and suggested that school effects might be even more important than in developed countries. Thus Heyneman and Loxley's (1983) study of science achievement in 16 developing and 13 industrialised countries examined a range of school variables and regressed science achievement scores against them. This study found relatively little variance explained by school factors in the industrialised countries but much larger amounts explained in the developed countries (27% of the variance in achievement explained by school quality in Indian children and only 3% by social class; 25% by school quality in Thailand and only 6% by social class). However, the total variance explained in the cases studied was typically around the 20 -30% level leaving much that was not explained. School effects seemed more important in subjects like science where systematic study is generally only possible in schools.
The literature that now exists on school effectiveness is difficult to summarise. More than 50 multivariate or experimental school effect studies had been undertaken by 1987 relating to developing countries and were reviewed by Fuller (1987) (see Appendix for a summary table). The studies are methodologically diverse, vary in terms of the specification of the dependent and independent variables, use a range of sampling techniques, and have been undertaken in very differently structured education systems. The problems include the cross-cultural transferability of notions like social class- the difficulties in specifying the dependent variable - achievement; the realisation that often large parts of the total variance remained unexplained after the effects of independent variables have been accounted for; and the rarity of studies that are capable of controlling for the entry characteristics of students. Not surprisingly a universally applicable set of conclusions is elusive. Synthetic reviews like Fuller's (1987) and that by Schiefelbein and Simmons (1981) have difficulties of aggregation that make it difficult to decide what importance to give to findings that appear true in some systems and not in others. What can we conclude from the 11 analyses cited of school expenditure and achievement, six of which confirm a positive relationship, and five do not? Or of Hanushek's review of more than 150 studies which concluded that there was no systematic relationship between expenditures and student achievement; or that attitudes and drop-out rates, reduced class sizes and more trained teachers were also unlikely to make much difference to achievement (Hanushek 1986). Common sense suggests that, in the limiting cases, levels of expenditure must have some relationship to achievement; nevertheless it is unlikely to be a sufficient condition alone. One of the earlier studies (Thias and Carnoy 1973) concluded that there was no relationship between expenditure per pupil and achievement at primary level but that there was at the secondary level. The latter was such that they claimed raising national examination scores by 5 % would require a 50% increase in expenditures per pupil. This illustrates one of the limitations of this kind of analysis. There are many ways of increasing achievement and each will have a different cost structure. Simply redistributing existing resources towards the least favoured schools (which would have little or no direct cost) is likely to have a much bigger effect on achievement in those under performing schools than would distributing increased resources evenly to include those which already enjoy surpluses of qualified teachers, textbooks and other learning materials. The incremental rate of return on investment to raise achievement in schools which have no books or facilities will be much greater than similar inputs to well-funded institutions.
Another example of the difficulties that arise in taking a macro view of school effectiveness studies can be illustrated by the well publicised literature on the effect of textbooks on school achievement. Fuller's review indicates that 16 out of 24 studies show positive effects of the availability of texts and reading materials on achievement. Though the dependent variable was textbooks/student the independent variables that were controlled for in the studies varied. But perhaps more important is the lack of insight into whether the studies related to the first pieces of reading material available or additions to an existing stock (though one study (Heyneman, Jamison and Montenegro 1983) does show no gains resulting from a change in the pupil/book ratio from 2:1 to 1:1). Neither is the qualitative relationship explored between the types of reading material and the demands of the tests used to measure achievement - it is tempting to ask do comics have the same effect as academic reading materials? The improvement in achievement attributable to book provision in the Philippines in one of these studies (Heyneman, Jamison and Montenegro 1983) is argued to be twice the impact that would be gained by lowering class size from 40 to 10 students. But this finding uses evidence from an experimental study in the Philippines and data on class size effects from the U.S.A. which presumes that the range of variation in class size considered and teaching methods are indistinguishable between the two systems. Forewarned of some of the analytical pitfalls it is worth turning to some more of the findings. Fuller and Heyneman (1989) have attempted to identify effective and ineffective factors that influence school achievement, reducing their earlier list of 27 factors to a more manageable 9. These turn out to be:
Effective parameters |
% of Studies Showing Positive Effects |
Length of Instructional Programme |
86 |
Pupil Feeding Programmes |
83 |
School Library Activity |
83 |
Years of Teacher Training |
71 |
Textbooks and Instructional Materials |
67 |
Ineffective Parameters |
|
Pupil Grade Repetition |
20 |
Reduced Class size |
24 |
Teachers Salaries |
36 |
Science Laboratories |
36 |
Some comment on each is in order. Repetition appears not to improve achievement in most of these studies. Since in most systems repetition implies just that repeating the same material often with the same teacher without any special treatment and thus repeating and experience that led to failure before this is perhaps not surprising if discouraging. In many systems repeaters have a disproportionate tendency to drop out. Paradoxically if repetition were reduced it might be expected to reduce average achievement levels if it meant that larger proportions of lower ability children proceeded to higher grades. Reducing repetition, where it is high (repetition rates in Sub Saharan Africa are often 15% or more at primary level World Bank 1988:136) is really a priority for other reasons. High repetition rates represent a serious source of inefficiency which increases the unit costs of graduates from a particular cycle and fills places with repeaters that might otherwise be occupied by those currently unenrolled. Though average levels of achievement might deteriorate as a result of enrolling more lower ability students, across the age group as whole (including those currently unenrolled) achievement would rise and unit costs reduce.
It may be that within a wide band achievement is not related to class size, but this does not mean there are no limits. Physical constraints (classroom size) and accepted traditions place limits on what is acceptable. In most countries classrooms are not built to accommodate more than about 50 students in comfort and class sizes in excess of this result in overspill onto verandas without appropriate furniture. One of the highest scoring IEA science study countries (Korea) has large class sizes averaging over 60. But this is in a relatively well resourced system and one where learning motivation is high and disciplined study part of the cultural heritage. Intuitively the significant factor is when teaching practices change practical group work is difficult when class sizes exceed 40, it may not be practiced below this number. A lecture is likely to be as effective with 20 students as with 80 if the space is available. And class size probably does interact with other variables - if textbooks/pupil are correlated with achievement, large class sizes in situations where there is fixed stock of books (a realistic assumption in a rapidly expanding system) will probably diminish the correlation.
Teachers' salaries, at the level of individual teachers, are unlikely to be directly related to achievement for the simple reason that achievement is unlikely to be the result of the teaching of a single teacher - students will experience several teachers over their careers in school. Moreover it cannot lead to the conclusion that paying teachers better is unlikely to have an effect on achievement - it may be that the most effective teachers do not get the highest rewards; it may be that all teachers are paid so poorly that whatever variation exists is not reflected in performance; it may also be that incomes are so low they fail to provide motivation to all but the most dedicated. Given the fairly universal belief that incomes should be related to effectiveness the challenge is to change the reward structure so that they are.
The apparent ineffectiveness of level of laboratory provision may well reflect the nature of science achievement tests. If these do not test the skills developed in laboratories (which frequently they do not) it should surprise no-one that laboratory provision does not have a large impact on achievement measured through pencil and paper tests. These often emphasise recall and the abstract application of principles. The reasons for incorporating practical work in science have been thoroughly explored by Haddad and Za'rour (1986) who argue its benefits whilst recognising the difficulty of measuring its impact. The First IEA Science study noted positive effects of reported laboratory use in three of the four developing countries in their sample but Heyneman and Loxley's 1983 study found no such effect. Lockheed, Fonancier and Bianchi (1989) did find positive effects arising from teaching primary science in laboratories in the Philippines but note that the magnitude of this was much less than the effect of frequent group work and of frequent testing. The most recent IEA study (Postlethwaite and Wiley 1992) is complex to interpret on the subject of practical work and achievement; it does suggest that where students views of teaching indicate more practical work probably takes place, there is a positive relationship with achievement in five out of nine cases. The weight of opinion seems to lie with those who are sceptical about the measurable benefits of laboratory science for achievement as conventionally measured, and who stress its high costs (Wallberg 1991).
On the positive side, length of time spent on instruction is reported widely as having an impact on achievement. Heyneman and Loxley (1983) note this in relation to general science in India, Thailand and Iran. Fuller (1987) counts 12 out of 14 analyses supporting this proposition. There are wide ranges between countries in the amount of time allocated to teaching in different countries. Science instructional time varies by a factor of more than two in the IEA Second International Science Study as does the length of the school year. And actual variations will be greater still. In some countries many of the official teaching days are not utilised for their intended purposes as a result of teacher absenteeism, school functions, excessive examination practice, natural events and casual holidays. The more time allocated to instruction the more is likely to be learned, but there is no reason to suppose that the relationship is linear.
Feeding programmes are an established way of enhancing enrolment and increasing retention. They are however often very expensive and may reduce teaching time if teachers are involved in the preparation of food. School libraries also appear associated with improved achievement, though there is very little data on patterns of use. Since libraries tend to be found in better resourced schools with more favourable learning conditions and better qualified staff it may be that the general association is not strongly positive with library resources alone.
Pre-service teacher qualifications and training do show up in many studies as positively related to achievement. The magnitudes of the effects are often moderate however. The effects are difficult to measure - since children experience different teachers should recent training be given the same weight as training ten years ago? - and the number of years of schooling completed before training may be at least as important as the training itself. One recent study (Lockheed et al 1986) suggests that other inputs i.e. textbooks can be substituted for additional training since textbook use and training did not interact in their data and the effects of textbooks were greater. Very little evidence exists on the effectiveness of in-service training. Those studies that do exist are generally positive but often have no means of controlling for the effects of training as opposed to the traits of the teachers who choose to take advantage of it. Those involved are generally the more motivated and skilled in the first place. Of course it is also likely to matter what teachers are being trained to do and what kind of students they are teaching though this also is also largely unresearched.
There is little doubt on the margin that textbooks do have a major impact on achievement in most subjects, and probably the more so in science and other subjects which depend on special school based resources. Unfortunately beyond the level of the existence of textbooks in reasonable quantities there is little research to indicate at what point additional written material ceases to have an effect (the Philippines study mentioned above is an exception); or what the relative impact of different types of material is teachers' guides, student texts, worksheets, reference books. And every teacher has opinions, often well founded, about good and "bad" Every textbook is not the same - some have inappropriate reading levels, some are poorly structured, some contain factual errors, some are produced with poor quality and uninteresting design and contain heavy gender stereotyping.
Ridell (1989) has recently offered a critique of the school achievement studies literature. This argues that the first wave of studies in the 1960s in developed countries made considerable use of production function like models used by economists in specifying variables. The second wave, placed more emphasis on process variables - e.g. teaching styles - and the educational rather than statistical significance of findings. A third wave is now developing which uses multi-level modelling techniques that can accommodate the self evidently hierarchical nature of data on school systems (students learn in classes which are part of schools which are part of districts and national systems). Most school achievement studies in developing countries, she argues, have been undertaken using the methods of the first wave. These have limitations, not least a hazardous reliance on a particular statistical procedure to define the proportion of variance associated with different variables which leaves much variance explained, and is unable to account adequately for variance arising from different levels or, for example, for the effects of selection. As a result it may be that the differences attributed to school rather than individual and home background factors may have been exaggerated when comparisons are mad between developed and developing countries. Heyneman (1989) defends the use of the analytical techniques of the 1970s (predominantly ordinary least squares) since multi-level modelling was not available at the time. He also argues that the refinements are welcome but are unlikely to change the nature of the challenge of raising the availability of school quality of school inputs and distributing them more fairly.
The Second International Science Study of the IEA justifies a brief review since it addresses a learning area frequently recognised as central to human resource development policy. This study includes data from 24 countries of which China, Ghana, Nigeria, Papua New Guinea, Thailand, the Philippines, and Zimbabwe can be clearly located as developing countries. Hong Kong, Singapore, and Korea might also be classified in this way albeit that their economic development has reached a different level. The IEA studies have been conducted on three populations, broadly speaking 10 year olds (Population 1), 14 year olds (Population 2) and those in the last year of schooling before university entrance (Population 3).
The interpretation of the IEA findings is very complex and it is only possible to draw attention to some of the main findings here. These are tentative since variations in the data sets are important for any comparison between countries and all of the overall findings need contextualising in more detail than can be provided. Moreover what may be true in the lowest scoring developing countries as a group is often not true in the other developing countries. With all these caveats some of the main findings are described below.
In terms of total score Ghana, Nigeria, the Philippines and Zimbabwe have the lowest total scores on the science tests. This is true in aggregate and in different subject areas. Other developing countries - e.g. Papua New Guinea, Thailand and China have means that are comparable with industrialised countries like England, and the U.S.A. Hungary and Japan score consistently well above most other countries. In general there is a high inter-correlation between scores at the population 2 level and those for population 1 suggesting that low performance is compounded through the system. There are considerable changes in the ranking of mean scores by country at the population 3 level which are heavily influenced by the selection practices of different countries which, in some cases, concentrate resources on the most able science students.
In general the proportion of schools scoring below the lowest school in the highest scoring country (Hungary) was high in the low scoring developing countries in the population 2 sample (Ghana 64%, Nigeria 88%, Philippines 87%, Zimbabwe 80%). In these countries the performance of the lowest 20% of students tested indicates that they have learned very little science. This is particularly worrying when it is realised that the Nigerian students were from a higher grade than in other countries, and the Ghanaian students were from selective and elite schools. The IEA data suggests that the bottom 20% of students in Ghana, Italy (Grade 8), Nigeria, the Philippines and Zimbabwe are "scientifically illiterate". Interestingly England, Hong Kong, Singapore and the USA are borderline cases. Indeed the USA has a higher proportion of schools scoring below the worst school in Hungary than does Thailand.
Of particular interest is the finding that the teaching group or class that pupils are in is of considerable significance to the scores that they achieve in some countries but not others. This effect is particularly prominent in Ghana, the Philippines, Italy and the Netherlands. By contrast in Japan and the Nordic countries at this level the effect is very low indeed at the population 2 level. This changes dramatically in Japan at the population 3 level, probably because of the increase in the proportion of private institutions at this level. One of the implications of this appears to be that in some countries differences between schools are considerable and it does matter a great deal in which school or class students study science, in terms of their achievement. In other countries, school and class effects are much smaller and have much less influence on achievement. This is not simply a function of resource levels; rather it seems to depend more on selection and streaming practices and organisational features of education systems.
The IEA authors have developed a yield coefficient that modifies the distribution of scores by the proportion of the age group in school. This is intended to indicate how many children know how much science. It highlights differences between countries and shows that yield coefficients tend to be much lower in those countries with the lowest proportions in school which are mainly the developing countries at population 2 level. This raises a dilemma for countries; with low yields wishing to improve them should numbers enrolled be increased or should low levels of achievement be improved first?
At population 3 level in the IEA data inter-country comparisons are even more hazardous than they are at population 2 level. There are wide disparities in the percentage of the age group studying at this level from 1% in Ghana and Papua New Guinea to 89% in Japan). The average age of this population spans 23 months. There was a 2 year grade difference in the level which the tests were applied to. The average number of subjects studied varied from 3 to 9 or more with concomitant variations in the time spent on science.
Generally England, Singapore and Hong Kong and Hungary have the highest scores in population 3 with some variations between subjects in this. These countries also have small numbers enrolled and highly specialised curricula. In general the IEA found no relationship between the proportion studying science and the achievement of elite students defined as the top 3% of the age group. There was no significant tendency for the number of subjects studied to influence science achievement except in Chemistry. Positive age effects were noted with older students scoring better.
The IEA study demonstrates that sex differences greatly favoured boys in the countries with the lowest overall scores (a category including many developing countries) in terms of the performance of both the bottom 20% and the top 20%. Though in Hungary sex differences were minimal, in Japan, the other high scoring country, boys outperformed girls consistently at all levels of ability. Typically sex differences in performance are greatest in physics and least in chemistry.
The most recent collection of studies on effective schools is that by Levin and Lockheed (1991). This work includes case studies on effective schools and reforms that have promoted their development. It also reviews data from recent studies using the multilevel statistical techniques referred to earlier. The overview offered identifies necessary inputs and facilitating conditions that seem to be related to effective schools.
On the input side four critical aspects are identified. First curriculum relevance, content and sequencing is seen as essential but often not adequately provided. Second the availability of instructional materials is stressed as central to effective learning. Successful schools almost invariably seem to provide sufficient instructional materials for students and high achievement usually correlates with textbook availability. Third, the time available for learning is identified as significant. Successful schools appear to ensure that greater proportions of the time allocated to learning are occupied with learning activity and increases in learning time generally bring learning gains. Fourth, they argue there is some evidence that the more learners are actively involved in the learning process the more likely it is this learning is successful.
Facilitating conditions are delineated as including first a level of community involvement which may take many forms from additional resources supplied by the community, to contributions the school makes to the life of the community and direct parental involvement. Second the professionalism of schools is identified as important. This is associated with effective leadership, teacher commitment and competence and adequate accountability. Third, flexible approaches to organisation and teaching and learning are identified. This includes the ability to adjust curricula and organisational arrangements to reflect local conditions and adapt teaching methods to suit different groups of children.
The individual case studies draw attention to what can learnt from experience with projects in a range of countries that include Thailand, Nepal, India, Colombia, Brazil, Shri Lanka and Burundi. In some cases, for example the Shri Lanka case study, there are illustrations of how relatively small inputs into quality improvement programmes appear to result in large gains in achievement and participation. Others argue the importance of ensuring political will exists to improve school conditions and performance and that the benefits of improved educational access and quality must be expressed in terms which offer gains to those in power as well as those on the margins if they are to be reflected in quality improvement programmes.
Though it is sometimes difficult to untangle those findings that have general utility and those that are specific to particular circumstances the school achievement literature provides a lot of food for thought about the relative importance of different types of intervention to improve school quality. It encourages clearer definitions of the attributes of "good" schools and the kind of achievement that is valued. It also focuses attention towards those inputs and processes that are manipulable through education policy and those whose locus of control lies elsewhere. Finally it can illustrate gaps between policy intentions and actual outcomes, thus drawing attention to implementation problems.
What this kind of analysis cannot do should not be required of it. It cannot generate policy prescriptions across widely differing countries and education systems that do more than point the way towards worthwhile possibilities that need exploration and validation at the intra country level. It is here that studies can provide the most reliable guidance for assistance targeted on areas where it will have the most impact. At this level one of the central questions of the school achievement literature invites inversion. It is not so much a question of what makes a good school - good schools are self evidently not the problem. It is more a question of why are some schools, often with apparently similar resource endowments, judged inferior to others and how can their performance be improved at replicable levels of cost?