I was struck by a recent post by Greg Ashman about a plan to use computers to mark literature papers. I fully agree with his assessment of the situation, but in the context of this post by Michael Fordham (about the unteachability of skills) and this post by Ben Newmark (about the purpose of education for children not simply being about becoming successful adults) it sparked in me a little thought experiment about the use of computers and AI which I believe can shed light on the nature of education.
My argument here attempts to elucidate the limitations of the role of computers in marking, and then hold these limitations up as a mirror to better understand the nature of literature and education. My conclusions further explicate the non-instrumental nature of education and the ideas of educational fideism (explained here). Furthermore, the lead to an expression of an ultimate ethical principle (which solves the problem posed here).
III. Rule-based computing
Basic rule-based computing can be imagined as follows:
An input is fed through a rule (or algorithm) and an output comes out. This can be imagined as a mechanical operation.
In terms of marking, this is analogous with marking according to a ‘rubric’, or a checklist. The problem with this is, as Greg Ashman points out, that these rules will never fully be able to describe all of the nuances that make up the judgments of what is an ‘A’ etc. It also leads to teachers teaching to the rubric, as opposed to the notion of ‘good literature’.
Even if one were to program in all of the rules of grammar, including all of the exceptions, it is inevitable that such a program will be incomplete (for reasons that we shall soon see).
The next stage of computer programs is those that ‘learn’. These work in the following way:
The computer is fed with numerous examples of work that are ordered (or weighted) from ‘better’ to ‘worse’. The computer then extracts correlations between those pieces of work that it can then use to create new ‘rules’ that then enable it to grade other pieces of work.
Deep-neural-networks are the next stage up from that. These networks have a hierarchy of information, from more detailed to more general.
This method was used for AlphaGo, a program written by Google’s DeepMind to play the extraordinarily complicated game of Go. Given the complexity of Go, and the volume of possible playing positions, the computing power needed to build correlations between playing positions and ‘better’ (or winning positions) would have been inordinate, if not impossible. Thus, many thousands of examples of games were fed into the computer. The computer then used a hierarchy of networks, at the bottom were those networks which looked at smaller specific areas of the board, these were then fed into networks which looked at larger areas, and so on. From there it established correlations which then allowed it to establish which moves were more likely to ‘win’.
In the marking world, the use of neural networks, and deep-neural-networks is analogous to comparative marking. In comparative marking, the decision is simply ‘which piece of work is better’. In the future, a company like NoMoreMarking will have enough stored examples to feed into a deep-neural-network to enable automatic grading. It would (I would have thought) need to be a deep network because of the billions of possibilities of correct answers for literature papers.
The great advantage of using computers here is that they are able to recognise and enumerate rules that we may follow, but be unconscious of.
Would this be a viable method of computer marking? It would certainly be more successful than simply ‘rule-based- programming’. However, there is one concern: Any such neural network is only as good as the information that has been fed into it. i.e. a neural network ‘learns’ these correlations from the sample that it is given – the sample provides it with its definition of ‘better/worse’. One could imagine a time, for example, when the works of great literature are fed into such a marking program with each awarded a ‘weight’ of some kind. The program could even take into account the more ‘jazzy’ writings, for example of James Joyce, which break many of the rules that one would program into a rubric following marking program. The major issue with this is that the computers then simply repeat the biases of the sample – whether it be towards white men, or towards one particular canon. To guard against such stagnation and repitition of bias, it would be necessary for a computer to work from an entirely ‘blank slate’ or ‘tabula rasa’.
V. ‘Tabula Rasa’ computing.
This week saw the news of a new kind of Go-playing computer – AlphaGo Zero. AlphaGo Zero started from an entirely blank slate (with only the rules of the game programmed in) and taught itself by playing against itself. Within a terrifyingly short time, it had outsmarted its predecessors. Some of the methods that it uses have been described as completely ‘alien’ to Go players – i.e. the logic of its moves are beyond human understanding.
Putting aside the unsettling nature of this advance, I feel compelled to ask whether it would be possible to create a ‘tabula rasa’ computer program that could mark literature papers? And the comforting answer is no.
VI. The problem of defining ‘better’
The reason why such a program is ab initio impossible is because literature cannot be ‘won’. Go, exams etc. are all games that can be won or lost, or performed better or worse at. One can imagine, for example, a Deep-neural-network that had been fed with great literature with King Lear (for example – don’t shoot me for the suggestion!) weighted as the best thing ever, and millions of other writings graded downwards from there. (Google books and Amazon probably already have this information – they could use the reviews to weight the work and then even produce writings accordingly. The students work could then be graded according to where their work comes on the deep-neural-networks grading system. The key thing here, however, is that the human weighting and input is necessary, someone has to decide which literature ‘wins’. And there’s the rub: literature is clearly not a game that is ever ‘won’.
VII. Finite and Infinite Games
James P. Carse wrote a wonderfully profound book called Finite and Infinite Games. He describes finite games as games with fixed rules the aim of which is to win or lose. In contrast is the infinite game. In the infinite game, the rules change (though there are certainly rules) and they change because the purpose of the game is not to win or lose, but merely to keep playing. Exams, or Go are clearly finite games, but literature is clearly infinite. In an infinite game, the players do not try to ‘win’ because that would involve someone ‘losing’ and the end of the game. In such a game, the players are like small children asking, ‘will you play with me?’
So what would AI developers need to do to create a program that could get involved in such an infinite game?
Firstly, the fundamental algorithm would have to work according to the desire to simply ensure that play continues. This provides us with a beautiful picture of the ethical obligation.
Secondly, in the infinite game, the rules necessarily change if it means that someone else can join, or that it prevents someone else from losing. The unpredictability of the world means that it must also evolve. If players did not agree to changes in the rules, changes in the world could result in someone winning. This leads to a difficulty for computer algorithms, for the aspect which enables the game to evolve to suit changing conditions is the ‘randomness’ of human activity. From the outside, randomness is logically akin to ‘choice’ – and randomness is something that a computer can only simulate. (And perhaps it will someday make sense for us to say that humans only simulate randomness too – though this is another philosophical difficulty entirely.)
We have already begun to see the effects of the stagnation caused by non-random computer programs on the world – the stagnation of social networks and the creation of echo-chambers; the increasing abilities of single individuals or companies to wield power through networks of bots and fake news. Social networks encourage us to look at more of the same rather than randomly different things, and thus stilt our world-view.
VIII. Implications for education
Given that the rules of this game changes, surprise is therefore a key feature of this game, and preparing for this surprise is the role of education. Education is the initiation into this infinite game, and part of the ethical obligation to enable the continuation of play. Thus, as I wrote here, education is the refinement of our expectations, of our understanding of the (evolving) rules of the game, by apprenticeship with cases.
The implications of the ethical obligation to enable the continuation of play depend upon what one has witnessed. The specific obligation is written into the context we observe: is a child being excluded from playing because they cannot read? Is a child being excluded from playing because their education is being disrupted? Etc… Given that it is logically impossible to hold a God’s eye view and pay attention to all things at the same time, our attention must inevitably be focussed and I suggest that this is our first choice: to what do we pay attention? Our second choice is whether or not we want to fulfil the obligations that are then embedded in what we pay attention to.
Using this analogy of computer programming as a mirror, we have been able to better see of what education consists. I am not discarding exams or even Go as being worthless for being finite games, they can be fun and useful as part of the infinite game, but they are not ultimately or absolutey so. Artificial Intelligence and the rise of computer programs holds great promise for us in education and the wider world but we must also be wary of the problems, of the stagnation or even devolution that they may cause. We can see that the key to ‘ethical’ AI (if we could call it that) is twofold: that the standard or end-goal of the program must be to simply keep the game going, and that this requires some degree of randomness. So Google and Deepmind, now you know…
In writing this post, I began to appreciate a bit more the profound impact that Gottfried Leibniz has had upon computing, his principles of identity and contradiction, of sufficient reason, of predicate in notion and most importantly his principle of ‘the best’ (which can be seen in the requirement for a standard). The image here is of his ‘stepped reckoner’ one of the earliest computers.