Disputas is an EdTech and AI company with the dual aim of enabling active learning in qualitative subjects and scaling data collection for complex concepts in natural language processing. The past two years we have built Ponder; a platform for making and doing practice assignments involving text-analysis. The fall of 2022 we will have the beginnings of a student annotated corpus with labels from law and informal logic. In 2023 we want to assess these datasets, before scaling up. We would appreciate suggestions concerning how to test the data in a research context.
Disputas is an EdTech and AI company with the dual aim of enabling active learning in qualitative subjects and scaling data collection for complex concepts in natural language processing. The past two years we have built Ponder; a platform for making and doing practice assignments involving text-analysis.
Ponder encapsulates two innovative ideas. Firstly, it enables learning of a kind that the research literature shows to be most effective and engaging to students. Namely, active learning in the form of deliberate practice with rapid formative assessment informed by learning analytics. If successful, it will give the students in humanities and social science subjects the opportunity to practise what they learn before exams, similarly to the way students practise in quantitative subjects like mathematics or programming. Secondly, Ponder is innovative because it is designed to elicit labelled annotations from students and teachers using text analysis exercises to produce anonymized linguistic data that can be used for machine learning. If successful, Ponder diverts the efforts expended in the educational sector towards training machines the same skills we foster in students.
In 2020, Anders Næss Evensen was training a machine learning model (BERT) to detect informal arguments in text for his master’s thesis in Natural Language Processing at the University of Oslo. The field of machine learning dedicated to informal logic; argument mining, suffers from data scarcity. Therefore, Anders spent a lot of time annotating texts himself so as to create the data he needed.
With the advent of the transformer model, the frontiers of natural language processing are rapidly expanding. One of the bottlenecks halting the advance of machine intelligence to fields such as law, medicine and journalism is a lack of good training data. In such fields, there is a lack of agreement among researchers on how the data should look like. Moreover, the available benchmarks are dominated by narrow domain-specific tasks. Creating new datasets is expensive, because annotation in these fields requires deep subject matter expertise.
Paal met Anders at the language technology department at the University of Oslo. He had just enrolled in the language technology program after having completed a masters degree in philosophy from the University of Bergen. Some months prior, Paal had teamed up with Andreas Netteland to develop an educational platform for active learning. Andreas had recently dropped out of the robotics and cybernetics program at NTNU to work as a full-stack engineer at WAYS. The duo had just won the Inven2start competition for 60kNOK and pro-bono consulting services to investigate IP, develop a viable business plan, and help setting up Disputas AS.
When the three of us exchanged information about our projects, it dawned on us. Students and researchers engage in the same activities for different purposes: they analyse texts. In fact, students and teachers are paying to use software for text analysis that is structurally identical to the annotation software used by researchers. Students take on loans in order to spend time analysing texts for the purpose of learning. Meanwhile, researchers spend large amounts of their research budgets on salaries to expert annotators. On the basis of this realisation, we decided to join forces, and design Ponder in a way so that the efforts of students would give rise to data of a kind that could be useful to AI research.
The first thing we did was to talk about the idea with some of the most competent experts and researchers on the relevant topics in Norway. We got positive feedback and good advice. We then wrote up the idea in an application to the Norwegian Research Council, and were granted 1mNOK to kickstart the development of Ponder.
The cognitive and pedagogical research literature shows, perhaps not surprisingly, that practice and feedback is important to learning. After surveying more than 900+ meta-analyses, the leading quantitative researcher in the field of pedagogics, John Hattie (2012), finds that deliberate practice and rapid formative assessment are the most effective interventions teachers can use to improve the learning and engagement of students. These results are echoed in experimental comparative studies (Yeh 2011; Ericsson & Pool 2016; Lehly & Wiliams 2009), and resonate with mainline theoretical views in cognitive science (Willingham 2009) and the philosophy of education (Dewey 1997; Scriven 1967).
Teachers know and appreciate the importance of practice and feedback. In quantitative subjects like programming and mathematics, it is common for students to do practice assignments to test their understanding of the concepts introduced in lectures. However, in humanities and social science subjects, practice and feedback is less common. The reason is that there is a lack of good instruments for practice and assessment.
Practice in qualitative subjects typically consists in multiple choice quizzes or essay assignments. The multiple choice quiz can be assessed automatically, which lessens the burden of assessment for the teacher. However, the quiz does not give much insight into student understanding. Often, the most important facet of assessment has to do with the justification or explanation underlying an answer to a question, not in the answer itself. The essay accommodates this latter fact, which is perhaps the main reason why essay assignments are common in qualitative subjects. However, essay assignments give teachers a lot of work. It takes time to read, assess and give feedback on the contents of essays. Therefore, it is uncommon for students to practice writing more than a few essays in a class.
With Ponder, students answer practice assignments by writing sentences in text-boxes and connecting them in diagrams. With these simplified diagram representations, the student can justify or explain the thinking behind an answer to a question. The restrictions imposed by the diagram-structure makes it possible to automate assessment and feedback processes, which allows teachers to give more practice assignments. Through diagrams, Ponder retains the benefits of both essay and multiple choice assignments, without incurring the costs.
There is a growing literature on the application of diagrams for purposes of deliberate practice and rapid formative assessment. The 50+ studies on the subject seem to indicate that students typically learn 3-4 times more in classes using this approach. I give an overview and interpretation of this research in a forthcoming paper to the peer-reviewed journal Norsk Filosofisk Tidsskrift. Although other software applications exist for creating diagrams, Ponder will be the first of its kind to fully automate assessment and feedback, and to elicit anonymized data from interaction to build corpora for research and development in AI.
Current state of affairs
Ponder has been developed with and through the feedback of Norwegian teachers and researchers of informal logic and law at several institutions of higher education over a duration of two years. The main components of the system are now in place, and we’ve run several promising tests. The fall of 2022 we are going to pilot Ponder in ex. phil and law classes all over Norway.
Up until now we have been exclusively focused on use-cases in informal logic and law, in particular cases of argument analysis and legal judgement analysis. We can see very clear applications for the Ponder datasets in argument mining and for machine learning projects in legaltech. We are eager to test the quality of these datasets, and would love to get in touch with any researchers interested in the topic of data collection in natural language processing, argument mining, or legaltech.
In a design-driven innovation project sponsored DOGA, we have also been investigating other use-cases for Ponder. We are especially interested in subjects where students learn through practice assignments involving text analysis and where there is a field in natural language processing devoted to the identification of the same patterns students are looking for in these assignments. We would appreciate any advice concerning subjects where this structure applies.
Dewey, J. (1997 ) Experience & Education, Simon & Schluster.
Ericsson, K. A. & Robert Pool (2016) Peak, Eamon Dolan/Houghton Mifflin Harcourt.
van Gelder, T.J., Bissett, M., & Cumming, G. (2004). “Enhancing expertise in informal reasoning”, Canadian Journal of Experimental Psychology 58, 142-52. https://doi.org/10.1037/h0085794.
Hattie, J. (2012). Visible learning for teachers: Maximizing impact on learning. Routledge/Taylor & Francis Group
Leahy, S., & Wiliam, D. (2009). Embedding assessment for learning: A professional development pack. London: Specialist Schools and Academies Trust.
Scriven, M. (1967). “The methodology of evaluation” In Stake, R. E. Curriculum evaluation. 18: 119-144. Chicago: Rand McNally.
Willingham, D. T. (2009). Why don't students like school? A cognitive scientist answers questions about how the mind works and what it means for the classroom. Jossey-Bass, NJ.
Yeh, S.S. (2011). The cost-effectiveness of 22 approaches for raising student achievement. Charlotte, NC: Information Age.