Machine Learning For Automated Generation Of Multiple Choice Test Items

Ramachandran, Sowmya; Ludwig, Jeremy

Effective assessments are essential for training and personnel development. Developing quality assessments is labor-intensive. This could lead to a shortage of assessments that can significantly impact the Army’s ability to develop their personnel and a Soldier’s ability to plan their career growth. There is a unique opportunity to harness the powers of data science and artificial intelligence to address this problem. Recent breakthroughs in neural network technologies have led to stunning successes in natural language processing techniques (NLP). Neural network architectures like the Transformer, and the availability of vast amounts of digital natural language data have led to the creation of highly effective, pre-trained task-independent language models. Once trained, these models can be applied to any number of natural language processing tasks. All this raises an interesting question: are these approaches sufficiently rich to automate the task of generating assessments? The language models seem particularly useful for automating the generation of multiple-choice items to test recall and comprehension as a starting point of this investigation. This paper will describe an investigation into this question. Specifically, we use pre-trained language models like BERT (Bidirectional Encoder Representation from Transformers) that have been fine-tuned to tasks like question generation, question answering, and summarization. We combine this with traditional approaches like rule-based pattern extraction to generate a range of question types. Success of this approach will offer benefits such as tool that will enable assessment creators to draw from a high-quality, automatically generated question-bank. In this paper, we will examine the feasibility and effectiveness of this approach and to identify gaps and challenges. We will describe the models and algorithms that we have implemented and analyze their effectiveness and describe results of formative evaluations of the quality of generated content.

Keywords
AI,ASSESSMENT,DEEP LEARNING,NATURAL LANGUAGE PROCESSING

Additional Keywords

Machine Learning For Automated Generation Of Multiple Choice Test Items

Track: Education

Author(s)

Machine Learning For Automated Generation Of Multiple Choice Test Items

Track: Education

Author(s)

Related Research