AI in education and Classics
I am sure all educators are aware of the challenges of Gen AI. At the very least, they have expressed their concerns regarding academic integrity and plagiarism (Perkins, Reference Perkins2023). Numerous professionals have also noted claims that the use of Gen AI diminishes critical thinking and research practices and how it hampers the development of foundational skills (Kelly et al., Reference Kelly, Sullivan and Strampel2023). Furthermore, they raised issues regarding the removal of the social element within teaching and learning. Additionally, there are flaws with the technology, such as AI spreading misinformation and generating biased content in its outputs (Mollick, Reference Mollick2024; Shah, Reference Shah2023; Aktay, Reference Aktay2022). These issues, combined with its newly emergent nature and wide accessibility, has created a tool and learning climate that has not been fully explored, often leaving educators without sufficient training and understanding about its applications both at a school level and within their individual learning areas (Michel-Villarreal et al., Reference Michel-Villarreal, Vilalta-Perdomo, Salinas-Navarro, Thierry-Aguilera and Gerardou2023; Son et al., Reference Son, Ružić and Philpott2023). This problem was a common feeling at my own school, which strives to integrate the outcomes of both NESA and the International Baccalaureate programs. It was especially problematic as both programs are at cross purposes regarding their approaches towards the use of Gen AI in teaching and learning (Department of Education, 2023; Duffy, Reference Duffy2023; International Baccalaureate, 2023). In response to these conflicting philosophies, several colleagues and I have been researching and engaging with Gen AI to see if it can be legitimately utilised in our teaching across Key Learning Areas while demonstrating its responsible and ethical use to our students and the wider school community. This is especially pertinent as Gen AI is often prompted and advocated as an agent of language learning and acquisition (Son et al., Reference Son, Ružić and Philpott2023).
However, there is a growing number of examples in which Classicists have utilised these technologies. To name but a few, ChatGPT alone recognises an estimated 339 million Latin-related ‘tokens’, and it not only recognises Latin material but also can generate its own texts. Many AI chatbots present some parsing capabilities for both ancient languages, providing countless examples of Latin texts (Burns, Reference Burns2023; Ross & Baines, Reference Ross and Baines2024; Ross, Reference Ross2023). There are developments with Gen AI’s involvement in dating of Greek inscriptions and papyri (Locaputo, Reference Locaputo, Portelli, Magnani, Colombi, Serra, Moral-Andrés, Merino-Gómez and Reviriego2024). The creation of AI software like Google’s Fabricius assists with the decipherment of Middle Egyptian hieroglyphics, while acting as a legitimate academic language learning tool (Criddle, Reference Criddle2020). Examples of reconstructed spoken audio of ancient languages, which can be used for communicative teaching approaches, and large language models (LLMs) are being employed by researchers for the reinterpretation of classical literature (Haristiani, Reference Haristiani2019; Kim et al., Reference Kim, Shim and Shim2023; Díaz-Sánchez & Chapinal-Heras, Reference Díaz-Sanchez and Chapinal-Heras2024).
However, as well as these innovations, there are unique challenges when applying Gen AI to Classical Languages learning and pedagogy. There can be linguistic irregularities as Gen AI intermingles some ancient language with modern dialects, especially apparent when engaging with ancient Greek and Hebrew (Ross, Reference Ross2023). The software can lack confidence describing grammatical concepts and metalanguage, due to impoverished training data (Bendel & N’diaye, Reference Bendel and N’diaye2023). There are also difficulties with Gen AI noting linguistic variation, as it fails to distinguish how similar morphological forms could have multiple meanings and translations. This is a common occurrence in Latin and Greek (Ross & Baines, Reference Ross and Baines2024). There is also the likelihood that Gen AI expresses confusion between languages, especially those with a Latinate alphabet or direct Latin derivatives, permitting a greater chance of AI hallucination (Bistafa, Reference Bistafa2023). There are also complications with necessary content restrictions, which can cause AI to refuse to display some material relevant to classical areas of study, such as gender roles, sexuality, warfare or slavery (Ross, Reference Ross2023).
So, reflecting on these concerns, the following article explores several case studies which examine how I have integrated Gen AI into my teaching practices with Stage 4 and 5 Latin classes. The cohorts involve students of mixed academic abilities and diverse learning needs from Year 7, Year 8 and Year 9 units of work. They all presented their own challenges, but allowed me to examine how and if the use of AI legitimately enhanced student learning. Furthermore, it allowed me to assess and examine the practical difficulties with its use in the classroom setting (Miller, Reference Miller2024). It also permitted me to model responsible AI literacy with students when they use Gen AI in their own learning (Chan & Hu, Reference Chan and Hu2023).
AI chatbots in the Classic classroom
The first activity involved the use of conversational chatbot within a Year Seven Latin class (students aged 12-13 years). It was part of a learning cycle to reinforce their understanding of direct questions and interrogative pronouns. It coincided with a cultural unit introducing students to notable Roman personalities in mythological and historical contexts. Students were instructed to choose from a selection of these figures. These included Julius Caesar, Romulus, Camilla, Augustus, Cicero and others. Students were provided with a scaffolded workplan with supporting resources to research and record biographical details on these personalities. Based on this research, students were asked to prepare a series of 15-20 questions in Latin ‘to ask’ their chosen personality (Shah, Reference Shah2023). After this, students were given the PARTS scaffold, a tool to aid in designing and crafting AI inputs or ‘prompts’, which they used in ChatGPT 3.5 (Google, 2024; OpenAI, 2016, 2023) (see Figure 1). For this purpose, I modelled and demonstrated it through a prompt modified for this task (Furze, Reference Furze2024).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250201174629890-0543:S2058631024001363:S2058631024001363_fig1.png?pub-status=live)
Figure 1. A summarised version of the PARTS prompting model, adapted from Google (2024).
Students then generated a chatbot of their Roman personality and asked their initial questions, using the prompt. Students were encouraged to critique the chatbot’s responses for accuracy and to develop further questions to sustain a short conversation (Furze, Reference Furze2024). After collating their chatbot’s conversations, they were uploaded to the school’s online learning platform. Then the students took part in a structured reflection of the activity, in which they discussed their engagement with the chatbot, along with their insights, concerns and own issues utilising it. This approach was designed to prompt student agency and awareness (Ross & Baines, Reference Ross and Baines2024).
With this task, the most noteworthy observation was the increase in uptake in engagement. Although I’m a still a novice with active communicative methodologies in my teaching practice, this activity facilitated greater practical communication in Latin between pairs of students through roleplay or interacting with the audio function of the chatbot itself, with a few students commenting even on the pronunciation conventions provided by ChatGPT (Shah, Reference Shah2023; Hargrave et al., Reference Hargrave, Fisher and Frey2024; Zhang & Huang, Reference Zhang and Huang2024). On the whole, students seemed much more concerned than usual to express themselves correctly and asked for regular feedback to sustain larger conversations and to refine their expression (Hunt, Reference Hunt2022; Urbanski, Reference Urbanski, Lloyd and Hunt2021; Zhang & Huang, Reference Zhang and Huang2024). Furthermore, even weaker students within the cohort were more receptive to extend themselves, exploring how they could craft more sophisticated enquiries, despite it often being beyond their capabilities (Hunt, Reference Hunt2022). The students gained more detailed responses when they combined some of the information that they had already researched with what they gained from the chatbots.
One pair of students explored Caesar’s divine ancestry, while another investigated his relationship with Brutus, leading up to the assassination. Another group, focusing on Cicero, discussed his literary career, especially his ties with Atticus. However, in the process, students became very understanding of the limitations of this technology. Even with their own basic understanding of the Latin, several students acknowledged unfamiliar vocabulary, queried word order and sentence structure or unexpected grammatical forms. Students also speculated on their own designed questions and queries, seeing if they possibly influenced these Gen AI outputs (Michel-Villarreal et al., Reference Michel-Villarreal, Vilalta-Perdomo, Salinas-Navarro, Thierry-Aguilera and Gerardou2023). More significantly, students showed concern regarding the historical credibility of their chatbots, as they expressed unexpected or anachronistic sensibilities (Ingram, Reference Ingram2023; Wu, Reference Wu2023). Two notable examples of this were with Caesar supposedly being remorseful after his conquest of Gaul or Augustus expressing sympathy regarding the deaths of Arminius and Cleopatra. This led to a class discussion about the inaccuracy of Gen AI, the mechanics of Gen AI reasoning systems like neural networks, and the problematic aspects on relying on chatbots for exploring historical perspectives, a common criticism levelled at similar platforms and programs (Paul, Reference Paul2023; Ingram, Reference Ingram2023).
AI and Latin prose composition
The other Stage 4 learning activity was intended for my Year 8 class (ages 13-14). Rather than directly utilising chatbots, this task was based around the writing, editing and expansion of Latin prose passages of around 30 to 50 words in length that had been generated by ChatGPT 3.5. This activity was designed with greater complexity as this cohort had been previously engaged with AI in various capacities and to explore more sophisticated prompt engineering and content generation. Students were advised to use vocabulary and syntax encompassing the current scope of their learning, derived from Stages 1-17 of the Cambridge Latin Course (Cambridge School Classics Project, 2022, 2023). They were also encouraged to base their passages on contexts presented in the Cambridge Latin Course, such as agricultural contexts, Togidubnus’ palace at Fishbourne or the cities of Roman Alexandria or Athens. The activity also involved the modelling of appropriate research-backed approaches for prompt engineering. This included the PARTS scaffold, but I also allowed the option of a similar model known as the 5S approach, which is similar to PARTS but more actively promotes greater reassessment of Gen AI outputs (Google, 2024; Distol, Reference Distol2024) (see Figure 2).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20250201174629890-0543:S2058631024001363:S2058631024001363_fig2.png?pub-status=live)
Figure 2. A simplified version of the 5S approach as outlined by Distol (Reference Distol2024).
These, in conjunction with other skills, such as Feynman and Socratic dialogue techniques with the Gen AI were used to edit their generated Latin passages further (University of Sydney, 2023). Students collaboratively designed multiple-choice questions, comprehension, grammatical knowledge or translation questions on these passages either through further refinement with Gen AI, or purely by their own design. They then exchanged their passages and questions with other groups who then attempted these activities themselves. After this collaborative work, students provided peer marking, commentary, feedback and reflection based on their own creation and the tasks which the others had designed.
One of the most pertinent observations I noted with this activity was that the majority of students needed additional scaffolding beyond that which they provided to generate their prose at the start. Despite the use of appropriate prompts, several students became more frustrated as the passages that were generated were either too simple for their purposes or filled with unfamiliar vocabulary and syntax. However, upon inspection students’ initial prompts needed help further refinement as they were often very open-ended, resembling ‘zero’ or ‘one shot’ prompting (Furze, Reference Furze2024). After bringing this to their attention, most students understood the necessity in which repeated engagement with the Gen AI requires critical judgement and more nuanced input prompts. After this, they achieved much better results if they broke the text down further, actively reviewed their outputs or revised the prompts. I saw most students utilising the 5S scaffold rather than PARTS approach. Some students had similar problems relying on ChatGPT when they were creating the comprehension-based questions, as they were considered too simple or contained elements that were not always present within the text. In some cases, the questions and answers deviated from each other. This led to several students preferring to craft their own questions rather being fully dependent on Gen AI. In a way, these students were using Gen AI to inspire their own efforts.
Despite these challenges, after collaborative feedback about how the tasks were attempted, students managed to refine their chosen learning materials to an appropriate standard. They took greater effort in editing and modifying their work before exchanging and marking other groups, and they clarified and redesigned questions and even the original prose composition in response to its perceived difficulty or errors. Overall, this not only familiarised the students with effective use of Gen AI, but it also reinforced some of its perceived limitations. The need for critical thinking and the importance of the user’s own agency for it to be used meaningfully were some of the prevailing themes explored within my students’ own reflections (Ross & Baines, Reference Ross and Baines2024).
AI image generation in the Classics classroom
My Stage 5 task was based on another aspect of Gen AI, namely image creation. Using Copilot Designer, I prepared a range of images based on aspects of Roman life. These ranged from scenes in the forum, travelling, leisure activities, military contexts and mythological imagery (Pesce, Reference Pesce2023). Students in groups selected one of these visual stimuli and brainstormed different Latin vocabulary and phrases to describe what was presented in the image, a well-established visual learning strategy (Furze, Reference Furze2024; Gruber-Miller, Reference Gruber-Miller and Gruber-Miller2006). Using this as a base, students created their own short Latin free composition, based on their current understanding of vocabulary and syntax. The only restrictions were the inclusion of adverbial subordinate clauses, which we have been recently been learning, and a fixed word limit. Furthermore, students were expected to note anything unusual or unexpected presented in the stimulus picture and the whole writing process as part of a written reflection (Furze, Reference Furze2024).
It was this activity which presented the most eclectic findings. Students engaged well with their images, expressing a greater sense of voice in their learning as the differentiation in artistic style and content of the stimuli added wider appeal (Strangman et al., Reference Strangman, Meyer, Hall and Proctor2005; Hill, Reference Hill2006; Kormos & Smith, Reference Kormos and Smith2023; Zeff, Reference Zeff2007). This allowed students to be more confident when drafting and writing their own compositions; some asked about how they could convey complex ideas while drafting, while others were willing to investigate specialised language for these contexts for themselves (Wei, Reference Wei2023). This ranged from the ranks of Roman soldiers, occupations in religious contexts, terminology for parts of boats or idiomatic expressions and metaphors (Wang et al., Reference Wang, Lund, Marengo, Pagano, Mannuru, Teel and Pange2023). Most of my students in this process accessed other resources in the classroom such as my Loeb collection, other Latin readers, the Perseus website and The Latin Library, taking inspiration from authentic Latin texts not only to convey meaning but also tone, register and style in some cases. The students provided multiple insights regarding the artistic licence used in different images, linking these observations back to their own understanding of Roman culture. Students noted that the Cerberus stimulus conformed to his ‘pop culture’ image as a ‘hellhound’ while omitting other aspects of his mythos, such as his serpentine features (Ross & Baines, Reference Ross and Baines2024; Ross, Reference Ross2023; Nicolette & Bass, Reference Nicolette and Bass2023; Furze, Reference Furze2024). Those students who were using the travel scenes noted that the Greco-Roman boats were more akin to Viking longships than anything that would have sailed the ancient Mediterranean. In the forum and urban scenes, many students noted the presence of unknown goods and food stuffs like baguette-styled bread loaves rather than the familiar circular panis loaves found in Pompeii, or just the modern or neo-classical architectural elements on various buildings.
Many of the students were sceptical of the monolithic view of society that was depicted: ‘Romans’ who appeared to be middle-aged Caucasian males rather than having more ethnic and gendered diversity in these scenes (Ross & Baines, Reference Ross and Baines2024; Ross, Reference Ross2023; Nicolette & Bass, Reference Nicolette and Bass2023). This initiated a student-led discussion regarding the bias of AI generated images, a current and topical issue, as well as implications such as the collection of undocumented or copyrighted material to form AI multimedia for the representation of the ancient world and how this could impact upon proper historical understanding (Ross & Baines, Reference Ross and Baines2024; Ross, Reference Ross2023; Nicolette & Bass, Reference Nicolette and Bass2023; Ure Museum, 2024).
Findings and observations
Since conducting these activities, several trends began to appear and become noticeable across these year groups. Besides a general decrease in the submission of student work with unreferenced AI in numerous formative tasks across these year groups, several of my colleagues beyond the initial sample have considered ways to integrate similar learning experiences in their own programs. For example, Humanities actively used Google’s Fabricus when introducing Ancient Egypt within their Stage 4 and Stage 6 History curriculum (Criddle, Reference Criddle2020). More significantly, several students began to experiment using Gen AI programs to enhance their Latin learning outside the classroom. Utilising basic prompting, students began generating their own vocabulary flash cards, constructing cloze passages, tasks based on recognising and recalling principal parts of verbs or short Latin sentences for translation practice of individual grammatical concepts.
More impressive was the fact that these students were more discerning when accessing Gen AI. Despite these cohorts investigating new Gen AI opportunities, they actively note the problematic aspects while reconciling that these issues present holistic learning opportunities themselves (Kic-Drgas & Kilickaya, Reference Kic-Drgas and Kılıçkaya2024; Kim et al., Reference Kim, Shim and Shim2023; Hargrave et al., Reference Hargrave, Fisher and Frey2024). For instance, when one of my students was creating flashcards with Google’s Bard, she noticed that the AI consistently generated wrong forms of the present infinitive. Though frustrated she expressed that process of editing these materials herself allowed time for her consolidate her understanding of verb morphology, an area she openly admitted she needed to work on. In my Year 9 and 10 cohorts, students actively used AI-based image generation themselves to be more creative, demonstrating their understanding of Latin by creating summarised comic book panels or story boards, while actively evaluating these images’ effectiveness at conveying the original tone and intent of the texts (Aktay, Reference Aktay2022; Zou et al., Reference Zou, Reinders, Thomas and Barr2023).
There is clear potential for Gen AI in my own practice: the supplementation and differentiation of learning content. I can leverage Gen AI to support students by developing scaffolded resources, by facilitating greater comprehensible input or by using Universal Design for Learning frameworks. This has allowed me to concentrate my time on more individualised instruction across year levels by reshaping pedagogical materials and redesigning the manners by which I conduct formative assessment (Fryer et al., Reference Fryer, Coniam, Carpenter and Lăpușneanu2020; Haristiani, Reference Haristiani2019; Zhang & Aslan, Reference Zhang and Aslan2021). For example, I have constructed tiered Latin readers for students of different abilities, structured pedagogical Latin texts into new formats, or used Gen AI to brainstorm thematic extracts of Latin texts for unseen translation or to curate extended response stimuli, which are new for the IB Diploma Latin syllabus (International Baccalaureate, 2022; Shah, Reference Shah2023).
In my experience, I have found clear benefits in using Gen AI software in my classroom. My students have become more adept at using Gen AI in an efficient manner to support their own learning, while being aware of its potential complications and issues. This has meant they are fostering not only AI literacy but also a wider sense of digital citizenship – a set of skills and knowledge that we should be encouraging across our curriculum. This process has also allowed me to evaluate my own practice, considering new ways to present and restructure content. But, more importantly, integrating Gen AI has permitted me as an educator honest dialogue with my students and I have been able to discuss my own concerns, considerations, and speculations on the future directions of Gen AI within education and the Classics as a whole.