Assessment and AI
AI in assessment has been framed as threat, and it poses substantial challenges to traditional forms of assessment. As AI chatbots have been trained largely through text, forms of assessment that rely on text, and particularly online forms of assessment, can be susceptible to unethical AI use.
In this post, I explore multiple sides of the assessment conundrum, focusing on how AI tools can support learners, instructors, and the university in enhancing our roles and capabilities.
Fortunately, we have substantial research on what constitutes effective assessment practices. Knight (2002) has outlined the distinction between formative and summative assessment, with an emphasis on the value of formative practices, as summative practices often fail to provide timely, actionable feedback that supports student learning and improvement. In contrast, formative assessment is integrated into the learning process and aims to foster development rather than deliver final judgments. Similarly Black & Wiliam (1998) argue that a diversity of tasks, opportunities for self-assessment and the quality of feedback play an important role in enhancing student learning, as they enable learners to understand their current performance, identify gaps in understanding, and take informed steps toward improvement. Sadler (1989) outlines that clear criteria are an important way bridge the gap between current and desired performance by helping students recognize quality in their work and take responsibility for their own learning. and Brookhart (2017) emphasizes how effective rubrics provide specific, descriptive feedback aligned with learning goals, enabling students to understand expectations and make targeted improvements. Wiggins (1998) outlines how demonstrating real-world competence enhances the perception of authenticity, which is especially significant for Gen Z, and deepens student learning and skill development. Timely, relevant, and detailed feedback all contribute to effective assessment (Gibbs & Simpson, 2004).
While much attention has focused on how AI can undermine assessment, there is a need to consider how it can support both instructors and learners in making assessment more meaningful and effective. Distinguishing between assessment for institutional validation, for instructional purposes, and for learner feedback reveals differing priorities. Bayrak (2022) emphasizes these distinctions, highlighting how self-assessment with online tools can play a key role in promoting deeper learning.
Student assessment
Summative feedback is particularly susceptible to generative AI because it focuses on final products, offering clear templates that generative technologies can exploit to reproduce similar outputs adversarially. Many of the processes that have traditionally generated good assessments, such as clear rubrics, well written exam questions, and predictable task structures are what now make those assessments vulnerable to AI.
|
Most Vulnerable to AI: Essays, Research Papers, Annotated Bibliographies, Literature Reviews, Policy Briefs, Multiple-Choice Exams, Short Answer Exams, Concept Maps, Reflective Journals. (Cotton et al., 2023) Moderately Vulnerable to AI: Presentations (if content is AI-generated), Debates (if scripted or pre-written by AI), Group Projects (if content creation is AI-assisted), Simulations (if responses can be generated by AI), Peer Reviews. (Selwyn, 2024) Less Vulnerable to AI: Lab Reports (data collection and interpretation require direct involvement), Oral Examinations (require spontaneous, unscripted responses), Role Plays (require real-time interaction and improvisation) (Evangelista 2025) |
Responses to AI therefore reinforce what we already understand about assessment. As (Rudolph, Tan, & Tan, 2023) argue, creativity, critical thinking and intrinsic motivation reinforced through authentic and realistic assessment push us towards the higher end of Bloom's taxonomy.
Additionally, we can understand the limitations inherent in responses to generative technology. Going back to closed book, pen and paper exams reinforces cramming; AI detection tools can easily be defeated (by adding a single space); out-designing AI becomes an arms race with a several hundred-billion-dollar industry. Instead, as recommended by AI at Western, we need to clearly outline the role of AI in the classroom as well as emphasizing integrity and accountability with critical skills.
AI tools that can complete assessments undermine the learning process, but only if misused in unethical ways. High quality assessment principles that emphasise clear criteria, formative practices and a diverse set of tasks still apply and are more crucial than ever. Assessment that emphasizes process over product through things like scaffolding, multi-stage assignments, decreases the value of final formative tasks, and reinforces that the process is as important as the output. While oral exams, logical argumentation live discussion and real-world application are less susceptible to AI, the emphasis on transparency of AI use allows learners and instructors to understand the strengths and limitations of the tools. Reactive solutions such as surveillance or paper only tests, can reduce the pedagogical value of assignments and reduce trust and authenticity, reinforcing reliance on AI tools. AI aware design needs to think about AI from the learner-centric, institutional Centric and formative goals when redesigning assessment to highlight critical thinking, increase motivation and emphasise learning.
Instructor use of AI
While much of the discussion of AI assessment focuses on student use, there are also important ways that AI can assist in assessment design. While much of the discussion of AI assessment focuses on student use, there are also important ways that AI can assist in assessment design. Repetitive tasks—like generating exam questions, aligning rubrics, or formatting LMS content—can be streamlined using AI. Tools like syllabus chatbots offer scalable student support, while human oversight remains essential to assessment practices.
The syllabus is an important tool for clearly defining AI use for instructors as well as students. As Queens University has recently outlined, assessment must remain human-centred, with instructors making all final evaluative decisions and using AI only as a support tool. Generative AI should not be used for summative assessment due to risks like bias, lack of transparency, and unresolved ethical concerns. Instructors must protect student privacy and intellectual property by using only approved, secure AI tools in accordance with Western’s guidelines. Students must be informed and given the option to opt out of AI-assisted assessment. For example, this would be a general set of guidelines for thinking about AI use by instructors:
|
Generative technology may be used to assist in assessment during the course. If you prefer not to have generative tools involved in the assessment of your work, please notify me. We will go over this in detail on Week 1. Possible uses of generative technology include:
Generative technology will not be used to assign grades. All grading decisions will be made by the instructor in line with course requirements. No personal identifiers will be shared with any generative system. |
A clear framework for AI use should define its role, clarify its limitations, and distinguish acceptable from unacceptable applications. This helps ensure transparency, manage risk, and uphold academic integrity while supporting informed student consent.
Ways that AI can assist in assessment:
- Hibbert, M., Altman, E., Shippen, T., & Wright, M. (2024, June 3). A framework for AI literacy. EDUCAUSE Review. https://er.educause.edu/articles/2024/6/a-framework-for-ai-literacy
- Using course materials (notes, transcripts, slide decks, readings) to develop group work, role-playing exercises, examination questions, and quizzes aligned with learning outcomes.
- Generating diverse question types (e.g., multiple-choice, short answer, application-based)
- Providing draft feedback aligned with rubrics to support formative assessment.
- Suggesting improvements or restructuring for clarity and coherence in student submissions.
- Identifying patterns in student performance for targeted instructor intervention.
- Flagging factual inconsistencies or citation gaps for instructor review.
If you're performing the same task repeatedly, it's a candidate for automation using AI tools, coding, or APIs. For example, lecture transcripts can be processed to auto-generate exam questions—making the assessment process more transparent, scalable, and consistent for instructors, while offering students clearer alignment between course content, rubrics and evaluation. Once you develop effective AI templates, you can scale them to generate group work, class discussion, class debates and participatory forms of engagement in different classrooms for different content. Moreover, if you are producing content for an LMS (such as Brightspace at Western) you can use the existing templates and have generative AI code your content directly into them for uploading to your course sites. Additionally, course syllabi can be combined with a prompt to create syllabus chatbots, which can function 24/7 for student inquiries as well as providing natural language responses to student questions (this is a open source link to one that I use, that can be copied). If you want to demonstrate the problem of allowing AI to directly assess assignments, you can use your course syllabus or assessment rubric for an assignment to allow students to practice submission based on your criteria. The reason that we should always avoid allowing AI systems to grade assignments is that they are wildly susceptible to adversarial attacks, in my experience just putting ‘good’ or ‘bad’ anywhere in the submission produces radically different results. These examples show how AI can enhance assessment design and delivery, but they also underscore the need for human oversight to preserve fairness, accuracy, and pedagogical intent.
Conclusion
AI in learning and instruction should enhance—not replace—core assessment principles grounded in well-established pedagogy. Human judgment and oversight are essential in all aspects of assessment and must be reasserted for both learners and instructors. AI can enable new forms of design efficiency, feedback tools, and student support, but it should not replace instructor evaluation. The syllabus is a key site for defining the use of AI by instructors and learners, setting expectations, and ensuring informed consent. When deployed in summative assessments, AI undermines core learning principles and remains susceptible to prompt manipulation and bias. Scalable, responsible integration of AI into assessment can support transparency, fairness, and effective learning outcomes.
The Challenge
This series of short exercises explores how generative AI intersects with academic assessment—not just as a risk to integrity, but as a tool that challenges how we design, deliver, and evaluate student work. Moving beyond detection, these activities invite instructors to rethink fairness, feedback, and transparency in light of AI’s capabilities and limitations. Each prompt highlights a key issue in assessment—question design, feedback clarity, citation accuracy—while encouraging critical engagement with the role of AI in shaping both student learning and instructor practice.
Click here to access the exercises in a Qualtrics Form.
References
-
Bayrak, F. (2022). Investigation of the web-based self-assessment system based on assessment analytics in terms of perceived self-intervention. Technology, Knowledge and Learning, 27(3), 642-659. https://doi.org/10.1007/s10758-021-09511-8
-
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7-74. https://doi.org/10.1080/0969595980050102
-
Brookhart, S. M. (2017). How to create and use rubrics for formative assessment and grading. ASCD.
-
Cotton, D., Cotton, P., Nash, C., Kneale, P., & Jessen, S. (2023). Chatting and cheating? Exploring the implications of ChatGPT for academic integrity. Innovations in Education and Teaching International, 60(2), 153–162. https://doi.org/10.1080/14703297.2023.2190148
-
Evangelista, E. D. L. (2025). Ensuring academic integrity in the age of ChatGPT: Rethinking exam design, assessment strategies, and ethical AI policies in higher education. Contemporary Educational Technology, 17(1), Article ep559. https://doi.org/10.30935/cedtech/15775
-
Gibbs, G., & Simpson, C. (2004). Conditions under which assessment supports students' learning. Learning and Teaching in Higher Education, 1(1), 3-31.
-
Knight, P. T. (2002). Summative assessment in higher education: Practices in disarray. Studies in Higher Education, 27(3), 275-286. https://doi.org/10.1080/03075070220000662
-
Rudolph, J., Tan, S., & Tan, S. (2023). War of the chatbots: Bard, Bing Chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. Journal of Applied Learning and Teaching, 6(1), 380-397. https://doi.org/10.37074/jalt.2023.6.1.23
-
Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119-144. https://doi.org/10.1007/BF00117714
-
Selwyn, N. (2024). On the limits of artificial intelligence (AI) in education. Nordisk Tidsskrift for Pedagogikk og Kritikk, 10(1), 3–14. https://doi.org/10.23865/ntpk.v10.6062
-
Wiggins, G. (1998). Educative assessment: Designing assessments to inform and improve student performance. Jossey-Bass.
Disclosure
NOTE: AI was used for brainstorming, research, improving grammar and flow, and generating PowerPoint templates used to create the images in this post. It also assisted with formatting the references. All statements remain attributable to the author. AI was used for brainstorming, research, improving grammar and flow, and generating PowerPoint templates used to create the images in this post. It also assisted with formatting the references. All statements remain attributable to the author.