Keeping Humans in the Loop: Guiding GenAI with Human Oversight

Subscribe to the weekly posts via email here

Generative AI (GenAI) systems are rapidly transforming education and industry, bringing both opportunities and new challenges. One clear principle has emerged across expert guidelines: we must keep a human in the loop as we integrate GenAI. Human-in-the-loop refers to involving human oversight or input at key stages of a project’s lifecycle.  This includes training an AI model, to reviewing and refining its outputs, all with a goal of maintaining accuracy, ethics, and alignment with human centered values. To implement this well, thoughtful design should happen in collaboration with users, on an ongoing basis, and with intention, instead of an afterthought.   

As a way of including human-in-the-loop processes, many companies now hire AI trainers, prompt engineers, data annotators, and AI risk/ethics specialists, people whose job is to curate training data, review AI decisions, and ensure the AI is aligned with policy and values. In fact, 81% of business leaders expect new entry-level roles like data curators, AI ethics specialists, and algorithm trainers to become increasingly common (Hoffman, 2025). Entry-level jobs in many fields are shifting from performing manual tasks to overseeing AI that performs those tasks. I first noticed this in my LinkedIn feed, where several ‘biology AI trainers wanted’ jobs were posted, looking for new graduates to train LLMs. Other fields, like the technology sector, are hiring communication specialists, such as journalists, as prompt engineers, relying on their writing and creative skills (Deck, 2025). Tech companies have also expanded AI trust and safety teams to ensure AI-generated content follows community guidelines and policy standards. In the health care field, hospitals, medical research centers, and digital health startups are bringing on not only clinical AI specialists but AI ethicists or fairness auditors to vet algorithms for bias and safety. Public health agencies have also recommended creating regulatory bodies that include AI experts, healthcare professionals and ethicists to guide the responsible use of AI in health settings. Similarly, governments are also creating AI advisory positions, with the intention of providing human guidance at higher level (Pham, 2025).  In 2024, Canada launched an Artificial Intelligence Safety Institute and convened new advisory councils to guide safe AI adoption in line with Canadian values (Government of Canada, 2025). This means AI fluency and ethical oversight skill are becoming part of the core competency for many new positions and having a human-in-the-loop is seen as necessary for ethical project development.   

How, as an institute of learning and research, can we support our students for work readiness in the new AI driven roles and support human-in-the-loop skills?  UNESCO has proposed new guidance for using GenAI in education and research that calls for such things as “Support[ing] higher education and research institutions to enhance programmes to develop local AI talent” while also calling to build capacity for teachers and researchers to make proper use of GenAI. It is important to note that these are skills that even our faculty are developing, and UNESCO suggests that building capacity should include protecting the rights of teachers and researchers, such as compensating time spent in training and learning about AI, or respecting how, when, and if GenAI will be used (UNESCO, 2023). As teachers and researchers, we can work towards empowering students as effective humans-in-the-loop, focusing on the process of using GenAI while applying transferable skills that students already develop in university, such as critical thinking and ethical reasoning. By guiding students in how to prompt, fact-check, and refine AI outputs, educators are effectively training the next generation to keep a hand on the wheel when using AI tools (U.S. Department of Education, Office of Educational Technology, 2023). 

Integrating human oversight is not just a procedural matter, it’s also about people and their skills, mindset, and trust in working with AI. A key question for any community of practice adopting GenAI is: What does it mean to be a human-in-the-loop, especially if you’re not an expert on the task the AI is doing?  Novices and students often face a dilemma – at what point do you trust the GenAI to know better than you, and when do you rely on your own knowledge, intuition, or experience? From an industry perspective, this can bring up additional questions such as "Do we still need skilled experts to oversee the skilled labour being outsourced to GenAI?  And if ‘yes’ then how do we develop that skill in a world where humans aren’t doing the work anymore?" (D. Dilkes, personal communication, September 2025). There’s no simple answer, but cultivating the right approach involves education, self-awareness, and experience.  

To truly empower the next generation of practitioners, educators and leaders can incorporate deliberate practice of human-in-the-loop scenarios. For example, assignments where students must improve a flawed AI-generated essay, or hackathons where teams audit an AI model for errors and biases, can build hands-on proficiency. Asking students to first develop and plan how they will incorporate a human-in-the-loop plan of action before they being to critically assess GenAI output will help emphasize the importance of thoughtful oversight instead of a reactionary one.  Modeling a human-in-the-loop approach is also an important consideration, one we are working on implementing in the Physiology and Pharmacology department, in the Schulich Faculty of Medicine and Dentistry at Western, with our undergraduate students.  In the PhysPharm 3000 labs, student get hands-on experience with developing a hypothesis and testing it, focusing on either a physiological or pharmaceutical concepts (course manager is Dr. Birceanu). Each group consists of about 25 students and a faculty advisor and over the course of 4-5 weeks, students and advisor develop a research question and methods for testing the questions, collect data and complete analysis, and then write up or present their findings and conclusions.  As the advisor, their role is to help consult on the development of the hypothesis, methodology, and analysis, all possible opportunities to include a human-in-the-loop approach while working with GenAI and students.

Opportunities for the use of GenAI in Physiology & Pharmacology 3rd Year Labs showing the steps of the research process and various examples of use. A PDF version of the graphic is available below.
(Figure 1: Bell & Birceanu, 2025) Access PDF version of the graphic here (opens in new tab).

The hypothesis can be partly developed through iterative prompts using GenAI, where the faculty member engages with students to critically assess the outputs, identifying novel suggestions the GenAI may produce, and dismissing other suggestions that are not novel, or perhaps not feasible given the lab and time constraints. Faculty can demonstrate how to check if references are real and relevant to the project, not hallucinations.  Faculty can also advise on how to develop a methodology that meets research and industry standards) and then re-iterate the importance of human judgement by assessing the validity and reliability of the primary references as well (Bell & Birceanu, 2025).  

We developed this process in to help students become more work ready and comfortable with when, where, and how to use GenAI appropriately.  Here is a summary of how we used it in the lab.  

We began by asking students to brainstorm possible research questions. Their ideas were collected on a whiteboard so everyone could see them. This created a shared starting point, rooted in student interest. 

Students then entered their questions into ChatGPT. For example: 
“What kind of experiments can I do with the neuromuscular junction (where nerves meet muscles and send signals for contractions) and exercise?” 

GenAI produced a list of possible research questions, ranging from exercise intensity to the effects of protein intake. At this stage, the AI served as a brainstorming partner, generating more options than students alone might think of. This is where the human-in-the-loop is critical. Many GenAI outputs sounded interesting but weren’t feasible in a third-year lab class. For example, testing age-related decline at the neuromuscular junction wasn’t possible with only student participants. 

The faculty expert guided the class by asking questions like: 

  • “Who would actually be willing to ingest more protein for this study?” 

  • “What challenges might compliance create?” 

Through these conversations, students learned how to screen AI-generated ideas for practicality, ethics, and scope. 

Students then crafted a more detailed prompt, adding constraints like “in a third-year undergraduate course with human subjects and a nerve stimulation machine.” This produced more specific, testable questions (e.g., the effects of caffeine or cooling on neuromuscular transmission). Again, the faculty role was to step in and highlight which options were doable but not novel, encouraging students to push further. 

Finally, by combining and refining ideas iteratively, students began to generate questions that were both feasible and more innovative (e.g., the effects of temperature on neuromuscular fatigue). Here, the faculty modeled how researchers validate or dismiss AI suggestions, drawing on expertise and creativity beyond what the AI could provide. 

By working through this process, students didn’t just see GenAI as a tool, they saw how human judgment makes the difference between an interesting idea and a solid, ethical research question. This human-in-the-loop approach shows students that while AI can generate possibilities, it’s human insight that ensures quality, safety, and creativity. Such practices demonstrate a roadmap for developing these skills and competences in our students for assessing GenAI output.  

Over time, our community of practice should aim for a culture where using GenAI always comes with the question, “Where is the human?,” meaning who is accountable for checking and refining the AI’s work at each step, and who will ultimately be the person held responsible.   

I personally applied this approach to writing this article. I knew the direction I wanted to take on this paper and asked Gen AI to do a deep research search with the following terms “Human-in-the-loop”, “Student Education”, “Higher Education”, “Work Readiness”, “Trusting Judgement”.  I wanted to see what types of conversations were being had about these issues.  I used ChatGPT pro for my search.  What I found was that the report that came back was primarily centered around one main source, mostly just rearranged, and that all other sources were US focused and not peer reviewed.  Going into this process I knew what stages I would need to act as the human-in-the-loop and evaluate what points to keep and what needed to be redeveloped or edited. All references needed to be assessed for appropriateness and to make sure plagiarism was not occurring.  The flow of the report needed to be critically evaluated for tangential ideas or vague and repetitive information. And last, it still needed to accurately reflect my views.  From the initial output, I removed quite a bit of repetition, some tangential ideas about what human-in-the-loop actually was (using the general population to help train the LLM doesn’t count as human-in-the-loop with respects to GenAI oversight, but the original output to my prompt had a whole section on it), and reorganized the sections to tell a coherent story.  Ultimately, I ended up removing a lot of it because it was, simply put, dull.  The finished post still retains some elements of that initial GenAI report, but the majority of this post was developed by me.   

When incorporating a human-in-the-loop approach to using GenAI we must also consider when in a project life-cycle the human-in-the-loop plan needs to be re-evaluated and revised. This may include building in pauses during a project to reassess how well the human-in-the-loop is doing, if more skills or training is required, and if the oversight provided is at the correct levels. For example, I reached out to Dani Dilkes and asked for her critical help with this post.  I needed another human-in-the-loop to help refine the paper, and GenAI was not going to provide the type of feedback I recognized I required.  

Another consideration is how to protect the human who is providing the human-in-the-loop oversight.  Depending on the type of work a human-in-the-loop overseeing AI outputs is doing, it is possible they may encounter challenging, even disturbing GenAI outputs, and need to also be supported and protected when this occurs (Deck, 2025). This is why using a human-in-the-loop approach needs to be developed and intended from the beginning, focused truly on the supporting the humans in any project instead of retrofitting it to a process that has already begun. By considering these practices, hopefully, we can avoid an over-reliance on GenAI output or a loss of trust in the process. 

The Challenge

As you incorporate GenAI into your own practice, consider these reflective questions:  

What does it mean for you to have a human in the loop when you yourself are that human?  

Ask yourself “At what point would you decide that the GenAI might actually know better than you?” Conversely, when do you trust your own knowledge or intuition enough to override the AI?  

Thinking about the answers to these questions, consider your students and peers that are also working towards developing AI literacy and competencies.

  • When developing assessments for students, will you include reflections on how they can develop skills to be the human-in-the-loop?
  • What places already exist or could be developed where you are modeling these skills already?

Activity:

Map out one way your use a human-in-the-loop approach that you can share with your students, providing them with a thoughtful roadmap of responsible use of GenAI.  Consider what skills and knowledge you have acquired to give you the confidence to use outputs from GenAI and accept the responsibility of said output. How often will you revisit this human-in-the-loop framework during a given project? 

Subscribe to the weekly posts via email here

 

References

Disclosure

I used ChatGPT5 – pro for the initial research question. I also used it to locate sources on which fields are actively seeking GenAI specialistsI used it to summarize the steps involved in using a HITL approach to generating a research question, validated methods, and analysis in the PhysPharm 3000 lab.

Your Challenger: Christine Bell