When we think about major disrupters to assessment processes in the credentialing arena, there is no doubt that internet-based testing and remote proctoring stand out as innovations that expanded the testing landscape. These days, all eyes are on generative artificial intelligence (AI) as credentialing professionals focus on opportunities for its use across the assessment life cycle.
Before we explore its opportunities, challenges, and limitations, it’s important to revisit the basics of this technology. GenAI is a subset of artificial intelligence that’s based on patterns and information learned from a large dataset through a concept called machine learning. GenAI is particularly good at interpreting and generating human content, which makes it suitable for a wide range of applications in the credentialing assessment ecosystem.
That said, humans must be in the loop for all assessment-related tasks to ensure accuracy and integrity. The current state of GenAI technology isn’t reliable enough yet for its output to be trusted without thorough human review. Additionally, the regulatory and legal landscape around the use of GenAI is evolving rapidly, meaning those in the credentialing arena must stay informed and cautious when using GenAI for any associated tasks. These considerations will be further discussed in a later section of this article.
Applications for GenAI in Test Development and Psychometrics
Assessment professionals have employed generative AI in several test development and psychometric activities, including job task analyses, competency development, item reclassifications, item writing and reviews, as well as scoring and reporting. Each of these steps in the assessment cycle play a crucial role in ensuring the validity, reliability, and fairness of assessments, making them ideal for GenAI integration to enhance efficiency and effectiveness.
Let’s take a closer look at these test development and psychometric applications.
Job Task Analyses and Competency Development
Job task analysis (JTA) is a systematic process that identifies and documents the tasks and required knowledge, skills, and abilities for specific job roles. Competency development, on the other hand, involves defining and validating the essential skills and behaviors needed for job performance.
Benefits of using GenAI include:
- Gaining deeper data insights since GenAI can scan and process large volumes of text to identify relevant information such as job descriptions, educational requirements, and industry standards
- Providing a solid foundation for the JTA process with data-driven insights and initial drafts
- Increased efficiency through automated data collection and analysis
Challenges associated with using GenAI include:
- Tools may struggle to understand the nuanced context of job roles, leading to inaccuracies.
- GenAI may miss human insights and tacit knowledge.
- GenAI models rely heavily on training data, which includes millions of initial data points as well as user inputs and prompts. If the training data is flawed or biased, the output will reflect those flaws and result in biased outcomes.
Reclassification Activities
These activities involve reviewing and updating the classification of requisite knowledge and job task descriptors, competencies, and test items to ensure they accurately reflect current industry standards, job roles, and competencies.
Benefits of using GenAI include:
- Enhanced accuracy with GenAI analyzing vast amounts of data to identify shifts and trends in job roles and industry standards, leading to more precise reclassification of tasks and competencies
- Increased efficiency through automation of labor-intensive processes, including reviewing and updating classifications
Challenges associated with using GenAI include:
- The accuracy of reclassification depends on the quality and representativeness of the input data.
Item Writing and Review
These processes involve creating and evaluating test items to ensure they accurately measure the intended knowledge, skills, and abilities while maintaining validity, reliability, and fairness.
Benefits of using GenAI include:
- Improved time and cost savings by reducing the amount of input required from psychometricians and SMEs
- Increased efficiency through AI-generated item drafts, reviews of SME-written questions, rationale writing, and reference research
Challenges associated with using GenAI include:
- AI-generated items may lack the creativity and nuanced understanding of human writers.
- Bias in the training data could result in biased items that disadvantage certain groups. The quality of the training data depends on the choice of GenAI model being used: open models offer vast datasets but less control, walled models use open-source material without adding new inputs, and closed models rely solely on internal data, potentially perpetuating existing biases.
- The legality of using and citing GenAI in item development and review is still being questioned, requiring careful consideration and adherence to evolving regulations.
- Human oversight is necessary to ensure quality and relevance.
Scoring and Reporting
These processes involve calculating test scores and generating detailed reports to provide insights into candidate performance and assessment outcomes.
Benefits of using GenAI include:
- Significant time and cost savings
- Less bias in rating when considering performance-based scoring due to well-known phenomena such as the halo effect, regression to the mean, and rater drift
- Enhanced efficiency through automation of some aspects of the reporting process
- Improved data analysis and visualization by generating drafts, tables, and graphs based on templates
Challenges associated with using GenAI include:
- GenAI-generated reports may lack the human touch and contextual understanding that psychometricians can provide, potentially leading to reports that are less insightful or actionable.
- Other ethical and legal concerns center around transparency and accountability as well as validity and reliability, particularly in high-stakes situations.
Guiding Principles and Key Considerations for GenAI Use
In addition to the applications outlined above, we anticipate GenAI will be used to support activities in automated test assembly, statistical analysis, and standard setting in the near future.
Through the gradual deployment of GenAI technology, several guiding principles and key considerations for ethical use have emerged. Some of these include:
- Human subject-matter experts still need to carefully review and approve all AI-generated content prior to publication to ensure validity. We’d go so far as to say that this is a requirement, not a best practice.
- If implemented improperly, especially through open-source models, GenAI technology introduces risks to test content security.
- To date, content created by GenAI cannot be copyrighted in many countries, including the United States. Consequently, credentialing bodies will not be able to copyright test content that is generated by this technology.
- GenAI does not organically remove test content bias.
GenAI technology and its use are in constant motion, requiring users to exercise vigilance in upholding best practices. As quickly as guidelines and recommendations are developed regarding the use of GenAI in assessment, it’s possible that technology will cast a wave of change requiring new views and updates on its utility. Consequently, every assessment professional must monitor this development on a continuous basis.
Looking for More?
As part of my work leading an AI Task Force on our Assessment Development and Psychometric Services team, I’m currently developing a comprehensive report about the implications of using GenAI in credentialing assessment processes. To get notified when this report is released, please fill out the form below or sign up for our monthly newsletter.
About the Author
Aurelie Lecocq
Director, Business Strategy and Growth, Meazure Learning