Evaluating Bias and Inclusiveness of Large Language Models on Social, Political, and Historical Education

Ziyue  Zeng

doi:10.56028/aetr.15.1.1758.2025

Authors

Ziyue Zeng

DOI:

https://doi.org/10.56028/aetr.15.1.1758.2025

Keywords:

Large Language Model, Machine learning, AI ethnics, Political bias, Educational technology.

Abstract

Educational institutions have been impacted by the rapid shift of information gathering mediums and student perspectives shaped by generative artificial intelligence in social, political, and historical education. Through a comparative analysis of four distinct large language models (LLMs): ChatGPT, Gemini, DeepSeek, and Qwen and applying a series of educational inquiries and classroom-applicable scenarios, the study evaluates biases, accuracy, inclusiveness, and potential misleading content in LLM-generated responses using a five-band benchmark designed based on the Common Core State Standards (CCSS) educational standards. Results reveals significant difference in terms of consistency and inclusivity performance among models; though showed few relations to political stance of the model (produced by which country), the outcome analysis still highlights how generative AI responses could impact educators and learners predominantly through its range of information and phrasing logic.

Evaluating Bias and Inclusiveness of Large Language Models on Social, Political, and Historical Education

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section