Psychometrics and Large Language Models: What Can Be Learned and What Can Go Wrong?
Our natural response to language is to engage Theory of Mind to understand the inner state of our conversational partners, and this leads us to ascribe human attributes to generative large language models (GLLMs). It is proposed that methods developed to discern individual difference characteristics such as personality and emotion from the semantic structure of language can be usefully applied to GLLMs to understand how meanings expressed by GLLMs are similar or different to the meanings we assume in human conversation. An example from personality research is applied to ChatGPT and it is found that the semantic structure describing personality has both similarities and differences to humans. This is important because when GLLMs rate text streams, differences in meanings may be misinterpreted by humans and thereby show up as systematic biases.
Steven Boker is Professor of Quantitative Psychology and Data Science, directs the Human Dynamics Laboratory, and the LIFE Academy at the University of Virginia. He is an internationally recognized expert in modeling longitudinal data using dynamical systems analysis and as one of the core developers of the OpenMx Structural Equation Modeling software. He is currently applying psychometric methods to understand the behavior of generative large language models from a psychological perspective.