I have begun diving into the lives of people who were pioneers in psychometrics. As much as I love measurement theory, psychological testing has a problematic history, which arguably has an effect on testing today.
What is psychometrics?
Psychometrics is the psychological field of study concerned with how we measure and investigate psychological phenomena. Basically.
You might hear about the “metrics” of a social media post. The metrics are the quantified measure of people’s behavior when interacting with your social media. Views. Engagement. Reach. Things like that. Each measurement has an operational definition. You have to check the platform to find out how that platform defines “engagement.” Each platform could have a different definition or weighting for components of the behavior.
Metrics and Social Media
I’m totally making this up, because the definitions of the metrics keep changing. But I still need to illustrate what I am talking about. So. Let’s say that there is a platform that defines “engagement” as an emoji response and a comment. When the computer at the platform is collecting the data, it is collecting unique users clicking on the emoji and making a comment. Let’s say that this platform will give a 1 to clicking on the emoji and a 5 for making a comment. If a user clicks on the emoji and makes a comment, then the score for that user’s engagement would be 6. If just an emoji, then it would be 1. If the user commented, but didn’t click on the emoji, then 5. Let’s say that the platform also includes clicking on the profile and adds a 10 to the score. As you can see, each behavior is part of the overall score for the user. Each component behavior (emoji, comment, or profile) is weighted differently. Okay, seems straightforward.
What’s not straightforward is that each platform could have different behaviors that it counts as “engagement.” Some may only count comments. Some may also include people subcribing. It just all depends. The other annoying thing about all of this is that platforms have a fetish for changing their minds about the operational definitions. They will announce it in the changes to terms of service. Oftentimes, these changes are buried. The really annoying thing about this is that your engagement for one post may be based on a different definition, even on the same platform. So you could be comparing apples to oranges in your metric history.
Beyond platforms being annoying, this also illustrates how you can measure a behavior. You have a definition and you assign numbers accordingly.
Psychometrics is the study of how to create those definitions, assign the numbers, and ensure that what you’re doing is reliable and valid.
Measurement Reliability and Validity
In a nutshell, reliability means that every time you use a measuring device, you get the same answer. If you’re using a ruler to measure wood planks, if the plank is an inch, your ruler will give you an inch every time you use it on the inch-long plank. A valid ruler means that you’re measuring what you think you’re measuring. In the case of the ruler, you think you’re measuring linear magnitude, but you could also be measuring user error, if that’s a thing using the ruler.
My mom used to go to a periodontist who would measure her gums. The periodontist said that she took three measurements each time, because there was a potential for user error. The gums were tiny and the lines for the measurement were tiny as well. Eyesight and holding the ruler the same exact way could have an impact on the measurement taken.
The Problematic Origins of the Field
So you’d think that it’s good to measure sure how you’re measuring is reliable and valid. I agree. But, the people who were the early pioneers of this field in modern times were doing it for problematic reasons.
I’m just going to say it. Eugenics. One of the reasons that early psychometricians were developing measures was to justify eugenics [1]. This is problematic. The early intelligence tests were designed for the purpose of finding individual differences to support a philosophy that encouraged eugenics [1]. Early psychometrics was also used to assign military jobs during the World Wars and during the development of the US educational system. Either one of those sounds okay, but in the educational system, there is potential for problems. Some may argue that the problematic use of testing was then, but this is now. Some have pointed out that these problems have continued [2].
What have I learned so far?
Well, so far, I’ve not learned anything that is new to me. The problematic history of psychometric testing was part of my learning about it in graduate school. What I personally lack is an in-depth and detailed history of the people who were problematic, which I will happily be sharing. It’s also been a hot minute, as they say, since I was reading about the conversation on how this history is still affecting testing today.
If the history of psychometrics is problematic, why do you like psychometrics?
Well, that’s an interesting and complicated question. I like thinking about how we measure things, especially in my area of interest which is psychology. I like questioning if the measurement is reliable and valid. I like arguing about how you know if it is. I like the maths of it all. I also really like having fodder for rants, which a problematic history of anything lends itself well to.
As I dive back into this conversation, I will be sharing what I find and my thoughts about it.
If this topic is of interest, buckle in. If you have any questions, let me know. Rabbit hole diving and explaining what I find is my love language.
Citations
Disclaimer:
The content of this blog post is for informational purposes only and does not constitute professional advice.