One of my professors once made the startling statement that, “one cannot measure speed.” This came as something of surprise to those of us who had speedometers in our cars. Yet, the professor had made a profound point. In science, it is vitally important to know exactly what phenomenon one actually measures. Especially in the arena of public policy, we often act as if we have accurate measurements of phenomena when we actually do not. I think the problem is especially bad when it comes to the question of mental health.
Let’s start with speed. We talk as if we measure speed directly all the time. Indeed, we have laws based on the idea. However, we don’t actually measure a single phenomenon called “speed.” We actually take separate measurements of distance and time and mathematically combine them to create a synthetic measurement called “speed.” The speedometer in a car doesn’t even measure distance directly. It actually measures the number of rotations of the axle during a set period of time. Change the diameter of the tires and suddenly the speedometer does not measure the “correct” speed. Prop the car up on the rollers of an engine test bed and the speedometer will report that the car is whizzing down the road at 80 mph while it sits perfectly still. Speed does not exist. Only space and time exist and we measure those.
Scientists routinely use such proxy or synthetic measurements but doing so can backfire:
Historical measurements of ozone, for example, are expressed in Dobson units which do not measure ozone directly but rather use a spectrometer to measure the ratio of the abundance of different wavelengths of ultraviolet light that create or destroy ozone. But the long-term accuracy of this method is based on the assumption that the distribution of wavelengths of ultraviolet coming from the sun remains constant over time. As long as that remains true, scientists can reasonably infer that changes in the ratio measured on the ground result from changes in the atmosphere. However, recent solar observations suggest that the solar UV spectrum changes significantly in sync with solar magnetosphere (sunspot) activity. That means that the lion’s share of ozone data collected over the last 80+ years may be highly inaccurate. Whoops.
Tree ring data is often used as a proxy measurement of seasonal temperature changes under the assumption that warmer seasons produce thicker rings. Yet, temperature isn’t the only factor that controls ring growth. Moisture plays a significant role as well. Trees grow fastest in a warm, wet season and slowest in a cool, dry one; but a warm, dry season produces the same growth as a cool, wet one. Telling warm and dry from warm and wet is very difficult and seldom done.
The most commonly used medical tests for pathogens rely on detecting the presence of antibodies for the pathogens, but if the patient’s immune system hasn’t yet evolved an antibody for a pathogen the test will generate a false negative. An individual can be cram-packed with microbes but the test won’t measure them. Even the older, incubation test will fail. We still can’t identify diseases which we cannot reliably incubate in the lab. Many yeast, protozoa and most viruses often escape our scrutiny because we can’t study them outside the body. Most of the viral illnesses we have names for are those that produce a highly visible rash. Otherwise, when the doctor says, “you have a virus,” he has no clue what virus you have.
In the social sciences, the measurement problems just get worse.
Economics fails as a predictive science precisely because we cannot actually measure economic activity. Many of the supposed measurements we use in our political discourse don’t look very meaningful when examined close up. Take gross national product. What actually gets measured in compiling that statistic? The answer would fill a shelf of binders and that is just for one country. Every country or region has its own standard. Then you have the entire question of what currency to measure in. Using the concept of purchasing power parity, China has the second-largest economy in the world. Expressed in US dollars, China has an economy of a size between the economies of Italy and France. The definition of unemployment grows super fuzzy the closer one looks at it, and again the definitions change over time and space.
In fact, virtually no supposed statistic in the social sciences rests on any firm physical measurement. Crime? Different jurisdictions define the same act as different types of crime or may disagree whether a crime occurred at all. Poverty? Do you define it by income or assets? The actual conditions in which a “poor” person lives change dramatically from era to era. Is a person with better housing, medical care, etc. just as “poor” as someone with the same relative income from 50 years previous? Domestic violence? Is it domestic if you beat up someone you just hooked up with? How about just living with or do you have to be legally married? Is it domestic if you do it in your living room but not if you do it in a bar? Homelessness? I was once classified as homeless under one federal study’s standards because I spent a couple months sleeping on a friend’s couch. (If I had been having sex with him I wouldn’t have been classified as homeless. Go figure.) Most homelessness studies use widely varying definitions of “homeless.” That is why “estimates” of homelessness range from a few hundred thousand to tens of millions.
Even statistics that would seem glaringly obvious grow fuzzy upon close examination. Take infant mortality. You would think that a dead baby is a dead baby, yet global infant mortality statistics use such widely differing definitions of a live birth that comparing region to region becomes meaningless. In most of America, a baby of any term that takes a breath outside the womb counts as a live birth. In most of Europe, a baby must survive several hours, usually 24, and be of 7+ months term to count as a live birth. Since babies are most likely to die immediately after birth, requiring hours of survival immediately improves infant mortality rates. Most of the apparent difference between American and European infant mortality rates disappears when these differences in the definition of live births are taken into account.
Mental illness also has little to no physical measurement. With the exception of mental illness caused by gross infection or endocrine disease, no clinical test exists for mental illness of any kind or degree. We functionally define mental illness based on a consensus-defined, subjective observational diagnoses. We train mental health professionals to diagnose mental illness by showing them a large number of patients with different degrees of apparent impairment and then saying, “we’ve agreed that this guy is crazy but this guy is not. If you see someone acting like the first guy, classify him as suffering from disease ‘X’ of degree ‘Y’.” Even attempts to use tests like the MMPI only correlate scores on the test with observational diagnoses of other people who took the test. Studies have shown that diagnoses both of disease and severity fluctuate significantly depending on both professional fads and available resources. Provide benefits for treating mental illness and suddenly you have a lot more of it.
We don’t at this time have any choice but to use observational diagnoses for mental health but this does present problems when debating public policy. For example, are we locking up large numbers of the mentally ill in prisons or do we define mental illness differently? If someone in prison is depressed is that normal or not? After all, they are in prison. If a region provides more mental health care, it will have more mental illness diagnoses. That means more people will have mental health records before incarceration. Does that mean more mentally ill people are locked up or that we moved the goal post? Have we changed the way we train the people making the diagnoses such that see problems that people before did not?
A great deal of table pounding and moral outrage gets expressed, supposedly in reaction to this or that statistic that purports to show some huge new problem about which something-must-be-done! Yet upon detailed examination we find that the statistic rests on quicksand. Change a minor assumption here or there and the “problem” disappears. We so desperately want to believe that we can measure, understand and correct problems that we do not ask the really hard questions about what we know and how accurately and precisely we know it.
You can’t measure speed and you can’t measure crazy.