Statistical Artifacts

Statistical Artifacts

When we have good graphs and statistical aids, thinking statistically can feel straightforward and intuitive. Clear charts can help us tell a story, can help us visualize trends and relationships, and can help us better conceptualize risk and probability. However, understanding data is hard, especially if the way that data is collected creates statistical artifacts.

 

Yesterday’s post was about extreme outcomes, and how it is the smallest counties in the United States where we see both the highest per capita instances of cancer and the lowest per capita instances of cancer. Small populations allow for large fluctuations in per capita cancer diagnoses, and thus extreme outcomes in cancer rates. We could graph the per capita rates, model them on a map of the United States, or present the data in unique ways, but all we would really be doing is creating a visual aid influenced by statistical artifacts from the samples we used. As Daniel Kahneman explains in his book Thinking Fast and Slow, “the differences between dense and rural counties do not really count as facts: they are what scientists call artifacts, observations that are produced entirely by some aspect of the method of research – in this case, by differences in sample size.”

 

Counties in the United States vary dramatically. Some counties are geographically huge, while others are pretty small – Nevada’s is a large state with over 110,000 square miles of land but only 17 counties compared to West Virginia with under 25,000 square feet of land and 55 counties. Across the US, some counties are exclusively within metropolitan areas, some are completely within suburbs, some are entirely rural with only a few hundred people, and some manage to incorporate major metros, expansive suburbs, and vast rural stretches (shoutout to Clark County, NV). They are convenient for collecting data, but can cause problems when analyzing population trends across the country. The variations in size and other factors creates the possibility for the extreme outcomes we see in things like cancer rates across counties. When smoothed out over larger populations, the disparities in cancer rates disappears.

 

Most of us are not collecting lots of important data for analysis each day. Most of us probably don’t have to worry too  much on a day to day basis about some important statistical sampling problem. But we should at least be aware of how complex information is, and how difficult it can be to display and share information in an accurate manner. We should turn to people like Tim Harford for help interpreting and understanding complex statistics when we can, and we should try to look for factors that might interfere with a convenient conclusion before we simply believe what we would like to believe about a set of data. Statistical artifacts can play a huge role in shaping the way we understand a particular phenomenon, and we shouldn’t jump to extreme conclusions based on poor data.
Extreme Outcomes

Extreme Outcomes

Large sample sizes are important. At this moment, the world is racing as quickly as possible toward a vaccine to allow us to move forward from the COVID-19 Pandemic. People across the globe are anxious for a way to resume normal life and to reduce the risk of death from the new virus and disease. One thing standing in the way of the super quick solution that everyone wants is basic statistics. For any vaccine or treatment, we need a large sample size to be certain of the effects of anything we offer to people as a cure or for prevention of COVID-19. We want to make sure we don’t make decisions based on extreme outcomes, and that what we produce is safe and effective.

 

Statistics and probability are frequent parts of our lives, and many of us probably feel as though we have a basic and sufficient grasp of both. The reality, however, is that we are often terrible with thinking statistically. We are much better at thinking in narrative, and often we substitute a narrative interpretation for a statistical interpretation of the world without even recognizing it. It is easy to change our behavior based on anecdote and narrative, but not always so easy to change our behavior based on statistics. This is why we have the saying often attributed to Stalin: One death is a tragedy, a million deaths is a statistic.

 

The danger with anecdotal and narrative interpretations of the world is that they are drawn from small sample sizes. Daniel Kahneman explains the danger of small sample sizes in his book Thinking Fast and Slow, “extreme outcomes (both high and low) are more likely to be found in small than in large samples. This explanation is not causal.”

 

In his book, Kahneman explains that when you look at counties in the United States with the highest rates of cancer, you find that some of the smallest counties in the nation have the highest rates of cancer. However, if you look at which counties have the lowest rates of cancer, you will also find that it is the smallest counties in the nation that have the lowest rates. While you could drive across the nation looking for explanations to the high and low cancer rates in rural and small counties, you likely wouldn’t find a compelling causal explanation. You might be able to string a narrative together and if you try really hard you might start to see a causal chain, but your interpretation is likely to be biased and based on flimsy evidence. The fact that our small counties are the ones that have the highest and lowest rates of cancer is an artifact of small sample sizes. When you have small sample sizes, as Kahneman explains, you are likely to see more extreme outcomes. A few random chance events can dramatically change the rate of cancer per thousand residents when you only have a few thousand residents in small counties. In larger more populated counties, you find a reversion to the mean, and few extreme chance outcomes outcomes are less likely to influence the overall statistics.

 

To prevent our decision-making from being overly influenced by extreme outcomes we have to move past our narrative and anecdotal thinking. To ensure that a vaccine for the coronavirus or a cure for COVID-19 is safe and effective, we must allow the statistics to play out. We have to have large sample sizes, so that we are not influenced by extreme outcomes, either positive or negative, that we see when a few patients are treated successfully. We need the data to ensure that the outcomes we see are statistically sound, and not an artifact of chance within a small sample.
Affect Heuristics

Affect Heuristics

I studied public policy at the University of Nevada, Reno, and one of the things I had to accept early on in my studies was that humans are not as rational as we like to believe. We tell ourselves that we are making objective and unbiased judgments about the world to reach the conclusions we find. We tell ourselves that we are listening to smart people who truly understand the issues, policies, and technicalities of policies and science, but studies of voting, of policy preference, and of individual knowledge show that this is not the case.

 

We are nearing November and in the United States we will be voting for president and other elected officials. Few of us will spend much time investigating the candidates on the ballot in a thorough and rigorous way. Few of us will seek out in-depth and nuanced information about the policies our political leaders support or about referendum questions on the ballot.  But many of us, perhaps the vast majority of us, will have strong views on policies ranging from tech company monopolies, to tariffs, and to public health measures. We will reach unshakable conclusions and find a few snippets of facts to support our views. But this doesn’t mean that we will truly understand any of the issues in a deep and complex manner.

 

Daniel Kahneman, in his book Thinking Fast and Slow helps us understand what is happening with our voting, and reveals what I didn’t want to believe, but what I was confronted with over and over through academic studies. He writes, “The dominance of conclusions over arguments is most pronounced where emotions are involved. The psychologist Paul Slovic has proposed an affect heuristic in which people let their likes and dislikes determine their beliefs about the world.”

 

Very few of us have a deep understating of economics, international relations, or public health, but we are good at recognizing what is in our immediate self-interest and who represents the identities that are core to who we are. We know that having someone who reflects our identities and praises those identities will help improve the social standing of our group, and ultimately improve our own social status. By recognizing who our leader is and what is in our individual self-interest to support, we can learn which policy beliefs we should adopt. We look to our leaders, learn what they believe and support, and follow their lead. We memorize a few basic facts, and use that as justification for the beliefs we hold, rather than admit that our beliefs simply follow our emotional desire to align with a leader that we believe will boost our social standing.

 

It is this affect heuristic that drives much of our political decision making. It helps explain how we can support some policies which don’t seem to immediately benefit us, by looking at the larger group we want to be a part of and trying to increase the social standing of that group, even at a personal cost. The affect heuristic shows that we want a conclusion to be true, because we would benefit from it, and we use motivated reasoning to adopt beliefs that conveniently support our self-interest. There doesn’t need to be any truth to the beliefs, they just need to satisfy our emotional valance and give us a shortcut to making decisions on complex topics.
Evaluating Happiness

Evaluating Happiness

If you ask college students how many dates they have had in the last month and then ask them how happy they are overall, you will find that those who had more dates will rate themselves as generally more happy than those who had fewer dates. However, if you ask college students how happy they are overall, and then after they evaluate their happiness ask them how many dates they have had, you won’t see a big difference in overall happiness based on the number of dates that students had in the last month.

 

Daniel Kahneman looks at the results of studies like this in his book Thinking Fast and Slow and draws the following conclusion. “The explanation is straightforward, and it is a good example of substitution,” he writes. Happiness these days is not a natural or an easy assessment. A good answer requires a fair amount of thinking. However, the students who had just been asked about their dating did not need to think hard because they already had in their mind an answer to a related question: how happy were they with their love life?

 

This example is interesting because we are often placed in situations where we have to make a quick assessment of a large and complex state of being. When we buy a new car or house we rarely have a chance to live with the car or house for six months to determine if we really like it and if it is actually a good fit for us. We have a test drive or two, a couple walk-throughs, and then we are asked to make an assessment of whether we would like to own the the thing and whether it would be a good fit for our lives. We face the same challenges with voting for president, choosing a college or major, hiring a new employee/taking a new job, or buying a mattress. Evaluating happiness and predicting happiness is complex and difficult, and often without noticing it, we switch the question to something that is easier for us to answer. We narrow down our overall assessment to a few factors that are more easy to evaluate and hold in our head. More dates last month means I’m more happy.

 

“The present state of mind looms very large when people evaluate their happiness,” writes Kahneman.

 

We often judge the president based on the economy in the last months or weeks leading up to an election. We may chose to buy a home or car based on how friendly our agent or salesperson was and whether they did a good job of making us feel smart. Simple factors that might influence our mood in the moment can alter our perceived level of happiness and have direct outcomes in the decisions we make. We rarely pause to think about how happy we are on an overall level, and if we do, it is hard to untangle the things that are influencing our current mood from our perception of our general life happiness. It is important to recognize how much the current moment can shape our overall happiness so that we can pause and adjust our behaviors and attitudes to better reflect our reality. Having a minor inconvenience should not throw off our entire mood and outlook on life. Similarly, if we are in positions we dislike and find unbearable, we should not put up with the status quo just because someone flatters us but makes no real changes to improve our situation. Ultimately, it is important for us to be able to recognize what is happening in our minds and to be able to recognize when our minds are likely to be influenced by small and rather meaningless things.
Biased in Predictable Ways

Biased in Predictable Ways

“A judgment that is based on substitution will inevitably be biased in predictable ways,” writes Daniel Kahneman in his book Thinking Fast and Slow. Kahneman uses an optical illusion to show how our minds can be tricked in specific way to lead us to an incorrect conclusion. The key take-away, is that we can understand and predict our biases and how those biases will lead to specific patterns of thinking. The human mind is complex and varied, but the errors it makes can be studied, understood, and predicted.

 

We don’t like to admit that our minds are biased, and even if we are willing to admit a bias in our thinking, we are often even less willing to accept a negative conclusion about ourselves or our behavior resulting from such a bias. However, as Kahneman’s work shows, our biases are predictable and follow patterns. We know that we hold biases, and we know that certain biases can arise or be induced in certain settings. If we are going to accept these biases, then we must accept what they tell us about our brains and about the consequences of these biases, regardless whether they are trivial or have major implications in our lives and societies.

 

In a lot of ways, I think this describes the conflicts we are seeing in American society today. There are many situations where we are willing to admit that biases occur, but to admit and accept a bias implicates greater social phenomenon. Admitting a bias can make it hard to deny that larger social and societal changes may be necessary, and the costs of change can be too high for some to accept. This puts us in situations where many deny that bias exists, or live in contradiction where a bias is accepted, but a remedy to rectify the consequences of the bias is not accepted. A bias can be accepted, but the conclusion and recognition that biases are predictable and understandable can be rejected, despite the mental contradictions that arise.

 

As we have better understood how we behave and react to each other, we have studied more forms of bias in certain settings. We know that we are quick to form in-groups and out-groups. We know that we see some people as more threatening than others, and that we are likely to have very small reactions that we might not consciously be aware of, but that can nevertheless be perceived by others. Accepting and understanding these biases with an intention to change is difficult. It requires not just that one person adapt their behavior, but that many people change some aspect of their lives, often giving up material goods and resources or status. The reason there is so much anger and division in the United States today is because there are many people who are ready to accept these biases, to accept the science that Kahneman shows, and to make changes, while many others are not. Accepting the science of how the brain works and the biases that can be produced in the brain challenges our sense of self, reveals things about us that we would rather leave in the shadows, and might call for change that many of us don’t want to make, especially when a fiction that denies such biases helps propel our status.
Probability Judgments

Probability Judgments

Julia Marcus, an epidemiologist at Harvard Medical School, was on a recent episode of the Ezra Klein show to discuss thinking about personal risk during the COVID-19 Pandemic. Klein and Marcus talked about the ways in which the United States Government has failed to help provide people with structures for thinking about risk, and how this has pushed risk decisions onto individuals. They talked about how this creates pressures on each of us to determine what activities are worthwhile, what is too risky for us, and how we can know if there is a high probability of infection in one setting relative to another.

 

On the podcast they acknowledged what Daniel Kahneman writes about in his book Thinking Fast and Slow – humans are not very good at making probability judgments. Risk is all about probability. It is fraught with uncertainty, with with small likelihoods of very bad outcomes, and with conflicting opinions and desires. Our minds, especially our normal operating mode of quick associations and judgments, doesn’t have the capacity to think statistically in the way that is necessary to make good probability judgments.

 

When we try to think statistically, we often turn to substitutions, as Kahneman explains in his book. “We asked ourselves how people manage to make judgments of probability without knowing precisely what probability is. We concluded that people must somehow simplify that impossible task and we set out to find how they do it. Our answer was that when called upon to judge probability, people actually judge something else and believe they have judged probability.”

 

This is very important when we think about our actions, and the actions of others, during this pandemic. We know it is risky to have family dinners with our loved ones, and we ask ourselves if it is too risky to get together with our parents, with siblings who are at risk due to health conditions, and if we shouldn’t be in the same room with a family member who is a practicing medical professional. But in the end, we answer a different question. We ask how much we miss our parents, if we think it is important to be close to our family, and if we really really want some of mom’s famous pecan pie.

 

As Klein and Marcus say during the podcast, it is a lot easier to be angry at people at a beach than to make probability judgments about a small family dinner. When governments, public health officials, and employers fail to establish systems to help us navigate the risk, we place the responsibility back onto individuals, so that we can have someone to blame, some sense of control, and an outlet for the frustrations that arise when our mind can’t process probability. We distort probability judgments and ask more symbolic questions about social cohesion, family love, and isolation. The answer to our challenges would be better and more responsive institutions and structures to manage risk and mediate probability judgments. The individual human mind can only substitute easier questions for complex probability judgments, and it needs visual aids, better structures, and guidance to help think through risk and probability in an accurate and reasonable manner.
Substitution Heuristics

Substitution Heuristics

I think heuristics are underrated. We should discuss heuristics as a society way more than we do. We barely acknowledge heuristics, but if we look closely, they are at the heart of many of our decisions, beliefs, and assumptions. They save us a lot of work and help us move through the world pretty smoothly, but are rarely discussed directly or even slightly recognized.

 

In Thinking Fast and Slow, Daniel Kahneman highlights heuristics in the sense of substitution and explains their role as:

 

“The target question is the assessment you intended to produce.
The heuristic question is the simpler question that you answered instead.”

 

I have already written about our brain substituting easier questions for harder questions, but the idea of heuristics gives the process a deeper dimension. Kahneman defines a heuristic writing, “The technical definition of heuristic is a simple procedure that helps find adequate, though often imperfect, answers to difficult questions.”

 

In my own life, and I imagine I am a relatively average case, I have relied on heuristics to help me make a huge number of decisions. I don’t know the best possible investment strategies for my future retirement, but as a heuristic, I know that working with an investment advisor to manage mutual funds and IRAs can be an adequate (even if not perfect) way to ensure I save for the future. I don’t know the healthiest possible foods to eat and what food combinations will maximize my nutrient intake, but as a heuristic I can ensure that I have a colorful plate with varied veggies and not too many sweets to ensure I get enough of the vitamins and nutrients that I need.

 

We have to make a lot of difficult decisions in our lives. Most of us don’t have the time or the ability to compile all the information we need on a given subject to make a fully informed decision, and even if we try, most of us don’t have a reasonable way to sort through contrasting and competing information to determine what is true and what the best course of action would be. Instead, we make substitutions and use heuristics to figure out what we should do. Instead of recognizing that we are using heuristics, however, we ascribe a higher level of confidence and certainty to our decisions than is warranted. What we do, how we live, and what we believe become part of our identity, and we fail to recognize that we are adopting a heuristic to achieve some version of what we believe to be a good life. When pressed to think about it, our mind creates a justification for our decision that doesn’t acknowledge the heuristics in play.

 

In a world where we were quicker to recognize heuristics, we might be able to live with a little distance between ourselves, our decisions, and our beliefs. We could acknowledge that heuristics are driving us, and be more open to change and more willing to be flexible with others. Acknowledging that we don’t have all the answers (that we don’t even have all the necessary information) and are operating on substitution heuristics for complex questions, might help us be less polarized and better connected within our society.
Rarely Stumped

Rarely Stumped

Daniel Kahneman starts one of the chapters in his book Thinking Fast and Slow by writing, “A remarkable aspect of your mental life is that you are rarely stumped. True, you occasionally face a question such as 17 × 24 = ? to which no answer comes immediately to mind, but these dumbfounded moments are rare. The normal state of your mind is that you have intuitive feelings and opinions about almost everything that comes your way.”

 

When I read this quote I am reminded of Gus, the father, in My Big Fat Greek Wedding. He is always ready to show how every word comes from a Greek root, even a Japanese word like kimono. He is sure of his intellect, sure that his heritage is perfect and is the foundation of all that is good in the world. He trusts his instincts and intuitions to a hilarious extent, even when he is clearly wrong and even when his decisions are gift-wrapped and planted in his mind in an almost Inception style.

 

His character is part caricature, but it is revealing of what Kahneman explains with the quote above. Our minds are good at finding intuitive answers that make sense of the world around us, even if we really don’t have any idea what is going on. We laugh at Gus and don’t consider ourselves to be guilty of behaving like him, but the only difference between most of us and Gus is that Gus is an exaggeration of the intuitive dogma and sense of self value and assurance that we all live with.

 

We scroll through social media, and trust that our initial judgment of a headline or post is the right frame for how to think about the issue. We are certain that our home remedy for tackling bug bites, cleaning windows, or curing a headache is based on sound science, even if it does nothing more than produce a placebo effect. We find a way to fit every aspect of our lives into a comprehensive framework where our decisions appear rational and justified, with us being the hero (or innocent victim if needed) of the story.

 

We should remember that we have a propensity to believe that we are always correct, that we are never stumped. We should pause, ask more questions, think about what is important to know before making a decision, and then deeply interrogate our thoughts to decide if we really have obtained meaningful information to inform our opinions, or if we are just acting on instinct, heuristics, self-interest, or out of groupthink. We cannot continue believing we are right, pushing baseless beliefs onto others when we have no real knowledge of an issue. We shouldn’t assume things are true just because they happen to align with the story we want to believe about ourselves and the world. When it comes to crucial issues and our interactions and relationships with others, we need to think more critically, and recognize when we are assuming we are right. If we can pause at those times and think more deeply, gather more information, ask more questions of our selves, we can have more accurate and honest interactions and relationships. Hopefully this will help us have more meaningful lives that better connect and better develop the community we all need in order to thrive.
Judging Faces

Judging Faces

One of the successes of System 1, the name Daniel Kahneman uses to describe our quick, intuitive part of the brain in his book Thinking Fast and Slow, is recognizing emotions in people’s faces. We don’t need much time to study someone’s face to recognize that they are happy, scared, or angry. We don’t even need to see someone’s face for a full second to get an accurate sense of their emotional state, and to adjust our behavior to interact accordingly with them.

 

The human mind is great at intuiting emotions from people’s faces. I can’t remember where, but I came across something that suggested the reason why we have white eyes is to help us better see where each other’s eyes are looking, and to help us better read each other’s emotions. Our ability to quickly and intuitively read each others’ faces helps us build social cohesion and connections. However, it can still go wrong, even though we are so adept.

 

Kahneman explains that biases and baseless assumptions can be built into System 1’s assessment of faces. We are quick to notice faces that share similar features as our own. We are also quick to judge people as nice, competent, or strong based on features in their faces. This is demonstrated in Thinking Fast and Slow with experiments conducted by Alex Todorov. He had showed potential voters the faces of candidates, for sometimes only fractions of seconds and noted that faces influenced votes. Kahneman writes, “As expected, the effect of facial competence on voting is about three times larger for information-poor and TV-prone voters than for others who are better informed and watch less television.”

 

I’m not here to hate on information-poor and TV-prone voters, but instead to help us see that we can easily be influenced by people’s faces and traits that we have associated with facial characteristics, even if we don’t consciously know those associations exist. For all of us, there will be situations where we are information-poor and ignorant of issues or important factors for our decision (the equivalent of being TV-prone in electoral voting). We might trust what a mechanic or investment banker says if they have a square jaw and high cheekbones. We might trust the advice of a nurse simply because she has facial features that make her seem caring and sympathetic. Perhaps in both situations the person is qualified and competent to be giving us advice, but even if they were not, we might trust them based on little more than appearance. System 1, which is so good at telling us about peoples’ emotions, can jump ahead and make judgement about many characteristics of people simply based on faces, and it may be correct sometimes, but it can also be wrong. System 2 will probably construct a coherent narrative to justify the quick decision made by System 1, but it likely won’t really have to do with the experience and qualifications of the person. We may find that we end up in situations where deep down, we are making judgments of someone based on little more than what they look like, and what System 1 thought of their face.
What You See Is All There Is

What You See Is All There Is

In Thinking Fast and Slow, Daniel Kahneman gives us the somewhat unwieldy acronym WYSIATI – what you see is all there is. The acronym describes a phenomenon that stems from how our brains work. System 1, the name that Kahneman gives to the part of our brain which is automatic, quick, and associative, can only take in so much information. It makes quick inferences about the world around it, and establishes a simple picture of the world for System 2, the thoughtful calculating part of our brain, to work with.

 

What you see is all there is means that we are limited by the observations and information that System 1 can take in. It doesn’t matter how good System 2 is at processing and making deep insights about the world if System 1 is passing along poor information. Garbage in, garbage out, as the computer science majors like to say.

 

Daniel Kahneman explains what this means for our day to day lives in detail in his book. He writes, “As the WYSIATI rule implies, neither the quantity nor the quality of the evidence counts for much in subjective confidence. The confidence that individuals have in their beliefs depends mostly on the quality of the story they can tell about what they see, even if they see little.”

 

System 2 doesn’t recognize that System 1 hands it incomplete and limited information. It chugs along believing that the information handed off by System 1 is everything that it needs to know. It doesn’t ask for more information, it just accepts that it has been handed a complete data set and begins to work. System 2 creates a solid narrative out of whatever information System 1 gives it, and only momentarily pauses if it notices an inconsistency in the story it is stitching together about the world. If it can make a coherent narrative, then it is happy and doesn’t find a need to look for additional information. What you see is all there is, there isn’t anything missing.

 

But we know that we only take in a limited slice of the world. We can’t sense the Earth’s magnetic pull, we can’t see in ultraviolet or infrared, and we have no way of knowing what is really happening in another person’s mind. When we read a long paper or finish a college course, we will remember some stuff, but not everything. Our mind is only able to hold so much information, and System 2 is limited to what can be observed and held. This should be a huge problem for our brain, we should recognize enormous blind spots, and be paralyzed with inaction due to a lack of information. But this isn’t what happens. We don’t even notice the blind spots, and instead we make a story from the information we collect, building a complete world that makes sense of the information, no matter how limited it is. What you see is all there is, we make the world work, but we do so with only a portion of what is really out there, and we don’t even notice we do so.