Science and Facts

Science and Facts

Science helps us understand the world and answer questions about how and why things are the way they are. But this doesn’t mean science always gives us the most accurate answers possible. Quite often science seems to suggest an answer, sometimes the answer we get doesn’t really answer the question we wanted to ask, and sometimes there is just too much noise to gain any real understanding. The inability to perfectly answer every question, especially when we present science as providing clear facts when teaching science to young children, is a point of the confusion and dismissal among those who don’t want to believe the answers that science gives us.
In Spook: Science Tackles the Afterlife, Mary Roach writes, “Of course, science doesn’t dependably deliver truths. It is as fallible as the men and women who undertake it. Science has the answer to every question that can be asked. However, science reserves the right to change that answer should additional data become available.” The science of the afterlife (really the science of life, living, death, and dying), Roach explains, has been a science of revision. What we believe, how we conduct experiments, and how we interpret scientific results has shifted as our technology and scientific methods have progressed. The science of life and death has given us many different answers over the years as our own biases have shifted and as our data and computer processing has evolved.
The reality is that all of our scientific fields of study are incomplete. There are questions we still don’t have great answers to, and as we seek those answers, we have to reconsider older answers and beliefs. We have to study contradictions and try to understand what might be wrong with the way we have interpreted the world. What we bring to science impacts what we find, and that means that sometimes we don’t find truths, but conveniently packaged answers that reinforce what we always wanted to be true. Overtime, however, the people doing the science change, the background knowledge brought to science changes, and the way we understand the answers from science changes. It can be frustrating to those of us on the outside who want clear answers and don’t want to be abused by people who wish to deliberately mislead based on incomplete scientific knowledge. But overtime science revises itself to become more accurate and to better describe the world around us.
Technological Uncertainty & Fear

Technological Uncertainty & Fear

New technologies scare people. When a new technology comes along, we react to the uncertainties of what the technology will mean. We predict worst case scenarios, fear that some sort of physiological change that we cannot control may take place, and we worry that the new technology could destroy some part of social life. We can look back at many of these technological changes and laugh at the worries and concerns of people at the time, but the truth is that we see this occur over and over in response to technological change and we are guilty, or capable of being guilty, of the same fear.
Technological fear is tied to uncertainty. Thinking about putting computer chips directly into our brains to interface directly with the internet or some type of computer hardware and software is a good example of such a fear today. What will happen if our brains can be hacked? What will happen to media, information, and social connections if we all have chips in our brain. Will we still be human (whatever that means) if we merge our brains with silicon chips?
I am currently reading about the industrial revolution in the 1800’s and early 1900’s and while people were not afraid of computer chips in their brains, they were afraid of new technology and what it would do to people and society. In a previous book I read, Packing for Mars, Mary Roach explains that this same fear and uncertainty took place when people thought about space travel and zero gravity. Space travel required immense speeds and we didn’t know if the body and mind could handle such speeds. On top of that, no one knew what would happen in zero gravity to the human body. Would normal body functions still work without Earth’s gravitational pull?
Regarding our technological uncertainty and fear, specifically with ever increasing transportation speeds, Roach writes, “over the course of history, the same sort of anxiety has appeared every time a newer, faster form of transport has come along.” Scientists feared that trains would be too fast for people, that airplanes would be too foreign from any experience the body was evolved to handle, and that all kinds of other technologies and forms of transport would zoom and shake the body into jelly. When we are uncertain about a new technology fear can take over, and we worry about a range of impacts that could occur. Humans have been doing this since at least the industrial revolution, and with robots, computer chip implants, and other changes on the horizon, we are not likely to stop any time soon.
Political and Scientific Numbers

Political and Scientific Numbers

I am currently reading a book about the beginnings of the Industrial Revolution and the author has recently been comparing the development of textile mills, steam engines, and chemical production in Britain in the 1800’s to the same developments on the European continent. It is clear that within Britain the developments of new technologies and the adoption of larger factories to produce more material was much quicker than on the continent, but exactly how much quicker is hard to determine. One of the biggest challenges is finding reliable and accurate information to compare the number of textile factories, the horse power of steam engines, or how many chemical products were exported in a given decade. In the 1850s getting good data and preserving that data for historians to sift through and analyze a couple of hundred years later was not an easy task. Many of the numbers that the author has referenced are generalized estimates and ranges, not well defined statistical figures. Nevertheless, this doesn’t mean the data are not useful and cannot help us understand general trends of the industrial revolution in Britain and the European continent.
Our ability to obtain and store numbers, information, and data is much better today than in the 1800s, but that doesn’t mean that all of our numbers are now perfect and that we have everything figured out. Sometimes our data comes from pretty reliable sources, like the GPS map data on Strava that gives us an idea of where lots of people like to exercise and where very few people exercise. Other data is pulled from surveys which can be unreliable or influenced by word choice and response order. Some data comes from observational studies that might be flawed in one way or another. Other data may just be incomplete, from small sample sizes, or simply messy and hard to understand. Getting good information out of such data is almost impossible. As the saying goes, garbage in – garbage out.
Consequently we end up with political numbers and scientific numbers. Christopher Jencks wrote about the role that both have played in how we understand and think about homelessness in his book The Homeless. He writes, “one needs to distinguish between scientific and political numbers. This distinction has nothing to do with accuracy. Scientific numbers are often wrong, and political numbers are often right. But scientific numbers are accompanied by enough documentation so you can tell who counted what, whereas political numbers are not.”
It is interesting to think about the accuracy (or perhaps inaccuracy) of the numbers we use to understand our world. Jencks explains that censuses of homeless individuals need to be conducted early in the morning or late at night to capture the full number of people sleeping in parks or leaving from/returning to overnight shelters. He also notes the difficulty of contacting people to confirm their homeless status and the challenges of simply surveying people by asking if they have a home. People use different definitions of having a home, being homeless, or having a fixed address and those differences can influence the count of how many homeless people live within a city or state. The numbers are backed by a scientific process, but they may be inaccurate and not representative of reality. By contrast, political numbers could be based on a random advocate’s average count of meals provided at a homeless shelter or by other estimates. These estimates may end up being just as accurate, or more so, than the scientific numbers used, but how the numbers are used and understood can be very different.
Advocacy groups, politicians, and concerned citizens can use non-scientific numbers to advance their cause or their point of view. They can rely on general estimates to demonstrate that something is or is not a problem. But they can’t necessarily drive actual action by governments, charities, or private organizations with only political numbers. Decisions look bad when made based on rough guesses and estimates. They look much better when they are backed by scientific numbers, even if those numbers are flawed. When it is time to actually vote, when policies have to be written and enacted, and when a check needs to be signed, having some sort of scientific backing to a number is crucial for self-defense and for (at least an attempt at) rational thinking.
Today we are a long way off from the pen and paper (quill and scroll?) days of the 1800s. We have the ability to collect far more data than we could have ever imagined, but the numbers we end up with are not always that much better than rough estimates and guesses. We may use the data in a way that shows that we trust the science and numbers, but the information may ultimately be useless. These are some of the frustrations that so many people have today with the ways we talk about politics and policy. Political numbers may suggest we live in one reality, but scientific numbers may suggest another reality. Figuring out which is correct and which we should trust is almost impossible, and the end result is confusion and frustration. We probably solve this with time, but it will be a hard problem that will hang around and worsen as misinformation spreads online.
Superhero Cadavers

Superhero Cadavers

[Author Note: This begins a short three post break in writing regarding homelessness for a few quotes and thoughts on books by Mary Roach. More to come from Roach after finishing some additional writing on homelessness and poverty.]
Mary Roach’s book Stiff: The Curious Lives of Human Cadavers is an exploration into what happens to bodies donated for scientific research. In the book she meets with scientists, researchers, and academics who are working with human cadavers to make life better for those of us who are still living. It is a witty, humorous, yet altogether respectful exploration of the ways in which the human body has helped propel our species forward, even after the human life within the body has expired.
Regarding cadavers and what they have unlocked through sometimes gory (though today as considerate and respectful as possible) experiments, Roach writes the following:
“Cadavers are our superheroes: They brave fire without flinching, withstand falls from tall buildings and head-on car crashes into walls. You can fire a gun at them or run a speedboat over their legs, and it will not faze them. Their heads can be removed with no deleterious effect. They can be in six places at once. I take the Superman point of view: What a shame to waste these powers, to not use them for the betterment of humankind.”
The scientific study of cadavers can be off-putting, but it has been incredibly valuable for humanity across the globe. Cadavers have helped us understand basic anatomy, design safer cars, and ensure the safety of astronauts. Without cadavers many more people would have died in ill devised medical experiments and car crashes, and numerous live animals would have suffered as alternative test subjects. Cadavers perform miraculous jobs that living humans cannot perform, and for their service and sacrifices, we should all be grateful.
Stories from Bid Data

Stories from Big Data

Dictionary.com describes datum (the singular of data) as “a single piece of information; any fact assumed to be a matter of direct observation.” So when we think about big data, we are thinking about massive amounts of individual pieces of information or individual facts from direct observation. Data simply are what they are, facts and individual observations in isolation.
On the other hand Dictionary.com defines information as “knowledge communicated or received concerning a particular fact or circumstance.” Information is the knowledge, story, and ideas we have about the data. These two definitions are important for thinking about big data. We never talk about big information, but the reality is that big data is less important than the knowledge we generate from the data, and that isn’t as objective as the individual datum.
In The Book of Why Judea Pearl writes, “a generation ago, a marine biologist might have spent months doing a census of his or her favorite species. Now the same biologist has immediate access online to millions of data points on fish, eggs, stomach contents, or anything else he or she wants. Instead of just doing a census, the biologist can tell a story.” Science has become contentious and polarizing recently, and part of the reason has to do with the stories that we are generating based on the big data we are collecting. We can see new patterns, new associations, new correlations, and new trends in data from across the globe. As we have collected this information, our impact on the planet, our understanding of reality, and how we think about ourselves in the universe has changed. Science is not simply facts, that is to say it is not just data. Science is information, it is knowledge and stories that have continued to challenge the narratives we have held onto as a species for thousands of years.
Judea Pearl thinks it is important to recognize the story aspect of big data. He thinks it is crucial that we understand the difference between data and information, because without doing so we turn to the data blindly and can generate an inaccurate story based on what we see. He writes,
“In certain circles there is an almost religious faith that we can find the answers to … questions in the data itself, if only we are sufficiently clever at data mining. However, readers of this book will know that this hype is likely to be misguided. The questions I have just asked are all causal, and causal questions can never be answered from data alone.”
Big data presents us with huge numbers of observations and facts, but those facts alone don’t represent causal structures or deeper interactions within reality. We have to generate information from the data and combine that new knowledge with existing knowledge and causal hypothesis to truly learn something new from big data. If we don’t then we will simply be identifying meaningless correlations without truly understanding what they mean or imply.
Words and Formulas

Words and Formulas

Scientific journal articles today are all about formulas, and in The Book of Why Judea Pearl suggests that there is a clear reason why formulas have come to dominate the world of academic studies. In his book he writes, “to a mathematician, or a person who is adequately training in the mathematical way of thinking …. a formula reveals everything: it leaves nothing to doubt or ambiguity. When reading a scientific article, I often catch myself jumping from formula to formula, skipping the words altogether. To me, a formula is a baked idea. Words are ideas in the oven.”
Formulas are scary and hard to sort out. They use Greek letters and even in fields like education, political science, or hospitality management formulas make their way into academic study. Nevertheless, if you can understand what a formula is saying, then you can understand the model that the researcher is trying to demonstrate. If you can understand the numbers that come out of a formula, you can understand something about the relationship between the variables measured in the study.
Once you write a formula, you are defining the factors that you are going to use in an analysis. You are expressing your hypothesis in concrete terms, and establishing specific values that can be analyzed in the forms of percentages, totals, ratios, or statistical coefficients.
Words, on the other hand, can be fuzzy. We can debate all day long about specific words, their definitions, registers, and implications in ways that we cannot argue over a formula. The data that goes into a formula and information that comes out is less subjective than the language and words we use to describe the data and the conclusions we draw from the information.
I like the metaphor that Pearl uses, comparing formulas to baked ideas and words to ideas within an oven. Words allow us to work our way through what we know, to tease apart small factors and attempt to attach significance to each factor. A formula requires that we cut through the potentialities and possibilities to make specific definitions that can be proven false. Words help us work our way toward a specific idea and a formula either repudiates that idea or lets it live on to face another more specific and nuanced formula in the future, with our ideas becoming more crisp over time.
Complex Causation Continued

Complex Causation Continued

Our brains are good at interpreting and detecting causal structures, but often, the real causal structures at play are more complicated than what we can easily see. A causal chain may include a mediator, such as citrus fruit providing vitamin C to prevent scurvy. A causal chain may have a complex mediator interaction, as in the example of my last post where a drug leads to the body creating an enzyme that then works with the drug to be effective. Additionally, causal chains can be long-term affairs.
In The Book of Why Judea Pearl discusses long-term causal chains writing, “how can you sort out the causal effect of treatment when it may occur in many stages and the intermediate variables (which you might want to use as controls) depend on earlier stages of treatment?”
This is an important question within medicine and occupational safety. Pearl writes about the fact that factory workers are often exposed to chemicals over a long period, not just in a single instance. If it was repeated exposure to chemicals that caused cancer or another disease, how do you pin that on the individual exposures themselves? Was the individual safe with 50 exposures but as soon as a 51st exposure occurred the individual developed a cancer? Long-term exposure to chemicals and an increased cancer risk seems pretty obvious to us, but the actual causal mechanism in this situation is a bit hazy.
The same can apply in the other direction within the field of medicine. Some cancer drugs or immune therapy treatments work for a long time, stop working, or require changes in combinations based on how disease has progressed or how other side effects have manifested. Additionally, as we have all learned over the past year with vaccines, some medical combinations work better with boosters or time delayed components. Thinking about causality in these kinds of situations is difficult because the differing time scopes and combinations make it hard to understand exactly what is affecting what and when. I don’t have any deep answers or insights into these questions, but simply highlight them to again demonstrate complex causation and how much work our minds must do to fully understand a causal chain.
Complex Causation

Complex Causation

In linear causal models the total effect of an action is equal to the direct effect of that action and its indirect effect. We can think of an oversimplified anti-tobacco public health campaign to conceptualize this equation. A campaign could be developed to use famous celebrities in advertisements against smoking. This approach may have a direct effect on teen smoking rates if teens see the advertisements and decide not to smoke as a result of the influential messaging from their favorite celebrity. This approach may also have indirect effects. Imagine a teen who didn’t see the advertising, but their best friend did see it. If their best friend was influenced, then they may adopt their friend’s anti-smoking stance. This would be an indirect effect of the advertising campaign in the positive direction. The total effect of the campaign would then be the kids who were directly deterred from smoking combined with those who didn’t smoke because their friends were deterred.
However, linear causal models don’t capture all of the complexity that can exist within causal models. As Judea Pearl explains in The book of Why, there can be complex causal models where the equation that I started this post with doesn’t hold. Pearl uses a drug used to treat a disease as an example of a situation where the direct effect and indirect effect of a drug don’t equal the total effect. He says that in situations where a drug causes the body to release an enzyme that then combines with the drug to treat a disease, we have to think beyond the equation above. In this case he writes, “the total effect is positive but the direct and indirect effects are zero.”
The drug itself doesn’t do anything to combat the disease. It stimulates the release of an enzyme and without that enzyme the drug is ineffective against the disease. The enzyme also doesn’t have a direct effect on the disease. The enzyme is only useful when combined with the drug, so there is no indirect effect that can be measured as a result of the original drug being introduced. The effect is mediated between the interaction of both the drug and enzyme together. In the model Pearl shows us, there is only the mediating effect, not a direct or indirect effect.
This model helps us see just how complicated ideas and conceptions of causation are. Most of the time we think about direct effects, and we don’t always get to thinking about indirect effects combined with direct effects. Good scientific studies are able to capture the direct and indirect effects, but to truly understand causation today, we have to be able to include mediating effects in complex causation models like the one Pearl describes.
Mediating Variables

Mediating Variables

Mediating variables stand in the middle of the actions and the outcomes that we can observe. They are often tied together and hard to separate from the action and the outcome, making their direct impact hard to pull apart from other factors. They play an important role in determining causal structures, and ultimately in shaping discourse and public policy about good and bad actions.
Judea Pearl writes about mediating variables in The Book of Why. He uses cigarette smoking, tar, and lung cancer as an example of the confounding nature of mediating variables. He writes, “if smoking causes lung cancer only through the formation of tar deposits, then we could eliminate the excess cancer risk by giving smokers tar-free cigarettes, such as e-cigarettes. On the other hand, if smoking causes cancer directly or through a different mediator, then e-cigarettes might not solve the problem.”
The mediator problem of tar still has not been fully disentangled and fully understood, but it is an excellent example of the importance, challenges, and public health consequences of mediating variables. Mediators can contribute directly to the final outcome we observe (lung cancer), but they may not be the only variable at play. In this instance, other aspects of smoking may directly cause lung cancer. An experiment between cigarette and e-cigarette smokers can help us get closer, but we won’t be able to say there isn’t a self-selection effect between traditional and e-cigarette smokers that plays into cancer development. However, closely studying both groups will help us start to better understand the direct role of tar in the causal chain.
Mediating variables like this pop up when we talk about the effectiveness of schools, the role for democratic norms, and the pros or cons of traditional gender roles. Often, mediating variables are driving the concerns we have for larger actions and behaviors. We want all children to go to school, but argue about the many mediating variables within the educational environment that may or may not directly contribute to specific outcomes that we want to see. It is hard to say which specific piece is the most important, because there are so many mediating variables all contributing directly or possibly indirectly to the education outcomes we see and imagine.
Counterfactuals

Counterfactuals

I have written a lot lately about the incredible human ability to imagine worlds that don’t exist. An important way that we understand the world is by imagining what would happen if we did something that we have not yet done or if we imagine what would have happened had we done something different in the past. We are able to use our experiences about the world and our intuition on causality to imagine a different state of affairs from what currently exists. Innovation, scientific advancements, and social cooperation all depend on our ability to imagine different worlds and intuit causal chains between our current world and the imagined reality we desire.
In The Book of Why Jude Pearl writes, “counterfactuals are an essential part of how humans learn about the world and how our actions affect it. While we can never walk down both the paths that diverge in a wood, in a great many cases we can know, with some degree of confidence, what lies down each.”
A criticism of modern science and statistics is the reliance on randomized controlled trials and the fact that we cannot run an RCT on many of the things we study. We cannot run RCTs on our planet to determine the role of meteor impacts or lightning strikes on the emergence of life. We cannot run RCTs on the toxicity of snake venoms in human subjects. We cannot run RCTs on giving stimulus checks  to Americans during the COVID-19 Pandemic. Due to physical limitations and ethical considerations, RCTs are not always possible. Nevertheless, we can still study the world and use counterfactuals to think about the role of specific interventions.
If we forced ourselves to only accept knowledge based on RCTs then we would not be able to study the areas I mentioned above. We cannot go down both paths in randomized experiments with those choices. We either ethically cannot administer an RCT or we are stuck with the way history played out. We can, however, employ counterfactuals, imagining different worlds in our heads to think about what would have happened had we gone down another path. In this process we might make errors, but we can continually learn and improve our mental models. We can study what did happen, think about what we can observe based on causal structures, and better understand what would have happened had we done something different. This is how much of human progress has moved forward, without RCTs and with counterfactuals, imagining how the world could be different, how people, places, societies, and molecules could have reacted differently with different actions and conditions.