Superhero Cadavers

Superhero Cadavers

[Author Note: This begins a short three post break in writing regarding homelessness for a few quotes and thoughts on books by Mary Roach. More to come from Roach after finishing some additional writing on homelessness and poverty.]
Mary Roach’s book Stiff: The Curious Lives of Human Cadavers is an exploration into what happens to bodies donated for scientific research. In the book she meets with scientists, researchers, and academics who are working with human cadavers to make life better for those of us who are still living. It is a witty, humorous, yet altogether respectful exploration of the ways in which the human body has helped propel our species forward, even after the human life within the body has expired.
Regarding cadavers and what they have unlocked through sometimes gory (though today as considerate and respectful as possible) experiments, Roach writes the following:
“Cadavers are our superheroes: They brave fire without flinching, withstand falls from tall buildings and head-on car crashes into walls. You can fire a gun at them or run a speedboat over their legs, and it will not faze them. Their heads can be removed with no deleterious effect. They can be in six places at once. I take the Superman point of view: What a shame to waste these powers, to not use them for the betterment of humankind.”
The scientific study of cadavers can be off-putting, but it has been incredibly valuable for humanity across the globe. Cadavers have helped us understand basic anatomy, design safer cars, and ensure the safety of astronauts. Without cadavers many more people would have died in ill devised medical experiments and car crashes, and numerous live animals would have suffered as alternative test subjects. Cadavers perform miraculous jobs that living humans cannot perform, and for their service and sacrifices, we should all be grateful.
Stories from Bid Data

Stories from Big Data

Dictionary.com describes datum (the singular of data) as “a single piece of information; any fact assumed to be a matter of direct observation.” So when we think about big data, we are thinking about massive amounts of individual pieces of information or individual facts from direct observation. Data simply are what they are, facts and individual observations in isolation.
On the other hand Dictionary.com defines information as “knowledge communicated or received concerning a particular fact or circumstance.” Information is the knowledge, story, and ideas we have about the data. These two definitions are important for thinking about big data. We never talk about big information, but the reality is that big data is less important than the knowledge we generate from the data, and that isn’t as objective as the individual datum.
In The Book of Why Judea Pearl writes, “a generation ago, a marine biologist might have spent months doing a census of his or her favorite species. Now the same biologist has immediate access online to millions of data points on fish, eggs, stomach contents, or anything else he or she wants. Instead of just doing a census, the biologist can tell a story.” Science has become contentious and polarizing recently, and part of the reason has to do with the stories that we are generating based on the big data we are collecting. We can see new patterns, new associations, new correlations, and new trends in data from across the globe. As we have collected this information, our impact on the planet, our understanding of reality, and how we think about ourselves in the universe has changed. Science is not simply facts, that is to say it is not just data. Science is information, it is knowledge and stories that have continued to challenge the narratives we have held onto as a species for thousands of years.
Judea Pearl thinks it is important to recognize the story aspect of big data. He thinks it is crucial that we understand the difference between data and information, because without doing so we turn to the data blindly and can generate an inaccurate story based on what we see. He writes,
“In certain circles there is an almost religious faith that we can find the answers to … questions in the data itself, if only we are sufficiently clever at data mining. However, readers of this book will know that this hype is likely to be misguided. The questions I have just asked are all causal, and causal questions can never be answered from data alone.”
Big data presents us with huge numbers of observations and facts, but those facts alone don’t represent causal structures or deeper interactions within reality. We have to generate information from the data and combine that new knowledge with existing knowledge and causal hypothesis to truly learn something new from big data. If we don’t then we will simply be identifying meaningless correlations without truly understanding what they mean or imply.
Words and Formulas

Words and Formulas

Scientific journal articles today are all about formulas, and in The Book of Why Judea Pearl suggests that there is a clear reason why formulas have come to dominate the world of academic studies. In his book he writes, “to a mathematician, or a person who is adequately training in the mathematical way of thinking …. a formula reveals everything: it leaves nothing to doubt or ambiguity. When reading a scientific article, I often catch myself jumping from formula to formula, skipping the words altogether. To me, a formula is a baked idea. Words are ideas in the oven.”
Formulas are scary and hard to sort out. They use Greek letters and even in fields like education, political science, or hospitality management formulas make their way into academic study. Nevertheless, if you can understand what a formula is saying, then you can understand the model that the researcher is trying to demonstrate. If you can understand the numbers that come out of a formula, you can understand something about the relationship between the variables measured in the study.
Once you write a formula, you are defining the factors that you are going to use in an analysis. You are expressing your hypothesis in concrete terms, and establishing specific values that can be analyzed in the forms of percentages, totals, ratios, or statistical coefficients.
Words, on the other hand, can be fuzzy. We can debate all day long about specific words, their definitions, registers, and implications in ways that we cannot argue over a formula. The data that goes into a formula and information that comes out is less subjective than the language and words we use to describe the data and the conclusions we draw from the information.
I like the metaphor that Pearl uses, comparing formulas to baked ideas and words to ideas within an oven. Words allow us to work our way through what we know, to tease apart small factors and attempt to attach significance to each factor. A formula requires that we cut through the potentialities and possibilities to make specific definitions that can be proven false. Words help us work our way toward a specific idea and a formula either repudiates that idea or lets it live on to face another more specific and nuanced formula in the future, with our ideas becoming more crisp over time.
Complex Causation Continued

Complex Causation Continued

Our brains are good at interpreting and detecting causal structures, but often, the real causal structures at play are more complicated than what we can easily see. A causal chain may include a mediator, such as citrus fruit providing vitamin C to prevent scurvy. A causal chain may have a complex mediator interaction, as in the example of my last post where a drug leads to the body creating an enzyme that then works with the drug to be effective. Additionally, causal chains can be long-term affairs.
In The Book of Why Judea Pearl discusses long-term causal chains writing, “how can you sort out the causal effect of treatment when it may occur in many stages and the intermediate variables (which you might want to use as controls) depend on earlier stages of treatment?”
This is an important question within medicine and occupational safety. Pearl writes about the fact that factory workers are often exposed to chemicals over a long period, not just in a single instance. If it was repeated exposure to chemicals that caused cancer or another disease, how do you pin that on the individual exposures themselves? Was the individual safe with 50 exposures but as soon as a 51st exposure occurred the individual developed a cancer? Long-term exposure to chemicals and an increased cancer risk seems pretty obvious to us, but the actual causal mechanism in this situation is a bit hazy.
The same can apply in the other direction within the field of medicine. Some cancer drugs or immune therapy treatments work for a long time, stop working, or require changes in combinations based on how disease has progressed or how other side effects have manifested. Additionally, as we have all learned over the past year with vaccines, some medical combinations work better with boosters or time delayed components. Thinking about causality in these kinds of situations is difficult because the differing time scopes and combinations make it hard to understand exactly what is affecting what and when. I don’t have any deep answers or insights into these questions, but simply highlight them to again demonstrate complex causation and how much work our minds must do to fully understand a causal chain.
Complex Causation

Complex Causation

In linear causal models the total effect of an action is equal to the direct effect of that action and its indirect effect. We can think of an oversimplified anti-tobacco public health campaign to conceptualize this equation. A campaign could be developed to use famous celebrities in advertisements against smoking. This approach may have a direct effect on teen smoking rates if teens see the advertisements and decide not to smoke as a result of the influential messaging from their favorite celebrity. This approach may also have indirect effects. Imagine a teen who didn’t see the advertising, but their best friend did see it. If their best friend was influenced, then they may adopt their friend’s anti-smoking stance. This would be an indirect effect of the advertising campaign in the positive direction. The total effect of the campaign would then be the kids who were directly deterred from smoking combined with those who didn’t smoke because their friends were deterred.
However, linear causal models don’t capture all of the complexity that can exist within causal models. As Judea Pearl explains in The book of Why, there can be complex causal models where the equation that I started this post with doesn’t hold. Pearl uses a drug used to treat a disease as an example of a situation where the direct effect and indirect effect of a drug don’t equal the total effect. He says that in situations where a drug causes the body to release an enzyme that then combines with the drug to treat a disease, we have to think beyond the equation above. In this case he writes, “the total effect is positive but the direct and indirect effects are zero.”
The drug itself doesn’t do anything to combat the disease. It stimulates the release of an enzyme and without that enzyme the drug is ineffective against the disease. The enzyme also doesn’t have a direct effect on the disease. The enzyme is only useful when combined with the drug, so there is no indirect effect that can be measured as a result of the original drug being introduced. The effect is mediated between the interaction of both the drug and enzyme together. In the model Pearl shows us, there is only the mediating effect, not a direct or indirect effect.
This model helps us see just how complicated ideas and conceptions of causation are. Most of the time we think about direct effects, and we don’t always get to thinking about indirect effects combined with direct effects. Good scientific studies are able to capture the direct and indirect effects, but to truly understand causation today, we have to be able to include mediating effects in complex causation models like the one Pearl describes.
Mediating Variables

Mediating Variables

Mediating variables stand in the middle of the actions and the outcomes that we can observe. They are often tied together and hard to separate from the action and the outcome, making their direct impact hard to pull apart from other factors. They play an important role in determining causal structures, and ultimately in shaping discourse and public policy about good and bad actions.
Judea Pearl writes about mediating variables in The Book of Why. He uses cigarette smoking, tar, and lung cancer as an example of the confounding nature of mediating variables. He writes, “if smoking causes lung cancer only through the formation of tar deposits, then we could eliminate the excess cancer risk by giving smokers tar-free cigarettes, such as e-cigarettes. On the other hand, if smoking causes cancer directly or through a different mediator, then e-cigarettes might not solve the problem.”
The mediator problem of tar still has not been fully disentangled and fully understood, but it is an excellent example of the importance, challenges, and public health consequences of mediating variables. Mediators can contribute directly to the final outcome we observe (lung cancer), but they may not be the only variable at play. In this instance, other aspects of smoking may directly cause lung cancer. An experiment between cigarette and e-cigarette smokers can help us get closer, but we won’t be able to say there isn’t a self-selection effect between traditional and e-cigarette smokers that plays into cancer development. However, closely studying both groups will help us start to better understand the direct role of tar in the causal chain.
Mediating variables like this pop up when we talk about the effectiveness of schools, the role for democratic norms, and the pros or cons of traditional gender roles. Often, mediating variables are driving the concerns we have for larger actions and behaviors. We want all children to go to school, but argue about the many mediating variables within the educational environment that may or may not directly contribute to specific outcomes that we want to see. It is hard to say which specific piece is the most important, because there are so many mediating variables all contributing directly or possibly indirectly to the education outcomes we see and imagine.
Counterfactuals

Counterfactuals

I have written a lot lately about the incredible human ability to imagine worlds that don’t exist. An important way that we understand the world is by imagining what would happen if we did something that we have not yet done or if we imagine what would have happened had we done something different in the past. We are able to use our experiences about the world and our intuition on causality to imagine a different state of affairs from what currently exists. Innovation, scientific advancements, and social cooperation all depend on our ability to imagine different worlds and intuit causal chains between our current world and the imagined reality we desire.
In The Book of Why Jude Pearl writes, “counterfactuals are an essential part of how humans learn about the world and how our actions affect it. While we can never walk down both the paths that diverge in a wood, in a great many cases we can know, with some degree of confidence, what lies down each.”
A criticism of modern science and statistics is the reliance on randomized controlled trials and the fact that we cannot run an RCT on many of the things we study. We cannot run RCTs on our planet to determine the role of meteor impacts or lightning strikes on the emergence of life. We cannot run RCTs on the toxicity of snake venoms in human subjects. We cannot run RCTs on giving stimulus checks  to Americans during the COVID-19 Pandemic. Due to physical limitations and ethical considerations, RCTs are not always possible. Nevertheless, we can still study the world and use counterfactuals to think about the role of specific interventions.
If we forced ourselves to only accept knowledge based on RCTs then we would not be able to study the areas I mentioned above. We cannot go down both paths in randomized experiments with those choices. We either ethically cannot administer an RCT or we are stuck with the way history played out. We can, however, employ counterfactuals, imagining different worlds in our heads to think about what would have happened had we gone down another path. In this process we might make errors, but we can continually learn and improve our mental models. We can study what did happen, think about what we can observe based on causal structures, and better understand what would have happened had we done something different. This is how much of human progress has moved forward, without RCTs and with counterfactuals, imagining how the world could be different, how people, places, societies, and molecules could have reacted differently with different actions and conditions.
Ignorability

Ignorability

The idea of ignorability helps us in science by playing a role in randomized trials. In the real world, there are too many potential variables to be able to comprehensively predict exactly how a given intervention will play out in every case. We almost always have outliers that have wildly different outcomes compared to what we would have predicted. Quite often some strange factor that could not be controlled or predicted caused the individual case to differ dramatically from the norm.
Thanks to concepts of ignorability, we don’t have to spend too much time worrying about the causal structures that created a single outlier. In The Book of Why Judea Pearl tries his best to provide a definition of ingorability for those who need to assess whether ignorability holds in a given outlier decision. He writes, “the assignment of patients to either treatment or control is ignorable if patients who would have one potential outcome are just as likely to be in the treatment or control group as the patients who would have a different potential outcome.”
What Pearl means is that ignorability applies when there is not a determining factor that makes people with any given outcome more likely to be in a control or treatment group. When people are randomized into control versus treatment, then there is not likely to be a commonality among people in either group that makes them more or less likely to have a given reaction. So a random outlier in one group can be expected to be offset by a random outlier in the other group (not literally a direct opposite, but we shouldn’t see a trend of specific outliers all in either treatment or control).
Ignroability does not apply in situations where there is a self-selection effect for control or treatment. In the world of the COVID-19 Pandemic, this applies in situations like human challenge trials. It is unlikely that people who know they are at risk of bad reactions to a vaccine would self-select into a human challenge trial. This same sort of thing happens with corporate health benefits initiatives, smart phone beta-testers, and general inadvertent errors in scientific studies. Outliers may not be outliers we can ignore if there is a self-selection effect, and the outcomes that we observe may reflect something other than what we are studying, meaning that we can’t apply ignorability in a way that allows us to draw a conclusion specifically on our intervention.
Alternative, Nonexistent Worlds - Judea Pearl - The Book of Why - Joe Abittan

Alternative, Nonexistent Worlds

Judea Pearl’s The Book of Why hinges on a unique ability that human animals have. Our ability to imagine alternative, nonexistent worlds is what has set us on new pathways and allowed us to dominate the planet. We can think of what would happen if we acted in a certain manner, used a tool in a new way, or if two objects collided together. We can visualize future outcomes of our actions and of the actions of other bodies and predict what can be done to create desired future outcomes.
In the book he writes, “our ability to conceive of alternative, nonexistent worlds separated us from our protohuman ancestors and indeed from any other creature on the planet. Every other creature can see what is. Our gift, which may sometimes be a curse, is that we can see what might have been.”
Pearl argues that our ability to see different possibilities, to imagine new worlds, and to be able to predict actions and behaviors that would realize that imagined world is not something we should ignore. He argues that this ability allows us to move beyond correlations, beyond statistical regressions, and into a world where our causal thinking helps drive our advancement toward the worlds we want.
It is important to note that he is not advocating for holding a belief and setting out to prove it with data and science, but rather than we use data and science combined with our ability to think causally to better understand the world. We do not have to be stuck in a state where we understand statistical techniques but deny plausible causal pathways. We can identify and define causal pathways, even if we cannot fully define causal mechanisms. Our ability to reason through alternative, nonexistent worlds is what allows us to think causally and apply this causal reasoning to statistical relationships. Doing so, Pearl argues, will save lives, help propel technological innovation, and will push science to new frontiers to improve life on our planet.
Laboratory Proof

Laboratory Proof

“If the standard of laboratory proof had been applied to scurvy,” writes Judea Pearl in The Book of Why, “then sailors would have continued dying right up until the 1930’s, because until the discovery of vitamin C, there was no laboratory proof that citrus fruits prevented scurvy.” Pearl’s quote shows that high scientific standards for definitive and exact causality are not always for the greater good. Sometimes modern science will spurn clear statistical relationships and evidence because statistical relationships alone cannot be counted on as concrete causal evidence. A clear answer will not be given because some marginal unknowns may still exist, and this can have its own costs.
Sailors did not know why or how citrus fruits prevented scurvy, but observations demonstrated that citrus fruits managed to prevent scurvy. There was no clear understanding of what scurvy was or why citrus fruits were helpful, but it was commonly understood that a causal relationship existed. People acted on these observations and lives were saved.
On two episodes, the Don’t Panic Geocast has talked about journal articles in the British Medical Journal that make the same point as Pearl. As a critique of the need for randomized controlled trials, the two journal articles highlight the troubling reality that there have not been any randomized controlled trials on the effectiveness of parachute usage when jumping from airplanes. The articles are hilarious and clearly satirical, but ultimately come to the same point that Pearl does with the quote above – laboratory proof is not always necessary, practical, or reasonable when lives are on the line.
Pearl argues that we can rely on our abilities to identify causality even without laboratory proof when we have sufficient statistical analysis and understanding of relationships. Statisticians always tell us that correlation is not causation and that observational studies are not sufficient to determine causality, yet the citrus fruit and parachute examples highlight that this mindset is not always appropriate. Sometimes more realistic and common sense understanding of causation – even if supported with just correlational relationships and statistics – are more important than laboratory proof.