In Praise of Doing Your Own Research

2022-Feb-04

Among the many things that people in the America (and her sphere of cultural influence, which certainly engulfs Canada) are divided about as a result of the two years of this pandemic is how we should find reliable information and how we should make sense of the world around us.

One group of people believe the best course of action is to trust the authorities and get through this by following policies; another group, of which I count myself, believe that you gotta dig into the raw data more deeply and decide for yourself what makes sense.

The latter approach elicits much disdain from the former group. A common genre of memes on social media these days are jokes about those foolish enough to try to "do their own research". These memes basically imply any combination of the following:

  • Only those with adequate credentials can do research.
  • Only those with adequate credentials who conform to mainsteam narratives are credible.
  • Googling things on your own is a waste of time that can produce no knowledge.
  • Those who lack credentials but think they can do their own research are simply searching for confirmation bias and/or suffering from the Dunning-Kruger effect.

Underlying these assertions is a deep sense of trust in science, scientific institutions, and policy-setting government bureaucracies. I don't share this faith in any of these actors. For one thing, there has never been more evidence that the practice of scientific inquiry has been corrupted. And when it comes to policy, it is foolish to think that science, rather than political calculations, is at the core of the entreprise.

Corruption of Science

There is a famous adage for those who want to make it as scientists: publish or perish. Researchers, peer review journals, funders, and research institutions all have a vested interest in publishing high-impact scientific findings. Often, their survival as an organization or their careers as individuals depend on this.

The problem is when a signal for competence (eg. number of papers, number of citations, etc) becomes a target, it starts being gamed. This is known as Goodhart's Law: when a measure becomes a target, it ceases to be a good measure. This explains much of the trouble with science today.

As Stuart Ritchie shows in his book, Science Fictions, this incentive structure has led to a wide array of misbehaviour by scientists.

Fraud

What happens when you're in a cut-throat competitive environment and your survival depends on producing the most impressive research finding you can? For some scientists, the answer is simple: you cheat.

The most egregious example that Ritchie recounts in his book is that of Paulo Machiarini. As a surgeon, Machiarini published papers on a breakthrough technology allowing the creation of synthetic tracheas that could be transplanted in patients without risk of rejection by the immune system. To summarize a long, dramatic, and frightening story: Machiarini performed this surgery on patients for several years before it was discovered that all of his patients were dying. Some of the patients didn't even have a life-threatening condition before the surgery. Meanwhile, Machiarini was publishing papers on the success of his procedures the whole time. Sometimes, the patients who were the subject of the paper had died before the paper had even been accepted for publication, yet Machiarini conveniently left these details out of his report.

You see, by the time he got to performing these surgeries, Machiarini was already considered a celebrity in his field. He was hired by Sweden's Karolinska Insitute – the home of the Nobel Prize! When the frontline healthcare workers at Karolinska who were responsible for caring for Machiarini's patients tried to blow the whistle on his fraud, the university silenced them and threatened their careers. It was only after Machiarini's chronic lying got him in trouble with a mistress that his house of cards came crashing down.

While this is a uniquely outrageous example, there are countless others. From scientists who sat at their dinner table and "produced" their own data to those who published papers with photoshoped microscope imagery to (my favourite example) a dermotologist who claimed to have solved the skin transplant rejection problem in mice when he simply took white mice and coloured them black with a felt pen!

In one anonymous survey, 2% of scientists have admitted to committing fraud at least once. The real number is certainly much higher.

Bias

It takes a particularly corrupt individual to fake data and lie outright. But it takes a lot less to simply... nudge things – consciously or subconsciously – in one direction or another.

I mentioned above that the participants in the scientific process – the researchers, the journal editors, the institutions, and the funders – are all incentivized to pursue as many high-impact scientific findings as possible. This means scientists are inclined to pursue "groundbreaking" results. And as Ritchie notes in his book, if you spend a lot of time breaking new ground, you end up with a lot of holes. Much value is to be derived from careful, dare I say boring, learning that incrementally builds on past work. But this kind of work, while essential to the scientific process, is strongly disincentivized.

One way that scientists nudge their results to look like they found a real effect is called 'p-hacking'. Without getting into the weeds, this is a statistical trick used to make an otherwise uninteresting data set look like it shows a real effect. Ritchie recounts numerous cases of p-hacking in his book. Most notable is the case of Brian Wansink. This former researcher in consumer behaviour had multiple of his papers retracted after he published a blog post, recounting that he had instructed his students to "salvage" data from a study that had, in fact, found no significant results. Wansink's public admission highlights something important: many scientists don't realize they're p-hacking or why it's wrong, so much so that they're willing to publicly tell the world about their methods. The cases recounted in Ritchie's book makes one wonder if these scientists ever took a course in statistics.

P-hacking makes it look like scientists have made a real discovery when what they're actually seeing is just random noise. In one poll of 2000 psychologists, covering a range of p-hacking practices, approximately 65% admitted to engaging in them.

The bias toward publishing high impact results has another casualty: the lack of publication of null findings or corrections. Because every actor in the scientific entreprise is motivated to publish groundbreaking results, when an experiment shows no such finding, it is often shelved, depriving the scientific community of an important lesson (knowing where there aren't any effects to be found is just as important to science as finding real effects).

Similarly, when scientists run replication attempts of previously published findings and fail to replicate them, journals are not motivated to publish this result. Ritchie opens his book with the story of Daryl Bem, a psychologist who claimed to have found evidence for parapsychological phenomena, where some individuals could predict the future. Ritchie and his team attempted to replicate Bem's findings. When they couldn't, the journal that had published the original paper to much publicity refused to publish the failed replication.

There's yet another bias that corrupts the practice of science and is a function of innate human behaviour: when a respected group of scientists agrees on something, it can create a groupthink dynamic that rejects opposing findings even in the face of empirical evidence. Science writer Sharon Begley has outlined how a cabal of scientists has stunted research toward a cure for Alzheimer's disease by essentially agreeing by fiat about the root cause of the disease (a certain plaque formation in the brain called the 'amyloid plaque'), even though years of attempts at addressing this root cause have failed to create a cure. Ritchie notes: "The dissenting researchers described how proponents of the amyloid hypothesis, many of whom are powerful, well-established professors, act as a ‘cabal’, shooting down papers that question the hypothesis with nasty peer reviews and torpedoing the attempts of heterodox researchers to get funding and tenure".

Negligence

Sometimes, people just make mistakes! And just because something is "science", doesn't make it error-free, nor does it warrant blind acceptance. In 2010, two economists published a paper showing that economic austerity is needed to keep the debt-to-GDP ratio less than 90%, otherwise the economy will shrink.

The Reinhart-Rogoff paper was the subject of much media coverage and was quoted from the US senate to the British cabinet. Except, it was soon discovered, the Excel sheets that the economists used contained a typo in one of its formulas. When the typo was fixed, the effect they thought they found disappeared.

In 2016, one Dutch scientist developed an automated method of checking scientific papers for some statistical errors. Out of 30,000 papers analyzed, nearly half contained errors (mostly minor), and around 13% had conclusion-changing mistakes.

Negligence and bias play hand-in-hand: errors that make papers look like they have more impressive findings tend to go unchecked, whereas those that make the findings less impressive tend to get scrutinized and eliminated by researchers.

Other similar techniques have been used to analyze papers in an automated fashion, with concerning results. But these techniques generally rely on picking up signals of mistakes or fraud from the published results. What's just as concerning is that papers often don't have enough information about their data or their study design. In one replication attempt, cancer researchers tried to replicate 51 important studies. They couldn't even start because these studies lacked enough data for replication. This is in direct contrast to the scientific norms of communality and universalism. One wonders how much additional errors would be found if scientists were more transparent. Additionally, some errors cannot simply be found in an automated fashion and require human judgement.

Hype

Now, let's throw another industry with perverse incentives into the mix: media and journalism.

As we've seen, the practice of science, even when it's supposedly following a rigorous process, is full of holes to begin with. But what if you could earn your prestige and career advancement without having to deal with that pesky rigour?

Some scientists and science organizations choose the route of science by press release instead. Here, even the few checks and balances that may prevent the progress of bad findings are nonexistent. For some, generating media hype around questionable findings can be a profitable business. Not only does the media present less friction toward publication, they amplify findings and create buzz, which the researchers can use to secure future grants or get book deals or give lucrative talks.

Over the last two decades you may have heard of the benefits of having a 'growth mindset' instead of a 'fixed mindset'. This was based on the research of Stanford psychologist Carol Dweck, who published popular books on the subjects and gave TED Talks. Her assertions were incorporated into school curriculums and in corporate training materials. In contrast to Dweck's bombastic claims, a meta-analysis of the research on the correlation of growth mindset with school performance found a 1% correlation.

Sometimes, hype around research can be built by journalists who, themselves, are incentivized to publish eye-catching content and who may not have the ability to correctly interpret data. In one conversation with a friend, where I was supporting the notion of doing one's own research, the Dunning-Kruger effect was brought up, and I found myself looking at the infamous Mt. Stupid chart. This chart is what comes up if you do an image search for the Dunning-Kruger effect.

It turns out, though, that this chart has nothing to do with research done by Mr. Dunning & Mr. Kruger, who never measured perceived competence over time. Additionally, a part of their now-famous paper measured "competence" by gauging how much participants agreed with a panel of experts on whether certain jokes were funny or not, after members of this panel that didn't fit the consensus were removed!

Like a game of Chinese whispers, the findings of the Dunning-Kruger paper have gone from a psychology paper to a cultural icon that is fully detached from its original, already-questionable grounding.


The scientific entreprise suffers from perverse incentives that encourage fraud, bias, negligence, and hype, and that discourage diligent replication and non-buzz-generating findings. Contrary to the Mertonian norms of openness and disinterestedness that should govern the practice of science, many papers don't expose enough data for replication to even take place.

The result is, at best, slow progress, and at worst, false findings that make their way into critical decisions. Over the last decade, the dam has broken on this reality. The field that led the way was psychology, where it was discovered that a large number of key findings could not replicate. In some large scale replication attempts, only 39% of the results could be replicated. In microeconomics, one replication survey had 61% success rate, and a 2018 study of neuroscience papers found that about 10% of brain imaging studies were affected by a software bug that compromised the results. One comprehensive study of biomedicine papers found that out of 268 papers, only one reported its full protocol, such that replication could even be attempted. In other words, 267 of these papers could have never gone through the full rigour of the scientific process.

In some fields, attempts at replication are as low as 0.1% of the total publications. The errors we've seen above are not being looked for in the vast majority of scientific output.

So when it comes to relying on science for making critical decisions, we have to be very careful. It's also important to acknowledge that science never dictates action in the real world. Every scientific finding comes with caveats. It's important to understand the context in which the finding holds and the factors which may render it inapplicable. To decide on a course of action, science is only one input. Other inputs include values, judgement, and – if you are a government agency deciding on policy – political considerations.

A Question of Policy

It's March 2020. A new disease is circulating in the world. Not much is known is about it, but it does seem to be somewhat fatal. What do we do?

Here's a scenario in which relying on science is hopeless. Science cannot answer this question. There haven't been enough studies done on this disease. Those that have been done may be incomplete or imperfect. Over time, we will know more, but we have to act now. What should we do?

We use inductive reasoning. We might know, for example, that the disease is caused by a virus. We may use what we know of other viruses to guide our course of action. We know that many viruses are airborne, so it might be that wearing masks or not gathering in the same space will reduce spread. We also know that some viruses can be transmitted on surfaces. Then we might opt to sanitize the surfaces that we touch. Is this science? Certainly there are no randomized, controlled trials. In this scenario, we are using our best judgement to decide on a course of action. Could we be wrong? Absolutely. There is no law of nature that says all viruses should transmit through the air or on surfaces. Perhaps there is another means we're ignorant of. But we act regardless.

While the decision to mask up or sanitize surfaces in the face of an unknown virus may seem obvious, in many contexts, such decisions would be derided with remarks like "there is no evidence that sanitizing surfaces helps prevent this infection". And that is both true and useless. While there is no scientific evidence in the form of rigorous, double-blind studies, there can still be good judgement when it comes to choosing what to do.

Judgement is a core part of deciding on a course of action and it's completely outside the purview of science. Another important consideration when it comes to policy is, of course, political calculations. What is most notable about the last two years is that more than ever before people in positions of leadership have been transparent about the fact that their policy decisions are not, in fact, based on science. For some people, this transparency seems to have only strengthened their faith in policy-makers for reasons that are beyond my comprehension.

Consider this quote from the man once so loved for his leadership of the US effort against the pandemic, Anthony Fauci, on how he chose to communicate the minimum necessary vaccination threshold to the public:

When polls said only about half of all Americans would take a vaccine, I was saying herd immunity would take 70 to 75 percent. Then, when newer surveys said 60 percent or more would take it, I thought, “I can nudge this up a bit,” so I went to 80, 85.

Or consider this quote from the former head of the FDA on the 6-ft social distancing rule:

Nobody knows where it came from. Most people assume that the ... the recommendation for keeping six feet apart, comes out of some old studies related to flu, where droplets don't travel more than six feet... the original recommendation that the CDC brought to the White House was ten feet and a political appointee in the White House said "we can't recommend ten feet, nobody can measure ten feet. It's inoperable. Society will shut down." So the compromise was around 6 feet.

Is it still your impression that science is the leading factor in setting policy?


In my discussions with friends about vaccine hesitancy and pandemic response policy, a common theme that has come up is their unshakeable faith in science. "How can people be vaccine hesitant!? Science is not questionable," one friend told me. "How could I go about doing my job if I don't believe in science", another exclaimed.

But science is a domain for which faith is wholly inappropriate. You may trust the scientific process, but trusting scientists and the bureaucracies within which they operate is a different matter, entirely. The desire to have blind trust in science is nothing other than the religious impulse in new robes, and it offers the same convenience: the luxury of not having to think and risk being confused or bewildered.

On a good day, scientific findings are prone to fraud, bias, negligence, and hype, and public policy is at the mercy of poor judgement or bad political incentives. And this is to say nothing about the potential for corruption. When it comes to drugs and therapeutics, for example, we've created a set of perverse incentives whereby the people who are supposed to oversee the drug companies are regularly employed by them. This has been clear for decades. In a 2004 book, titled The Truth About Drug Companies: How They Deceive Us & What to Do About It, the author showed that "many members of the FDA advisory committees were paid consultants for drug companies. Although they were supposed to excuse themselves from decisions when they have a financial connection with the company that makes the drug in question, that rule is regularly waived."

I say we need to encourage more people to do their own research, not less. Of course, it is impossible to do this for everything and at all times. Human society relies on a division of labour. But for things that directly impact our lives, it would be foolish to rely on a group of actors whose incentives may not be fully aligned with ours.

Doing your own research does not mean going into the field and collecting data, and it also doesn't mean discarding experts. But it does mean thinking critically about the systems within which those experts may be operating and about the level of trust that you should put in one expert versus another. The challenge for those who say they "follow the science" to mean that government policy should be followed uncritically is to explain why the science differs so widely across different jurisdictions. Did one jurisdiction get the right experts, the right data? How do you know?

The truth is that we rely on experts all the time without totally abdicating our critical faculties. Whenever you take your car to a mechanic or your pet to a vet or yourself to a doctor, you are implicitly doing your own research. Firstly, you had to pick which of many available experts to rely on. Secondly, you assess their performance based on your feeling of their honesty and their past records, and you may, at your discretion, decide to get a second opinion. For many people who have had a serious health condition and needed to interact with the health care system, it is quite common to take their treatment into their own hands, connect with others who have gone through similar situations, seek specific experts that they trust, and make key decisions based on the tradeoffs of different options. This is not a new phenomenon at all. Let's not normalize unthinking conformity and obedience.