Doing Good Science: More Than Good Methods

Epistemic status: initial thoughts, but probably incomplete. But I prefer sharing incomplete but useful thoughts over not sharing them at all. I think these thoughts roughly generalize to non-scientific research (e.g. industry’s R&D or philosophy).

Recently somebody at a dinner asked me what I think is effective science (that is, science that is maximally good for the world). I had some initial thoughts, but the question kept simmering in my mind. Here I set out an initial model of what I think is good science. Numerous people before have thought about this question and I am probably ignoring a lot of their work. However, I hope this still adds something useful. This question is important for doing good science, evaluating good science, and funding good science.

I think this problem is important, neglected, and tractable. It is important because scientific developments compound: they will affect the trajectory of the future and obtaining new insights sooner improves decision making. I think it is neglected as most granting institutions seem to adopt a morally partial view (i.e. they favour a specific group such as Dutch citizens, or a specific topic such as history of philosophy). Impartial granting and evaluation is done, but only exceptionally. I think the problem is relatively tractable, although it is one of the most complex ways in which one can do good. Science has a long distance to actual impact (in comparison to e.g. planting trees or making policy), so there are many paths it can take towards impact. However, my impression is that we can evaluate the value of research much better than ‘total ignorance’, even if we cannot single out the best possible research.

I want to define ‘Good Research’ as good questions with good answers that are used to improve the world. In formula form, this looks as follows:

Value of research =
quality of the question * quality of the answer * quality of use

Note that the value of research is a product, not a sum. This means that scoring low on only one dimension (e.g. use) drastically reduces the total value of the research. This principle (AKA the Anna Karenina Principle) normally gives rise to a power-law distribution, even if values are normally distributed over the three factors. This can be shown in two different ways, so I’ll just draw both. The point here is to get all relevant factors right.

I should also note that each value can take a negative value, but I don’t include that in my model for now. However, this means that even seemingly good research can have negative consequences. Responsible scientists need to realize that, and take the necessary steps to prevent scoring negatively on any factor wherever possible.

I also believe that the limited attention that has gone into evaluating good science has gone mostly into the quality of answers, making sure they are reliable (e.g. they replicate) and valid (i.e. actually capture the phenomenon it is meant to capture), and attention has also gone into science implementation, e.g. by collaborations with industry, government, and civil society. I suspect the least amount of attention has gone into making sure we ask the right questions. I will now go into each factor, flagging some initial considerations and breaking the factor up further.

Quality of the question

Richard Hamming is (in)famous for asking many scientists “what is the most important problem in your field?”. After getting an answer, he asked “and why are you not working on it?” Unsurprisingly, not everyone liked Hamming. However, his observation is on point: many scientists do not work on the most important problems in their field. Hamming acknowledged that asking the right questions is a skill that needs years of cultivation (see this talk of his), and requires being connected to one’s own field, other fields, and to society. Without (the right) feedback one shouldn’t expect to figure out the most important questions to ask. (This is also why I think being involved in the effective altruism community is so important for me.) However, where Hamming focused on problems that were scientifically important, I (and I suppose many readers with me) care about more than scientific progress: I want it to benefit the lives of humans and other sentient beings.

This brings values into the picture. I regard one of the main contributions of philosophy of science in the 20th century to be that science is value-laden, not value-neutral. Consequentially, scientists need to engage in practical ethics to decide what is most important: is it the wellbeing of humans, of all sentient beings, reducing global inequality, reducing existential risk, promoting biodiversity, or something else? Given that some moral views (especially total utilitarianism, which I find very plausible) claim that there are orders of magnitude difference in value between these things, it is important to get them right. If you think that we already have our academic priorities aligned with our moral priorities, I present you this graph from Nick Bostrom (2013):

Other components of the question quality might be its scientific importance: working out the details of one of many theories does not move the field forward as much as questioning the assumptions of a paradigm, which has huge scientific implications (e.g. Kahneman & Tsversky questioning classical economics assumptions). I have heard others say that some people are especially good at this: newcomers and scientists from other fields. Interdisciplinary research also seems neglected: applying insights to new areas is something not many people are able to. There are possibly other factors I have not identified here.

Quality of the question =
moral importance * scientific importance * ..?

Quality of the answer

This is a factor I am much less qualified to talk about than other scientists and philosophers of science. They include factors like reliability, validity, and can be improved by individual creativity and rigor, and systems such as preregistration to avoid p-hacking and the file drawer effect, and other methods to avoid false positives. The formula might look as follows, but I’m significantly uncertain about it:

Quality of the answer =
reliability * validity * generalizability * simplicity

Quality of use

Research is worthless if is always stays within academia. It needs to positively affect the world. Insights will generally be used outside of academia by government, industry, or civil society. However, science can also indirectly influence the world via influencing other science. I suspect this is how most science impacts the world. To connect insights to actors the insights need to be accessible. Initiatives to improve accessibility are open science, science outreach to general public, and other translation such as literature reviews.

Moreover, quality of use is also not value-neutral. A policy can be implemented very rigorously and efficiently to achieve a worthless, or even an abominable, goal. Effective science therefore needs to ensure their research is used by competent actors with the right values. An approximation of the quality of use is as follows:

Quality of use =
accessibility of answer * values of user * quality of implementation


Doing good with science is complex, neglected, but tractable. I have set out a very rough, incomplete and partially incorrect model to make sense of this. I encourage others do further work on this and would be happy to contribute further.

My preliminary model of impactful science looks like this, and is subject to change (click to enlarge):

One comment on “Doing Good Science: More Than Good Methods”

  1. Good and interesting post though i just skimmed it–the ‘Anna Karinena principle’ (which i had never heard of (except from the book by Tolstoy which I haven’t read either) is ‘spot on’.

    ‘Your chain is as strong as its weakest link’ is another version.

    . The WikiP article takes you to V I Arnold’s formulation (i think he’s the A in KAM theorem of statistical physics) . He wrote a quite critical article about math education in France long ago. (Russian math and science is also quite complicated—see Luzin affair).

    Everything is ‘multiplicative’ including utility, and its nonlinear. This has been my basic critique of EA , though I have a sort of simplistic or what i call a common-sense view . Its also similar to ‘the parable of solomen’ (if i recall) —the quality of life of 2 halfs of a baby is not the same as a whole baby. Its not additive.

    In my view the problem with science now is there is too much, its often redundant, some people work on trivial issues, do not read the literature, write very long ‘obfuscationist’ papers (make a molehill into a mountain), are ‘opportunity hoarders’ (take all the research funding for their trivial studies and spend it on conference travel ), and is often biased. Also its often ‘publish or perish ‘ driven, as well as ‘from lab to market’ issues (corporate funded science–they want scientists to make something they can sell. This is one reason USA has an opiate epidemic–i knew scientists who worked in Big Pharma –in one town in USA with 400 people and 2 pharamacies they sold 3 million oxycontin pills in 2 years).

Leave a Reply

Your email address will not be published. Required fields are marked *