# Lost in Stats: Pitfalls of the Scientific Method

[Organizers: David Gross and Felipe Montealegre-Mora; Wednesdays **4:15pm**, Seminar Room 0.03 New Theory Building; digital participation possible at https://uni-koeln.zoom.us/j/6233837377]

Science employs a collection of methods that are designed to increase our knowledge about the real world. However, it is increasingly appreciated that the scientific method as it is actually used will sometimes lead to systematically biased results. This phenomenon is best-understood in the quantitative social sciences, which have faced a *replication crisis* in recent years -- i.e. a situation where widely-accepted effects have turned out to be spurious. A fairly good theoretical understanding has been achieved of how a field employing state of the art inferential methods can end up accepting non-existing effects as real.

This debate has not been as prominent in physics. There are reasons to believe that physics is less affected (access to more data, quantitative theories, restriction to less complex situations). But there are also signs that we should be more concerned and would benefit from absorbing the lessons learned elsewhere.

In this seminar, we will explore the systematic failure modes of the scientific method and scientific institutions, with particular attention to their applicability to physics.

## Organization

First meeting is on** Wednesday, the 13th of October, 4pm. **If you are interested in participating, just show up or write an email to Felipe (fmonteal@thp.uni-koeln.de). Registration via KLIPS is not currently possible. Most topics will be covered by 1h talks by participants, some fundamentals might be covered by the organizers, lecture-style. **Update:** we are fully booked! Here are the list of talks and resources:

#### Talk list, resources, slides

- 03.11
*Frequentist statistical methods*, Felipe- Freedman, Pisani, Purves.
*Statistics*, Part VIII. - Devore.
*Probability and statistics for engineering and the sciences*, Chaps. 6, 7 and 8. - Alexander, The control group is out of control.

- Freedman, Pisani, Purves.
- 10.11
*Bayesian statistical methods*, Roberto- Gelman, Carlin, Stern, Dunson, Vehtari, Rubin.
*Bayesian data analysis*, Chaps. 1 and 2. - Lyons.
*Discovery or fluke? Statistics in particle physics.* - (Optional/extra) Jaynes.
*Probability theory: the logic of science*, Chaps. 4 and 6.

- Gelman, Carlin, Stern, Dunson, Vehtari, Rubin.
- 17.11
*Simpson's paradox and causal calculus toolbox*, Mark- Pearl.
*Understanding Simpson's paradox*. - Nielsen,
*If correlation does not imply causation, what does?*

- Pearl.
- 24.11
*Observational causal inference*, Mariami - 01.12
*No talk**Observational causal inference*, Mariami - 08.12
*How to detect false positives*, Joey- Ioannidis,
*Why most published research findings are false*. - https://www.statisticshowto.com/funnel-plot/

- Ioannidis,
- 15.12
*Effects of underpowered studies*, Laura- Ioannidis,
*Why most discovered true associations are inflated*. - Timm,
*An introduction to type S and M errors in hypothesis testing*.

- Ioannidis,
- 22.12
*p-hacking, multiple testing, and forking paths*, Rajat- Gelman, Loken.
*The statistical crisis in science: data-dependent analysis--a "garden of forking paths"--explains why many statistically significant comparisons don't hold up*. - Gelman, Loken.
*The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time*. - Anderson.
*p-hacking and the problem of multiple comparisons*.

- Gelman, Loken.
- 12.01
*Historical expectation bias*, Denny- Jeng.
*Bandwagon effects and error bars in particle physics*. - Jeng.
*A selected history of expectation bias in physics*. - Feynman.
*Cargo cult science*.

- Jeng.
- 19.01
*Error bars in quantum estimation/systematic biases in Bell tests*, Nikkin - 02.02
*Closing talk (title TBC)*, Matthias Kleinmann.

## Preliminary List of Topics

- Review of inferential methods
- Orthodox & Bayesian statistics, null-hypothesis significance testing, p-values, measures of effect size, uncertainty quantification
- Causation vs correlation: Causal models, randomized designs, observational causal inference, Simpson's and Berksons' paradox

- Pitfalls of empirical science
- Technical accounts: Why most published research findings are false, Why Most Discovered True Associations Are Inflated, ...
- Fun reads: The control group is out of control, Unicorns vs Bigfoot

- The discussion in physics
- Expectation bias in physics. More specific: Bandwagon effects and error bars in particle physics. Less serious: cargo cult science.
- Systematic biases in quantum information experiments: Error bars in tomography, Systematic errors in Bell experiments.
- Biased measurements in astronomy
- ...

## Non-topics

- We are only interested in
*systematic*biases of the scientific method. That is distinct from hypotheses/theories/programs failing to live up to their hopes. Failure is an expected part of exploration. Systematic biases are not. - Pseudoscience ("quantum homeopathy...") or crackpots. (Though the typology of crackpots
*is*a fascinating topic, as is its relation to physics) - Conspiracy theories, nihilism, unfocused cultural criticism
- Fraud. (Fraud is boring).
- "Culture wars" and hot-button political issues
- Activism. Here, we aim to understand, not to reform.