THE EXAGGERATED PROMISE OF SO-CALLED UNBIASED DATA MINING

Submitted by michael on Mon, 01/14/2019 - 11:50
Excerpt

The Feynman trap—ransacking data for patterns without any preconceived idea of what one is looking for—is the Achilles heel of studies based on data mining. Finding something unusual or surprising after it has already occurred is neither unusual nor surprising. Patterns are sure to be found, and are likely to be misleading, absurd, or worse.

The next time someone says "I'll collect all the data and then the patterns will be apparent" I'll remind them of this.