"Like “pivot” and “cloud computing,” “big data” is one of those startup buzzwords that gets thrown around indiscriminately–partly because it means different things depending on the intel you’re trying to unearth and partly because it sounds like the kind of futuristic jargon that opens doors. Using machine learning to analyze big data? We can practically see the pitch deck already!”
I’m going to bite here and say “sort of”. Silver’s analysis is on the very low end of what qualifies as ‘Big Data’, but that’s sort of beside the point I want to make here. As Ezra Klein said, “The greatest trick Nate Silver ever pulled was becoming the face of aggregated polling”. Klein is mostly right in that, while Silver masterfully aggregated a lot of data, he did an equally good job of promoting his work and, in some ways, is the product of being in the right place at the right time. As the Atlantic pointed out, he wasn’t the only person who was accurately predicting the outcome of the election over an extended period of time.
He reminds me a little bit of the psychologist Steve Pinker. Like Silver, Pinker is uber-smart. But the ideas and methodology these guys have are not entirely new. Pinker writes pop-sci books that essentially aggregate existing research. That said, what makes people like Pinker and Silver special is that they take something seemingly mysterious and de-mystify it. That’s a talent very few people have. And You can sense a little bit of bitterness in Klein’s tweets, as though he feels he could have done what Silver did. And he probably could have, except he didn’t.
So yes, we should absolutely, 100% celebrate Silver’s accomplishment, but more because it represents something that reaches beyond just one dude. It’s that data has completed its near total integration into our lives, which is too often misconstrued as a bad thing. And more to the point, it’s that everyday fools like you and I can actually see how it works and know that data is our friend. Even in the face of so much bullshit punditry over the last four years, the data could not be denied. Yes, @fivethirtyeight is a game changer. But Big Data has yet to come of age and I suspect that’s where the revolution lies. And that’s what I think about that.
Quick - someone flash the Hadoop signal!
Image source: REUTERS/Chris Helgren
“It’s early yet but, by my count, Twitter has already been responsible for more athletes being ousted from the Olympics than have performance enhancing drugs.”
Dilbert cartoon on Big Data. Awesome
"The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from the folder analogy to the file cabinet analogy to the library analogy to — well, at petabytes we ran out of organizational analogies.
At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later. For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn’t pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right.”
“It’s difficult to see the data race subsiding. In fact, the economic incentives to harvest and monetize vast amounts of data are only growing. A 2011 McKinsey study (that is quickly becoming the most often cited source of the economic potential of Big Data) pegged the value of Big Data in just the US healthcare system alone at US$300 billion. McKinsey also estimated a need for a million and a half “data savvy managers” in the US simply to take advantage of the economic opportunities of commercial Big Data. The report notes that there are big potential wins for the public at large as well: a more efficient US healthcare system that leans more heavily on Big Data to anticipate public health trends could lower heath care costs for all taxpayers. But such win-win scenarios are not always as obvious in other sectors.”