Book Review: The Numbers Game

Alan Schwarz’s The Numbers Game is an indispensable look at how the numbers that have come to define the game of baseball came to be.  The book is less about the hallowed numbers that even casual fans can identify; Aaron’s 755 home runs, DiMaggio’s 56 game hit-streak, Nolan Ryan’s 5714 strikeouts, Cy Young’s 511 wins, Pete Rose’s [...]

Book Review: Proofiness

Charles Seife’s Proofiness is an accessible and entertaining look at the many ways numbers can be used (more to the point, abused) in order to win an argument.  Seife spends the early part of the book outlining his typology for numerical abuses.  For instance, “disestimation” is the act of taking a number too literally, understating [...]

“Statistics is the New Grammar”

In the latest issue of WIRED, Clive Thompson pens a great piece which echoes a sentiment I’ve touched on before: in a data-driven world it is critical that all citizens have at least a basic literacy in statistics. Now and in the future, we will have unprecedented access to voluminous amounts of data.  The analysis of this [...]

Analytical Shortcuts: Knowing “What” Instead of “Why”

Correlation doesn’t always equal causation, but often correlation can serve as a signal.  The collection and analysis of data in some areas of the world is messy and slow.  Often times this means the data can only tells us what happened in the past.  What we would ideally like is a snapshot of events and [...]

Methodology Lessons: DOE’s Natural-gas Overstatement

The Wall Street Journal reported yesterday that the US Department of Energy is set to restate the data it collects on U.S. natural-gas production.  The reason?  The Department has learned that its methodology is seriously flawed: The monthly gas-production data, known as the 914 report, is used by the industry and analysts as guide for [...]

Think like a methodologist

Nathan at Flowing Data puts words to an idea I’ve had for a while, but could never figure out how to communicate.  He writes, “[T]he most important things I’ve learned [in statistics courses] are less formal, but have proven extremely useful when working/playing with data.”  Some of the lessons learned include: [T]rends and patterns are [...]

The First Sabermetric Cy Young?

That’s one way to interpret Zack Greinke’s claiming of the award for 2009: It was not surprising that Greinke won, since his earned run average, 2.16, was the lowest in the American League since 2000. But his decisive margin of victory over Seattle’s Felix Hernandez was a sign that voters overlooked his deficiency in another [...]

Visualizing War

Two topics that are right up my alley: international conflict and data visualization. Put the two together, and you have a truly thought provoking piece of work. David McCandless is a “visual journalist” who specializes in visualizing data across numerous subjects. In his latest work for The Guardian’s Data Blog, David visualize a ton of [...]

Cloud Analytics from Big Blue

Music to analytically-driven ears: [...] IBM is unveiling a new internal analytics product that the company is touting as the “largest private cloud computing environment for business analytics in the world,” which launches internally with more than a petabyte of information. Along with this internal product, IBM will launch a companion product for clients to [...]

Did the market for offensive talent correct after Moneyball?

For those that follow the debate around  Michael Lewis’ Moneyball and its effect on front office strategy there is a fantastic article over at The Hardball Times.  For the uninitiated, Moneyball follows Oakland A’s General Manager Billy Beane during the summer of 2002 as the team attempted to implement a different strategy to make his [...]

Follow

Get every new post delivered to your Inbox.