If you get caught stealing signs in baseball—the original baseball data theft—your baseball punishment is a fastball to the ribcage.
If you steal second base when your team is up ten runs, you also get a fastball to the ribcage.
If you steal another player’s roster spot by taking PED’s, then get caught, it may take awhile (the guy you beat out has to make it to the big leagues first) but, you guessed it, a fastball to the ribcage.
Clearly, baseball understands the concept of corporal punishment, maybe too well, as it’s collective wisdom seems to insist that Astros’ General Manager, Jeff Luhnow, who just caught a St. Louis Cardinals hacker snooping around his comprehensive player database, codename Ground Control, kick open the door to the Cardinal’s analytics department and drill some random data nerd with his best heater in retribution.
Baseball executives beaning baseball executives, it’s an amusing image. So amusing that many baseball media moguls—myself included— have gotten a lot of mileage out of playing up the baseball justice angle.
But it’s missing the point.
In hacking into the Astros database, the St. Louis Cardinals did something unprecedented in baseball. Not only did they potentially cause more damage to more players and careers then any individual player cheater ever did or could (Yes, even A-rod), they also made one of the strongest cases for players to pay attention to the power of advanced analytics since Brad Pitt got cast to play the lead in Money Ball.
Advanced data analytics has had a tough go of it. Not among front office officials who are well versed in decision making via analytical samples, but among the players who are generating the data that’s analyzed. That’s because baseball players have a certain order of operations by which they instinctually understand the game. It’s often rooted in clichés, and encapsulated in slogans like “playing the game the right way.” It expands and contracts with ego, tries feverishly to keep things simple/stupid, and only uses advanced data when it’s distilled down into not-so-advanced, bite size chunks, connected to immediate, gratifying results. Microwavable analytics, if you will.
To get to that point, however, a lot of boring and not so news worthy work has to be done. Data has to be collected, refined, processed, and turned into an actionable outcome. And there is soooooo much data in baseball, mountains and mountains of it. Getting data is rarely ever the issue in making a baseball decision. Cleaning and filtering it, extracting meaning and causality—that’s when you’ve created something truly valuable, maybe even game changing.
What the Astros have in Ground Control, and what every baseball organization that’s worth its salt now has under its employ, is the culmination of all-star level analytical talent: complex tools, reports and algorithms that help front office officials draft all-star level athletic talent. These tools represent an investment, one each respective front office will contest is making their organization better. Surely it has played a part in the evolution of the current Astros.
Even so, I expect most players to shrug at the lost work of their off field counter parts. I expect most baseball fans to shrug at it as well. To many, despite all the big data baseball breakthroughs and movies, stats are still numbers recorded after baseball players act, with the good numbers going to good players while the bad numbers go to the Hayhursts…err… bad players. And you can’t steal someone’s stats like you can a base, duh.
But this loses sight of how today’s stats are the raw building blocks of the models that, in turn, build future championships. Those models are intellectual capital. And stealing intellectual capital is a crime… Just not one for which you can sink a fastball into the guts of the person who committed it as punishment.
Maybe it’s for the best as a fastball beaning, even one performed by Randy Johnson, would be getting off easy. If FBI investigators can prove a malicious data breach occurred, or that Astros’ proprietary data tools were taken, jail time will be a result. Moreover, if a link can be made between the Cardinals success and any illegally accessed information, expect MLB to have a long, hard, and possibly damming chat about stripping the Cardinals of their previous and/or present success, which could have untold fallout.
Ironically, there is one potential victor here: the analytics community. Spotting trends is hard work. Getting players to buy in his harder still. In fact, sometimes explaining the complex nuances of baseball big data to a player is counterproductive as more than a few players feel resentful, insisting data analysts don’t have a lick of athletic potential and could never understand the difficulty of the athlete’s struggle for greatness by way of standard regression. It’s like asking the player to look at himself or herself as a number generating machine and not a person. They’ll resist. They’ll say you don’t understand. They’ll say you’ve never played the game.
But that game has changed, and if data and the trends it shows are now worth stealing, there is at least some intrinsic validation that data does understand, shouldn’t be resisted, and has came into existence under it’s own struggle.
Maybe the data even shows a little common ground.
Baseball players understand corporal punishment, that much we know. There is another topic they understand very well: cheating. Data teams may not compete on the field like their player counter parts do, or come into the game the same way, but they still compete to get an edge for their club, and no one likes to have their edge taken away without justice being served—baseball or otherwise.
—if you like this article and the others on here, please click an add. The proceeds help keep this site up and running. Thanks!—