"What does that even mean?" That question can be as complex as some advanced stats themselves, in that one could be asking: how is it counted/measured, what makes up the stat, what is a good (or bad) value, or what is that supposed to tell me about player X? An equally valid question being asked since the dawn of complex statistics in sports is, so what?
It is that last question that has given rise to the debate(s) surrounding advanced statistics which have permeated virtually every team sport played beyond the neighborhood sandlot. There are plenty of basic stats – accumulated stats like points scored and homeruns hit, and rate stats like free throw percentage and batting average – that may be interesting and impressive (or horrifying) on their own, but not necessarily informative when it comes to the big picture of an individual’s overall contribution to the team’s record. And then there are the “advanced” stats that either combine two or more basic stats into a more revelatory and/or informative form of data or analyze performance across a complex set of criteria.
Nate Silver recently published a piece on FiveThiryEight.com which he presented an improved statistic for measuring the value of a relief pitcher, particularly a closer. His premise is that the save is a wildly overrated stat that does little more than count a pitcher’s number of accumulated saves, which I happen to agree are based on a rather arbitrary set of situational conditions. One of the best part of this article, other than perhaps fueling the fire within Goose Gossage, is that it is a fantastic How To guide for developing an advanced statistic. Consider this the 21st century stat geek version of Schoolhouse Rock’s How a Bill Becomes a Law. For anyone interested in turning his/her hypothesis into the next VORP, Silver provides a fine process blueprint.
Silver walks through all the rationale in developing the Goose Egg. He meticulously breaks down the situations in which pitchers are rewarded or penalized in the algorithm, provides the context around why each element was included in the calculation (or was not), all the while mathematically relating the performance to the outcome of the game – which is, at the end of the day, the ultimate relevant team stat. Silver then recalculates the career performances of some of the game’s best relievers using the Goose Egg criteria. He then validates the premise that this stat is more germane than the save by comparing it to the already accepted Win Probability Added statistic. (Spoiler alert: it is, significantly.)
Silver’s intention was not to demonstrate how an advanced statistic is born, I am serving it up as that. Once he has the Goose Egg sorted, he digs even deeper, further refining the statistic to “correct for” ballpark and league. In doing so, he layers on the concept of WAR (wins above replacement). This concept is one of my all-time favorites in the world of performance analysis, and has been since the 90’s when Keith Woolner first introduced us to VORP.
Value Over Replacement Player at its introduction was a concept about which many people were skeptical could be accurately calculated….there were so many variables! In fact, the first keynote speaker at the first MIT Sloan Sports Analytics Conference (2007) was then Toronto Blue Jays GM J.P. Ricciardi, who said during his remarks that he supposed he should start paying closer attention to VORP – if he could figure out what it was telling him. At the end of Ricciardi’s remarks, Keith Woolner himself was called on to start the Q&A session and offered to walk Ricciardi though the stat. For those who want to brush up on it, Baseball Prospectus published an overview of VORP by Derek Jacques in 2007.
The reason VORP, or WAR, or whatever the specific version of this stat is in a given sport or for a given position, is so valuable is because of just that…done correctly, the calculation factors in all of the components of scoring for one’s team, adjusts for position and venue where applicable, and allows for offense and defense to be included for sports where players play both ways. So, thank you, Keith Woolner, for winding up this particular statistical top, may it spin on and on.
Another revolution in the performance analysis landscape is the grading system used by Pro Football Focus, whose grades appear in NFL on NBC broadcasts and on countless fantasy football platforms. Understanding the philosophy behind the grading system is as critical as understating the algorithm itself. In a nutshell, PFF grades each player on how well he executed his portion of the play, without regard to the actual outcome of the play. That is, if a play broke down because of factors out of a given player’s control, his score on that play is not impacted. With eleven players on a side, each with a speicif assignment, it’s probable that on any given play quite a few players will execute their roles satisfactorily and their team will come away with an unsatisfactory result. For example, if a quarterback stands in the pocket, executes a good pass under pressure, and the receiver drops the pass, the quarterback is not penalized for that incompletion in the PFF grading system, nor are the linemen. Even though this process requires a subjective review of each snap, the system itself has become so robust, with enough built-in checks and balances, that PFF introduced in-game scoring during the 2016 NFL season.
As Nate Silver proved with Goose Egg, and the integration of PFF grades into mainstream platforms has indicated, there is a ton of white space in the world of advanced statistics. There is probably as much white space as any in the realm of global football (soccer, to me). The reasons are many why the world’s most popular sport lags behind so many others when it comes to dissecting it by the numbers. To name a few, the rate of scoring is about the lowest of mainstream team sports, the types of shots on goal are widely disparate (a header from 10 feet, a ball booted from 40 feet, all else), and there are limited “basic” stats upon which to build advanced metrics.
For example, it was popular for a while in some soccer circles to adopt PDO from ice hockey, which had gained wide acceptance and was further modified to accommodate power play and other game-specific situations. (PDO is 10 times the sum of a team’s shooting percentage plus its save percentage, or 10(Sh%+Sv%).) For soccer, however, it had significant criticisms and never really caught on. Among the criticisms was that the equation suggests that on the soccer pitch, offensive performance and defensive performance are related, and since this has not been proven to be the case, those are essentially two unrelated numbers combined in an attempt to draw a conclusion. It is not an entirely useless stat in soccer, as Richard Whittall pointed out in 2014 for 21st Club. However, “in the same way you can do horrific things with a nail gun (happy Halloween!), so too can you wreak havoc with a simplistic, one-size-fits-all approach to PDO.” (Incidentally, Hull City was relegated after the 2014-15 season.)
I, for one, am looking forward to the inevitable onslaught of soccer analytics that are sure to permeate our statistical lexicon any day now that will, much like VORP once did, account for multiple performance variables such as distance from goal, angle of attack, shot/attempt type (head/foot), fast break, time of game, etc. With the wearable technology teams are utilizing and the precision data that is being collected from the pitch, the biggest question will be whether the teams will be willing to go public with their revelations.