Why Gold Glove Voting is Meaningless
Among the Sabermetric community, the notion that Gold Glove awards are overrated/meaningless isn’t much of a revelation. The continuous awarding of such honors to players whose statistics do not support the results makes this clear every year. But I’m less interested in the results themselves than I am in the process that leads to the erroneous decisions. In other words, why, exactly, do we get it wrong–and it goes much deeper than “not everyone knows about UZR” and the somewhat-valid but nevertheless incomplete cries that the same players just win year after year.
Managers vote for the Gold Glove awards. They vote for players within their league and cannot vote for anyone on their team. I believe this means that we can infer the assumption that the managers, by virtue of watching the other teams in their league play, have a good handle on various players’ defensive abilities. But there are many issues diluting this process.

1) We don’t see (perceive) everything we think we do.
The first and most important point that must be made regarding visual observations of MLB fielders is that, quite simply, people do not perceive everything around them that they think they do. Our society presumes that people can see and encode everything that their eyes scan around them, but psychological study after psychological study indicates that this is false. The book The Invisible Gorilla: And Other Ways Our Intuitions Deceive Us pretty much settles the point. The book, authored by two cognitive psychologists with doctorate degrees from Ivy League schools, explains the concept of ‘inattentional blindness,’ whereby people frequently do not see things within their frame of vision when they are looking at/for something else.
Well, they see things, on some level–but they don’t perceive them, making that essentially irrelevant. In psychological experiments, eye trackers have documented the precise location on which participants’ eyes were focusing–and have noted that the eyes were sometimes looking directly at things that the participants claimed afterwards not to see. The information, essentially, was never encoded in the brain, because it was focusing on something else.
As the authors (Christopher Chabris and Daniel Simons) note, “directing our eyes at something does not guarantee that we will consciously see it.” Don’t believe this? They open the book with reference to their now-famous experiment showing that half of the people asked to closely watch a group of people pass a basketball around do not perceive a person in a gorilla suit walk into the frame, pound his chest for about 10 seconds, and stroll off.
For someone without a psychological background, this might seem a staggering piece of information; our society does not have much understanding of or patience for the notion that our senses are not as foolproof as we assume they are. What’s more, Chabris and Simons noted that inattentional blindless becomes exacerbated a) when one is looking for something else and b) when one has a limited amount of time to perceive something.
So let’s bring this back to baseball–don’t these conditions apply precisely to the circumstances of a manager watching a batted ball turn into a defensive play? The gorilla went unseen for so many people largely because they had been told to observe and note something else (the number of passes of the basketball certain people in the video made) and because there was limited time to make their observations.
And that’s precisely why managers cannot be expected to see and encode the proper visual information about a fielder’s capabilities. When a batted ball is in play, the manager is looking for numerous other things, namely, what his own team, his own base-runners, are doing. He’s not primarily concerned with the fielding prowess of his opponents. He’s also watching the entire scene unfold, having to pay attention to the alignment of auxillary fielders in addition to the one about to make a play. The human brain simply cannot take all of that in–it has to satisfice, to cut things out. As Robert Burton says in his book On Being Certain: Believing You Are Right Even When You’re Not regarding a visual search, “to carry this out with maximal efficiency, an implicit second instruction [is] sent to the unconscious–to downplay or ignore irrelevant visual inputs…The unconscious has free rein as to what should or should not be seen.”
Yet, with Gold Glove voting, we ask managers to be able to encode the performance of one specific fielder making a play on a ball–while they’re paying attention to the positioning and advancement of their teams’ base-runners and the other teams’ fielders in motion at once. And we ask all of this to occur within a span of a few seconds.
It simply can’t be done. That’s why anyone who says he can judge fielding prowess–or any set of skills from any athlete–purely, or even primarily, on the basis of his observations has insufficient awareness of his perceptual limitations.

2) We don’t remember everything we perceive.
Let’s say that managers were able to overcome all the aforementioned perceptive limitations–it still wouldn’t matter, because our memories are too fragile and fallible to retain and retrieve all of that information, particularly after a long period of time. To quote Dr. Daniel Schacter’s book The Seven Sins of Memory: How the Mind Forgets and Remembers, “with the passing of time, the particulars [of experiences] fade and opportunities multiply for interference–generated by later, similar experiences–to blur our recollections.” Hmm, “later, similar experiences”–doesn’t that sound like yet another ground ball to shortstop, yet another gapper? These plays produce retroactive interference in the observer, making it very difficult to keep track of everything.
Again, the brain cuts corners; it would be inefficient for it to remember every detail that crossed its paths. That’s why eyewitness testimony is so horribly unreliable, a notion that psychologists have been pushing for years but which our society is only slowly accepting. That’s why our confidence in the memories of dramatic events (the “What were you doing when the Challenger exploded?” kind) are much more wrong than we believe; people claim such memories are vivid, yet study after study shows that they a) change with time (so some of them must be wrong) and/or b) are inconsistent with the memories of other people who were in the same place at the same time.
What helps us remember? Expectations, for one thing. Confirmation bias affects even perception more than most realize; as Chabris and Simons write, ”your moment-to-moment expectations, more than the visual distinctiveness of the object, determine what you see–and what you miss.” With confirmation bias, people stubbornly seek out and remember information that adheres to already-held views while ignoring or forgetting contradictory information. In fact, brain studies have found that “the reasoning areas of the brain virtually shut down when participants were confronted with dissonant information” (see the excellent book Mistakes Were Made (But Not by Me) by Carol Tavris and Elliot Aronson). In other words, confirmation bias has a neurological component; our brains simply do not want to think logically about information that contradicts established beliefs.
(I think that’s the biggest reason that the same players win the award over and over again–once people think they are good fielders (whether or not said belief is accurate), they remember their good plays and forget/ignore the bad ones. There might be a component of cognitive dissonance in the repeated winning as well; it might be viewed as an admission of wrongness if a player who wins one year doesn’t the next, thus producing dissonance to someone who voted for someone who got replaced. But I digress.)
Bottom line: if, over time, we can’t accurately recall details of what happened on September 11th, why do we expect people to recall the details of a game from May 17th on October 10th when they’re submitting their ballots for Gold Gloves?
3) The sample size of observations is simply too small.
Let’s say that managers were able to overcome all the aforementioned perceptive and memory limitations–it still wouldn’t matter, because the sample size of observations is minutely small. Teams within division play each other 18 times a year. Teams play the rest of their league brethren between six and nine times a year; regardless, it’s an absurdly small sample size from which to generate conclusions about baseball skill, even if our perception and memory were flawless. It’d be like the guy who knows little about baseball and catches a handful of Ryan Zimmerman’s at-bats throughout the year while his roommate watches; in those at-bats, Ryan might happen to go 3 for 30, encouraging the friend to determine that he’s no good at this batting thing.
There are obviously other partial explanations for curious voting results. The human brain is wired to remember the spectacular and unusual–ie, Web Gems–but we have little interest in the more mundane, the fact that your Web Gem guy keeps letting balls roll past him to the wall.
All of that is why advanced fielding statistics, where every single ball that’s hit in every single game is tracked, with no fear of rotten perception of memory, are essential for assessing this skill.
Could the sample size tell the truth about a player, in theory? Sure, though it would probably still be less specific. And the managers could also happen to vote for the right guys, although it’d be hard to say exactly why they did. Maybe some of them consult with each other to get more viewpoints. Maybe some of them look at advanced fielding statistics and prize those above what their eyes saw, which would both be commendable and would undercut the foundation of the voting process, which is based on the notion that managers are better able to judge this than anyone with access to a computer. Are they?
Tags: wow this was long today




