Figure skating fans love to talk about judging–everyone has an opinion on it! But, talking about the judges’ work is quite different from actually getting in there and doing it yourself–as I recently found out.
This spring, the Naked Ice web site launched a judging project (“We’ll Be the Judge of That!”) in which fans would actually judge an entire competition, issuing a full set of TES/PCS marks for each skater, just like real judges. Not only that, Naked Ice upped the stakes by choosing the 2014 Sochi Olympics ladies’ event—one of the most controversial of the IJS era—as the competition to be judged. The results of the project are available tonight on the Naked Ice site: “We’ll Be the Judge of That–2014 Sochi Olympics Results.”
I participated in this project as one of the 7 judges. It was really interesting and a lot of fun! It was also a lot of work; and very educational. To be honest, I think this is one of the most important fan projects I’ve ever seen done in the skating community. Because, as much as we all talk about judging, very few of us have actually done the job and really understand the challenges. This project gave us a window into what judges really experience when they’re using IJS. And I have to say: Judging under this system is not easy.
Here’s how the project was set up. Each judge had to watch each short program and long program for the top 9 ladies at the Sochi Olympics. Then, we had to enter our marks into online score sheets (see example below). We had to provide a GOE mark (-3 to +3) for each technical element. We also had to give a brief explanation for each GOE mark. Then, moving on to PCS (presentation), we had to provide a mark in each of the five PCS categories, plus an explanation.
So, overall, we had to enter 12 separate marks (7 TES/5 PCS) for each short program and 17 separate marks (12 TES/5 PCS) for each long program (29 marks in all per competitor). Taking both programs into account, for all 9 skaters, we had to award a total of 261 separate marks! Can I just say, that’s a lot of marks to determine and enter? And we weren’t even judging the entire competition. (The real-life judges in Sochi had to score 24 ladies, not just 9!)
Actual Judging Process
The concept behind this project was to score skaters’ performances exactly like real judges would. However, in practice, our actual experience judging the performances was inevitably different than the real judges’.
First, we watched the performances on video, not live in the arena. Having been to several live competitions now, I didn’t find it as difficult to judge skating on video as you might expect. When you watch skating live, you do get a somewhat better sense of ice coverage, presence/performance impact, and perhaps speed. But in other ways, I actually find it easier to evaluate performances on video. Watching live, there are a lot of external stimuli that can be quite distracting (crowd noise, crowd movement, activity around your seat, comments from people nearby, lighting oddities, activity in the kiss-n-cry). Watching on video eliminates those distractions and allows you to really focus more closely on the performance. So I actually didn’t feel that judging via video, rather than live, had a significant negative effect.
What did make this project very different from real-life judging? The ability to rewatch programs. In real life, judges get only one shot to watch each program. But when judging at home, you can readily rewatch the programs—and I did. In fact, I actually watched each program 3 times before entering my marks. Why? I wanted to get my marks “right.” Or as right as they could be. And I just found that trying to evaluate all the information necessary to provide 12 or 17 marks per program was nearly impossible, if I was only watching the program 1 time. There was just too much to look at, too much to take in. So, instead, I watched each program 3 times, took notes, then entered my scores.
I found myself constantly referring to two ISU documents: The Program Components Chart and the Single & Pair Skating Scale of Values, Levels of Difficulty, and Guidelines for Marking Grade of Execution. Yes, as dry as it sounds, I was flipping pages and counting bullets for pretty much every single score. Because I figured that if I was supposed to use IJS, then I needed to try to use it accurately, by the guidelines. Otherwise, I’d just be making up my own scoring system.
It was quite a long process to watch all those programs and enter all those scores! The short programs went by fairly quickly and were entertaining. But the long programs took a while, and the process started to seem a bit more academic.
This project was nothing if not educational. I felt like I noticed and learned a ton of stuff about IJS and judging. Here, in no particular order, are some random observations/thoughts coming out of this project.
- As judges, we gave each lady 29 marks across both programs. Yet, at no point did we give an overall ranking mark! I never fully grasped before that ISU judges are essentially working blind in this system, in regard to overall placement. You’re scoring the skaters on so many different elements and qualities, yet you never give a clear ranking mark, saying which skater you think is best. Your ranking of each skater is just the sum of those 29 individual marks. And unless you’re experienced and have a darn good memory (and mental math capabilities), it’s hard to know exactly where you’ve ranked a skater overall.
- Before this project, I always thought the Sochi podium should have been: 1 Yuna, 2 Carolina, 3 Adelina. Now, at the end of the project, I can honestly say: I don’t know who I put in first place with my scores. I am waiting to find out. From the “judge’s” perspective, I see now that it was a very close competition, really without a clear-cut, obvious winner. While rewatching the programs, I noticed many different things about the skaters’ individual performances, such as: Carolina was sublime, but her spins were just okay and not well-centered; Yuna’s jump GOE in the LP turned out to be not as high as I expected, in part because she lacked transitions into the jumps; Adelina’s spins were terrific, but her body line/control was worse than I remembered; Mao’s LP was even more amazing than I recalled, especially her step sequence; Gracie’s spins were outstanding and her PE mark should have been very high; Ashley really sold her LP, as always. I noticed a lot of individual things, and I tried to show that through the individual marks. But I still don’t know where I placed the skaters overall.
- The very nature of judging under IJS discourages appreciation for the program as a whole. Because you’re giving marks for so many different things, the program becomes less a cohesive entity than a collection of elements. This is particularly true in the long program, which has almost twice as many technical elements to mark.
- The GOE criteria for jumps need some serious rewriting, in my opinion. There are so many criteria that help you get positive GOE on jumps: Doing steps into the jump, setting the jump on a musical highlight, throwing up a half-straight arm a la Medvedeva, having a “creative” entry or exit. In fact, it’s actually kind of hard not to get positive GOE on jumps if you just throw in a couple of these features (having just two features produces +1). My issue with this? The quality of the jump itself (height, flow, ease, proper landing) is somewhat lost. You still get bullet points for those qualities, but you can also get really good GOE on a really small jump or a jump with a poor landing, as long as you add in the arm or the steps or the musical highlight. I would like to see more value placed on the quality of the jump itself.
- The GOE guidelines, to me, put too much emphasis on proper entry to jumps and not enough on good exits (i.e., smooth, sure, on an outside edge, flow/control out of landing). Similarly, the GOE guidelines for spins put too much emphasis on speed and rotations, not enough on centering.
- With jumps, I tried to pay attention to wrong-edge takeoffs and underrotations, because those errors are supposed to incur negative GOE. Sometimes I replayed a questionable jump several times to try and see if there was an error. In most cases, I could not see an error clearly enough to actually mark the skater down for it. I really believe the only certain way to confirm wrong-edge takeoffs is with slow-motion. You can tell an underrotation more in real time, but in most cases, I’d still want the slow-motion confirmation before giving negative GOE. For fans who assert with full confidence that they can tell a wrong-edge takeoff on flip or Lutz without slow motion: Not buying it. My benefit of the doubt always goes in favor of the skater. For this competition, I mostly gave negative GOE on underrotations/wrong edge only when there was quite clear slow-motion video evidence.
- We all know this, but judges’ hands really are half-tied in determining the proper value of jumps. If the technical panel calls the jump incorrectly, the judges can give as much negative or positive GOE as allowable, but the jump will still be incorrectly valued numerically.
- I found myself giving out many more +3s in GOE than I expected. Before starting the project, I thought that +3 basically connoted perfection—so it should be rare. However, when you sit down and actually look at the guidelines for getting +3, you see that the +3 is quite attainable, according to the bullets. The +3 does not indicate perfection, but just a really well-done element.
- After using the five PCS components (and trying to be careful about it), all I can say is: The across-the-board consistent PCS scores we often see in real life (as in, all PCS scores 9.0 to 9.5 for one skater) are just flat-out wrong, in my opinion. Some of the PCS components really, truly have very little to do with each other—both theoretically and according to the actual guidelines. Skating Skills is almost completely separate from Interpretation. We should see much greater variance in PCS components. Fans have been saying this for years; actual use of the components just confirms it.
- I understand now why judges watch practices. I never used to understand that; it seemed like it could only lead to unfairly prejudging the skaters. However, having done some actual judging now, I realize that watching the program more than once is almost essential to evaluate it fully. It’s especially useful in assessing the Choreography or Transitions marks. Watching the program more than once allows you to better recognize the ice-coverage patterns and general number/quality of transitions in the program. It gives some idea of where to start with those scores. It’s really hard to seriously assess the choreographic structure of a program while also looking at jump takeoffs and landings.
This was an amazing project, and I’m so glad that Naked Ice made it happen!! Thanks!! I hope that similar projects will be launched in the future, so that more fans get the opportunity to actually use IJS and experience what it’s like to judge within this system.
I may never completely stop criticizing the judging of competitions; I think it’s just the nature of being an observer and fan of this sport. But now that I’ve walked in the judges’ shoes by actually using IJS to “judge” a competition, I understand more about the challenges the judges face and the difficulty of their job. So, if I criticize, hopefully at least it will now come from a place of greater understanding! 🙂