119072282JH003_UFC_58_St_Pi

Quantifying the Career of Georges St-Pierre

I recently decided to improve my graph theory based MMA ranking system which modeled fighters as vertices and wins as edges on a directed graph. Previously the system generated a single large list of fighter rankings, evaluating all fights on record to generate data. The only problem is that this does not show how these rankings have changed over time – for instance, how has the ranking of Georges St-Pierre changed over the last decade? His recent retirement had made me curious.

I decided that instead of running this analysis for a single day in the present and looking back at all fights in the past – why not run the analysis for a given arbitrary day, and limit the fights which are included in the analysis to only those fights occurring within a certain period of time before that day? This analysis could then be ran and re-ran as I iterated day by day to obtain data demonstrating how rankings changed over time for a certain lookback period.

I ran this analysis between January 1, 2004 (roughly one month before Georges St-Pierre’s first fight in the Octagon) to the most recent UFC event at the time of this post (UFC on FOX 12, July 26, 2014) for lookback periods ranging from two to sixteen years in the past. To provide context I also ran this analysis for every UFC welterweight champion during this timeframe (Matt Hughes, Matt Serra, and Johny Hendricks). A visualization and brief summary of this data can be found in the table below. The color of the graphs range from green (shortest lookback period of 2 years) to yellow (longest lookback period of sixteen years).

varlb-hendricks
Johny Hendricks: Arriving in the UFC in mid-2009, Hendricks demonstrated strong performance in the shorter lookback periods but simply hasn’t had enough time to climb up the rankings in the longer term.
varlb-hughes
Matt Hughes: A legend of the old guard, his shorter lookback period rankings started to slip in 2007 but still ranks highly in the long term, evidence of years of dominance.
varlb-serra
Matt Serra: Saw big gains in 2007 with his defeat of George St-Pierre but couldn’t hold on to them. Note the dropoff in scores into negative territory when the lookback period no longer considers his Georges St-Pierre win (ie early 2009 in the 2 year lookback period, early 2011 in the four year lookback, etc).
varlb-gsp
Georges St-Pierre: A steady gainer who got a big pop in the mid to long term rankings in late 2006 after defeating BJ Penn. Short term lookback periods show weaker performance post-2010 due to injury and slower scheduling, but long term ranking are strong.

So how are we supposed to compare performance between fighters? Thirty two time series with eight separate lookback periods is a mess to visualize effectively, so we need to find the best lookback period or periods to compare. Shorter lookback periods respond quickly to recent wins and losses, but are fickle, ignoring long term consistency. Longer lookback periods are more stable and reflect consistent performance, but take years to reflect possibly rapid changes in fighter performance. After evaluating many different approaches, I decided to average the results from the 2, 4, 8, and 16 year lookback periods. Each lookback period is sufficiently different from the other and provides new information, and the combination provides a good balance between short and long term viewpoints.

gsp-career-mmagraph

The resulting chart makes qualitative sense and shows the early dominance and slow fade of Matt Hughes, the rise of Georges St-Pierre, Matt Serra’s spike when he captured the welterweight championship, and the new guard of Johny Hendricks quickly approaching. We can also look at the day to day changes in George St-Pierre’s ranking to capture some events that influenced the changes in his ranking most heavily.


Biggest One-Day Gains Biggest One-Day Losses
March 4, 2006 +0.232 Georges St-Pierre wins by split decision against BJ Penn in a close battle at UFC 58. April 7, 2007 -0.119 Matt Serra defeats Georges St-Pierre in an infamous upset at UFC 69.
November 18, 2006 +0.204 Georges St-Pierre defeats Matt Hughes by TKO in the second round at UFC 65 to become the new UFC Welterweight Champion. September 23, 2006 -0.096 Matt Hughes defeats BJ Penn at UFC 63. Note that this occurred shortly after GSP defeated BJ Penn for a massive gain in the rankings. BJ Penn’s drop in the rankings as a result of this loss also caused GSP’s ranking to slip as a second-order effect.
August 9, 2008 +0.203 Georges St-Pierre defeats Jon Fitch at UFC 87. September 23, 2006 -0.067 Thiago Alves defeats Matt Hughes by TKO at UFC 85. A large portion of GSP’s rankings relied on beating Matt Hughes. As Hughes’ ranking slipped due to losses, GSP’s did as well due to second order effects.

Most of George St-Pierre’s big gains occurred earlier in his career, which makes sense – it’s hard to move up when you’re already the champion. His worst days were also the predictable loss to Matt Serra, and the days when his most prominent prior wins (BJ Penn, Matt Hughes) ended up losing themselves.

And when was Georges St-Pierre’s peak? Well by the numbers, it was the week shortly after his defeat of Dan Hardy at UFC 111 when he reached a career high of ~0.727, at roughly the start of the controversy about him finishing fights. A year and a half later, he would injure his ACL which reduced the frequency of his fights to such a degree eventually his ranking slid below that of a young upstart named Johny Hendricks. The two eventually fought on November 16, 2014 – and Georges St-Pierre won, for a one-day gain of 0.120, his seventh best on record.

After all, if you’re going to retire, you might as well go out on top.

Summer Cleaning

It’s been a while since I’ve posted on here – currently in the process of cleaning up the site and updating the look and feel. I’ve made the decision not to delete any old content despite how awkward it may appear now. It’s a bit crazy to realize that this site has been online, in one form or the other, for over nine years.

Please let me know if the changes break any functionality. In the meantime, here are some Lorenz attractor renders I completed when I got my new computer. Plenty more points than the old renders… gotta use that horsepower for something.



FIGURE-3

The Theis Equation and Flow

Mathematics is remarkably effective in describing the physical world in part due to isomorphisms, relationships between concepts that reveal a similar underlying structure. In 1935 Charles Vernon Theis was working on groundwater flow, a subject with little mathematical treatment at the time. He thought that perhaps a well tapping a confined aquifer could be described using the same mathematics as the heat flow of a thin wire drawing heat from a large plate, as this work was better established. With a little bit of help from C. I. Lubin and considering how parameters describing underground water flow could be compared to those describing heat flow in solid materials, he developed the Theis equation which is used to this day to model the response of a confined aquifer to pumping over time.

I developed a small program which allows visualization of the potentiometric surface of a confined aquifer subject to pumping using Processing. This particular example uses aquifer and pumping parameters from a Geo-Slope whitepaper.

The source code may be downloaded here. All values including aquifer, pumping, visualization, and numerical parameters may be varied to apply to a wide variety of situations. The exponential integral (or “well function”) is calculated using a numerical approximation accurate to at least 1 part in 10,000,000 .

lorenz-attractor-1600-04

Convergence in the Lorenz Attractor

Most visualizations of the Lorenz attractor are of a long history of a single point after convergence to the attractor has occurred. I was interested in what the surrounding space looked like, so I randomly selected 20,000 starting points from a three dimensional Gaussian distribution with a standard deviation of 100. Each point was iterated, and a short history displayed as a trail.

Interestingly enough the points do not simply fall in from arbitrary directions like a gravity field, but display structure by instead swirling along a clear path up the z axis.

mma-directed-graph

Graph Theory, Algorithmic Consensus, and MMA Ranking

I’ve been working on an objective ranking system lately that could be applied to groups with large numbers of individual competitors, like the sport of mixed martial arts (MMA). The biggest issue compared to typical ranking systems is that there are so many participants that they cannot all compete against each other in a round-robin tournament or similar within a reasonable time frame. In order to calculate a global ranking, all of the players must be compared against each other through a sort of “six degrees of separation” style comparison, which is vulnerable to bias and calculation error.

This problem has already been solved in the chess world with the Elo rating system, a statistical approach that requires frequent competition in order to generate statistically significant results. Unfortunately competitors in sports like mixed martial arts or boxing do not compete nearly as frequently as chess players (for obvious reasons) and this approach drowns in a sea of statistical noise. Typically combat sport rankings are done by a knowledgeable observer by hand, through consensus of many observers, or by models with a large number of tunable parameters. It is very interesting to consider that humans appear to be able to easily determine who should be ranked highly, and that many algorithmic approaches largely match these evaluations but make some seemingly obvious mistakes. My goal was to find an approach that produced rankings that seemed sensible to a human observer with a minimum of tunable parameters (preferably none).

Data Structure

The initial step is to structure our data in a sensible way. We have a large number of participants, connected by individual competition which can either result in a win or a loss. One way of structuring this data would be in a directed graph, where competitors are represented by nodes and matches as edges with direction defined by who wins or loses. We seem to be focused on losses (or win/loss ratio) as the biggest factor – a competitor with 40 wins and zero losses is typically regarded as better than a competitor with 60 wins and 20 losses. Let’s set the direction of the edge from the losing competitor to the winning competitor. A “good” competitor’s node will therefore have many incoming edges and few outgoing edges, and tend to be at the center of force-directed graph layouts.

Evaluation Algorithms

There are many possible evaluation algorithms which will produce a ranking from this data structure. After many trials, two appeared to stand above the rest.

  1. The first is recommended in the journal article Ranking from unbalanced paired-comparison data by H.A. David published in Volume 74, Issue 2 of Biometrika in 1987.
  2. David also discussed the Kendall-Wei algorithm in his paper, of which Google’s PageRank algorithm is a special case. PageRank is used to rank webpages which are represented as a directed graph based on the concept of network flow, and may also be applied to other directed graphs including our case. The PageRank algorithm contains one tunable parameter, a damping factor which is currently set to the default 0.85.

It was found that both algorithms seemed to emphasize different aspects important to MMA ranking. David’s “Unbalanced Pair Comparison” emphasized a grittier statistics-based approach, highlighting fighters such as Anderson Silva, Rashad Evans, and Jon Fitch. Google’s PageRank seemed to take a more social approach emphasizing fighters with a wide range of quality opponents, like Georges St-Pierre, Matt Hughes, and Forest Griffin. It was very interesting how one algorithm appeared to highlight the “hardcore mma fan” perspective, while the other seemed to be pulled straight from the UFC head office.

It was decided that both would be calculated, scores normalized, and used in combination to generate a consensus ranking similar to consensus rankings generated from human experts. This was inspired by IBM’s Watson which uses a consensus of multiple algorithms to evaluate answers to trivia questions. Two possible improvements are hypothesized but undertested:

  1. Perhaps additional independent ranking algorithms incorporated in this consensus would improve accuracy. The big issue appears to be “independent” algorithms which do not simply restate the work of other algorithms, and of those, finding algorithms which display ranking behavior useful for our application.
  2. Unlike Watson, confidence levels are not used. This would be a useful addition given situations like extreme upsets. A newer beta version of this ranking system determines if highly ranked fighters coincide with centrality metrics in an attempt to implement this, but is not complete at this time of this post.

Results

The ranking system was run on every UFC event from UFC 1 (November 12, 1993) to Fight for the Troops 2 (January 22, 2011). Both algorithms are shown ranked alone for comparison, and their scores were equally weighted to produce the final results.

Lightweight (155lbs)
Overall Rank PageRank Unbalanced Pair
1. Gray Maynard 1. B.J. Penn 1. Gray Maynard
2. B.J. Penn 2. Gray Maynard 2. George Sotiropolous
3. Frankie Edgar 3. Frankie Edgar 3. Frankie Edgar
4. George Sotiropolous 4. Kenny Florian 4. Jim Miller
5. Jim Miller 5. Joe Lauzon 5. Nik Lentz

First up are the lightweights – and the results aren’t too shabby. No one seems to want to admit it due to his sometimes snooze-inducing style, but Gray Maynard is a beast who is likely to cause B.J. Penn significant issues if they ever fought. Frankie Edgar deserves to be right up there but not number one, and chronically underrated George Sotiropolous and Jim Miller round out the pack.

Welterweight (170lbs)
Overall Rank PageRank Unbalanced Pair
1. Georges St-Pierre 1. Georges St-Pierre 1. Matt Hughes
2. Matt Hughes 2. Matt Hughes 2. Josh Koscheck
3. Josh Koscheck 3. Matt Serra 3. Georges St-Pierre
4. Martin Kampmann 4. Dennis Hallman 4. Martin Kampmann
5. Dennis Hallman 5. Martin Kampmann 5. Rick Story

Georges St-Pierre is the obvious frontrunner at 170. Matt Hughes at number two is a bit more debatable, but a long title reign and consistent quality opposition provide a reasonable rationale. Josh Koscheck is perpetually always the bridesmaid, never the bride at third, and Martin Kampmann and Dennis Hallman round out a somewhat thin division.

Middleweight (185lbs)
Overall Rank PageRank Unbalanced Pair
1. Anderson Silva 1. Anderson Silva 1. Anderson Silva
2. Jon Fitch 2. Jon Fitch 2. Jon Fitch
3. Yushin Okami 3. Vitor Belfort 3. Yushin Okami
4. Michael Bisping 4. Nate Marquardt 4. Michael Bisping
5. Nate Marquardt 5. Yushin Okami 5. Demian Maia

Anderson Silva provides another easy choice for number one at 185lbs. Both Jon Fitch and Yushin Okami deserve their spots with a consistent if slightly dull record. Michael Bisping has slowly been grinding his way up the charts, and Nate Marquardt rounds out the top five.

Light Heavyweight (205lbs)
Overall Rank PageRank Unbalanced Pair
1. Rashad Evans 1. Forrest Griffin 1. Rashad Evans
2. Lyoto Machida 2. Lyoto Machida 2. Jon Jones
3. Forrest Griffin 3. Rashad Evans 3. Ryan Bader
4. Quinton Jackson 4. Quinton Jackson 4. Lyoto Machida
5. Mauricio Rua 5. Mauricio Rua 5. Thiago Silva

Rashad Evans appears to have made a sensible call waiting for his title shot at UFC 128. The hypercompetitive light heavyweight division is always a tough one to call. A split in the consensus between the two algorithms produces a top five that seems to emphasize number of fights in the Octagon, with champion Mauricio “Shogun” Rua a surprising fifth. Too early to call Evans over Rua? Only time will tell.

Heavyweight (265lbs)
Overall Rank PageRank Unbalanced Pair
1. Frank Mir 1. Frank Mir 1. Frank Mir
2. Cain Velasquez 2. Brock Lesnar 2. Junior Dos Santos
3. Junior Dos Santos 3. Cain Velasquez 3. Cain Velasquez
4. Brock Lesnar 4. Antonio Rodrigo Nogueira 4. Cheick Kongo
5. Shane Carwin 5. Shane Carwin 5. Brendan Schaub

I initially disagreed with Frank Mir as number one here – Cain Velasquez seems to be the obvious choice. But the ranking process seems to trust number of fights over new hype, and the rest of the top five is bang on what I would choose. You can’t win them all – or perhaps I’m just being unfair to Frank Mir.

Conclusions

The approach produced excellent rankings from UFC-only data, largely coinciding with established and more complete authorities like FightMatrix. The approach used two ranking algorithms which traversed a directed graph, and produced scores which were normalized and added to produce a final score which was sorted to produce final rankings. One tunable parameter (PageRank damping factor) exists in the model, but was left at the default value of 0.85. Further work will focus on additional ranking algorithms which may be incorporated into the consensus, parametric analysis of the PageRank damping factor, and determining confidence scores.

Acknowledgements

The NetworkX package for Python was critical for this project. I came across Mixed Martial Arts vs Graph Theory shortly after starting work on this. Many thanks to Jack Rusher for highlighting interesting algorithms and providing a respite from entering most of the data by hand.

mmagraph-ufc003

Directed Graphs and MMA

There is underlying mathematical structure everywhere – it’s just a matter of finding the best way to unfold it from the data. I’ve been working on a project to objectively rank mixed martial arts fighters. It’s not anywhere near done yet, but I’ve collected a fair bit of data. Here’s what nearly 2000 MMA fights in the UFC and Pride FC over the last 15 years look like expressed as a directed graph with a force directed layout.

You can see fights clumped into UFC (top) and Pride (bottom) organizations, with a subset of fighters acting as ambassadors between both.

butterflyblue

Lorenz and the Butterfly Effect

In 1962, Edward Lorenz was studying a simplified model of convection flow in the atmosphere. He imagined a closed chamber of air with a temperature difference between the bottom and the top, modeled using the Navier-Stokes equations of fluid flow. If the difference in temperature was slight, the heated air would slowly rise to the top in a predictable manner. But if the difference in temperature was higher, other situations would arise including convection and turbulence. Turbulence is notoriously difficult to model, so Lorenz focused on the formation of convection, somewhat more predictable circular flows.

Focusing only on convection allowed Lorenz to simplify the model sufficiently so that he could run calculations on the primitive (by today’s standards) computer available to him at MIT. And something wonderful happened – instead of settling down into a predictable pattern, Lorenz’s model oscillated within a certain range but never seemed to do the exact same thing twice. It was an excellent approach to the problem, exhibiting the same kind of semi-regular behavior we see in actual atmospheric dynamics. It made quite a stir at MIT at the time, with other faculty betting on just how this strange mathematical object would behave from iteration to iteration.

But Lorenz was about to discover something even more astounding about this model. While running a particularly interesting simulation, he was forced to stop his calculations prematurely as the computer he was using was shared by many people. He was prepared for this, and wrote down three values describing the current state of the system at a certain time. Later when the computer was available again, he entered the three values and let the model run once more. But something very strange happened. It initially appeared to follow the same path as before, but soon it started to diverge from the original slightly, and quickly these small differences became huge variances. What had happened to cause this?

Lorenz soon found that his error was a seemingly trivial one. He had rounded the numbers he wrote down to three decimal places, while the computer stored values to six decimal places. Simply rounding to the nearest thousandth had created a tiny error that propagated throughout the system, creating drastic differences in the final result. This is the origin of the term “butterfly effect”, where infinitesimal initial inputs can cause huge results. Lorenz had discovered the first chaotic dynamical system.

We can see this effect in action here, where a red and a blue path have initial values truncated in a similar manner. Instead of graphing X, Y, and Z values separately, we’ll graph them in 3D space.

Initial Values (Red)   Initial Values (Blue)  
X 5.766 5.766450
Y -2.456 -2.456460
Z 32.817 32.817251

We can see at first that the two paths closely track each other. The error is magnified as time marches on, and the paths slowly diverge until they are completely different – all as a result of truncating to three decimal places. This is a contrast to the sensitivity to error we are typically used to, where small errors are not magnified uncontrollably as calculation progresses. Imagine if using a measuring tape that was off by a tenth of a percent caused your new deck to to collapse rather than stand strong…

There is also a hint of a more complex underlying structure, a folded three dimensional surface that the paths move along. If a longer path is drawn, the semi-regular line graphs of before are revealed to be the projections of a structure of undeniable beauty.

This is a rendering of the long journey of a single point in Lorenz’s creation, a path consisting of three million iterations developed in Processing that shows the region of allowable values in the model more clearly. Older values are darker, and the visualization is animated to show the strange dance of the current values of the system as they sweep from arm to arm of this dual-eyed mathematical monster, order generating chaos.

river-crossing

River Crossing Problems and Discrete State Spaces

A brain teaser goes as follows: a farmer is returning from market, where he has bought a wolf, a goat, and a cabbage. On his way home he must cross a river by boat, but the boat is only large enough to carry the farmer and one additional item. The farmer realizes he must shuttle the items across one by one. He must be careful however, as the wolf cannot be left alone on the same side of the river as the goat (since the goat will be eaten), and the goat cannot be left alone on the same side of the river as the cabbage (since the goat will eat the cabbage).

How can the farmer arrange his trips to move the wolf, the goat, and the cabbage across the river while ensuring that no one eats the other while he’s away? The problem usually yields to an analysis by trial and error, where one of the possible solutions is found by iterating and checking the possible decisions of the farmer.

But is there a more rigorous approach that will allow us to easily find all possible solutions without having to resort to guessing and checking?

A Discrete State Space

Let’s create a three-dimensional vector that represents the state of the system. The components of this vector will be 0 or 1 depending on which side of the river the wolf, the goat, and the cabbage are. Let’s list off all possible combinations without worrying about whether something is going to get eaten – we’ll get to that later.

Vector Location
(0,0,0) Our initial state, with all three on the starting side of the river.
(1,0,0) The wolf has crossed the river, but not the goat or cabbage.
(0,1,0) The goat has crossed the river, but not the wolf or cabbage.
(0,0,1) The cabbage has crossed the river, but not the wolf or goat.
(1,1,0) The wolf and the goat have crossed the river, but not the cabbage.
(0,1,1) The goat and the cabbage have crossed the river, but not the wolf.
(1,0,1) The wolf and cabbage have crossed the river, but not the goat.
(1,1,1) Our desired state, where all three have crossed the river.

Now a giant list like this isn’t much help. We’ve listed off all possible states, but we can structure this in a more sensible manner which will make solving the problem much easier. Let’s allow the vectors to represent the corners of a cube, with the edges of the cube representing trips across the river. Movement along the x axis represents the wolf moving, movement along the y axis represents the goat moving, and movement along the z axis represents the cabbage moving.

Alright, that seems a bit better. What we need to do now is find an allowable path along the edges of the cube from (0,0,0) to (1,1,1), and this will represent the solution to our puzzle. First we need to remove the edges where something gets eaten.

Path Removed Rationale
(0,0,0) to (1,0,0) This moves the wolf across, leaving the goat with the cabbage.
(0,0,0) to (0,0,1) This moves the cabbage across, leaving the wolf with the goat.
(0,1,1) to (1,1,1) This moves the wolf across, leaving the goat with the cabbage.
(1,1,0) to (1,1,1) This moves the cabbage across, leaving the wolf with the goat.

Our allowable state space now looks like this.

The problem has simplified itself drastically. We want to go from (0,0,0) to (1,1,1) by travelling along the black (allowable) edges of this cube. We can see that if we ignore solutions where the farmer loops pointlessly, there are two allowable solutions to this puzzle.

  1. Move goat to other side.
  2. Move cabbage to other side.
  3. Move goat back.
  4. Move wolf to other side.
  5. Move goat to other side.
  1. Move goat to other side.
  2. Move wolf to other side.
  3. Move goat back.
  4. Move cabbage to other side.
  5. Move goat to other side.

This is a beautiful example of the simple structures that underly many of these classic riddles.