mma-dirgraph

UFC 2014 Year in Review

I’ve been working on a program that allows measurement of the performance of MMA fighters over time, without tunable parameters or the need for human judgement. I used it to calculate the current top five fighters in each weight class in the UFC, then looked at how their scores varied over the last year. Tweaking the code took a little longer than expected, so this is actually 2014 and a little bit of 2015 in review as the report ranges from January 1, 2014 to January 19, 2015, ending immediately after UFC Fight Night: McGregor vs. Siver.

The program is based on two different algorithms across five different histories of 1, 2, 4, 8, and 16 years. The results are then evenly weighted to produce a single confidence measure. The closer a score is to 100%, the more confident the model is that a fighter has won frequently against other winners. A score of 0% means the model has no information – a debuting fighter starts at this. The closer a score is to -100%, the more confident the model is that a fighter has lost frequently to other losers.

It’s best to think of these scores as a report card of actual performance, rather than a theoretical ranking of potential performance. A solid contender fighting frequently and consistently winning might outrank a champion who only fights once a year, and you can check out the lightweights for a shining example of this. It’s not about some abstract measure of skill – it’s about who’s winning the most fights against the best opponents right now.

Flyweight

FlyweightPlot-20140119

Demetrious “Might Mouse” Johnson ranks head and shoulders above his division with two successful title defenses in June (against Ali Bagautinov at UFC 174) and September (against Chris Cariaso at UFC 178). Joseph Benavidez broke out of the pack to take second with a win over Dustin Ortiz in late November at UFC Fight Night 57, but still has a long way to go. John Dodson defeated John Moraga in June at UFC Fight Night 42, but by the end of the year both were closely matched. Jussier Formiga’s win against Zach Makovsky in August at UFC Fight Night 47 brought him into the top five.

Bamtamweight

BantamweightPlot-20140119

This closely matched division has Raphael Assuncao, Renan Barao, Urijah Faber, and TJ Dillashaw all vying for the top spot. TJ Dillashaw catapulted himself into the elite with a win over Renan Barao in May at UFC 173, but needs a win over a strong opponent to really cement his status as champion. Raphael Assuncao had a strong year with wins against against Pedro Munhoz in February at UFC 170 and Bryan Caraway in October at UFC Fight Night 54. Renan Barao stayed in contention after pulling out a must-win fight against Mitch Gagnon in December at UFC Fight Night 58. Urijah Faber is perpetually a contender, but a recent title shot loss to Barao in February at UFC 169 hurts his near term chances. Dominick Cruz got a big bump in September at UFC 178 with a win against Takeya Mizugaki, but recurring knee injury issues will likely keep him out of competition (and the top of the division) in 2015.

Women’s Bantamweight

WomensBantamweightPlot-20140119

Alexis Davis and Ronda Rousey were closely matched until Rousey’s victory in July at UFC 175 cemented her at the top of the division. Cat Zigano got a big bump in late September at UFC 178 with a win against Amanda Nunes, and will challenge Rousey for the title in February 2015 at UFC 184. Bethe Correia and Jessica Andrade round out the top five.

Featherweight

FeatherweightPlot-20140119

Jose Aldo can deservedly call himself the king of this division, but Conor McGregor is climbing fast. McGregor’s ranking was aided by an aggressively paced fight schedule with victories over Diego Brandao, Dustin Poirier, and most recently Dennis Siver all in the last six months. Aldo and McGregor are being positioned for a title fight at UFC 187 in May 2015. Ricardo Lamas defeated Dennis Bermudez via submission in November at UFC 180 and these two fighters fill out spots three and four. Cub Swanson had a big win over Jeremy Stephens in June at UFC Fight Night 44, but is struggling to maintain momentum after a loss to Frankie Edgar later in the year.

Lightweight

LightweightPlot-20140119

The interesting rankings in this division reflect activity (in the case of Donald Cerrone) and lack of it (in the case of champ Anthony Pettis). Cerrone has been on a tear, and his fast paced schedule against strong opponents has led to his ranking skyrocketing. Pettis has been the opposite, with only a single fight in 2014 and his ranking slowly trends downward as a result. The biggest shakeup in the rankings came in August at UFC Fight Night 49 when Rafael dos Anjos knocked out Benson Henderson demonstrating he belongs near the top of the division. Benson Henderson has been on a slow and steady slide since then, erasing all his gains and then some from a major win against Rustam Khabilov at UFC Fight Night 42 in June. Micheal Johnson rounds out the top five.

Welterweight

WelterweightPlot-20140119

This division is a bit of a mess, despite it being quite clear who the top three are – Johny Hendricks, Robbie Lawler, and Rory MacDonald. Lawler became champion in December at UFC 181, but ranks just below Hendricks. Rory MacDonald defeated Tarec Saffiedine in October at UFC Fight Night 54 to cement his contender status, but a talked about title shot never materialized. Rick Story defeated Gunnar Nelson in October at UFC Fight Night 53 to rise to number four. Matt Brown has been consistent across 2014 and rounds out fifth.

Middleweight

MiddleweightPlot-20140119

Weidman defended his title with a win over Lyoto Machida in July at UFC 175, but his injuries and fight delays have led to a somewhat sleepy title picture in 2014. Ronaldo “Jacare” Souza was a strong contender after defeating Francis Carmont in February at UFC Fight Night 36, but his ranking has steadily slipped since then and is now sitting in fourth. A planned bout with second place Yoel Romero has been delayed, and the winner of this will likely fight for the title. Thales Leites’ biggest win of the year was against Francis Carmont in August at UFC Fight Night 49, pushing him to third, while Tim Boetsch rounds out the top five.

Light Heavyweight

LightHeavyweightPlot-20140119

No debate here, Jon Jones is the clear champion at 205 after a decision win over Daniel Cormier at the start of the new year. The real question is who’s up next for the title. Ryan Bader’s victory over Ovince St-Preux in August at UFC Fight Night 47 pushed him into second place, while Phil Davis defeated Glover Teixeira in October at UFC 179 to take third. Anthony Johnson had a strong return to the UFC with a victory over Phil Davis in April at UFC 172, and only his lack of UFC activity in 2012-2013 prevents him from being ranked higher than fourth. Ovince St-Preux bounced back after his loss to Ryan Bader with a win over Mauricio “Shogun” Rua in November at UFC Fight Night 56 to retain his status in the top five.

Heavyweight

HeavyweightPlot-20140119

Probably the most egregious example of an inactive champion, Cain Velasquez hasn’t fought since October 2013 when he defeated Junior dos Santos for the second time at UFC 166, and his ranking has been on a steady slide for all of 2014. Junior dos Santos picked up a win against Stipe Miocic to rise to first in the rankings in December at UFC on Fox 13, but after two brutal losses to Velasquez his future in the division is unclear. Fabricio Werdum snagged a critical win against Travis Browne in April at UFC on Fox 11, then defeated Mark Hunt in November at UFC 180 to win the interim heavyweight championship. Werdum and Velasquez are tentatively scheduled to meet in June 2015 at UFC 188. Travis Browne defeated Brendan Schuab in December at UFC 181, and Matt Mitrione defeated Gabriel Gonzaga in December at UFC on Fox 13 – both are hungry for their shot but are a fight or two away from the title.

Caveats

The model is strongly sensitive to activity – the more frequently someone fights (and more importantly, wins) the higher they will rank in general. If a fighter hasn’t fought in the last year, they automatically lose 20% of of their total score as two out of the ten scores will be zero (the one year periods). It also only uses UFC and Pride fight data, which leads to certain active fighters with strong careers in other organizations being underrated and not being considered for the top five, like Luke Rockhold at middleweight. Adding in fight data from other organizations is definitely on my to-do list.

Warby Parker Theo in Whiskey Tortoise

Warby Parker Canada Review

warby-storefrontI tend to order everything I can online. My current pair of glasses were looking pretty beat up and my benefits plan at work had just reset, so I decided to take the plunge and order glasses from online retailer Warby Parker. They’ve had a strong online presence for US-based shoppers in the last few years and have recently started shipping to Canada.

Selection

The grid-style view of frames displayed on the website is at first mildly overwhelming. Each frame is different in some small way from the others, yet all share a very similar look. It’s very easy to go back and forth and agonize over what precise frame would look best as you’re unable to physically try them on. Warby Parker in the US solves this by offering a home try on program where you pick five of your favourite frames, they’re shipped to you without lenses, and you can try them on to see what you like. Unfortunately, this program is unavailable in Canada leaving you a bit in the dark.

I ended up going with Theo in Whiskey Tortoise. I typed in my prescription, and then used Warby Parker’s PD calculator to determine my pupillary distance. It’s a rather clever bit of software that uses a credit card as a reference to determine the distance between your pupils, a necessary measurement for the manufacture of glasses. Once all my information was in, I paid with a credit card and the order process began. The total cost was a very reasonable $130, $120 for the glasses and $10 for shipping with duties and taxes included.

Shipping

It took 11 days (7 business days) from when I first submitted my order to when my glasses were delivered. A few days after I ordered Warby Parker’s Canadian site was upgraded, and somehow the information in my online account was erased. No saved prescriptions anymore, no record of my order, and my name was now stored as FiftyOne Milburn – presumably an artifact from the fact that Warby Parker uses FiftyOne (now known as BorderFree) as a third party international shipper. A bit concerning, but after a phone call to Warby Parker offices I was reassured that my order was still moving along fine, this was simply a software issue due to the website upgrade.

Order received Nov 20, 2014
Order confirmed Nov 20, 2014
Left Warby Parker for third party shipper (Borderfree) Nov 26, 2014 4 business days to make glasses
Left third party shipper (Borderfree), got tracking number Nov 28, 2014 2 business days to ship to third party shipper
Successfully delivered Dec 1, 2014 1 business day to ship internationally to Canada

Shipping from the US to Canada was particularly fast (sent Friday, delivered late Monday). There’s a bit of room for improvement at the start of the order process however. I understand that prescription glasses take time to make unlike a retail good that can be pulled off a warehouse shelf, but there was a six day period at the start of the order where I had no information about what was happening with my order. A simple page with basic information (“Your prescription is being reviewed”, “Your glasses are being made”, “Your glasses are being prepared for shipping”, etc) would be sufficient.

Unboxing

The glasses arrive packaged in a clean looking box with a well constructed case and branded microfiber bag. The only problem I have is that the glasses case is just a tiny bit too small for the frames – the end of the arms peek out just beyond the edges. The case can still be closed, but it still seems like a small detail that was neglected.

I was quite impressed with the glasses themselves. They’re well built plastic frames, with the temples mating cleanly to the front of the frame with no gaps or misalignment. The nose pads are comfortable and the details (including a WP at the end of the left temple) are a nice touch. The hinges are simple but sturdy screw hinges, and do not offer any spring or give when open fully. This became obvious when first wearing the glasses, as apparently my face is just a bit too wide for the frames and there was some pressure above my ears.

Wear

I’ve had these glasses for about a month or so now. The pressure the arms were applying above my ears has subdued with a bit of gentle bending of the frames and time, although there’s a bit of annoyance when eating. They’re beautifully clear, I haven’t worn glasses in years and was surprised with the clarity versus my contact lenses, particularly reading electronic screens and books. I now typically wear the glasses on most evenings and weekends.

Overall

Pros: Well constructed and reasonably priced glasses.
Cons: Some hiccups with cross-border logistics.

Strongly recommended if you’re comfortable with online shopping and enjoy the style of acetate framed glasses.

119072282JH003_UFC_58_St_Pi

Quantifying the Career of Georges St-Pierre

I recently decided to improve my graph theory based MMA ranking system which modeled fighters as vertices and wins as edges on a directed graph. Previously the system generated a single large list of fighter rankings, evaluating all fights on record to generate data. The only problem is that this does not show how these rankings have changed over time – for instance, how has the ranking of Georges St-Pierre changed over the last decade? His recent retirement had made me curious.

I decided that instead of running this analysis for a single day in the present and looking back at all fights in the past – why not run the analysis for a given arbitrary day, and limit the fights which are included in the analysis to only those fights occurring within a certain period of time before that day? This analysis could then be ran and re-ran as I iterated day by day to obtain data demonstrating how rankings changed over time for a certain lookback period.

I ran this analysis between January 1, 2004 (roughly one month before Georges St-Pierre’s first fight in the Octagon) to the most recent UFC event at the time of this post (UFC on FOX 12, July 26, 2014) for lookback periods ranging from two to sixteen years in the past. To provide context I also ran this analysis for every UFC welterweight champion during this timeframe (Matt Hughes, Matt Serra, and Johny Hendricks). A visualization and brief summary of this data can be found in the table below. The color of the graphs range from green (shortest lookback period of 2 years) to yellow (longest lookback period of sixteen years).

varlb-hendricks
Johny Hendricks: Arriving in the UFC in mid-2009, Hendricks demonstrated strong performance in the shorter lookback periods but simply hasn’t had enough time to climb up the rankings in the longer term.
varlb-hughes
Matt Hughes: A legend of the old guard, his shorter lookback period rankings started to slip in 2007 but still ranks highly in the long term, evidence of years of dominance.
varlb-serra
Matt Serra: Saw big gains in 2007 with his defeat of George St-Pierre but couldn’t hold on to them. Note the dropoff in scores into negative territory when the lookback period no longer considers his Georges St-Pierre win (ie early 2009 in the 2 year lookback period, early 2011 in the four year lookback, etc).
varlb-gsp
Georges St-Pierre: A steady gainer who got a big pop in the mid to long term rankings in late 2006 after defeating BJ Penn. Short term lookback periods show weaker performance post-2010 due to injury and slower scheduling, but long term ranking are strong.

So how are we supposed to compare performance between fighters? Thirty two time series with eight separate lookback periods is a mess to visualize effectively, so we need to find the best lookback period or periods to compare. Shorter lookback periods respond quickly to recent wins and losses, but are fickle, ignoring long term consistency. Longer lookback periods are more stable and reflect consistent performance, but take years to reflect possibly rapid changes in fighter performance. After evaluating many different approaches, I decided to average the results from the 2, 4, 8, and 16 year lookback periods. Each lookback period is sufficiently different from the other and provides new information, and the combination provides a good balance between short and long term viewpoints.

gsp-career-mmagraph

The resulting chart makes qualitative sense and shows the early dominance and slow fade of Matt Hughes, the rise of Georges St-Pierre, Matt Serra’s spike when he captured the welterweight championship, and the new guard of Johny Hendricks quickly approaching. We can also look at the day to day changes in George St-Pierre’s ranking to capture some events that influenced the changes in his ranking most heavily.


Biggest One-Day Gains Biggest One-Day Losses
March 4, 2006 +0.232 Georges St-Pierre wins by split decision against BJ Penn in a close battle at UFC 58. April 7, 2007 -0.119 Matt Serra defeats Georges St-Pierre in an infamous upset at UFC 69.
November 18, 2006 +0.204 Georges St-Pierre defeats Matt Hughes by TKO in the second round at UFC 65 to become the new UFC Welterweight Champion. September 23, 2006 -0.096 Matt Hughes defeats BJ Penn at UFC 63. Note that this occurred shortly after GSP defeated BJ Penn for a massive gain in the rankings. BJ Penn’s drop in the rankings as a result of this loss also caused GSP’s ranking to slip as a second-order effect.
August 9, 2008 +0.203 Georges St-Pierre defeats Jon Fitch at UFC 87. September 23, 2006 -0.067 Thiago Alves defeats Matt Hughes by TKO at UFC 85. A large portion of GSP’s rankings relied on beating Matt Hughes. As Hughes’ ranking slipped due to losses, GSP’s did as well due to second order effects.

Most of George St-Pierre’s big gains occurred earlier in his career, which makes sense – it’s hard to move up when you’re already the champion. His worst days were also the predictable loss to Matt Serra, and the days when his most prominent prior wins (BJ Penn, Matt Hughes) ended up losing themselves.

And when was Georges St-Pierre’s peak? Well by the numbers, it was the week shortly after his defeat of Dan Hardy at UFC 111 when he reached a career high of ~0.727, at roughly the start of the controversy about him finishing fights. A year and a half later, he would injure his ACL which reduced the frequency of his fights to such a degree eventually his ranking slid below that of a young upstart named Johny Hendricks. The two eventually fought on November 16, 2014 – and Georges St-Pierre won, for a one-day gain of 0.120, his seventh best on record.

After all, if you’re going to retire, you might as well go out on top.

Summer Cleaning

It’s been a while since I’ve posted on here – currently in the process of cleaning up the site and updating the look and feel. I’ve made the decision not to delete any old content despite how awkward it may appear now. It’s a bit crazy to realize that this site has been online, in one form or the other, for over nine years.

Please let me know if the changes break any functionality. In the meantime, here are some Lorenz attractor renders I completed when I got my new computer. Plenty more points than the old renders… gotta use that horsepower for something.



FIGURE-3

The Theis Equation and Flow

Mathematics is remarkably effective in describing the physical world in part due to isomorphisms, relationships between concepts that reveal a similar underlying structure. In 1935 Charles Vernon Theis was working on groundwater flow, a subject with little mathematical treatment at the time. He thought that perhaps a well tapping a confined aquifer could be described using the same mathematics as the heat flow of a thin wire drawing heat from a large plate, as this work was better established. With a little bit of help from C. I. Lubin and considering how parameters describing underground water flow could be compared to those describing heat flow in solid materials, he developed the Theis equation which is used to this day to model the response of a confined aquifer to pumping over time.

I developed a small program which allows visualization of the potentiometric surface of a confined aquifer subject to pumping using Processing. This particular example uses aquifer and pumping parameters from a Geo-Slope whitepaper.

The source code may be downloaded here. All values including aquifer, pumping, visualization, and numerical parameters may be varied to apply to a wide variety of situations. The exponential integral (or “well function”) is calculated using a numerical approximation accurate to at least 1 part in 10,000,000 .

lorenz-attractor-1600-04

Convergence in the Lorenz Attractor

Most visualizations of the Lorenz attractor are of a long history of a single point after convergence to the attractor has occurred. I was interested in what the surrounding space looked like, so I randomly selected 20,000 starting points from a three dimensional Gaussian distribution with a standard deviation of 100. Each point was iterated, and a short history displayed as a trail.

Interestingly enough the points do not simply fall in from arbitrary directions like a gravity field, but display structure by instead swirling along a clear path up the z axis.

mma-directed-graph

Graph Theory, Algorithmic Consensus, and MMA Ranking

I’ve been working on an objective ranking system lately that could be applied to groups with large numbers of individual competitors, like the sport of mixed martial arts (MMA). The biggest issue compared to typical ranking systems is that there are so many participants that they cannot all compete against each other in a round-robin tournament or similar within a reasonable time frame. In order to calculate a global ranking, all of the players must be compared against each other through a sort of “six degrees of separation” style comparison, which is vulnerable to bias and calculation error.

This problem has already been solved in the chess world with the Elo rating system, a statistical approach that requires frequent competition in order to generate statistically significant results. Unfortunately competitors in sports like mixed martial arts or boxing do not compete nearly as frequently as chess players (for obvious reasons) and this approach drowns in a sea of statistical noise. Typically combat sport rankings are done by a knowledgeable observer by hand, through consensus of many observers, or by models with a large number of tunable parameters. It is very interesting to consider that humans appear to be able to easily determine who should be ranked highly, and that many algorithmic approaches largely match these evaluations but make some seemingly obvious mistakes. My goal was to find an approach that produced rankings that seemed sensible to a human observer with a minimum of tunable parameters (preferably none).

Data Structure

The initial step is to structure our data in a sensible way. We have a large number of participants, connected by individual competition which can either result in a win or a loss. One way of structuring this data would be in a directed graph, where competitors are represented by nodes and matches as edges with direction defined by who wins or loses. We seem to be focused on losses (or win/loss ratio) as the biggest factor – a competitor with 40 wins and zero losses is typically regarded as better than a competitor with 60 wins and 20 losses. Let’s set the direction of the edge from the losing competitor to the winning competitor. A “good” competitor’s node will therefore have many incoming edges and few outgoing edges, and tend to be at the center of force-directed graph layouts.

Evaluation Algorithms

There are many possible evaluation algorithms which will produce a ranking from this data structure. After many trials, two appeared to stand above the rest.

  1. The first is recommended in the journal article Ranking from unbalanced paired-comparison data by H.A. David published in Volume 74, Issue 2 of Biometrika in 1987.
  2. David also discussed the Kendall-Wei algorithm in his paper, of which Google’s PageRank algorithm is a special case. PageRank is used to rank webpages which are represented as a directed graph based on the concept of network flow, and may also be applied to other directed graphs including our case. The PageRank algorithm contains one tunable parameter, a damping factor which is currently set to the default 0.85.

It was found that both algorithms seemed to emphasize different aspects important to MMA ranking. David’s “Unbalanced Pair Comparison” emphasized a grittier statistics-based approach, highlighting fighters such as Anderson Silva, Rashad Evans, and Jon Fitch. Google’s PageRank seemed to take a more social approach emphasizing fighters with a wide range of quality opponents, like Georges St-Pierre, Matt Hughes, and Forest Griffin. It was very interesting how one algorithm appeared to highlight the “hardcore mma fan” perspective, while the other seemed to be pulled straight from the UFC head office.

It was decided that both would be calculated, scores normalized, and used in combination to generate a consensus ranking similar to consensus rankings generated from human experts. This was inspired by IBM’s Watson which uses a consensus of multiple algorithms to evaluate answers to trivia questions. Two possible improvements are hypothesized but undertested:

  1. Perhaps additional independent ranking algorithms incorporated in this consensus would improve accuracy. The big issue appears to be “independent” algorithms which do not simply restate the work of other algorithms, and of those, finding algorithms which display ranking behavior useful for our application.
  2. Unlike Watson, confidence levels are not used. This would be a useful addition given situations like extreme upsets. A newer beta version of this ranking system determines if highly ranked fighters coincide with centrality metrics in an attempt to implement this, but is not complete at this time of this post.

Results

The ranking system was run on every UFC event from UFC 1 (November 12, 1993) to Fight for the Troops 2 (January 22, 2011). Both algorithms are shown ranked alone for comparison, and their scores were equally weighted to produce the final results.

Lightweight (155lbs)
Overall Rank PageRank Unbalanced Pair
1. Gray Maynard 1. B.J. Penn 1. Gray Maynard
2. B.J. Penn 2. Gray Maynard 2. George Sotiropolous
3. Frankie Edgar 3. Frankie Edgar 3. Frankie Edgar
4. George Sotiropolous 4. Kenny Florian 4. Jim Miller
5. Jim Miller 5. Joe Lauzon 5. Nik Lentz

First up are the lightweights – and the results aren’t too shabby. No one seems to want to admit it due to his sometimes snooze-inducing style, but Gray Maynard is a beast who is likely to cause B.J. Penn significant issues if they ever fought. Frankie Edgar deserves to be right up there but not number one, and chronically underrated George Sotiropolous and Jim Miller round out the pack.

Welterweight (170lbs)
Overall Rank PageRank Unbalanced Pair
1. Georges St-Pierre 1. Georges St-Pierre 1. Matt Hughes
2. Matt Hughes 2. Matt Hughes 2. Josh Koscheck
3. Josh Koscheck 3. Matt Serra 3. Georges St-Pierre
4. Martin Kampmann 4. Dennis Hallman 4. Martin Kampmann
5. Dennis Hallman 5. Martin Kampmann 5. Rick Story

Georges St-Pierre is the obvious frontrunner at 170. Matt Hughes at number two is a bit more debatable, but a long title reign and consistent quality opposition provide a reasonable rationale. Josh Koscheck is perpetually always the bridesmaid, never the bride at third, and Martin Kampmann and Dennis Hallman round out a somewhat thin division.

Middleweight (185lbs)
Overall Rank PageRank Unbalanced Pair
1. Anderson Silva 1. Anderson Silva 1. Anderson Silva
2. Jon Fitch 2. Jon Fitch 2. Jon Fitch
3. Yushin Okami 3. Vitor Belfort 3. Yushin Okami
4. Michael Bisping 4. Nate Marquardt 4. Michael Bisping
5. Nate Marquardt 5. Yushin Okami 5. Demian Maia

Anderson Silva provides another easy choice for number one at 185lbs. Both Jon Fitch and Yushin Okami deserve their spots with a consistent if slightly dull record. Michael Bisping has slowly been grinding his way up the charts, and Nate Marquardt rounds out the top five.

Light Heavyweight (205lbs)
Overall Rank PageRank Unbalanced Pair
1. Rashad Evans 1. Forrest Griffin 1. Rashad Evans
2. Lyoto Machida 2. Lyoto Machida 2. Jon Jones
3. Forrest Griffin 3. Rashad Evans 3. Ryan Bader
4. Quinton Jackson 4. Quinton Jackson 4. Lyoto Machida
5. Mauricio Rua 5. Mauricio Rua 5. Thiago Silva

Rashad Evans appears to have made a sensible call waiting for his title shot at UFC 128. The hypercompetitive light heavyweight division is always a tough one to call. A split in the consensus between the two algorithms produces a top five that seems to emphasize number of fights in the Octagon, with champion Mauricio “Shogun” Rua a surprising fifth. Too early to call Evans over Rua? Only time will tell.

Heavyweight (265lbs)
Overall Rank PageRank Unbalanced Pair
1. Frank Mir 1. Frank Mir 1. Frank Mir
2. Cain Velasquez 2. Brock Lesnar 2. Junior Dos Santos
3. Junior Dos Santos 3. Cain Velasquez 3. Cain Velasquez
4. Brock Lesnar 4. Antonio Rodrigo Nogueira 4. Cheick Kongo
5. Shane Carwin 5. Shane Carwin 5. Brendan Schaub

I initially disagreed with Frank Mir as number one here – Cain Velasquez seems to be the obvious choice. But the ranking process seems to trust number of fights over new hype, and the rest of the top five is bang on what I would choose. You can’t win them all – or perhaps I’m just being unfair to Frank Mir.

Conclusions

The approach produced excellent rankings from UFC-only data, largely coinciding with established and more complete authorities like FightMatrix. The approach used two ranking algorithms which traversed a directed graph, and produced scores which were normalized and added to produce a final score which was sorted to produce final rankings. One tunable parameter (PageRank damping factor) exists in the model, but was left at the default value of 0.85. Further work will focus on additional ranking algorithms which may be incorporated into the consensus, parametric analysis of the PageRank damping factor, and determining confidence scores.

mmagraph-ufc003

Directed Graphs and MMA

There is underlying mathematical structure everywhere – it’s just a matter of finding the best way to unfold it from the data. I’ve been working on a project to objectively rank mixed martial arts fighters. It’s not anywhere near done yet, but I’ve collected a fair bit of data. Here’s what nearly 2000 MMA fights in the UFC and Pride FC over the last 15 years look like expressed as a directed graph with a force directed layout.

You can see fights clumped into UFC (top) and Pride (bottom) organizations, with a subset of fighters acting as ambassadors between both.

lorenz-blue

Lorenz and the Butterfly Effect

In 1962, Edward Lorenz was studying a simplified model of convection flow in the atmosphere. He imagined a closed chamber of air with a temperature difference between the bottom and the top, modeled using the Navier-Stokes equations of fluid flow. If the difference in temperature was slight, the heated air would slowly rise to the top in a predictable manner. But if the difference in temperature was higher, other situations would arise including convection and turbulence. Turbulence is notoriously difficult to model, so Lorenz focused on the formation of convection, somewhat more predictable circular flows.

Focusing only on convection allowed Lorenz to simplify the model sufficiently so that he could run calculations on the primitive (by today’s standards) computer available to him at MIT. And something wonderful happened – instead of settling down into a predictable pattern, Lorenz’s model oscillated within a certain range but never seemed to do the exact same thing twice. It was an excellent approach to the problem, exhibiting the same kind of semi-regular behavior we see in actual atmospheric dynamics. It made quite a stir at MIT at the time, with other faculty betting on just how this strange mathematical object would behave from iteration to iteration.

But Lorenz was about to discover something even more astounding about this model. While running a particularly interesting simulation, he was forced to stop his calculations prematurely as the computer he was using was shared by many people. He was prepared for this, and wrote down three values describing the current state of the system at a certain time. Later when the computer was available again, he entered the three values and let the model run once more. But something very strange happened. It initially appeared to follow the same path as before, but soon it started to diverge from the original slightly, and quickly these small differences became huge variances. What had happened to cause this?

Lorenz soon found that his error was a seemingly trivial one. He had rounded the numbers he wrote down to three decimal places, while the computer stored values to six decimal places. Simply rounding to the nearest thousandth had created a tiny error that propagated throughout the system, creating drastic differences in the final result. This is the origin of the term “butterfly effect”, where infinitesimal initial inputs can cause huge results. Lorenz had discovered the first chaotic dynamical system.

We can see this effect in action here, where a red and a blue path have initial values truncated in a similar manner. Instead of graphing X, Y, and Z values separately, we’ll graph them in 3D space.

Initial Values (Red)   Initial Values (Blue)  
X 5.766 5.766450
Y -2.456 -2.456460
Z 32.817 32.817251

We can see at first that the two paths closely track each other. The error is magnified as time marches on, and the paths slowly diverge until they are completely different – all as a result of truncating to three decimal places. This is a contrast to the sensitivity to error we are typically used to, where small errors are not magnified uncontrollably as calculation progresses. Imagine if using a measuring tape that was off by a tenth of a percent caused your new deck to to collapse rather than stand strong…

There is also a hint of a more complex underlying structure, a folded three dimensional surface that the paths move along. If a longer path is drawn, the semi-regular line graphs of before are revealed to be the projections of a structure of undeniable beauty.

This is a rendering of the long journey of a single point in Lorenz’s creation, a path consisting of three million iterations developed in Processing that shows the region of allowable values in the model more clearly. Older values are darker, and the visualization is animated to show the strange dance of the current values of the system as they sweep from arm to arm of this dual-eyed mathematical monster, order generating chaos.