Archive for May 21st, 2010

Google, the Melbourne Demons, Port Adelaide Power and that game in Darwin…

Posted by Laura on Friday, 21 May, 2010

This weekend, the Melbourne Demons are playing the Port Adelaide Power in Darwin.  This game is one of two AFL games being held in Darwin this season.  I’m rather keen on geographic patterns in fan communities.  Where are they located?  How many people are there?  What is the size and interest level in a particular place?  Given that there isn’t an AFL team based in Darwin and the nearest team is team is over 3,000 kms (1,800+miles), it would be hard to figure out what team allegiances would be based on.  (The Canberra game with the Swans had a large number of people who barracked for the Sydney based team.  Canberra’s distance from Sydney and the Swans support of AFL Canberra are probably the major reasons for that.)  I wanted to explore what those loyalties would be in the Northern Territory to the exclusion of other states.

There really is no good way about getting numbers for the Northern Territory with out picking up everyone else across the country.  And even when that isn’t the case, people frequently will list themselves as residing or belonging to the next biggest city even if they don’t reside there.  This is highly problematic when you’re looking to see if there are pockets of team support in the suburbs and rural areas where city affiliation is more important when dealing with a wider, more international audience that may not have heard of Freemantle but may have heard of Perth, or who may not have heard of Geelong but do know where Sydney is.  There are ways to tease those patterns out by removing the major cities, like Melbourne, where the core is very tiny.  And I’m digressing because even when you can do that, it is rather hard to still just get data off major networks about a person’s interest by city, while excluding other states.  I can’t do that on Facebook, LiveJournal and its clones, bebo, blogger, orkut, 43things, LinkedIn, Twitter, care2… the list goes on and on.  There is no easy solution other than getting everyone and then, after the data is collected, filtering it down by state.

While I have a lot of data of that sort already, not many people live in the Northern Territory.  (For the Adelaide Crows, across six networks and with 75 fans, only one is from the Northern Territory.)  It is really hard to get regional patterns inside the Northern Territories.  My solution to try to figure this out was go to, put in the team’s name and the city.  (I got the list of cities I used from a list of postal codes for the Northern Territory on Wikipedia. I was logged out of my Google account.  I did not use the API.) My list of cities was 114 long after I removed cities with multiple postal codes.  City names, when they included more than one word, were put in quotes.  Team names were put in quotes.  An example search with that would be “Melbourne Demons” “Alice Springs”.

This is all fine and dandy.  You can easily repeat the results.  You should be able to get regional patterns on a large scale that you can’t get with or or bebo. Everything theoretically should work to get a some one accurate picture of the interest level by city in the Northern Territory for both teams.  Except, well, no.  Midstream, methodology begins to change.  Things I had not necessarily thought of come in to play.  First, there are duplicate city names.  This is an issue for Palmerston, which is a city in New Zealand, a city in the Northern Territory and a suburb in the Australian Capital Territory.  Second, some cities have common names or share names with people.  This is the case for Gray, Northern Territory.  It is the case for another city that shares a name of a player for a different AFL team.  This issue might be correctable by adding a “Northern Territory” or an NT to the search phrase.  I did this for Palmerston.  I just didn’t do it consistently because Google did not always realize NT meant “Northern Territory” and there were three wildly different search results in some cases.  It becomes just easier to ignore and accept that search results are going to be faulty.  The third major issue was Google spelling.  This issue can be less obvious unless you actually look at the results.  Moil is a city in the Northern Territory.  Google helpfully wanted correct my spelling by pulling up results featuring the word Mobile.  Moil and Mobile are not the same thing.  Karama and Karma are also not the same thing.  Google, if you don’t specifically tell it that these are not the same thing, treats them as if they are.  When I found this, I did correct the results number by putting a + in front of it to force Google to only pull up results with that exactly spelling.  Outside those two examples, I did this for Katherine, Elliott, Farrar, Gray, Gunn, Malak, Millner, Mitchell, The Gardens, and The Narrows. This helped insure slightly more relevance and didn’t create the problems of what is the preferential way to indicate that a city is in the Northern Territory.

The methodology problems out of the way, it is time for the results.  I couldn’t get a good visualization tool.  (The ones I tend to use aren’t really good with the Northern Territory.  I’ll find a fix for that in the future.)  Therefor, the easiest way to see the results is to download the xls file or the csv file.  The results, to me with out the aid of a map, are pretty boring when compared to methodology but still interesting.  On the whole, it looks like there is more interest in the Melbourne Demons than there is in the Port Adelaide Power.  If I give each team a point if they are more popular in a particular city, the Demons easily win the day with 93 to the Power’s 15 and with six cities being tied.  If I add up all the search results (each city gets added.  This number has little relationship to the total pages in the Northern Territory because many pages reference both teams or multiple cities in the Northern Territory), the Demons also win with 114,368 total pages compared to the Power’s 64,191.  The ratio to cities and total pages is not particularly close.  The Power are more popular in 13% of cities and represent 35% of total pages.

The top city for Port Adelaide Power is represented by the following search: “Port Adelaide Power” Driver NT.  Driver is a popular common word so it is highly probable that this is not accurate, even with the attempt to correct for the Northern Territory by adding NT to the search.  The next city that “prefers” the Power based on total search results is Parap, with 839 results.  For the Melbourne Demons, “Melbourne Demons” +Mitchell is the top city.  That’s another problematic place as this is a common surname.  The next most popular city based on total search results for the Melbourne Demons is Yuendumu with 12,200 page results.   What is interesting here is that Darwin and Alice Springs do not appear at the top of the list, even when we exclude Driver and Mitchell.  When the Demons and Power lists are combined and sorted descending by pages per city, Darwin doesn’t appear until the 12th spot for the Demons and 18th sport for the Power.  Alice Spring doesn’t appear until 32 for the Demons and 39th position for the Power.  The biggest population bases in the territory are not generating the most references for either teams.

I’m not entirely certain why “big” cities don’t rank higher.  Are all the cities ahead of them problematic with their names where steps were not taken to correct for that?  Or is it possible that more rural fans are reliant on the Internet to express their fannishness for a team?  Are there players from these rural communities playing in the AFL so local news sources give additional attention to players that they would not get in more urban areas?  It is possible.  The real reason is probably rather complex.

So if you’re going to the game in Darwin this weekend, you probably see more people barracking for the Demons.


1. I could theoretically get data from Facebook’s advertiser page for the number of people who list an interest and live with in a certain distance of a city.  There are just a few limitations.  First, not every location in the Northern Territory is listed.  Second, since Facebook forced users to like their interests, things have been in a state of flux and I’ve found zeros where there should not be zeros based on the number of people who like a fan page that Facebook uses and its default for a search of that interest.

2.  There are other ways I might have gone about doing this besides Google, including searching local newspapers for references to a team.  There are just limitations there in that not every location has its own newspaper and it excludes a lot of fan created references on sites likes bebo and blogger where the audience may be different than the ones that newspapers market to.  I might also have tried a geolocation based search.  I just haven’t found a good one yet that is based in Australia.  And even the ones I have seen tend to focus on Twitter and Foursquare.  AFL fandom is located more than just there.

3.  The methodology problems are a recurring problem when doing any sort of social media or web based research with the intent to create data sets.  It is why I’m generally deeply skeptical of any numbers I see unless some one clearly states their methodology, explains the problems and provides their data to give benchmarks.  This methodology issue also probably explains why much of the research done in regards to social media involves case studies and qualitative style research: The data is just so problematic to attain.

Edited to add: Visualization of this data. It isn’t perfect. There are a number of erroneous data points. (Anything outside of the Northern Territory is incorrectly placed on the map.) That said, it begins to give an idea of these patterns going on… though looking at the map, I don’t really see what I would consider overwhelming patterns. One of the islands is all Melbourne Demons. I had some data for about 15 cities for the North Melbourne Kangaroos that I overlaid to give this a bit more perspective. At some point, I should do every city in the Northern Territory, corrected as much as possible for the problems discussed above, with every team on the map.

Related Posts: