Twitter #hashtags and the Australian election

This entry was posted by Laura on Thursday, 19 August, 2010 at

Can you learn about an election by its Twitter #hashtags ? I’m curious to see if the #hashtag use on Twitter matches with what people see as the major issues that the newspapers and news media covered. (I’m also interested in seeing what #hashtags they ReTweeted but that will likely have to wait for another post.) There are a couple of challenges when doing something like this… and be wary of anyone not spelling out their methodology because if you don’t know their methods, you can’t fairly evaluate their results and subsequent conclusions. Social media research is very much creative research that takes place in the moment. While you may not ever be able to get the exact same results, when you repeat a person’s methodology, deviations should be explainable. Anyway, the challenges and assumptions with doing a #hashtag analysis of the Australian elections include:

  • Incomplete tweet set for keywords: I only get what is available from Searchtastic.
  • Incomplete tweet set based on keyword limitations: Even supposing I could get all the tweets related to a keyword, I can’t find every reference to the Australian elections as there are too many possible words that could reference the elections and all the candidates running nation wide.
  • Incomplete tweet set because of time: Some keywords were searched for earlier and some later. Not all keywords were looked at over the same period.
  • Irrelevant tweets: Keywords like #greens may refer to the Green Party in the United States. Liberals may refer to American or British liberals. Unless every tweet is examined, irrelevant tweets will remain in the data set.

To offset some of these problems, a large data set was acquired. (Because a lot of people are tweeting about the elections. I see way more election tweets than footy tweets.) These tweets were acquired using Searchtastic and the following keyword set: #Abbott, #alp, #alp vote, #arbib, #aus2010, #AusLabor, #ausvotes, #ausvotes abbott, #ausvotes gillard, #boatphone, #GILLARDTINED, #greens, #laborfail, #masterchef abbott, #masterchef debate, #masterchef election, #masterchef gillard, #myliberal, #nocleanfeed , #NPC, #ozvote, #qanda, #spill, #tcot, #workchoices, @AustralianLabor, @LiberalAus, @TonyAbbottMHR , abbott math, Abbott-Gillard, asylum boat, australia vote, canberra election, debate abbott, debate gillard, debate winner, Following: @JuliaGillard, Following: @SenatorBobBrown, Following: @TonyAbbottMHR, Gillard, greens brown, Greens Canberra, greens election, Gruen Report, Gruen transfer, hockey liberals, immigration australia, Julia Gillard, JuliaGillard, Krudd Labor, KRudd Liberals, Labor darwin, labor leaks, Labor Liberals, Labor Tasmania, Liberals Tasmania, libs abbott, libs darwin, Libs Howard, marginal electorate, marginal seats, NSWLabor, perth vote, preferences vote, Rudd Liberals, Sex Party, stimulus liberals, sydney elections, Tony Abbott, tony boat phone, tonyabbottmhr, Truss LNP, Truss nationals, Warren Truss, Work Choices Australia, WorkChoices .

These keywords represent the major parties and candidates, some of the major issues, #hashtags that I saw on my Twitter feed, and different geographic areas around the country. Searches were run between July 19 and August 19. A total of 57,977 tweets were collected. The elections were called on July 17. To make sure that the collection of tweets pertain to the elections, all tweets made before July 1 were removed from the data set. (Methodology: Sort tweets by date on Excel. Remove those tweets not between those dates.) By then, everyone knew the elections had to be called and conversation regarding them had started. This takes the total tweets in the data set down to 21,071. That’s still a fairly large collection of tweets to work from. The next step is to remove duplicate tweets from the data set. (On Excel, Data -> Filter -> Advanced Filter -> Unique records only.) This brings the total tweets down to 18,462. That’s still a lot of tweets.

The next step is to extract #hashtags. To do this, I copy and pasted all the tweets to Notepad. I ran a find and replace for [space]# and replaced with [tab]#. I copy and pasted these back, removed all cells that did not start with a #. After this was done, the data set was copy and pasted back to Notepad. Another find and replace was done, this time, [space] was replaced with [tab]. This was pasted back to excel and all cells that did not start with # were deleted. When this was done, 33,857 total hashtags were found. Symbols like , ! . – ? were removed from those #hashtags. This was done to make that #labor. and #labor were treated the same for counting purposes.

A list of unique hashtags was then attained of which there were 2,678. The following table includes all #hashtags that appeared 250 or more times on the list:

Tweet Count
#tcot 5157
#ausvotes 2449
#tlot 1816
#p2 1761
#teaparty 1704
#GOP 1172
#ocra 950
#Libertarian 920
#News 867
#ucot 705
#politics 687
#Israel 536
#sgp 536
#iamthemob 367
#jcot 308
#roft 304
#qanda 303
#cdnpoli 293
#Twisters 261
#USA 209
#Obama 194
#MyLiberal 187
#debate 183
#flotilla 171
#Gaza 155
#masterchef 154
#energy 149
#hhrs 147
#green 134
#rpn 121
#912 111
#aus2010 111
#NPC 108
#rootyq 101
#AUSlabor 95
#oilspill 94
#topprog 94
#nocleanfeed 92
#fb 88
#rightriot 80
#laborfail 79
#spill 76
#US 76
#jews 75
#ALP 74
#terrorism 74
#Greens 72
#cspj 70
#tpp 70
#BP 69
#antisemitism 68
#Gillard 67
#p21 67
#openinternet 66
#FF 65
#FollowFriday 65
#oil 65
#Muslim 64
#ireland 63
#Abbott 62
#UK 61
#Australia 58
#Iran 58
#Palin 57
#Blog 55
#Europe 54
#quote 54
#dnc 52
#fail 51
#iranelection 51
#jlot 51
#hcr 47
#MentalHealth 47
#justsayin 46
#rs 46
#eco 44
#boatphone 43
#glennbeck 43
#islam 42
#zionism 42
#NBN 41
#palestine 41
#jobs 40
#environment 39
#Autism 36
#Hamas 36
#Health 36
#military 36
#tiot 36
#dem 35
#AZ 34
#Lebanon 34
#vote2010 34
#dems 33
#ausdebate 32
#bonjovi 32
#climate 32
#judaism 32
#lateline 32
#MoFo 32
#videos 32
#foxnews 31
#theview 31
#730report 30
#hasbara 30
#nz 30
#travel 30
#tweetcongress 30
#YWC 30
#conservative 29
#Gulf 29
#patriottweets 29
#AFRICA 28
#ampat 28
#dublin 28
#property 28
#WORLDCUP 28
#beck 27
#Free 27
#IDF 27
#ronpaul 27
#acon 26
#jewish 26
#twibbon 26
#bds 25
#CNN 25
#Obamacare 25
#rush 25

Looking at this list, there are some phrases that are likely not Australian or not uniquely Australian. This includes #tcot, which stands for Top Conservatives On Twitter. The term was amongst those searched for because it appeared in a few tweets that also included the #ausvotes #hashtag. A google search for #tcot #ausvotes only brings up 8,120, which further supports the idea that this isn’t really an Australian election term. #tlot was not deliberately searched for in terms of trying to include it. If you put #tlot #ausvotes into a google search, you get 244,000 results which suggests heavy Australian usage. You could probably remove #teaparty, #GOP, #Libertarian, #Israel, #Twisters, #USA, #Obama, #flotilla, #Gaza, #912, #oilspill, #US, #jews, #BP, #antisemitism, #oil, #Muslim, #ireland, #UK, #Iran, #Palin, #Europe, #dnc, #cdnpoli, and #iranelection. They are unlikely to do with the Australian elections.

If that’s agreed upon, then it looks like top issues based on #hashtags … the internet and its openness? It doesn’t look like there was any large scale usage of #hashtags around issues. Instead, it appears that #hashtags were used to label tweets that discussed the election, were used to discuss specific candidates and to discuss specific parties. Issue based discussion may have been secondary to Twitter discussion.

And if that’s true, and going further with that idea, it could validate the messaging used by Labor and the Liberals to largely mount attacks on each other. People on Twitter are heavily engaged in discussing politics but not the issues. It may also justify the work of GetUp!, which strives to bring attention to specific issues in Australia.

If you want access to the Excel file with all the tweets, please comment or send me an e-mail. The file is about 24meg so I didn’t upload it.

Related Posts:

  • I think 140 characters is way too small to discuss or have a proper conversation about national issues, however to make silly comments or defend political parties can be done in 140 characters. That I believe is one of the reasons why people are using Twitter that way.

    Great search and info Laura, it is nice to see someone putting the time and effort to share research information.

    Valeri
  • Hi Laura,

    Nice work, I have shared your article on www.typeboard.com

    I am some how not surprised that Australian are discussing more about the parties than national issues, it's unfortunate really.
  • Yes. There's not the conversation about nation-building which would I expect.

    Politics is at least as tribal as sport.

    What do you think?
  • I wouldn't be surprised if there were local patterns similar to the ones that I know exist for the AFL and NRL...
  • So for instance, there might be more #nocleanfeed in Victoria, because that is Stephen Conroy's seat?
  • As a random aside, I was going through my list of Australian vote related tweets to add locations. I haven't finished it (54,000 unique tweets takes a while to process and determine location) but I culled out the ones I had processed so far that included #nocleanfeed (185 total tweets, 99 of which I had pegged as in Australia so far, 77 with cities, 12 total cities) had the following results:

    City States Country Count of #nofeedreferences
    Sydney New South Wales Australia 23
    Melbourne Victoria Australia 18
    Blacktown New South Wales Australia 12
    Adelaide South Australia Australia 7
    Brisbane Queensland Australia 6
    Geelong Victoria Australia 4
    Canberra Australian Capital Territory Australia 2
    Hobart Tasmania Australia 1
    Maroondah Victoria Australia 1
    Newtown New South Wales Australia 1
    Perth Western Australia Australia 1
    Pyrmont New South Wales Australia 1


    I don't know about that guy's seat but that could give you an idea...
  • The National Broadband Network is getting a lot of coverage in old and new media, that's true.

    You say:

    "There are a couple of challenges when doing something like this… and be wary of anyone not spelling out their methodology because if you don’t know their methods, you can’t fairly evaluate their results and subsequent conclusions. Social media research is very much creative research that takes place in the moment. While you may not ever be able to get the exact same results, when you repeat a person’s methodology, deviations should be explainable."

    I'll say! Very true!

    (Especially the bit about creative research in the moment).
blog comments powered by Disqus