Twitter #hashtags and the Australian election
Can you learn about an election by its Twitter #hashtags ? I’m curious to see if the #hashtag use on Twitter matches with what people see as the major issues that the newspapers and news media covered. (I’m also interested in seeing what #hashtags they ReTweeted but that will likely have to wait for another post.) There are a couple of challenges when doing something like this… and be wary of anyone not spelling out their methodology because if you don’t know their methods, you can’t fairly evaluate their results and subsequent conclusions. Social media research is very much creative research that takes place in the moment. While you may not ever be able to get the exact same results, when you repeat a person’s methodology, deviations should be explainable. Anyway, the challenges and assumptions with doing a #hashtag analysis of the Australian elections include:
- Incomplete tweet set for keywords: I only get what is available from Searchtastic.
- Incomplete tweet set based on keyword limitations: Even supposing I could get all the tweets related to a keyword, I can’t find every reference to the Australian elections as there are too many possible words that could reference the elections and all the candidates running nation wide.
- Incomplete tweet set because of time: Some keywords were searched for earlier and some later. Not all keywords were looked at over the same period.
- Irrelevant tweets: Keywords like #greens may refer to the Green Party in the United States. Liberals may refer to American or British liberals. Unless every tweet is examined, irrelevant tweets will remain in the data set.
To offset some of these problems, a large data set was acquired. (Because a lot of people are tweeting about the elections. I see way more election tweets than footy tweets.) These tweets were acquired using Searchtastic and the following keyword set: #Abbott, #alp, #alp vote, #arbib, #aus2010, #AusLabor, #ausvotes, #ausvotes abbott, #ausvotes gillard, #boatphone, #GILLARDTINED, #greens, #laborfail, #masterchef abbott, #masterchef debate, #masterchef election, #masterchef gillard, #myliberal, #nocleanfeed , #NPC, #ozvote, #qanda, #spill, #tcot, #workchoices, @AustralianLabor, @LiberalAus, @TonyAbbottMHR , abbott math, Abbott-Gillard, asylum boat, australia vote, canberra election, debate abbott, debate gillard, debate winner, Following: @JuliaGillard, Following: @SenatorBobBrown, Following: @TonyAbbottMHR, Gillard, greens brown, Greens Canberra, greens election, Gruen Report, Gruen transfer, hockey liberals, immigration australia, Julia Gillard, JuliaGillard, Krudd Labor, KRudd Liberals, Labor darwin, labor leaks, Labor Liberals, Labor Tasmania, Liberals Tasmania, libs abbott, libs darwin, Libs Howard, marginal electorate, marginal seats, NSWLabor, perth vote, preferences vote, Rudd Liberals, Sex Party, stimulus liberals, sydney elections, Tony Abbott, tony boat phone, tonyabbottmhr, Truss LNP, Truss nationals, Warren Truss, Work Choices Australia, WorkChoices .
These keywords represent the major parties and candidates, some of the major issues, #hashtags that I saw on my Twitter feed, and different geographic areas around the country. Searches were run between July 19 and August 19. A total of 57,977 tweets were collected. The elections were called on July 17. To make sure that the collection of tweets pertain to the elections, all tweets made before July 1 were removed from the data set. (Methodology: Sort tweets by date on Excel. Remove those tweets not between those dates.) By then, everyone knew the elections had to be called and conversation regarding them had started. This takes the total tweets in the data set down to 21,071. That’s still a fairly large collection of tweets to work from. The next step is to remove duplicate tweets from the data set. (On Excel, Data -> Filter -> Advanced Filter -> Unique records only.) This brings the total tweets down to 18,462. That’s still a lot of tweets.
The next step is to extract #hashtags. To do this, I copy and pasted all the tweets to Notepad. I ran a find and replace for [space]# and replaced with [tab]#. I copy and pasted these back, removed all cells that did not start with a #. After this was done, the data set was copy and pasted back to Notepad. Another find and replace was done, this time, [space] was replaced with [tab]. This was pasted back to excel and all cells that did not start with # were deleted. When this was done, 33,857 total hashtags were found. Symbols like , ! . – ? were removed from those #hashtags. This was done to make that #labor. and #labor were treated the same for counting purposes.
A list of unique hashtags was then attained of which there were 2,678. The following table includes all #hashtags that appeared 250 or more times on the list:
Tweet | Count |
#tcot | 5157 |
#ausvotes | 2449 |
#tlot | 1816 |
#p2 | 1761 |
#teaparty | 1704 |
#GOP | 1172 |
#ocra | 950 |
#Libertarian | 920 |
#News | 867 |
#ucot | 705 |
#politics | 687 |
#Israel | 536 |
#sgp | 536 |
#iamthemob | 367 |
#jcot | 308 |
#roft | 304 |
#qanda | 303 |
#cdnpoli | 293 |
#Twisters | 261 |
#USA | 209 |
#Obama | 194 |
#MyLiberal | 187 |
#debate | 183 |
#flotilla | 171 |
#Gaza | 155 |
#masterchef | 154 |
#energy | 149 |
#hhrs | 147 |
#green | 134 |
#rpn | 121 |
#912 | 111 |
#aus2010 | 111 |
#NPC | 108 |
#rootyq | 101 |
#AUSlabor | 95 |
#oilspill | 94 |
#topprog | 94 |
#nocleanfeed | 92 |
#fb | 88 |
#rightriot | 80 |
#laborfail | 79 |
#spill | 76 |
#US | 76 |
#jews | 75 |
#ALP | 74 |
#terrorism | 74 |
#Greens | 72 |
#cspj | 70 |
#tpp | 70 |
#BP | 69 |
#antisemitism | 68 |
#Gillard | 67 |
#p21 | 67 |
#openinternet | 66 |
#FF | 65 |
#FollowFriday | 65 |
#oil | 65 |
#Muslim | 64 |
#ireland | 63 |
#Abbott | 62 |
#UK | 61 |
#Australia | 58 |
#Iran | 58 |
#Palin | 57 |
#Blog | 55 |
#Europe | 54 |
#quote | 54 |
#dnc | 52 |
#fail | 51 |
#iranelection | 51 |
#jlot | 51 |
#hcr | 47 |
#MentalHealth | 47 |
#justsayin | 46 |
#rs | 46 |
#eco | 44 |
#boatphone | 43 |
#glennbeck | 43 |
#islam | 42 |
#zionism | 42 |
#NBN | 41 |
#palestine | 41 |
#jobs | 40 |
#environment | 39 |
#Autism | 36 |
#Hamas | 36 |
#Health | 36 |
#military | 36 |
#tiot | 36 |
#dem | 35 |
#AZ | 34 |
#Lebanon | 34 |
#vote2010 | 34 |
#dems | 33 |
#ausdebate | 32 |
#bonjovi | 32 |
#climate | 32 |
#judaism | 32 |
#lateline | 32 |
#MoFo | 32 |
#videos | 32 |
#foxnews | 31 |
#theview | 31 |
#730report | 30 |
#hasbara | 30 |
#nz | 30 |
#travel | 30 |
#tweetcongress | 30 |
#YWC | 30 |
#conservative | 29 |
#Gulf | 29 |
#patriottweets | 29 |
#AFRICA | 28 |
#ampat | 28 |
#dublin | 28 |
#property | 28 |
#WORLDCUP | 28 |
#beck | 27 |
#Free | 27 |
#IDF | 27 |
#ronpaul | 27 |
#acon | 26 |
#jewish | 26 |
#twibbon | 26 |
#bds | 25 |
#CNN | 25 |
#Obamacare | 25 |
#rush | 25 |
Looking at this list, there are some phrases that are likely not Australian or not uniquely Australian. This includes #tcot, which stands for Top Conservatives On Twitter. The term was amongst those searched for because it appeared in a few tweets that also included the #ausvotes #hashtag. A google search for #tcot #ausvotes only brings up 8,120, which further supports the idea that this isn’t really an Australian election term. #tlot was not deliberately searched for in terms of trying to include it. If you put #tlot #ausvotes into a google search, you get 244,000 results which suggests heavy Australian usage. You could probably remove #teaparty, #GOP, #Libertarian, #Israel, #Twisters, #USA, #Obama, #flotilla, #Gaza, #912, #oilspill, #US, #jews, #BP, #antisemitism, #oil, #Muslim, #ireland, #UK, #Iran, #Palin, #Europe, #dnc, #cdnpoli, and #iranelection. They are unlikely to do with the Australian elections.
If that’s agreed upon, then it looks like top issues based on #hashtags … the internet and its openness? It doesn’t look like there was any large scale usage of #hashtags around issues. Instead, it appears that #hashtags were used to label tweets that discussed the election, were used to discuss specific candidates and to discuss specific parties. Issue based discussion may have been secondary to Twitter discussion.
And if that’s true, and going further with that idea, it could validate the messaging used by Labor and the Liberals to largely mount attacks on each other. People on Twitter are heavily engaged in discussing politics but not the issues. It may also justify the work of GetUp!, which strives to bring attention to specific issues in Australia.
If you want access to the Excel file with all the tweets, please comment or send me an e-mail. The file is about 24meg so I didn’t upload it.