Archive for category Methodology analysis

Methodology: Draft/Free writing (part 2)

Posted by on Wednesday, 18 August, 2010

This is an update of a draft of my methodology section. I finally got back to writing it as I took some time off from it and my literature review. This is very much a draft intended for me to explore concepts of what types of online research exist as it pertains to social media. After the major methods are spelled out, the intend is to go further into population studies, the why and how of them. This should be followed up with the methodology for what I’m actually doing. That last part will ultimately be the shortest as population studies aren’t that hard to do. The major methodology parts will actually be spelled out in the individual sections and will follow a format similar to ones that I’ve done for other posts here.

Methodology

When conducting social media research, there are ten general methods that can be used to gather and analyze data. These are:

1. Individual case studies for how a business uses social media and the web,
2. Search and traffic analytics analysis,
3. Sentiment analysis and reputation management,
4. Content analysis,
5. Usability studies,
6. Interaction and collaboration analysis,
7. Relationship analysis to try to determine how people interact and to identify key influencers, and
8. Population studies
9. Online target analysis of behavior and psychographics,
10. Predictive analysis.

Each of these methods offers insights into various aspects of the web and its population. The type of analysis used is often specific to the purpose of the research, involved blended approaches from traditional analysis types, and different methods are often used in conjunction with each other. These methods often blend quantitative and qualitative analysis. Choosing the correct method of gathering analyzing data can be one of the biggest hurdles for being able to measure ROI and understand how a community works.
This section will provide a brief summary of each type, explain how to conduct this type of research and give examples that used that methodology.

Online target analysis of behavior and psychographics

Online targeting of and marketing towards a specific audience because of their demographic characteristics is extremely common on the Internet. Psychographics is a term that includes targeting towards a specific demographic group except it includes the offline component.

Sutherland and Canwell (2004) define psychographics as “market research and market segmentation technique used to measure lifestyles and to develop lifestyle classifications.” (p. 247) Nicolas (2009) defines online behaviorial analysis as a series of steps: Collecting user data across several sites, organizing information about users based on the sites they visit and their behavior on those sites, “infer demographics and interest data”, and classifying new users based on the collected data in order to deliver relevant ads and content based their demographic profiles. Kinney, McDaniel, and DeGaris (2008) define psychographics as attitude towards something such as a brand or involvement with an organization.

Given the methodology involved, much of this type of research involves action research in that it is done in a specific content, based on internal models to address specific situations.

An example of this type of research was done by Kinney, McDaniel, and DeGaris (2008) who investigated the demographic characteristics of NASCAR fans and their attitudes towards NASCAR, its sponsors and sponsor involvement with NASCAR. The research found that age, gender and education were all important variables in determining sponsor recall: Younger, more educated males had the best brand recall amongst NASCAR fans.

This type of research can be viewed as a subcomponent of a population study in that demographic information is sought about the population. In an online context, it often works in conjunction with search and traffic analytics analysis, content analysis, and interaction and collaboration analysis.

Predictive analysis

A search on 13 July 2010 on SPORTDiscus had three results for “predictive analysis.” A search on the same date on Scopus had 605 results, 275 of which were in engineering, 132 in computer science and 102 in medicine. Predictive analysis is probably one of the least used analysis methods, especially in social media and fandom.

What is predictive analysis? At its simplest, it is identifying a future event or events, monitoring selection actions that precede the event and seeing if those events can be used to predict the outcome of similar events in the future. If a predictive value is found, an organization can monitor behaviors to help make more informed decisions.
An example of this type of research is “Predicting the Future With Social Media” by Asur and Huberman (2010). Their goal was to determine if tweet volume and sentiment on Twitter prior to a movie being released could be used to predict how well a movie performs at the box office. Their methodology involved identifying movie wider release dates that took place on a Friday, creating a list of keyword searches related to those movies, and using the Twitter API to collect all tweets and aggregate date that mention those keywords over a three month time period. The authors then compared the tweet volume to box office performance. They concluded that social media “can be used to build a powerful model for predicting movie box-office revenue.” (Asur & Huberman, 2010)

This type of research can be used in conjunction with other methods. It can be used along side a population study to see if certain actions will result in demographic changes.

References

Asur, S., & Huberman, B. A. (2010). Predicting the Future With Social Media. Social Computing Lab. Retrieved from http://www.hpl.hp.com/research/scl/papers/socialmedia/socialmedia.pdf

Kinney, L., McDaniel, S., & DeGaris, L. (2008). Demographic and psychographic variables predicting NASCAR sponsor brand recall. International Journal of Sports Marketing & Sponsorship, 9(3), 169-179. Retrieved from SPORTDiscus with Full Text database.

Nicolas, P. (2009, December 17). “Online audience behavior analysis and targeting.” Patrick Nicolas Official Home Page. Retrieved August 1, 2010, from http://www.pnexpert.com/Analytics.html

Sutherland, J., & Canwell, D. (204). Key Concepts in Marketing. Palgrave Key Concepts. Hampshire, England: Palgrave MacMillan.

Related Posts:

Methodology: Draft/Free writing (part 1)

Posted by on Tuesday, 13 July, 2010

One of the ways I learn best is by talking through a problem.  I compare and contrast things.  I ask people their opinions.  I read things to people.  I get into in depth conversations about practices.  I ask questions.   Write now, I’m in the process of writing my methodology.  At this point, I feel like what I’m really doing trying to outline the current practices in social media research, explaining how they are done and giving examples.  Once that is done, I can justify using a population study as the analysis process to help answer my research question and then go into a bit more depth regarding that.  Outlining the available methods seems important because the processes can be a bit different than traditional sociology methods.  Or at least, it feels that way.  This will be updated as I go along.  How often that happens depends on my motivation to write.

Methodology

When conducting social media research, there are ten general methods that can be used to gather and analyze data.  These are:

1. Individual case studies for how a business uses social media and the web,
2. Search and traffic analytics analysis,
3. Sentiment analysis and reputation management,
4. Content analysis,
5. Usability studies,
6. Interaction and collaboration analysis,
7. Relationship analysis to try to determine how people interact and to identify key influencers, and
8. Population studies
9. Online target analysis of behavior and psychographics,
10. Predictive analysis.

Each of these methods offers insights into various aspects of the web and its population.  The type of analysis used is often specific to the purpose of the research, involved blended approaches from traditional analysis types, and different methods are often used in conjunction with each other.  These methods often blend quantitative and qualitative analysis.  Choosing the correct method of gathering analyzing data can be one of the biggest hurdles for being able to measure ROI and understand how a community works.
This section will provide a brief summary of each type, explain how to conduct this type of research and give examples that used that methodology.

Predictive analysis
A search on 13 July 2010 on SPORTDiscus had three results for “predictive analysis.” A search on the same date on Scopus had 605 results, 275 of which were in engineering, 132 in computer science and 102 in medicine. Predictive analysis is probably one of the least used analysis methods, especially in social media and fandom.
What is predictive analysis?  At its simplest, it is identifying a future event or events, monitoring selection actions that precede the event and seeing if those events can be used to predict the outcome of similar events in the future.  If a predictive value is found, an organization can monitor behaviors to help make more informed decisions.
An example of this type of research is “Predicting the Future With Social Media” by Asur and Huberman (2010).  Their goal was to determine if tweet volume and sentiment on Twitter prior to a movie being released could be used to predict how well a movie performs at the box office.  Their methodology involved identifying movie wider release dates that took place on a Friday, creating a list of keyword searches related to those movies, and using the Twitter API to collect all tweets and aggregate date that mention those keywords over a three month time period.  The authors then compared the tweet volume to box office performance.  They concluded that social media “can be used to build a powerful model for predicting movie box-office revenue.” (Asur & Huberman, 2010)
This type of research can be used in conjunction with other methods.  It can be used along side a population study to see if certain actions will result in demographic changes.

References

Asur, S., & Huberman, B. A. (2010). Predicting the Future With Social Media. Social Computing Lab. Retrieved from http://www.hpl.hp.com/research/scl/papers/socialmedia/socialmedia.pdf



Related Posts:

Methodology for Measuring Monetary Value per Social Media Fan Interested in a Brand

Posted by on Friday, 9 July, 2010

I wrote this in April. I thought it was posted here or on Fan History’s blog. After a bit of searching, nope, it isn’t. Therefor, posting it here. This was written at the request of a friend who works for a marketing company in Chicago to discuss the challenges of measuring ROI.


April 19, 2010

One of the most important metrics that people discuss is the value of a customer and the value of followers of brands on various social networks. What is the return on investment for generating buzz in social media? What is the per individual value of having a fan on Facebook, having a fan create their own video content and upload it to YouTube, the value of some one belonging to or listing a brand as an interest on LiveJournal? Virtue, a social media management company, has put the value of a Facebook fan at $3.60 per fan.1 Their methodology is suspect and their conclusions should not be read as universal across the many industries that utilize social media to promote their products and drive revenue.

Social media is composed of many networks, each catering to their own demographic and interest base. Each of these groups has their own behavioral patterns. Canadians and Brits have different usage patterns for Internet based radio than their American counterparts. The buying power and education level of a Facebook user is different than that of a MySpace user, even if both groups are composed solely of Americans. Added to this mix, there has been a fair amount of research done that says brands themselves do not influence purchasing decisions as much as friends and family. 2

Given this reality, a standard industry wide number is impossible to calculate. A smart company should independently develop a number for measuring the monetary potential of people interested in their brand. To do this, a company should first identify a specific network where they are aware of a community that is already interested in their product. This community can be expressed by listing the brand as an interest, as is the case for Facebook and LiveJournal, by belonging to a group dedicated to the brand for sites like ning and Yahoo!Groups, or by uploading user generated content on sites such as YouTube. Remember: Each of these sites has a different demographic base so you cannot arrive at a single metric across all networks unless you have the same uniform population using multiple networks.

Once you have identified the community you wish to find the individual value for, determine the demographic and geographic composition of the community on that particular network. Stick with information that is publicly available. In the case of LiveJournal, that data includes date of birth, geographic location, the type of account a user has (paid, plus, permanent). For Facebook, this information can be much deeper if you use the data provided to people interested in marketing to them. For a fan page that you do not run, you are limited to their name (for which you can attempt to determine gender) and the network that a person belongs to.

After you have this data, compare it to your known information regarding people who support your brand. If the demographic information does not match information, there is something wrong with that community and you will never get meaningful data. For example, a United States based Christian bookstore may be gaming for autofollowers on Twitter and have a huge following of people from Iran, Saudi Arabia and Morocco. On the whole, that demographic is unlikely to convert into potential customers for the bookstore. Trying to go further to assess value of one’s Twitter followers would thus not be useful. 3 If the data does match internal numbers or is a demographic base that you wish to explore, go on to the next step.

If the community demographics match with what you want to explore, create a survey for people who express an interest in your brand. It would be best to make the survey using a format that people inside the network would find easiest to use. For example, on LiveJournal, you may wish to use the service’s polling data; on Facebook, you may wish to create a survey using FBML and make it a tab on your fanpage or create a Facebook application. When conducting the survey, ask demographic and geographic questions that are represented by the data you pulled from public sources on that network. The survey should also ask questions regarding what other networks the individual uses. The survey should also inquire as to how often the customer has made a purchase from your company, what type of purchase was made, who influenced the decision to go with your particular brand, where the people who influenced them were sharing their influence, and what percentage of time those particular influences helped with a purchasing decision.

After the survey results are in, an analysis should be conducted to determine the particular value of each respondent in terms of monetary value of a customer. For example, a person may be a fan of your Facebook fan page. If for example you are a baseball team, the person may be inclined to purchase tickets to attend games anyway. They may respond that they have bought $200 worth of tickets. They may also tell you that they bought $20 worth of tickets to games they would not have otherwise attended as a result of coupons that you posted to the Facebook fanpage wall. The fan may also have purchased tickets they would not have purchased otherwise because a friend on Facebook talked about how they loved the team but had never attended game. The Facebook based conversation created a situation where the person spent another $40 on tickets. The value of this customer on Facebook thus is not $200 but $60 or 30% of their total spending is a direct result of a social media activity on Facebook.

Once the value for each respondent has been determined, normalize this against the demographic composition of the whole population, as every person expressing interest in a brand on a particular social network is not likely to respond to a survey. Remember, different groups behave differently. Some groups are incentivized to respond for their own reasons in order to try to meet their own needs. If the a particular network has two thirds of the population that is female that expressing interest in your product and the survey response is only a quarter for women, the monetary value of your community is going to be wrong.

This method of determining the monetary value of individuals expressing interest in your brand has other benefits. It can help you set benchmarks for improving sales to key demographics and help you identify new demographic bases for your product that you may not have realized. It may also help you identify other networks where you can potentially attempt to drive buzz for your product where the community is less obvious. Most importantly, this method uses real numbers for real customers rather than relying on unproven hypotheticals.

1 http://mashable.com/2010/04/14/facebook-fan-valuation/
2 http://www.mediapost.com/publications/?fa=Articles.showArticle&art_aid=109574
3 If you find yourself in a situation like that, the best response is to re-evaluate your social media strategies. Something is clearly broken.

Related Posts:

  • No Related Posts

Types of social media research

Posted by on Sunday, 4 July, 2010

This is more of a thinking out loud so I can delve into it later. One of the things I’ve been curious about is the type of social media research conducted by academics and marketers. The types that I can think of are:

1) Case studies for how a business uses social media and the web,
2) Content studies that look at social media research and website design of a small basket of companies,
3) Sentiment analysis and keyword mentions,
4) Relationship analysis to try to determine how people interact and to identify key influencers,
5) Population studies.

Any more that people can think of? Is there any method superior to another? Should social media researchers be valuing any over the other? Which methods are more popular and why?

Related Posts:

  • No Related Posts

Problematic gathering of Foursquare data

Posted by on Monday, 28 June, 2010

One of the challenges of social media metrics is identifying what numbers matter, how to get those numbers ,  how to organize that data in a way that facilitates quickly getting the data and making it useful, and making sure your data set is complete.  The latter can be problematic as people can always create new communities, groups, hashtags, accounts, etc. If you don’t organize your data in a useful way and regularly update it, you can create such a huge mess as to make your data almost unusable.

This is a problem I’ve run into with my AFL and NRL Foursquare data.  When I first started gathering this data in late April, I spent a day or two looking for all the venues.  Slightly problematic issue arose in that not all venues had been created.  I never went back to regularly check to see if these venues had been created.  What this means is that my NRL data has several huge holes in it, because venues don’t exist or the venue that does exist was not the more popular of the ones created.

Another problem was the data was not collected in a way that I found entirely logical when I revisited it to try to create a table to show the average number of checkins at home and away matches for the NRL.  (I wanted to do the NRL first, before I tackled the AFL because I’m focusing on the AFL and trial and erroring on the NRL seemed wiser.)  I gathered the total checkins and unique visitors every Thursday through Monday night for all venues that played NRL and AFL games that I had identified.  In hindsight, this wasn’t the best way to go about this.  I should have identified everything by games being played as it would have made processing the data much, much easier and I wouldn’t have as much “garbage” data that I have to wade through.  I’ve spent most of the morning correcting this mistake by identifying games and venue locations so I can more easily and efficiently track total checkins for AFL games and some NRL games.  (Later, I can try to do this when the A-League, W-League and NBL start up.)

Looking through existing Foursquare data though, I really don’t know if I will want to process it.  I’d almost rather go through the last quarter of the season, where I know I have a complete data set than try to piece together the data dating back to late April.  I probably won’t do that but I’ll likely have to figure out what to do.  It isn’t pretty and I’m really kicking myself for what could have been a lot of time wasted each data gathering data that I can’t use.

My gowalla data faces similar issues.  The big difference there is I’ve always known I haven’t had a complete data set and it was through processing World Cup data on Gowalla that I realized my collection issues with Foursquare data collection.  I’d love to use Gowalla for AFL/NRL analysis but it isn’t going to happen.  I mean, it really isn’t going to happen, especially as Foursquare is the bigger priority as it has greater penetration in Australia and I never had a complete venue list for Gowalla.

Related Posts:

Issues with social media metrics: Twitter

Posted by on Saturday, 5 June, 2010

I apologize in advance.  This is a stream of conscious rant about various Twitter metrics and analysis that take part in social media.  It is a result of seeing one too many posts about meaning from metrics that I see as meaningless.   This may become a series where I explain problems with other metrics.

Ever see a social media person talk about measuring ROI on Twitter?  The focus tends to be on two major metrics: Total follower count and total retweets.  Whenever I see a consultant advocating the first as a meaningful metric, I want to tell the world that they should not hire that person.  Brand awareness is important.  That’s why companies pay for naming rights of stadiums, even if the ROI is questionable or hard to measure.  Twitter just doesn’t work that way: Getting a follower does not translate into name recognition for your brand, website, interest or self.  Okay, it might…  but not if the goal is to get as many followers as possible.  Here I am defining many as 1,000+.  If everyone who follows you has 1,000+ people they follow, the chance of you getting your message across to them where they will see it are slim to none. (1)   A system that encourages you to go out and seek likely follow backs generally relies on getting them from those 1,000+ follows people.  It becomes a great big circle of following in order to build up followers.    Big deal: You have 10,000 followers, who all have 10,000 followers who never read each other’s tweets.  No brand recognition there.  No personal connection.  No traffic generation.

Instead of total number of followers, the metric you want to measure is the average number of followers for the people that follow you.  Your ideal is a number between 50 and 250.  This generally indicates that the person has a commitment to check Twitter regularly to see what people they follow are doing and have people that will check to see what they are updating.  It means that if you post a tweet, chances are these people will have that Tweet visible on their timeline for at least ten minutes, up to possibly an hour.  Less then 50 follows indicates a person probably isn’t checking Twitter regularly.  More than 250 means much less visibility for you if they are reading their entire timeline.  It also means that their is the potential that the person running the account is using a tool to manage their timeline so that they may never read you.  If you want to get read on Twitter (the rational for getting more followers), you’ve got to target those who will read you to begin with.

Instead of total followers, if you’re a bricks and mortar business, you want a metric of how many of your followers live in areas where your market is.  This, like the average total number of follows your followers have, is not an easy number to get.  If you’re a minor league team in the United States, your market is largely going to be people who live with in an hour or two of your home grounds.  If your team has a relationship with a team up or down the ladder, your market may extend to that area.  Your market may also extend to where places where players from your team originate.  These people are likely to purchase tickets to your game, attend games on the road, listen or watch games over streaming audio or video, or buy merchandise based on your team.  Identify those locations or categories of followers and count them.  Ignore those followers who don’t.  Count the person who lives in your town: Do not count the fly fishing business from Canada, or the follower from Brazil who never mentions your sport and only tweets about Justin Beiber.  The second two, unless you have evidence to the contrary, are not going to convert into any sort of sales or provide a relationship that can further your own goals.  There is nothing wrong with having those followers (2)  even if they provide you with nothing back in return.  Just don’t try to get them by following large numbers of people.  What’s the easiest way to count people in your market?  Follow them and only them back.  If you just want to follow a few people, add people in your market to a list.  This makes the number really, really easy to keep track of as you just have to keep track of new followers that Twitter e-mails you about.  If people didn’t make it on your list the first time, you can add them to your list or to your follows when they retweet you or @ reply you.  Total people following you in your market counts 100 times more than the person not in your market who you likely won’t convert into a potential sale, job or viewer.

Why do people use total followers rather than average number of follows their followers have or the total number of potential people in their market?  They generally do that because the first metric is easy to get a number for.  The second and third ones are pretty hard to get at this time.  Just because a metric is easy to get doesn’t mean it is the right one to use.  In this case, try to spend the time to get the more meaningful number.  That way you know your message is actually getting out to the people who matter.

While I’m on this topic of Twitter follower counts as a metric, here is another one to consider for a special subset of people who mention their social media prowess, with all the details about how to do that available on their website.  When I say their website, I mean that thing they cross promote on Twitter and LinkedIn and on other social networks.  For them, there exists a special metric: Ratio of Twitter followers to the total unique visitors Compete says that they have to their site.  I chose Compete because it actually gives you a number and heavy social media users are more likely to have Compete installed on their browser.  (3)  Given that, the measurement for traffic to their site should actually be higher than it actually is… but I digress.  Twitter followers/Website traffic.  Important metric.  Ideally, the number should be less than one.  If it is greater than one, it says that the person running the Twitter account does not effectively promote what matters: Themselves.  People don’t want to read what that person has to say in any depth.  Twitter followers aren’t clicking on the person’s links.  Followers aren’t sharing links to that person’s content with their own Twitter followers or Facebook friends or linking to it on their blog.  If a person really matters, with a few notable exceptions (4), people will want to follow their links.   The measure of 34,000 Twitter followers/6,000 monthly visitors thus is incredibly meaningful.  It says that the person can get followers but they can’t convert that into traffic: People aren’t interested in more meaningful dialog with the account owner.

Moving on to that second Twitter metric that people like to talk up: ReTweets.  There is nothing wrong with this number and can be useful in terms of determining how entertaining or useful people find the content you’re putting out on Twitter. (5)   It is just one of those metrics that people treat as if it exists in isolation and that’s where it becomes less meaningful.  First, before even beginning to look at the number, ask yourself an important question: Why do you want your content ReTweeted?  If your goal is to use Twitter ReTweets to convert into sales or page views to your site, then 5,000 ReTweets which result in zero sales or zero visits means that your failure rate is 5,000.  Who cares if you get 5,000 ReTweets if it doesn’t help you meet your goal?  If your goal is to use ReTweets to start a conversation and no one @ replies to you or goes to your blog to have a conversation with you, then your ReTweet campaign wasn’t successful. ReTweet metrics are only useful as they pertain to helping achieve other measurable goals.  A ReTweet totals metric, absent another metric, is a number about ego boosts and helping with your own self worth on Twitter.

Other metrics people like to mention for Twitter include total mentions.  The more mentions you get, the more times the tag you created gets used, the better it is for your brand in gaining recognition.  This is great in theory as a measurement tool.  What it ignores is sentiment analysis.  If you can get users to tag 50,000 of their tweets, that could be great brand recognition.  At the same time, it could mean some one else highjacked your messaged for shits and giggles.  (6)  It could also have been highjacked by people who have complaints about your service or product. 50,000 tweets do you very little good if 45,000 are from an angry mob.  If you’re doing a campaign involving ReTweets or mentioning a tag, some people can and do take that too far.  Your audience of tag user may flood their Twitter feed with your message so often that they piss off people into unfollowing them.  That hurts your reach.  Another brand could highjack your tag to promote their related product.  Sentiment and audience reaction ultimately matter more than sheer numbers.  If you can count the total of positive, negative, neutral, highjacked posts using a tag, that number will be more helpful than total mentions.

The major Twitter metrics have serious flaws.  They don’t tell the whole story and often provide a misleading picture.  The only way for these metrics to work is for them to be broken down into smaller, more harder to measure numbers that answer how the measure helps you measure your market.  And please encourage people to stop promoting ineffective measures.  Just because it is an easy number to get doesn’t make it worth using: People shouldn’t be paying for that.  Flawed data is flawed and the industry as a whole is hurt when we constantly allow bad practices to continue.


1. I follow 300 people. I’m in Australia. I can’t keep up with all my American friends when they are busy tweeting while I’m asleep and I don’t have nearly that many followers.

2. Why?  Because everyone thinks that number matter and it doesn’t hurt to have followers who don’t do anything for you.

3. Alexa doesn’t give real traffic volume.  It just gives a ranking compared to other sites.  As we’re comparing Twitter followers to visitors to a site, using Alexa for comparison purposes doesn’t work.

4. Stuff My Dad Says, the BP satire updates, celebrities are all examples where this doesn’t work.  Brick and mortar websites also may be an exception as you can put some content on aggragate services like Facebook and Foursquare.  Bricks and mortar stores can use those discounts on those sites to measure the effectiveness of their campaign.   Websites and content providers?  Not so much.  If you’re reading this, you’re probably not an exception.

5.  In that regard, it is actually a bit more useful than total followers are.  There are fewer bot/indiscriminate ReTweeters/spammers than there are bot/indiscriminate followers/spammers.

6. Yes, this does happen. And it doesn’t always include the usual suspects.

Related Posts: