Whose Comments Are We Interested In?

Up to this point, we have concerned ourselves with what data to analyze while ensuring that what we selected is germane to our topic. In th...

Up to this point, we have concerned ourselves with what data to analyze while ensuring that what we selected is germane to our topic. In this post, we explore how important it is to determine whose social media comments we are interested in. A few examples are as follows:

  • If we are interested in getting objective feedback on a product from a specific company, we might want to make sure that we can identify or exclude this company’s employees from the pool of content under analysis.
  • Similarly, we need to ask: Are we interested in comments from the general public, or are we interested in the comments of C-level employees (that is, chief marketing officers or chief information officers)?
  • Also, are we interested only in people who have a positive bias toward a company or those with a strong negative bias?

Looking for the Right Subset of People

At the beginning of a social analytics project, analysts spend a fair amount of time thinking about the ultimate goals of the project and the results that we expect to get at the conclusion of the project. This upfront analysis will go a long way in determining the appropriate target segment of the analysis. During the definition of a typical social media data analysis project, requesters will (or should) explicitly point out the “who” (whose opinion are they interested in?) or will give the researcher or the model builder sufficient hints or guidance. Various attributes can be used to segment or target the audience that we’re interested in. Some of them are described in the following sections. 


Do we want the opinions of employees or nonemployees? For example, if a company launches a new product or service and wants to see how the marketplace is reacting to that product or service in social media data structure, it might prefer to exclude the comments of its own employees. In other situations, we might exclusively focus on the employee population if the intent is to learn how they are responding to a new product, service, or strategy. In a project that we worked on, IBM was interested in learning about the marketplace reaction of a brand-new product type. The marketing team specifically asked us to exclude the comments and sentiments of IBMers to understand sentiment from “neutral” people so as not to bias the results.


Are we looking for comments from people with a positive bias or negative bias?

For example, if the object of social media analysis is to detect customer support issues, it makes sense to focus only on posts with a clear negative bias. You might argue that highlighting positive customer experiences is just as important and probably needs to be considered as well. Another common use case involves trying to compare the sentiment about a variety of products that a company is providing to the marketplace. In this situation, we may consider opinions from all ranges of demographics and keep score about the number of positive, negative, or neutral comments. Sometimes, the purpose of a project is merely to find how many people or comments mention the company’s product versus a competitor’s product. In this case, we may (initially) ignore sentiment and consider all comments without exclusions.

A few years ago, there was a civil movement called Occupy Wall Street in the United States . Numerous people congregated around specific commercial buildings to express their silent protests against what they believed to be unfair practices. During this time, as a validation of some of our analytics capabilities, we built an experimental social listening model to detect whether there was any impact to an IBM location where some key customer meetings were being conducted. In this case, we built a model that focused on snippets of information that may have negative sentiment about IBM and then specifically looked for any mentions of protests or civil actions. In many cases, sentiment is a result of an analysis phase. However, in some instances, the scope and nature of the project determine whether we should include comments only from people who have either a favorable view or an unfavorable view of our topic. In cases like these, we are able to take this information into account in the very initial phase of the project and focus only on a specific subset of people.

Location or Geography

Do we want to focus on comments from people who live in a specific location?

One of the projects that we were involved in dealt with issues around water in South Africa. In this particular project, we were clearly interested in comments from people in South Africa about the variety of issues and questions around the current and future needs and use of clean and healthy water. Sometimes we may be interested in comments from all over the world, but valuable insights can emerge when we classify the analytics by region.


Is the language of the content important to us?
Some projects require us to understand what is specifically being said about a company’s product or service in a particular local language. For example, if a company wants to do some market research around the market’s appetite for a machine translation tool in Spanish-speaking countries, it will be interested in content contributed by individuals in the Spanish language.


Is the age of content author important to the project at hand?

There is a lot of discussion in popular media about the work habits of Generation Xers. Those in Generation X (or Gen X) were born after the Western Post–World War II baby boom. As a point of reference, most consider those with birth dates ranging from the early 1960s to the early 1980s as being part of this demographic. If a company’s Human Resources department wanted to study the experience of its newly hired Gen Xers, we would have to determine a way to segment the population based on age.


Are we specifically interested in comments of men or women?
Gender also becomes an important attribute upon which we may segment audience for a particular project. If an organization is creating training and educational materials to encourage more women to pursue higher studies in science- and mathematics-related disciplines, it may choose to focus exclusively on comments and feedback from women. Similarly, if a health-care company is undertaking research about male-pattern baldness, it would be served well by segmenting its audience to include only men.

In one case, we were asked to evaluate the comments that were made in social media during the introduction of a new movie trailer. Our client was interested not only in the reaction to the trailer, and by association the movie itself, but also if certain themes resonated with either males, females, or both. Again, the goal was to determine not only likeability of the movie, but also keys in how to market it.

Profession/ Expertise

Do we need opinions from anybody in general, or do we need opinions from people who are working in a specific profession (such as the IT profession) in a specific industry (such as automotive)?

For example, if IBM is interested in learning about the reaction to the cognitive computing capabilities of IBM Watson in the area of health care, it is probably interested in the opinions of corporate users as opposed to home users.

Eminence or Popularity

Are we interested in opinions only from people of certain standing in the domain of the topic area?

A major aspect of a social media campaign for companies involves identifying who might be an “influencer” in a particular topic area or industry. For performing this type of analysis, we tend to spend a lot of time in developing rules to ensure we are able to narrow the solution space to identify a small subset of individuals that a company should target its marketing messages to.


When dealing with social media analysis within a company’s intranet, are we interested in segmenting based on a specific job role?

For example, we are working on a project that computes a social scorecard for employees based on their participation in social media. There are some roles in which the job demands a lot of collaboration in social media, and then there are some people who might be working on highly specialized or highly sensitive projects in which they may not be allowed to share information in social media. Here, the type of role is very important in interpreting scores.

Specific People or Groups

Are we really interested in narrowing down our analysis to comments about or comments from a specific individual or a specific set of individuals? A couple of years ago, we were asked to build an application to capture and display sentiment in near real time about tennis players participating in the US Open . In this case, we used names of players, their nicknames, and a variety of other aliases to ensure we were targeting the right segment. In another example, we were asked to identify how people in social media were reacting to a Lance Armstrong interview with Oprah Winfrey.

Do We Really Want ALL the Comments?

Perhaps inappropriate is too strong of a word, but in some cases you might want to exclude the comments of your company’s employees. We tend to look at ourselves as one of the best customers of our products and services, but sometimes IBMers are also among our most vocal critics. If we are looking to understand the true concerns or thoughts of our external customers and clients, we may want to exclude the subset of IBMers from the conversation. This is an example of the employment attribute that we discussed previously. Again, the purpose isn’t to exclude because these comments aren’t valuable, but in the spirit of openness and true sentiment or feelings, it may be useful to separate the comments.

In one example, we were asked to look at the social media activity around a new product launch. The client’s concern was that while there was a tremendous amount of money and time being invested in the various marketing campaigns, the sales hadn’t picked up as much as had been anticipated. A quick analysis of the discussion around the topic showed the level of activity over a four-week period (see Figure 3.1).

This graph shows the number of mentions of the particular product over time. It’s rather clear from this simple graphic that in the beginning, there was quite a bit of hype or discussion around this product launch, but over a short period of time, the discussion continued to decline almost to zero mentions.

What was even more disturbing about this analysis was who was having the conversations. We quickly looked at the top contributors to this thread of conversation and turned up the list shown in Figure 3.2.

A manual lookup of the top 10 users in this conversation revealed that at least 9 of them were employees of the company and represented nearly half the conversation (47%).

The conclusion we drew was that in the various social media and news venues, the employees were chatting about the new release, but given the slope of the curve in Figure 3.1, that conversation didn’t sustain itself. After the employees stopped talking, there was virtually no conversation. Clearly, a new marketing plan was needed since what was being said wasn’t being repeated, commented on, or perhaps even resonating with the public.

Are They Happy or Unhappy?

I’ll never forget the time I [Matt] was traveling to Las Vegas to speak at a trade show. It was a long flight, but when we landed and the plane was taxiing to the gate, I simply tweeted “Viva Las Vegas” and was almost instantly greeted with a return tweet for a hotel/casino special. Someone was actually watching for conversation about the city, not just me, to send a special offer.

Watching or monitoring social media for customer issues is still a growing trend. It provides the ability to respond to issues in a timely fashion as well as gives opportunities for additional business opportunities.

Consumers are using Twitter to either ask questions about product- and service-related issues or to air complaints with increasing regularity. A study by Sprout Social found that social media messages eliciting a direct response from companies had risen by 178% from 2012 to 2013 [2]. To stay competitive, companies are choosing to watch for negative terms or concepts being used around a brand and head off a potential customer satisfaction problem later.

By listening to customer feedback in Twitter, companies like JetBlue have been able to build their reputation as responsive customer service organizations. Think about this from the consumers’ perspective. Airline delays can be one of the most common causes of customer frustration. Not only do these delays happen often, but those being delayed or inconvenienced can be pretty vocal about their feelings, especially when there is nothing to do but sit in an airline terminal with their smart phones.

Acknowledging this fact, @JetBlue ensures the company is responsive to its customers because it understands the importance of continued customer loyalty. JetBlue not only engages with happy customers but also responds to and helps frustrated customers as quickly as possible.

According to an article in AdWeek [3], due to a downpouring of rain in the Northeast that grounded most of JetBlue’s planes, the company was facing a public relations storm that seemed unlikely to go away anytime soon. On this particular occasion, passengers were trapped in their planes (on the tarmac) in New York City for hours—going nowhere and growing more annoyed by the minute. In many cases, passenger delays stretched into days while over 1,000 flights were ultimately canceled.

Needless to say, customer concerns and outcries ran rampant. However, through social media channels, then-CEO David Neeleman reached out to travelers of JetBlue to personally apologize for the issues and presented the company’s plans to improve service. The use of social media outlets to enable an open atmosphere of communication coupled with the company’s willing to admit (publically) its mistakes went a long way to turn a bad situation good.

The lesson?

Listening to the right content (in some cases, customer dissatisfaction) can provide an added vehicle to achieving customer loyalty and goodwill. JetBlue leveraged YouTube (a popular video-sharing site) to explain the service failure and describe how it planned to improve its operations as a part of its effort to control the situation. Again, it did this by posting an apology by founder and then-CEO David Neeleman shortly after the trouble began. As a result, the company built a relationship with its customers.

This use of a social media source coupled with JetBlue’s complete openness and willingness to take responsibility helped to push it over the media reports and resume its standing as a consumer favorite. What’s important is that despite the negative news coverage and complaints by consumer advocacy groups, the airline was able to keep its place atop the J.D. Power North America Airline Satisfaction Study for low-cost carriers going on 11 years in a row [4]!

So when we think about who we want to listen to, the answer, of course, is everybody. But by segmenting the comments into those with positive sentiments and those with negative sentiments, we can quickly respond to those urgent customer issues.

Location and Language

There are times when understanding the mood or the thoughts of a particular region of the world is of main importance. For example, if we are interested in understanding the social opinions or concerns of youths in India, monitoring data from the United States isn’t all that practical. Just to be complete in this thought, however, while we understand that there may be some spillover discussion in US-based traffic about conditions in India, the likelihood of finding any significant content is probably not worth the effort of having to discover it in a vast sea of other (unrelated) data. Obviously, this is a decision that needs to be made by each data scientist or organization; our intent is simply to point out where there may be value in looking only at a particular region in the world.

As an example, consider the diagram shown in Figure 3.3; it shows social media mentions for a particular bank we were working on an analysis for. The bank had recently made some announcements and was interested to see if there was an increase or decrease in social media traffic as (perhaps) a result of the media attention. Figure 3.3 shows a summary of the top 10 languages for all of the media mentions we were able to collect over the previous two days.

What we were able to see was a large amount of traffic coming not from English (US) speaking individuals, but from Turkish social media participants. Not only that, but it appeared that Portuguese and Spanish numbers were almost equally as high. What was more interesting was that the announcements were made in the United States.

One of the interesting facts to gather would obviously be the location of the individuals making the comments. In some cases, this information is easy to retrieve—for example, through the use of GPS technology on mobile devices. In the case of Twitter, the use of geolocation can allow someone to find tweets that have been sent from a specific location. This could be a country, a city, or multiple regions around the world. When a Twitter user opts in to allow location-based services on his or her Twitter account, Twitter uses geotagging to categorize each tweet by location and makes that information available to subscribers of the data. In theory, this would give users of that data the ability to track tweets sent from a specific city or country. Unfortunately, the statistics on the use of this feature aren’t promising (yet), with only about 10% of the total population enabling the feature [5]. Lacking the exact geolocation, we could make the assumption that those posting in Turkish, for example, were originating their tweets from Turkey.

It may not be a perfect one-to-one match, but lacking any other information, it’s the best we could do. In this case, the bank in question had made an announcement (in the US press) about some branch closings in Europe. From the backlash we were able to mine from social media sources, it appears that those most widely affected customers were located in Spanish-speaking countries as well as Turkey. While we don’t know exactly how the bank handled this situation (our job was simply to discover any potential issues), we do know it immediately focused customer relations on branches and banking in those regions in an effort to minimize any fallout from its announcements.

Age and Gender

Understanding the demographics of just who is using social media to communicate is an important step in being able to understand what is being said about a company or brand.

Some of the current data provided by the Pew Research Center [6] around social media can give us a better idea of who is generating all of the traffic (and who is listening). Let’s not make a mistake here: according to this work, approximately 74% of Internet users are engaged in some form of social media (that’s over 2.2 billion individuals). While we’ve tried to summarize some of the more simple statistics in Table 3.1[7], some numbers should stand out:
  • In the 18–29-year-old bracket, there is 89% usage.
  • The 30–49-year-old bracket sits at 82%.
  • In the 50–64-year-old bracket, 65% are active on social media.
In the 65-plus bracket, 49% are using social media.

Time spent online using social media shows [8]:
  • The United States at 16 minutes of every hour
  • The Australians at 14 minutes for every hour
  • The United Kingdom users at 13 minutes
And while we’re at it, remember that 71% of users’ social media access comes from a mobile device [9], and women tend to dominate most of the social media platforms [10].

Ultimately, we would like to include some of this demographics information in an analysis, but the knowledge of this information is just as useful. If, for example, we were wondering what the issues were surrounding health care (or other issues) post retirement in social media, we would be hard-pressed to find much discussion by that demographic in places such as Instagram or Twitter (since the number of participants in the 65 and older demographic seems to be quite low). That’s not to say the chatter wouldn’t be out there; there could be significant discussion by the children of those users in the 30–39-year-old demographic, but again, it may come with a different perspective. Similarly, based on this table, if we were interested in the content from females, Pinterest might be a good venue to consider .

Eminence, Prestige, or Popularity

What does it mean to be eminent? There are a number of online presentations and seminars on increasing your social media eminence, or “digital footprint.” What are some attributes of eminent people? They tend to be in a position of superiority or distinction. Often they are high ranking or famous (either worldwide or within their social community or sphere of influence) and have a tremendous amount of influence over those who hear what they have to say. For example, if the president of the United States (or any world leader) makes a comment on some social or economic issue, that comment is usually picked up by the press and is on everyone’s lips by the time the evening news comes on (more so if it’s a controversial topic). These leaders are highly influential and can literally change the minds or perspectives of millions of people in a relatively short time span. On the other hand, if coauthor Avinash Kohirkar makes a public statement about the same topic, the results are vastly different. He may influence family and friends, but the net effect of his comments pale in comparison to those that are viewed with a higher degree of eminence. So what do these users do to have a high degree of social media eminence publish high-quality articles or blog entries. Other users rush to see what they have to say (and often repeat it or are influenced by it). Highly eminent people are seen as those individuals who add value to online business discussions.

It stands to reason that we would want to know what these people are saying. We also want to know if something was said in the social media concerning our brands or products. It does make a difference if a comment was made by a simple techie (such as Avinash) or a world leader. One of the challenges in using eminence (or influence) as a metric is determining how to quantify it. There is a lot of discussion and debate in the industry about this topic, and there are lots of tools and approaches that people are using to measure influence [11]. To illustrate this point here, we are going to make some assumptions and come up with a simple formula.

In some of our work, we make the following assumptions:
  • Influential people are those who often have their comments repeated.
  • Influential people tend to have many people following them (that is, the interest in what they have to say is high).

Based on these assumptions, we defined a simple metric called “reach” that is a quantifiable way to determine how widespread someone’s message could be. Reach, to us, is simply the number of things that a person has said multiplied by the number of people listening. Is this metric perfect? No. But it is something to watch for: a person with a large reach is saying a lot and is also reaching a wide audience. Granted, someone could be blabbering about some topic on social media and posting thousands of messages, all being received by a small handful of listeners. If that’s a concern, simply look to modify the definition of influence to something like that shown in Figure 3.4.

It is possible for a company to use the concept of influencers to effectively communicate a key marketing message broadly. Consider the effect a wellknown industry analyst who is constantly talking about security in financial institutions such as banks could have on the perception of various institutions. In addition, if we follow this analyst, we will come to understand the social media venues that this analyst and others like him or her participate in. As an example, let’s assume that IBM acquired a company that specializes in fraud detection for banks. Our marketing teams in IBM will be served well by posting about this event on the venues that this analyst is already quite active in. If the analyst is impressed by the acquisition and chooses to “like” it or “share” it, that message will be received by a large number of his or her followers.

How do we measure how influential someone is? Or how do we measure how effective a person’s messages are? We can look to see if that person has talked about a specific product or service and then measure the sales of that product or service to see if there is an increase (or decrease). However, that would be a difficult measurement and, quite honestly, wouldn’t represent the image or perception of the product or service, which could, at a later date, affect the sales.

Instead, we chose to look at someone’s reach, or how far and wide this person’s message could be spread. Figure 3.4 shows an example of how reach could be computed in a message system such as Twitter (although it’s equally applic able to any systems where a post is made and others follow that posting).

In Figure 3.4, we show that an individual’s reach can simply be calculated in one of two ways:
  • Method 1—Multiply the number of messages sent by the number of people that could read that message. If someone sends 1,000 messages and 10 people are following that person, the combined message has a calculated score of 10,000 (see Table 3.2).
  • Method 2—Multiply the number of messages sent by the number of people that could read the message and then multiply that result by the ratio of followers to messages.
 In method 2, we’ve add another factor to our equation: the ratio of the number of followers to the number of messages produced. Doing so effectively gives more weight to the person with a larger following. This produces perhaps a more meaningful score for our metric, where we might be more inclined to focus on the comments of the second user rather than those of the first.



Analytics Case Study Content Experience How-To Mobile Marketing Social Media Strategy Strategy
The Digital Media Strategy Blog: Whose Comments Are We Interested In?
Whose Comments Are We Interested In?
The Digital Media Strategy Blog
Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS CONTENT IS PREMIUM Please share to unlock Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy