Deep social data analytics

A deep social data analytics is a full-fledged social media research project typically spanning a period of weeks to months. The amount of...

A deep social data analytics is a full-fledged social media research project typically spanning a period of weeks to months. The amount of data processed in each iteration of the analytics could be relatively small or large, but the duration of the study is usually large. This type of analytics is characterized by an approach that starts with raw data and some limited goals. But as the analytics progresses through the various iterations, the team progressively becomes aware of the hidden insights, and the solution focus areas get narrower and narrower. We described one example from our experience here.

In this case, we collected data from a conference and analyzed it in an attempt to identify dominant themes. Once the themes and topics were identified, using tools that support sophisticated statistical capabilities, we established correlations between various topic pairs. The resultant table is shown in Figure 6.2. The data has been modified for illustrative purposes so that no “real” conclusions can be drawn about the named entities here.

social data analytics
In many cases, a deeper analytics of a set of data involves combining several specific terms under a single umbrella, or concept. For example, when mining unstructured data for words such as computer, computing, computer networking, or cloud, we could group them together to form a single concept called “Information Technology.” The increased CPU intensity attribute for the deep analytics comes from this extra pattern matching and text analytics code. In the case of the relationship matrix (see Figure 6.2), all of the concepts are compared to compute an affinity matrix between concepts in an attempt to determine if there is a low affinity (relationship) between concepts or a high affinity. Since it is quite possible that new terms or words could be uncovered during the analytics that would be relevant to a particular concept, the data models would need to be modified and rerun. This, in turn, also drives the high CPU utilization (in hindsight, this could also point to another extension of the taxonomy: time to deliver results). Table 6.2 contains a summary of different types of use cases involving External Social Media.

Internal Social Media

Let’s now focus on the internal social media domain. IBM uses its own product, called IBM Connections, as its enterprise social network platform to facilitate collaboration among all the employees inside the firewall. Even in this domain, there are two broad analytics types based on whether the data is at rest or in motion. And, in each of these cases, we consider use cases for simple social metrics, ad hoc analytics, and deep analytics.

Data in Motion

With more than 500,000 profiles enabled on the IBM Connections platform , there tends to be quite a bit of activity occurring throughout the day. Activity ranges from the simple posting of a status message to notifications of new product announcements, recognition of employees, and just general awareness of things happening inside of IBM.

To watch for potential employee dissatisfactions, we have implemented a simple application that analyzes content in this platform and can capture and highlight any IT support issues that could arise with a new release of an internal computing application. For example, if a new version of Lotus Notes is rolled out to a broad section of the employees, the IT support team can configure this application to watch for positive or negative sentiment words being mentioned together with the name of the product.

Machine capacity

Given the real-time nature of the data and the network bandwidth needed to keep up with incoming data, the CPU capacity required for this type of analytics is quite high. If, however, instead of real time, we introduce a delay, say one to five minutes, we can significantly reduce the network and CPU requirements because our computing infrastructure doesn’t need to process in real time. This is what we would call “near real time.”

Simple Social Metrics (SSM)

Similar to the Simple Social Metrics discussed previously, IBM’s research team in Haifa has created a simple pie chart that shows a breakdown of the sentiment of all the content streaming through IBM Connections into positive, neutral, and negative sentiments. As of this writing, 38% percent of the posts are positive, 60% are neutral, and 2% are negative. These kinds of metrics allow us to understand the general feeling within IBM; this analytics doesn’t need to be on a real-time basis, since the change in mood of a large workforce changes over time, not instantaneously.

This Social Network analytics application also continuously updates trending topics and trending words in real time because they can be predictors of things to come.

Machine capacity

The network and CPU requirements are moderate in this case.

Ad Hoc analytics

The IBM Research team has built an application that continuously monitors and analyzes all of the content that is streaming through IBM’s enterprise social network: IBM Connections. This social network analytics application has an interface that enables users to specify any topic, and the application will show these users an interactive view of all the conversations (including counts) relevant to that topic over the past 30 days (see Figure 6.3). This can be a very handy tool to gauge how well a certain topic has been resonating with the employees in the past month or so.

Deep analytics

Because of the short amount of time available for processing in these types of projects, deep analytics is usually not possible.

Data at Rest

Data at rest refers to use cases in which data has already been accumulated. This can include data from the past day, week, month, or year. This also includes custom windows of time for example, social media data analysis around a “water day” event in South Africa several months back, for a duration of one month.

Simple Social Metrics (SSM)

IBM conducts online courses on key topics for very large audiences every month. The industry refers to these as massive open online courses (MOOCs). These courses are accessible to all employees. A new course is launched on the first Friday of each month. During the course of this day, we collect comments from all IBMers about this topic and about this specific course in the IBM Connections platform. At the end of the day, we compute some simple social metrics and come up with the following: top 10 hashtags, top 10 mentions, top 10 authors, and overall sentiment.

Machine capacity

The network bandwidth and the CPU capacity required for this type of analytics are low.

Ad Hoc analytics

Duration of analytics—1 month

During the first month after the release of a new MOOC, we perform a social data analysis of all posts made by IBMers in IBM Connections and come up with some reports.  For a course on cloud computing, we produced the following:
  • Volume of conversations over time
  • Volume of conversations by geography of the author
  • Percentage of posts about cloud computing, in comparison to other similar words
  • Percentage of positive, neutral, and negative comments
  • Percentage of discussion by business unit and role

Machine capacity

The network bandwidth required is quite low, but the CPU capacity is typically low to moderate, depending on the amount of total data that we will be processing.



Analytics Case Study Content Experience How-To Mobile Marketing Social Media Strategy Strategy
The Digital Media Strategy Blog: Deep social data analytics
Deep social data analytics
The Digital Media Strategy Blog
Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS CONTENT IS PREMIUM Please share to unlock Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy