Text analytics is the automated process of translating large volumes of unstructured text into quantitative data to uncover insights, trends, and patterns through statistical pattern learning. To perform text analytics you'll need to analyze text and then use data visualization tools to showcase your results.
The creator of The Simpsons, Matt Groening, once said: “I know all those words, but that sentence makes no sense to me”.
Written language has a tense relationship with the concept of data. When you hear the words ‘data analysis’ you probably think about some information-crunching practice to quantify what’s going on with your brand, product, service, you name it. Written language, on the other hand, is something we’re all familiar with. However, it’s unstructured data, that is, it’s not organized in a pre-defined way, which makes it difficult to quantify and get the insights you are looking for.
For instance, when your team needs to analyze thousands of survey responses, NPS open-ended answers, Twitter mentions, or support tickets, what does the process involve? Probably a handful of staff members manually reading and tagging text data for hours on end, wasting precious time that could be used on more important and time-sensitive tasks.
Text analytics used to be something that only experts in machine learning or linguistics could put into practice. Granted, the backend of this process is still very hard to understand, but with today’s technology, anyone can use a text analysis model to automatically analyze huge batches of text data and visualize the results in beautiful graphs and charts.
Keep reading and you’ll find out how easy it is to use text analytics within your company. We’ll go over the definitions of text analytics and text analysis, basic and advanced text analysis methods, and also share some useful tools to get started analyzing text data. Finally, we’ll go over how to translate those results into neat-looking data visualizations and share some of the most important use cases for text analytics.
Let’s get right into it!
Text analytics is the process of translating written language into data to get quantitative insights.
Much like when traditional analytics tools provide statistical information about ad campaigns (e.g. Google Adwords) or website performance (e.g. Google Analytics), text analytics does the same thing but with insights from text like support emails, product feedback from clients, social media mentions and much more.
Imagine a customer experience team wanting to tag responses from a recent customer survey, for topics like Reliability, Usability or Functionality.
How much time will it take to read every survey response and add the corresponding tags manually?
Or, imagine working for Slack’s customer insights team. You could use text analytics to convert 15,000 reviews about Slack in Capterra into meaningful visualizations, and discover how customers feel about different aspects of your product, like in this example below:
If done manually, text analytics can be a time-consuming, exhausting process. Luckily, there is a better way. We’ll take a look at how you can use machine learning to automate this process in just a minute. First, we want to clear up some doubts concerning the difference between text analytics and concepts like text mining.
If you’re new to these concepts and you’ve searched for definitions, you’ve probably found yourself reading several different (and sometimes conflicting) descriptions for text analytics, text analysis and text mining.
First thing’s first: there’s no practical difference between text analysis and text mining. Although it is very common to hear these two practices described differently, the fact of the matter is that these two concepts use statistical pattern learning to obtain qualitative information from unstructured text.
So, text analysis is the process of quickly sifting through unstructured text, analyzing it, and providing qualitative insights. For example, automatically understanding the topic of a given text (AKA topic analysis) or how customers feel about a particular subject (AKA sentiment analysis).
Text analytics is purely quantitative. While text analysis, or text mining, extracts meaning from text, text analytics aggregates the results of the analysis of huge batches of text data to visualize trends through reports and graphs.
To perform text analytics, first, we need to do text analysis. Text analysis will provide the data, and text analytics will help us understand the underlying story it’s telling us.
Being able to crunch the numbers that tell stories behind text can help companies make better decisions.
Just a few years ago computers used to answer like Matt Groening: “I know all those words, but that sentence makes no sense to me”. So, asking a machine to perform manual tasks that involved language processing was out of the question.
With machine learning tools that don’t require you to have programming experience or a data science background, you can teach text analysis models to recognize the meaning behind those words, their relevance, and group them to get key insights for your business.
Here are some of the main benefits of text analytics:
To perform text analytics, you’ll need two things:
Let’s go over how you can perform both things.
Now, how can you apply text analysis to your data? In this section, we’ll cover popular methods used to analyze text, ranging from basic to advanced.
Word frequency analysis is pretty straight forward: how many times a set of words or phrases appear in a particular batch of data?
With this basic analysis, you can count how many times relevant keywords appear in, say, last month’s total survey responses. These results could be visualized through charts and graphs for a better and brighter understanding of what these survey responses are saying.
A collocation analysis identifies language constructions such as bigrams (a sequence of 2 words in text) or trigrams (3 words) associating them as a single unit.
For example, if we were evaluating product reviews from our customers, we could expect to see words that commonly co-occur like ‘customer support’, ‘user interface’ or ‘easy to use’. By using Natural Language Processing (NLP), you can quickly learn to identify these word structures called n-grams and obtain useful insights from a granular perspective.
Concordance is a neat feature that allows you to contextualize words that are of value to you.
As a result, concordance helps remove the ambiguity of text and words that have different and even antonymic meanings, depending on the context in which it is used.
Take a look at the following example:
In these four sentences, the word ‘simple’ has very different meanings. The chart above isolates the target (simple) and compares its preceding and following contexts to understand its meaning.
After having gone through a high-level overview of the three basic forms of text analysis, let’s dive into the two advanced practices: text classification and text extraction.
Text classification is one of the most popular Natural Language Processing (NLP) techniques. It allows you to assign tags or categories to text according to its content. This method has many applications, including sentiment analysis, topic analysis, intent detection, _and language detection_.
Sentiment analysis is the process that automatically detects the emotion expressed in text.
For example, imagine the following customer survey response:
“The UI is super easy to use, I love this product!”
A sentiment analysis model can automatically analyze this text and tag it as Positive.
If you’d like to analyze customer feedback by sentiment, splitting feedback about your brand into negative and positive, you’re best off using a sentiment analysis model, which can learn how to do this automatically and save you time and resources. Manually going through each response would lose you hours, even days.
If you’d like to give sentiment analysis a go and experience first-hand how it works, you can check out this pre-trained sentiment analysis model on MonkeyLearn.
Topic analysis is a process that automatically understands what a piece of text is about, and applies one or more tags for each piece of text data it analyzes.
For instance, a topic classifier can be trained to read something like this...
“Although we really liked the platform’s functionality, the support staff is quite slow”
...And automatically return the tags Functionality and Customer Support. This way a product team can analyze thousands of customer reviews and get insights about which are the most recurrent topics.
Combined with sentiment analysis, the same team not only knows what their customers are talking about, but also how they’re talking about it. This is called aspect-based sentiment analysis, which you can learn more about it in this post.
When working in outbound sales, your team probably receives hundreds of email replies per day, many of which will be leads showing interest, while others will be auto-responses, bounced emails, or replies that show no interest at all.
An intent detection model can categorize these emails according to their intent, automatically, and let your sales team focus on the ones that are worth paying attention to by training it with tags like Interested, Not interested or Bounced.
You can check out a pre-trained intent detection model to test its power.
Companies around the globe often analyze reviews, support tickets, and comments about their product in many different languages, thanks to the globalized and connected world we live in. Airbnb, Uber, Slack and so many more need local teams that understand many different languages to support their global operations.
Let’s imagine we receive 100 new survey responses per day in many different languages. To effectively distribute these responses to the appropriate local teams, you might be going through each response, identifying the language, and then sending it to the right team member. This process is repetitive and takes a lot of time.
With language detection powered by machine learning, you can identify the language of a given text automatically. Check out this pre-trained model, which effectively detects 49 different languages.
Another popular advanced method of text analysis is text extraction, a technique that allows us to recognize and extract different elements from text such as keywords, brands, models, specs, and more. This is particularly useful, for example, to create reports showing which terms are usually associated with a brand.
Imagine being able to perform an analysis that compares the most relevant keywords from an online conversation about your brand with the ones from your competition:
Many insights can be obtained by extracting data from text, and it’s a simple process with the advent of machine learning. The following are some examples of text extraction models used by companies:
A keyword extraction model can be used to extract the most relevant words or expressions of a given text. The extracted keywords can be used to create tag clouds for understanding common themes present throughout the data.
An entity extraction model can automatically identify people’s names, company names, universities, government entities and much more, allowing us to understand what names are mentioned in any given text.
A feature extraction model can be trained to identify and extract pieces of text corresponding to the unique characteristics of a product. For instance, the following is a pre-trained extraction model that recognizes different features from laptops: models, CPU, GPU, RAM, hard drives, and more.
So, now that we’ve gone through the main techniques used for text analysis, we can focus on text analytics. With the results of these text analysis techniques, we’ll get the data we need to create graphs, charts, and analytics we are looking for from text.
Let’s take a look at some of the best data visualization tools, which will help us create beautiful reports showcasing our analysis. These graphs and charts allow us to get useful and quick insights – helping to easily detect trends over time.
Google’s Data Studio stands out for being not only very easy to use and intuitive but also free. It allows anyone to create or connect a dataset, and create fully customized visualizations of the data. We can choose from a broad set of preset reports, specially designed for different industries and types of data.
Integrations in Google Data Studio are one of its unique characteristics, allowing you to connect to an impressive amount of services and platforms, from which you can import data. With over 120 partner integrations like Google Sheets, MySQL and uploaded CSV files, this visualization tool is one of the easiest and most versatile.
Using Google Data Studio for creating charts and graphs from text analysis results is pretty simple. Here, we’ll go over it in a few simple steps:
Head over to Google Data Studio’s homepage and log in with your Google account. You’ll be redirected to the dashboard, where you’ll be able to get an overview of the platform, including preset reports and your datasets:
To either build your own report or use one of the preset ones, first you’ll need to define your data source. For this example, we’ll use a Google Sheets file with some customer survey answers.
To do this, click on the ‘Data Sources’ tab on the left side of our screen and select the Google Sheets connection tab from the list:
Once your data source is connected, a preview of the data will appear. It’s a good idea to check that Google Data Studio has correctly imported every value from your spreadsheet.
It’s also important to make sure that each field has been assigned the correct type of data. Types range from ‘text’ and ’numeric’, to ’geo’,‘date’ and many more. This selection will affect which types of charts and graphs we will be able to use to visualize that particular data:
Now it’s time to create a report based on the data we’ve connected. To do this, we can either create one from the ‘Data Preview’ screen or the ‘Explore’ section.
The good thing about Google Data Studio is its simplicity, for creating charts and graphs using your data and intuitive tools.
For instance, we can easily create a pie chart showing the percentage of survey responses tagged with a particular tag or category. Just click ‘Add a chart’ in the top bar and select the chart of your choice:
There are also a good amount of themes we can use, and colors to personalize them. Here’s a basic report with a couple of pie charts showcasing our latest survey’s most relevant topics, and the sentiment used by our customers when referring to them:
Google Data Studio lets you share custom reports with other members of your team, and even updates your datasets when you change the data in your spreadsheets.
If you want to get started with Google Data Studio with a handy tutorial, check out this tutorial for a quick overview of its basic functionalities.
While targeted at more experienced data analysts, Tableau is also a great tool for data visualization. With a less intuitive interface than Google Data Studio, Tableau still offers some features that make it the go-to service for heavier tasks. For example, with data joining you can build charts showcasing different data sources at the same time, blending metrics and resulting in a complete report.
Tableau has a steeper learning curve but, when mastered, renders impressive results. Keep in mind that, although it offers a free trial and a free demonstration by one of its support specialists, Tableau isn’t free and prices range from around $40 for the online version and $70 for desktop.
Let’s go over a basic tutorial on how we can visualize text analysis results with Tableau.
To do this just head over to Tableau’s website and request a free trial of their service. You can also ask for a free walkthrough of the platform with one of Tableau’s support specialists.
Tableau offers many ways to integrate your data, supporting Excel, Google Sheets, JSON, and MySQL, to name a few.
If using a Google Sheets file, you’ll need to hover to the left side of Tableau’s start page and click on the Google Sheets button under the ‘Connect to Server’ column:
This will trigger a browser tab that will ask you to authorize Tableau to access your Google Sheets files.
Once you have connected Tableau with Google Sheets, you’ll need to select the file with the data that you want to create visualizations for, plus the specific sheets we want to use. In our example, we’re using a file with survey responses that were analyzed with sentiment analysis and topic detection, to detect how respondents felt about topics they were mentioning:
Once we select our sheet, Tableau will show us a preview of the data we are selecting:
This particular sheet shows the counts by sentiment for each aspect mentioned in the survey responses.
To access to the workspace and start creating graphs from your data, click on ‘Sheet 1’ in the bottom toolbar:
This workspace will offer different visualization options to the right, and available dimensions and measures to the left:
As a first step, add the dimensions (rows) you want to use for feeding data to your graphs. In our case, we’ll use ‘Aspects’. Also, select the measures (columns) you want to use for the visualization (we’ll use the sentiment counts for each aspect):
Then, click on ‘Show Me’ at the top right of the screen and Tableau will show the different graphs or charts available for the selected dimensions and measures:
Let’s select a basic horizontal bar chart. This will automatically create a visualization that reflects the number of survey responses tagged as positive, negative, and neutral (measures) for each aspect (dimension):
And that’s it! You can change the type of visualization any time you want by clicking on ‘Show Me’ and selecting another visualization type:
Tableau allows us to either build a presentation out of separate spreadsheets, much like in a Powerpoint presentation, or merge them into a single screen called “Dashboard”.
To create a new Dashboard, click on Dashboard > New Dashboard in the top menu bar:
Then, just drag the sheet you want onto your dashboard:
This way, you can drag several charts and graphs onto a single dashboard, helping you create insightful stories from your data.
If you are interested in learning more about how to get started with Tableau, these tutorials are for you. They cover how to prepare, analyze, and share your data in Tableau.
In this section, we’ll go over some examples of how text analytics can be used by marketers, product teams, support agents and other professionals to get actionable insights from text data and make data-driven decisions.
Customer feedback in open-ended responses from customer surveys, NPS, or product reviews are usually a pain to analyze, and usually hold the most valuable information from your customers regarding their experience with your service. Computers easily decode multiple choice and numerical answers, and with text analysis tools, they can also automatically analyze text.
Companies collect product reviews by customers in sites like Capterra, G2Crowd, and Google Play, and analyze them to identify what customers are publicly praising about a product or service, detect key aspects to improve, and get a better understanding of the customer experience.
For example, this post analyzes 1 million hotel reviews, resulting in some interesting findings. For instance, check out this graph in which it’s easy to see the sentiment imbued in reviews from hotels in Paris, London, Rio, Bangkok, Madrid, Beijing, and New York. London hotels received the most negative reviews, while Bangkok and Madrid are doing pretty well!
Cross-checking different aspects of reviews with their locations can render useful insights when analyzing feedback in a particular industry, in this case, hotels. If you were working at a hotel in London, you would quickly realize that you should start working to improve both food and comfort. Check it out:
We’ll show one last graph that reflects something we all want to know about hotels: the internet connection and wi-fi performance. Everyone wants to stay in touch with their families or even get some work done on a business trip:
There’s a clear correlation between the number of stars awarded to a hotel and its wi-fi performance, something we should expect. But, the fact that 3-star hotels have the same amount of positive reviews as five-star hotels, regarding their internet connection, should be an alarming insight for anyone working in a customer satisfaction department at a 5-star hotel chain.
NPS surveys are a great way to get feedback from customers, but obtaining insights from the open-ended responses can be a hassle. This is where text analysis steps in, providing the necessary information to make data-driven decisions.
For example, Promoter.io extracted the most relevant keywords from a batch of NPS responses to understand what their clients complain about and which aspects of their service are highly regarded:
On the one hand, their customers have an excellent impression about Promoter’s service and convenience (with 100% promoters). On the other hand, we can see that the tags Phone and Laptop have such a high percentage of negative responses, which could indicate that their UI/UX team needs to solve some performance issues.
Each morning your support team checks in and is greeted by tons of new tickets waiting to be processed. With a machine learning model working 24/7, automatically sifting through support tickets and tagging them by urgency, topic, intent, sentiment, or language, you’ll save you hours.
Plus, text analytics of your support tickets can lead to powerful insights, for example what’s repeatedly triggering new tickets, which can help your team take action to reduce those numbers. Thanks to text analytics, your support team can increase their resolution rates and effectively provide a better service.
For example, in this analysis of customer support interactions of different telcos on Twitter, we were able to create data visualizations from sentiment analysis and keyword extraction results. These visualizations can help quickly understand common issues and compare the interactions between Twitter accounts from different competitors:
In this graph, we can see a huge difference in the number of mentions that AT&T and Verizon get each week. This alone can help a support team visualize different engagement rates, or presence, in this particular social media platform.
Then, sentiment analysis unveils that AT&T and Verizon share a low amount of positive interactions with their clients. In contrast, T-Mobile and Sprint seem to receive much more positive feedback, possibly due to better support:
Moreover, sentiment analysis helps us conclude that Sprint has the second least amount of mentions but the highest amount of negative mentions:
AT&T, on the other hand, has the most amount of overall mentions and the least amount of negative mentions.
This is just a brief example of how text analytics can help a support team get a complete panoramic view of how their customers perceive their brand.
Text analytics has changed the business intelligence world forever. Refined platforms now allow everyone, regardless of programming experience, to enjoy the benefits of machine learning – saving them countless hours and giving them the opportunity to focus on more fulfilling tasks.
Customer support tickets, NPS responses, product reviews, customer surveys, social media interactions and other types of texts can be automatically analyzed by sentiment, topic, keywords, and entities thanks to machine learning. This provides the necessary raw materials to do text analytics and uncover invaluable insights in a scalable and consistent way.
If you’re interested in using text analytics for your business, check out MonkeyLearn, a platform that makes it super easy to analyze text with machine learning. You can also request a demo and our team will help you get started with text analytics.
March 6th, 2020