What Is Topic Analysis? Examples & Tools

What Is Topic Analysis? Examples & Tools

Topic analysis is a machine learning technique that automatically assigns topics to text data. Topic analysis tools analyze unstructured text, including emails and social media interactions,  then process and sort this information, helping businesses discover the topics their customers mention most often in relation to their product, brand, or service.

The vast amount of data that businesses receive today is unstructured, and manually sorting it is no longer effective. Thanks to machine learning techniques, like topic analysis, businesses are able to sift through large amounts of data in the blink of an eye and pinpoint the most frequent topics mentioned in customer feedback. 

So, how exactly does topic analysis work and how can you use it to benefit your business? Keep on reading to find out!

How Does Topic Analysis Work?

There are two different approaches to topic analysis:

The one you use will depend on the problem you need to solve.

The main difference between these two techniques is that one uses a supervised algorithm (topic classification), and the other an unsupervised algorithm (topic modeling). 

Unsupervised algorithms

You don’t need to rain unsupervised algorithms. Instead, they’re able to recognize structures, patterns, and relationships in collections of uncategorized data to make connections on their own. Here’s an example of how they work: 

Apple has a host of products. The Apple watch, iPhone, Apple TV, iPad, MacBook, and so on. Not all of them include the word ‘Apple’, so machines might not recognize the iPad as an Apple product. However, unsupervised machine learning tools would be able to associate words starting with a lowercase ‘i’ with Apple, and label them accordingly.

Since you don’t need to tag texts manually with unsupervised techniques, they can be useful for tasks like discovering topics within your text. However, you will need to feed the model large amounts of text, and results won’t be clearly labelled with topics such as Customer Support _or _Ease of Use. 

Instead, an unsupervised topic analysis model will group related texts according to certain terms and words that it used to build relationships and detect patterns. 

Supervised algorithms

Supervised algorithms need to be trained before they can categorize texts on their own. This involves feeding the model samples of text, and tagging them using predefined labels. The tagging process is essential because the more texts you tag, the more accurate text classifiers will be at making predictions on their own. To further improve accuracy, spend time correctly labeling your texts, and even define each label so that everyone in your team knows which label to use and when.

So, is it better to use topic modeling or topic classification? Well, it all comes down to your objectives, how familiar you are with your data, and the resources at hand. While topic modeling is best used to discover the main topics talked about within a set of documents, topic classification is useful to speed up the process of tagging texts. Without machines, tagging text is manual, time-consuming and tedious.

Often, however, topic modeling can be used alongside topic analysis models.

Let’s imagine you’ve launched a new product and want to know how customers are reacting to it. You send out a survey and receive hundreds of responses, each with open-ended answers. Instead of spending hours trying to pinpoint the topics customers mention in relation to your new product (price, design, usability...), you can use topic modeling to do this for you. 

You can then use these recurring topics as your predefined tags in topic classification.

Topic Analysis Examples

 

There are many ways in which you can fine-tune topic analysis models, depending on what’s important to your business and what you want to gain from your analysis. 

Topic classification models help organize texts, classifying them by your defined topics of interest. You might teach your model to classify texts that are related to price, for example, helping it associate currency symbols, numbers, related terms (affordable, pricey) and expressions (it costs a fortune) with the word price. Once it’s able to make connections on its own, your model will automatically classify unseen texts.

Take this review:

“We feel Intercom is the best value for the money, and are happy paying for it every month”

A trained text classification model would recognize the expression ‘value for money’ and the word ‘paying’, and assign it the category price.

Now, you could even go one step further and discover the sentiments of texts. This review about price, for example, would be classified as positive. This topic analysis technique combines both topic and sentiment analysis, also known as aspect-based sentiment analysis

Getting Started with Topic Analysis Tools

Now that you know how topic analysis works and why it’s useful, let’s take a look at how to create a topic classifier using SaaS tools like MonkeyLearn. You don’t need to know much about machine learning or how to code, and you won’t need to spend a fortune on resources to build your classifier. 

Just create a free MonkeyLearn account and follow these steps: 

1. Create your classifier

Access your dashboard, click on ‘Create a Model’ and choose ‘Classifier’.

MonkeyLearn's creation wizard, with the option to choose a classifier or extractor.

2. Choose type of classification 

To create a customized model that classifies your texts based on their topic, you need to choose ‘Topic classification’.

MonkeyLearn's creation wizard showing classification options: topic, sentiment, and intent.

3. Upload your texts

Feed your classifier with relevant texts. The more you upload and tag, the smarter your model becomes. You can import your data by uploading Excel or CSV files, or you can integrate MonkeyLearn’s models with Gmail, Twitter, Zendesk, or RSS feeds.

MonkeyLearn's creation wizard showing available sources to upload your text data.

4. Define your tags

Now it’s time for you to define the tags your classifier will use to analyze your texts. You’ll need at least two tags to start with, and you can add more tags later if needed.

Once you’ve finished training your model, it will start making predictions based on your predefined tags. Some things to consider when tagging:

  • Try not to use more than 10 tags so your model doesn’t get confused, and you get more accurate results – especially during the training period. 
  • Don’t use tags that are too niche or specific, as you will need enough data samples for each tag to train your classifier. Also, avoid overlapping topics to improve the model’s accuracy.
An user defining tags for the topic analysis model.

5. Start tagging

Train your classifier by manually tagging examples of each category:

6. Test Classifier

After training your classifier, you can test it to see how it works. Yu can write your own text and see how your model classifies the new data:

Final Words & Closing

All in all, topic analysis will help you detect themes and topics within your data in a scalable, fast, and cost-efficient way. Whether you choose to carry out topic classification or topic modeling, or even both, you’ll be able to obtain valuable insights from your texts. And the best part is that you’ll be able to make informed decisions about your product or service.

Forget about time-consuming, tedious manual data processing, and discover the topics your texts mention in just seconds with MonkeyLearn. As demonstrated above, it’s very easy to use one of our pre-trained models or train your own classifier for more accurate results. 

Sign up to MonkeyLearn for free so you can start using our range of topic analysis tools right away. If you have any questions, do not hesitate to contact us. Our agents will be happy to answer any questions you may have about topic analysis!

Rachel Wolff

March 5th, 2020

Posts you might like...

MonkeyLearn Logo

Text Analysis with Machine Learning

Turn tweets, emails, documents, webpages and more into actionable data. Automate business processes and save hours of manual data processing.

Try MonkeyLearn
Clearbit LogoSegment LogoPubnub LogoProtagonist Logo