Twitter Topic Analysis - How to Do It & Why You Need It

Every business knows it’s important to understand what customers are saying about their brand and their products, every minute of every day.

Twitter can provide a wealth of information about public opinion toward your brand. In fact, it’s often on social media where customers give their most honest opinions because they’re unguarded, and users simply feel compelled to speak their minds. But with Twitter users sending approximately 6,000 tweets every second, where do you even begin to make sense of all that data?

Enter Twitter topic analysis, a fast and simple, AI-powered data science solution to find out what customers are tweeting about your brand for up-to-the-minute, data-driven insights.

What Is Topic Analysis?

With the help of natural language processing (NLP), machines are able to break down human language and analyze it for powerful insights. Topic analysis (or topic detection) is a text analysis technique that uses machine learning to automatically process text and categorize it by topic or subject.

Topic analysis allows you to find Twitter mentions of your brand and products and have them automatically classified or categorized to find out the subjects and issues that are most important to your customers.

The two most common methods of topic analysis are topic modeling and topic classification.

Topic modeling uses unsupervised machine learning to find patterns in text and cluster similar words and phrases, without the need to create defined tags (categories) ahead of time.

With topic classification or topic extraction, on the other hand, you must pre-define the tags, use them to train a classification model, and the model will then categorize your texts (tweets) into the tags you set up.

Imagine you just released a new product or feature and you want to find out what customers are tweeting about it right now (or over a period of time), but there are far too many tweets to read through manually. What are the most important topics? Are they discussing Price, UX/UI, Performance?

Take these tweets for example:

Tweets mentioning PlayStation 5's UX.

A trained topic classifier would output the above as “UX/UI,” and it can work on thousands of tweets in just seconds.

Other text mining techniques, like sentiment analysis and keyword extraction, can be combined with topic analysis to find the actual sentiments around the topics and the most important words customers are using to describe them.

Why Is Topic Analysis Important?

  • Easy to Scale

Machine learning allows you unlimited access to analyze tweets (or other text) at a scale that would simply be impossible for humans to achieve. Using SaaS text analysis platforms means easy implementation and simple integration with tools you already use, and you can scale up or down immediately as needed.

  • Real-time analysis

Follow product releases and marketing campaigns 24/7 and in real time, so you won’t miss a single tweet, and you’ll be able to follow your brand image as it changes over time. Combining topic analysis with NLP techniques, like Twitter sentiment analysis (which reads text for opinion polarity: positive, negative, neutral) allows you to understand how your customers feel about your brand (and its individual topics or aspects) at any given moment.

  • Consistent criteria

Topic analysis allows you to train models to the specific language of your business, your needs, and your criteria, so your analysis is always consistent, unlike human analysis.

Let’s jump in and see how it works.

Tutorial: How to Do Twitter Topic Analysis

SaaS tools, like MonkeyLearn make topic analysis of tweets easy, with no code. Follow along to make your own painless topic classifier in just a few minutes:

1. Get Twitter Data to Analyze

MonkeyLearn

With MonkeyLearn, you can search Twitter (or import from other apps) right in the dashboard. There are lots of great integrations with tools you already use. We’ll explain this in detail later, but here’s what it looks like:

The option to import data from a variety of sources.

Just enter a company, product name, keyword, Twitter handle, or whatever you’re looking for:

A search bar to enter a Twitter search query.

We’ll be searching for data on Apple AirPods. It just takes a few seconds for the data to upload directly from Twitter:

The search automatically populating the model with tweets mentioning "Apple AirPods."

Twitter API

If you’re not afraid of just a bit of code, the Twitter API is another easy option to connect directly to Twitter data and search by keyword, brand, hashtag, etc. You can search historical data, collect tweets from specific users, even gather your personal DMs.

2. Prepare Your Twitter Data

Twitter data comes with a lot of “noise,” like repetitive text or irrelevant text and symbols that can affect your analysis results.

image13

Things like URL links, handles, and emojis are generally unnecessary to your analysis, so it’s important that you clean your data. Running spell check and removing special characters and URL links is a good start. Or take a look at how to clean your data with code.

MonkeyLearn also offers a number of data cleaning tools that can help prep your data automatically. The opinion units extractor, for example, can break tweets or entire pages of text into individual thoughts or statements called “opinion units”. You can try it with your own text:

Test with your own text

Results

TagValue
OPINIONI like the new update,
OPINIONbut it seems really slow.
OPINIONI can’t get tech support on the phone either.

That way, your analysis is run on individual statements, rather than whole tweets that could contain multiple opinions.

If you’re getting your data from somewhere other than Twitter, you can use tools like the email cleaner to automatically remove email signatures, legal clauses, previous replies, etc. Or the boilerplate extractor that removes things like templates, ads, and navigation bars, to extract only relevant text from HTML.

3. Create a Custom Topic Analysis Model

Here’s the fun part – where you get to see machine learning at work.

3.1. Choose Your Model

If you haven’t already, sign up to MonkeyLearn for free. It just takes a few seconds. Then, go to the MonkeyLearn dashboard, click ‘Create Model,’ and choose ‘Classifier’:

The option to choose Classifier or Extractor.

Choose ‘Topic Classification’:

The option to choose Text Classification, Sentiment Analysis, or Intent Classification.

3.2. Import Your Twitter Data

Here are all the upload and direct integration options. You can upload a cleaned CSV or Excel file or connect to one of the apps you already use. For this tutorial, we’ll be uploading directly from Twitter. Click the Twitter icon:

The option to import data from a variety of sources.

3.3. Search for Tweets

Enter your search query. It can be keywords, a company name, Twitter handles, etc. We’re searching “Apple AirPods”:

The search automatically populating the model with tweets mentioning "Apple AirPods."

3.4. Define Your Topic Analysis Tags

The topic criteria is up to you and depends on the results you’re looking for. You can, ultimately, use as many tags as you want, but it’s best to start with ten or less. They should be specific enough so that the categories or subjects don’t overlap. We’re using the RUF tagging criteria: “Reliability,” “Usability,” “Functionality.”

Entering tags, "Reliability," "Usability," and "Functionality."

3.5. Train Your Twitter Topic Analyzer

Start tagging tweets with the appropriate topic. After you tag a few, you’ll notice the model will begin predicting for you. Correct the predictions if they’re inaccurate.

Training the topic classifier to tag by topic.

3.6. Test Your Twitter Topic Analyzer

Once you’ve trained it with enough data, the app will ask you to name your model. Then, you can test it with new text. Cut and paste or write in new text.

If it’s still not performing to your criteria, click ‘Build’ and continue training it. The more you train your model, the smarter it will become.

The topic classifier classifying the text, "Hard to figure connectivity." as "Usability."

Also in the ‘Build’ menu, you can check stats to see how it’s performing. You can see overall stats or check by individual tags. The below shows our new model is only performing at 50% accuracy (because we only used 12 training texts so far), so it definitely needs more training.

The word cloud on the bottom right shows the most common words used in the tweets we analyzed.

The "Stats" page in the MonkeyLearn topic classifier shows how well the model is performing and features a word cloud with the most used words.

3.7. Put Your Twitter Topic Analyzer to Work

Now that you’ve built your model, you can analyze thousands of tweets in a single go. And it’s easy. Click ‘Run,’ and choose one of the below:

  • Batch Analysis: You can upload a CSV or Excel file with new tweets. The model will automatically process them and return a CSV with the results.
  • Integrations: MonkeyLearn offers a number of easy-to-use integrations, some with apps and tools you probably already use, like Gmail, Google Sheets, SurveyMonkey, Zapier, Zendesk, and more.
  • API: If you know a bit (or a lot) of coding, MonkeyLearn’s API in Python and other languages is great for a truly streamlined data analysis process.

4. Visualize Your Topic Analysis Results

Business intelligence (BI) visualization tools, like MonkeyLearn Studio, allow you to see your results in striking detail. But, unlike most BI tools, MonkeyLearn studio is an all-in-one data gathering, cleaning, analysis, and visualization tool. You can combine a number of text analysis tools and perform your text analysis all in a single, simple, interactive dashboard.

MonkeyLearn Studio dashboard showing results for intent classification and sentiment analysis in charts and graphs.

The above is a MonkeyLearn Studio dashboard showing aspect-based sentiment analysis of customer reviews of Zoom. Aspect-based sentiment analysis first performs topic analysis to categorize comments (Usability, Support, Reliability, etc.), then analyzes them by sentiment or “opinion polarity” (positive, negative, neutral, etc.), so that you end up finding out which aspects of your business are positive and which are negative.

You can play around with the MonkeyLearn Studio public dashboard to see just how easy it is to use.

Start Analyzing Twitter Topics

Powerful machine learning topic analysis tools allow you to quickly and easily organize tweets about your brand, so you can find out how your customers feel about the topics and categories that matter. Get actionable, data-driven insights from your Twitter data in real time, or follow your brand image and marketing efforts over time.

MonkeyLearn’s suite of text analysis tools can all be combined in MonkeyLearn Studio for a seamless data mining process that, once your models are trained, requires almost no human interaction.

Check out the pricing page to see how MonkeyLearn stacks up against the competition. Or schedule a demo to learn more about topic analysis and the many other powerful text analysis techniques that MonkeyLearn has to offer.

Tobias Geisler Mesevage

October 16th, 2020