Know your followers with Machine Learning

Know your followers with Machine Learning

Tell me who your followers are and I tell you who you are.

You have lots of followers, congrats! You're popular, maybe an influencer. But, do you actually know who's following you?

If you have thousands of followers, that could be a tricky question to answer. Let's use Machine Learning to try to answer that question.

MonkeyLearn's Twitter account as today (Oct 4th, 2016), has almost 20,000 followers. It would be great to know which kind of people follow us, find out more about their interests, who they are, what they do, and try to figure out why they follow us. That could give us useful insights about our user base and product.

We quickly used Twitter's public API to download the bios of all of our followers. You can take a look at the Python script here. In order to use it, you'll need to get your API tokens.

We stored all the data in a simple CSV file. First column: user handle, second column: user bio. Like this:

edp3rezActively seeking to become wiser every day and attract people with a similar mindset so we can find ways to build a new world. Is that you?
FintechArticleFintech Articles and News Analysis from around the web
jahangeerbaluchComputer Scientist
MFENOGLIODon't forget to stop and have fun from time to time . Si luchas por lo que realmente quieres vendra a ti. A buenas el mejor
michaelyoungMBNFounder/CEO @ @mbnsolutions - Managing Partner @ @mbnconsilium - Founder of @DataSciTechScot
brolouiemdfashionate seasonal web developer, wordpress developer
vexipoloxozoTнe 2015 Sαle ιѕ нere! yoυ cαɴ ɴow вυy 50,000 Twιттer Followerѕ ғor oɴly $146, Try ιт ɴow! αт
PetroSemeniukDeveloper. Developer. Developer.
SaurabhIAmI am a Techno Freak, A Computer Science Geek, App. Developer, A Web Developer, A Kickass Gamer, and A Painting Artist!
krkdevDeveloper and Debugger

Then we wrote a simple Python command to read and concatenate all the bios of our users into a text variable and send it to our Keyword Extractor model in MonkeyLearn.

You can execute the command like this:

python -s 4000 -c 1 -k 100 -t <<YOUR TOKEN HERE>> followers_bios.csv


-s option sets the max number of rows (bios) to use. I suggest to limit to the last 4,000 texts, too much of them would take a lot of time to process.

-c option sets the column number (starting on 0) where the bios are located in the CSV file.

-k option sets the max number of keywords to return.

-t option sets your MonkeyLearn API token.

And lastly, the followers_bios.csv is the CSV file where you stored the bios.

The keywords returned will be sorted according to their relevance within the texts.

You can even try to do the same process just copying and pasting the texts within MonkeyLearn's GUI, just go to the API section. This will limit just to the top 10 keywords.


And the results for the top 100 keywords associated with MonkeyLearn's followers are:

Developer Python Machine Learning Tweets Marketing Web Developer Big Data Data Science Designer Data startups software developer Engineer Software Entrepreneur Web Analytics Technology Consultant Social Media tech Opinions enthusiast app developer Artificial IntelligenceBusiness Computer geek Manager views lover research innovation Programmer Founder music lover Natural Language Processing love Blogger Learning Java Interests graphic design digital marketing PHP Science world JavaScript Dev game developerBusiness Developer fan life photographer companies student CTO music Scientist Product Manager open source Cloud Speaker CEO Analyst customers Linux app Mobile App Developer passion Android people mobile app Software Engineer Project ManagerGamer Brand Travel Strategy things Network Product services Natural language Market Research Data Scientist Mobile Developer Business Intelligence writer Sci Creator father Author solutions SEO programming Coder endorsements Twitter Expert


That's great! Definitively what we wanted to see, but we also found some interesting insights:

  • Strong popularity within Developers, we have keywords like Developer, Web Developer, Software Developer, Engineer, App Developer, Programmer, Game developer, Mobile App Developer, Software Engineer, Data Scientist, Mobile Developer, Coder.
  • Strong popularity within people in the Data Science and Technology space: Machine Learning, Big Data, Data Science, Data, Analytics, Technology, Artificial Intelligence, Research, Innovation, Science, Natural Language Processing, Programming.
  • Some Other Titles arose, which are very interesting besides the Developer and Data Scientist, all of them very related to the startup world: Designer, Entrepreneur, Consultant, Manager, Founder, Blogger, Business Developer, Photographer, Student, CTO, Product Manager, Scientist, Speaker, CEO, Analyst, Project Manager, Writer, Author.
  • Some Technologies: Python, Java, PHP, Javascript, Open Source, Cloud, Linux, Android.
  • Non-tech disciplines which have been growing a lot in our community and we plan to give more tools: Marketing, Startups, Social Media, Business, Graphic Design, Digital Marketing, Music, Market Research, Business Intelligence, SEO.
  • Personal characteristics which clearly denote that we have very enthusiastic and geeky followers: enthusiast, geek, lover, music lover, love, fan, life, passion, creator, father, expert.

Hope you enjoyed this quick post, I'd love to know your own insights with your followers!

Raúl Garreta

October 4th, 2016

Posts you might like...

MonkeyLearn Logo

Text Analysis with Machine Learning

Turn tweets, emails, documents, webpages and more into actionable data. Automate business processes and save hours of manual data processing.

Try MonkeyLearn
Clearbit LogoSegment LogoPubnub LogoProtagonist Logo