So you’re working on a text classification problem. You’re refining your training set, and maybe you’ve even tried stuff out using Naive Bayes. But now you’re feeling confident in your dataset, and want to take it one step further. Enter Support Vector Machines (SVM): a fast and dependable classification algorithm that performs very well with a limited amount of data. […]
The simplest solutions are usually the most powerful ones, and Naive Bayes is a good proof of that. In spite of the great advances of the Machine Learning in the last years, it has proven to not only be simple but also fast, accurate and reliable. It has been successfully used for many purposes, but it works particularly well with natural language processing (NLP) problems. […]
This is the final part in a series where we use machine learning and natural language processing to analyze articles published in tech news sites in order to gain insights about the state of the startup industry. […]
This is the second part in a series where we analyze thousands of articles from tech news sites in order to get insights and trends about startups.
Last time around we scraped all the articles ever published in TechCrunch, VentureBeat and Recode using Scrapy. We then filtered out all the articles that weren’t about startups, […]
On this new post series, we will analyze hundreds of thousands of articles from TechCrunch, VentureBeat and Recode to discover cool trends and insights about startups.
What are the hottest industries for startups right now?
Do machine learning startups get more press than fintech startups?
What is the startup segment with most acquisitions?
These are the […]
Election day looms closer and closer every week. US Politics are rapidly becoming the preferred conversation topic for millions of Americans and non-Americans worldwide. What are these people saying? What do they think? What are their opinions? How do they feel?
We are using machine learning to find out! For the past few months, we’ve […]
On a previous post we learned how to train a machine learning classifier that is able to detect the different aspects mentioned on hotel reviews. With this aspect classifier, we were able to automatically know if a particular review was talking about cleanliness, comfort & facilities, food, Internet, location, staff and […]
Recently we walked you through on how to train a sentiment analysis classifier for hotel reviews using Scrapy and MonkeyLearn. This tutorial is a perfect example on how we can combine web scraped data and machine learning for discovering valuable insights about a particular industry.
With this model we were able to analyze millions […]
Data is everywhere. And in massive quantities. We are currently in an era of data explosion, where millions of tweets, articles, comments, reviews and the like are being published everyday.
Developers are taking advantage of the abundance of data and using things like web scraping to do all kinds of cool things. Sometimes web scraping is not […]