News

Introducing MonkeyLearn 3.0

Raúl Garreta
by Raúl Garreta

Introducing MonkeyLearn 3.0

Hi everybody, we’re excited to introduce MonkeyLearn 3.0!

Over the last couple years we have been working with clients ranging from well known SaaS companies, to oil and gas businesses. They have all had a common need to automate manual processes that involve a lot of texts: whether it be to process support tickets, analyze feedback or reviews, or extract information from contracts and documents.

We are now launching a new MonkeyLearn version allowing more people to build text analysis models powered by machine learning.

Here is what we are excited to share:

  • New redesigned GUI and API: a cleaner and simpler interface that can serve both for technical and non-technical users.
  • Custom Extractors: this is a new major feature, you can train a ML model to extract custom data within texts.
  • Custom Classifiers: you can train a ML model to classify texts into tags as with our previous version, but much easier 🙂
  • Active Learning: the tagging process is quicker and tags are suggested as you train the model.

Custom Text Extraction

Text extraction involves identifying relevant data within a piece of text. There are some general examples of things you can extract from texts: keywords, individual and company names, emails, addresses, prices, etc. However, many of our clients have needed to extract more specific pieces of info that were unique to their own context. This calls for custom extraction, a new kind of custom model that you can now build in MonkeyLearn.

In the following example, you can see how a series of laptop features (brand, model, display, cpu, etc.) are being extracted from product descriptions. The results are shown to the right, with each tag having its corresponding value – brand: Apple, cpu: Intel Core i7, memory: 8 GB, etc.

Extracting laptop features from unstructured product descriptions

With MonkeyLearn, you can now build a custom extractor seamlessly from our web interface. As you tag example texts, the model is trained in the background, learning from the patterns in your data and highlighting suggestions in real time (more on this below). The result is a huge boost in speed and productivity.

Training a model with enough samples is usually a pain when it comes to machine learning processes. Now a user can not only work faster, but can see how the model is working at the same time, allowing them to make corrections and give feedback that will improve accuracy more quickly.

The following is what the tagging process for making a custom extractor looks like. Users can train the model by annotating or selecting the text data (word or phrases) that corresponds to the appropriate tag.

Custom Text Classifiers

Text classification turns texts into tags. The idea with classification is to classify texts by topic, sentiment or intent, just to name a few examples. Up until now, making custom classifiers has been our flagship product. We believe that the key to high accuracy in text classification requires users to build their own custom classifier, and it’s now easier than ever before with MonkeyLearn. We’ve updated the look and feel and simplified the process for users to define their own custom set of tags, to provide training data, and to let machine learning work on the fly.

In the following example, a piece of feedback for a software product is classified by tags that indicate what the feedback is about: Features, Ease of Use, Support, Pricing, etc. The resulting tags are shown on the right with the corresponding level of confidence.

Building and training a custom classifier is now a lot more simple; just upload your text data and define the set of tags. You can see what the training process looks like below. The text is shown on the left and the user selects that tag (or tags) that apply on the right. As with custom extractors, the model begins to learn from your classifications in the background and will begin to make suggestions.

Making suggestions during the tagging process for both extractors and classifiers is something we are really excited about. Users will be able to actively see the model learning as they tag more samples. This is done using various active learning techniques that will help maximize training impact and minimize effort. As more texts are tagged, the more accurate a model will be, and users can see the fruits of their tagging labor right in front of them.

Integrations, API and SDKs

We have also redesigned our API based on tons of feedback that we received from our users and customers. We did the same with our SDKs. All of this was done with the idea of maintaining backwards compatibility with our API v2. We invite you to check out the new API docs and help docs. We invested a lot of time and effort to help explain things, as these docs are part of the product itself.

In the coming months, we plan to add more integrations with third party apps and platforms.

Final Words

We feel that we have learned a lot from our users, and that this has helped point us in the right direction for where our technology needs to go. The process of building a custom extractor or custom classifier can be easy and fun! Our goal has always been to build a tool that allows everyone to train a machine learning model with a simple graphical interface. This involves not just the right algorithms and model performance, but designing the right user experience and offering the appropriate integrations.

We hope that you find this new version valuable. We invite you to try it out and we really appreciate any comments and feedback you might have that can help improve the product. Most of all, we would love to hear from you about the use cases you find for text analysis, and how custom extraction and classification can help you get things done.

Raúl Garreta

Raúl Garreta

MonkeyLearn Co-Founder & CEO. Machine Learning and Natural Language Processing expert. Author of "Learning scikit-learn: Machine Learning in Python".

Notification

Have something to say?

Text Analysis with Machine Learning

Turn tweets, emails, documents, webpages and more into actionable data. Automate
business processes and save hours of manual data processing.