After gathering the data (text samples), you’ll have to tag them into the following defined categories in order to build a Training Set:

For example, if you have the following categories:

  • Entertainment & Recreation
  • Food & Drinks
  • Health & Beauty
  • Retail
  • Travel & Vacations
  • Miscellaneous

The data shall be saved in a CSV or Excel file with the following format:

Text Category
64% off Ogawa’s Smart Aire Plus 2D Massage Chair OG7528 – Includes 1-Year Warranty Delivery. /Retail
Maldives: 5D and 4N Stay at 5-Star Anantara Veli Maldives Resort & Spa + 2-Way Flight by Singapore Airlines. /Travel & Vacations

You can take a look at the details on how to create CSV/Excel files for MonkeyLearn.

The following tools are suggested to perform the data tagging:

  • Excel / Libre Office / Google Spreadsheets.
  • Using the MonkeyLearn’s GUI in the Samples section after creating and uploading the data (see next section for details).
  • Open Refine.
  • Any particular tool or technique that you are familiar with.

Once you have created your training set, you are ready to add these samples to MonkeyLearn and train your classifier!

Next steps

Now that we are ready to train our model,  the next step is to learn about the different classifier parameters.