Getting Started with Text Analytics in MATLAB

If you are familiar with typical data analytics, trying text analytics might be simpler than you expect. The steps for a text analytics workflow aren’t unique to text: Access and explore the data, preprocess it, build a model, and share the results or the model. However, working with text data often raises some questions:

  • Where does the data come from?
  • What is it used for?
  • Can any of the processing be automated?
  • How is a model created with text data?


Text data is surprisingly accessible. Engineers and scientists generate a lot of text data as part of day-to-day operations. This data comes from internal sources such as internal reports, maintenance logs, work orders, and technical support cases. Text data can provide important information such as cause of equipment failure, pain points in products and process design, and action recommendations based on historic data.


There are also many external sources of data, such as information in social media, news, blogs, forums, and other platforms. This data can be valuable for getting timely insights into gaps and opportunities in scientific research, market intelligence for improving product/process design, and economic or policy information in forecasting models for product demand.


In each of these cases, text analytics with MATLAB® can be useful in automating the process of extracting information from text, significantly reducing the time required for manual processing.


This paper highlights common text analytics applications followed by a typical workflow and some examples to get you started with exploring and building models with your own text dataset.