The enterprise platform to build, deliver, and govern AI
Watch the 15 minute on-demand demo to get an overview of the Domino Enterprise AI Platform.
While humans have been using language since we arose, a complete understanding of language is a lifelong pursuit that often comes short, even for experts. To task computer technology with comprehending language, translating and even producing original written works represents a series of problems that are still in the process of being solved.
Natural language processing (NLP) is a blend of different disciplines, ranging from computer science and computational linguistics to artificial intelligence, that is used together to analyze, extract and comprehend information derived from human language, including both text and spoken words. It goes beyond processing words as blocks of information. Rather, NLP can recognize the hierarchical structures within language, extracting ideas and discerning nuances of meaning. It involves understanding syntax, semantics, morphology and lexicons. NLP has several use cases in data science such as:
A subset of NLP, natural language generation (NLG) is a type of language technology that can write out ideas in English or other human languages. When a model is given data input, it can produce human-language text. With text-to-speech technology, it can also produce human speech. This is a three-stage process:
Natural Language Generation has seen rapid expansion into commercial organizations through new discoveries and expansions in open-source models such as GPT-3 and frameworks such as PyTorch.
Another subset of NLP is natural language understanding (NLU) that determines the meaning of sentences in text or speech. While this may appear to come naturally to humans, for machine learning, it involves a complex series of analyses that can include:
Only after these analyses have been put together can NLU make sense of phrases like “Man-eating shark”; phrases that rely on previous sentences, like “I’d like that”; and even individual words that have multiple meanings, like the auto-antonym “oversight.”
Before you can get started in NLP, you will need access to labeled data (for supervised learning), algorithms, code and a framework. There are several different techniques you can use, including deep learning techniques depending on your needs. Some of the most common NLP techniques include:
Popular frameworks for NLP today include NLTK, PyTorch, spaCy, TensorFlow, Stanford CoreNLP, Spark NLP and Baidu ERNIE. Each NLP framework has its pros and cons in a production environment, so often data scientists do not rely solely on one framework. Kaggle offers a series of NLP tutorials that cover basics, for beginners with a knowledge of Python, and deep learning using Google’s Word2vec. Tools include a labeled dataset of 50,000 IMDB movie reviews and the required code.
NLP is used for a variety of applications that people use on a regular basis. Google Translate, for example, was developed using TensorFlow. While its early incarnations were often mocked, it has been continuously improved using deep learning through Google’s GNMT neural translation model, to produce accurate and natural-sounding translations for over 100 languages.
Facebook has achieved remarkable success with its translation service as well, solving complex problems with deep learning and natural language processing, as well as language identification, text normalization and word-sense disambiguation.
Other applications for natural language processing today include sentiment analysis, which allows applications to detect nuances in emotions and opinions and to identify such things as sarcasm or irony. Sentiment analysis is also used in text classification, which automatically processes unstructured text to determine how it should be classified. A sarcastic comment in a negative product review, for instance, can then be correctly classified, rather than misinterpreting the comment as positive.
In addition to apps you may use online or in social media, there are numerous business applications dependent on NLP. In the insurance industry, for example, NLP models can analyze reports and applications to help determine whether the company should accept the risk requested.
Topdanmark, the second-largest insurance company in Denmark, built and deployed an NLP model using the Domino data science platform to automate 65% to 75% of its cases, and customer waiting times have been reduced from a week to mere seconds. To begin exploring the advantages of Domino’s Enterprise MLOps platform, watch a demo.

David Weedmark is a published author who has worked as a project manager, software developer and as a network security consultant.
Watch the 15 minute on-demand demo to get an overview of the Domino Enterprise AI Platform.
Watch the 15 minute on-demand demo to get an overview of the Domino Enterprise AI Platform.