What is medical or clinical NLP?

Natural language processing, or NLP, is a computing technique which parses and makes sense of written language. Normal NLP is used to analyze text to determine its meaning, enabling computer applications that can communicate effectively with humans. Clinical NLP is a specialization of NLP that allows computers to understand the rich meaning that lies behind a doctor’s written analysis of a patient.

Normal NLP engines use large corpora of text, usually books or other written documents, to determine how language is structured and how grammar is formed. Taking models that were learned from this kind of writing and trying to apply it as a clinical NLP solution won’t work.

Clinical NLP Requirements

There are several requirements that you should expect any clinical NLP system to have:

  • Entity extraction: to surface relevant clinical concepts from unstructured data.
  • Contextualization: to decipher the doctor’s meaning when they mention a concept. For example, when doctors deny a patient has a condition or talk about a patient’s history.
  • Knowledge graph: to understand how clinical concepts are interrelated, like the fact that both fentanyl and hydrocodone are opiates.

Entity Extraction

Doctors don’t write about patients like you would write a book. Clinical NLP engines need to be able to understand the shorthand, acronyms, and jargon that are medicine-specific. You also need to supplement clinical NLP engines with a knowledge graph, because doctors rely on the knowledge of other doctors who are reading what they write to fill in information that they don’t explicitly record.

Different words and phrases can have exactly the same meaning in medicine, for example dyspnea, SOB, breathless, breathlessness, and shortness of breath all have the same meaning.


The context of what a doctor is writing about is also very important for a clinical NLP system to understand. Up to 50% of the mention of conditions and symptoms in doctor’s writing are actually instances where they are ruling out that condition or symptom for a patient. When a doctor says “the patient is negative for diabetes” your clinical NLP system has to know that the patient does not have diabetes.

Similarly, doctors often discuss a patient’s history, their family history, or attempt to hypothesize about what might be happening to a patient, all of which needs to be detected using clinical NLP.

Knowledge Graph

A knowledge graph encodes entities, also called concepts, and their relationship to one another. All of these relationships create a web of data that can be used in computing applications to help them “think” about medicine similarly to how a human might. Lexigram’s Knowledge Graph powers all of our software and is also available directly via our APIs.