Vocabulary Construction Issues

Incorporating a controlled vocabulary into the MED involves three steps, modeling of information, creation of terms, and maintenance. First, we must understand what the terms in the vocabulary mean and how they relate to each other.And then we must formalize this understanding through explicit modeling in the MED. Once the modeling has occurred, we can add the actual terms to the MED and establish an on-going maintenance process. A MUMPS-based vocabulary browser was adapted to serve as the MED Editor.

Adding New Terms

When adding a term, we first want to be sure that it does not exist. If not, then it can be added - and it needs to be added to all appropriate classes. Sometimes a new term is similar enough to an old term that we will want to create anew class to subsume them both. So the procedure when adding new terms is

Identify redundant terms
Put new terms into existing classes
Create new classes where appropriate

Pushing a Term

Here is what "pushing" looks like. We start with a new test "Stat Glucose Test" and assign it to the Lab Test class. We also provide the semantic information by linking it to "Glucose". As we start to look down the tree, we see the class "Chemical Test" which measures chemicals. Since the system knows glucose is a chemical, it can push Stat Glucose Test down under Chemical Test. We can repeat the process as many times as necessary until the new term rests under the most specific class, in this case, Plasma Glucose Test.

Create new classes

We need to create a new class to group old terms with new ones. Theoretically, if the terms are similar enough, they should be in the same class, so all we have to do is select a class with more than two children and see if the children can be partitioned into two nonempty sets, based on their attribute similarities and differences.  For example, the old lab system had the term "Core Antigen" and the new lab had the term "HBC" (for "Hepatitis B Core Antigen"). Matching by name would not be very helpful in this case, but since both were linked to the substance measured (Hepatitis B Core Antigen), then the system could recognize their similarity. When it did, it proposed a new class to include them both ("Hepatitis B Core Antigen Test", shown on the right).

Semi-Automated Maintenance

Read formulary file
Identify new drugs
Link new drug to ingredient(s)
Suggest classifying in "preparation" class
Add new drug as per human reviewer

The on-going maintenance tasks must be customized for each ancillary system. The lab vocabulary changes every few months in small ways, so these can be handled by hand. The pharmacy, on the other hand, can add new drugs every day. We have developed a maintenance process for the pharmacy which is semi-automated. The system reads the formulary file on the pharmacy system and, when it finds a new drug, it links that drug to the ingredients in the MED, based on information in the formulary file. This ingredient information then allows the system to push the new drug under an appropriate "preparation" class.


Interactive Classification

Here is an example of the dialogue with the editor. In the first case, the system found an appropriate class to put the new drug "Lasix 20mg Tab", so I said "yes". In the second case, it knew that Zaroxolyn was a diuretic but could not find a specific "preparation" class. In this instance, MED editor was used to create a new class to subsume the new drug.

Automated Classification

Here is an example of how the automated classification looks under the hood. The program is adding "Septra" and it knows that it has two ingredients (trimethoprim and sulfamethoxizole). It also knows there is a preparation class that has these two ingredients. It is therefore able to add the drug to the class (arrow).

There is an added benefit to this process. When Septra was added to the preparation class, it became a descendant of two allergy classes: "Trimethoprim" (allergy code 65) and "Sulfa" (allergy code S1). In this case, the pharmacy has assigned only the S1 code to Septra, but neglected to assign the 65 code. The MED editor detects such discrepancies. As a result, we can feed back this information to the pharmacists who can correct their database.