The basic need I have is to convert the codes in a MEDDRA dataset to CUIs (UMLS concept unique identifiers). If there were only 10 or so, I’d look them up on the Metathesaurus manually…but I have a dataset of 155 related to COVID-19. Once I have the CUIs, I can limit the output from Metamaplite to only those with relevant COVID-19 CUIs.
To do this programmatically, I am planning to use httpie to fire the relevant curl commands from the documentation. Well, the documentation looks a bit sparse, but I think I can make it work.
- Connect to Metathesaurus home page and login. (If you don’t have an account, you need to sign up first…)
- To obtain the API KEY for your account, choose “My Profile”. (This can be reset going to “Edit Profile” at the bottom of the page.


3. After activating a new virtual environment and installing httpie
, I ran this command:
http --form POST https://utslogin.nlm.nih.gov/cas/v1/api-key apikey={INSERT_API_KEY_VALUE_HERE}
The response includes the TGT key (begins with TGT
, ends with cas
.

This is my ‘TGT’, which I should generate for each session. I then use the TGT to generate an individual ‘ST’ or ticket. It would be rather time-consuming and error prone to continue using httpie in this context, so let’s write a short script.
In a config.py
, I’ll place my API_KEY
and the LIST
of target Meddra codes I’m interested in. I’ll also store the VERSION
(setting to ‘current
‘ which is the most recent UMLS version). A future refactoring should load these from the command line.
Then, we can create a main.py
which I’ve based on some sample code provided by HHS (that code isn’t great — a pull request or two would be good — but it points you in the right direction).
import requests
import json
from config import API_KEY, VERSION, LIST
from lxml.html import fromstring
def get_tgt():
"""Retrieve session-based token"""
r = requests.post(
f'https://utslogin.nlm.nih.gov/cas/v1/api-key',
data={'apikey': API_KEY},
headers={'Content-type': 'application/x-www-form-urlencoded', 'Accept': 'text/plain', 'User-Agent': 'python'},
)
return fromstring(r.text).xpath('//form/@action')[0]
def get_ticket(tgt):
"""Retrieve request-based token"""
r = requests.post(
tgt,
data={'service': 'http://umlsks.nlm.nih.gov'},
headers={'Content-type': 'application/x-www-form-urlencoded', 'Accept': 'text/plain', 'User-Agent': 'python'},
)
return r.text
if __name__ == '__main__':
tgt = get_tgt()
cuis = []
for string in LIST:
ticket = get_ticket(tgt)
# NB: 'exact' must be used when searching for 'code'
r = requests.get(
f'https://uts-ws.nlm.nih.gov/rest/search/{VERSION}',
params={'string': string, 'ticket': ticket, 'inputType': 'code', 'searchType': 'exact'},
)
r.encoding = 'utf-8'
json_data = json.loads(r.text)
for result in json_data['result']['results']:
cuis.append(result['ui'])
print(cuis)
Here are a couple codes you can try out (the full list I’m using is in the spreadsheet here):
LIST = [
10084510,
10084459,
10084467,
10084382,
}
The output I get is the desired list of UMLS CUIs: ['C0206750', 'C5244047', 'C5244047', 'C5203670', 'C5203670', ...]
.
Now, I can use the list to select the text mentions of interest, and also package this up for future use.