Python 3.11.0 was just released. In honor of this, I wanted to write a quick walk through of installing Python on Windows. The process is relatively straightforward and only takes a couple minutes. In addition, I’ll provide the steps for downloading and installing Microsoft Build Tools for Visual Studio which is sometimes required for compiling…
Dynamic Regex Queries in Pandas
I encountered an issue recently in which I wanted to dynamically retain only those rows which matched a group of regular expressions, or, in some cases, to be able to exclude rows matching a particular set of regular expressions. This is relatively straightforward should the regular expressions be known in advance. Let’s begin by setting…
Experiments in test-driven development for NLP?
Perhaps inspired by Brian Okken’s pytest book, I have been experimenting with a new approach to writing code. Most of my work consists of a long list of one-off scripts which serve a single purpose: moving data around, performing some relatively simple NLP operation, etc. While they will likely be run a few times (e.g.,…
Ignoring Tests with `pytest.param`
Testing applications is very important, but must be creatively exercised — perhaps we can follow the wearied expression of testing being more of an art than a science? Even packages like hypothesis still require some creative initialisation. What exactly should I test? How do I test that, and only that? Perhaps there are tests that…
Upgrading to Python 3.10
I just upgraded my Windows machines to Python 3.10.1. I’ve shied away from 3.X.0 releases ever since one of them broke something on Windows — I don’t recall the version, or the reason, and I’d assume release testing has improved so that it’s unlikely to recur, but I suppose I’ve become superstitious in my age….
Retrieve UMLS Data with API Key
The basic need I have is to convert the codes in a MEDDRA dataset to CUIs (UMLS concept unique identifiers). If there were only 10 or so, I’d look them up on the Metathesaurus manually…but I have a dataset of 155 related to COVID-19. Once I have the CUIs, I can limit the output from…
Disable New Microsoft Office “Save As” Menu
Perhaps I’m old-fashioned, or perhaps I haven’t invested enough learning how great Microsoft Office’s new(-ish) save dialog is. Typically, when I want to save something, I want to type/paste in the path I want to save my file, or click through the folders as I’m accustomed to in Windows Explorer. I can’t really figure out…
Default Values for max and min
I rely pretty heavily on Python’s min and max function when trying to take the highest or lowest values from a particular algorithm. For example, a regular expression extracts scores (these could be grades, number of pages in a book, distance run, etc.) from an input document. The document may contain multiple scores (e.g., describing…
Simplify Project Setup with Cookiecutter
Nearly every time I kick-off a new Python project, I follow the same set of steps: And then I feel about ready to grab some lunch or call it a hard-worked day at the office. Each time I follow these steps, I have realized that this should be automated. In fact, the only non-automatable part…
Outlook: Improve Delay Delivery (Part 2)
A couple months ago, I wrote up an initial implementation of adding a button to Outlook that would allow me to ‘delay delivery’. I took the basics of the code from a SuperUser post and turned it into a macro. Basically, the VB code checked to see if I was sending the email at weird…

