The string module provides a number of methods and constants for manipulating strings (type: str). These work on all strings in Python which are created using quotation marks: ‘single’, “double”, ”’triple-single”’, and “””triple-double”””-quoted text. Strings can also be created by applying the built-in str( ) function to any other datatype. In Python, strings are immutable…
Category: python
How Virtual Environments Work (on Windows)
Brett Cannon made a short (and quite interesting post) on virtual environments and their context, though this focused on their application to a Unix-based OS rather than Windows. I’d like to summarize the content there and adapt it to Windows. History Why do we have virtual environments? This may be a perplexing question to someone…
pathlib
— Object-oriented filesystem paths
The pathlib module was introduced in Python 3.4 (see PEP-428) — or, more accurately, pathlib was a 3rd party module which was added to the standard library (i.e., the packages that come with all installs of Python — unless excluded for, e.g., including library on smaller devices). It was attempting to provide a more friendly,…
Walruses and Regular Expressions
Incorporating regular expressions has been clunky. Let’s imagine that we need to search for a few regular expressions in some text and then perform task when a term is found. In code: Unfortunately, we can’t just use the match because, if the pattern is not in text, the result is None. So, before using the…
Making Better Regular Expressions
I use a lot of regular expressions in my work. They are very powerful for extracting, replacing, or locating text strings of interest, particularly in their flexibility. Character classes, case insensitivity, etc. are very powerful. Take a simple use case: let’s find all the words (letter-only sequences) in some text: Regular expressions do have some…
Installing Python on Windows
Python 3.11.0 was just released. In honor of this, I wanted to write a quick walk through of installing Python on Windows. The process is relatively straightforward and only takes a couple minutes. In addition, I’ll provide the steps for downloading and installing Microsoft Build Tools for Visual Studio which is sometimes required for compiling…
Dynamic Regex Queries in Pandas
I encountered an issue recently in which I wanted to dynamically retain only those rows which matched a group of regular expressions, or, in some cases, to be able to exclude rows matching a particular set of regular expressions. This is relatively straightforward should the regular expressions be known in advance. Let’s begin by setting…
Experiments in test-driven development for NLP?
Perhaps inspired by Brian Okken’s pytest book, I have been experimenting with a new approach to writing code. Most of my work consists of a long list of one-off scripts which serve a single purpose: moving data around, performing some relatively simple NLP operation, etc. While they will likely be run a few times (e.g.,…
Ignoring Tests with `pytest.param`
Testing applications is very important, but must be creatively exercised — perhaps we can follow the wearied expression of testing being more of an art than a science? Even packages like hypothesis still require some creative initialisation. What exactly should I test? How do I test that, and only that? Perhaps there are tests that…
Upgrading to Python 3.10
I just upgraded my Windows machines to Python 3.10.1. I’ve shied away from 3.X.0 releases ever since one of them broke something on Windows — I don’t recall the version, or the reason, and I’d assume release testing has improved so that it’s unlikely to recur, but I suppose I’ve become superstitious in my age….