In this write-up, I want to discuss a more encapsulated solution to the fencepost problem which relies on Python’s context managers. By ‘encapsulated’, I mean ‘hidden from the user’, or ‘handled by the object’ in an object oriented programming sense. Before starting, let’s digress briefly into the fencepost problem (at least as how I was…
Category: python
zipfile — Work with ZIP archives
Python is probably not your first thought when it comes to opening zip archives or compressing directories. In fact, if you’re like me, zip means something rather different… For most needs of handling zip archives, your favourite shell or window GUI handles most of your needs. In fact, if you want Python to emulate this…
Cannot import `EnumType` from `enum`
I was working on a project using importlib in which I needed to locate the relevant enum class within a file. In order to check if the element is an enum.Enum, I looked up the documentation and found the appropriate isinstance check: This worked brilliantly. A collaborator ended up using the code but reported an…
Building WSL Shell Scripts in Windows
At the high level I usually work at, writing a Python script that runs on both Windows and Unix-like OSes. I ran into a couple surprises, however, when trying to generate a number of shell scripts from Windows designed to run with WSL. We can setup the basic outline of a script that will write…
Building Language Rules in SpaCy
spaCy provides a number of useful methods for exploring and creating patterns after a particular text or document has been read. To see this in action, let’s use spaCy to build some rules in the more computational linguistic side of NLP. So, for those less interested in language, forgive a brief digression into Polish. In…
Using spaCy for Sentence Splitting
By default, spaCy carries around a powerful battery of pipelines and swings these mighty chainsaws at every passing tree and twig. Sometimes, however, you only want a small pruner to accomplish some smaller task. Can spaCy still work in such a use case? For example, suppose that all I want from spaCy are my documents…
spaCy: The Basics
I learned much of my natural language processing using Python’s `nltk` library which, coupled with the nltk book (https://www.nltk.org/book/), provides a great introduction to the topic. When I hit industry, however, I never really found a use for it, nor motivate myself to learn the intricacies of creating a corpus from my own dataset. Many…
How Python Finds Your Imports
It seems easy. I need a package, say pandas, so I run pip install pandas. Then, at the top of my file I can get access to this library by a simple import at the top: import pandas as pd. But how does Python determine where the package is located? First, Python will check if…
string — Common string operations (Part 1: methods)
The string module provides a number of methods and constants for manipulating strings (type: str). These work on all strings in Python which are created using quotation marks: ‘single’, “double”, ”’triple-single”’, and “””triple-double”””-quoted text. Strings can also be created by applying the built-in str( ) function to any other datatype. In Python, strings are immutable…
How Virtual Environments Work (on Windows)
Brett Cannon made a short (and quite interesting post) on virtual environments and their context, though this focused on their application to a Unix-based OS rather than Windows. I’d like to summarize the content there and adapt it to Windows. History Why do we have virtual environments? This may be a perplexing question to someone…