Python is probably not your first thought when it comes to opening zip archives or compressing directories. In fact, if you’re like me, zip
means something rather different… For most needs of handling zip archives, your favourite shell or window GUI handles most of your needs. In fact, if you want Python to emulate this behaviour and open a zip archive for you, the zipfile
library is probably not for you — shutil
will serve you better with its shutil.make_archive
and shutil.unpack_archive
.
When does the zipfile
module come into play? The zipfile
module is useful when you want to do more than unpacking or creating a single zip archive. Suppose you need to work with some sort of transaction data that is stored in a directory, which each day’s transactions in a zip archive. To do your analysis, you need to open all the zip archives and identify the import component without needlessly wasting space by unzipping everything.

Or, perhaps, you have annotations on some text data where each ‘document’ is stored in a zip file with the original text, added tags, etc.? In these and many other cases, the zipfile
module can provide a tool to skillfully extract what you need.
Useful Functions
Let’s start with a simple, nested directory structure to explore.
zipme/ <- directory outer.txt <- file inside `zipme/` directory inner/ <- directory inside `zipme/` directory inner.txt
All of these directories/files could be placed inside zipme.zip
:
zipme.zip/ zipme/ <- inside `zipme.zip` archive outer.txt inner/ inner.txt
Unpack/Build Archive
Even though I said that shutil
is probably the best option for unpacking a zip file, let’s see how we would do it. This will allow us to get acquainted.
from zipfile import ZipFile # import with ZipFile('zipme.zip') as zipr: # context manager to open zipr.extractall() # extract all elements into current directory
This code will extract the files into the current directory. If we want to specify an output directory, we can supply that as an argument: zipr.extractall(path)
.
We can reverse this process using write(filename, arcname)
. In this context, filename
is the location of the file on the filesystem, and arcname
is the name/path to be used within the archive itself (i.e., the archive name).
from pathlib import Path from zipfile import ZipFile path = Path('zipme') with ZipFile('zipme.zip', 'w') as zipr: zipr.write(path / 'outer.txt', 'outer.txt') zipr.mkdir('inner') # create the 'inner' directory inside the archive zipr.write(path / 'inner' / 'inner.txt', 'inner/inner.txt')
After opening the archive in write mode, we copy ‘outer.txt’ from the filesystem into the archive, create an ‘inner’ directory, and then copy ‘inner.txt’ from the filesystem to its cozy place within the archive. Technically, the zipr.mkdir('inner')
is redundant, since the next command will create it. You’d only need to use mkdir
if you didn’t have anything to put inside. (Not sure why you’d do that…?)
If the archive is password protected, supply the pwd
argument to ZipFile.
Print Zip Contents
You can print the contents of the zip archive to stdout using the zipr.printdir()
:
with ZipFile('zipme.zip') as zipr: zipr.printdir() ## Output: # File Name Modified Size # zipme/inner/ 2023-06-16 02:00:00 0 # zipme/inner/inner.txt 2023-06-16 02:00:00 0 # zipme/outer.txt 2023-06-16 02:00:00 0
More usefully, we can get the same information with the paths within the archive using zipr.namelist()
:
with ZipFile('zipme.zip') as zipr: print(zipr.namelist()) # returns relative paths print(zipr.filelist) # returns files as ZipInfo objects (contain some metadata) ## Output: # ['zipme/inner/', 'zipme/inner/inner.txt', 'zipme/outer.txt'] # [<ZipInfo filename='zipme/inner/' external_attr=0x10>, <ZipInfo filename='zipme/inner/inner.txt' external_attr=0x20 file_size=0>, <ZipInfo filename='zipme/outer.txt' external_attr=0x20 file_size=0>]
The zipr.filelist
function can also be used, and instead of strings will return a ZipInfo
object. The can be used to access additional metadata about the contained files (see below or online doco).
Read an Archived File (in memory)
Now that we can open the archive and list its contents, why not peek into the files themselves? Here, we’ll use zipr.open
within a context manager to read the files. Everything will be in bytes, so we’ll need to decode these into utf8 in order to get a string representation:
with ZipFile('zipme.zip') as zipr: with zipr.open('outer.txt', 'r') as fh: print(fh.read().decode('utf8')) ## Output: # Hello,
Using zipr.filelist
or zipr.namelist()
, we can collect the text from all files within the directory since these functions will provide the archive’s contents.
with ZipFile('zipme.zip') as zipr: for file in zipr.filelist: # list of ZipInfo, so we can check if it's a directory if file.is_dir(): continue # skip directories with zipr.open(file, 'r') as fh: print(fh.read().decode('utf8'), end='') # don't print a newline after the file contents ## Output: # Hello, world!
Going through the zipr.filelist
, we can work with ZipInfo
objects. These include metadata include:
- whether or not it’s a directory/file (
ZipInfo.is_dir
) - retrieve the filename to be able to, e.g., check the extension (
ZipInfo.filename.endswith('.txt')
)
Parting Thoughts
The zipfile
module is not one I commonly use, but when working with large data dumps where zip archives are frequent, or when needing to work with a large archive without wanting to or being able to unpack it, accessing via zipfile.Zipfile
is quick and convenient.