Log files can often be useful sources of historical information about how programs run. I have found them sitting next to datasets and used them to get more information on the provenance of the dataset. Perhaps I could add a function that would log all of the parameters that were run? Sure, a configuration file should be located nearby — but, in reality, it often isn’t: lost on the command line or in a PyCharm configuration file. How might we accomplish this?
Suppose we have a function build
with the following signature, how could we print out all of the parameters?
import datetime
from pathlib import Path
from loguru import logger
def build(source_path: Path, dest_path: Path, testrun=False, exclusions: list=None, **kwargs):
now = datetime.datetime.now().strftime('%Y%m%d') # current date-stamp
dest_path.mkdir(exist_ok=True) # ensure output path exists
logger.add(dest_path / f'build_{now}.log') # create log file relic
# function logic here
if __name__ == '__main__':
# argparse to get parameters here
main(**vars(parser.parse_args()))
For context, assume that:
- this is part of an installable package using
pyproject.toml
, - there is a
[project.script]
section pointing to this file - we are using
argparse
orclick
to grab command line arguments and immediately passing into thebuild
function
Naïve Baseline
A first impulse might be to do so manually, where we can use the f-string shortcut
import datetime
from pathlib import Path
from loguru import logger
def build(source_path: Path, dest_path: Path, testrun=False, exclusions: list=None, **kwargs):
now = datetime.datetime.now().strftime('%Y%m%d') # current date-stamp
dest_path.mkdir(exist_ok=True) # ensure output path exists
logger.add(dest_path / f'build_{now}.log') # create log file relic
logger.info(f'Parameters:')
logger.info(f'{source_path=}')
logger.info(f'{dest_path=}')
logger.info(f'{testrun=}')
logger.info(f'{exclusions=}')
logger.info(f'{kwargs=}')
# function logic here
>> build(Path('in'), Path('out'), d=4)
Parameters:
source_path=WindowsPath('in')
dest_path=WindowsPath('out')
testrun=False
exclusions=None
kwargs={'d': 4}
While this works quite nicely, there is a problem: everytime that a change is made, a new argument is add (or removed/renamed), the appropriate logger.info
call must be updated too. One of the beauties of programming is to automate this away.
Using locals()
Another approach is to use iterate through locals()
, and output each key with its associated value. This approach (as we will see) is not quite perfect — it will also include any local variables which we have declared since the function was called (e.g., in our example, we have declared now
which isn’t a parameter).
import datetime
from pathlib import Path
from loguru import logger
def build(source_path: Path, dest_path: Path, testrun=False, exclusions: list=None, **kwargs):
now = datetime.datetime.now().strftime('%Y%m%d') # current date-stamp
dest_path.mkdir(exist_ok=True) # ensure output path exists
logger.add(dest_path / f'build_{now}.log') # create log file relic
logger.info('Parameters:')
for k, v in locals().items():
logger.info(f'{k}={v}')
# function logic here
>> build(Path('in'), Path('out'), d=4)
Parameters:
source_path=WindowsPath('in')
dest_path=WindowsPath('out')
testrun=False
exclusions=None
kwargs={'d': 4}
now=20240920
In spite of the weakness of including the now
variable, I rather like this approach: it’s simple, straightforward, and clear. Assuming our project has multiple entrypoints/scripts, how might we convert this into a function? We have to be careful, as locals()
will behave differently inside a function we call. Thus, we’ll always need to pass in locals()
to the function.
import datetime
from pathlib import Path
from loguru import logger
def loglocals(d: dict):
"""Outputs function parameters/values to log; always supply with `locals()`"""
logger.info('Parameters:')
for k, v in d.items():
logger.info(f'{k}={v}')
def build(source_path: Path, dest_path: Path, testrun=False, exclusions: list=None, **kwargs):
now = datetime.datetime.now().strftime('%Y%m%d') # current date-stamp
dest_path.mkdir(exist_ok=True) # ensure output path exists
logger.add(dest_path / f'build_{now}.log') # create log file relic
loglocals(locals())
# function logic here
>> build(Path('in'), Path('out'), d=4)
Parameters:
source_path=WindowsPath('in')
dest_path=WindowsPath('out')
testrun=False
exclusions=None
kwargs={'d': 4}
now=20240920
We’re still printing our previously-declared variables — how might we avoid that?
func.__code__
Approach
Well, we need to get the parameters of the calling function. This can be accomplished by calling build.__code__.co_varnames
. For example, suppose we have an add
function:
def add(x, y, *args, z=1, **kwargs):
print(add.__code__.co_varnames)
>> add(1, 2, a=3)
('x', 'y', 'z', 'args', 'kwargs')
co_varnames
, however, will include all variables declared in the function (not just parameters), so we’ll have to limit them with the first n
variables using co_argcount
. More precisely, we’ll use add.__code__.co_varnames[:add.__code__.co_argcount + add.__code__.co_kwonlyargcount]
. co_argcount
and co_kwonlyargcount
will not include arguments like args
and kwargs
, so we’ll have to manually add those. (This is probably okay since args
and kwargs
can’t appear more than once. We can then look these values up in locals()
:
def add(x, y, *args, z=1, **kwargs):
print(f'All varnames: {add.__code__.co_varnames}')
print(f'Parameters: {add.__code__.co_varnames[:add.__code__.co_argcount + add.__code__.co_kwonlyargcount + 2]}')
for varname in add.__code__.co_varnames[:add.__code__.co_argcount + add.__code__.co_kwonlyargcount + 2]:
print(f'{varname}={locals()[varname]}')
>> add(1, 2, a=3)
All varnames: ('x', 'y', 'z', 'args', 'kwargs', 'varname')
Parameters: ('x', 'y', 'z', 'args', 'kwargs')
x=1
y=2
z=1
args=()
kwargs={'a': 3}
An equivalent function to loglocals
might look like this:
def loglocals(func, d: dict, has_args=False, has_kwargs=False):
"""Outputs function parameters to log"""
logger.info('Parameters:')
n_args = func.__code__.co_argcount + func.__code__.co_kwonlyargcount + has_args + has_kwargs
for varname in func.__code__.co_varnames[:n_args]:
logger.info(f'{varname}={d[varname]}')
def build(source_path: Path, dest_path: Path, testrun=False, exclusions: list=None, **kwargs):
now = datetime.datetime.now().strftime('%Y%m%d') # current date-stamp
dest_path.mkdir(exist_ok=True) # ensure output path exists
logger.add(dest_path / f'build_{now}.log') # create log file relic
loglocals(build, locals(), has_kwargs=True)
# function logic here
>> build(Path('in'), Path('out'), d=4)
Parameters:
source_path=WindowsPath('in')
dest_path=WindowsPath('out')
testrun=False
exclusions=None
kwargs={'d': 4}
While more accurate, this is a lot of work, and seems less readable. It also requires maintenance of whether or not there are *args
and/or **kwargs
.
Inspect
We can also import the module inspect
. Compared to our locals()
solution, this seems a bit overkill, but provide the best results.
import inspect
def add(x, y, *args, z=1, **kwargs):
for param in inspect.signature(add).parameters:
print(f'{param}={locals()[param]}')
>> add(1, 2, a=3)
x=1
y=2
args=()
z=1
kwargs={'a': 3}
For our build
function, the inspect
-based version of loglocals
might look like this:
def loglocals(func, d: dict):
"""Outputs function parameters/values to log"""
import inspect
logger.info('Parameters:')
for param in inspect.signature(func).parameters:
logger.info(f'{param}={d[param]}')
def build(source_path: Path, dest_path: Path, testrun=False, exclusions: list=None, **kwargs):
now = datetime.datetime.now().strftime('%Y%m%d') # current date-stamp
dest_path.mkdir(exist_ok=True) # ensure output path exists
logger.add(dest_path / f'build_{now}.log') # create log file relic
loglocals(build, locals())
# function logic here
>> build(Path('in'), Path('out'), d=4)
Parameters:
source_path=WindowsPath('in')
dest_path=WindowsPath('out')
testrun=False
exclusions=None
kwargs={'d': 4}