`argparse`: Optional Argument and Flag?

I was modifying an program recently which uses argparse to collect command line options in order to add an option to enable a ‘testrun’. The script begins by copying a large cohort to a server before taking several steps manipulating it. When I ran it with a new configuration file, there was a misconfigured flag, but the program took several hours before running into the misconfiguration. Thus, I wanted to add a ‘testrun’ command line option which would allow a user to ‘testrun’ the configuration file in order to ensure that everything was working. If something was broken, it would only take a couple of minutes to observe and fix the configurations (or other errors). Then, they could drop the ‘testrun’ option and run the entire dataset.

I wanted to be able to supply the script with three possible states for the ‘testrun’ parameter:

` `: not a testrun [testrun = None]
--testrun: apply the default number of records (say, 20) [testrun = 20]
--testrun=50: apply a non-default value if, e.g., a number of records are likely to be filtered out and the test remain incomplete (or, v.v., reduce the number to, e.g., 5, to avoid a longer runtime if there are expected to be a number of matches [testrun = 50]

My first attempt involved trying to set flag options (e.g., store_true) with optional numbers of arguments (e.g., nargs='?'), but these caused issues. A search on the argparse documentation, however, suggests an alternative using the const parameter: https://docs.python.org/3/library/argparse.html#const. From the description, const appears to have been added to help out with exactly these edge cases.

We can create the parameter like so:

import argparse


parser = argparse.ArgumentParser()
parser.add_argument('--testrun', nargs='?', const=20, type=int,
                    help='Run a sample (default=20) records to test configuration')
print(parser.parse_args([]))  # no args
#> testrun=None
print(parser.parse_args(['--testrun']))  # flag, show default
#> testrun=20
print(parser.parse_args(['--testrun=50']))  # flag with sample specified
#> testrun=50
print(parser.parse_args(['--testrun', '5']))  # flag with sample specified
#> testrun=5

import argparse


parser = argparse.ArgumentParser()
parser.add_argument('--testrun', nargs='?', const=20, type=int,
                    help='Run a sample (default=20) records to test configuration')
print(parser.parse_args([]))  # no args
#> testrun=None
print(parser.parse_args(['--testrun']))  # flag, show default
#> testrun=20
print(parser.parse_args(['--testrun=50']))  # flag with sample specified
#> testrun=50
print(parser.parse_args(['--testrun', '5']))  # flag with sample specified
#> testrun=5

We can now handle our use case by checking for the existence of testrun before limiting our corpus. E.g., an example in pandas might look something like this:

import pandas as pd


df = pd.read_csv(path)
if testrun:
    df = df.head(testrun)

import pandas as pd


df = pd.read_csv(path)
if testrun:
    df = df.head(testrun)

Or, perhaps more efficiently using an iterator to avoid loading the entire dataset:

if testrun:
    df = next(pd.read_csv(path, chunksize=testrun))  # only load `chunksize` lines from CSV          
else:
    df = pd.read_csv(path)

if testrun:
    df = next(pd.read_csv(path, chunksize=testrun))  # only load `chunksize` lines from CSV          
else:
    df = pd.read_csv(path)

`click`‘s version

In click, the same should be possible with the following:

import click


@click.command()
@click.option('--testrun', is_flag=False, flag_value=20, default=None, type=int)
def run(testrun):
    print(testrun)
    

if __name__ == '__main__':
    run()

import click


@click.command()
@click.option('--testrun', is_flag=False, flag_value=20, default=None, type=int)
def run(testrun):
    print(testrun)
    

if __name__ == '__main__':
    run()

click‘s version

`click`‘s version