Argparse allows you to parse command line arguments with ease. It also provides a --help
feature.
Basic syntax:
# It's good to use argparse inside of a function to not pollute the name space
def main():
import argparse
# Create the parser and add a description
parser = argparse.ArgumentParser(description='This is where you would write a description. \
It can span multiple lines and will look just fine on \
the console.')
# Add positional argument
parser.add_argument('name_of_var', type=int, default=10,
help='The name of var will be accessed via args.name_of_var. Defaults to: %(default)d')
# Variable number of arguments
parser.add_argument('lots_of_args', type=int, nargs='+',
help='A variable number of args will be accepted here')
# Named argument
parser.add_argument('-s', '--save-dir', required=False, type=str, default='',
help='Save directory. Accessed via args.save_dir Defaults to: %(default)s')
# Boolean argument (if exist)
parser.add_argument('-f', '--flag', action='store_true', required=False,
help='If argument is provided, args.f will be True, else False')
# Parse arguments
args = parser.parse_args()
print(f'name_of_var: {args.name_of_var}')
print(f'lots_of_args: {args.lots_of_args}')
print(f'save-dir: {args.save_dir}')
print(f'flag: {args.flag}')
if __name__ == '__main__':
main()
Choices example:
# Allow only select options for arguments
parser.add_argument('--animal', choices=['dog', 'cat', 'giraffe'])
Argument groups are useful for organizing help message options into groups. By default, only the groups "positional arguments" and "optional arguments" are displayed with --help
.
group = parser.add_argument_group('animal')
animal.add_argument('--cute', type=int, choices=range(1, 11) help='Cute Scale')
animal.add_argument('--giraffiness', type=int, help='How giraffe like is the animal?')
args = parser.parse_args()
from multiprocessing import Pool
num_processes = 4
args = [(1, 2), (3, 4), (5, 6)]
def async_function(a, b):
return a + b
results = []
def callback(result): # Callback should always only accept one argument
# Do something with the result if you want
results.append(result)
with Pool(num_processes) as pool:
asyncResult = pool.starmap_async(async_function, args, callback=callback)
# Won't do anything until we get the result
same_results = asyncResult.get() # get() also returns the results
from termcolor import colored
print(colored('Hello World', 'green'))
print(colored('Hello World', 'cyan', 'on_magenta', attrs=['blink', 'bold']))
More colors but harder to use
Conda is a great python command line package and environment manager. It is similar to virtualenv.
To make (base)
not appear:
$ conda config --set auto_activate_base false
$ conda activate <env name>
$ conda create --name <env name> python=3.8
$ conda env list
Cache is a way to remember all previous results to a function
from functools import cache
@cache
def factorial(n):
return n * factorial(n-1) if n else 1
LRU-Cache (least recently used) is a way to remember the most recent results to a function. Because it does not cache all previous results, it can offer more speed up potential than the cache alternative.
from functools import lru_cache
@lru_cache(maxsize=32) # Keep track of previous 32 calls
def lru_factorial(n):
return n * lru_factorial(n-1) if n else 1
I personally use the Google style python docstrings.
Example:
class Cat():
'''Retains information about a particular cat and performs cat-like actions
Note:
Cats are cute
Args:
color (str): Color of cat
legs (int): Number of legs a cat has
Attributes:
color (str): color of cat
count (int): Number of cats in existence
legs (int): Number of legs a cat has
'''
count = 0
def __init__(self, color, legs):
self.color = color
self.legs = legs
self.count += 1
def meow(self, times: int = 3):
'''Cat meows "times" times
A cat will meow "times" times in a row
Args:
times (int): Number of times to meow
Returns:
str: The string "meow" repeated a number of times as given. Each
meow is separated by a space.
Raises:
ValueError: If 'times' is not a positive integer
'''
if times < 1 or type(times) is not int:
raise ValueError('A cat must meow at least once!')
meow = "meow " * times
return meow
pdoc is an API documentation auto-generator. It can generate html documentation pages in a single line.
To install: pip3 install pdoc3
Usage: pdoc --html <File, Directory, or Package path>
--skip-errors
The html files will be saved in html/
If you want to use a linter to find code that could be written better (and pull out your hair while doing so) pylint is for you! To make you feel extra bad, pylint even gives you a score out of 10!
To install: pip3 install pylint
Usage: pylint <Directory or File>
JSON is a great way to store text in both a human and machine readable format.
import json
x = ''
# Read JSON file
with open('file.json', 'r') as f:
x = json.load(f)
# Save JSON file
json.dump(x, open('file.json', 'w'), indent=4, sort_keys=True)
a.sort()
new_a = sorted(a)
Works with both in-place and sorted
a = [('a', 1), ('c', 3), ('b', 2)]
a.sort(key=lambda x: x[1])
A python module is a single python file. The module's name is the file name without the extension.
A package is a single python application that consists of modules and subpackages.
An __init__.py
file is required in every directory and sub-directory where modules/sub-packages exist for a package. This file is generally empty but may contain code to initialize the package.
The __pycache__
directory caches (saves) "compiled" python code. The timestamp of the python file is checked. If it's been updated, the cache is updated.
Module search path:
/user/local/lib/python/
Use imap if you want to load in the arguments as you go (use a generator that will return LOTS of results). Will be slightly slower than map
. Set chunksize
to a number greater than 1
for lots of speedup.
Use imap_unordered when you would use imap
but don't care about the order of the returned results.
Use map when you don't care about all of the arguments being loaded into memory at once.
from multiprocessing import Pool
from tqdm import tqdm
num_processes = 4
args = [(1, 2), (3, 4), (5, 6)] # A generator also works. See `chunksize`
results = []
def func(a, b):
return a + b
with Pool(num_processes) as pool:
# Use `map` or `imap` instead of `starmap` if `func` only has 1 argument
for item in tqdm(pool.starmap(func, args), total=len(args)):
results.append(item)
Another option is p_tqdm
Use starmap when a function requires multiple arguments.
Check if file is a file or directory:
import os
if os.path.isfile('./path/to/file.txt'):
print("Is File")
if os.path.isdir("./path/to/dir/"):
print("Is Directory")
if not os.path.exists('./test1/test2'):
os.makedirs('./test1/test2')
To create a path:
import os
path = os.path.join('root-dir', 'parent-dir', 'file.txt')
To make directories recursively:
import os
if not os.path.exists('./test1/test2'):
os.makedirs('./test1/test2')
Alternatively:
from pathlib import Path
Path('./test3/test4').mkdir(parents=True, exist_ok=True)
Split a file path into dir and file:
import os
head, tail = os.path.split('/path/to/file.txt')
print(head) # '/path/to'
print(tail) # 'file.txt'
Get the extension from path
import os
path, ext = os.path.splitext('/path/to/file.tar.gz')
print(path) # '/path/to/file.tar'
print(ext) # '.gz'
from prettytable import PrettyTable
table = PrettyTable()
table.title = 'Title'
table.field_names = ['Col1', 'Col2']
table.add_row(['Apple Pi', 3.14159])
table.add_row(['Banana', 42])
# Print table to stdout
print(table)
# Save text table to file
with open('output.txt', 'w') as f:
f.write(table.get_string())
$ pip install PTable
PTable is a fork from PrettyTable that allows for titles
To save a PrettyTable to a file, it must be converted to a string first
with open('file.txt', 'w') as f:
f.write(table.get_string())
Print to stderr
:
import sys
print('Bad stuff happened', file=sys.stderr)