Python

Python is an interpreted programming language with all of the bells and whistles.

Argparse

Argparse allows you to parse command line arguments with ease. It also provides a --help feature.

Basic syntax:

# It's good to use argparse inside of a function to not pollute the name space
def main():
    import argparse

    # Create the parser and add a description
    parser = argparse.ArgumentParser(description='This is where you would write a description. \
                                    It can span multiple lines and will look just fine on \
                                    the console.')
    # Add positional argument
    parser.add_argument('name_of_var', type=int, default=10,
                        help='The name of var will be accessed via args.name_of_var. Defaults to: %(default)d')
    # Variable number of arguments
    parser.add_argument('lots_of_args', type=int, nargs='+',
                        help='A variable number of args will be accepted here')
    # Named argument
    parser.add_argument('-s', '--save-dir', required=False, type=str, default='',
                        help='Save directory. Accessed via args.save_dir Defaults to: %(default)s')
    # Boolean argument (if exist)
    parser.add_argument('-f', '--flag', action='store_true', required=False,
                        help='If argument is provided, args.f will be True, else False')

    # Parse arguments
    args = parser.parse_args()

    print(f'name_of_var: {args.name_of_var}')
    print(f'lots_of_args: {args.lots_of_args}')
    print(f'save-dir: {args.save_dir}')
    print(f'flag: {args.flag}')
if __name__ == '__main__':
    main()

• Choices

Choices example:

# Allow only select options for arguments
parser.add_argument('--animal', choices=['dog', 'cat', 'giraffe'])

• Group

Argument groups are useful for organizing help message options into groups. By default, only the groups "positional arguments" and "optional arguments" are displayed with --help.

group = parser.add_argument_group('animal')
animal.add_argument('--cute', type=int, choices=range(1, 11) help='Cute Scale')
animal.add_argument('--giraffiness', type=int, help='How giraffe like is the animal?')

args = parser.parse_args()

Async

• Pool

from multiprocessing import Pool

num_processes = 4
args = [(1, 2), (3, 4), (5, 6)]

def async_function(a, b):
    return a + b

results = []
def callback(result):  # Callback should always only accept one argument
    # Do something with the result if you want
    results.append(result)

with Pool(num_processes) as pool:
    asyncResult = pool.starmap_async(async_function, args, callback=callback)
    # Won't do anything until we get the result
    same_results = asyncResult.get()  # get() also returns the results

Colored

Termcolor (Simple)
from termcolor import colored
print(colored('Hello World', 'green'))
print(colored('Hello World', 'cyan', 'on_magenta', attrs=['blink', 'bold']))
Colored (Complex)

More colors but harder to use

Conda

Conda is a great python command line package and environment manager. It is similar to virtualenv.

Conda cheat sheet

To make (base) not appear:

$ conda config --set auto_activate_base false

• Activate Environment

$ conda activate <env name>

• Create Environment

$ conda create --name <env name> python=3.8

• List Environments

$ conda env list

Decorators

• Cache

Cache is a way to remember all previous results to a function

from functools import cache
@cache
def factorial(n):
    return n * factorial(n-1) if n else 1

• Lru Cache

LRU-Cache (least recently used) is a way to remember the most recent results to a function. Because it does not cache all previous results, it can offer more speed up potential than the cache alternative.

from functools import lru_cache
@lru_cache(maxsize=32)  # Keep track of previous 32 calls
def lru_factorial(n):
    return n * lru_factorial(n-1) if n else 1

Documentation

• Docstrings

I personally use the Google style python docstrings.

Example:

class Cat():
    '''Retains information about a particular cat and performs cat-like actions
    
    Note:
        Cats are cute

    Args:
        color (str): Color of cat
        legs (int): Number of legs a cat has

    Attributes:
        color (str): color of cat
        count (int): Number of cats in existence
        legs (int): Number of legs a cat has
    '''
    
    count = 0

    def __init__(self, color, legs):
        self.color = color
        self.legs = legs
        self.count += 1

    def meow(self, times: int = 3):
        '''Cat meows "times" times
        
        A cat will meow "times" times in a row

        Args:
            times (int): Number of times to meow

        Returns:
            str: The string "meow" repeated a number of times as given. Each 
                meow is separated by a space.

        Raises:
            ValueError: If 'times' is not a positive integer
        '''
        if times < 1 or type(times) is not int:
            raise ValueError('A cat must meow at least once!')
        meow = "meow " * times
        return meow

• Documentation Generator

pdoc is an API documentation auto-generator. It can generate html documentation pages in a single line.

To install: pip3 install pdoc3

Usage: pdoc --html <File, Directory, or Package path>

  • To ignore errors: --skip-errors

The html files will be saved in html/

• Lint

If you want to use a linter to find code that could be written better (and pull out your hair while doing so) pylint is for you! To make you feel extra bad, pylint even gives you a score out of 10!

To install: pip3 install pylint

Usage: pylint <Directory or File>

Json

JSON is a great way to store text in both a human and machine readable format.

import json

x = ''
# Read JSON file
with open('file.json', 'r') as f:
    x = json.load(f)

# Save JSON file
json.dump(x, open('file.json', 'w'), indent=4, sort_keys=True)

List

• Sorting

In-Place
a.sort()
New List
new_a = sorted(a)
Key Functions

Works with both in-place and sorted

a = [('a', 1), ('c', 3), ('b', 2)]
a.sort(key=lambda x: x[1])

Matplotlib

Modules

A python module is a single python file. The module's name is the file name without the extension.

• Packages

A package is a single python application that consists of modules and subpackages.

An __init__.py file is required in every directory and sub-directory where modules/sub-packages exist for a package. This file is generally empty but may contain code to initialize the package.

• Pycache

The __pycache__ directory caches (saves) "compiled" python code. The timestamp of the python file is checked. If it's been updated, the cache is updated.

• Search Path

Module search path:

  1. Current directory / script directory
  2. PYTHONPATH
  3. Default path, i.e. /user/local/lib/python/

Multi Processing

• Imap

Use imap if you want to load in the arguments as you go (use a generator that will return LOTS of results). Will be slightly slower than map. Set chunksize to a number greater than 1 for lots of speedup.

• Imap_Unordered

Use imap_unordered when you would use imap but don't care about the order of the returned results.

• Map

Use map when you don't care about all of the arguments being loaded into memory at once.

• Progress Bar

from multiprocessing import Pool
from tqdm import tqdm

num_processes = 4
args = [(1, 2), (3, 4), (5, 6)]  # A generator also works. See `chunksize`
results = []

def func(a, b):
    return a + b

with Pool(num_processes) as pool:
    # Use `map` or `imap` instead of `starmap` if `func` only has 1 argument
    for item in tqdm(pool.starmap(func, args), total=len(args)):
        results.append(item)

Another option is p_tqdm

• Starmap

Use starmap when a function requires multiple arguments.

Os

• Isfile

Check if file is a file or directory:

import os
if os.path.isfile('./path/to/file.txt'):
    print("Is File")
if os.path.isdir("./path/to/dir/"):
    print("Is Directory")
if not os.path.exists('./test1/test2'):
    os.makedirs('./test1/test2')

• Join

To create a path:

import os
path = os.path.join('root-dir', 'parent-dir', 'file.txt')

• Makedirs

To make directories recursively:

import os
if not os.path.exists('./test1/test2'):
    os.makedirs('./test1/test2')

Alternatively:

from pathlib import Path
Path('./test3/test4').mkdir(parents=True, exist_ok=True)

• Split

Split a file path into dir and file:

import os
head, tail = os.path.split('/path/to/file.txt')
print(head)  # '/path/to'
print(tail)  # 'file.txt'

• Splitext

Get the extension from path

import os
path, ext = os.path.splitext('/path/to/file.tar.gz')
print(path)  # '/path/to/file.tar'
print(ext)   # '.gz'

Pickle

Pretty Print

Pretty Table

from prettytable import PrettyTable

table = PrettyTable()

table.title = 'Title'
table.field_names = ['Col1', 'Col2']
table.add_row(['Apple Pi', 3.14159])
table.add_row(['Banana', 42])

# Print table to stdout
print(table)

# Save text table to file
with open('output.txt', 'w') as f:
    f.write(table.get_string())

• Installation

$ pip install PTable

PTable is a fork from PrettyTable that allows for titles

• Save

To save a PrettyTable to a file, it must be converted to a string first

with open('file.txt', 'w') as f:
    f.write(table.get_string())

Print

Print to stderr:

import sys
print('Bad stuff happened', file=sys.stderr)