Effective Python
Chapter 1 Pythonic Thinking
Item 1: Know Which Version of Python You're Using
python --version
        
import sys
print(sys.version)
        
Item 2: Follow the PEP 8 Style Guide
Item 3: Know the Differences Between bytes, str, and unicode
  • bytes, raw 8-bit values, encoded characters
  • str, unicode, no specific encoding
  • Item 4: Write Helper Functions Instead of Complex Expressions
  • A helper function is a function that performs part of the computation of another function following the DRY (Don’t repeat yourself) concept
  • Move complex expressions into helper functions, especially if you need to use the same logic repeatedly
  • If/else expression provides a more readable alternative to using Boolean operators
  • Item 5: Know How to Slice Sequences
  • Avoid being verbose: Don't supply 0 for the start index or the length of the sequence for the end index
  • # start and end indexes that are beyond the boundaries of the list
    # establish a maximum length to consider for an input sequence
    a = ['a', 'b', 'c', 'd']
    
    print(a[:10])
    print(a[-10:])
            
    # The result of slicing is a shallow copy
    a = [[1, 2, 3], 'b', 'c', 'd']
    b = a[:3]
    
    b[0][0] = 10
    
    print(a, b) # [[10, 2, 3], 'b', 'c', 'd'] [[10, 2, 3], 'b', 'c']
            
    # slices can be replaced by a list
    # the length of slice assignments don't need to be the same
    a = ['a', 'b', 'c', 'd']
    a[2:8] = [10, 20]
    
    print(a) # ['a', 'b', 10, 20]
            
    # leaving out both the start and the end indexes end up with a shadow copy
    a = [[1, 2, 3], 'b', 'c', 'd']
    b = a[:]
    
    print(a == b, a is b) # True False
    
    b[0][0] = 10
    
    print(a, b) # [[10, 2, 3], 'b', 'c', 'd'] [[10, 2, 3], 'b', 'c', 'd']
            
    # assign a list to a slice with no start or end indexes
    # replace the content with the content of the list instead of allocating a new list
    a = ['a', 'b', 'c', 'd']
    
    b = a
    print(id(a), id(b)) # 140420333398144 140420333398144
    
    a[:] = [10, 20]
    
    print(a is b, id(a), id(b)) # True 140420333398144 140420333398144
    
    print(a) # [10, 20]
            
    Item 6: Avoid Using start, end, and stride in a Single Slice
    # list[start:end:stride]
    # Slicing create a shallow copy 
    a = ['a', 'b', 'c', 'd']
    
    print(a[::2])
    print(a[::-1]) # ['d', 'c', 'b', 'a']
    
    # prefer using positive stride values
    # use one assignment to stride and another to slice
    a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
    
    b = a[::2]
    c = b[1:-1]
            
    Item 7: Use List Comprehensions Instead of map and filter
  • The expressions deriving one list from another are called list comprehensions
  • a = range(10)
    
    # list comprehension
    b = [e**2 for e in a]
    d = [e**2 for e in a if e%2 == 0]
    
    # map
    c = list(map(lambda x:x**2, a))
    e = list(map(lambda x:x**2, filter(lambda x: x%2 == 0, a)))
    
    d = {'name':'Lin', 'age': 43}
    
    # dictionary comprehensions
    d2 = {key:value for key, value in d.items()}
    
    # set comprehensions
    s = {value for value in d.values()}
            
    Item 8: Avoid More Than Two Expressions in List Comprehensions
  • List comprehensions support multiple levels of loops and multiple conditions per loop level
  • m = [[1, 2], [3, 4]]
    
    s = [[x**2 for x in row] for row in m]
    
    print(s)
            
    Item 9: Consider Generator Expressions for Large Comprehensions
  • Generator expression avoid using too much memory by yielding one item at a time from the expression
  • Geneator expressions can be chained together
  • a = list(range(10))
    
    # do not materialize the output sequence
    b = (x**2 for x in a)
    
    # print
    for e in b:
        print(e, end = ' ')
    
    # not print, has been consumed
    for e in b:
        print(e, end = ' ')
    
    # use a generator as the input for another generator
    a = list(range(10))
    
    b = (x**2 for x in a)
    
    c = (x*10 for x in b)
    
    # the parent generator is consumed, the child generator is empty
    # print
    for e in b:
        print(e, end = ' ')
    
    # not print
    for e in c:
        print(e, end = ' ')
            
    Item 10: Prefer enumerate Over range
    # enumerate generate a lazy generator
    a = enumerate(range(10))
            
    Item 11: Use zip to Process Iterators in Parallel
    # wraps two or more iterators with a lazy generator
    a = range(10)
    b = [e**2 for e in a]
    c = [e*2 for e in a]
    
    z = zip(a, b, c)
    
    for item in z:
        print(item)
    
    # yield tuples until a wrapped iterator is exhausted
    a = range(4)
    b = [e**2 for e in range(3)]
    
    z = zip(a, b)
    
    for item in z:
        print(item)
    
    # (0, 0)
    # (1, 1)
    # (2, 4)
    
    from itertools import *
    
    z = zip_longest(a, b)
    
    for item in z:
        print(item)
    
    # (0, 0)
    # (1, 1)
    # (2, 4)
    # (3, None)
            
    Item 12: Avoid else Blocks After for and while Loops
    # the else block runs after the loop finishes
    for e in range(4):
        print(e)
    else:
        print('Else block ...')
    
    # break in a loop will skip the else block
    for e in range(4):
        print(e)
        if e % 2 == 0:
            break
    else:
        print('Else block ...')
    
    # the else block runs after the loop finishes
    a = 0
    while a < 4:
        print(a)
        a += 1
    else:
        print('Else block ...')
    
    # break in a loop will skip the else block
    a = 0
    while a < 4:
        print(a)
        if a % 2 == 0:
            break
        a += 1
    else:
        print('Else block ...')
            
    Item 13: Take Advantage of Each Block in try/except/else/finally
    # finally, run clearn up even when exeption occur
    def get_exception():
        raise Exception('Raising exception ...')
    
    def run_block():
        try:
            get_exception()
        finally:
            print('Run finally ...') # Always runs after try
    
    def calling():
        try:
            run_block()
        except Exception as e:
            print(e)
    
    calling()
    
    # Run finally ...
    # Raising exception ...
    
    # else, can be used to perform additional actions after a successful try block
    def run_block():
        try:
            pass
        except Exception as e:
            print(e)
        else:
            print('Additional actions ...')
        finally:
            print('Run finally ...')
    
    run_block()
    
    # Additional actions ...
    # Run finally ...
            
    Chapter 2 Functions
    Item 14: Prefer Exceptions to Returning None
    # None, zero, the empty string are evaluated to False in condition expressions
    def divide(a, b):
        try:
            return a/b
        except Exception as e:
            return None
    
    if divide (2, 0) is None:
        print('Denominator is zero ...')
    
    # 0 is False in python
    if not divide(0, 2):
        print('Invalid inputs ...') # This is wrong!
    
    # raise exceptions to indicate special situations instead of returning None
    def divide(a, b):
        try:
            return a/b
        except Exception as e:
            raise ValueError('Invalid inputs ...') from e
    
    try:
        result = divide(0, 2)
    except Exception as e:
        print(e)
    else:
        print(result)
            
    Item 15: Know How Closures Interact with Variable Scope
  • A closure is a function value that references variables from outside its body
  • Python traverse the scope in the order
  • nonlocal, traverse the scope of enclosing function
  • global, go into the module scope
  • n = 10 # global variable
    
    def f1():
        print(n) # 10, access global variable
    
    def f2():
        n = 100 # 100, define a local variable
        print(n)
    
    def f3():
        global n
        n = 100
        print(n) # access global variable
    
    def f4():
        def f5():
            nonlocal m
            m = 1000
            print(m) # 1000
    
        m = 100
        print(m) # 100
        f5()
        print(m) # 1000
    
    # f1(), 10
    # print(n), 10
    
    # f2(), 100
    # print(n), 10
    
    # f3(), 100
    # print(n), 100
    
    # f4(), 100, 1000, 1000
            
    Item 16: Consider Generators Instead of Returning Lists
  • Generators are functions that use yield expressions
  • Generator functions do not actually run, instead, return an iterator
  • With each call to next function, the iterator advances the generator to its next yield expression
  • Each value passed to yield by the generator is returned by the iterator to the caller
  • The iterator can be converted to a list by list function
  • Iterators returned are stateful and can't be resued
  • Generator does not support slicing, use islice instead
  • # list
    def get_values(s):
        c = []
        for l in s:
            c.append(ord(l))
    
        return c
    
    # generator
    def get_values(s):
        c = []
        for l in s:
            yield ord(l)
    
    r = get_values('Hello World!')
    
    from itertools import *
    
    r2 = islice(r, 0, 3) # generator
            
    Item 17: Be Defensive When Iterating Over Arguments
  • Beware of functions that iterate over input arguments multiple time. If these arguments are iterators, may see strange behavior and missing values
  • a = [10, 20, 30]
    
    def get_values():
        for e in a:
            yield e
    
    def normalize(func):
        total = sum(get_values()) # New iterator
        result = []
        for value in get_values(): # New iterator
            result.append(value/total)
    
        return result
    
    percentages = normalize(get_values)
    
    print(percentages)
    
    # iterator protocol, how to traver the contents of a container
    # iter function calls __iter__ method in class
    a = [10, 20, 30]
    
    class ReadValues():
        def __init__(self, a):
            self.a = a
    
        def __iter__(self):
            for e in self.a:
                yield e
    
    def normalize(values):
        # raise an exception if the inputs is iterable but not a container
        if iter(values) is iter(values):
            raise TypeError('Must be a container ...')
        total = sum(values)
        result = []
        for value in values:
            result.append(value/total)
    
        return result
    
    values = ReadValues(a)
    percentages = normalize(values)
    
    print(percentages)
            
    Item 18: Reduce Visual Noise with Variable Positional Arguments
  • Optional positional arguments, start args, *args
  • Problems
  • def log(m, *values):
        print(m, ': ', values)
    
    log('Numbers', 1, 2)
    
    a = [1, 2]
    log('Numbers', *a) # singularization
            
    Item 19: Provide Optional Behavior with Keyword Arguments
  • Function arguments can be specified by position or by keyword
  • Optional keyword arguments should always be passed by keyword instead of by position
  • def info(message, value):
        print(message, value)
    
    info('Hello: ', 10) # positional arguments
    info(value = 10, message = 'Hello: ') # keyword arguments in any order
    #info(message = 'Hello: ', 10) # error
    info('Hello: ', value = 10) # positional arguments must be specified before keyword arguments
    info('Hello: ', message = 'World!') # each argument can only be specified once
    
    def info(message = 'Hello: ', value = 10):
        print(message, value)
    
    info('World: ') # positional argument
    info(value = 100) # keyword argument
    info('World', value = 100) # positional arguments must be specified before keyword arguments
    
    def info(message = 'Hello: ', value = 10, **args):
        print(message, value, args)
    
    info(value = 100, key_1 = 10, key_2 = 20) # optional arguments
            
    Item 20: Use None and Docstrings to Specify Dynamic Default Arguments
  • Default arguments are only evaluated once at the time of function is loaded
  • Mutable argument values, like {} or [], are shared by all functions calls
  • Use None as the default value for mutable argument to avoid odd behaviors
  • from datetime import datetime
    
    # static default arguments
    def info(when = datetime.now()):
        print(when)
    
    # timestamps are same, datetime.now is executed a single time when the function is defined
    # default argument value is static
    info() # 2019-09-22 11:05:17.924218
    info() # 2019-09-22 11:05:17.924218
    
    # dynamic default arguments
    def info(when = None):
        when = datetime.now() if when is None else when
        print(when)
    
    info() # 2019-09-22 11:12:04.619564
    info() # 2023-09-22 11:12:04.619604
    
    # using mutable value as default argument value
    def info(default = {}): # default is shared by all calls to info
        return default
    
    c1 = info()
    c1['Name'] = 'Lin'
    
    c2 = info()
    c2['Age'] = 43
    
    # {'Name': 'Lin', 'Age': 43} {'Name': 'Lin', 'Age': 43} 4434628928 4434628928
    print(c1, c2, id(c1), id(c2))
    
    # use None as the default argument
    def info(default = None):
        return {} if default is None else default
    
    c1 = info()
    c2 = info()
    
    print(id(c1), id(c2))
            
    Item 21: Enforce Clarity with Keyword-Only Arguments
  • Use * symbol to indicate the start of keyword-only arguments
  • Use **kwargs to support keyword-only arguments
  • # use * to indicate keyword-only arguments
    def info(message, *, value = 10):
        print(message, value)
    
    info('Hello', value = 100) # positional argument and keyword-only argument
    info(value = 100, message = 'Hello') # keyword arguments in any order
    
    # info('Hello', 100) # error, pass positional argument to keyword-only argument
    
    # use **kwargs to indicate keyword-only arguments
    def info(message, **kwargs):
        print(message, kwargs['value'])
    
    info('Hello', value = 100) # positional argument and keyword-only argument
    info(value = 100, message = 'Hello') # keyword arguments in any order
    
    # info('Hello', 100) # error, pass positional argument to keyword-only argument
            
    Chapter 3 Classes and Inheritance
    Item 22: Prefer Helper Classes Over Bookkeeping with Dictionaries and Tuples
  • Avoid making dictionaries with values that are other dictionaries or long tuples
  • Use namedtuple for lightweight, immutable data containers
  • Item 23: Accept Functions for Simple Interfaces Instead of Classes
  • Many of Python's built-in APIs allow to customize behavior by passing in a function
  • Instead of defineing classes, functions are often all need for simple interfaces between components in Python
  • References to functions in Python can be used in expressions like any other type
  • When need to maintain state, consider defining a class with __call__ method instead of defining a stateful closure
  • from collections import defaultdict
    
    current = {'green':12, 'blue':3}
    
    increments = [('red', 5), ('blue', 17), ('orange', 9)]
    
    class CallMissing(object):
        added = 0 # keep operation state
    
        def __init__(self):
            pass
    
        # create a callable class
        def __call__(self):
            CallMissing.added += 1
            return 0
    
    result = defaultdict(CallMissing(), current)
    
    for key, value in increments:
        result[key] += value
    
    print(CallMissing.added, dict(result)) # 2 {'green': 12, 'blue': 20, 'red': 5, 'orange': 9}
            
    Item 24: Use @classmethod Polymorphism to Construct Objects Generically