Python Selected#

Python selected topics from John DeNero’s lectures, who owns most all credits of this notes.

Basics#

Doctest#

Python could test simple test cases within the docstring. Suppose we have a file named identity.py with source code:

def identity(k):
    """
    >>> identity(5)
    5
    >>> identity(100)
    100
    """
    return k

Use the following command to test the simple testcases. If all tests pass, nothing is printed.

$ python3 -m doctest identity.py

Instead, if we want to interact with the functions in the file, use the command below:

$ python3 -i identity.py
>>> identity(100)
100

Higher-order functions#

Every expression is evaluated in the environment where an environment is a sequence of frames. A name evaluates to the value bound to that name in the earliest frame of the current environment in which that name is found.

Higher-order function is a function that takes a function as an argument value or returns a function as a return values. The function that creates and returns a function creates a frame and this becomes the parent frame of the inner or returned function, though the frame is inactivate after execution. Hence, the parent of a function is the frame in which it was defined, and the parent of a frame is the parent of the function called.

Function composition uses function as parameters to compose functionality.

In [1]: def square(x):
   ...:     return x * x
   ...: 

In [2]: def double(x):
   ...:     return 2 * x
   ...: 

In [3]: def compose(f, g):
   ...:     def h(x):
   ...:         return f(g(x))
   ...:     return h
   ...: 

In [4]: squible = compose(square, double)

In [5]: squible(5)
Out[5]: 100

However, the function g in the function f cannot find the parameter y since g is not defined as a nested function inside f, or the parent frame of g is the Global frame, not function f.

def f(x, y):
    return g(x)

def g(a):
    return a + y

>>> f(1, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
  File "<stdin>", line 2, in g
NameError: name 'y' is not defined

When rewriting a function, remember the expressions must be evaluated into values before calling functions.

from math import sqrt
def real_sqrt(x):
    if x >= 0:
        return sqrt(x)
    else:
        return 0

# rewrite above
def ternary(condition, t, f):
    if condition:
        return t
    else:
        return f

def real_sqrt(x):
    return ternary(x >= 0, sqrt(x), 0)

If we call the rewrite version of real_sqrt, we will encounter a ValueError since the sqrt must be evaluated into value before passing into ternary as a parameter.

>>> real_sqrt(-16)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in real_sqrt
ValueError: math domain error

Instead of passing the memo as a parameter, we could use higher-order functions to achieve memoization.

def fib(n):
    if n == 0 or n == 1:
        return n
    else:
        return fib(n-2) + fib(n-1)

def count(f):
    def counted(*args):
        counted.call_count += 1
        return f(*args)
    counted.call_count = 0
    return counted

def memo(f):
    cache = {}
    def memoized(n):
        if n not in cache:
            cache[n] = f(n)
        return cache[n]
    return memoized

>>> fib = count(fib)
>>> fib(20)
6765
>>> fib.call_count
21891

>>> counted_fib = count(fib)
>>> fib  = memo(counted_fib)
>>> fib(20)
6765
>>> counted_fib.call_count
21

Lambda expression#

If we want to bind the function to a variable, we could just use assignment.

In [6]: from operator import add

In [7]: addPtr = add

In [8]: addPtr(1, 1)
Out[8]: 2

Instead, we could use lambda function to evaluate to a function as a lambda expression. To illustrate, the function mul is a function with formal parameters x and y that returns the value of x * y, which must be a single expression.

In [9]: mul = lambda x, y: x * y

In [10]: mul(11, 11)
Out[10]: 121

The difference between a lambda function and a function defined using def is that the def statement gives the funcition an intrinsic name which is the name of the function, while the lambda function is represented by \(\lambda(x, y)\) in the frame.

With lambda function, we could transform a multi-argument function into a single-argument, higher-order function by function currying.

In [11]: from operator import add

In [12]: def curry(f):
   ....:     def g(x):
   ....:         def h(y):
   ....:             return f(x, y)
   ....:         return h
   ....:     return g
   ....: 

In [13]: curry(add)(1)(2023)
Out[13]: 2024

In [14]: curry = lambda f: lambda x: lambda y: f(x, y)

In [15]: curry(add)(1)(2023)
Out[15]: 2024

A lambda function’s parent is the current frame in which the lambda expression is evaluated. The two variable a in the the function call has value of one since their parent frame is the Global frame.

In [16]: a = 1

In [17]: def f(g):
   ....:     a = 2
   ....:     return lambda y: a * g(y)
   ....: 

In [18]: f(lambda y: a + y)(a)
Out[18]: 4

Decorator#

The decorator is a design pattern that adds additional functionality to a function without modifying its structure. Use @ following decorator function name as a decorator preceding the applied function.

# decorator function definition
In [19]: def trace(f):
   ....:     def traced(x, y):
   ....:         print("calling", f, "on arguments", x, y)
   ....:         return f(x, y)
   ....:     return traced
   ....: 

In [20]: @trace
   ....: def mul(x, y):
   ....:     return x * y
   ....: 

In [21]: mul(2, 3)
calling <function mul at 0x160578940> on arguments 2 3
Out[21]: 6

A traditional method of wrapper without a decorator is reassigning the wrapper function to the function name, using higher-order function like below.

mul = trace(mul)

Self-reference#

Self-reference refers to a particular design of HOF, where a function eventually returns itself. In particular, a self-referencing function will not return a function call, but rather the function object itself.

Tip

The trick to understand the function call is to take one pair of parenthesis at a time, and use the returned current function to take the next parameter, and so on.

In [22]: def printAll(x):
   ....:     print(x)
   ....:     return printAll
   ....: 

In [23]: printAll(1)(2)(3)
1
2
3
Out[23]: <function __main__.printAll(x)>

Another example of tracing the summation at each step.

In [24]: def traceSum(x):
   ....:     print(x)
   ....:     def nextNum(y):
   ....:         return traceSum(x + y)
   ....:     return nextNum
   ....: 

In [25]: traceSum(1)(2)(3)
1
3
6
Out[25]: <function __main__.traceSum.<locals>.nextNum(y)>

Mutation#

Tuple is an immutable and hashable sequence, in which it can be keys in the dictionary.

In [26]: {(1, 1): 1}
Out[26]: {(1, 1): 1}

However, an immutable sequence may still change if it contains a mutable value as an element.

In [27]: s = ([0, 0], 0)

In [28]: s[0][0] = 1; s
Out[28]: ([1, 0], 0)

is identity operator evaluates to True if both operands evaluate to the same object. Instead, == evaluates to True if both operands evaluate to equal values, but not the memory address.

In [29]: a = [0]

In [30]: b = [0]

In [31]: a is b
Out[31]: False

In [32]: a == b
Out[32]: True

Default value of a function parameter is dangerous since only one object will be created when the function is defined.

In [33]: def f(s=[]):
   ....:     s.append(0)
   ....:     print(s)
   ....:     return len(s)
   ....: 

In [34]: f()
[0]
Out[34]: 1

In [35]: f()
[0, 0]
Out[35]: 2

Iterators#

A container can provide an iterator that provides access to its elements in certain order. An iterator behaves like an pointer pointing to an element in the container. iter initializes a iterator and next returns the value of the current element and proceeds the pointer.

In [36]: a = [[0, 1], 2, 3, 4]

In [37]: it = iter(a)

In [38]: next(it)
Out[38]: [0, 1]

list converts a iterator into a list with elements from current position to the end of container.

In [39]: list(it)
Out[39]: [2, 3, 4]

Calling next after the list conversion causes a StopIteration error since the iterator has reach the end of container.

>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Dictionary iterators become invalid if the structure or length of the dictionary container is changed. Instead, if we change a value of a key, the iterator is still valid.

>>> d = {'zero': 0, 'one': 1}
>>> it = iter(d)
>>> next(it)
'zero'
>>> d['one'] = 11
>>> next(it)
'one'
>>> d['two'] = 2
>>> d
{'zero': 0, 'one': 11, 'two': 2}
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

Attention

As of Python 3.7, dictionaries are guaranteed insertion ordered.

A for loop can be used to process an iterable. Compared with an iterable like list, an iterator proceeds in only one direction until reach the end.

In [40]: it = iter([1, 2])

In [41]: for n in it:
   ....:     print(n)
   ....: 
1
2

Python iterator does not provide a function, like hasNext in Java, to check whether an iterator reach the end. Alternatively, a default value will be returned if we put it as a second parameter of next.

In [42]: next(it, -1)
Out[42]: -1

Python provides built-in functions for iteration, and many built-in sequence operatios return iterators that compute results lazily, i.e. producing the next result one at a time, such as map, filter, zip, reversed.

In [43]: m = map(lambda x: x * x, range(3, 10))

In [44]: f = filter(lambda x: x >= 10, m)

In [45]: next(f)
Out[45]: 16

In [46]: list(f)
Out[46]: [25, 36, 49, 64, 81]

Be careful when you think you are comparing two lists, when one is actually a iterator.

In [47]: a = [1, 2, 1]

In [48]: reversed(a) == a
Out[48]: False

In [49]: reversed(a)
Out[49]: <list_reverseiterator at 0x1605b3ee0>

In [50]: list(reversed(a)) == a
Out[50]: True

The zip function only iterates over matches and skips extras and more than two iterables can be passed to.

In [51]: list(zip([1, 2], [1, 2, 3], [1, 2]))
Out[51]: [(1, 1, 1), (2, 2, 2)]

Generator#

Generator is a special kind of iterator create automatically by calling a generator function. A generator function is a function that lazily yield values instead of return them. While a normal function returns once, a generator function can yield multiple times. When a generator function is called, it returns a generator that iterates over its yields.

In [52]: def evens(start, end):
   ....:     even = start + (start % 2)
   ....:     while even < end:
   ....:         yield even
   ....:         even += 2
   ....: 

Tip

Using yield is convenient when you want to inspect or process a few data records from a large data source. Its laziness saves memory and reduces running time.

A yield from statement yields all values from an iterator or iterable. The following two functions behave the same.

def f(a):
    for n in a:
        yield n

def f(a):
    yield from a

Suppose we want to find the partitions of an integer with a maximum component value, an common approach is:

def partitions(n, m):
    """List partitions.

    >>> for p in partitions(6, 4): print(p)
    2 + 4
    1 + 1 + 4
    3 + 3
    1 + 2 + 3
    1 + 1 + 1 + 3
    2 + 2 + 2
    1 + 1 + 2 + 2
    1 + 1 + 1 + 1 + 2
    1 + 1 + 1 + 1 + 1 + 1
    """
    if n <= 0 or m == 0:
        return []
    else:
        exact_match = []
        if n == m:
            exact_match = [str(m)]
        with_m = [p + ' + ' + str(m) for p in partitions(n-m, m)]
        without_m = partitions(n, m-1)
        return exact_match + with_m + without_m

Using yield we could simplify the structure and make it more readable.

def yield_partitions(n, m):
    """List partitions.

    >>> for p in yield_partitions(6, 4): print(p)
    2 + 4
    1 + 1 + 4
    3 + 3
    1 + 2 + 3
    1 + 1 + 1 + 3
    2 + 2 + 2
    1 + 1 + 2 + 2
    1 + 1 + 1 + 1 + 2
    1 + 1 + 1 + 1 + 1 + 1
    """
    if n > 0 and m > 0:
        if n == m:
            yield str(m)
        for p in yield_partitions(n-m, m):
            yield p + ' + ' + str(m)
        yield from yield_partitions(n, m-1)

Fibonacci#

A common iterative Fibonacci implementation:

def fib(n):
    """Compute the nth Fibonacci number with n >= 1"""
    prev, curr = 0, 1
    k = 1
    while k < n:
        prev, curr = curr, prev + curr
        k += 1

    return curr

An alternative definition to handle the case where \(n = 0\):

def fib(n):
    """Compute the nth Fibonacci number with n >= 0"""
    prev, curr = 1, 0 # swap initial values
    k = 0             # start from the 0th value
    while k < n:
        prev, curr = curr, prev + curr
        k += 1

    return curr

If we want to multiply the result of a factorial by a constant k, and put k as a parameter, recursion is used.

def fact_times(n, k):
    """Return k * n * n-1 * ... * 1.

    >>> fact_times(4, 3)
    72
    """
    if n == 0:
        return k
    else:
        return fact_times(n - 1, k * n)

fact_times(5, 2) # 240

print function#

The print function returns None and displays arguments seperated by spaces when it is called.

In [53]: print(print(1), print(2))
1
2
None None

In [54]: def delay(arg):
   ....:     print("delayed")
   ....:     def g():
   ....:         return arg
   ....:     return g
   ....: 

In [55]: delay(delay)()(6)()
delayed
delayed
Out[55]: 6

In [56]: print(delay(print)()(4))
delayed
4
None

Notice that the print function does not produce IPython output. One more example of the horse-mask brainstorm.

In [57]: def horse(mask):
   ....:     horse = mask
   ....:     def mask(horse):
   ....:         return horse
   ....:     return horse(mask)
   ....: 

In [58]: mask = lambda horse: horse(2)

In [59]: horse(mask)
Out[59]: 2

assert function#

The assert function can be used to check assumptions of the variable values. A second parameter could be added to report the representation of the error.

>>> assert [0] == [1], ([0], [1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: ([0], [1])

sum function#

The summation function can take a second parameter as a default initial value, which is a useful technique in recursive function.

In [60]: sum([1, 2, 3], 0)
Out[60]: 6

In [61]: sum([[1], [2]], [])
Out[61]: [1, 2]

Exceptions#

Python exceptions are raised with a raise <expression> statement. The expression must evaluate to a subclass of BaseException or an instance of one. Common errors:

TypeError: A function was passed the wrong number/type of argument
NameError: A name wasn’t found
KeyError: A key wasn’t found in a dictionary
RecursionError: Too many recursive calls
ValueError: Found a lexical error
SyntaxError: Found a syntantic error

The try-catch statement could handle the potential errors.

from operator import truediv
def reduce(f, s, initial):
    """Combine elements of s pairwise using f, starting with initial.

    >>> reduce(mul, [2, 4, 8], 1)
    64
    >>> reduce(pow, [1, 2, 3, 4], 2)
    16777216
    """
    for x in s:
        initial = f(initial, x)
    return initial

def divide_all(n, ds):
    """Divide n by every d in ds.

    >>> divide_all(1024, [2, 4, 8])
    16.0
    >>> divide_all(1024, [2, 4, 0, 8])
    inf
    """
    try:
        return reduce(truediv, ds, n)
    except ZeroDivisionError:
        return float('inf')

Object#

A class statement create a new class and binds that class to the class name in the first frame of the current environment.

In [62]: class Integer:
   ....:     zero = 0
   ....: 

In [63]: Integer.zero
Out[63]: 0

Python has built-in functions that user can override.

__init__: Method invoked automatically when an object is constructed.
__repr__: Method invoked to display an object as a Python expression.
__str__: Method invoked to display a human-readable representation.
__add__: Method invoked to add one object to another.
__radd__: Method invoked to add another object to this object.
__bool__: Method invoked to convert an object to True or False.
__float__: Method invoked to convert an object to a float number.

In [64]: one, two = 1, 2

In [65]: one + two
Out[65]: 3

In [66]: one.__add__(two)
Out[66]: 3

The built-in function isinstance returns whether an object is an instance of a class or of a subclass thereof.

>>> isinstance
<built-in function isinstance>
>>> isinstance(one, int)
True

Attributes and methods#

The dot operator automatically fill in the first argument of instance methods, which is self. Identically, we can check and look up an attribute using built-in functions hasattr and getattr.

>>> hasattr
<built-in function hasattr>
>>> getattr
<built-in function getattr>

A Account class is create for illustration.

class Account:
    interest = 0.02

    def __init__(self, account_holder):
        self.holder = account_holder
        self.balance = 0

    def deposit(self, amount):
        """Add amount to balance."""
        self.balance = self.balance + amount
        return self.balance

    def withdraw(self, amount):
        """Subtract amount from balance if funds are available."""
        if amount > self.balance:
            return 'Insufficient funds'
        self.balance = self.balance - amount
        return self.balance

>>> cara_account = Account('Cara')
>>> cara_account.holder
'Cara'
>>> cara_account.balance
0
>>> cara_account.interest
0.02
>>> cara_account.deposit(100)
100
>>> Account.deposit(cara_account, 100)
200

Python distinguishes between:

Functions, which are directly defined globally, and
Bound methods, which couple together a function and the object on which that method will be invoked.

>>> type(Account.deposit)
<class 'function'>
>>> type(cara_account.deposit)
<class 'method'>

Notice that the class attributes are shared across all instances of a class since they are attributes of the class, not the instance.

>>> Account.interest
0.02

Different from other languages, Python supports adding a new attribute to an instance by attribute assignment.

>>> cara_account.interest = 0.05
>>> cara_account.interest
0.05
>>> Account.interest
0.02

Inheritance#

Inheritance is best for representing is-a relationships. Subclass is a specialized class that shares the common features of the superclass. For example, a checking account is an account.

class CheckingAccount(Account):
    withdraw_fee = 1
    interest = 0.01

    def withdraw(self, amount):
        return Account.withdraw(self, amount + self.withdraw_fee)
        # Alternatively:
        return super().withdraw(amount + self.withdraw_fee)

>>> ca = CheckingAccount('Cara')
>>> ca.balance = 20
>>> ca.withdraw(5)
14
>>> ca.interest
0.01

In the method withdraw, we prefer self.withdraw_fee to CheckingAccount.withdraw_fee to allow for specialized accounts.

Practice the understanding of inheritance with an example and draw the environment diagram.

class A:
    z = -1
    def f(self, x):
        return B(x-1)

class B(A):
    n = 4
    def __init__(self, y):
        if y:
            self.z = self.f(y)
        else:
            self.z = C(y+1)

class C(B):
    def f(self, x):
        return x

>>> a = A()
>>> b = B(1)
>>> C(2).n
>>> C(2).z
>>> a.z == C.z
>>> a.z == b.z
>>> b.z
>>> b.z.z
>>> b.z.z.z

See solution to John’s attribute lookup practice video.

Python supports multiple inheritance using class subclass(super1, super2).

Composition#

Composition is the best for representing has-a relationship. For example, a museum holds a list of exhibits.

class Bank:
    def __init__(self):
        self.accounts = []

    def open_account(self, holder, amount, account_type=Account):
        """Open an account_type for holder and deposit amount."""
        account = account_type(holder)
        account.deposit(amount)
        self.accounts.append(account)
        return account

    def pay_interest(self):
        """Pay interest to all accounts."""
        for account in self.accounts:
            account.deposit(account.balance * account.interest)

>>> bank = Bank()
>>> john = bank.open_account('John', 10)
>>> jack = bank.open_account('Jack', 5, CheckingAccount)
>>> jack.interest
0.01
>>> john.interest = 0.06
>>> bank.pay_interest()
>>> john.balance
10.6
>>> jack.balance
5.05

String representation#

The goal of __repr__ is to be unambiguous and the goal of __str__ is to be human-readable.

The function repr returns a Python expression of str type that evaluates to an equal object, i.e. eval(repr(object)) == object, and the function str returns what print returns on an object. repr invokes a zero-argument method __repr__ on its argument, and str invokes a zero-argument method __str__ on its argument.

The behavior of repr is slightly more complicated than invoking __repr__ on its argument: an instance attribute called __repr__ is ignored and only class attributes are found.

The behavior of str is also complicated:

An instance attribute called __str__ is ignored
If no __str__ attribute is found, uses repr string

Stepwise illustration: suppose we implement neither __repr__ or __str__.

class Car:
    pass

>>> lamborghini = Car()
>>> print(lamborghini)
<__main__.Car object at 0x10494a920>
>>> print(str(lamborghini))
<__main__.Car object at 0x10494a920>
>>> print(repr(lamborghini))
<__main__.Car object at 0x10494a920>
>>> print(lamborghini.__str__())
<__main__.Car object at 0x10494a920>
>>> print(lamborghini.__repr__())
<__main__.Car object at 0x10494a920>

Now we only the __repr__ function.

class Car:
    def __repr__(self):
        return "Car()"

>>> lamborghini = Car()
>>> print(lamborghini)
Car()
>>> print(str(lamborghini))
Car()
>>> print(repr(lamborghini))
Car()
>>> print(lamborghini.__str__())
Car()
>>> print(lamborghini.__repr__())
Car()

Tip

Implement __repr__ for any class you implement. This should be the second nature. Implement __str__ for the seek of readability.

With both __repr__ and __str__ functions.

class Car:
    def __repr__(self):
        return "Car()"

    def __str__(self):
        return "A car"

>>> lamborghini = Car()
>>> print(lamborghini)
A car
>>> print(str(lamborghini))
A car
>>> print(repr(lamborghini))
Car()
>>> print(lamborghini.__str__())
A car
>>> print(lamborghini.__repr__())
Car()

Add instance methods __repr__ and __str__ bound to a object.

class Car:
    def __init__(self):
        self.__repr__ = lambda: "this Car()"
        self.__str__ = lambda: "this car"

    def __repr__(self):
        return "Car()"

    def __str__(self):
        return "A car"

>>> lamborghini = Car()
>>> print(lamborghini)
A car
>>> print(str(lamborghini))
A car
>>> print(repr(lamborghini))
Car()
>>> print(lamborghini.__str__())
this car
>>> print(lamborghini.__repr__())
this Car()

Practice#

Inheritance#

Try to draw the environment diagram and guess the outputs.

class Worker:
    greeting = 'Sir'
    def __init__(self):
        self.elf = Worker
    def work(self):
        return self.greeting + ', I work'
    def __repr__(self):
        return Bourgeoisie.greeting

class Bourgeoisie(Worker):
    greeting = 'Peon'
    def work(self):
        print(Worker.work(self))
        return 'My job is to gather wealth'

>>> jack = Worker()
>>> john = Bourgeoisie()
>>> jack.greeting = 'Maam'
>>> Worker().work()
>>> jack
>>> jack.work()
>>> john.work()
>>> john.elf.work(john)

Built-in functions#

Try to implement the function body according to the docstring and compare the solutions.

def min_abs_indices(s):
    """Indices of all elements in list s that have the smallest absolute value.

    >>> min_abs_indices([-4, -3, -2, 3, 2, 4])
    [2, 4]
    >>> min_abs_indices([1, 2, 3, 4, 5])
    [0]
    """
    min_abs = min(map(abs, s))
    return list(filter(lambda i: abs(s[i]) == min_abs, range(len(s))))
    # OR
    return [i for i in range(len(s)) if abs(s[i]) == min_abs]

def largest_adj_sum(s):
    """Largest sum of two adjacent elements in a list s.

    >>> largest_adj_sum([-4, -3, -2, 3, 2, 4])
    6
    >>> largest_adj_sum([-4, 3, -2, -3, 2, -4])
    1
    """
    return max([x + y for x, y in zip(s[:-1], s[1:])])
    # OR
    return max([s[i] + s[i + 1] for i in range(len(s) - 1)])
    # OR
    return max(map(lambda i: s[i] + s[i + 1], range(len(s) - 1)))

def digit_dict(s):
    """Map each digit d to the lists of elements in s that end with d.

    >>> digit_dict([5, 8, 13, 21, 34, 55, 89])
    {1: [21], 3: [13], 4: [34], 5: [5, 55], 8: [8], 9: [89]}
    """
    return {i: [x for x in s if x % 10 == i] for i in range(10) if any([x % 10 == i for x in s])}
    # OR
    last_digits = list(map(lambda x: x % 10, s))
    return {i: [x for x in s if x % 10 == i] for i in range(10) if i in last_digits}

def all_have_an_equal(s):
    """Does every element equal some other element in s?

    >>> all_have_an_equal([-4, -3, -2, 3, 2, 4])
    False
    >>> all_have_an_equal([4, 3, 2, 3, 2, 4])
    True
    """
    return min([sum([1 for y in s if x == y]) for x in s]) > 1
    # OR
    return all([s[i] in s[:i] + s[i+1:] for i in range(len(s))])
    # OR
    return all(map(lambda x: s.count(x) > 1, s))