Python Selected#
Python selected topics from John DeNero’s lectures, who owns most all credits of this notes.
Basics#
Doctest#
Python could test simple test cases within the docstring. Suppose we have a file named identity.py with source code:
def identity(k):
"""
>>> identity(5)
5
>>> identity(100)
100
"""
return k
Use the following command to test the simple testcases. If all tests pass, nothing is printed.
$ python3 -m doctest identity.py
Instead, if we want to interact with the functions in the file, use the command below:
$ python3 -i identity.py
>>> identity(100)
100
Higher-order functions#
Every expression is evaluated in the environment where an environment is a sequence of frames. A name evaluates to the value bound to that name in the earliest frame of the current environment in which that name is found.
Higher-order function is a function that takes a function as an argument value or returns a function as a return values. The function that creates and returns a function creates a frame and this becomes the parent frame of the inner or returned function, though the frame is inactivate after execution. Hence, the parent of a function is the frame in which it was defined, and the parent of a frame is the parent of the function called.
Function composition uses function as parameters to compose functionality.
In [1]: def square(x):
...: return x * x
...:
In [2]: def double(x):
...: return 2 * x
...:
In [3]: def compose(f, g):
...: def h(x):
...: return f(g(x))
...: return h
...:
In [4]: squible = compose(square, double)
In [5]: squible(5)
Out[5]: 100
However, the function g
in the function f
cannot find the parameter y
since g
is not defined as a nested function inside f
, or the parent frame of g
is the Global frame, not function f
.
def f(x, y):
return g(x)
def g(a):
return a + y
>>> f(1, 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
File "<stdin>", line 2, in g
NameError: name 'y' is not defined
When rewriting a function, remember the expressions must be evaluated into values before calling functions.
from math import sqrt
def real_sqrt(x):
if x >= 0:
return sqrt(x)
else:
return 0
# rewrite above
def ternary(condition, t, f):
if condition:
return t
else:
return f
def real_sqrt(x):
return ternary(x >= 0, sqrt(x), 0)
If we call the rewrite version of real_sqrt
, we will encounter a ValueError
since the sqrt
must be evaluated into value before passing into ternary
as a parameter.
>>> real_sqrt(-16)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in real_sqrt
ValueError: math domain error
Instead of passing the memo as a parameter, we could use higher-order functions to achieve memoization.
def fib(n):
if n == 0 or n == 1:
return n
else:
return fib(n-2) + fib(n-1)
def count(f):
def counted(*args):
counted.call_count += 1
return f(*args)
counted.call_count = 0
return counted
def memo(f):
cache = {}
def memoized(n):
if n not in cache:
cache[n] = f(n)
return cache[n]
return memoized
>>> fib = count(fib)
>>> fib(20)
6765
>>> fib.call_count
21891
>>> counted_fib = count(fib)
>>> fib = memo(counted_fib)
>>> fib(20)
6765
>>> counted_fib.call_count
21
Lambda expression#
If we want to bind the function to a variable, we could just use assignment.
In [6]: from operator import add
In [7]: addPtr = add
In [8]: addPtr(1, 1)
Out[8]: 2
Instead, we could use lambda function to evaluate to a function as a lambda expression. To illustrate, the function mul
is a function with formal parameters x
and y
that returns the value of x * y
, which must be a single expression.
In [9]: mul = lambda x, y: x * y
In [10]: mul(11, 11)
Out[10]: 121
The difference between a lambda function and a function defined using def
is that the def
statement gives the funcition an intrinsic name which is the name of the function, while the lambda function is represented by \(\lambda(x, y)\) in the frame.
With lambda function, we could transform a multi-argument function into a single-argument, higher-order function by function currying.
In [11]: from operator import add
In [12]: def curry(f):
....: def g(x):
....: def h(y):
....: return f(x, y)
....: return h
....: return g
....:
In [13]: curry(add)(1)(2023)
Out[13]: 2024
In [14]: curry = lambda f: lambda x: lambda y: f(x, y)
In [15]: curry(add)(1)(2023)
Out[15]: 2024
A lambda function’s parent is the current frame in which the lambda expression is evaluated. The two variable a
in the the function call has value of one since their parent frame is the Global frame.
In [16]: a = 1
In [17]: def f(g):
....: a = 2
....: return lambda y: a * g(y)
....:
In [18]: f(lambda y: a + y)(a)
Out[18]: 4
Decorator#
The decorator is a design pattern that adds additional functionality to a function without modifying its structure. Use @
following decorator function name as a decorator preceding the applied function.
# decorator function definition
In [19]: def trace(f):
....: def traced(x, y):
....: print("calling", f, "on arguments", x, y)
....: return f(x, y)
....: return traced
....:
In [20]: @trace
....: def mul(x, y):
....: return x * y
....:
In [21]: mul(2, 3)
calling <function mul at 0x160578940> on arguments 2 3
Out[21]: 6
A traditional method of wrapper without a decorator is reassigning the wrapper function to the function name, using higher-order function like below.
mul = trace(mul)
Self-reference#
Self-reference refers to a particular design of HOF, where a function eventually returns itself. In particular, a self-referencing function will not return a function call, but rather the function object itself.
Tip
The trick to understand the function call is to take one pair of parenthesis at a time, and use the returned current function to take the next parameter, and so on.
In [22]: def printAll(x):
....: print(x)
....: return printAll
....:
In [23]: printAll(1)(2)(3)
1
2
3
Out[23]: <function __main__.printAll(x)>
Another example of tracing the summation at each step.
In [24]: def traceSum(x):
....: print(x)
....: def nextNum(y):
....: return traceSum(x + y)
....: return nextNum
....:
In [25]: traceSum(1)(2)(3)
1
3
6
Out[25]: <function __main__.traceSum.<locals>.nextNum(y)>
Mutation#
Tuple is an immutable and hashable sequence, in which it can be keys in the dictionary.
In [26]: {(1, 1): 1}
Out[26]: {(1, 1): 1}
However, an immutable sequence may still change if it contains a mutable value as an element.
In [27]: s = ([0, 0], 0)
In [28]: s[0][0] = 1; s
Out[28]: ([1, 0], 0)
is
identity operator evaluates to True
if both operands evaluate to the same object. Instead, ==
evaluates to True
if both operands evaluate to equal values, but not the memory address.
In [29]: a = [0]
In [30]: b = [0]
In [31]: a is b
Out[31]: False
In [32]: a == b
Out[32]: True
Default value of a function parameter is dangerous since only one object will be created when the function is defined.
In [33]: def f(s=[]):
....: s.append(0)
....: print(s)
....: return len(s)
....:
In [34]: f()
[0]
Out[34]: 1
In [35]: f()
[0, 0]
Out[35]: 2
Iterators#
A container can provide an iterator that provides access to its elements in certain order. An iterator behaves like an pointer pointing to an element in the container. iter
initializes a iterator and next
returns the value of the current element and proceeds the pointer.
In [36]: a = [[0, 1], 2, 3, 4]
In [37]: it = iter(a)
In [38]: next(it)
Out[38]: [0, 1]
list
converts a iterator into a list with elements from current position to the end of container.
In [39]: list(it)
Out[39]: [2, 3, 4]
Calling next
after the list conversion causes a StopIteration
error since the iterator has reach the end of container.
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Dictionary iterators become invalid if the structure or length of the dictionary container is changed. Instead, if we change a value of a key, the iterator is still valid.
>>> d = {'zero': 0, 'one': 1}
>>> it = iter(d)
>>> next(it)
'zero'
>>> d['one'] = 11
>>> next(it)
'one'
>>> d['two'] = 2
>>> d
{'zero': 0, 'one': 11, 'two': 2}
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration
Attention
As of Python 3.7, dictionaries are guaranteed insertion ordered.
A for
loop can be used to process an iterable. Compared with an iterable like list
, an iterator proceeds in only one direction until reach the end.
In [40]: it = iter([1, 2])
In [41]: for n in it:
....: print(n)
....:
1
2
Python iterator does not provide a function, like hasNext
in Java, to check whether an iterator reach the end. Alternatively, a default value will be returned if we put it as a second parameter of next
.
In [42]: next(it, -1)
Out[42]: -1
Python provides built-in functions for iteration, and many built-in sequence operatios return iterators that compute results lazily, i.e. producing the next result one at a time, such as map
, filter
, zip
, reversed
.
In [43]: m = map(lambda x: x * x, range(3, 10))
In [44]: f = filter(lambda x: x >= 10, m)
In [45]: next(f)
Out[45]: 16
In [46]: list(f)
Out[46]: [25, 36, 49, 64, 81]
Be careful when you think you are comparing two lists, when one is actually a iterator.
In [47]: a = [1, 2, 1]
In [48]: reversed(a) == a
Out[48]: False
In [49]: reversed(a)
Out[49]: <list_reverseiterator at 0x1605b3ee0>
In [50]: list(reversed(a)) == a
Out[50]: True
The zip
function only iterates over matches and skips extras and more than two iterables can be passed to.
In [51]: list(zip([1, 2], [1, 2, 3], [1, 2]))
Out[51]: [(1, 1, 1), (2, 2, 2)]
Generator#
Generator is a special kind of iterator create automatically by calling a generator function. A generator function is a function that lazily yield
values instead of return
them. While a normal function returns once, a generator function can yield
multiple times. When a generator function is called, it returns a generator that iterates over its yields.
In [52]: def evens(start, end):
....: even = start + (start % 2)
....: while even < end:
....: yield even
....: even += 2
....:
Tip
Using yield
is convenient when you want to inspect or process a few data records from a large data source. Its laziness saves memory and reduces running time.
A yield from
statement yields all values from an iterator or iterable. The following two functions behave the same.
def f(a):
for n in a:
yield n
def f(a):
yield from a
Suppose we want to find the partitions of an integer with a maximum component value, an common approach is:
def partitions(n, m):
"""List partitions.
>>> for p in partitions(6, 4): print(p)
2 + 4
1 + 1 + 4
3 + 3
1 + 2 + 3
1 + 1 + 1 + 3
2 + 2 + 2
1 + 1 + 2 + 2
1 + 1 + 1 + 1 + 2
1 + 1 + 1 + 1 + 1 + 1
"""
if n <= 0 or m == 0:
return []
else:
exact_match = []
if n == m:
exact_match = [str(m)]
with_m = [p + ' + ' + str(m) for p in partitions(n-m, m)]
without_m = partitions(n, m-1)
return exact_match + with_m + without_m
Using yield
we could simplify the structure and make it more readable.
def yield_partitions(n, m):
"""List partitions.
>>> for p in yield_partitions(6, 4): print(p)
2 + 4
1 + 1 + 4
3 + 3
1 + 2 + 3
1 + 1 + 1 + 3
2 + 2 + 2
1 + 1 + 2 + 2
1 + 1 + 1 + 1 + 2
1 + 1 + 1 + 1 + 1 + 1
"""
if n > 0 and m > 0:
if n == m:
yield str(m)
for p in yield_partitions(n-m, m):
yield p + ' + ' + str(m)
yield from yield_partitions(n, m-1)
Fibonacci#
A common iterative Fibonacci implementation:
def fib(n):
"""Compute the nth Fibonacci number with n >= 1"""
prev, curr = 0, 1
k = 1
while k < n:
prev, curr = curr, prev + curr
k += 1
return curr
An alternative definition to handle the case where \(n = 0\):
def fib(n):
"""Compute the nth Fibonacci number with n >= 0"""
prev, curr = 1, 0 # swap initial values
k = 0 # start from the 0th value
while k < n:
prev, curr = curr, prev + curr
k += 1
return curr
If we want to multiply the result of a factorial by a constant k
, and put k
as a parameter, recursion is used.
def fact_times(n, k):
"""Return k * n * n-1 * ... * 1.
>>> fact_times(4, 3)
72
"""
if n == 0:
return k
else:
return fact_times(n - 1, k * n)
fact_times(5, 2) # 240
print function#
The print
function returns None
and displays arguments seperated by spaces when it is called.
In [53]: print(print(1), print(2))
1
2
None None
In [54]: def delay(arg):
....: print("delayed")
....: def g():
....: return arg
....: return g
....:
In [55]: delay(delay)()(6)()
delayed
delayed
Out[55]: 6
In [56]: print(delay(print)()(4))
delayed
4
None
Notice that the print
function does not produce IPython output. One more example of the horse-mask brainstorm.
In [57]: def horse(mask):
....: horse = mask
....: def mask(horse):
....: return horse
....: return horse(mask)
....:
In [58]: mask = lambda horse: horse(2)
In [59]: horse(mask)
Out[59]: 2
assert function#
The assert
function can be used to check assumptions of the variable values. A second parameter could be added to report the representation of the error.
>>> assert [0] == [1], ([0], [1])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError: ([0], [1])
sum function#
The summation function can take a second parameter as a default initial value, which is a useful technique in recursive function.
In [60]: sum([1, 2, 3], 0)
Out[60]: 6
In [61]: sum([[1], [2]], [])
Out[61]: [1, 2]
Exceptions#
Python exceptions are raised with a raise <expression>
statement. The expression must evaluate to a subclass of BaseException
or an instance of one. Common errors:
TypeError
: A function was passed the wrong number/type of argumentNameError
: A name wasn’t foundKeyError
: A key wasn’t found in a dictionaryRecursionError
: Too many recursive callsValueError
: Found a lexical errorSyntaxError
: Found a syntantic error
The try-catch
statement could handle the potential errors.
from operator import truediv
def reduce(f, s, initial):
"""Combine elements of s pairwise using f, starting with initial.
>>> reduce(mul, [2, 4, 8], 1)
64
>>> reduce(pow, [1, 2, 3, 4], 2)
16777216
"""
for x in s:
initial = f(initial, x)
return initial
def divide_all(n, ds):
"""Divide n by every d in ds.
>>> divide_all(1024, [2, 4, 8])
16.0
>>> divide_all(1024, [2, 4, 0, 8])
inf
"""
try:
return reduce(truediv, ds, n)
except ZeroDivisionError:
return float('inf')
Object#
A class
statement create a new class and binds that class to the class name in the first frame of the current environment.
In [62]: class Integer:
....: zero = 0
....:
In [63]: Integer.zero
Out[63]: 0
Python has built-in functions that user can override.
__init__
: Method invoked automatically when an object is constructed.__repr__
: Method invoked to display an object as a Python expression.__str__
: Method invoked to display a human-readable representation.__add__
: Method invoked to add one object to another.__radd__
: Method invoked to add another object to this object.__bool__
: Method invoked to convert an object to True or False.__float__
: Method invoked to convert an object to a float number.
In [64]: one, two = 1, 2
In [65]: one + two
Out[65]: 3
In [66]: one.__add__(two)
Out[66]: 3
The built-in function isinstance
returns whether an object is an instance of a class or of a subclass thereof.
>>> isinstance
<built-in function isinstance>
>>> isinstance(one, int)
True
Attributes and methods#
The dot
operator automatically fill in the first argument of instance methods, which is self
. Identically, we can check and look up an attribute using built-in functions hasattr
and getattr
.
>>> hasattr
<built-in function hasattr>
>>> getattr
<built-in function getattr>
A Account class is create for illustration.
class Account:
interest = 0.02
def __init__(self, account_holder):
self.holder = account_holder
self.balance = 0
def deposit(self, amount):
"""Add amount to balance."""
self.balance = self.balance + amount
return self.balance
def withdraw(self, amount):
"""Subtract amount from balance if funds are available."""
if amount > self.balance:
return 'Insufficient funds'
self.balance = self.balance - amount
return self.balance
>>> cara_account = Account('Cara')
>>> cara_account.holder
'Cara'
>>> cara_account.balance
0
>>> cara_account.interest
0.02
>>> cara_account.deposit(100)
100
>>> Account.deposit(cara_account, 100)
200
Python distinguishes between:
Functions, which are directly defined globally, and
Bound methods, which couple together a function and the object on which that method will be invoked.
>>> type(Account.deposit)
<class 'function'>
>>> type(cara_account.deposit)
<class 'method'>
Notice that the class attributes are shared across all instances of a class since they are attributes of the class, not the instance.
>>> Account.interest
0.02
Different from other languages, Python supports adding a new attribute to an instance by attribute assignment.
>>> cara_account.interest = 0.05
>>> cara_account.interest
0.05
>>> Account.interest
0.02
Inheritance#
Inheritance is best for representing is-a relationships. Subclass is a specialized class that shares the common features of the superclass. For example, a checking account is an account.
class CheckingAccount(Account):
withdraw_fee = 1
interest = 0.01
def withdraw(self, amount):
return Account.withdraw(self, amount + self.withdraw_fee)
# Alternatively:
return super().withdraw(amount + self.withdraw_fee)
>>> ca = CheckingAccount('Cara')
>>> ca.balance = 20
>>> ca.withdraw(5)
14
>>> ca.interest
0.01
In the method withdraw
, we prefer self.withdraw_fee
to CheckingAccount.withdraw_fee
to allow for specialized accounts.
Practice the understanding of inheritance with an example and draw the environment diagram.
class A:
z = -1
def f(self, x):
return B(x-1)
class B(A):
n = 4
def __init__(self, y):
if y:
self.z = self.f(y)
else:
self.z = C(y+1)
class C(B):
def f(self, x):
return x
>>> a = A()
>>> b = B(1)
>>> C(2).n
>>> C(2).z
>>> a.z == C.z
>>> a.z == b.z
>>> b.z
>>> b.z.z
>>> b.z.z.z
See solution to John’s attribute lookup practice video.
Python supports multiple inheritance using class subclass(super1, super2)
.
Composition#
Composition is the best for representing has-a relationship. For example, a museum holds a list of exhibits.
class Bank:
def __init__(self):
self.accounts = []
def open_account(self, holder, amount, account_type=Account):
"""Open an account_type for holder and deposit amount."""
account = account_type(holder)
account.deposit(amount)
self.accounts.append(account)
return account
def pay_interest(self):
"""Pay interest to all accounts."""
for account in self.accounts:
account.deposit(account.balance * account.interest)
>>> bank = Bank()
>>> john = bank.open_account('John', 10)
>>> jack = bank.open_account('Jack', 5, CheckingAccount)
>>> jack.interest
0.01
>>> john.interest = 0.06
>>> bank.pay_interest()
>>> john.balance
10.6
>>> jack.balance
5.05
String representation#
The goal of __repr__
is to be unambiguous and the goal of __str__
is to be human-readable.
The function repr
returns a Python expression of str
type that evaluates to an equal object, i.e. eval(repr(object)) == object
, and the function str
returns what print
returns on an object. repr
invokes a zero-argument method __repr__
on its argument, and str
invokes a zero-argument method __str__
on its argument.
The behavior of repr is slightly more complicated than invoking __repr__
on its argument: an instance attribute called __repr__
is ignored and only class attributes are found.
The behavior of str is also complicated:
An instance attribute called
__str__
is ignoredIf no
__str__
attribute is found, uses repr string
Stepwise illustration: suppose we implement neither __repr__
or __str__
.
class Car:
pass
>>> lamborghini = Car()
>>> print(lamborghini)
<__main__.Car object at 0x10494a920>
>>> print(str(lamborghini))
<__main__.Car object at 0x10494a920>
>>> print(repr(lamborghini))
<__main__.Car object at 0x10494a920>
>>> print(lamborghini.__str__())
<__main__.Car object at 0x10494a920>
>>> print(lamborghini.__repr__())
<__main__.Car object at 0x10494a920>
Now we only the __repr__
function.
class Car:
def __repr__(self):
return "Car()"
>>> lamborghini = Car()
>>> print(lamborghini)
Car()
>>> print(str(lamborghini))
Car()
>>> print(repr(lamborghini))
Car()
>>> print(lamborghini.__str__())
Car()
>>> print(lamborghini.__repr__())
Car()
Tip
Implement __repr__
for any class you implement. This should be the second nature. Implement __str__
for the seek of readability.
With both __repr__
and __str__
functions.
class Car:
def __repr__(self):
return "Car()"
def __str__(self):
return "A car"
>>> lamborghini = Car()
>>> print(lamborghini)
A car
>>> print(str(lamborghini))
A car
>>> print(repr(lamborghini))
Car()
>>> print(lamborghini.__str__())
A car
>>> print(lamborghini.__repr__())
Car()
Add instance methods __repr__
and __str__
bound to a object.
class Car:
def __init__(self):
self.__repr__ = lambda: "this Car()"
self.__str__ = lambda: "this car"
def __repr__(self):
return "Car()"
def __str__(self):
return "A car"
>>> lamborghini = Car()
>>> print(lamborghini)
A car
>>> print(str(lamborghini))
A car
>>> print(repr(lamborghini))
Car()
>>> print(lamborghini.__str__())
this car
>>> print(lamborghini.__repr__())
this Car()
Practice#
Inheritance#
Try to draw the environment diagram and guess the outputs.
class Worker:
greeting = 'Sir'
def __init__(self):
self.elf = Worker
def work(self):
return self.greeting + ', I work'
def __repr__(self):
return Bourgeoisie.greeting
class Bourgeoisie(Worker):
greeting = 'Peon'
def work(self):
print(Worker.work(self))
return 'My job is to gather wealth'
>>> jack = Worker()
>>> john = Bourgeoisie()
>>> jack.greeting = 'Maam'
>>> Worker().work()
>>> jack
>>> jack.work()
>>> john.work()
>>> john.elf.work(john)
Built-in functions#
Try to implement the function body according to the docstring and compare the solutions.
def min_abs_indices(s):
"""Indices of all elements in list s that have the smallest absolute value.
>>> min_abs_indices([-4, -3, -2, 3, 2, 4])
[2, 4]
>>> min_abs_indices([1, 2, 3, 4, 5])
[0]
"""
min_abs = min(map(abs, s))
return list(filter(lambda i: abs(s[i]) == min_abs, range(len(s))))
# OR
return [i for i in range(len(s)) if abs(s[i]) == min_abs]
def largest_adj_sum(s):
"""Largest sum of two adjacent elements in a list s.
>>> largest_adj_sum([-4, -3, -2, 3, 2, 4])
6
>>> largest_adj_sum([-4, 3, -2, -3, 2, -4])
1
"""
return max([x + y for x, y in zip(s[:-1], s[1:])])
# OR
return max([s[i] + s[i + 1] for i in range(len(s) - 1)])
# OR
return max(map(lambda i: s[i] + s[i + 1], range(len(s) - 1)))
def digit_dict(s):
"""Map each digit d to the lists of elements in s that end with d.
>>> digit_dict([5, 8, 13, 21, 34, 55, 89])
{1: [21], 3: [13], 4: [34], 5: [5, 55], 8: [8], 9: [89]}
"""
return {i: [x for x in s if x % 10 == i] for i in range(10) if any([x % 10 == i for x in s])}
# OR
last_digits = list(map(lambda x: x % 10, s))
return {i: [x for x in s if x % 10 == i] for i in range(10) if i in last_digits}
def all_have_an_equal(s):
"""Does every element equal some other element in s?
>>> all_have_an_equal([-4, -3, -2, 3, 2, 4])
False
>>> all_have_an_equal([4, 3, 2, 3, 2, 4])
True
"""
return min([sum([1 for y in s if x == y]) for x in s]) > 1
# OR
return all([s[i] in s[:i] + s[i+1:] for i in range(len(s))])
# OR
return all(map(lambda x: s.count(x) > 1, s))