Numpy Guide#

This is an elegant-design introduction to Numpy, geared mainly for new users. You can see more complex recipes at Numpy.

While Python uses help() for docstring reference, IPython uses ? or np.info or np.help. Also, it uses ?? or np.source for source code.

Customarily, we import as follows

In [1]: import numpy as np

In [2]: import pandas as pd

Object creation#

Several methods to create Numpy arrays are available, and we should contain homogeneous elements only for optimal efficiency. The simplest way is

In [3]: a = np.array([1, 2, 3, 4, 5, 6]); a
Out[3]: array([1, 2, 3, 4, 5, 6])

In [4]: a.dtype
Out[4]: dtype('int64')

In [5]: a.shape
Out[5]: (6,)

In [6]: a.ndim
Out[6]: 1

In [7]: a.nbytes
Out[7]: 48

As for the growth accelaration of arrays, we can

initialize a Python list and convert it to a Numpy array, or
preallocate spaces as Numpy array directly

In [8]: lst = [1, 2, 3, 4.5] # a simple Python list

In [9]: np.array(lst)
Out[9]: array([1. , 2. , 3. , 4.5])

# the default dtype is float64
In [10]: np.zeros(shape=6)
Out[10]: array([0., 0., 0., 0., 0., 0.])

# the first parameter is capable of n-dim object as tuple
In [11]: np.ones(shape=(1,6), dtype='int64')
Out[11]: array([[1, 1, 1, 1, 1, 1]])

In [12]: np.empty(shape=6) # random address of length six
Out[12]: array([0., 0., 0., 0., 0., 0.])

# parameters N for number of rows and M for number of columns
In [13]: np.eye(N=3, M=None, k=1) # identity matrix w/ upside offset
Out[13]: 
array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.]])

In [14]: np.full(shape=6, fill_value=0, dtype=np.int64)
Out[14]: array([0, 0, 0, 0, 0, 0])

If we already have an array and prefer to imitate only the format for preallocation purposes, we could use _like functions, e.g.

In [15]: np.full_like(a, fill_value=1)
Out[15]: array([1, 1, 1, 1, 1, 1])

In addition, we could use np.arange and np.linspace for faster customized array creation

# must have stop parameter, and the default value is 0
# for start and 1 for step parameter
In [16]: np.arange(start=0, stop=6, step=1)
Out[16]: array([0, 1, 2, 3, 4, 5])

# must have start and stop parameters, and the
# default value for num parameter is 50
In [17]: np.linspace(start=0, stop=1, num=6, endpoint=True)
Out[17]: array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])

Note

np.linspace is of better precision when involving fractional calculations.

Use np.ogrid to generate grid for optimal efficiency. While np.arange generates a 1D array, np.ogrid generates two 2D matrices with shape (n, 1) and (1, n).

In [18]: x, y = np.ogrid[0:10:2, 0:6]

In [19]: print(x, x.shape)
[[0]
 [2]
 [4]
 [6]
 [8]] (5, 1)

In [20]: print(y, y.shape)
[[0 1 2 3 4 5]] (1, 6)

In order to make a deep copy, use .copy() for both Python and Numpy objects.

In [21]: b = a.copy(); b
Out[21]: array([1, 2, 3, 4, 5, 6])

Selection#

The common method to index an array is a[start:stop:step] with default value 0, len(a), and 1 respectively. When step is negative, reverse reading is performed.

In [22]: a[0:6:2]
Out[22]: array([1, 3, 5])

In [23]: a[:3]
Out[23]: array([1, 2, 3])

In [24]: a[::2]
Out[24]: array([1, 3, 5])

In [25]: a[::-1] # reverse list
Out[25]: array([6, 5, 4, 3, 2, 1])

In [26]: a[[1, 4, 5]] # fancy indexing
Out[26]: array([2, 5, 6])

Logics & boolean mask#

Numpy uses and & , or | , xor ^ , and not ~ for logical operations

In [27]: a[(a >= 1) & (a < 3)]
Out[27]: array([1, 2])

In [28]: a[(a < 1) | (a >= 3)]
Out[28]: array([3, 4, 5, 6])

In [29]: a[~(a < 3)]
Out[29]: array([3, 4, 5, 6])

Similarly, we use np.where for logical operations when all three parameters are provided: condition, x for original value or new filling value when condition is met, and y for new filling value otherwise. But note that np.where() takes no keyword arguments. If only the condition parameter is provided, np.nonzero is preferable to generate indices that met the condition to the original ndarray.

In [30]: np.nonzero(a > 3) # indices
Out[30]: (array([3, 4, 5]),)

In [31]: np.where(a > 3, a, 10*a)
Out[31]: array([10, 20, 30,  4,  5,  6])

Another useful function is np.clip, which specifies lower and upper bounds while keeping the intermediate original

In [32]: np.clip(a, a_min=1, a_max=4)
Out[32]: array([1, 2, 3, 4, 4, 4])

Boolean masking is typically the most efficient way to quantify a sub-collection in a collection. Masking in python and data science is when you want manipulated data in a collection based on some criteria, often represented as a boolean object.

In [33]: criteria = (a < 1) | (a >= 3)

In [34]: a[criteria]
Out[34]: array([3, 4, 5, 6])

Random system#

Several methods for randomly generating numbers with different statistical distributions are available. Specifically, we import the default_rng (random number generator) object as a variable

In [35]: rng = np.random.default_rng()

and illustrate common usages for uniform and (standard) normal distribution

# uniform distribution with half-open interval (default)
In [36]: rng.integers(low=0, high=6, size=6, dtype=np.int64) # U[0,6)
Out[36]: array([1, 4, 2, 0, 4, 0])

# closed-interval uniform distribution
In [37]: rng.integers(low=0, high=6, size=6, endpoint=True) # U[0,6]
Out[37]: array([2, 0, 6, 5, 4, 5])

In [38]: rng.random(size=5) # U[0,1)
Out[38]: array([0.7624, 0.4223, 0.8306, 0.1025, 0.4575])

In [39]: rng.uniform(low=1, high=6, size=(2, 6)) # U[0,6)
Out[39]: 
array([[2.4782, 5.983 , 2.579 , 3.5264, 4.1163, 2.0123],
       [2.7283, 5.9817, 1.6136, 2.5503, 2.9326, 4.8826]])

In [40]: rng.normal(loc=5, scale=2, size=6) # N(5,2)
Out[40]: array([5.4013, 5.3531, 3.3438, 6.0282, 4.6673, 1.8358])

In [41]: rng.standard_normal(size=6) # N(0,1)
Out[41]: array([ 0.3587,  0.0222, -0.6185, -1.3344, -1.3766, -0.7636])

For more information about default_rng, check out the official documentation for default_rng.

Mathematics#

Element-wise operations with broadcasting concept include + or np.add, - or np.subtract, * or np.multiply, ** or np.power, / or np.divide, //, np.sqrt, np.exp, np.log, np.[trigonometry], np.floor, np.ceil, np.round, np.maximum, np.minimum, np.transpose or .T, etc.

Operations for single output include np.max, np.min, np.sum, np.argmax, np.argmin, np.mean, np.median, np.percentile, np.std, np.var, np.all, np.any, etc. Note that axis parameter is available for many of these.

Use np.dot for 1D dot product and np.matmul or @ for matrix multiplication.

In [42]: np.dot([1, 1], [2, 3])
Out[42]: 5

In [43]: np.matmul(np.array([[4, 1], [2, 2]]), np.eye(2))
Out[43]: 
array([[4., 1.],
       [2., 2.]])

In [44]: np.diag(a) @ np.eye(6)
Out[44]: 
array([[1., 0., 0., 0., 0., 0.],
       [0., 2., 0., 0., 0., 0.],
       [0., 0., 3., 0., 0., 0.],
       [0., 0., 0., 4., 0., 0.],
       [0., 0., 0., 0., 5., 0.],
       [0., 0., 0., 0., 0., 6.]])

Use np.round for approximate rounding. Alternatively, Python’s builtin round function uses a more accurate but slower algorithm for 64-bit floating point values.

In [45]: np.round(10.055, decimals=2), round(10.055, 2)
Out[45]: (10.06, 10.05)

In [46]: np.round([0.396, 0.158, 0.999, 0.197, 0.261], decimals=2)
Out[46]: array([0.4 , 0.16, 1.  , 0.2 , 0.26])

In [47]: np.round([100.0585, 100.6489,  97.5847], decimals=-1)
Out[47]: array([100., 100., 100.])

Note that for values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value.

In [48]: np.round(1.5) == np.round(2.5) == 2
Out[48]: True

Use math.isclose to compare two values and np.allclose to test whether two arrays are element-wise equal with a tolerance.

In [49]: math.isclose(0.1+0.2-0.3, 0, rel_tol=1e-09, abs_tol=1e-8)
Out[49]: True

In [50]: np.allclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
Out[50]: True

Note that np.allclose assumes identical shape and a tolerance. To check whether two ndarrays have same shape and elements with no tolerance, use np.array_equal.

In [51]: np.array_equal(np.array([[1, 0], [0, 1]]), np.eye(2))
Out[51]: True

Use np.linalg.norm to perform row-wise normalization.

In [52]: x = np.array([[0, 3, 4],
   ....:               [1, 6, 4]])
   ....: 

In [53]: norm = np.linalg.norm(x, axis=1, keepdims=True)

In [54]: x / norm # normalization
Out[54]: 
array([[0.    , 0.6   , 0.8   ],
       [0.1374, 0.8242, 0.5494]])

The possible parameter keepdims specifies if the original dimensions of the matrix must be kept.

In [55]: x = np.array([[1, 2], [3, 4]])

In [56]: np.sum(x, axis=1, keepdims=True)
Out[56]: 
array([[3],
       [7]])

In [57]: np.sum(x, axis=1, keepdims=False)
Out[57]: array([3, 7])

Reshaping#

Suppose we create a 1D array named a of size n, the default shape would be (n,). Some operations like transpose cannot be done, and we need to reshape to a formal shape with two options: (1, n) or (n, 1). Now, we discuss the shape conversion among three shapes. Specifically, we will use np.reshape or ndarray.reshape, np.expand_dims, np.squeeze, np.flatten, np.ravel, np.transpose or .T.

(n,) -> (1, n)

In [58]: a = np.array([0, 1, 2])

In [59]: a.reshape(1, -1)
Out[59]: array([[0, 1, 2]])

In [60]: a[np.newaxis, :]
Out[60]: array([[0, 1, 2]])

In [61]: a[None, :]
Out[61]: array([[0, 1, 2]])

In [62]: np.expand_dims(a, axis=0)
Out[62]: array([[0, 1, 2]])

(n,) -> (n, 1)

In [63]: a.reshape(-1, 1)
Out[63]: 
array([[0],
       [1],
       [2]])

In [64]: a[:, np.newaxis]
Out[64]: 
array([[0],
       [1],
       [2]])

In [65]: a[:, None]
Out[65]: 
array([[0],
       [1],
       [2]])

In [66]: np.expand_dims(a, axis=1)
Out[66]: 
array([[0],
       [1],
       [2]])

(n, 1) or (1, n) -> (n,)

In [67]: a.squeeze()
Out[67]: array([0, 1, 2])

In [68]: a.flatten()
Out[68]: array([0, 1, 2])

In [69]: a.reshape(-1)
Out[69]: array([0, 1, 2])

In [70]: np.ravel(a)
Out[70]: array([0, 1, 2])

Note

Array flattened by np.ravel is a reference to the parent ndarray (shadow copy), hence memory efficiently. np.flatten creates a new array (deep copy).

use np.transpose or .T for swap between (n, 1) and (1, n)
to add a particular dimension, use np.newaxis.

In [71]: rand = rng.random((2, 3)); rand.shape
Out[71]: (2, 3)

In [72]: rand[:, np.newaxis, :].shape
Out[72]: (2, 1, 3)

Tip

np.newaxis is equivalent to None. When we set axis to -1, dimension will be inferred automatically.

Manipulation#

Several methods to stack objects are available, either vertical or horizontal.

In [73]: a = np.array([[0, 1, 2]])

In [74]: b = np.array([[3, 4, 5]])

# horizontal
In [75]: np.hstack((a, b))
Out[75]: array([[0, 1, 2, 3, 4, 5]])

In [76]: np.c_[a, b]
Out[76]: array([[0, 1, 2, 3, 4, 5]])

In [77]: np.concatenate((a, b), axis=1)
Out[77]: array([[0, 1, 2, 3, 4, 5]])

# vertical
In [78]: np.vstack((a, b))
Out[78]: 
array([[0, 1, 2],
       [3, 4, 5]])

In [79]: np.r_[a, b]
Out[79]: 
array([[0, 1, 2],
       [3, 4, 5]])

In [80]: np.concatenate((a, b), axis=0)
Out[80]: 
array([[0, 1, 2],
       [3, 4, 5]])

# depth - axis=3
In [81]: np.dstack((a, b))
Out[81]: 
array([[[0, 3],
        [1, 4],
        [2, 5]]])

also with fancy element-wise or integral replication.

In [82]: np.tile(a, (3, 1))
Out[82]: 
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])

In [83]: c = np.arange(1, 5).reshape(2, 2).repeat(3, axis=1).repeat(2, axis=0); c
Out[83]: 
array([[1, 1, 1, 2, 2, 2],
       [1, 1, 1, 2, 2, 2],
       [3, 3, 3, 4, 4, 4],
       [3, 3, 3, 4, 4, 4]])

In [84]: np.repeat(a, [1, 2, 3], axis=1)
Out[84]: array([[0, 1, 1, 2, 2, 2]])

In [85]: np.block([[0, a], [np.eye(4)]])
Out[85]: 
array([[0., 0., 1., 2.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

Alternatively, we split 1D array by

In [86]: d = np.arange(6); d
Out[86]: array([0, 1, 2, 3, 4, 5])

In [87]: np.split(d, 3)
Out[87]: [array([0, 1]), array([2, 3]), array([4, 5])]

In [88]: np.split(d, [3, 5, 6])
Out[88]: [array([0, 1, 2]), array([3, 4]), array([5]), array([], dtype=int64)]

or ndarray by

In [89]: np.hsplit(c, 2)
Out[89]: 
[array([[1, 1, 1],
        [1, 1, 1],
        [3, 3, 3],
        [3, 3, 3]]),
 array([[2, 2, 2],
        [2, 2, 2],
        [4, 4, 4],
        [4, 4, 4]])]

In [90]: np.hsplit(c, np.array([2])) # [2:]
Out[90]: 
[array([[1, 1],
        [1, 1],
        [3, 3],
        [3, 3]]),
 array([[1, 2, 2, 2],
        [1, 2, 2, 2],
        [3, 4, 4, 4],
        [3, 4, 4, 4]])]

In [91]: np.hsplit(c, np.array([2, 5]))
Out[91]: 
[array([[1, 1],
        [1, 1],
        [3, 3],
        [3, 3]]),
 array([[1, 2, 2],
        [1, 2, 2],
        [3, 4, 4],
        [3, 4, 4]]),
 array([[2],
        [2],
        [4],
        [4]])]

In [92]: np.vsplit(c, 2)
Out[92]: 
[array([[1, 1, 1, 2, 2, 2],
        [1, 1, 1, 2, 2, 2]]),
 array([[3, 3, 3, 4, 4, 4],
        [3, 3, 3, 4, 4, 4]])]

For above functions, passing an integer as division parameter leads to equal division. According to official documentation, np.array_split allows indices_or_sections to be an integer that does not equally divide the axis. For an array of length l that should be split into n sections, it returns l % n sub-arrays of size l//n + 1 and the rest of size l//n.

In [93]: e = np.arange(8); e
Out[93]: array([0, 1, 2, 3, 4, 5, 6, 7])

In [94]: np.array_split(e, 3)
Out[94]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7])]

In addition, we have np.insert, np.append, and np.delete with the assuption that input ndarray have same number of dimensions.

In [95]: f = np.array([[1, 1], [2, 2], [3, 3]]); f
Out[95]: 
array([[1, 1],
       [2, 2],
       [3, 3]])

In [96]: np.insert(f, 1, 5.5, axis=1) # type casting
Out[96]: 
array([[1, 5, 1],
       [2, 5, 2],
       [3, 5, 3]])

In [97]: np.insert(f, [1], [[1],[2],[3]], axis=1)
Out[97]: 
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

In [98]: np.array_equal(np.insert(f, 1, [1, 2, 3], axis=1),
   ....:                np.insert(f, [1], [[1],[2],[3]], axis=1))
   ....: 
Out[98]: True

Differentiate Python’s builtin slice and index array as the parameter for location to insert

In [99]: np.insert(f, slice(2, 4), [5, 6])
Out[99]: array([1, 1, 5, 2, 6, 2, 3, 3])

In [100]: np.insert(f, [2, 4], [5, 6])
Out[100]: array([1, 1, 5, 2, 2, 6, 3, 3])

In [101]: np.append([[1, 2, 3], [4, 5, 6]], [[7, 8, 9]], axis=0)
Out[101]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [102]: np.append([[1, 2], [3, 4]], [[5, 6], [7, 8]], axis=1)
Out[102]: 
array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

In [103]: g = np.arange(1, 17).reshape(4, 4); g
Out[103]: 
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [104]: np.delete(g, [1, 3], axis=0)
Out[104]: 
array([[ 1,  2,  3,  4],
       [ 9, 10, 11, 12]])

Preferably, we use a boolean mask for further usage of mask; the above is then equivalent to

In [105]: mask = np.ones_like(g, dtype=bool)

In [106]: mask[[1, 3], :] = False

In [107]: g[mask] # flattened
Out[107]: array([ 1,  2,  3,  4,  9, 10, 11, 12])

Use np.pad for periphery appending

#        np.pad(g, pad_width=((u, d), (l, r)))
In [108]: np.pad(g, pad_width=((1, 2), (1, 2)), constant_values=(0, 6))
Out[108]: 
array([[ 0,  0,  0,  0,  0,  6,  6],
       [ 0,  1,  2,  3,  4,  6,  6],
       [ 0,  5,  6,  7,  8,  6,  6],
       [ 0,  9, 10, 11, 12,  6,  6],
       [ 0, 13, 14, 15, 16,  6,  6],
       [ 0,  6,  6,  6,  6,  6,  6],
       [ 0,  6,  6,  6,  6,  6,  6]])

Sorting#

Use np.sort, np.argsort, np.lexsort to get sorted values, indices, and lexical-sorted values. While the order parameter set a order of sorting from front to back, the keys as a tuple parameter for np.lexsort set a order of sorting from back to forth.

In [109]: h = np.array([[1,8,2,4],[4,5,1,3]]); h
Out[109]: 
array([[1, 8, 2, 4],
       [4, 5, 1, 3]])

# Numpy structured array
In [110]: dtype = [('name', str), ('height', int), ('age', int)]

In [111]: val = [('Jason', 180, 21), ('Leo', 179, 21),('Rebecca', 166, 24)]

In [112]: k = np.array(val, dtype=dtype); k
Out[112]: 
array([('', 180, 21), ('', 179, 21), ('', 166, 24)],
      dtype=[('name', '<U'), ('height', '<i8'), ('age', '<i8')])

In [113]: np.sort(h, axis=1)
Out[113]: 
array([[1, 2, 4, 8],
       [1, 3, 4, 5]])

In [114]: np.sort(h, axis=0)
Out[114]: 
array([[1, 5, 1, 3],
       [4, 8, 2, 4]])

In [115]: np.sort(h, axis=None)
Out[115]: array([1, 1, 2, 3, 4, 4, 5, 8])

In [116]: np.sort(k, order=['age', 'height'])
Out[116]: 
array([('', 179, 21), ('', 180, 21), ('', 166, 24)],
      dtype=[('name', '<U'), ('height', '<i8'), ('age', '<i8')])

In [117]: np.argsort(h)
Out[117]: 
array([[0, 2, 3, 1],
       [2, 3, 0, 1]])

In [118]: np.argsort(k, order=['age', 'height'])
Out[118]: array([1, 0, 2])

In [119]: finance = [85, 70, 95, 80]

In [120]: math = [80, 95, 90, 85]

In [121]: total = [165, 165, 185, 165]

In [122]: np.lexsort((finance, math, total))
Out[122]: array([0, 3, 1, 2])

Alternatively, pandas is also capable of handling sorting. For more information about pandas, check out pandas Guide.

In [123]: unsort = np.array([3, 9, 0, 8])

In [124]: pd.DataFrame(unsort).sort_index(axis=0).to_numpy().flatten()
Out[124]: array([3, 9, 0, 8])

Note that Python’s builtin .sort() is in-place sorting.

Warning

All manipulations, indexing, and sorting are not-in-place (shadow copy), and changes will be reflected on original object. Avoid this by doing assignment directly or assigning the output to a specific existing object through possible parameter out.

Numpy Exercises#

Edited Numpy exercises from GitHub with various and optimized solutions.

1. Import the numpy package under the name ``np``

hint: import … as

import numpy as np

2. Print the numpy version and the configuration

hint: np.__version__, np.show_config)

print(np.__version__)
np.show_config()

3. Create a null vector of size 10

hint: np.zeros

np.zeros(10)

4. How to find the memory size of any array

hint: size, itemsize

Z = np.zeros((10,10))
print("%d bytes" % (Z.size * Z.itemsize))
# alternatively
print(Z.nbytes)

5. How to get the documentation of the numpy add function from the command line?

hint: np.info

%run `python -c "import numpy; numpy.info(numpy.add)"`

6. Create a null vector of size 10 but the fifth value which is 1

hint: array[4]

Z = np.zeros(10)
Z[4] = 1
print(Z)

7. Create a vector with values ranging from 10 to 49

hint: arange

np.arange(10,50) # np.int64
np.linspace(10,49,40) # np.float64

8. Reverse a vector (first element becomes last)

hint: array[::-1]

Z = np.arange(50)
Z = Z[::-1]
print(Z)

9. Create a 3x3 matrix with values ranging from 0 to 8

hint: reshape

np.arange(9).reshape(3, 3)

10. Find indices of non-zero elements from [1,2,0,0,4,0]

hint: np.nonzero

np.nonzero([1,2,0,0,4,0])
np.flatnonzero([1,2,0,0,4,0])
# alternatively
arr = np.array([1,2,0,0,4,0])
mask = arr != 0
arr[mask]

11. Create a 3x3 identity matrix

hint: np.eye

np.eye(3)

12. Create a 3x3x3 array with random values

hint: np.random.random

np.random.random((3,3,3))
# alternatively
rng = np.random.default_rng()
rng.random((3,3,3))

13. Create a 10x10 array with random values and find the minimum and maximum values

hint: min, max

Z = np.random.random((10,10))
# alternatively
arr = rng.random((10,10))
print('min: %f, max: %f' % (arr.min(), arr.max()))

14. Create a random vector of size 30 and find the mean value

hint: mean

np.random.random(30).mean()

15. Create a 2d array with 1 on the border and 0 inside

hint: array[1:-1, 1:-1]

Z = np.ones((10,10))
Z[1:-1,1:-1] = 0
print(Z)
# alternatively
arr = np.zeros((3,3))
np.pad(arr, 1, constant_values=1)

16. How to add a border (filled with 0’s) around an existing array?

hint: np.pad

Z = np.ones((5,5))
Z = np.pad(Z, pad_width=1, mode='constant', constant_values=0)
print(Z)

# Using fancy indexing
Z[:, [0, -1]] = 0
Z[[0, -1], :] = 0
print(Z)

17. What is the result of the following expression?

0 * np.nan
np.nan == np.nan
np.inf > np.nan
np.nan - np.nan
np.nan in set([np.nan])
0.3 == 3 * 0.1

hint: NaN = not a number, inf = infinity

print(0 * np.nan)
print(np.nan == np.nan)
print(np.inf > np.nan)
print(np.nan - np.nan)
print(np.nan in set([np.nan]))
print(0.3 == 3 * 0.1)

18. Create a 5x5 matrix with values 1,2,3,4 just below the diagonal

hint: np.diag

np.diag(1 + np.arange(4), k=-1)

19. Create a 8x8 matrix and fill it with a checkerboard pattern

hint: array[::2]

Z = np.zeros((8,8),dtype=int)
Z[1::2,::2] = 1
Z[::2,1::2] = 1
print(Z)

20. Consider a (6,7,8) shape array, what is the index (x,y,z) of the 100th element?

hint: np.unravel_index

print(np.unravel_index(99, (6,7,8)))

21. Create a checkerboard 8x8 matrix using the tile function

hint: np.tile

Z = np.tile( np.array([[0,1],[1,0]]), (4,4))
print(Z)

22. Normalize a 5x5 random matrix

hint: (x -mean)/std

Z = np.random.random((5,5))
Z = (Z - np.mean (Z)) / (np.std (Z))
print(Z)

23. Create a custom dtype that describes a color as four unsigned bytes (RGBA)

hint: np.dtype

color = np.dtype([("r", np.ubyte),
                  ("g", np.ubyte),
                  ("b", np.ubyte),
                  ("a", np.ubyte)])

24. Multiply a 5x3 matrix by a 3x2 matrix (real matrix product)

hint:

Z = np.dot(np.ones((5,3)), np.ones((3,2)))
print(Z)

# Alternative solution, in Python 3.5 and above
Z = np.ones((5,3)) @ np.ones((3,2))
print(Z)

25. Given a 1D array, negate all elements which are between 3 and 8, in place.

hint: >, <

Z = np.arange(11)
Z[(3 < Z) & (Z < 8)] *= -1
print(Z)

26. What is the output of the following script?

print(sum(range(5),-1))
from numpy import *
print(sum(range(5),-1))

hint: np.sum

print(sum(range(5), -1))
from numpy import *
print(sum(range(5), -1))
# explanation
print(sum(range(5), start=-1))
print(np.sum(range(5), axis=-1))
sum? # normal sum
# sum(iterable, start=0, /)
# return the sum of a 'start' value (default: 0) plus an iterable of numbers

27. Consider an integer vector Z, which of these expressions are legal?

Z**Z
2 << Z >> 2
Z <- Z
1j*Z
Z/1/1
Z<Z>Z

No hints provided...

Z**Z
2 << Z >> 2
Z <- Z
1j*Z
Z/1/1
Z<Z>Z

28. What are the result of the following expressions?

np.array(0) / np.array(0)
np.array(0) // np.array(0)
np.array([np.nan]).astype(int).astype(float)

No hints provided...

print(np.array(0) / np.array(0))
print(np.array(0) // np.array(0))
print(np.array([np.nan]).astype(int).astype(float))

29. How to round away from zero a float array ?

hint: np.uniform, np.copysign, np.ceil, np.abs, np.where

Z = np.random.uniform(-10,+10,10)
print(np.copysign(np.ceil(np.abs(Z)), Z))

# More readable but less efficient
print(np.where(Z>0, np.ceil(Z), np.floor(Z)))
# Intuitively
Z = np.random.uniform(-10,+10,10)
print(Z)
mask = Z < 0
Z[mask] *= -1
Z = np.ceil(Z)
Z[mask] *= -1
print(Z)

30. How to find common values between two arrays?

hint: np.intersect1d

Z1 = np.random.randint(0, 10, 10)
Z2 = np.random.randint(0, 10, 10)
print(np.intersect1d(Z1, Z2))

31. How to ignore all numpy warnings (not recommended)?

hint: np.seterr, np.errstate

# Suicide mode on
defaults = np.seterr(all="ignore")
Z = np.ones(1) / 0

# Back to sanity
_ = np.seterr(**defaults)

# Equivalently with a context manager
with np.errstate(all="ignore"):
    np.arange(3) / 0

32. Is the following expressions true?

np.sqrt(-1) == np.emath.sqrt(-1)

hint: imaginary number

np.sqrt(-1) == np.emath.sqrt(-1)
# np.sqrt(-1) throw error of invalid input
# np.emath.sqrt(-1) = 1j

33. How to get the dates of yesterday, today and tomorrow?

hint: np.datetime64, np.timedelta64

yesterday = np.datetime64('today') - np.timedelta64(1)
today     = np.datetime64('today')
tomorrow  = np.datetime64('today') + np.timedelta64(1)

34. How to get all the dates corresponding to the month of July 2016?

hint: np.arange(dtype=datetime64['D'])

Z = np.arange('2016-07', '2016-08', dtype='datetime64[D]')
print(Z)

**35. How to compute ((A+B)*(-A/2)) in place (without copy)?**

hint: np.add(out=), np.negative(out=), np.multiply(out=), np.divide(out=)

A = np.ones(3)*1
B = np.ones(3)*2
np.add(A, B, out=B)
np.divide(A, 2, out=A)
np.negative(A, out=A)
np.multiply(A, B, out=A)

36. Extract the integer part of a random array of positive numbers using 4 different methods

hint: %, np.floor, astype, np.trunc

Z = np.random.uniform(0, 10, 10)

print(Z - Z%1)
print(Z // 1)
print(np.floor(Z))
print(Z.astype(int))
print(np.trunc(Z))

37. Create a 5x5 matrix with row values ranging from 0 to 4

hint: np.arange

# with broadcasting
Z = np.zeros((5,5))
Z += np.arange(5)
print(Z)
# alternatively with broadcasting
arr = np.arange(5) * np.ones(5)[:,np.newaxis]
print(arr)
# without broadcasting
np.tile(np.arange(0, 5), (5,1))

38. Consider a generator function that generates 10 integers and use it to build an array

hint: np.fromiter

def generate():
    for x in range(10):
        yield x
Z = np.fromiter(generate(), dtype=float, count=-1)
print(Z)

39. Create a vector of size 10 with values ranging from 0 to 1, both excluded

hint: np.linspace

np.linspace(0,1,11,endpoint=False)[1:]

40. Create a random vector of size 10 and sort it

hint: sort

np.random.random(10).sort()

41. How to sum a small array faster than np.sum?

hint: np.add.reduce

Z = np.arange(10)
np.add.reduce(Z)

42. Consider two random array A and B, check if they are equal

hint: np.allclose, np.array_equal

A = np.random.randint(0, 2, 5)
B = np.random.randint(0, 2, 5)

# Assuming identical shape of the arrays and a tolerance for the comparison of values
equal = np.allclose(A,B)
print(equal)

# Checking both the shape and the element values, no tolerance (values have to be exactly equal)
equal = np.array_equal(A,B)
print(equal)

43. Make an array immutable (read-only)

hint: flags.writeable

Z = np.zeros(10)
Z.flags.writeable = False
Z[0] = 1

44. Consider a random 10x2 matrix representing cartesian coordinates, convert them to polar coordinates

hint: np.sqrt, np.arctan2

Z = np.random.random((10,2))
X,Y = Z[:,0], Z[:,1]
R = np.sqrt(X**2+Y**2)
T = np.arctan2(Y,X)
print(R)
print(T)

45. Create random vector of size 10 and replace the maximum value by 0

hint: argmax

Z = np.random.random(10)
Z[Z.argmax()] = 0
print(Z)

46. Create a structured array with ``x`` and ``y`` coordinates covering the [0,1]x[0,1] area

hint: np.meshgrid

Z = np.zeros((5,5), [('x',float),('y',float)])
Z['x'], Z['y'] = np.meshgrid(np.linspace(0,1,5),
                             np.linspace(0,1,5))
print(Z)

47. Given two arrays, X and Y, construct the Cauchy matrix C (Cij =1/(xi - yj))

hint: np.subtract.outer

X = np.arange(8)
Y = X + 0.5
C = 1.0 / np.subtract.outer(X, Y)
print(np.linalg.det(C))

48. Print the minimum and maximum representable value for each numpy scalar type

hint: np.iinfo, np.finfo, eps

for dtype in [np.int8, np.int32, np.int64]:
   print(np.iinfo(dtype).min)
   print(np.iinfo(dtype).max)
for dtype in [np.float32, np.float64]:
   print(np.finfo(dtype).min)
   print(np.finfo(dtype).max)
   print(np.finfo(dtype).eps)

49. How to print all the values of an array?

hint: np.set_printoptions

np.set_printoptions(threshold=float("inf"))
Z = np.zeros((40,40))
print(Z)

50. How to find the closest value (to a given scalar) in a vector?

hint: argmin

Z = np.arange(100)
v = np.random.uniform(0,100)
index = (np.abs(Z-v)).argmin()
print(Z[index])

51. Create a structured array representing a position (x,y) and a color (r,g,b)

hint: dtype

Z = np.zeros(10, [ ('position', [ ('x', float, 1),
                                  ('y', float, 1)]),
                   ('color',    [ ('r', float, 1),
                                  ('g', float, 1),
                                  ('b', float, 1)])])
print(Z)

52. Consider a random vector with shape (100,2) representing coordinates, find point by point distances

hint: np.atleast_2d, T, np.sqrt

Z = np.random.random((10,2))
X,Y = np.atleast_2d(Z[:,0], Z[:,1])
D = np.sqrt( (X-X.T)**2 + (Y-Y.T)**2)
print(D)

# Much faster with scipy
import scipy
# Thanks Gavin Heverly-Coulson (#issue 1)
import scipy.spatial

Z = np.random.random((10,2))
D = scipy.spatial.distance.cdist(Z,Z)
print(D)

53. How to convert a float (32 bits) array into an integer (32 bits) in place?

hint: view and [:] =

# Thanks Vikas (https://stackoverflow.com/a/10622758/5989906)
# & unutbu (https://stackoverflow.com/a/4396247/5989906)
Z = (np.random.rand(10)*100).astype(np.float32)
Y = Z.view(np.int32)
Y[:] = Z
print(Y)

54. How to read the following file?

1, 2, 3, 4, 5
6,  ,  , 7, 8
 ,  , 9,10,11

hint: np.genfromtxt

from io import StringIO

# Fake file
s = StringIO('''1, 2, 3, 4, 5

                6,  ,  , 7, 8

                 ,  , 9,10,11
''')
Z = np.genfromtxt(s, delimiter=",", dtype=np.int)
print(Z)

55. What is the equivalent of enumerate for numpy arrays?

hint: np.ndenumerate, np.ndindex

Z = np.arange(9).reshape(3,3)
for index, value in np.ndenumerate(Z):
    print(index, value)
for index in np.ndindex(Z.shape):
    print(index, Z[index])

56. Generate a generic 2D Gaussian-like array

hint: np.meshgrid, np.exp

X, Y = np.meshgrid(np.linspace(-1,1,10), np.linspace(-1,1,10))
D = np.sqrt(X*X+Y*Y)
sigma, mu = 1.0, 0.0
G = np.exp(-( (D-mu)**2 / ( 2.0 * sigma**2 ) ) )
print(G)

57. How to randomly place p elements in a 2D array?

hint: np.put, np.random.choice

# Author: Divakar

n = 10
p = 3
Z = np.zeros((n,n))
np.put(Z, np.random.choice(range(n*n), p, replace=False),1)
print(Z)

58. Subtract the mean of each row of a matrix

hint: mean(axis=,keepdims=)

# Author: Warren Weckesser

X = np.random.rand(5, 10)

# Recent versions of numpy
Y = X - X.mean(axis=1, keepdims=True)

# Older versions of numpy
Y = X - X.mean(axis=1).reshape(-1, 1)

print(Y)

59. How to sort an array by the nth column?

hint: argsort

# Author: Steve Tjoa

Z = np.random.randint(0,10,(3,3))
print(Z)
print(Z[Z[:,1].argsort()])

60. How to tell if a given 2D array has null columns?

hint: any, ~

# Author: Warren Weckesser

Z = np.random.randint(0,3,(3,10))
print((~Z.any(axis=0)).any())

61. Find the nearest value from a given value in an array

hint: np.abs, argmin, flat

Z = np.random.uniform(0,1,10)
z = 0.5
m = Z.flat[np.abs(Z - z).argmin()]
print(m)

62. Considering two arrays with shape (1,3) and (3,1), how to compute their sum using an iterator?

hint: np.nditer

A = np.arange(3).reshape(3,1)
B = np.arange(3).reshape(1,3)
it = np.nditer([A,B,None])
for x,y,z in it: z[...] = x + y
print(it.operands[2])

63. Create an array class that has a name attribute

hint: class method

class NamedArray(np.ndarray):
    def __new__(cls, array, name="no name"):
        obj = np.asarray(array).view(cls)
        obj.name = name
        return obj
    def __array_finalize__(self, obj):
        if obj is None: return
        self.info = getattr(obj, 'name', "no name")

Z = NamedArray(np.arange(10), "range_10")
print (Z.name)

64. Consider a given vector, how to add 1 to each element indexed by a second vector (be careful with repeated indices)?

hint: np.bincount | np.add.at

# Author: Brett Olsen

Z = np.ones(10)
I = np.random.randint(0,len(Z),20)
Z += np.bincount(I, minlength=len(Z))
print(Z)

# Another solution
# Author: Bartosz Telenczuk
np.add.at(Z, I, 1)
print(Z)

65. How to accumulate elements of a vector (X) to an array (F) based on an index list (I)?

hint: np.bincount

# Author: Alan G Isaac

X = [1,2,3,4,5,6]
I = [1,3,9,3,4,1]
F = np.bincount(I,X)
print(F)

66. Considering a (w,h,3) image of (dtype=ubyte), compute the number of unique colors

hint: np.unique

# Author: Fisher Wang

w, h = 256, 256
I = np.random.randint(0, 4, (h, w, 3)).astype(np.ubyte)
colors = np.unique(I.reshape(-1, 3), axis=0)
n = len(colors)
print(n)

# Faster version
# Author: Mark Setchell
# https://stackoverflow.com/a/59671950/2836621

w, h = 256, 256
I = np.random.randint(0,4,(h,w,3), dtype=np.uint8)

# View each pixel as a single 24-bit integer, rather than three 8-bit bytes
I24 = np.dot(I.astype(np.uint32),[1,256,65536])

# Count unique colours
n = len(np.unique(I24))
print(n)

67. Considering a four dimensions array, how to get sum over the last two axis at once?

hint: sum(axis=(-2,-1))

A = np.random.randint(0,10,(3,4,3,4))
# solution by passing a tuple of axes (introduced in numpy 1.7.0)
sum = A.sum(axis=(-2,-1))
print(sum)
# solution by flattening the last two dimensions into one
# (useful for functions that don't accept tuples for axis argument)
sum = A.reshape(A.shape[:-2] + (-1,)).sum(axis=-1)
print(sum)

68. Considering a one-dimensional vector D, how to compute means of subsets of D using a vector S of same size describing subset indices?

hint: np.bincount

# Author: Jaime Fernández del Río

D = np.random.uniform(0,1,100)
S = np.random.randint(0,10,100)
D_sums = np.bincount(S, weights=D)
D_counts = np.bincount(S)
D_means = D_sums / D_counts
print(D_means)

# Pandas solution as a reference due to more intuitive code
import pandas as pd
print(pd.Series(D).groupby(S).mean())

69. How to get the diagonal of a dot product?

hint: np.diag

# Author: Mathieu Blondel

A = np.random.uniform(0,1,(5,5))
B = np.random.uniform(0,1,(5,5))

# Slow version
np.diag(np.dot(A, B))

# Fast version
np.sum(A * B.T, axis=1)

# Faster version
np.einsum("ij,ji->i", A, B)

70. Consider the vector [1, 2, 3, 4, 5], how to build a new vector with 3 consecutive zeros interleaved between each value?

hint: array[::4]

# Author: Warren Weckesser

Z = np.array([1,2,3,4,5])
nz = 3
Z0 = np.zeros(len(Z) + (len(Z)-1)*(nz))
Z0[::nz+1] = Z
print(Z0)

71. Consider an array of dimension (5,5,3), how to mulitply it by an array with dimensions (5,5)?

hint: array[:, :, None]

A = np.ones((5,5,3))
B = 2*np.ones((5,5))
print(A * B[:,:,None])

72. How to swap two rows of an array?

hint: array[[]] = array[[]]

# Author: Eelco Hoogendoorn

A = np.arange(25).reshape(5,5)
A[[0,1]] = A[[1,0]]
print(A)

73. Consider a set of 10 triplets describing 10 triangles (with shared vertices), find the set of unique line segments composing all the triangles

hint: repeat, np.roll, np.sort, view, np.unique

# Author: Nicolas P. Rougier

faces = np.random.randint(0,100,(10,3))
F = np.roll(faces.repeat(2,axis=1),-1,axis=1)
F = F.reshape(len(F)*3,2)
F = np.sort(F,axis=1)
G = F.view( dtype=[('p0',F.dtype),('p1',F.dtype)] )
G = np.unique(G)
print(G)

74. Given a sorted array C that corresponds to a bincount, how to produce an array A such that np.bincount(A) == C?

hint: np.repeat

# Author: Jaime Fernández del Río

C = np.bincount([1,1,2,3,4,4,6])
A = np.repeat(np.arange(len(C)), C)
print(A)

75. How to compute averages using a sliding window over an array?

hint: np.cumsum

# Author: Jaime Fernández del Río

def moving_average(a, n=3) :
    ret = np.cumsum(a, dtype=float)
    ret[n:] = ret[n:] - ret[:-n]
    return ret[n - 1:] / n
Z = np.arange(20)
print(moving_average(Z, n=3))

76. Consider a one-dimensional array Z, build a two-dimensional array whose first row is (Z[0],Z[1],Z[2]) and each subsequent row is shifted by 1 (last row should be (Z[-3],Z[-2],Z[-1])

hint: from numpy.lib import stride_tricks

# Author: Joe Kington / Erik Rigtorp
from numpy.lib import stride_tricks

def rolling(a, window):
    shape = (a.size - window + 1, window)
    strides = (a.itemsize, a.itemsize)
    return stride_tricks.as_strided(a, shape=shape, strides=strides)
Z = rolling(np.arange(10), 3)
print(Z)

77. How to negate a boolean, or to change the sign of a float inplace?

hint: np.logical_not, np.negative

# Author: Nathaniel J. Smith

Z = np.random.randint(0,2,100)
np.logical_not(Z, out=Z)

Z = np.random.uniform(-1.0,1.0,100)
np.negative(Z, out=Z)

78. Consider 2 sets of points P0,P1 describing lines (2d) and a point p, how to compute distance from p to each line i (P0[i],P1[i])? No hints provided...

def distance(P0, P1, p):
    T = P1 - P0
    L = (T**2).sum(axis=1)
    U = -((P0[:,0]-p[...,0])*T[:,0] + (P0[:,1]-p[...,1])*T[:,1]) / L
    U = U.reshape(len(U),1)
    D = P0 + U*T - p
    return np.sqrt((D**2).sum(axis=1))

P0 = np.random.uniform(-10,10,(10,2))
P1 = np.random.uniform(-10,10,(10,2))
p  = np.random.uniform(-10,10,( 1,2))
print(distance(P0, P1, p))

79. Consider 2 sets of points P0,P1 describing lines (2d) and a set of points P, how to compute distance from each point j (P[j]) to each line i (P0[i],P1[i])? No hints provided...

# Author: Italmassov Kuanysh

# based on distance function from previous question
P0 = np.random.uniform(-10, 10, (10,2))
P1 = np.random.uniform(-10,10,(10,2))
p = np.random.uniform(-10, 10, (10,2))
print(np.array([distance(P0,P1,p_i) for p_i in p]))

80. Consider an arbitrary array, write a function that extract a subpart with a fixed shape and centered on a given element (pad with a ``fill`` value when necessary)

hint: minimum maximum

# Author: Nicolas Rougier

Z = np.random.randint(0,10,(10,10))
shape = (5,5)
fill  = 0
position = (1,1)

R = np.ones(shape, dtype=Z.dtype)*fill
P  = np.array(list(position)).astype(int)
Rs = np.array(list(R.shape)).astype(int)
Zs = np.array(list(Z.shape)).astype(int)

R_start = np.zeros((len(shape),)).astype(int)
R_stop  = np.array(list(shape)).astype(int)
Z_start = (P-Rs//2)
Z_stop  = (P+Rs//2)+Rs%2

R_start = (R_start - np.minimum(Z_start,0)).tolist()
Z_start = (np.maximum(Z_start,0)).tolist()
R_stop = np.maximum(R_start, (R_stop - np.maximum(Z_stop-Zs,0))).tolist()
Z_stop = (np.minimum(Z_stop,Zs)).tolist()

r = [slice(start,stop) for start,stop in zip(R_start,R_stop)]
z = [slice(start,stop) for start,stop in zip(Z_start,Z_stop)]
R[r] = Z[z]
print(Z)
print(R)

81. Consider an array Z = [1,2,3,4,5,6,7,8,9,10,11,12,13,14], how to generate an array R = [[1,2,3,4], [2,3,4,5], [3,4,5,6], …, [11,12,13,14]]?

hint: stride_tricks.as_strided

# Author: Stefan van der Walt

Z = np.arange(1,15,dtype=np.uint32)
R = stride_tricks.as_strided(Z,(11,4),(4,4))
print(R)

82. Compute a matrix rank

hint: np.linalg.svd

# Author: Stefan van der Walt

Z = np.random.uniform(0,1,(10,10))
U, S, V = np.linalg.svd(Z) # Singular Value Decomposition
rank = np.sum(S > 1e-10)
print(rank)

83. How to find the most frequent value in an array?

hint: np.bincount, argmax

Z = np.random.randint(0,10,50)
print(np.bincount(Z).argmax())

84. Extract all the contiguous 3x3 blocks from a random 10x10 matrix

hint: stride_tricks.as_strided

# Author: Chris Barker

Z = np.random.randint(0,5,(10,10))
n = 3
i = 1 + (Z.shape[0]-3)
j = 1 + (Z.shape[1]-3)
C = stride_tricks.as_strided(Z, shape=(i, j, n, n), strides=Z.strides + Z.strides)
print(C)

85. Create a 2D array subclass such that Z[i,j] == Z[j,i]

hint: class method

# Author: Eric O. Lebigot
# Note: only works for 2d array and value setting using indices

class Symetric(np.ndarray):
    def __setitem__(self, index, value):
        i,j = index
        super(Symetric, self).__setitem__((i,j), value)
        super(Symetric, self).__setitem__((j,i), value)

def symetric(Z):
    return np.asarray(Z + Z.T - np.diag(Z.diagonal())).view(Symetric)

S = symetric(np.random.randint(0,10,(5,5)))
S[2,3] = 42
print(S)

86. Consider a set of p matrices wich shape (n,n) and a set of p vectors with shape (n,1). How to compute the sum of of the p matrix products at once? (result has shape (n,1))

hint: np.tensordot

# Author: Stefan van der Walt

p, n = 10, 20
M = np.ones((p,n,n))
V = np.ones((p,n,1))
S = np.tensordot(M, V, axes=[[0, 2], [0, 1]])
print(S)

# It works, because:
# M is (p,n,n)
# V is (p,n,1)
# Thus, summing over the paired axes 0 and 0 (of M and V independently),
# and 2 and 1, to remain with a (n,1) vector.

87. Consider a 16x16 array, how to get the block-sum (block size is 4x4)?

hint: np.add.reduceat

# Author: Robert Kern

Z = np.ones((16,16))
k = 4
S = np.add.reduceat(np.add.reduceat(Z, np.arange(0, Z.shape[0], k), axis=0),
                                       np.arange(0, Z.shape[1], k), axis=1)
print(S)

88. How to implement the Game of Life using numpy arrays? No hints provided...

# Author: Nicolas Rougier

def iterate(Z):
    # Count neighbours
    N = (Z[0:-2,0:-2] + Z[0:-2,1:-1] + Z[0:-2,2:] +
         Z[1:-1,0:-2]                + Z[1:-1,2:] +
         Z[2:  ,0:-2] + Z[2:  ,1:-1] + Z[2:  ,2:])

    # Apply rules
    birth = (N==3) & (Z[1:-1,1:-1]==0)
    survive = ((N==2) | (N==3)) & (Z[1:-1,1:-1]==1)
    Z[...] = 0
    Z[1:-1,1:-1][birth | survive] = 1
    return Z

Z = np.random.randint(0,2,(50,50))
for i in range(100): Z = iterate(Z)
print(Z)

89. How to get the n largest values of an array

hint: np.argsort | np.argpartition

Z = np.arange(10000)
np.random.shuffle(Z)
n = 5

# Slow
print (Z[np.argsort(Z)[-n:]])

# Fast
print (Z[np.argpartition(-Z,n)[:n]])

90. Given an arbitrary number of vectors, build the cartesian product (every combinations of every item)

hint: np.indices

# Author: Stefan Van der Walt

def cartesian(arrays):
    arrays = [np.asarray(a) for a in arrays]
    shape = (len(x) for x in arrays)

    ix = np.indices(shape, dtype=int)
    ix = ix.reshape(len(arrays), -1).T

    for n, arr in enumerate(arrays):
        ix[:, n] = arrays[n][ix[:, n]]

    return ix

print (cartesian(([1, 2, 3], [4, 5], [6, 7])))

91. How to create a record array from a regular array?

hint: np.core.records.fromarrays

Z = np.array([("Hello", 2.5, 3),
              ("World", 3.6, 2)])
R = np.core.records.fromarrays(Z.T,
                               names='col1, col2, col3',
                               formats = 'S8, f8, i8')
print(R)

92. Consider a large vector Z, compute Z to the power of 3 using 3 different methods

hint: np.power, *, np.einsum

# Author: Ryan G.

x = np.random.rand(int(5e7))

%timeit np.power(x,3)
%timeit x*x*x
%timeit np.einsum('i,i,i->i',x,x,x)

93. Consider two arrays A and B of shape (8,3) and (2,2). How to find rows of A that contain elements of each row of B regardless of the order of the elements in B?

hint: np.where

# Author: Gabe Schwartz

A = np.random.randint(0,5,(8,3))
B = np.random.randint(0,5,(2,2))

C = (A[..., np.newaxis, np.newaxis] == B)
rows = np.where(C.any((3,1)).all(1))[0]
print(rows)

94. Considering a 10x3 matrix, extract rows with unequal values (e.g. [2,2,3]) No hints provided...

# Author: Robert Kern

Z = np.random.randint(0,5,(10,3))
print(Z)
# solution for arrays of all dtypes (including string arrays and record arrays)
E = np.all(Z[:,1:] == Z[:,:-1], axis=1)
U = Z[~E]
print(U)
# soluiton for numerical arrays only, will work for any number of columns in Z
U = Z[Z.max(axis=1) != Z.min(axis=1),:]
print(U)

95. Convert a vector of ints into a matrix binary representation

hint: np.unpackbits

# Author: Warren Weckesser

I = np.array([0, 1, 2, 3, 15, 16, 32, 64, 128])
B = ((I.reshape(-1,1) & (2**np.arange(8))) != 0).astype(int)
print(B[:,::-1])

# Author: Daniel T. McDonald

I = np.array([0, 1, 2, 3, 15, 16, 32, 64, 128], dtype=np.uint8)
print(np.unpackbits(I[:, np.newaxis], axis=1))

96. Given a two dimensional array, how to extract unique rows?

hint: np.ascontiguousarray | np.unique

# Author: Jaime Fernández del Río

Z = np.random.randint(0,2,(6,3))
T = np.ascontiguousarray(Z).view(np.dtype((np.void, Z.dtype.itemsize * Z.shape[1])))
_, idx = np.unique(T, return_index=True)
uZ = Z[idx]
print(uZ)

# Author: Andreas Kouzelis
# NumPy >= 1.13
uZ = np.unique(Z, axis=0)
print(uZ)

97. Considering 2 vectors A & B, write the einsum equivalent of inner, outer, sum, and mul function

hint: np.einsum

# Author: Alex Riley
# Make sure to read: http://ajcr.net/Basic-guide-to-einsum/

A = np.random.uniform(0,1,10)
B = np.random.uniform(0,1,10)

np.einsum('i->', A)       # np.sum(A)
np.einsum('i,i->i', A, B) # A * B
np.einsum('i,i', A, B)    # np.inner(A, B)
np.einsum('i,j->ij', A, B)    # np.outer(A, B)

98. Considering a path described by two vectors (X,Y), how to sample it using equidistant samples?

hint: np.cumsum, np.interp

# Author: Bas Swinckels

phi = np.arange(0, 10*np.pi, 0.1)
a = 1
x = a*phi*np.cos(phi)
y = a*phi*np.sin(phi)

dr = (np.diff(x)**2 + np.diff(y)**2)**.5 # segment lengths
r = np.zeros_like(x)
r[1:] = np.cumsum(dr)                # integrate path
r_int = np.linspace(0, r.max(), 200) # regular spaced path
x_int = np.interp(r_int, r, x)       # integrate path
y_int = np.interp(r_int, r, y)

99. Given an integer n and a 2D array X, select from X the rows which can be interpreted as draws from a multinomial distribution with n degrees, i.e., the rows which only contain integers and which sum to n.

hint: np.logical_and.reduce, np.mod

# Author: Evgeni Burovski

X = np.asarray([[1.0, 0.0, 3.0, 8.0],
                [2.0, 0.0, 1.0, 1.0],
                [1.5, 2.5, 1.0, 0.0]])
n = 4
M = np.logical_and.reduce(np.mod(X, 1) == 0, axis=-1)
M &= (X.sum(axis=-1) == n)
print(X[M])

100. Compute bootstrapped 95% confidence intervals for the mean of a 1D array X (i.e., resample the elements of an array with replacement N times, compute the mean of each sample, and then compute percentiles over the means).

hint: np.percentile

# Author: Jessica B. Hamrick

X = np.random.randn(100) # random 1D array
N = 1000 # number of bootstrap samples
idx = np.random.randint(0, X.size, (N, X.size))
means = X[idx].mean(axis=1)
confint = np.percentile(means, [2.5, 97.5])
print(confint)

Miscellaneous#

Differentiate common usage of Numpy vs. MATLAB.