Numpy Guide#
This is an elegant-design introduction to Numpy, geared mainly for new users. You can see more complex recipes at Numpy.
While Python uses help()
for docstring reference, IPython uses ?
or np.info
or np.help
. Also, it uses ??
or np.source
for source code.
Customarily, we import as follows
In [1]: import numpy as np
In [2]: import pandas as pd
Object creation#
Several methods to create Numpy arrays are available, and we should contain homogeneous elements only for optimal efficiency. The simplest way is
In [3]: a = np.array([1, 2, 3, 4, 5, 6]); a
Out[3]: array([1, 2, 3, 4, 5, 6])
In [4]: a.dtype
Out[4]: dtype('int64')
In [5]: a.shape
Out[5]: (6,)
In [6]: a.ndim
Out[6]: 1
In [7]: a.nbytes
Out[7]: 48
As for the growth accelaration of arrays, we can
initialize a Python list and convert it to a Numpy array, or
preallocate spaces as Numpy array directly
In [8]: lst = [1, 2, 3, 4.5] # a simple Python list
In [9]: np.array(lst)
Out[9]: array([1. , 2. , 3. , 4.5])
# the default dtype is float64
In [10]: np.zeros(shape=6)
Out[10]: array([0., 0., 0., 0., 0., 0.])
# the first parameter is capable of n-dim object as tuple
In [11]: np.ones(shape=(1,6), dtype='int64')
Out[11]: array([[1, 1, 1, 1, 1, 1]])
In [12]: np.empty(shape=6) # random address of length six
Out[12]: array([0., 0., 0., 0., 0., 0.])
# parameters N for number of rows and M for number of columns
In [13]: np.eye(N=3, M=None, k=1) # identity matrix w/ upside offset
Out[13]:
array([[0., 1., 0.],
[0., 0., 1.],
[0., 0., 0.]])
In [14]: np.full(shape=6, fill_value=0, dtype=np.int64)
Out[14]: array([0, 0, 0, 0, 0, 0])
If we already have an array and prefer to imitate only the format for preallocation purposes, we could use _like
functions, e.g.
In [15]: np.full_like(a, fill_value=1)
Out[15]: array([1, 1, 1, 1, 1, 1])
In addition, we could use np.arange
and np.linspace
for faster customized array creation
# must have stop parameter, and the default value is 0
# for start and 1 for step parameter
In [16]: np.arange(start=0, stop=6, step=1)
Out[16]: array([0, 1, 2, 3, 4, 5])
# must have start and stop parameters, and the
# default value for num parameter is 50
In [17]: np.linspace(start=0, stop=1, num=6, endpoint=True)
Out[17]: array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
Note
np.linspace
is of better precision when involving fractional calculations.
Use np.ogrid
to generate grid for optimal efficiency. While np.arange
generates a 1D array, np.ogrid
generates two 2D matrices with shape (n, 1) and (1, n).
In [18]: x, y = np.ogrid[0:10:2, 0:6]
In [19]: print(x, x.shape)
[[0]
[2]
[4]
[6]
[8]] (5, 1)
In [20]: print(y, y.shape)
[[0 1 2 3 4 5]] (1, 6)
In order to make a deep copy, use .copy()
for both Python and Numpy objects.
In [21]: b = a.copy(); b
Out[21]: array([1, 2, 3, 4, 5, 6])
Selection#
The common method to index an array is a[start:stop:step]
with default value 0, len(a), and 1 respectively. When step is negative, reverse reading is performed.
In [22]: a[0:6:2]
Out[22]: array([1, 3, 5])
In [23]: a[:3]
Out[23]: array([1, 2, 3])
In [24]: a[::2]
Out[24]: array([1, 3, 5])
In [25]: a[::-1] # reverse list
Out[25]: array([6, 5, 4, 3, 2, 1])
In [26]: a[[1, 4, 5]] # fancy indexing
Out[26]: array([2, 5, 6])
Logics & boolean mask#
Numpy uses and &
, or |
, xor ^
, and not ~
for logical operations
In [27]: a[(a >= 1) & (a < 3)]
Out[27]: array([1, 2])
In [28]: a[(a < 1) | (a >= 3)]
Out[28]: array([3, 4, 5, 6])
In [29]: a[~(a < 3)]
Out[29]: array([3, 4, 5, 6])
Similarly, we use np.where
for logical operations when all three parameters are provided: condition, x for original value or new filling value when condition is met, and y for new filling value otherwise. But note that np.where()
takes no keyword arguments. If only the condition parameter is provided, np.nonzero
is preferable to generate indices that met the condition to the original ndarray.
In [30]: np.nonzero(a > 3) # indices
Out[30]: (array([3, 4, 5]),)
In [31]: np.where(a > 3, a, 10*a)
Out[31]: array([10, 20, 30, 4, 5, 6])
Another useful function is np.clip
, which specifies lower and upper bounds while keeping the intermediate original
In [32]: np.clip(a, a_min=1, a_max=4)
Out[32]: array([1, 2, 3, 4, 4, 4])
Boolean masking is typically the most efficient way to quantify a sub-collection in a collection. Masking in python and data science is when you want manipulated data in a collection based on some criteria, often represented as a boolean object.
In [33]: criteria = (a < 1) | (a >= 3)
In [34]: a[criteria]
Out[34]: array([3, 4, 5, 6])
Random system#
Several methods for randomly generating numbers with different statistical distributions are available. Specifically, we import the default_rng
(random number generator) object as a variable
In [35]: rng = np.random.default_rng()
and illustrate common usages for uniform and (standard) normal distribution
# uniform distribution with half-open interval (default)
In [36]: rng.integers(low=0, high=6, size=6, dtype=np.int64) # U[0,6)
Out[36]: array([1, 4, 2, 0, 4, 0])
# closed-interval uniform distribution
In [37]: rng.integers(low=0, high=6, size=6, endpoint=True) # U[0,6]
Out[37]: array([2, 0, 6, 5, 4, 5])
In [38]: rng.random(size=5) # U[0,1)
Out[38]: array([0.7624, 0.4223, 0.8306, 0.1025, 0.4575])
In [39]: rng.uniform(low=1, high=6, size=(2, 6)) # U[0,6)
Out[39]:
array([[2.4782, 5.983 , 2.579 , 3.5264, 4.1163, 2.0123],
[2.7283, 5.9817, 1.6136, 2.5503, 2.9326, 4.8826]])
In [40]: rng.normal(loc=5, scale=2, size=6) # N(5,2)
Out[40]: array([5.4013, 5.3531, 3.3438, 6.0282, 4.6673, 1.8358])
In [41]: rng.standard_normal(size=6) # N(0,1)
Out[41]: array([ 0.3587, 0.0222, -0.6185, -1.3344, -1.3766, -0.7636])
For more information about default_rng
, check out the official documentation for default_rng.
Mathematics#
Element-wise operations with broadcasting concept include +
or np.add
, -
or np.subtract
, *
or np.multiply
, **
or np.power
, /
or np.divide
, //
, np.sqrt
, np.exp
, np.log
, np.[trigonometry]
, np.floor
, np.ceil
, np.round
, np.maximum
, np.minimum
, np.transpose
or .T
, etc.
Operations for single output include np.max
, np.min
, np.sum
, np.argmax
, np.argmin
, np.mean
, np.median
, np.percentile
, np.std
, np.var
, np.all
, np.any
, etc. Note that axis
parameter is available for many of these.
Use np.dot
for 1D dot product and np.matmul
or @
for matrix multiplication.
In [42]: np.dot([1, 1], [2, 3])
Out[42]: 5
In [43]: np.matmul(np.array([[4, 1], [2, 2]]), np.eye(2))
Out[43]:
array([[4., 1.],
[2., 2.]])
In [44]: np.diag(a) @ np.eye(6)
Out[44]:
array([[1., 0., 0., 0., 0., 0.],
[0., 2., 0., 0., 0., 0.],
[0., 0., 3., 0., 0., 0.],
[0., 0., 0., 4., 0., 0.],
[0., 0., 0., 0., 5., 0.],
[0., 0., 0., 0., 0., 6.]])
Use np.round
for approximate rounding. Alternatively, Python’s builtin round function uses a more accurate but slower algorithm for 64-bit floating point values.
In [45]: np.round(10.055, decimals=2), round(10.055, 2)
Out[45]: (10.06, 10.05)
In [46]: np.round([0.396, 0.158, 0.999, 0.197, 0.261], decimals=2)
Out[46]: array([0.4 , 0.16, 1. , 0.2 , 0.26])
In [47]: np.round([100.0585, 100.6489, 97.5847], decimals=-1)
Out[47]: array([100., 100., 100.])
Note that for values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value.
In [48]: np.round(1.5) == np.round(2.5) == 2
Out[48]: True
Use math.isclose
to compare two values and np.allclose
to test whether two arrays are element-wise equal with a tolerance.
In [49]: math.isclose(0.1+0.2-0.3, 0, rel_tol=1e-09, abs_tol=1e-8)
Out[49]: True
In [50]: np.allclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
Out[50]: True
Note that np.allclose
assumes identical shape and a tolerance. To check whether two ndarrays have same shape and elements with no tolerance, use np.array_equal
.
In [51]: np.array_equal(np.array([[1, 0], [0, 1]]), np.eye(2))
Out[51]: True
Use np.linalg.norm
to perform row-wise normalization.
In [52]: x = np.array([[0, 3, 4],
....: [1, 6, 4]])
....:
In [53]: norm = np.linalg.norm(x, axis=1, keepdims=True)
In [54]: x / norm # normalization
Out[54]:
array([[0. , 0.6 , 0.8 ],
[0.1374, 0.8242, 0.5494]])
The possible parameter keepdims
specifies if the original dimensions of the matrix must be kept.
In [55]: x = np.array([[1, 2], [3, 4]])
In [56]: np.sum(x, axis=1, keepdims=True)
Out[56]:
array([[3],
[7]])
In [57]: np.sum(x, axis=1, keepdims=False)
Out[57]: array([3, 7])
Reshaping#
Suppose we create a 1D array named a of size n, the default shape would be (n,). Some operations like transpose cannot be done, and we need to reshape to a formal shape with two options: (1, n) or (n, 1). Now, we discuss the shape conversion among three shapes. Specifically, we will use np.reshape
or ndarray.reshape
, np.expand_dims
, np.squeeze
, np.flatten
, np.ravel
, np.transpose
or .T
.
(n,) -> (1, n)
In [58]: a = np.array([0, 1, 2])
In [59]: a.reshape(1, -1)
Out[59]: array([[0, 1, 2]])
In [60]: a[np.newaxis, :]
Out[60]: array([[0, 1, 2]])
In [61]: a[None, :]
Out[61]: array([[0, 1, 2]])
In [62]: np.expand_dims(a, axis=0)
Out[62]: array([[0, 1, 2]])
(n,) -> (n, 1)
In [63]: a.reshape(-1, 1)
Out[63]:
array([[0],
[1],
[2]])
In [64]: a[:, np.newaxis]
Out[64]:
array([[0],
[1],
[2]])
In [65]: a[:, None]
Out[65]:
array([[0],
[1],
[2]])
In [66]: np.expand_dims(a, axis=1)
Out[66]:
array([[0],
[1],
[2]])
(n, 1) or (1, n) -> (n,)
In [67]: a.squeeze()
Out[67]: array([0, 1, 2])
In [68]: a.flatten()
Out[68]: array([0, 1, 2])
In [69]: a.reshape(-1)
Out[69]: array([0, 1, 2])
In [70]: np.ravel(a)
Out[70]: array([0, 1, 2])
Note
Array flattened by np.ravel
is a reference to the parent ndarray (shadow copy), hence memory efficiently. np.flatten
creates a new array (deep copy).
use
np.transpose
or.T
for swap between (n, 1) and (1, n)to add a particular dimension, use
np.newaxis
.
In [71]: rand = rng.random((2, 3)); rand.shape
Out[71]: (2, 3)
In [72]: rand[:, np.newaxis, :].shape
Out[72]: (2, 1, 3)
Tip
np.newaxis
is equivalent to None
. When we set axis to -1, dimension will be inferred automatically.
Manipulation#
Several methods to stack objects are available, either vertical or horizontal.
In [73]: a = np.array([[0, 1, 2]])
In [74]: b = np.array([[3, 4, 5]])
# horizontal
In [75]: np.hstack((a, b))
Out[75]: array([[0, 1, 2, 3, 4, 5]])
In [76]: np.c_[a, b]
Out[76]: array([[0, 1, 2, 3, 4, 5]])
In [77]: np.concatenate((a, b), axis=1)
Out[77]: array([[0, 1, 2, 3, 4, 5]])
# vertical
In [78]: np.vstack((a, b))
Out[78]:
array([[0, 1, 2],
[3, 4, 5]])
In [79]: np.r_[a, b]
Out[79]:
array([[0, 1, 2],
[3, 4, 5]])
In [80]: np.concatenate((a, b), axis=0)
Out[80]:
array([[0, 1, 2],
[3, 4, 5]])
# depth - axis=3
In [81]: np.dstack((a, b))
Out[81]:
array([[[0, 3],
[1, 4],
[2, 5]]])
also with fancy element-wise or integral replication.
In [82]: np.tile(a, (3, 1))
Out[82]:
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
In [83]: c = np.arange(1, 5).reshape(2, 2).repeat(3, axis=1).repeat(2, axis=0); c
Out[83]:
array([[1, 1, 1, 2, 2, 2],
[1, 1, 1, 2, 2, 2],
[3, 3, 3, 4, 4, 4],
[3, 3, 3, 4, 4, 4]])
In [84]: np.repeat(a, [1, 2, 3], axis=1)
Out[84]: array([[0, 1, 1, 2, 2, 2]])
In [85]: np.block([[0, a], [np.eye(4)]])
Out[85]:
array([[0., 0., 1., 2.],
[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
Alternatively, we split 1D array by
In [86]: d = np.arange(6); d
Out[86]: array([0, 1, 2, 3, 4, 5])
In [87]: np.split(d, 3)
Out[87]: [array([0, 1]), array([2, 3]), array([4, 5])]
In [88]: np.split(d, [3, 5, 6])
Out[88]: [array([0, 1, 2]), array([3, 4]), array([5]), array([], dtype=int64)]
or ndarray by
In [89]: np.hsplit(c, 2)
Out[89]:
[array([[1, 1, 1],
[1, 1, 1],
[3, 3, 3],
[3, 3, 3]]),
array([[2, 2, 2],
[2, 2, 2],
[4, 4, 4],
[4, 4, 4]])]
In [90]: np.hsplit(c, np.array([2])) # [2:]
Out[90]:
[array([[1, 1],
[1, 1],
[3, 3],
[3, 3]]),
array([[1, 2, 2, 2],
[1, 2, 2, 2],
[3, 4, 4, 4],
[3, 4, 4, 4]])]
In [91]: np.hsplit(c, np.array([2, 5]))
Out[91]:
[array([[1, 1],
[1, 1],
[3, 3],
[3, 3]]),
array([[1, 2, 2],
[1, 2, 2],
[3, 4, 4],
[3, 4, 4]]),
array([[2],
[2],
[4],
[4]])]
In [92]: np.vsplit(c, 2)
Out[92]:
[array([[1, 1, 1, 2, 2, 2],
[1, 1, 1, 2, 2, 2]]),
array([[3, 3, 3, 4, 4, 4],
[3, 3, 3, 4, 4, 4]])]
For above functions, passing an integer as division parameter leads to equal division. According to official documentation, np.array_split
allows indices_or_sections to be an integer that does not equally divide the axis. For an array of length l that should be split into n sections, it returns l % n sub-arrays of size l//n + 1 and the rest of size l//n.
In [93]: e = np.arange(8); e
Out[93]: array([0, 1, 2, 3, 4, 5, 6, 7])
In [94]: np.array_split(e, 3)
Out[94]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7])]
In addition, we have np.insert
, np.append
, and np.delete
with the assuption that input ndarray have same number of dimensions.
In [95]: f = np.array([[1, 1], [2, 2], [3, 3]]); f
Out[95]:
array([[1, 1],
[2, 2],
[3, 3]])
In [96]: np.insert(f, 1, 5.5, axis=1) # type casting
Out[96]:
array([[1, 5, 1],
[2, 5, 2],
[3, 5, 3]])
In [97]: np.insert(f, [1], [[1],[2],[3]], axis=1)
Out[97]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [98]: np.array_equal(np.insert(f, 1, [1, 2, 3], axis=1),
....: np.insert(f, [1], [[1],[2],[3]], axis=1))
....:
Out[98]: True
Differentiate Python’s builtin slice
and index array as the parameter for location to insert
In [99]: np.insert(f, slice(2, 4), [5, 6])
Out[99]: array([1, 1, 5, 2, 6, 2, 3, 3])
In [100]: np.insert(f, [2, 4], [5, 6])
Out[100]: array([1, 1, 5, 2, 2, 6, 3, 3])
In [101]: np.append([[1, 2, 3], [4, 5, 6]], [[7, 8, 9]], axis=0)
Out[101]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [102]: np.append([[1, 2], [3, 4]], [[5, 6], [7, 8]], axis=1)
Out[102]:
array([[1, 2, 5, 6],
[3, 4, 7, 8]])
In [103]: g = np.arange(1, 17).reshape(4, 4); g
Out[103]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
In [104]: np.delete(g, [1, 3], axis=0)
Out[104]:
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
Preferably, we use a boolean mask for further usage of mask; the above is then equivalent to
In [105]: mask = np.ones_like(g, dtype=bool)
In [106]: mask[[1, 3], :] = False
In [107]: g[mask] # flattened
Out[107]: array([ 1, 2, 3, 4, 9, 10, 11, 12])
Use np.pad
for periphery appending
# np.pad(g, pad_width=((u, d), (l, r)))
In [108]: np.pad(g, pad_width=((1, 2), (1, 2)), constant_values=(0, 6))
Out[108]:
array([[ 0, 0, 0, 0, 0, 6, 6],
[ 0, 1, 2, 3, 4, 6, 6],
[ 0, 5, 6, 7, 8, 6, 6],
[ 0, 9, 10, 11, 12, 6, 6],
[ 0, 13, 14, 15, 16, 6, 6],
[ 0, 6, 6, 6, 6, 6, 6],
[ 0, 6, 6, 6, 6, 6, 6]])
Sorting#
Use np.sort
, np.argsort
, np.lexsort
to get sorted values, indices, and lexical-sorted values. While the order
parameter set a order of sorting from front to back, the keys
as a tuple parameter for np.lexsort
set a order of sorting from back to forth.
In [109]: h = np.array([[1,8,2,4],[4,5,1,3]]); h
Out[109]:
array([[1, 8, 2, 4],
[4, 5, 1, 3]])
# Numpy structured array
In [110]: dtype = [('name', str), ('height', int), ('age', int)]
In [111]: val = [('Jason', 180, 21), ('Leo', 179, 21),('Rebecca', 166, 24)]
In [112]: k = np.array(val, dtype=dtype); k
Out[112]:
array([('', 180, 21), ('', 179, 21), ('', 166, 24)],
dtype=[('name', '<U'), ('height', '<i8'), ('age', '<i8')])
In [113]: np.sort(h, axis=1)
Out[113]:
array([[1, 2, 4, 8],
[1, 3, 4, 5]])
In [114]: np.sort(h, axis=0)
Out[114]:
array([[1, 5, 1, 3],
[4, 8, 2, 4]])
In [115]: np.sort(h, axis=None)
Out[115]: array([1, 1, 2, 3, 4, 4, 5, 8])
In [116]: np.sort(k, order=['age', 'height'])
Out[116]:
array([('', 179, 21), ('', 180, 21), ('', 166, 24)],
dtype=[('name', '<U'), ('height', '<i8'), ('age', '<i8')])
In [117]: np.argsort(h)
Out[117]:
array([[0, 2, 3, 1],
[2, 3, 0, 1]])
In [118]: np.argsort(k, order=['age', 'height'])
Out[118]: array([1, 0, 2])
In [119]: finance = [85, 70, 95, 80]
In [120]: math = [80, 95, 90, 85]
In [121]: total = [165, 165, 185, 165]
In [122]: np.lexsort((finance, math, total))
Out[122]: array([0, 3, 1, 2])
Alternatively, pandas is also capable of handling sorting. For more information about pandas, check out pandas Guide.
In [123]: unsort = np.array([3, 9, 0, 8])
In [124]: pd.DataFrame(unsort).sort_index(axis=0).to_numpy().flatten()
Out[124]: array([3, 9, 0, 8])
Note that Python’s builtin .sort()
is in-place sorting.
Warning
All manipulations, indexing, and sorting are not-in-place (shadow copy), and changes will be reflected on original object. Avoid this by doing assignment directly or assigning the output to a specific existing object through possible parameter out
.
Numpy Exercises#
Edited Numpy exercises from GitHub with various and optimized solutions.
1. Import the numpy package under the name ``np``
hint: import … as
import numpy as np
2. Print the numpy version and the configuration
hint: np.__version__, np.show_config)
print(np.__version__)
np.show_config()
3. Create a null vector of size 10
hint: np.zeros
np.zeros(10)
4. How to find the memory size of any array
hint: size, itemsize
Z = np.zeros((10,10))
print("%d bytes" % (Z.size * Z.itemsize))
# alternatively
print(Z.nbytes)
5. How to get the documentation of the numpy add function from the command line?
hint: np.info
%run `python -c "import numpy; numpy.info(numpy.add)"`
6. Create a null vector of size 10 but the fifth value which is 1
hint: array[4]
Z = np.zeros(10)
Z[4] = 1
print(Z)
7. Create a vector with values ranging from 10 to 49
hint: arange
np.arange(10,50) # np.int64
np.linspace(10,49,40) # np.float64
8. Reverse a vector (first element becomes last)
hint: array[::-1]
Z = np.arange(50)
Z = Z[::-1]
print(Z)
9. Create a 3x3 matrix with values ranging from 0 to 8
hint: reshape
np.arange(9).reshape(3, 3)
10. Find indices of non-zero elements from [1,2,0,0,4,0]
hint: np.nonzero
np.nonzero([1,2,0,0,4,0])
np.flatnonzero([1,2,0,0,4,0])
# alternatively
arr = np.array([1,2,0,0,4,0])
mask = arr != 0
arr[mask]
11. Create a 3x3 identity matrix
hint: np.eye
np.eye(3)
12. Create a 3x3x3 array with random values
hint: np.random.random
np.random.random((3,3,3))
# alternatively
rng = np.random.default_rng()
rng.random((3,3,3))
13. Create a 10x10 array with random values and find the minimum and maximum values
hint: min, max
Z = np.random.random((10,10))
# alternatively
arr = rng.random((10,10))
print('min: %f, max: %f' % (arr.min(), arr.max()))
14. Create a random vector of size 30 and find the mean value
hint: mean
np.random.random(30).mean()
15. Create a 2d array with 1 on the border and 0 inside
hint: array[1:-1, 1:-1]
Z = np.ones((10,10))
Z[1:-1,1:-1] = 0
print(Z)
# alternatively
arr = np.zeros((3,3))
np.pad(arr, 1, constant_values=1)
16. How to add a border (filled with 0’s) around an existing array?
hint: np.pad
Z = np.ones((5,5))
Z = np.pad(Z, pad_width=1, mode='constant', constant_values=0)
print(Z)
# Using fancy indexing
Z[:, [0, -1]] = 0
Z[[0, -1], :] = 0
print(Z)
17. What is the result of the following expression?
0 * np.nan
np.nan == np.nan
np.inf > np.nan
np.nan - np.nan
np.nan in set([np.nan])
0.3 == 3 * 0.1
hint: NaN = not a number, inf = infinity
print(0 * np.nan)
print(np.nan == np.nan)
print(np.inf > np.nan)
print(np.nan - np.nan)
print(np.nan in set([np.nan]))
print(0.3 == 3 * 0.1)
18. Create a 5x5 matrix with values 1,2,3,4 just below the diagonal
hint: np.diag
np.diag(1 + np.arange(4), k=-1)
19. Create a 8x8 matrix and fill it with a checkerboard pattern
hint: array[::2]
Z = np.zeros((8,8),dtype=int)
Z[1::2,::2] = 1
Z[::2,1::2] = 1
print(Z)
20. Consider a (6,7,8) shape array, what is the index (x,y,z) of the 100th element?
hint: np.unravel_index
print(np.unravel_index(99, (6,7,8)))
21. Create a checkerboard 8x8 matrix using the tile function
hint: np.tile
Z = np.tile( np.array([[0,1],[1,0]]), (4,4))
print(Z)
22. Normalize a 5x5 random matrix
hint: (x -mean)/std
Z = np.random.random((5,5))
Z = (Z - np.mean (Z)) / (np.std (Z))
print(Z)
23. Create a custom dtype that describes a color as four unsigned bytes (RGBA)
hint: np.dtype
color = np.dtype([("r", np.ubyte),
("g", np.ubyte),
("b", np.ubyte),
("a", np.ubyte)])
24. Multiply a 5x3 matrix by a 3x2 matrix (real matrix product)
hint:
Z = np.dot(np.ones((5,3)), np.ones((3,2)))
print(Z)
# Alternative solution, in Python 3.5 and above
Z = np.ones((5,3)) @ np.ones((3,2))
print(Z)
25. Given a 1D array, negate all elements which are between 3 and 8, in place.
hint: >, <
Z = np.arange(11)
Z[(3 < Z) & (Z < 8)] *= -1
print(Z)
26. What is the output of the following script?
print(sum(range(5),-1))
from numpy import *
print(sum(range(5),-1))
hint: np.sum
print(sum(range(5), -1))
from numpy import *
print(sum(range(5), -1))
# explanation
print(sum(range(5), start=-1))
print(np.sum(range(5), axis=-1))
sum? # normal sum
# sum(iterable, start=0, /)
# return the sum of a 'start' value (default: 0) plus an iterable of numbers
27. Consider an integer vector Z, which of these expressions are legal?
Z**Z
2 << Z >> 2
Z <- Z
1j*Z
Z/1/1
Z<Z>Z
No hints provided...
Z**Z
2 << Z >> 2
Z <- Z
1j*Z
Z/1/1
Z<Z>Z
28. What are the result of the following expressions?
np.array(0) / np.array(0)
np.array(0) // np.array(0)
np.array([np.nan]).astype(int).astype(float)
No hints provided...
print(np.array(0) / np.array(0))
print(np.array(0) // np.array(0))
print(np.array([np.nan]).astype(int).astype(float))
29. How to round away from zero a float array ?
hint: np.uniform, np.copysign, np.ceil, np.abs, np.where
Z = np.random.uniform(-10,+10,10)
print(np.copysign(np.ceil(np.abs(Z)), Z))
# More readable but less efficient
print(np.where(Z>0, np.ceil(Z), np.floor(Z)))
# Intuitively
Z = np.random.uniform(-10,+10,10)
print(Z)
mask = Z < 0
Z[mask] *= -1
Z = np.ceil(Z)
Z[mask] *= -1
print(Z)
30. How to find common values between two arrays?
hint: np.intersect1d
Z1 = np.random.randint(0, 10, 10)
Z2 = np.random.randint(0, 10, 10)
print(np.intersect1d(Z1, Z2))
31. How to ignore all numpy warnings (not recommended)?
hint: np.seterr, np.errstate
# Suicide mode on
defaults = np.seterr(all="ignore")
Z = np.ones(1) / 0
# Back to sanity
_ = np.seterr(**defaults)
# Equivalently with a context manager
with np.errstate(all="ignore"):
np.arange(3) / 0
32. Is the following expressions true?
np.sqrt(-1) == np.emath.sqrt(-1)
hint: imaginary number
np.sqrt(-1) == np.emath.sqrt(-1)
# np.sqrt(-1) throw error of invalid input
# np.emath.sqrt(-1) = 1j
33. How to get the dates of yesterday, today and tomorrow?
hint: np.datetime64, np.timedelta64
yesterday = np.datetime64('today') - np.timedelta64(1)
today = np.datetime64('today')
tomorrow = np.datetime64('today') + np.timedelta64(1)
34. How to get all the dates corresponding to the month of July 2016?
hint: np.arange(dtype=datetime64['D'])
Z = np.arange('2016-07', '2016-08', dtype='datetime64[D]')
print(Z)
**35. How to compute ((A+B)*(-A/2)) in place (without copy)?**
hint: np.add(out=), np.negative(out=), np.multiply(out=), np.divide(out=)
A = np.ones(3)*1
B = np.ones(3)*2
np.add(A, B, out=B)
np.divide(A, 2, out=A)
np.negative(A, out=A)
np.multiply(A, B, out=A)
36. Extract the integer part of a random array of positive numbers using 4 different methods
hint: %, np.floor, astype, np.trunc
Z = np.random.uniform(0, 10, 10)
print(Z - Z%1)
print(Z // 1)
print(np.floor(Z))
print(Z.astype(int))
print(np.trunc(Z))
37. Create a 5x5 matrix with row values ranging from 0 to 4
hint: np.arange
# with broadcasting
Z = np.zeros((5,5))
Z += np.arange(5)
print(Z)
# alternatively with broadcasting
arr = np.arange(5) * np.ones(5)[:,np.newaxis]
print(arr)
# without broadcasting
np.tile(np.arange(0, 5), (5,1))
38. Consider a generator function that generates 10 integers and use it to build an array
hint: np.fromiter
def generate():
for x in range(10):
yield x
Z = np.fromiter(generate(), dtype=float, count=-1)
print(Z)
39. Create a vector of size 10 with values ranging from 0 to 1, both excluded
hint: np.linspace
np.linspace(0,1,11,endpoint=False)[1:]
40. Create a random vector of size 10 and sort it
hint: sort
np.random.random(10).sort()
41. How to sum a small array faster than np.sum?
hint: np.add.reduce
Z = np.arange(10)
np.add.reduce(Z)
42. Consider two random array A and B, check if they are equal
hint: np.allclose, np.array_equal
A = np.random.randint(0, 2, 5)
B = np.random.randint(0, 2, 5)
# Assuming identical shape of the arrays and a tolerance for the comparison of values
equal = np.allclose(A,B)
print(equal)
# Checking both the shape and the element values, no tolerance (values have to be exactly equal)
equal = np.array_equal(A,B)
print(equal)
43. Make an array immutable (read-only)
hint: flags.writeable
Z = np.zeros(10)
Z.flags.writeable = False
Z[0] = 1
44. Consider a random 10x2 matrix representing cartesian coordinates, convert them to polar coordinates
hint: np.sqrt, np.arctan2
Z = np.random.random((10,2))
X,Y = Z[:,0], Z[:,1]
R = np.sqrt(X**2+Y**2)
T = np.arctan2(Y,X)
print(R)
print(T)
45. Create random vector of size 10 and replace the maximum value by 0
hint: argmax
Z = np.random.random(10)
Z[Z.argmax()] = 0
print(Z)
46. Create a structured array with ``x`` and ``y`` coordinates covering the [0,1]x[0,1] area
hint: np.meshgrid
Z = np.zeros((5,5), [('x',float),('y',float)])
Z['x'], Z['y'] = np.meshgrid(np.linspace(0,1,5),
np.linspace(0,1,5))
print(Z)
47. Given two arrays, X and Y, construct the Cauchy matrix C (Cij =1/(xi - yj))
hint: np.subtract.outer
X = np.arange(8)
Y = X + 0.5
C = 1.0 / np.subtract.outer(X, Y)
print(np.linalg.det(C))
48. Print the minimum and maximum representable value for each numpy scalar type
hint: np.iinfo, np.finfo, eps
for dtype in [np.int8, np.int32, np.int64]:
print(np.iinfo(dtype).min)
print(np.iinfo(dtype).max)
for dtype in [np.float32, np.float64]:
print(np.finfo(dtype).min)
print(np.finfo(dtype).max)
print(np.finfo(dtype).eps)
49. How to print all the values of an array?
hint: np.set_printoptions
np.set_printoptions(threshold=float("inf"))
Z = np.zeros((40,40))
print(Z)
50. How to find the closest value (to a given scalar) in a vector?
hint: argmin
Z = np.arange(100)
v = np.random.uniform(0,100)
index = (np.abs(Z-v)).argmin()
print(Z[index])
51. Create a structured array representing a position (x,y) and a color (r,g,b)
hint: dtype
Z = np.zeros(10, [ ('position', [ ('x', float, 1),
('y', float, 1)]),
('color', [ ('r', float, 1),
('g', float, 1),
('b', float, 1)])])
print(Z)
52. Consider a random vector with shape (100,2) representing coordinates, find point by point distances
hint: np.atleast_2d, T, np.sqrt
Z = np.random.random((10,2))
X,Y = np.atleast_2d(Z[:,0], Z[:,1])
D = np.sqrt( (X-X.T)**2 + (Y-Y.T)**2)
print(D)
# Much faster with scipy
import scipy
# Thanks Gavin Heverly-Coulson (#issue 1)
import scipy.spatial
Z = np.random.random((10,2))
D = scipy.spatial.distance.cdist(Z,Z)
print(D)
53. How to convert a float (32 bits) array into an integer (32 bits) in place?
hint: view and [:] =
# Thanks Vikas (https://stackoverflow.com/a/10622758/5989906)
# & unutbu (https://stackoverflow.com/a/4396247/5989906)
Z = (np.random.rand(10)*100).astype(np.float32)
Y = Z.view(np.int32)
Y[:] = Z
print(Y)
54. How to read the following file?
1, 2, 3, 4, 5
6, , , 7, 8
, , 9,10,11
hint: np.genfromtxt
from io import StringIO
# Fake file
s = StringIO('''1, 2, 3, 4, 5
6, , , 7, 8
, , 9,10,11
''')
Z = np.genfromtxt(s, delimiter=",", dtype=np.int)
print(Z)
55. What is the equivalent of enumerate for numpy arrays?
hint: np.ndenumerate, np.ndindex
Z = np.arange(9).reshape(3,3)
for index, value in np.ndenumerate(Z):
print(index, value)
for index in np.ndindex(Z.shape):
print(index, Z[index])
56. Generate a generic 2D Gaussian-like array
hint: np.meshgrid, np.exp
X, Y = np.meshgrid(np.linspace(-1,1,10), np.linspace(-1,1,10))
D = np.sqrt(X*X+Y*Y)
sigma, mu = 1.0, 0.0
G = np.exp(-( (D-mu)**2 / ( 2.0 * sigma**2 ) ) )
print(G)
57. How to randomly place p elements in a 2D array?
hint: np.put, np.random.choice
# Author: Divakar
n = 10
p = 3
Z = np.zeros((n,n))
np.put(Z, np.random.choice(range(n*n), p, replace=False),1)
print(Z)
58. Subtract the mean of each row of a matrix
hint: mean(axis=,keepdims=)
# Author: Warren Weckesser
X = np.random.rand(5, 10)
# Recent versions of numpy
Y = X - X.mean(axis=1, keepdims=True)
# Older versions of numpy
Y = X - X.mean(axis=1).reshape(-1, 1)
print(Y)
59. How to sort an array by the nth column?
hint: argsort
# Author: Steve Tjoa
Z = np.random.randint(0,10,(3,3))
print(Z)
print(Z[Z[:,1].argsort()])
60. How to tell if a given 2D array has null columns?
hint: any, ~
# Author: Warren Weckesser
Z = np.random.randint(0,3,(3,10))
print((~Z.any(axis=0)).any())
61. Find the nearest value from a given value in an array
hint: np.abs, argmin, flat
Z = np.random.uniform(0,1,10)
z = 0.5
m = Z.flat[np.abs(Z - z).argmin()]
print(m)
62. Considering two arrays with shape (1,3) and (3,1), how to compute their sum using an iterator?
hint: np.nditer
A = np.arange(3).reshape(3,1)
B = np.arange(3).reshape(1,3)
it = np.nditer([A,B,None])
for x,y,z in it: z[...] = x + y
print(it.operands[2])
63. Create an array class that has a name attribute
hint: class method
class NamedArray(np.ndarray):
def __new__(cls, array, name="no name"):
obj = np.asarray(array).view(cls)
obj.name = name
return obj
def __array_finalize__(self, obj):
if obj is None: return
self.info = getattr(obj, 'name', "no name")
Z = NamedArray(np.arange(10), "range_10")
print (Z.name)
64. Consider a given vector, how to add 1 to each element indexed by a second vector (be careful with repeated indices)?
hint: np.bincount | np.add.at
# Author: Brett Olsen
Z = np.ones(10)
I = np.random.randint(0,len(Z),20)
Z += np.bincount(I, minlength=len(Z))
print(Z)
# Another solution
# Author: Bartosz Telenczuk
np.add.at(Z, I, 1)
print(Z)
65. How to accumulate elements of a vector (X) to an array (F) based on an index list (I)?
hint: np.bincount
# Author: Alan G Isaac
X = [1,2,3,4,5,6]
I = [1,3,9,3,4,1]
F = np.bincount(I,X)
print(F)
66. Considering a (w,h,3) image of (dtype=ubyte), compute the number of unique colors
hint: np.unique
# Author: Fisher Wang
w, h = 256, 256
I = np.random.randint(0, 4, (h, w, 3)).astype(np.ubyte)
colors = np.unique(I.reshape(-1, 3), axis=0)
n = len(colors)
print(n)
# Faster version
# Author: Mark Setchell
# https://stackoverflow.com/a/59671950/2836621
w, h = 256, 256
I = np.random.randint(0,4,(h,w,3), dtype=np.uint8)
# View each pixel as a single 24-bit integer, rather than three 8-bit bytes
I24 = np.dot(I.astype(np.uint32),[1,256,65536])
# Count unique colours
n = len(np.unique(I24))
print(n)
67. Considering a four dimensions array, how to get sum over the last two axis at once?
hint: sum(axis=(-2,-1))
A = np.random.randint(0,10,(3,4,3,4))
# solution by passing a tuple of axes (introduced in numpy 1.7.0)
sum = A.sum(axis=(-2,-1))
print(sum)
# solution by flattening the last two dimensions into one
# (useful for functions that don't accept tuples for axis argument)
sum = A.reshape(A.shape[:-2] + (-1,)).sum(axis=-1)
print(sum)
68. Considering a one-dimensional vector D, how to compute means of subsets of D using a vector S of same size describing subset indices?
hint: np.bincount
# Author: Jaime Fernández del Río
D = np.random.uniform(0,1,100)
S = np.random.randint(0,10,100)
D_sums = np.bincount(S, weights=D)
D_counts = np.bincount(S)
D_means = D_sums / D_counts
print(D_means)
# Pandas solution as a reference due to more intuitive code
import pandas as pd
print(pd.Series(D).groupby(S).mean())
69. How to get the diagonal of a dot product?
hint: np.diag
# Author: Mathieu Blondel
A = np.random.uniform(0,1,(5,5))
B = np.random.uniform(0,1,(5,5))
# Slow version
np.diag(np.dot(A, B))
# Fast version
np.sum(A * B.T, axis=1)
# Faster version
np.einsum("ij,ji->i", A, B)
70. Consider the vector [1, 2, 3, 4, 5], how to build a new vector with 3 consecutive zeros interleaved between each value?
hint: array[::4]
# Author: Warren Weckesser
Z = np.array([1,2,3,4,5])
nz = 3
Z0 = np.zeros(len(Z) + (len(Z)-1)*(nz))
Z0[::nz+1] = Z
print(Z0)
71. Consider an array of dimension (5,5,3), how to mulitply it by an array with dimensions (5,5)?
hint: array[:, :, None]
A = np.ones((5,5,3))
B = 2*np.ones((5,5))
print(A * B[:,:,None])
72. How to swap two rows of an array?
hint: array[[]] = array[[]]
# Author: Eelco Hoogendoorn
A = np.arange(25).reshape(5,5)
A[[0,1]] = A[[1,0]]
print(A)
73. Consider a set of 10 triplets describing 10 triangles (with shared vertices), find the set of unique line segments composing all the triangles
hint: repeat, np.roll, np.sort, view, np.unique
# Author: Nicolas P. Rougier
faces = np.random.randint(0,100,(10,3))
F = np.roll(faces.repeat(2,axis=1),-1,axis=1)
F = F.reshape(len(F)*3,2)
F = np.sort(F,axis=1)
G = F.view( dtype=[('p0',F.dtype),('p1',F.dtype)] )
G = np.unique(G)
print(G)
74. Given a sorted array C that corresponds to a bincount, how to produce an array A such that np.bincount(A) == C?
hint: np.repeat
# Author: Jaime Fernández del Río
C = np.bincount([1,1,2,3,4,4,6])
A = np.repeat(np.arange(len(C)), C)
print(A)
75. How to compute averages using a sliding window over an array?
hint: np.cumsum
# Author: Jaime Fernández del Río
def moving_average(a, n=3) :
ret = np.cumsum(a, dtype=float)
ret[n:] = ret[n:] - ret[:-n]
return ret[n - 1:] / n
Z = np.arange(20)
print(moving_average(Z, n=3))
76. Consider a one-dimensional array Z, build a two-dimensional array whose first row is (Z[0],Z[1],Z[2]) and each subsequent row is shifted by 1 (last row should be (Z[-3],Z[-2],Z[-1])
hint: from numpy.lib import stride_tricks
# Author: Joe Kington / Erik Rigtorp
from numpy.lib import stride_tricks
def rolling(a, window):
shape = (a.size - window + 1, window)
strides = (a.itemsize, a.itemsize)
return stride_tricks.as_strided(a, shape=shape, strides=strides)
Z = rolling(np.arange(10), 3)
print(Z)
77. How to negate a boolean, or to change the sign of a float inplace?
hint: np.logical_not, np.negative
# Author: Nathaniel J. Smith
Z = np.random.randint(0,2,100)
np.logical_not(Z, out=Z)
Z = np.random.uniform(-1.0,1.0,100)
np.negative(Z, out=Z)
78. Consider 2 sets of points P0,P1 describing lines (2d) and a point
p, how to compute distance from p to each line i (P0[i],P1[i])?
No hints provided...
def distance(P0, P1, p):
T = P1 - P0
L = (T**2).sum(axis=1)
U = -((P0[:,0]-p[...,0])*T[:,0] + (P0[:,1]-p[...,1])*T[:,1]) / L
U = U.reshape(len(U),1)
D = P0 + U*T - p
return np.sqrt((D**2).sum(axis=1))
P0 = np.random.uniform(-10,10,(10,2))
P1 = np.random.uniform(-10,10,(10,2))
p = np.random.uniform(-10,10,( 1,2))
print(distance(P0, P1, p))
79. Consider 2 sets of points P0,P1 describing lines (2d) and a set of
points P, how to compute distance from each point j (P[j]) to each line
i (P0[i],P1[i])? No hints provided...
# Author: Italmassov Kuanysh
# based on distance function from previous question
P0 = np.random.uniform(-10, 10, (10,2))
P1 = np.random.uniform(-10,10,(10,2))
p = np.random.uniform(-10, 10, (10,2))
print(np.array([distance(P0,P1,p_i) for p_i in p]))
80. Consider an arbitrary array, write a function that extract a subpart with a fixed shape and centered on a given element (pad with a ``fill`` value when necessary)
hint: minimum maximum
# Author: Nicolas Rougier
Z = np.random.randint(0,10,(10,10))
shape = (5,5)
fill = 0
position = (1,1)
R = np.ones(shape, dtype=Z.dtype)*fill
P = np.array(list(position)).astype(int)
Rs = np.array(list(R.shape)).astype(int)
Zs = np.array(list(Z.shape)).astype(int)
R_start = np.zeros((len(shape),)).astype(int)
R_stop = np.array(list(shape)).astype(int)
Z_start = (P-Rs//2)
Z_stop = (P+Rs//2)+Rs%2
R_start = (R_start - np.minimum(Z_start,0)).tolist()
Z_start = (np.maximum(Z_start,0)).tolist()
R_stop = np.maximum(R_start, (R_stop - np.maximum(Z_stop-Zs,0))).tolist()
Z_stop = (np.minimum(Z_stop,Zs)).tolist()
r = [slice(start,stop) for start,stop in zip(R_start,R_stop)]
z = [slice(start,stop) for start,stop in zip(Z_start,Z_stop)]
R[r] = Z[z]
print(Z)
print(R)
81. Consider an array Z = [1,2,3,4,5,6,7,8,9,10,11,12,13,14], how to generate an array R = [[1,2,3,4], [2,3,4,5], [3,4,5,6], …, [11,12,13,14]]?
hint: stride_tricks.as_strided
# Author: Stefan van der Walt
Z = np.arange(1,15,dtype=np.uint32)
R = stride_tricks.as_strided(Z,(11,4),(4,4))
print(R)
82. Compute a matrix rank
hint: np.linalg.svd
# Author: Stefan van der Walt
Z = np.random.uniform(0,1,(10,10))
U, S, V = np.linalg.svd(Z) # Singular Value Decomposition
rank = np.sum(S > 1e-10)
print(rank)
83. How to find the most frequent value in an array?
hint: np.bincount, argmax
Z = np.random.randint(0,10,50)
print(np.bincount(Z).argmax())
84. Extract all the contiguous 3x3 blocks from a random 10x10 matrix
hint: stride_tricks.as_strided
# Author: Chris Barker
Z = np.random.randint(0,5,(10,10))
n = 3
i = 1 + (Z.shape[0]-3)
j = 1 + (Z.shape[1]-3)
C = stride_tricks.as_strided(Z, shape=(i, j, n, n), strides=Z.strides + Z.strides)
print(C)
85. Create a 2D array subclass such that Z[i,j] == Z[j,i]
hint: class method
# Author: Eric O. Lebigot
# Note: only works for 2d array and value setting using indices
class Symetric(np.ndarray):
def __setitem__(self, index, value):
i,j = index
super(Symetric, self).__setitem__((i,j), value)
super(Symetric, self).__setitem__((j,i), value)
def symetric(Z):
return np.asarray(Z + Z.T - np.diag(Z.diagonal())).view(Symetric)
S = symetric(np.random.randint(0,10,(5,5)))
S[2,3] = 42
print(S)
86. Consider a set of p matrices wich shape (n,n) and a set of p vectors with shape (n,1). How to compute the sum of of the p matrix products at once? (result has shape (n,1))
hint: np.tensordot
# Author: Stefan van der Walt
p, n = 10, 20
M = np.ones((p,n,n))
V = np.ones((p,n,1))
S = np.tensordot(M, V, axes=[[0, 2], [0, 1]])
print(S)
# It works, because:
# M is (p,n,n)
# V is (p,n,1)
# Thus, summing over the paired axes 0 and 0 (of M and V independently),
# and 2 and 1, to remain with a (n,1) vector.
87. Consider a 16x16 array, how to get the block-sum (block size is 4x4)?
hint: np.add.reduceat
# Author: Robert Kern
Z = np.ones((16,16))
k = 4
S = np.add.reduceat(np.add.reduceat(Z, np.arange(0, Z.shape[0], k), axis=0),
np.arange(0, Z.shape[1], k), axis=1)
print(S)
88. How to implement the Game of Life using numpy arrays?
No hints provided...
# Author: Nicolas Rougier
def iterate(Z):
# Count neighbours
N = (Z[0:-2,0:-2] + Z[0:-2,1:-1] + Z[0:-2,2:] +
Z[1:-1,0:-2] + Z[1:-1,2:] +
Z[2: ,0:-2] + Z[2: ,1:-1] + Z[2: ,2:])
# Apply rules
birth = (N==3) & (Z[1:-1,1:-1]==0)
survive = ((N==2) | (N==3)) & (Z[1:-1,1:-1]==1)
Z[...] = 0
Z[1:-1,1:-1][birth | survive] = 1
return Z
Z = np.random.randint(0,2,(50,50))
for i in range(100): Z = iterate(Z)
print(Z)
89. How to get the n largest values of an array
hint: np.argsort | np.argpartition
Z = np.arange(10000)
np.random.shuffle(Z)
n = 5
# Slow
print (Z[np.argsort(Z)[-n:]])
# Fast
print (Z[np.argpartition(-Z,n)[:n]])
90. Given an arbitrary number of vectors, build the cartesian product (every combinations of every item)
hint: np.indices
# Author: Stefan Van der Walt
def cartesian(arrays):
arrays = [np.asarray(a) for a in arrays]
shape = (len(x) for x in arrays)
ix = np.indices(shape, dtype=int)
ix = ix.reshape(len(arrays), -1).T
for n, arr in enumerate(arrays):
ix[:, n] = arrays[n][ix[:, n]]
return ix
print (cartesian(([1, 2, 3], [4, 5], [6, 7])))
91. How to create a record array from a regular array?
hint: np.core.records.fromarrays
Z = np.array([("Hello", 2.5, 3),
("World", 3.6, 2)])
R = np.core.records.fromarrays(Z.T,
names='col1, col2, col3',
formats = 'S8, f8, i8')
print(R)
92. Consider a large vector Z, compute Z to the power of 3 using 3 different methods
hint: np.power, *, np.einsum
# Author: Ryan G.
x = np.random.rand(int(5e7))
%timeit np.power(x,3)
%timeit x*x*x
%timeit np.einsum('i,i,i->i',x,x,x)
93. Consider two arrays A and B of shape (8,3) and (2,2). How to find rows of A that contain elements of each row of B regardless of the order of the elements in B?
hint: np.where
# Author: Gabe Schwartz
A = np.random.randint(0,5,(8,3))
B = np.random.randint(0,5,(2,2))
C = (A[..., np.newaxis, np.newaxis] == B)
rows = np.where(C.any((3,1)).all(1))[0]
print(rows)
94. Considering a 10x3 matrix, extract rows with unequal values
(e.g. [2,2,3]) No hints provided...
# Author: Robert Kern
Z = np.random.randint(0,5,(10,3))
print(Z)
# solution for arrays of all dtypes (including string arrays and record arrays)
E = np.all(Z[:,1:] == Z[:,:-1], axis=1)
U = Z[~E]
print(U)
# soluiton for numerical arrays only, will work for any number of columns in Z
U = Z[Z.max(axis=1) != Z.min(axis=1),:]
print(U)
95. Convert a vector of ints into a matrix binary representation
hint: np.unpackbits
# Author: Warren Weckesser
I = np.array([0, 1, 2, 3, 15, 16, 32, 64, 128])
B = ((I.reshape(-1,1) & (2**np.arange(8))) != 0).astype(int)
print(B[:,::-1])
# Author: Daniel T. McDonald
I = np.array([0, 1, 2, 3, 15, 16, 32, 64, 128], dtype=np.uint8)
print(np.unpackbits(I[:, np.newaxis], axis=1))
96. Given a two dimensional array, how to extract unique rows?
hint: np.ascontiguousarray | np.unique
# Author: Jaime Fernández del Río
Z = np.random.randint(0,2,(6,3))
T = np.ascontiguousarray(Z).view(np.dtype((np.void, Z.dtype.itemsize * Z.shape[1])))
_, idx = np.unique(T, return_index=True)
uZ = Z[idx]
print(uZ)
# Author: Andreas Kouzelis
# NumPy >= 1.13
uZ = np.unique(Z, axis=0)
print(uZ)
97. Considering 2 vectors A & B, write the einsum equivalent of inner, outer, sum, and mul function
hint: np.einsum
# Author: Alex Riley
# Make sure to read: http://ajcr.net/Basic-guide-to-einsum/
A = np.random.uniform(0,1,10)
B = np.random.uniform(0,1,10)
np.einsum('i->', A) # np.sum(A)
np.einsum('i,i->i', A, B) # A * B
np.einsum('i,i', A, B) # np.inner(A, B)
np.einsum('i,j->ij', A, B) # np.outer(A, B)
98. Considering a path described by two vectors (X,Y), how to sample it using equidistant samples?
hint: np.cumsum, np.interp
# Author: Bas Swinckels
phi = np.arange(0, 10*np.pi, 0.1)
a = 1
x = a*phi*np.cos(phi)
y = a*phi*np.sin(phi)
dr = (np.diff(x)**2 + np.diff(y)**2)**.5 # segment lengths
r = np.zeros_like(x)
r[1:] = np.cumsum(dr) # integrate path
r_int = np.linspace(0, r.max(), 200) # regular spaced path
x_int = np.interp(r_int, r, x) # integrate path
y_int = np.interp(r_int, r, y)
99. Given an integer n and a 2D array X, select from X the rows which can be interpreted as draws from a multinomial distribution with n degrees, i.e., the rows which only contain integers and which sum to n.
hint: np.logical_and.reduce, np.mod
# Author: Evgeni Burovski
X = np.asarray([[1.0, 0.0, 3.0, 8.0],
[2.0, 0.0, 1.0, 1.0],
[1.5, 2.5, 1.0, 0.0]])
n = 4
M = np.logical_and.reduce(np.mod(X, 1) == 0, axis=-1)
M &= (X.sum(axis=-1) == n)
print(X[M])
100. Compute bootstrapped 95% confidence intervals for the mean of a 1D array X (i.e., resample the elements of an array with replacement N times, compute the mean of each sample, and then compute percentiles over the means).
hint: np.percentile
# Author: Jessica B. Hamrick
X = np.random.randn(100) # random 1D array
N = 1000 # number of bootstrap samples
idx = np.random.randint(0, X.size, (N, X.size))
means = X[idx].mean(axis=1)
confint = np.percentile(means, [2.5, 97.5])
print(confint)
Miscellaneous#
Differentiate common usage of Numpy vs. MATLAB.