Saturday, April 11, 2020

Numpy Array Cookbook: Generating and Manipulating Arrays in Python


1) Array Overview

What are Arrays?

Array’s are a data structure for storing homogeneous data. That mean’s all elements are the same type.
import numpy as nparr = np.array([[1,2],[3,4]])
type(arr)#=> numpy.ndarray
np.zeros((2))
#=> array([0., 0.])np.zeros((2,2))
#=> array([[0., 0.],
#=>        [0., 0.]])np.zeros((2,2,2))
#=> array([[[0., 0.],
#=>         [0., 0.]],
#=> 
#=>        [[0., 0.],
#=>         [0., 0.]]])
...

Arrays vs Lists

  • Arrays use less memory than lists
  • Arrays have significantly more functionality
  • Arrays require data to be homogeneous; lists do not
  • Arithmetic on arrays operates like matrix multiplication

Important Parameters

shape: a tuple representing dimensions of an array. An array of shape (2,3,2) is a 2x3x2 dimension array. And looks like below.
np.zeros((2,3,2))#=> array([[[0., 0.],
#=>         [0., 0.],
#=>         [0., 0.]],
#=> 
#=>        [[0., 0.],
#=>         [0., 0.],
#=>         [0., 0.]]])

2) Generating Arrays

zeros

Generate an array of zeros with a specified shape.
np.zeros((2,3))
#=> array([[0., 0., 0.],
#=>        [0., 0., 0.]])

ones

Generate an array of ones with a specified shape.
np.ones((2,3))
#=> array([[1., 1., 1.],
#=>        [1., 1., 1.]])

empty

np.empty() is a little different than zeros and ones, as it doesn’t preset any values in the array. Some people say it’s slightly faster to initialize but that’s negligible.
arr = np.empty((2,2))
arr
#=> array([[1.00000000e+000, 1.49166815e-154],
#=>        [4.44659081e-323, 0.00000000e+000]])

full

Initialize an array with a given value.
np.full((3,2), 10)
#=> array([[10, 10],
#=>        [10, 10],
#=>        [10, 10]])np.full((3,2), ['a','b'])
#=> array([['a', 'b'],
#=>        ['a', 'b'],
#=>        ['a', 'b']], dtype='<U1')

array

This is probably what you’ve seen the most in real life. It initializes an array from an “array-like” object.
li = ['a','b','c']
np.array(li)#=> array(['a', 'b', 'c'], dtype='<U1')

_like

There are several _like functions corresponding to the functions we’ve discussed: empty_likeones_likezeros_like and full_like.
a1 = np.array([[1,2],[3,4]])
#=> array([[1, 2],
#=>        [3, 4]])np.ones_like(a1)
#=> array([[1, 1],
#=>        [1, 1]])

rand

Generate an array with random values.
np.random.rand(3,2)
#=> array([[0.94664048, 0.76616114],
#=>        [0.395549  , 0.84680126],
#=>        [0.42873   , 0.77736086]])

asarray

np.asarray is a wrapper around np.array, which sets the parameter copy=False. See np.array above.

arange

Generates an array of values with a set interval between an upper and lower limit. It’s numpy’s version of list(range(50,60,2)) with lists.
np.arange(50,60,2)
#=> array([50, 52, 54, 56, 58])

linspace

Generates an array of numbers with equal intervals between 2 other numbers. Instead of specifying the interval directly like arange, we specify how many numbers to generate between the upper and lower limit.
np.linspace(10, 20, 6)
#=> array([10., 12., 14., 16., 18., 20.])np.linspace(0, 2, 5)
#=> array([0. , 0.5, 1. , 1.5, 2. ])

meshgrid

Generates a matrix of coordinates based on 2 input arrays.
x = np.array([1,2,3])
y = np.array([-3,-2,-1])
 
xcors, ycors = np.meshgrid(x, y) xcors
#=> [[1 2 3]
#=> [1 2 3]
#=> [1 2 3]]ycors
#=> [[-3 -3 -3]
#=> [-2 -2 -2]
#=> [-1 -1 -1]]
[[(1, -3), (2, -3), (3, -3)]
 [(1, -2), (2, -2), (3, -2)],
 [(1, -1), (2, -1), (3, -1)]]

3) Manipulating Arrays

copy

Make a copy of an existing array.
a1 = np.array([1,2,3])
a2 = a1a2[0] = 10
a1
#=> array([10,  2,  3])
a1 = np.array([1,2,3])
a2 = a1.copy()a2[0] = 10
a1
#=> array([1, 2, 3])

shape

Get the shape of an array.
a = np.array([[1,2],[3,4],[5,6]])
a.shape
#=> (3, 2)

reshape

Reshapes an array.
a = np.array([[1,2],[3,4],[5,6]])
a
#=> array([[1, 2],
#=>        [3, 4],
#=>        [5, 6]])
a.shape
#=> (3, 2)
a.reshape(2,3)
#=> array([[1, 2, 3],
#=>        [4, 5, 6]])
a.reshape(6)
#=> array([1, 2, 3, 4, 5, 6])
a.reshape(6,1)
#=>array([[1],
#=>       [2],
#=>       [3],
#=>       [4],
#=>       [5],
#=>       [6]])
a.reshape(2,3,1)
#=> array([[[1],
#=>         [2],
#=>         [3]],
#=> 
#=>        [[4],
#=>         [5],
#=>         [6]]])

resize

Similar to reshape but it mutates the original array.
a = np.array([['a','b'],['c','d']])
a
#=>array([['a', 'b'],
#=>       ['c', 'd']], dtype='<U1')a.reshape(1,4)
#=> array([['a', 'b', 'c', 'd']], dtype='<U1')a
#=>array([['a', 'b'],
#=>       ['c', 'd']], dtype='<U1')a.resize(1,4)
a
#=> array([['a', 'b', 'c', 'd']], dtype='<U1')

transpose

Transposes an array.
a = np.array([['s','t','u'],['x','y','z']])
a
#=> array([['s', 't', 'u'],
#=>        ['x', 'y', 'z']], dtype='<U1')a.T
#=> array([['s', 'x'],
#=>        ['t', 'y'],
#=>        ['u', 'z']], dtype='<U1')

flatten

Flattens an array into 1 dimension and returns a copy.
a = np.array([[1,2,3],['a','b','c']])
a.flatten()
#=> array(['1', '2', '3', 'a', 'b', 'c'], dtype='<U21')a.reshape(6)
#=> array(['1', '2', '3', 'a', 'b', 'c'], dtype='<U21')

ravel

Flattens an array-like object into 1 dimension. Similar to flatten but it returns a view of an array instead of a copy.
np.ravel([[1,2,3],[4,5,6]])
#=> array([1, 2, 3, 4, 5, 6])np.flatten([[1,2,3],[4,5,6]])
#=> AttributeError: module 'numpy' has no attribute 'flatten'

hsplit

Horizontally splits an array into subarrays.
a = np.array(
    [[1,2,3],
     [4,5,6]])
a
#=> array([[1, 2, 3],
#=>        [4, 5, 6]])np.hsplit(a,3)# #=> [array([[1],[4]]), 
# #=>  array([[2],[5]]), 
# #=>  array([[3],[6]])]

vsplit

Vertically splits an array into subarrays.
a = np.array(
    [[1,2,3],
     [4,5,6]])
a
#=> array([[1, 2, 3],
#=>        [4, 5, 6]])np.vsplit(a,2)#=> [array([[1, 2, 3]]), 
#=> array([[4, 5, 6]])]

stack

Joins arrays on an axis.
a = np.array(['a', 'b', 'c'])
b = np.array(['d', 'e', 'f'])np.stack((a, b), axis=0)
#=> array([['a', 'b', 'c'],
#=>       ['d', 'e', 'f']], dtype='<U1')
a = np.array(['a', 'b', 'c'])
b = np.array(['d', 'e', 'f'])np.stack((a, b), axis=1)
#=> array([['a', 'd'],
#=>        ['b', 'e'],
#=>        ['c', 'f']], dtype='<U1')

Conclusion

I consider this the basics of numpy. You’ll come across these functions repeatedly when reading existing code at work or doing tutorials online.

No comments:

Must Watch YouTube Videos for Databricks Platform Administrators

  While written word is clearly the medium of choice for this platform, sometimes a picture or a video can be worth 1,000 words. Below are  ...