×

NumPy Array Manipulations

NumPy package provides several functions for the manipulation of the array and its elements. These functions can be classified broadly as:

  • Changing Array Shapes
  • Tranpose Operations
  • Changing Array Dimensions
  • Joining Arrays
  • Splitting Arrays
  • Adding and Removing Array elements

Changing Array Shapes

The NumPy Library has a lot of functions for changing the shape(number of rows and columns) of an array.

  • ndarray.flatten()
  • ravel()
  • reshape()

ndarray.flatten():

The flatten() function is used to convert the input array into one dimension. The function returns a copy of the input array collapsed into a single dimension.

Syntax:

ndarray.flatten(order='C')

The flatten() function takes order as a parameter used for describing the order of elements to be stored in the output array. The order parameter is set to ‘C'(row-major order style) as default and it is an optional parameter.

import numpy as np
arr = np.array([[[1, 2, 3],[4, 5, 6],[7, 8, 9]],
                [[10, 11,12],[13, 14, 15],[16, 17, 18]]])

print(arr.flatten('C')) #row-major order
#Output:
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18]

print(arr.flatten('F')) #column-major order
#Output:
[ 1 10  4 13  7 16  2 11  5 14  8 17  3 12  6 15  9 18]

ravel():

The NumPy ravel() function returns a flattened one-dimension array and a copy is made only if needed. The ravel() function works similarly to ndarray.flatten() function. The resultant array will have the same data type as the input array.

Syntax:

ravel(a, order='C')

The ravel() function takes the argument a which is input array and order same as the flatten() function. The a is mandatory and the order is an optional parameter.

print(np.ravel(arr))
#Output:
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18]

Transpose Operations

The transpose operations are used to interchange the row and column axis of an array. NumPy provides many functions to perform transpose to a ndarray.

  • moveaxis()
  • rollaxis()
  • swapaxes()
  • transpose()
  • ndarray.T

moveaxis():

The moveaxis() function of NumPy moves axes of the input array. Only the axis that is moved will change, other axes remain unchanged.

Syntax:

moveaxis(a, source, destination)

The a argument is the array whose axes are reordered, the source argument is the original position of axes to move and the destination is the final position of axes. Both source and destination must be unique and input type should be integer or sequence of integers.

print('Orignal array shape:', arr.shape)
print('Shape after moving axes:', np.moveaxis(arr, 0, -1).shape)
print('Shape after again moving axes:', np.moveaxis(arr, -1, 0).shape)

Output:

Orignal array shape: (2, 3, 3)
Shape after moving axes: (3, 3, 2)
Shape after again moving axes: (3, 2, 3)

rollaxis():

The rollaxis() function of NumPy rolls the specified axis of the given ndarray backward until it lies in a specified position.

Syntax:

rollaxis(a, axis, start=0)

The a parameter is the input array, the axis is the position of the axis to be rolled and the start is the specified destination position for the rolled axis.

When start <= axis, the axis is rolled back until it lies in the start position. When start > axis, the axis is rolled until it lies before this position. The start is set 0 as default, which results in a complete roll.

print('Original shape:', arr.shape)
print('when start < axis:', np.rollaxis(arr, 0, -1).shape) #roll axis to -1
print('when start = axis:', np.rollaxis(arr, 0, 0).shape) #roll axis to 0
print('when start > axis:', np.rollaxis(arr, 0, 3).shape) #roll axis to (3 - 1)

Output:

Original shape: (2, 3, 3)
when start < axis: (3, 2, 3)
when start = axis: (2, 3, 3)
when start > axis: (3, 3, 2)

swapaxes():

The swapaxes() function of Numpy interchanges two axes of an array. A view of the swapped array is returned (if the order of axes is changed else the input array is returned) for Numpy version >= 1.10 otherwise a new array will be created.

Syntax:

swapaxes(a, axis1, axis2)

The a parameter is the same as rollaxis() and axis1 and axis2 are the axes position to be interchanged.

print(np.swapaxes(arr, 0, 1))
#Output:
[[[ 1  2  3]
  [10 11 12]]

 [[ 4  5  6]
  [13 14 15]]

 [[ 7  8  9]
  [16 17 18]]]

transpose():

The transpose() function of NumPy permutes(reverses) the axes of the given array. For an array having two axes, the transpose() function gives the matrix transpose and returns a new modified array.

Syntax:

transpose(a, axes=None)

The a parameter is same as the rollaxis() function and axes is a list or tuple contain the permutation of [0, 1,.., N-1] where N is the number of axes of the input array. The axes parameter is optional.

print(np.transpose(arr))
#Output:
[[[ 1 10]
  [ 4 13]
  [ 7 16]]

 [[ 2 11]
  [ 5 14]
  [ 8 17]]

 [[ 3 12]
  [ 6 15]
  [ 9 18]]]

with axes parameter:

print(np.transpose(arr,(1, 0, 2)))
#Output:
[[[ 1  2  3]
  [10 11 12]]

 [[ 4  5  6]
  [13 14 15]]

 [[ 7  8  9]
  [16 17 18]]]

ndarray.T:

The ndarray.T is an attribute of the ndarray class which returns the transpose of the given ndarray. It works similar to transpose() function.

arr.T #for other than Notebook users, use print(arr.T)
#Output:
array([[[ 1, 10],
        [ 4, 13],
        [ 7, 16]],

       [[ 2, 11],
        [ 5, 14],
        [ 8, 17]],

       [[ 3, 12],
        [ 6, 15],
        [ 9, 18]]])

Changing Array Dimensions

The NumPy package provides the following functions to change the dimensions of a ndarray.

  • atleast_1d()
  • broadcast_to()
  • broadcast_arrays()
  • expand_dims()
  • squeeze()

atleast_1d():

The atleast_1d() function converts input arrays with at least 1-dimension. The scalar inputs(single values)  are converted to one-dimensional arrays, whilst higher-dimensional inputs are preserved.

Syntax:

atleast_1d(*arys)

The arys is one or more than one array.

print(np.atleast_1d(arr))
#Output:
[[[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]]

 [[10 11 12]
  [13 14 15]
  [16 17 18]]]

For scalar inputs:

print(np.atleast_1d(1, 2, [3, 4]))
#Output:
[array([1]), array([2]), array([3, 4])]

The atleast_2d() function and atleast_3d() function also works similar to atleast_1d() function. The atleast_2d() function converts the input array to a two-dimensional array and preserves the array of higher dimensions.

Similarly, atleast_3d() function preserves array with higher dimensions and converts the lower dimensional array or scalar inputs to a 3-d array.

print(np.atleast_2d([1, 2, 3, 4, 5]))
#Output:
[[1 2 3 4 5]]

print(np.atleast_3d([10, [2, 4, 6], 23]))
#Output:
[[[10]
  [list([2, 4, 6])]
  [23]]]

broadcast_to():

The NumPy broadcast_to() function is used to broadcasts a ndarray to a new shape.

Syntax:

broadcast_to(array, shape, subok=False)

The array parameter is the array to broadcast. The shape is the shape of the resultant array(tuple of integers). The subok If True(False as default), then sub-classes will be passed through, otherwise, the returned array will be forced to be a base-class array (default).

x = np.arange(4)
print(np.broadcast_to(x, (3, 4)))
#Output:
[[0 1 2 3]
 [0 1 2 3]
 [0 1 2 3]]

broadcast_arrays():

The NumPy broadcast_arrays() function is used to broadcast any number of arrays against each other. The broadcast_arrays() returns the views of the list of arrays and are typically not contiguous(not merged).

Syntax:

broadcast_arrays(*args, subok=False)

The args parameter is the arrays to broadcast and the subok parameter is the same as the broadcast_to() function. The subok parameter is optional.

x = np.array([[5, 6, 7]])
y = np.array([[1],[2]])
print(np.broadcast_arrays(x, y))
#Output:
[array([[5, 6, 7],
       [5, 6, 7]]), array([[1, 1, 1],
       [2, 2, 2]])]

expand_dims():

The expand_dims() function of Numpy is used to expand the array by inserting a new axis at the specified position. The expand_dims() returns the array with the increased number of dimensions.

Syntax:

expand_dims(a, axis)

The a parameter is the input array and the axis parameter is the position of the new axis (or axes) be placed(input type is integer or tuple of integers).

arr1 = np.expand_dims(arr, axis=1)
print(arr1)
#Output:
[[[[ 1  2  3]
   [ 4  5  6]
   [ 7  8  9]]]

 [[[10 11 12]
   [13 14 15]
   [16 17 18]]]]

arr1 = np.expand_dims(arr, axis=2)
print(arr1)
#Output:
[[[[ 1  2  3]]

  [[ 4  5  6]]

  [[ 7  8  9]]]

 [[[10 11 12]]

  [[13 14 15]]

  [[16 17 18]]]]

squeeze():

The NumPy squeeze() function removes the one-dimensional(removes axes to length one) entry from the given array. The squeeze() function returns a ndarray with changed dimensions.

Syntax:

squeeze(a, axis=None)

The parameters of squeeze() function is the same as the expand_dims() function except axis is set to None.

x = np.array([[[2], [3], [5], [7], [11]]])
print(x.shape)
#Output:
(1, 5, 1)

print(np.squeeze(x).shape)
#Output:
(5,)

Joining Arrays

Joining means putting the contents of two or more arrays in a single array. The NumPy library provides several functions for joining arrays.

  • concatenate()
  • block()
  • stack()
  • hstack()
  • vstack()

cocatenate():

Concatenation refers to joining, the NumPy concatenate() function is used to join two or more arrays of the same shape along an existing axis. The concatenate() function returns a concatenated ndarray.

Syntax:

concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")

The parameter a1, a2, … are the input arrays(must have the same shape, except in the dimension corresponding to the axis). The parameter axis is the axis along which the arrays will be joined(If the axis is None, arrays are flattened before use, Default is set 0). The out parameter is the destination to place the result.

The dtype parameter is the data type of the resultant array, and the default is None. The casting parameter controls what kind of data casting may occur and defaults to ‘same_kind’.

a = np.array([[1, 2, 3], [4, 5, 6],[7, 8, 9]])
b = np.array([[10, 13, 16]])

print(np.concatenate((a, b), axis = 0))
#Output:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 13 16]]

print(np.concatenate((a, b.T), axis = 1))
#Output:
[[ 1  2  3 10]
 [ 4  5  6 13]
 [ 7  8  9 16]]

print(np.concatenate((a, b), axis = None))
#Output:
[ 1  2  3  4  5  6  7  8  9 10 13 16]

block():

The NumPy block() returns a ndarray assembled from nested lists of blocks. Here, blocks can be array, list of arrays, scalar inputs but not tuples. Blocks can be of any dimension, but will not be broadcasted using the normal rules. Instead, leading axes of size 1 are inserted, to make each block’s dimensions the same.

Syntax:

block(arrays)

The arrays parameter are arrays, nested list of arrays, and scalars.

print(np.block([[np.ones((2, 2)), np.zeros((2, 3))],
                [np.zeros((3, 2)), np.ones((3, 3))]]))
#Output:
[[1. 1. 0. 0. 0.]
 [1. 1. 0. 0. 0.]
 [0. 0. 1. 1. 1.]
 [0. 0. 1. 1. 1.]
 [0. 0. 1. 1. 1.]]

stack():

The NumPy stack() function joins the sequence of arrays along a new axis. The stack() function returns a stacked ndarray(all contents of input arrays in one array) that has one more dimension than the input arrays.

Syntax:

stack(arrays, axis=0, out=None)

The arrays parameters are the input arrays(each array must have the same shape). The axis parameter is the index of the new axis in the dimensions of the resultant array(if axis = 0(default), it will be the first dimension, and if axis = -1, it will be the last dimension). The out parameter is the destination to place the new ndarray(out is optional and the default is set None).

a = np.array([2, 3, 5])
b = np.array([7, 11, 13])

print(np.stack((a, b), axis = 0))
#Output:
[[ 2  3  5]
 [ 7 11 13]]

print(np.stack((a, b), axis = -1))
#Output:
[[ 2  7]
 [ 3 11]
 [ 5 13]]

hstack():

The hstack() function of NumPy is used to join arrays horizontally(column-wise). The hstack() function is equivalent to concatenate() function along the second axis, except for one-dimensional arrays(hstack() function concatenates along the first axis). This function is most suitable for arrays up to 3 dimensions.

Syntax:

hstack(tup)

The tup parameter is a sequence of ndarrays(arrays must have the same shape along all but the second axis, except 1D arrays which can be any length).

print(np.hstack((arr, np.zeros((2, 3, 3)), np.ones((2, 3, 3)))))
#Output:
[[[ 1.  2.  3.]
  [ 4.  5.  6.]
  [ 7.  8.  9.]
  [ 0.  0.  0.]
  [ 0.  0.  0.]
  [ 0.  0.  0.]
  [ 1.  1.  1.]
  [ 1.  1.  1.]
  [ 1.  1.  1.]]

 [[10. 11. 12.]
  [13. 14. 15.]
  [16. 17. 18.]
  [ 0.  0.  0.]
  [ 0.  0.  0.]
  [ 0.  0.  0.]
  [ 1.  1.  1.]
  [ 1.  1.  1.]
  [ 1.  1.  1.]]]

vstack():

The NumPy vstack() function is used to join arrays vertically(row wise). The vstack() function is equivalent to concatenate() function along the first axis. The vstack() is similar to hstack() and also works best for 3-dimensional array or higher.

Syntax:

vstack(tup)

The parameter of the vstack() function is the same as the hstack() function.

print(np.vstack((arr, np.zeros((2, 3, 3)), np.ones((2, 3, 3)))))
#Output:
[[[ 1.  2.  3.]
  [ 4.  5.  6.]
  [ 7.  8.  9.]]

 [[10. 11. 12.]
  [13. 14. 15.]
  [16. 17. 18.]]

 [[ 0.  0.  0.]
  [ 0.  0.  0.]
  [ 0.  0.  0.]]

 [[ 0.  0.  0.]
  [ 0.  0.  0.]
  [ 0.  0.  0.]]

 [[ 1.  1.  1.]
  [ 1.  1.  1.]
  [ 1.  1.  1.]]

 [[ 1.  1.  1.]
  [ 1.  1.  1.]
  [ 1.  1.  1.]]]

Splitting Arrays

Splitting is a reverse operation of joining operation. As joining operation merges the content of two or more arrays into one, splitting operation divides or breaks one array into multiple arrays. NumPy provides functions for splitting operation:

  • split()
  • array_split()
  • hsplit()
  • vsplit()

split():

The split() function is used to divide the array into subarrays along a specified axis. The split() function returns a list of ndarrays.

Syntax:

split(ary, indices_or_sections, axis=0)

The ary parameter is a ndarray to be divided into subarrays, the axis is the axis along which split is done(default is 0). The indices_or_sections parameter is if an integer says N, the array will be divided into N equal arrays along an axis.

The indices_or_sections parameter is if a 1-D array of sorted integers, the entries indicate where along the axis the array is split. For example, indices_or_sections is [4, 5] for axis = 0, results in array split: ary[ : 4], ary[4 : 5] and ary[5 : ].

a = np.arange(15)

print(np.split(a, 5))
#Output:
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11]), array([12, 13, 14])]

print(np.split(a, [5, 9]))
#Output:
[array([0, 1, 2, 3, 4]), array([5, 6, 7, 8]), array([ 9, 10, 11, 12, 13, 14])]

array_split():

The array_split() of Numpy splits an array into multiple sub-arrays. The only difference between the split() function and array_split() function is that array_split() function allows indices_or_sections to be an integer that does not equally divide the axis.

Syntax:

array_split(ary, indices_or_sections, axis=0)

The parameters of array_split() function is as same the split() function.

print(np.array_split(a, 4))
#Output:
[array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10, 11]), array([12, 13, 14])]

print(np.array_split(a, 2))
#Output:
[array([0, 1, 2, 3, 4, 5, 6, 7]), array([ 8,  9, 10, 11, 12, 13, 14])]

hsplit():

The Numpy hsplit() function is used to split an array into multiple sub-arrays horizontally (column-wise). The hsplit() is equivalent to split() with axis = 1. The hsplit() function always split the array along the second axis regardless of the dimensions of the array.

Syntax:

hsplit(ary, indices_or_sections)

The parameter of hsplit() function is the same as the split() function.

a = np.arange(24).reshape(6,4)

print(np.hsplit(a, 2))
#Output:
[array([[ 0,  1],
       [ 4,  5],
       [ 8,  9],
       [12, 13],
       [16, 17],
       [20, 21]]), array([[ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15],
       [18, 19],
       [22, 23]])]

vsplit():

The Numpy vsplit() function is used to split an array into multiple sub-arrays vertically (row-wise). The vsplit() is equivalent to split() with axis = 0. The vsplit() function is similar to the hsplit() function, the only difference is vsplit() function always split array along the first axis.

Syntax:

vsplit(ary, indices_or_sections)

The parameters of vsplit() function is the same as split() function.

print(np.vsplit(a, 3))
#Output:
[array([[0, 1, 2, 3],
       [4, 5, 6, 7]]), array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]]), array([[16, 17, 18, 19],
       [20, 21, 22, 23]])]

Adding and Removing Array elements

The Numpy package provides functionalities to add and remove ndarray elements.

  • append()
  • delete()
  • insert()
  • resize()
  • trim_zeros()

append():

The append() function of NumPy adds an element at the end of the array sequence. The append() function returns a new ndarray with values added. The dimension of input arrays must be the same.

Syntax:

append(arr, values, axis=None)

The arr parameter is input array(values are appended to a copy of this array), values parameter are arrays to be appended on input array(arr). The axis parameter is the axis along which the values are appended. If the axis is not given, then both arr and values will be flattened(1-dimensional) before append operation.

x=np.arange(4)

print(np.append(x,[6, 8, 10, 12]))
#Output:
[ 0  1  2  3  6  8 10 12]

print(np.append(x.reshape(2,2), [[4, 5],[6, 7]], axis = 0))
#Output:
[[0 1]
 [2 3]
 [4 5]
 [6 7]]

delete():

The delete() function of NumPy removes a specified array from the input array. The delete() function returns a new array with deleted elements along an axis.

Syntax:

delete(arr, obj, axis=None)

The arr parameter is the input array, obj is indices of the subarray to be removed (along an axis). The axis parameter is the axis along which the subarray is deleted.

a = np.arange(10, 19)

print(np.delete(a, (0, 2, 4)))
#Output:
[11 13 15 16 17 18]

print(np.delete(a.reshape(3, 3), 1, axis = 1))
#Output:
[[10 12]
 [13 15]
 [16 18]]

insert():

The NumPy insert() function adds elements before the given indices. The insert() function returns a new ndarray with inserted elements along a given axis.

Syntax:

insert(arr, obj, values, axis=None)

The arr is an input array, obj is the indices before the new element is inserted, values are the values to be inserted and axis is the axis along which element is inserted.

x = np.arange(9)

print(np.insert(x.reshape(3,3), 2, [10, 15, 20], axis = 1))
#Output:
[[ 0  1 10  2]
 [ 3  4 15  5]
 [ 6  7 20  8]]

resize():

The resize() function of NumPy returns a new array with the specified shape. If the new array is larger than the original array(shape-wise), then the new array is filled with repeated copies of the original array elements.

The resize() function is different from ndarray.resize() function as ndarray.resize() fills the new shape with zeros instead of repeated copies of array elements.

Syntax:

resize(a, new_shape)

The a parameter is the input array and new_shape is the new shape of the array.

x = np.array([[2, 3], [5, 7], [11, 13]])

print(np.resize(x, (4, 3)))
#Output:
[[ 2  3  5]
 [ 7 11 13]
 [ 2  3  5]
 [ 7 11 13]]

trim_zeros():

The NumPy trim_zeros() function removes the leading and/or trailing zeros from a 1-D array sequence. The trim_zeros() returns a trimmed array with the datatype the same as the input array.

Syntax:

trim_zeros(filt, trim='fb')

The filt parameter is an input 1-d array, and trim is a string with the value ‘f’ representing trim from front and the ‘b’ to trim from the back(Default is ‘fb’ which trim zeros from both front and back) of the array.

a = np.array([0, 0, 1, 0, 3, 0, 4, 5, 0, 0, 0 ])

print(np.trim_zeros(a))
#Output:
[1 0 3 0 4 5]