×

Getting Started with NumPy

Numerical Python(NumPy) is a Python library that is fundamentally used for scientific computing in Python. The library provides fast operations on arrays, handling multidimensional arrays(example: matrices) and providing functions for shape manipulation, sorting, logical operations, and many more mathematical operations.

The earlier version of NumPy called Numeric was developed by Jim Hugunin and another package Numarray was also developed with some additional routines and functionalities.

In 2005, Travis Oliphant created the NumPy package by combining the features of Numarray into the Numeric package. Since NumPy is an open-source project there are many more contributors to it.

Why NumPy is required?

An array is a collection of elements or items having the same data type whereas a list is a collection of items that can have different data types. Arrays are used in all sorts of major domains of emerging technologies in computer science and it is a very basic yet important data structure.

  • The main reason why NumPy came into the picture is to provide array functionalities and implementations with a vast variety of routines that supports array creation, array operations, manipulations, and processing.
  • The NumPy package is designed in such a way that it is incredibility computational friendly and efficient. This serves as the basis for many other packages to work well like SciPy and Pandas also.
  • The NumPy array can handle a huge amount of data with optimized memory and fast processing. NumPy works well with all CPU versions.
  • In the NumPy array, each element comprises a precise block size on the memory location in bytes whereas the Python list does not have the same block size throughout for all elements in it. Python list elements are dynamically allocated (randomly scattered) along with the memory.

Accessing NumPy array elements takes very little time as each element in the NumPy array allocated sequentially on the memory. That is the core reason why the NumPy array execution is far faster than that of Python lists.

There are various operations that are performed with the help of the NumPy library. NumPy routines facilitate mathematical operations, logical operations, operations of linear algebra, Fourier transforms, etc.

How NumPy works?

The NumPy library is written partially in Python and most of the parts are in C/C++ language. Python being an interpreter language, NumPy routines that are needed to be executed quickly if written in python, do not perform well as those that are written in compiled language C/C++.

Modules and submodules that requires fast numeric computations are coded in C/C++

In NumPy, instead of iterating loops and indexing, the concept of vectorization is applied to make it, even more, faster in terms of execution time and optimized in terms of memory at the same time.

The Vectorized code has many advantages, one of which is readability increases and the number of lines of code reduces. This in turn helps in managing code and to deal with fewer bugs only.

These vectorized codes are optimized as they were created as pre-compiled C codes. The vectorized codes also have a very similar notation form as standard mathematical notation making it easier to write understand and execute mathematical functions.

When to use NumPy?

  • NumPy really comes handy when we have to compute a lot of data values all at once for analysis.
  • It also works effortlessly with vectors and matrices.
  • It can be used to import a set of huge numerical values directly into python codes. the data can be from images or video buffers while working effectively with some big like Tensorflow, Opencv, etc that are used for image processing.
  • It can be used to generate random numerical testcases at bulk and provides handful of functions in Statistical Distributions and calculations.

Environment Setup

The only thing required to be installed prior to NumPy is Python. Python distribution doesn’t have NumPy module as base/built-in modules. The pip (python most famous built-in package installer) can be used to install NumPy:

pip install numpy

The pip command can be executed in command prompt (Windows Users) and terminal(Mac, Linux Users).

If you are using Anaconda Distribution, then NumPy comes preinstalled. For updating or to get the latest version of NumPy and this can be used for conda users as well:

pip install numpy --upgrade

If the installation commands were executed correctly without errors then NumPy is installed and can be imported in code as :

import numpy as np

Where import is a keyword used to import Python modules/library/packages, numpy is the library name and as np is like substituting np in place of numpy for preventing any module name collisions (as np is optional in the context of importing).

If NumPy is not installed properly, after executing the import statement, the error message will be displayed as:

Using Command line :

C:\Users\user>pip show numpy

Output:

WARNING: Package(s) not found: numpy

Using Python IDLE (comes with Python Distribution):

>>> import numpy as np

Output:

Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    import numpy as np
ModuleNotFoundError: No module named 'numpy'

Using Jupyter Notebook:

import numpy as np

Output:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-0aa0b027fcb6> in <module>
----> 1 import numpy as np

ModuleNotFoundError: No module named 'numpy'

In any of the case if this message is coming, then just try running the installation commands again, If same issue occurs then validate if Python is installed without errors and added properly to your system’s environment variables (Windows Users).

A quick example of NumPy

In this short example we will see how to create a numpy array which are known as ndarray(stands for n dimensional array) and see few attributes of NumPy arrays. Do not Panic if this terms are unfamiliar as we will go through each term in depth in the coming tutorial.

#importing numpy to our code
import numpy as np
#creating a python list 
py_list = [100,20,350,49]
#creating a numpy array (ndarray)
#using array(),passing python list as argument
numpy_list = np.array(py_list)
#print numpy list
print("numpy list :",numpy_list)

Output:

numpy list : [100  20 350  49]

So here in the first line of our code, we import the NumPy library as np means instead of writing numpy we are using np as its substitute(alias). We created a normal python list named py_list and initialized it with some integer values.

After that, we call the array() function from NumPy (np.array()). The array() function takes the python list py_list as an argument(input parameter) and outputs a numpy array.

This array() function gives an N-Dimension array (ndarrray) in return(after completing its execution). So this output array is stored in a list variable named numpy_list. We print the numpy_list at the end of the code.