Making Sense of Numpy Arrays I

Hello and welcome

One key reason python is the number one language for data scientists and machine learning engineers is the amount of libraries it has. Moving away from the core libraries that comes packaged with python , numpy doesn't come packaged with python. It is a third party package that has to be installed using the CLI (with the help of pip), however if you have the anaconda distribution installed , then you have no worries (it is already installed).

Making Sense of Numpy Arrays I

This post gives you all you need to get started with numpy arrays . This is the first official library one need to understand in the study of Machine learning/ Data science with python. It is super fast and with uniform api.

jason-strull-KQ0C6WtEGlo-unsplash.jpg

Numpy is a python library designed for scientific and engineering computation. It’s the goto library in python for array/ tensor manipulations.

This will be first of the series of blog posts on the Numpy library. This series is intended to give you a thorough explanation of the numpy library and its capabilities. In this post , I will cover the following

  • Concept behind Numpy and vectorization
  • Advantages of numpy over numpy lists and tuples
  • Creating numpy arrays from lists and tuples
  • Numpy array attributes/ properties
  • Basic arithmetic operations on numpy arrays

Concept behind Numpy and Vectorisation

Numpy array is a collection of homogeneous data types. Unlike python lists and tuples , which are non-homogeneous. This is one of the advantage numpy arrays have over python list/tuple ; since the array is homogeneous , no effort is made in trying to determine the data type of each element, which is usually done at run time for a list , hence takes extra time but that is not the case for a numpy array , also called ndarray.

Creating Numpy arrays from List and Tuples

To work with the numpy library , first the library has to be imported. Different methods can be used to import the library , since it is just like other python libraries however , the community convention is to import it with an alias of np , shown in the listing below

1  import numpy as np

Now, to convert from list and tuple to numpy array , we use the .array() method and pass the list or tuple as the argument, as shown in the listing below

1    import numpy as np
2    x=[2,3,4,1] # Ths is a list
3    y=(7,5,8,4.1) # This is a tuple
4    z=[2,3,1.2,"3"]
5    x_np=np.array(x)
6    y_np=np.array(y)
7    z_np=np.array(z)
8    print(x_np)
9    print(y_np)
10   print(z_np)
#Below is the output of the program

'''
[2 3 4 1]
[7.  5.  8.  4.1]
['2' '3' '1.2' '3']
'''

Notice the data type of _ynp , and _znp. The data type of y_np and z_np are respectively float and string (for the sake of simplicity), this is because numpy arrays are meant to be homogeneous collection, hence always try to figure out things, and in most cases does explicit upcasting of types, so in a collection of ints and floats , everything is casted to floats , while in a collection of floats , ints and strings , the casting is to strings.

Numpy arrays can also be created from 2 and multi-dimensional arrays , as shown in the listing below.

1    p2=np.array([[3,1,2],[9,8,1],[1,4,3]])
2    print(p2)

#The result is 
'''
[[3 1 2]
[9 8 1]
[1 4 3]]
'''

Numpy Attributes

A numpy array is a numpy object, hence posses the two members of every objects , which are attributes and methods . In this section i will be explaining the major and most common attributes of the numpy array

1. ndim

ndim is a method that figures out the number of dimension(s) of a numpy array. it returns an integer value, as shown below

1    x_np=np.array([2,3,4,1])
2    p2=np.array([[3,1,2],[9,8,1],[1,4,3]])
3    print(x_np.ndim)
4    print(p2.ndim)

# The output is 
"""
1
2
"""

2. shape

Shape is used to figure out the numpy of elements in each axis of the array, as shown below

1    p3=np.array([[4,5],[5,2],[5,6]])
2    print(x_np.shape)
3    print(p2.shape)
4    print(p3.shape)

#The output:
"""
(4,)
(3, 3)
(3, 2)
"""

3. dtypes

dtype determines the datatype of the elements in the array . Numpy has different data types , which includes int8, int16, int32, int64, float32, float64, e.t.c.

At the simplest level, just know that we have int and float , because our interest is in the manipulation of numbers

1    print(p2.dtype)
2    print(y_np.dtype)

# The outpu:
"""
int32
float64
"""

4. size

This determines the total number of elements an array holds or can contain. In the listing below , notice that p3 is of shape (3,2), hence holds 3x2=6 items , while p2 is of shape (3,3), hence contains 9 items.

1    print(p2.size)
2    print(p3.size)

#The output:
"""
9
6
"""

Basic Array Opeartions

When it comes to arithmetic operations, this is where numpy shows its true power. Numpy allows you to perform complex operations, sequentially that otherwise would have required complex loops and branches. In general , numpy helps in the elimination of loops and if statements.

In this post, i will be explaining the basic arithmetic operations that can be performed on arrays. This includes addition , subtraction , multiplication , division and exponentiation.

  1. Addition

Given the lists a=[3,5,1,0.0] and b=[3,5,2,1], if we try to add these list using the addition operator it would perform concatenation of lists instead of the operation we intend for it to perform, this is as shown below , how ever numpy makes the intended operation possible.

1    a=[3,5,1,0.0]
2    b=[2,3,-1,1]
3    print(a+b)

# The output:
"""
[3, 5, 1, 0.0, 2, 3, -1, 1]
"""

However , when converted to numpy array , the result becomes

1    a=np.array([3,5,1,0.0])
2    b=np.array([2,3,-1,1])
3    print(a+b)

# The output

"""
[5. 8. 0. 1.]
"""

A scalar can also be added to each element of the array(This behavior is called broadcasting to be discussed in the next post), as shown below

1    print(8+a) # This adds 8 to each element of "a"

# the output:
"""
[11. 13.  9.  8.]
"""

2. Subtraction

The behavior in addition is also application in subtraction

1    print(a-b) # Subtracts each element of b from the element in the corresponging position in a
# The output is:
"""
[ 1.  2.  2. -1.]
"""
1    print(a-10) # Subtracts 10 from each value in a

# The output is:
"""
[ -7.  -5.  -9. -10.]
"""

Conclusion

As can be seen , numpy makes working with arrays or collections lot of fun.

In the next post , i shall be considering other operations on arrays , which includes multiplication , division , broadcasting, and operations using universal functions. Do well to practice what we have in this post , and drop comments where confused or if more explanation is required.

Have fun!!

No Comments Yet