NumPy Standard Deviation Function Explained (With Examples)

In this post, we are going to learn how to perform standard deviation operations with NumPy. And utilize the use of the NumPy standard deviation function to carry out these operations. 

If you want to learn by watching, I recommend checking out the NumPy Data Science Essential Training from LinkedIn Learning. 

Anyway, to follow this article, you will need basic knowledge of the Python programming language. If you need a little refresher on Python, then check out this guide for reference:

Before learning about the Numpy standard deviation function, you need to be familiar with Numpy arrays and how they work:

Last but not least, I recommend following this tutorial using a Jupyter notebook.

What is Standard Deviation?

I am already assuming that you have the basic statistical concepts such as mean, median, mode, and standard deviation under your belt. However, going over the definition would not hurt.

Read: How to Become a Machine Learning Engineer

The standard deviation is the measurement that shows the dispersion of a data point in a dataset from the mean. The further the data point spreads, the greater the standard deviation.

Thus, if the data points are further away from the mean value, then there is a higher deviation within your dataset.

Here is the population standard deviation formula

Numpy Standard Deviation
  • σ = population standard deviation
  • ∑ = sum of
  • X = each value
  • μ = population mean
  • N = number of values in the population

And here is the sample standard deviation formula:

Numpy Standard Deviation
  • s = sample standard deviation
  • ∑ = sum of
  • X = each value
  • x̅ = sample mean
  • n = number of values in the sample

Going in-depth with standard deviation is beyond the scope of this article. As a result, you should check out the following resource from Khan Academy:

NumPy Standard Deviation Function

Before using the Numpy standard deviation function, let’s start by creating a simple list and sort it:

myList = [10.5,12.2,4.2,5.5,4.2,7.7,9]
myList.sort()
print(myList)

Output:

[4.2, 4.2, 5.5, 7.7, 9, 10.5, 12.2]

Our code now creates a list. The code then sorts the list in ascending order.

To find the standard deviation, we are going to use the np.std(). We will pass our list as a parameter inside the function. Type the following:

import numpy as np
myStandardDeviation = np.std(myList)
print(myStandardDeviation)

Output:

2.9048868044966527

What do you think is happening here?

First, I imported Numpy as np. Then used the np.std() function to find the standard deviation of our list. You can see that I have stored my result inside the variable myStandardDeviation.

Parameters of np.std()

The Numpy standard deviation function accepts the following parameters:

  • a
  • axis
  • dtype
  • ddof
  • keepdims
  • out

a (required)

This parameter specifies the array of values for which you want to calculate the standard deviation. You must pass input for this parameter. And the accepted data type can either be Numpy arrays or Python lists.

axis (optional)

The axis parameter defines the axis along which we compute our standard deviation.

I like to think of the axis as an imaginary line or direction that runs along with our array.

For two dimensional array, there are two axes. It can either be vertical or horizontal.

When the axis=1, we are applying along each column or all the vertical values.

Similarly, when the axis=0, we are applying along each row or all the horizontal values.

Here’s a great visual representation of how the axis works from TutorialsAndYou

Numpy Standard Deviation

dtype (optional)

You can also specify the data type for calculating the standard deviation.

The default dtype is float64 for an array that are integers.

ddof (optional)

This parameter specifies the Delta Degrees of Freedom.

Degrees of Freedom is the maximum number of values that are in the calculation that can independently vary. By default, ddof is zero.

keepdims (optional)

The keepdims parameter accepts a boolean value of either True or False.

You can set keepdims = True. As a result, the output will have the same number of dimensions as the input. 

We can pass keepdims primarily to keep the two-dimensional aspect of an array. 

out (optional)

The out parameter allows you to specify an alternate array where you can write the result.

If you want to create a new array from your result, you can use the out parameter. 

But one thing to keep in mind is that the new array must be the same shape and type as the array that your function returns.

Some of the parameters that the Numpy standard deviation function uses are abstract. Thus, I highly recommend checking out the official Numpy documentation.

Examples

Let’s take a look at some of the examples on how to use the Numpy standard deviation function or np.std() :

RelatedNumPy Tutorial for Beginners – Arrays

Example 1 – Standard deviation of 1-dimensional Numpy array.

import numpy as np

#Create and initialize our 1d array
my1dArray = np.array([67, 244, 12, 66, 44, 27,15,21,30])

#Compute standard deviation using np.std()
results = np.std(my1dArray)

#Print results
print(results)

Output:

68.3245703504987

Example 2 – Standard deviation of 2-dimensional Numpy array.

import numpy as np

#Create and initialize our 2d array
my2dArray = ([[31, 55], [77, 9]])

#Compute standard deviation using np.std()
results = np.std(my2dArray)

#Print results
print(results)

Output:

25.495097567963924

Example 3 – Standard deviation of columns.

First create a 2-dimensional array from Python list using np.array() function.

# Creating an 2-d array from Python list.

import numpy as np 

myList = [[1,2,3],[4,5,6],[7,8,9]]
myArray = np.array(myList)

print(myArray)

So far this is what our 2-d array looks like:

[[1 2 3]
 [4 5 6]
 [7 8 9]]

To find the standard deviation of columns, we have to pass the parameter axis and set it to 0 inside our np.std() function: 

#Compute standard deviation with axis = 0
result = np.std(myArray, axis=0)

#Print result
print(result)

Output:

[2.44948974 2.44948974 2.44948974]

When we set axis=0, the Numpy standard deviation function will compute it downward or vertically.

Example 4 – Standard deviation of rows.

Similar to our previous example, we will first create a 2-d array. We can use the previous example here.

# Creating an 2-d array from Python list.

import numpy as np 

myList = [[1,2,3],[4,5,6],[7,8,9]]
myArray = np.array(myList)

print(myArray)

Here’s what our 2-d array should like:

[[1 2 3]
 [4 5 6]
 [7 8 9]]

To find the standard deviation of rows, we have to pass the parameter axis and set it to 1 inside our np.std() function: 

#Compute standard deviation with axis = 1
result = np.std(myArray, axis=1)

#Print result
print(result)

Output:

[0.81649658 0.81649658 0.81649658]

Conclusion

Cool, so we are at the end of this post. I hope it gave you a good idea on how to use the Numpy standard deviation function.

These concepts can be abstract and complex. So, go over the materials again and again if it comes to that. And not to mention, it is also equally crucial to keep practicing.

Recommended: NumPy Array & Slicing Explained

Make sure to check out the course NumPy Data Science Essential Training to learn Numpy through interactive videos. 

I highly recommend it. The best part is that you can try out the course for FREE!

Do you have questions regarding the Numpy standard deviation function? Feel free to comment below!

Leave a Reply