Looking to create a Covariance Matrix using Python?

If so, I’ll show you how to create such a matrix using both numpy and pandas.

## Steps to Create a Covariance Matrix using Python

### Step 1: Gather the Data

To start, you’ll need to gather the data that will be used for the covariance matrix.

For example, I gathered the following data about 3 variables:

A | B | C |

45 | 38 | 10 |

37 | 31 | 15 |

42 | 26 | 17 |

35 | 28 | 21 |

39 | 33 | 12 |

### Step 2: Get the Population Covariance Matrix using Python

To get the population covariance matrix (based on N), you’ll need to set the bias to *True *in the code below.

This is the complete Python code to derive the population covariance matrix using the numpy package:

import numpy as np A = [45,37,42,35,39] B = [38,31,26,28,33] C = [10,15,17,21,12] Data = np.array([A,B,C]) covMatrix = np.cov(Data,bias=True) print (covMatrix)

Run the code, and you’ll get the following matrix:

### Step 3: Get a Visual Representation of the Matrix

You can use the seaborn package in order to visually represent the covariance matrix.

Here is the complete code that you can apply in Python:

import numpy as np import seaborn as sn A = [45,37,42,35,39] B = [38,31,26,28,33] C = [10,15,17,21,12] Data = np.array([A,B,C]) covMatrix = np.cov(Data,bias=True) sn.heatmap(covMatrix, annot=True, fmt='g')

Once you run the code, you’ll get the following matrix:

## Derive the Sample Covariance Matrix

To get the sample covariance (based on N-1), you’ll need to set the bias to *False *in the code below.

Here is the code based on the numpy package:

import numpy as np A = [45,37,42,35,39] B = [38,31,26,28,33] C = [10,15,17,21,12] Data = np.array([A,B,C]) covMatrix = np.cov(Data,bias=False) print (covMatrix)

And this is the matrix that you’ll get:

You can also use the pandas package in order to get the sample covariance matrix.

You may then apply the following code using pandas:

from pandas import DataFrame Data = {'A': [45,37,42,35,39], 'B': [38,31,26,28,33], 'C': [10,15,17,21,12] } df = DataFrame(Data,columns=['A','B','C']) covMatrix = DataFrame.cov(df) print (covMatrix)

You’ll get the same matrix as derived using numpy:

Finally, you can visually represent the covariance matrix using the seaborn package:

from pandas import DataFrame import seaborn as sn Data = {'A': [45,37,42,35,39], 'B': [38,31,26,28,33], 'C': [10,15,17,21,12] } df = DataFrame(Data,columns=['A','B','C']) covMatrix = DataFrame.cov(df) sn.heatmap(covMatrix, annot=True, fmt='g')

Run the code, and you’ll get the visual representation of the matrix:

You may also want to check the following source that explains the full steps to create a Confusion Matrix using Python.