Python and Data Visualization: Matplotlib and Beyond
6 mins read

Python and Data Visualization: Matplotlib and Beyond

Matplotlib: Introduction

Matplotlib is a powerful library for creating static, animated, and interactive visualizations in Python. It provides a wide range of functions and methods for creating various types of graphs, plots, and charts.

Installation

To install Matplotlib, we can use the following command:

pip install matplotlib

Basics of Matplotlib

Let’s start by importing Matplotlib and creating a basic line graph. We will plot the sales data for a fictional company over a period of time.

import matplotlib.pyplot as plt

# Sales data
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May']
sales = [10000, 15000, 12000, 18000, 20000]

# Create a line graph
plt.plot(months, sales)

# Add labels and title
plt.xlabel('Months')
plt.ylabel('Sales')
plt.title('Monthly Sales Report')

# Display the graph
plt.show()

We begin by importing the matplotlib.pyplot module, which provides a MATLAB-like interface for creating plots. Next, we define the sales data for each month using two lists: months and sales.

To create a line graph, we use the plot() function and pass in the months list as the x-axis values and the sales list as the y-axis values. The resulting graph will have the months on the x-axis and the corresponding sales on the y-axis.

We add labels to the x-axis and y-axis using the xlabel() and ylabel() functions, respectively. We also set a title for the graph using the title() function.

Types of Graphs

Matplotlib provides various types of graphs and plots to visualize different types of data. Let’s explore a few of them:

Bar Graphs

A bar graph is a great way to represent categorical data or compare multiple categories. Let’s create a bar graph to compare the revenue generated by different product categories for a retail company.

import matplotlib.pyplot as plt

# Product categories
categories = ['Electronics', 'Clothing', 'Books', 'Home']

# Revenue data
revenue = [5000, 7000, 3000, 4000]

# Create a bar graph
plt.bar(categories, revenue)

# Add labels and title
plt.xlabel('Product Categories')
plt.ylabel('Revenue ($)')
plt.title('Revenue by Product Category')

# Display the graph
plt.show()

In this example, we have four product categories: Electronics, Clothing, Books, and Home. The corresponding revenue data is stored in the revenue list.

We use the bar() function to create a bar graph, where the x-axis represents the categories and the y-axis represents the revenue. We pass in the categories list as the x-axis values and the revenue list as the y-axis values.

Similar to the line graph example, we add labels to the x-axis and y-axis using the xlabel() and ylabel() functions, respectively. We also set a title for the graph using the title() function.

Finally, we use the show() function to display the graph.

Pie Charts

A pie chart is useful for representing proportions or percentages. Let’s create a pie chart to visualize the market share of different smartphone brands.

import matplotlib.pyplot as plt

# Smartphone brands
brands = ['Apple', 'Samsung', 'Huawei', 'Xiaomi', 'Others']

# Market share
market_share = [30, 25, 15, 10, 20]

# Create a pie chart
plt.pie(market_share, labels=brands, autopct='%1.1f%%')

# Add title
plt.title('Market Share of Smartphone Brands')

# Display the chart
plt.show()

In this example, we have five smartphone brands: Apple, Samsung, Huawei, Xiaomi, and Others. The market share data for each brand is stored in the market_share list.

We use the pie() function to create a pie chart, where the sizes of the wedges represent the market share of each brand. We pass in the market_share list as the data values and the brands list as the labels for each wedge. The autopct='%1.1f%%' parameter is used to display the percentage value for each wedge.

We set a title for the pie chart using the title() function.

Beyond Matplotlib

While Matplotlib is a powerful library for data visualization, there are several other libraries that offer additional functionalities and aesthetic options. Let’s explore a few of them:

Seaborn

Seaborn is built on top of Matplotlib and provides a high-level interface for creating attractive and informative statistical graphics. It simplifies many tasks by automatically applying appropriate settings and themes. Let’s create a box plot to visualize the distribution of student scores in different subjects.

import seaborn as sns

# Student scores
math_scores = [80, 95, 70, 85, 90]
science_scores = [75, 80, 85, 90, 95]
english_scores = [85, 80, 75, 90, 95]

# Create a box plot
sns.boxplot(data=[math_scores, science_scores, english_scores])

# Set labels and title
plt.xlabel('Subjects')
plt.ylabel('Scores')
plt.title('Distribution of Student Scores')

# Display the plot
plt.show()

In this example, we have three subjects: Math, Science, and English. The scores achieved by each student in these subjects are stored in separate lists: math_scores, science_scores, and english_scores.

We use the boxplot() function from Seaborn to create a box plot. We pass in the data as a list of lists, where each list represents the scores for a particular subject.

We set the labels for the x-axis and y-axis using the xlabel() and ylabel() functions, respectively. We also set a title for the plot using the title() function.

Finally, we use the show() function to display the plot.

Plotly

Plotly is an open-source library for creating interactive plots and dashboards. It offers a wide range of charts and graphs with built-in interactivity and animation capabilities. Let’s create an interactive scatter plot to visualize the relationship between two variables.

import plotly.express as px

# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 4, 1, 3, 5]

# Create a scatter plot
fig = px.scatter(x=x, y=y)

# Set title
fig.update_layout(title='Scatter Plot')

# Display the plot
fig.show()

In this example, we have two variables: x and y. We define the values for these variables as lists.

We use the scatter() function from Plotly Express to create a scatter plot. We pass in the x and y values as arguments.

We set a title for the plot using the update_layout() function.

Finally, we use the show() method of the fig object to display the plot.

Leave a Reply

Your email address will not be published. Required fields are marked *