×

Scatter Plots with Plotly

A scatter plot is a chart type that is normally used to observe and visually display the relationship between variables. The values of the variables are represented by dots.

The positioning of the dots on the vertical and horizontal axis will inform the value of the respective data point; hence, scatter plots make use of Cartesian coordinates to display the values of the variables in a data set.

Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on a variety of types of data and produces easy-to-style figures. Lets plot a simple graph using plotly.express (usually imported as px).

Simple Plotting with scatter

scatter function accepts two lists, one for x coordinates and another one for y coordinates.

# x and y given as array_like objects
import plotly.express as px
fig = px.scatter(x=[0, 1, 2, 3, 4], y=[0, 1, 4, 9, 16])
fig.show()

Run this code and your default browser will open where you will get an interactive graph like this one.

Scatter Plot Plotly 01

Passing pandas dataframe to scatter

We can use the scatter function to plot the pandas data frame. It accepts dataframe and labels for x and y axis.

# x and y given as DataFrame columns
import plotly.express as px
df = px.data.iris() # iris is a pandas DataFrame
fig = px.scatter(df, x="sepal_width", y="sepal_length")
fig.show()

Run this code and you will get this graph in the browser.

Scatter Plot Plotly 02

Setting size and color with column names

To plot the graph with variable-sized circular marks, we use the size argument. Note that color and size data are added to hover information. You can add other columns to hover data with the hover_data argument of px.scatter.

import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
                 size='petal_length', hover_data=['petal_width'])
fig.show()

Run this code and observe the difference in the size of circular marks.

Scatter Plot Plotly 03

Color can be continuous as follows

import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color='petal_length')
fig.show()

Now by just looking at the color, you can estimate the value of petal_length.

Scatter Plot Plotly 04

Symbol (labels)

Symbols are used to differentiate data from different categories. The symbol argument can be mapped to the column as well.

import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species", symbol="species")
fig.show()

Different symbols are assigned as per species available in the data.

Scatter Plot Plotly 05

Scatter Plot And categorical axes

We can present the variable in categorical form such as group like medals.

import plotly.express as px
df = px.data.medals_long()

fig = px.scatter(df, y="nation", x="count", color="medal", symbol="medal")
#to increase the size of circular marks in graph
fig.update_traces(marker_size=10) 
fig.show()
Scatter Plot Plotky 06

Error bars with scatter plot

Error bars are used to represent uncertainty in data.

Error Bars
import plotly.express as px
df = px.data.iris()
#adding an extra column in dataframe for using it to plot error bars
df["e"] = df["sepal_width"]/100
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
                 error_x="e", error_y="e")
fig.show()
Scatter Plot Plotly 07

Marginal Distribution Plots

Marginal distribution plots are small subplots above or to the right of the main plot, which show the distribution of data along only one dimension.

Marginal can be ‘histogram’, ‘rug’, ‘box’ or ‘violin’.

import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_length", y="sepal_width", marginal_x="histogram", marginal_y="rug")
fig.show()

The side along the presence of data distribution makes it easier to understand the distribution of each variable.

Scatter Plot Plotly 07 1