×

Altair with STREAMLIT

Altair Streamlit Feature

Altair is a declarative statistical visualization library for Python, based on Vega. Altair is becoming the first choice of people looking for a quick and efficient way to visualize datasets.

The key idea behind Altair is that users should only focus on the declaration of links between data columns and visual encoding channels (e.g., x and y axes, color, size, etc.) and the rest of the visualization process will be handled by the library (i.e Altair visualization library).

Vega

Vega provides basic building blocks for a wide variety of visualization designs: data loading and transformation, scales, map projections, axes, legends, and graphical marks such as rectangles, lines, plotting symbols, etc.

Streamlit

Streamlit is an amazing technology that turns data scripts into shareable web apps in minutes.

Installation

pip install altair
pip install streamlit

Required Modules

import altair as alt
import streamlit as st
import pandas as pd

Components of Altair Chart

Chart() is a fundamental object in Altair, which accepts a single argument — a DataFrame. The chart won’t do much on its own, till we mention its components – the data, mark, and the encoding.

Data

Altair is built around the Pandas Dataframe, which means that we can manipulate data in Altair the same way we would deal with Pandas DataFrame. There are multiple ways of inputting data like providing the Pandas — DataFrame or CSV formatted text file.

Mark

The Mark property is what specifies how exactly the attributes of the data set should be represented on the plot (i.e. line chart, scatter plot, etc). Altair provides a number of basic mark properties:

Altair Mark Properties

Encoding

Once we have the data and how it is represented — Next, we want to specify where to represent the data. That is deciding, What data should be in which axes?, What should be the size of the plot, or What should be the color of the plot, etc. This is where we use encodings. 

Line Plot

Let us look at a simple line plot using Chart()mark_line() and encode() methods.

Data Set
data_set = {
    'countries': ['India', 'Australia', 'Japan', 'America', 'Russia'],
    'values': [4500, 2500, 1053, 500, 3200]
}

df = pd.DataFrame(data_set)
Plot
line = alt.Chart(df).mark_line().encode(
    x = 'countries',
    y = 'values'
)
Loading plot into Streamlit application
st.altair_chart(line)
Full implementaion would be like this:
import altair as alt
import streamlit as st
import pandas as pd

data_set = {
    'countries': ['India', 'Australia', 'Japan', 'America', 'Russia'],
    'values': [4500, 2500, 1053, 500, 3200]
}

df = pd.DataFrame(data_set)

line = alt.Chart(df).mark_line().encode(
    x = 'countries',
    y = 'values'
)

st.altair_chart(line)

Command to run the streamlit application

streamlit run app.py

app.py is the file name of the python file we have created so far.

The Streamlit Web App will be available at the following URL:

http://localhost:8501
Output
Line Chart Altair

Adding Properties to the Plot

Interactiveness

By adding interactive() method to a line plot object we can make it interactive. Interactiveness means we can now Zoom-In or Zoom-Out in the plot.

line = alt.Chart(df).mark_line().encode(
    x = 'countries',
    y = 'values'
).interactive()

st.altair_chart(line)
Output

Height and Width

Properties() – method helps us to set the height and width of the plot.

<altair-chart-object>.properties(width=500, height=500)

Adding Title to the plot

Properties() – method also helps to add a title to the plot.

<altair-chart-object>.properties(title = "The Line Plot")

Adding Colour

Colour can be added to the plot bypassing the Color argument to the mark method.

import altair as alt
import streamlit as st
import pandas as pd

data_set = {
    'countries': ['India', 'Australia', 'Japan', 'America', 'Russia'],
    'values': [4500, 2500, 1053, 500, 3200]
}

df = pd.DataFrame(data_set)

line = alt.Chart(df).mark_line(color="Yellow").encode(
    x = 'countries',
    y = 'values'
).properties(width = 650, height = 500, title = "Line Plot").interactive()

st.altair_chart(line)
Output
Coloured Line Chart Altair

Scatter Plot

For Scatter Plot mark_point() – method is used. Chart() and encode() – method remains the same as the line plot.

Data Set

Altair also allows using vega data sets for practice. Install vega data set using the following command:

pip install altair vega_datasets

Vega data sets have various data sets like :

  • data.stocks()
  • data.movies()
  • data.iris()

You can easily find all available datasets with data.list_datasets().

Let’s use the iris data set. And we can get insides of any data-set using head() function.

from vega_datasets import data

df = data.iris()
print(df.head())
Output
    sepalLength  sepalWidth  petalLength  petalWidth species
0          5.1         3.5          1.4         0.2  setosa
1          4.9         3.0          1.4         0.2  setosa
2          4.7         3.2          1.3         0.2  setosa
3          4.6         3.1          1.5         0.2  setosa
4          5.0         3.6          1.4         0.2  setosa
The Scatter Plot
scatter  = alt.Chart(df).mark_point().encode(x='sepalLength', y='petalLength').interactive()
Loading plot into Streamlit application

Streamlit has altair_chart() which enables us to load the Altair charts into the streamlit web apps.

st.altair_chart(scatter)
Output
Scatter Plot Altair

Customizing the Scatter Plot

We can also customize the color, size, transparency of the pointers in the Scatter Plot using altair.Color(), altair.Size() and altair.OpacityValue() methods respectively.

Each of these methods takes a Column name of the data set as a parameter. 

  • Consider the altair.Size() – method, the bigger the “sepalWidth”, the bigger the circle.
  • And, altair.Color() – method chooses colours at random and the number of different colours depends upon the number of different values in the column passed as an arguement.
df = data.iris()
scatter  = alt.Chart(df).mark_point().encode(
        alt.X('sepalLength'),
        alt.Y('petalWidth'),
        alt.Color('petalLength'),
        alt.Size('sepalWidth'),
        alt.OpacityValue(0.8)).interactive()
st.altair_chart(scatter)
Output
Edge Color Size Opacity Altair

We can also fill the circles in the plot shown above by passing the “filled” argument as “True” to the mark method used.

import altair as alt
import streamlit as st
from vega_datasets import data

df = data.iris()

scatter  = alt.Chart(df).mark_point(filled=True).encode(
        alt.X('sepalLength'),
        alt.Y('petalWidth'),
        alt.Color('sepalLength'),
        alt.Size('sepalWidth'),
        alt.OpacityValue(0.8)).interactive()
st.altair_chart(scatter)
Output
Color Size Opacity Altair

Scatter Plot with Tooltips

Tooltips simply mean when we hover over any circle(i.e. point) in the Scatter Plot, we can see the information related to that point on the plot. You have to just pass the “list” to the “tooltip” parameter of the “encode” method.

  • The list passed to the tooltip parameter contains the columns names of the dataset used to plot the chart.
df = data.cars()
scatter  = alt.Chart(df).mark_point(filled=True).encode(
        alt.X('Horsepower'),
        alt.Y('Miles_per_Gallon'),
        alt.Color('Origin'),
        tooltip = ['Name', 'Origin', 'Horsepower', 'Miles_per_Gallon']
        ).interactive()
st.altair_chart(scatter)
Output
Scatter Plot With Tooltip Altair

Bar Chart

For BarChart mark_bar is used as a Mark method.

df = data.iris()
bar = alt.Chart(df).mark_bar().encode(x='sepalLength', y='petalLength')
st.altair_chart(bar)
Output
Bar Plot Altair

Horizontal Bar Graph

While plotting a Horizontal bar graph(or Chart), everything remains the same as in Bar Graph (Or Vertical Bar Graph\Chart), except we have to change the order of attributes(i.e. Columns) we are using to plot in X and Y axes.

We tend to put the quantitative value on X-axis and we specify the quantitative and non-quantitative value as Q & O.

df = data.iris()
bar = alt.Chart(df).mark_bar().encode(x='sepalLength:Q',y='petalWidth:O')
st.altair_chart(bar)
Output
Horizontal Bar Graph Altair

Adding labels to the chart

Let’s see how we can add labels at the top of the bars of the horizontal bar chart. Mark_text method is used as a mark for adding text to the plot.

We can add labels to any plot\chart following same procedure.

df = data.wheat()
bar = alt.Chart(df).mark_bar().encode(x='wheat:Q', y='year:O')

text = bar.mark_text(color='white').encode(text = 'wheat:Q')
st.altair_chart(bar+text)
Output
Add Labels To Charts Altair

Stacked bar chart

A stacked bar chart(or Graph), is a graph that is used to break down and compare parts of a whole. Each bar in the chart represents a whole, and segments in the bar represent different parts or categories of that whole.

Stacked Bar Charts Part To Hole 1

Let’s see how to create a stacked bar graph. We will use the Mark_bar method only as a mark, but here we use an aggregate function (SUM), applied to the columns to be plotted on either X or Y axes.

We can create stacked version of any plot\chart following same procedure.

df = data.cars()
bar  = alt.Chart(df).mark_bar().encode(
        alt.X('Horsepower'),
        alt.Y('sum(Miles_per_Gallon)'),
        alt.Color('Name'),  
    ).interactive()
st.altair_chart(bar)
Output
Stacked Bar Chart Altair

Box Plot

For Box Plot, mark_boxplot is used as a Mark method.

df = data.iris()
box_plot = alt.Chart(df).mark_boxplot().encode(x='sepalLength', y='petalLength')
st.altair_chart(box_plot)
Box Plot Altair

Area Chart

For Area-Chart mark_area is used as a Mark method.

import altair as alt
import streamlit as st
from vega_datasets import data

df = data.iris()

area = alt.Chart(df).mark_area(color="maroon").encode(x='sepalLength', y='petalLength')
st.altair_chart(area)
Output
Area Chart Altair

Heat Map

A heatmap (or heatmap) is a graphical representation of data where values are depicted by colors. Heat maps make it easy to visualize complex data and understand it at a glance.

How Heat Map Looks Like

Let us see how to plot a simple heat map using Altair. For heatmap, the mark_rect method is used as a mark.

df = data.cars()
hm  = alt.Chart(df).mark_rect().encode(
        alt.Y('Horsepower'),
        alt.X('Miles_per_Gallon'),
        alt.Color('Name'),  
        tooltip = ['Name', 'Origin', 'Horsepower', 'Miles_per_Gallon']
    ).interactive()
st.altair_chart(hm)
Output
Simple Heat Map Altair

Concatenation of Plots

Concatenation of plots simply means creating subplots. Altair provides two methods called altair.hconcat() and altair.vconcat() in order to plot charts in the same line horizontally or vertically.

Horizontal Concatenation

df_1 = data.cars()
scatter  = alt.Chart(df_1).mark_point().encode(x='Horsepower', y='Miles_per_Gallon')

df_2 = data.iris()
area = alt.Chart(df_2).mark_area(color="maroon").encode(x='sepalLength', y='petalLength')

obj = alt.hconcat(scatter, area) #Horizontal Concatenation

st.altair_chart(obj)
Output
Horizontal Concatenation Altair

Vertical Concatenation

import altair as alt
import streamlit as st
from vega_datasets import data

df_1 = data.cars()
scatter  = alt.Chart(df_1).mark_point().encode(x='Horsepower', y='Miles_per_Gallon')

df_2 = data.iris()
area = alt.Chart(df_2).mark_area(color="maroon").encode(x='sepalLength', y='petalLength')

obj = alt.vconcat(scatter, area) #Vertical Concatenation

st.altair_chart(obj)
Output
Vertical Bar Graph Altair