You can upload or download a file in a Streamlit workflow.
So, suppose you have an ML app that is related to image classification or text analysis, so you might be required to upload an image file or a CSV file depending on the situation. You can use streamlit.file_uploader()
to achieve that.
Similarly, you might have obtained some results from the ML app after processing and you want to download it. You can do that using download
attribute in HTML inside anchor <a>
tag.
Single file upload
Let’s create a simple that has a select box giving options to upload either an image, dataset, or document.
import streamlit as st
def main():
st.title("File Upload Tutorial")
menu = ["Image","Dataset","DocumentFiles","About"]
choice = st.sidebar.selectbox("Menu",menu)
if choice == "Image":
st.subheader("Image")
elif choice == "Dataset":
st.subheader("Dataset")
elif choice == "DocumentFiles":
st.subheader("DocumentFiles")
Uploading Image
To upload an image we will import Image
from PIL
.
In the st.file_uploader()
define a type
attribute to tell the app what kind of files may be accepted. Here png, jpg, jpeg are the accepted extensions.
st.file_uploader()
returns several properties including name
, type
and size
which can be displayed as shown in the code.
To preview the uploaded file, we use Image.open(<image_file>)
which returns the image data. You can then view the returned image data using st.image(<image_data>)
.
You can set the size of the image you want to display as a preview using width
in the st.image()
function.
# Include PIL, load_image before main()
from PIL import Image
def load_image(image_file):
img = Image.open(image_file)
return img
...
if choice == "Image":
st.subheader("Image")
image_file = st.file_uploader("Upload Images", type=["png","jpg","jpeg"])
if image_file is not None:
# To See details
file_details = {"filename":image_file.name, "filetype":image_file.type,
"filesize":image_file.size}
st.write(file_details)
# To View Uploaded Image
st.image(load_image(image_file),width=250)
Uploading Dataset
You can upload a dataset using st.file_uploader()
, where the type
attribute is set to csv
.
You can preview the file name, type, and size from the object returned from st.fileuploader()
function.
To load the uploaded data set simply use pd.read_csv()
function from pandas library and view it using st.dataframe()
.
You can refer here to see how you can view data frames and tables in Streamlit
import pandas as pd
....
elif choice == "Dataset":
st.subheader("Dataset")
data_file = st.file_uploader("Upload CSV",type=["csv"])
if data_file is not None:
file_details = {"filename":data_file.name, "filetype":data_file.type,
"filesize":data_file.size}
st.write(file_details)
df = pd.read_csv(data_file)
st.dataframe(df)
Uploading Document Files
You can upload document files using streamlit. You can upload a using st.file_uploader()
, where the type
attribute is set to pdf
, docx
, txt
.
You can process or preview a text file by simply decoding bytes to string and viewing the raw text using st.text()
To process a pdf file you can use a library called pdfplumber
which extracts all the text data from the pdf in pages and can be accessed using a function called extract_text()
Similarly, a docx file can be processed using a library called docx2txt
and the function docx2txt.process(<docx_file>)
. The returned raw text from the function can be viewed using streamlit.write(<raw_text>)
import docx2txt
import pdfplumber
....
elif choice == "DocumentFiles":
st.subheader("DocumentFiles")
docx_file = st.file_uploader("Upload Document", type=["pdf","docx","txt"])
if st.button("Process"):
if docx_file is not None:
file_details = {"filename":docx_file.name, "filetype":docx_file.type,
"filesize":docx_file.size}
st.write(file_details)
if docx_file.type == "text/plain":
# Read as string (decode bytes to string)
raw_text = str(docx_file.read(),"utf-8")
st.text(raw_text)
elif docx_file.type == "application/pdf":
try:
with pdfplumber.open(docx_file) as pdf:
pages = pdf.pages[0]
st.write(pages.extract_text())
except:
st.write("None")
else:
raw_text = docx2txt.process(docx_file)
st.write(raw_text)
Saving Uploaded Files in a Directory
So suppose your project is hosted on a server and someone uploads a file remotely, then it is desirable to save the file on the host server so that it can be utilized further.
To save a file in a directory we will first import os
library which provides ways to use operating system-dependent functionality.
You can save the file in the desired directory after writing the buffer to a file using <image_file_name>.write((image_file).getbuffer())
as shown below
import os
...
if choice == "Image":
st.subheader("Image")
image_file = st.file_uploader("Upload Images",
type=["png","jpg","jpeg"])
if image_file is not None:
# TO See details
file_details = {"filename":image_file.name, "filetype":image_file.type,
"filesize":image_file.size}
st.write(file_details)
st.image(load_image(image_file), width=250)
#Saving upload
with open(os.path.join("fileDir",image_file.name),"wb") as f:
f.write((image_file).getbuffer())
st.success("File Saved")
Uploading Multiple Files
It’s very easy to upload multiple files. In the st.file_uploader()
function set accept_multiple_files
attribute as True
.
The uploaded files are saved as an array of objects.
Each object can be accessed using a for
loop. It can then be previewed and saved using the same methods as discussed above in the article.
if choice == "Image":
st.subheader("Image")
uploaded_files= st.file_uploader("Upload Images",type=["png","jpg","jpeg"],
accept_multiple_files = True)
if uploaded_files is not None:
# TO See details
for image_file in uploaded_files:
file_details = {"filename":image_file.name,"filetype":image_file.type,
"filesize":image_file.size}
st.write(file_details)
st.image(load_image(image_file), width=250)
#Saving upload
with open(os.path.join("fileDir",image_file.name),"wb") as f:
f.write((image_file).getbuffer())
st.success("File Saved")
Downloading Files
You can download the files from a Streamlit app as well.
You need to first encode the data in base 64 format and then decode it using base64.b64encode(<file>.encode()).decode()
Finally, we create the filename in the required format and create a download link using <a href>
tag with specified base64 data as shown below.
Here we can download both text and CSV files by defining the file extension and adding it to the end of the file name that we intend to save.
import streamlit as st
import streamlit.components as stc
# Utils
import base64
import time
timestr = time.strftime("%Y%m%d-%H%M%S")
import pandas as pd
class FileDownloader(object):
def __init__(self, data,filename='myfile',file_ext='txt'):
super(FileDownloader, self).__init__()
self.data = data
self.filename = filename
self.file_ext = file_ext
def download(self):
b64 = base64.b64encode(self.data.encode()).decode()
new_filename = "{}_{}_.{}".format(self.filename,timestr,self.file_ext)
st.markdown("#### Download File ###")
href = f'<a href="data:file/{self.file_ext};base64,{b64}" download="{new_filename}">Click Here!!</a>'
st.markdown(href,unsafe_allow_html=True)
def main():
menu = ["Text","CSV"]
choice = st.sidebar.selectbox("Menu",menu)
if choice == "Text":
st.subheader("Text")
my_text = st.text_area("Your Message")
if st.button("Save"):
st.write(my_text)
download = FileDownloader(my_text).download()
elif choice == "CSV":
df = pd.read_csv("iris.csv")
st.dataframe(df)
download = FileDownloader(df.to_csv(),file_ext='csv').download()
if __name__ == '__main__':
main()