×

File Handling in Python

Python provides file handling and supports users to read, write and perform many other operations on files. Python has an in-built function that can open a file and perform manipulations on the file.

There are many ways to operate on files. If the files are consuming huge memory then they are handled by Big Data or DataBases. If the files are of normal size then they can be handled by a python file object.

Python classifies files into two types they are, Text file and Binary File.

  • The Text files are a sequence of characters that are used to store character data
  • Binary files are images, videos, audio files everything in binary format

Opening a File

Before performing any operation on a file(such as read, write, etc.,) we have to open the file. For this purpose, Python provides an in-built function open().

The open(file_name, mode) function accepts two parameters:-

  • file name is the name of the file that we need to open. If the file is in the same directory of python then we can open is directly, or else we need to send file name along with its path
  • mode determines in which mode files need to be opened. There are various modes such as read, write, append, etc.,

Modes for opening a Text file

The different modes for opening a text file are:-

  • ‘r’
  • ‘w’
  • ‘a’
  • ‘x’
  • ‘r+’
  • ‘w+’
  • ‘a+’

Read mode (‘r’)

Opens an existing file for reading operation. The file pointer is positioned at the beginning of the file. If the specified file does not exist then we will get FileNotFoundError. This is the default mode.

f = open("file_1.txt", "r")

Write mode (‘w’)

Opens an existing file for the write operation. If the file already contains some data
then the old data will be overridden.
If the specified file is not already available then this mode will create that file.

f = open("file_1.txt", "w")

Append mode (‘a’)

Opens an existing file for append operation. It won’t override existing data. If the specified file is not already available then this mode will create a new file.

f = open("file_1.txt", "a")

Read and Write mode (‘r+’)

r+ is used to read and write data into the file. The file pointer is placed at the beginning of the file. While writing, the previous data in the file will be overwritten.

If there is no existing file then this mode will not create a new file. It returns FileNotFoundError.

f = open("file_1.txt", "r+")

Write and Read mode (‘w+’)

w+ is used to write and read the data from the file. While writing, the previous data in the file will be overwritten.

If there is no existing file then this mode will create a new file.

f = open("file_1.txt", "w+")

Append and Read mode (‘a+’)

a+ used to append and read the data from the file. It won’t override the existing data in the file. If the specified file is not available then this mode will create a new file.

f = open("file_1.txt", "a+")

Exclusive mode (‘x’)

x is used to open a file in exclusive creation mode for the write operation. If the file already exists then we will get FileExistsError.

f = open("file_1.txt", "x")

Modes for opening a binary file

The modes for opening a binary file are just similar to the modes used for opening a text file. If all the modes above specified are suffixed with ‘b’ then these represent binary files.

  • rb -> Opens an existing binary file for reading operation. If the specified file does not exist then we will get FileNotFoundError.
  • wb -> Opens an existing binary file for the write operation. The old data will be overridden. Creates new file if no file found
  • ab -> Opens an existing binary file for append operation. It won’t override existing data. Creates new file if no file found
  • r+b -> Used to read and write data into the binary file. While writing, the previous data in the file will be overwritten. If there is no existing file then this mode will not create a new file
  • w+b -> Used to write and read the data from the file. While writing, the previous data in the file will be overwritten. If there is no existing file it will create a new file
  • a+b -> Used to append and read the data from the file. The existing data is not overriden. If there is no existing file it will create a new file
  • xb -> Used to open a file in exclusive creation mode for the write operation. If the file already exists FileExistsError is returned

Closing a File

After performing operations on a file, it is recommended to close a file. Since when we create a file object few resources of the system are allocated for handling these file objects.

If we don’t close the file then these resources will not get deallocated and load increases on the system processor because of the unavailability of resources.

Using close() method we can close the file and thus deallocate the resources.

f = open("file_1.txt", "r")
# perform operations on file

f.close()

Properties of File Object

When we open a file all the details of the file are stored in file object and those can be accessed using file object methods.

  • f.name returns the name of the file that is opened by the file object
  • f.mode returns in which mode the file is opened, such as r, w, a, etc.,
  • f.closed returns whether the file is close or not
f = open("file_1.txt", "r")
f.close()
print("file name :-", f.name)
print("file mode :-", f.mode)
print("is file closed :-", f.closed)

# Output
file name :- file_1.txt
file mode :- r
is file closed :- True
  • f.readable() is a method that returns boolean value whether the file can be readable or not based on the mode it is opened
  • f.writable() is a method that returns boolean value whether the file can be writable or not based on the mode it is opened
f = open("file_1.txt", "r")
f.close()
print("file writable :-", f.writable())
print("file readable :-", f.readable())

# Output
file writable :- False
file readable :- True

Write Character data to Text Files

We can write data into the text file by opening a file in ‘w’ mode. If the file is not available then a new file is created with a specified name. Once the file is opened using ‘w’ mode then all the data inside it is overridden with new data.

We can use two methods to write data to text files they are:-

  • write()
  • writelines()

write() method

Using write() method we can write the character data into a Text file. But we have opened the file using ‘w’ mode.

Since we are opening the file in write mode all the data inside the file will be overridden with the data we are inserting using the write() method.

If there is no such file exists with the specified name then a new file is created.

f = open("file_1.txt", "w")
f.write("hello")
f.write("welcome")
f.write("Daniel")
f.close()

# Opening file_1.txt to see data
helloWelcomeDaniel

As we can see the data that we have written inside file_1.txt is written in a single line. Since the write() method writes the data in the same line. To write the data in a new line every time we use the write() method we need to use \n.

In the below example, we are writing data using \n in 'w' mode thus the old data is overridden with new data in a new line.

f = open("file_1.txt", "w")
f.write("Python\n")
f.write("Files\n")
f.write("Concept\n")
f.close()

# Opening file_1.txt to see data
Python
Files
Concept

To not lose the data every time we write data into the file we need to open the file in append( ‘a’ ) mode.

In the below example, we are first writing the data to the file_1.txt in ‘w’ mode and then in the second phase we open the same file in ‘a’ mode, so previously written data is preserved and the new data is added to the file_1.txt

f = open("file_1.txt", "w")
f.write("hello")
f.write("welcome")
f.write("Daniel")
f.close()

f = open("file_1.txt", "a")
f.write("Python")
f.write("Files")
f.write("Concept")
f.close()

# Opening file_1.txt to see data
helloWelcomeDaniel
PythonFilesConcept

writelines() method

writelines() method is used to write the sequence of strings to a file. The sequence can be a list, tuple, set, or dictionary. But the data have to be in string format.

writeline() method writes all the items of the sequence into a file in a single line.

f = open("file_1.txt", "w")
my_list = ["Jimmy", "Mark", "Rocky"]
f.writelines(my_list)
f.close()

# Opening file_1.txt to see data
JimmyMarkRocky

If we want to write the data in a new line for each item then we need to use “\n”.

f = open("file_1.txt", "w")
my_list = ["Jimmy\n", "Mark\n", "Rocky\n"]
f.writelines(my_list)
f.close()

# Opening file_1.txt to see data
Jimmy
Mark
Rocky

Along with lists, we can pass set, tuple, and dictionary as sequence into writelines() method. In the below example we are passing the tuple as a sequence.

f = open("file_1.txt", "w")
my_tuple = ("Jimmy\n", "Mark\n", "Rocky\n")
f.writelines(my_tuple)
f.close()

# Opening file_1.txt to see data
Jimmy
Mark
Rocky

But if we pass set into writelines() then the order of writing the data into the file might be different. This is because the set is an unordered data type and while writing items to the file the order will be different.

f = open("file_1.txt", "w")
my_set = {"Jimmy\n", "Mark\n", "Rocky"}
f.writelines(my_set)
f.close()

# Opening file_1.txt to see data
Mark
Jimmy
Rocky

If we pass a dictionary as a sequence into writelines() then only keys of the dictionary will be added as data into the file.

f = open("file_1.txt", "w")
my_dict = {"emp1" : "Jim", "emp2" : "Scott"}
f.writelines(my_dict)
f.close()

# Opening file_1.txt to see data
emp1emp2

While passing the dictionary into writelines() we have to make sure the keys of the dictionary are in string type. Or else we might get a TypeError since writelines() accepts only string data type.

If we want to write values of the dictionary to the file then we need to pass my_dict.values() into writelines().

f = open("file_1.txt", "w")
my_dict = {"emp1" : "Jim", "emp2" : "Scott"}
f.writelines(my_dict.values())
f.close()

# Opening file_1.txt to see data
JimScott

Writing data dynamically from keyboard to a file

We can use the input() function which accepts user_input from the keyboard. We can automate this operation by passing the input function into the loop.

We can also dynamically name the file using the input() function. The filename passed by the user has to be passed into the file open() function.

In the below example the user is passing the file name as input, which is passed into the file open() function in ‘write’ mode. The while loop keeps on continuing iterating again and again collecting the user_input data and writing it into the file.

When the user enters ‘n'(no) then the loop is terminated and the file is closed using file.close() function.

input_file_name = input("enter file name: ")
file = open(input_file_name, "w")
while True:
    user_input = input("enter data [yes or no]")
    if user_input.lower() == "yes":
        input_data = input("data:- ")
        file.write(input_data + '\n')
    else:
        break
f.close()

# input
enter file name: hello.txt
enter data [yes or no]:-yes
data:- steve
enter data [yes or no]:-yes
data:- johny
enter data [yes or no]:-yes
data:- smith
enter data [yes or no]:-no

# Opening hello.txt
steve
johny
smith

Read Character data from Text file

Since we can write data into a file, we can also read the data from the file. In python, we have in-built functions for reading the data from the file. They are:-

  • read()
  • read(n)
  • readline()
  • readlines()

There is a file named file_1.txt and let’s try to use above mentioned methods to read the data from the file.

Image 370

read() function

Using read() function we can get all the data present in the file irrespective of the amount of data present in it. We can pass the mode for opening the file as “r”.

Note:- If we do not pass any mode for opening the file then by default file will be opened in read mode

f = open("file_1.txt", "r")
data = f.read()
print(data)
f.close()

# Output
Hello
Welcome John
let's start 
studying
python

read(n) function

If we pass a parameter ‘n’ into read() function then in total n number of characters from the file are returned. We have to make sure the parameter passed is an integer.

f = open("file_1.txt", "r")
data = f.read(3)
print(data)
f.close()

# Output
Hel

Basically, if the data is present in multiple lines, these multiple lines are caused due to “\n” present in the characters of the data. Thus while retrieving the data “\n” is also counted as one character.

Image 370
If Else 2 2
f = open("file_1.txt", "r")
data = f.read(8)
print(data)
f.close()

# Output
Hello
We

If we pass a value greater than the total number of characters present in the file then all characters are returned. In file_1.txt the total number of characters is 45, but if we pass data.read(200) then all the available characters are returned.

If we pass a negative value into read(n) such as data.read(-4), read(-1), read(-234), in this case also all the characters are returned from the file, irrespective of the negative value passed.

Image 372
f = open("file_1.txt", "r")
print(f.read(200))
print(f.read(-2))
print(f.read(-462))
f.close()

# Output
Hello
Welcome John
Hello
Welcome John
Hello
Welcome John

readline() function

readline() function is used to read the data in the file line-by-line. Each time we use deadline() the next line is returned. It returns a complete line irrespective of the number of characters in the line.

f = open("file_1.txt", "r")
print(f.readline(), end = "")
print(f.readline(), end = "")
f.close()

# Output
Hello
Welcome John

readlines() function

readlines() function converts the data present in the file into a list. Each item in the list is a line from the file. Thus if we want to store each line as an item in a list, we can use readlines() function.

f = open("file_1.txt", "r")
data = f.readline()
f.close()

We can access this list by iterating over it using for loop.

for i in data:
    print(i)

# Output
Hello
Welcome John

os.path.isfile()

We can check a file, whether it is available in the particular location or not using os.path.isfile(). It returns a boolean value True or False. If the specified path exists then it returns True.

If the specified path does not exist then it returns False.

import os
print(os.path.isfile("C:\Users\Desktop\file_1.txt"))

# Output
True

with statement in File Handling

Using with statement we can open a file alternatively. It is used to group the statements related to the file into a single block known as with statement block.

Using with statement we don’t have to explicitly close the file when the control comes out of the with block the file is automatically closed. Thus with statement improves code readability and reduces complexity.

The syntax of with statement starts with keyword and is then followed by the file open() function. This opened function can be aliased as another variable using a keyword.

with open(file_1.txt, "w") as f:

In the below example, we are opening a file and writing some data into it. Without explicitly declaring the file.close() function the with statement closes the file once control comes out of with block.

We can check this using file.closed() function which returns boolean values True or False based on file being closed or not.

with open(file_1.txt, "w") as f:
    f.write("hello")
    print(f.closed())
print(f.closed())

# Output
False
True

As we can see from the above example, when the f.closed() function is called from inside of with block it returns False and when called from outside of with block it returns True. Thus with block automatically closes the file after performing operations on it.

tell() and seek() methods in File Handling

tell() and seek() methods are related to the file pointer( cursor ). When a text file is opened the cursor is positioned at the first character. As we read or write the data into the file the position of the cursor changes accordingly.

tell() method

tell() methods return the current position of the cursor in the file. When we open a text file the cursor is placed at the first character, which means tell() returns cursor position as “zero”.

Because the index of the first character is “zero” and the index is increased by +1 for each character. Thus tell() indirectly returns the index of the character that is being read by the file object.

Image 370
f = open("file_1.txt", "r")
print(f.tell())
f.close()

# Output
0

In the below example, we are reading the first line of the file_1.txt, and let’s check the cursor position by using the tell() method.

f = open("file_1.txt", "r")
f.readline()
print(f.tell())
f.close()

# Output
6

Definitely, after reading all the characters in the first line, the cursor position has to be returned as ‘5’ but it’s returned as ‘6‘ because there is a new line character ( \n ) at the end of the first line which makes the index of the first character in the second line to be 6 but not 5.

seek() method

seek() method moves the file pointer to a defined position. It takes an integer as a parameter and moves the cursor to that location.

It’s mostly useful to read the file from a particular character if we know the index of that character. The syntax of the seek() method is given below:-

file_object.seek(n)

Where n is the number of characters it has to be skipped to place the cursor starting from the first character of the file.

Image 370

In the below example, we are moving the cursor to the third character in the second line which means the number of characters it has to skip starting from the first character would be 9( 8 characters + 1 new line character[“\n”] ).

And using readline() we are reading the rest of the characters present in the line.

f = open("file_1.txt", "r")
f.seek(9)
print("cursor position :- "f.tell())
print(f.readline())
f.close()

# Output
cursor position :- 8
come John

Reading and Writing Binary Files

Binary files include images, videos, and audio files, etc., The data stored in a binary file is different from a text file since it is in a binary format. Thus we need to open these binary files in binary mode.

The mode for opening binary files suffices with “b”. Such as “wb”, “rb”, “ab”, “r+b”, “w+b”, “a+b”, “xb”

Reading and writing into Binary files is similar to performing the same operations on a text file. But the correct mode has to be used based on the file we are handling.

f = open("selfie.jpg", "rb")
print(type(f))
f.close()

# Output
<class 'bytes'>

if we try to open these files we only can see the data in encoded format. It’s better if we handle media files with other modules such as a pillow, matplotlib, OpenCV, etc.

Handling CSV Files

CSV Files are comma-separated values stored as a list of data in the form of plain text in a file. These CSV files mostly use comma as delimiters. A delimiter is one or more characters that divide text strings.

These CSV files are often used to exchange data between different applications.

Python provides a CSV module to handle CSV files. Using this module we can read and write the data into a CSV file.

Write Data into CSV File

For writing the data into a CSV file we need to use csv.writer(file_object) which returns a CSV writer object.

Using this CSV writer object we can write a list of data as rows into a CSV file. For entering a row into a CSV file we need to use .writerow() method.

import csv
with open("data.csv", "w") as f:
    writer = csv.writer(f)
    writer.writerow(["Name", "age", "email"])
    writer.writerow(["John", "34", "[email protected]"])
    writer.writerow(["smith", "26", "[email protected]"])
Image 376

But by default, while entering the data into the CSV file one extra blank row is being inserted for every .writerow() operation.

To avoid this blank row we need to pass one more value while opening the file. If we pass newline = '' then the blank rows will be removed.

import csv
with open("data.csv", "w", newline = '') as f:
    writer = csv.writer(f)
    writer.writerow(["Name", "age", "email"])
    writer.writerow(["John", "34", "[email protected]"])
    writer.writerow(["smith", "26", "[email protected]"])
Image 375

Note:- If we are not using the newline attribute then in the CSV file, blank lines will be included between the data. To prevent these blank lines, a newline attribute is required.

Read Data from CSV File

To read the data from CSV file we can use csv.reader() which returns a CSV reader object. We can associate this reader object with other methods to read particular data from a CSV file.

Since we need to read the data of the CSV file, we have to open the file in read( ‘r’ ) mode.

import csv
with open("data.csv", "r") as f:
    read = csv.reader(f)
    print(type(read))

# Output
<class '_csv.reader'>

If we try to print the data then it will be returning a list of list objects, where each row is represented as one list and each cell is an item in these lists. All these lists are represented in a list.

import csv
with open("data.csv", "r") as f:
    read = csv.reader(f)
    data = list(read)

# Output
[[Name, Age, Email], 
 [John, 34, [email protected]], 
 [smith, 26, [email protected]]]

As we know, the data in CSV files are stored in the form of a list of data, we can iterate over the csv.reader() object to print each row.

import csv
with open("data.csv", "r") as f:
    read = csv.reader(f)
    data = list(read)
    for row in data:
        for cell in row:
            print(cell, "\t", end = "")

# Output
Name  Age  Email
John  34   [email protected]
smith 26   [email protected]

Zip and Unzip Files

Zip is a common file format that is used to compress one or more files together into a single location. Once we zip some files it reduces the size of the file and makes it easier to load and share the file.

This is because Once a file is zipped it compresses the contents of the file. Thus it will be easy to operate on less data.

  • Zipping imrpoves memory utilization by compressing the data without losing it
  • We can reduce load, share, download time by zipping the file
  • Performance is increased by using zip files.

Zipping a File

Zipping a file means compressing one or more files into a single file by using specific encoding formats. Python provides modules and classes to zip a file. To perform zipping operation we need to import the zipfile module.

from zipfile import *

Once we import zipfile module we can create a new location to zip all the files using python ZipFile() class. Since we are creating a zip file we need to pass the mode for opening as write( “w” ) mode.

Also, we need to pass a constant for the usual zip compression method. It’s known as “ZIP_DEFLATED”. To add the files into the zip object we can use write() method.

from zipfile import *
f = ZipFile("data.zip", "w", ZIP_DEFLATED)
f.write("file_1.txt")
f.write("file_2.txt")
f.write("file_3.txt")
f.close()

Unzipping a File

Unzipping a file means extracting the files from a zipped file or similar archive. To unzip a file we need to use the “ZIP_STORED” constant and access the zip file in read(“r”) mode.

from zipfile import *
f = ZipFile("data.zip", "r", ZIP_STORED)

To access the file names that are being unzipped we can use namelist() method which returns the names of the files in the zip file.

from zipfile import *
f = ZipFile("data.zip", "r", ZIP_STORED)
file_names = f.namelist()
print(file_names)

# Output
["file_1.txt", "file_2.txt", "file_3.txt"]