×

Working with text files in Python

The integration of networking devices is saved in the form of text files. So it is essential to know how to work with text files using python. Text files in python 3 are UNICODE characters, which are a set of ASCII values. There are 13739 UNICODE characters and 128 ASCII characters, where UNICODE characters are encoded in UTF-8 or UTF-32.

Opening and reading text files in python

To read or write the text files using python, there is no need to use any of the libraries. Python has a built-in open function that returns the file object. The file object contains the methods and attributes, using which it gathers information about the opened file and helps to work with it.

#configuration.txt file content
Hostname 172.16.16.200
Hello World
This is the text file
f=open('configuration.txt' , 'rt')
  • Syntax: f.open(filename,mode), where f is a file pointer
  • Returns: file object which can be used to read, write and modify the file
  • filename – It is required to mention the exactly the correct file name
  • mode – There 3 different modes;
    • Read(r) – It opens the text file in read only mode. Files opened as read mode cannot be edited. It is the default mode.
    • Write(w) – It enables to edit the content of the text file.
    • Append(a) – It allows adding the data to the end of the existing data.  The cursor is positioned at the end of the present data in the file.
    • Write and Read
      • (w+) – It allows reading and writing. For an existing file, data is truncated and over-written. The cursor is positioned at the beginning of the file. This will truncate the file to zero length if it exists or create a new file if it doesn’t
      • (r+) – It also allows reading and writing. Unlike w+ it neither deletes the content nor create a new file if it doesn’t exist. It throws FileNotFound exception.
  • file type – In python, the files are categorized as either text files or binary files. The default file type is text.
    • Text file – A file that contains sequence of characters and sequence of lines, terminated with the special character called EOL(End Of Line), also represented as ” \n “. It is specified as ‘t’ along with the mode.
    • Binary file – Any file that is not a text file. These files can only be processed by the applications that could understand it’s file structure. Binary files could be photos, pdf files, executable files and so on. It is specified as ‘b’ along with the mode.

read() function

The read() function reads the entire text and returns it as a string stored in the variable content. The print() function prints the entire content of the text file.

  • Syntax: f.read(), f is a file pointer
  • Returns:  n bytes from the file associated with the given file descriptor.
content=f.read()
print(content)

Output

Hostname 172.16.16.200
Hello World
This is the text file

CLOSING A TEXT FILE

The close() function is used to close the opened file.

  • Syntax: f.close(), where f is a file pointer
  • Returns: TRUE if the file is closed; FALSE if the file is opened.
print(f.closed)
f.close()
print(f.closed)

The above code snippet prints the status of the file before and after closing it.

Output

False
True

tell and seek functions

Using the read() function we can also specify the number of characters to be read from the file. The variable content stores the first 5 characters from the file, while the variable nextcontent stores the next 3 characters followed by the variable content. This is because the cursor is placed in the last read position.

content=f.read(5)
print(content)
nextcontent=f.read(3)
print(nextcontent)

Output

Hostn
ame

tell() function

The tell() function is used to find out the cursor position.

  • Syntax: f.tell(), where f is a file pointer
  • Returns: current position of the cursor.
print(f.tell())

Output

8

seek() function

The seek() method is used to move to a specific position inside the file. This method takes a number as an argument, which indicates the number of characters to be moved from the beginning of the file. It changes the cursor position to a given specific position.

  • Syntax: f.seek(offset), where f is a file pointer
  • Offset: Number of positions to move forward 
  • Returns: Does not return any value 
f.seek(2)
print(f.read(3)

The above code snippet moves the cursor position to the 2nd character in the text file and then prints the following 3 characters.

Output

stn

with keyword

The with the keyword is a block of code. The file opened using the with keyword is closed automatically outside the block. Any file-related content that is opened in with block becomes inaccessible outside the block.

Syntax: with open(filename,mode) as file:

with open('configuration.txt','r') as file :
        file.read()
print(file.closed())

In the above code, closed() will return true, as the file is closed automatically outside the with block of code.

Output

True

Reading files into a list

In python, working with lists is the most versatile and is one of the most commonly used data structures in the case of file processing. The content of the files can be converted into lists in many ways. Some of the ways are as follows :

splitlines() method

splitlines() method splits the string(read from file) at the line breaks and returns the list of lines in the string. Each line of the file becomes an element of the list.

with open('configuration.txt') as file:
        my_list = file.read().splitlines()
        print(my_list)

Output

['Hostname 172.16.16.200','Hello World','This is the text file']

readlines() method

readlines() method will read till the End Of the File(EOF) and returns lists containing the lines of the file. In this method, at the end of each element, a \n (newline) will be added.

with open('configuration.txt','r') as file:
        my_list = file.readlines()
        print(my_list)

Output

['Hostname 172.16.16.200\n','Hello World\n','This is the text file\n']

readline() method

readline() method will read just a line, not the entire file. To read a single line this method can be used.

with open('configuration.txt') as file:
        my_list = file.readline()
        print(my_list)

Output

Hostname 172.16.16.200

To read more than a line using the readline() method, for loop can be used for iteration.

with open('configuration.txt') as file:
        for line in file:
        print(line,end='')

Output

Hostname 172.16.16.200
Hello World
This is the text file

Writing to text files

To write text into the text files methods like and writeline() are used.

write() function

To write into a text file open the file and write mode, the write() function is used to write contents into a text file. If the filename already exists, it will be overwritten else a new file will be created. The write() function does not add a new line, it must be given manually.

To append the content to the existing context of the text file, open the file in append mode.

Syntax: f.write(string), where f is a filehandle.

with open('mytext.txt',a):
   f.write('Hello World\n')
   f.write('Welcome Home\n')

Output

#mytext.txt
Hello World
Welcome Home

When a file is opened using r+ mode, the text is added to the beginning of the file. To add the text into a particular position, move the cursor to the desired position using seek function and then the write() function is used to write the content. The following code snippet shows writing content using seek function in r+ mode.

with open('configuration.txt',r+) as file:
   f.seek(5)
   f.write('hello')

Output

#configuration.txt
Hostnhelloame 172.16.16.200
Hello World
This is the text file

writelines()

The writelines() function is used to write multiple strings at a time.

Syntax : f.writelines(L) for L=[string1,string2,string3]

file = open('test.txt','w')
L=["Hi\n","Welcome to ","the new home\n","Thankyou"]
file.writelines(L)
file.close()

Output

#test.txt
Hi
Welcome to the new home
Thankyou

Working with CSV file in Python

Consider the CSV file airtravel.csv. This file contains the data for the monthly transatlantic air travel in thousands of passengers from the year 1958 to 1960. This file contains 3 fields – Month, 1958, 1959, and 1960. It contains 12 records from January through December.

#airtravel.csv
"Month"     "1958"     "1969"    "1960"
"JAN"            340          360         417
"FEB"            318         342        392
"MAR"          362         406        419
"APR"            348          396         461
"MAY"           363          420        472
"JUN"            435          472        535
"JUL"             491          548        622
"AUG"           505          559        606       
"SEP"            404          463       508
"OCT"           359          407       461
"NOV"           310          362        390
"DEC"            337          405        432

reader() function

To work with CSV files we have to import the CSV module which is in-built. CSV file is opened similar to the text file using open() function. To read the CSV file, reader() function from the CSV module is made use of.

The reader() function returns a reader object that will iterate through all the lines in the CSV file and return each row as a string with no automatic data type.

  • Syntax : csv.reader(f,parameters), where f is the filehandle.
  • Returns : Each record as list of strings
import csv
with open ('airtravel.csv' , 'r') as f:
     reader = csv.reader(f)
     for row in reader:
        print(row)

Output

['Month', ' "1958" ', ' "1969" ',  ' "1960" ']
['JAN', '340', '360', '417']
['FEB', '318', '342', '392']
['MAR', '362', '406', '419']
['APR', '348', '396', '461']
['MAY', '363', '420', '472']
['JUN', '435', '472', '535']
['JUL', '491', '548', '622']
['AUG', '505', '559', '606']       
[;SEP', '404', '463', '508']
['OCT', '359', '407', '461']
['NOV', '310', '362', '390']
['DEC', '337', '405', '432']

To display only the second column, which is the records of the field 1958, we have to use the index. The index starts from 0. To print the second column, the index must be mentioned as 1.

import csv
with open ('airtravel.csv' , 'r') as f:
     reader = csv.reader(f)
     for col in reader:
        print(col[1])

Output

"1958"
340
318
362
...

To skip the header in that particular field, use the next() function.

import csv
with open ('airtravel.csv' , 'r') as f:
     reader = csv.reader(f)
     next(reader)
     for row in reader:
        print(row[1])

Output

340
318
362
...

The reader() function takes an additional parameter called delimiter. This is an optional parameter. The delimiter can be a colon, semi-colon, or anything, using which the content of the file will be delimited and read.

#sample.csv
1,John,Linux
2,Tim,Web development
3,Charan,Python

import csv
with open('innovators.csv', 'r') as f:
    reader = csv.reader(f, delimiter = '\t')
    for row in reader:
        print(row)

Output

['1', 'John', 'Linux']
['2', 'Tim', 'Web development']
['3', 'Charan', 'Python']

The CSV file can contain some additional spaces after delimiter, this will also print spaces in the output. To remove these initial spaces a parameter called skipinitialspace can be used.

#sample.csv
1, John, Linux
2, Tim, Web development
3, Charan, Python

import csv
with open('people.csv', 'r') as f:
    reader = csv.reader(f, skipinitialspace=True)
    for row in reader:
        print(row)

Output

['1', 'John', 'Linux']
['2', 'Tim', 'Web development']
['3', 'Charan', 'Python']

To find the busiest month in the year 1958 from the file airtravel.csv, dictionaries in python can be used. The values() method returns the values stored in the dictionary. The max() function returns the maximum value in the dictionary.

import csv
with open ('airtravel.csv' , 'r') as f:
     reader = csv.reader(f)
     next(reader)
     year_1958 = dict()  #dictionary is created
     for row in reader:
        year_1958[row[0]]=row[1]
max_1958 = max(year_1958.value())
for k, v in year_1958 .items():
    if max_1958  == v:
       print('Busiest month in 1958:{k}, Flights:{v.strip()}')

The strip() method is used to remove the excess whitespace present in the CSV file.

Output

Busiest month in 1958:AUG, Flights: 505

Writing CSV files

writer() and writerow() function

To write to the CSV files, writer() function in CSV module is used. The writer() function returns the writer object which writes each row to the CSV file. The writer object has the method named writerow() which takes in list or tuple. The items in the list will be converted to columns in the CSV file. Let’s see the code to write content into a CSV file.

  • Syntax : csv.writer(f,parameters), where f is the filehandle.
  • Returns : writer object.
  • Syntax : writerobject.writerow(list or tuple).
#people.csv
ID,   Name,   City
1,      John,      London
2,      Marry,    Newyork
3,      Chandler Tulsa
4,       Joey        Paris

import csv
with open('people.csv', 'a') as csvfile:
   writer = csv.writer(csvfile)
   csvdata = (5,'Anne','Amsterdam')
   writer.writerow(csvdata)

Output

#people.csv
ID,   Name,   City
1,      John,      London
2,      Marry,    Newyork
3,      Chandler Tulsa
4,       Joey        Paris
5,Anne,Amsterdam

To write more than 1 row at a time into the CSV file for loop is used. Let’s see the following example of writing multiple rows into the CSV file.

with open('numbers.csv','w',newline=' ') as f:
   writer = csv.writer(f)
   writer.writerow(['x', 'x**2', 'x**3'])
   for x in range(1, 11):
        writer.writerow([x, x**2, x**3,])

The newline argument in the open() function will remove the extra newline that is added in the CSV file. The above code will add the numbers from 1 to 10, it’s square and cube values into the numbers.csv file.

Output

x, x**2, x**3
1, 1, 1
2, 4, 8
3, 9, 27
...

CSV dialects

Python has in-built dialects. Dialects are used to describe the properties of the CSV file such as the delimiter, quoting mechanism, a new line, escape character, etc. To print the existing dialects we use the following code.

print(csv.list_dialects())

The three different dialects are Excel, Excel temp, and Unix. It is also possible to build our own custom dialects. To build a custom dialect we have to register the dialect with the needed attribute using the register_dialect() function. The registered dialect can be used as usual in the reader() function and the writer() function using the dialect attribute.

  • Syntax : csv.register_dialect('dilectname', parameters)
#items.csv
item#quantity#price
pens#3#9.1
plates#12#7.5
cups#20#1.1
bottles#8#3.5

import csv
csv.register_dialect('hashes',delimiter='#',quoting=csv.QUOTE_NONE,lineterminator='\n')
with open('items.csv',r) as csvfile:
   reader = csv.reader(csvfile,dialect='hashes')
   for row in reader:
       print(row)

Output

['item','quantity','price']
['pens','3','9.1']
['plates','12','7.5']
['cups','20','1.1']
['bottles','8','3.5']
import csv
csv.register_dialect('hashes',delimiter='#',quoting=csv.QUOTE_NONE,lineterminator='\n')
with open('items.csv',r) as csvfile:
   writer = csv.writer(csvfile,dialect='hashes')
   writer.writerow(('spoon',3,1))

Output

#items.csv
item#quantity#price
pens#3#9.1
plates#12#7.5
cups#20#1.1
bottles#8#3.5
spoon#3#1