Home
About Us Privacy Policy
 

FILE HANDLING IN PYTHON

ADVERTISEMENT

Advanced File Handling



What are files anyways?

Files are documents that are stored on a long-term basis on devices such as hard-drive, SSD, pen drives, etc. Few examples of files are the text files on your desktop, images, videos, those pesky .dll files that are invariably missing, any word document, excel sheets, web page, executable software, etc. To put it simply, any information that is accessible on a long-term basis is a file.

There are two types of files, namely: text and binary files.

Examples of text files:

  • text file
  • source code
  • scripts
  • HTML page

Examples of binary files:

  • database files
  • executable programs
  • media files such as:
    • image
    • audio
    • video


What is File Handling, and why is it required?

At the core of any real-world application, file handling is an integral part. Let us take an example of a website to store user documents such as a cloud service backup, desktop applications such as image and video editors to save/read media files, or a mobile application such as storing user photos, contacts, documents. It allows for storing details for later use.

Unlike variables that we have been using to develop programs are stored in the memory or RAM for the duration of the program; Files on the flip side are stored on long-term storage devices such as Hard Drives, SSD.

Python provides several methods for reading, creating, updating, and removing files.


File Modes

Python has several file modes to perform various tasks such as reading, writing, appending, or creating files.

Modes are case-insensitive. Meaning "r" is the same as "R".

Mode Description
r This mode will open a file in reading mode only. This is the default mode by default.
rb This mode will open a file in reading mode only in binary mode.
r+ This mode will open a file for both reading and writing operation.
rb+ This mode will open a file for both reading and writing operations in binary mode.
w This mode will open a file for writing operations. This mode will override any existing file.
wb This mode will open a file for writing operations in binary mode. This mode will override any existing file.
w+ This mode will open a file for writing operations. This mode will override any existing file.
wb+ This mode will open a file for writing and reading operations in binary mode. This mode will override any existing file.
a This mode will open a file for appending operations. If the concerned file does not exists, it will create a new file.
ab This mode will open a file for appending operations in binary mode. If the concerned file does not exists, it will create a new file.
a+ This mode will open a file for appending and reading operations. If the concerned file does not exists, it will create a new file.
ab+ This mode will open a file for appending and reading operations in binary mode. If the concerned file does not exists, it will create a new file.

ADVERTISEMENT

Establishing connection to a File using a File Handle

To establish a connection with the file, Python provides an in-built open() function. The open() method throws FileNotFoundError exception if the file is not present as per the path specified.

The open() function takes two parameters:

  • File Name(Including the path to the file)
  • Mode (in which to open the file)

By default, the mode is "r".

All the available file modes are described above.

Example:
Create a file named "my_text_document.txt" in the current directory.

file_name = "my_text_document.txt"

# Opening a handle to the file in text mode
handle = open(file_name, "r")

The above two statements can also be combined into one as follows:

handle = open("my_text_document.txt", "r")

If the file is successfully found, the open() function returns the filehandle object of class "_io.TextIOWrapper". Let us verify this as shown in the below example:

print(type(handle))

<class '_io.TextIOWrapper'>


Closing a file

Whenever we open a file resource, the operating system acknowledges this. It marks the file as being utilized until the program informs the operating system that it is no longer in use and is available for other programs to access. To notify the OS(operating system) that the file resource is no longer required, Python provides a method eloquently named close().

Here is the syntax:

close(<file_handle_instance>)

Example:

# Opening and closing a file
try:
    file_handle = open("my_text_file.txt", "r")
except:
    pass
finally:
    if file_handle:
        file_handle.close()


Why implement inside try-except-finally block and close the handle in the finally block?

Good question!. The advantage of using an exception handling mechanism is that it allows for navigating from unintentional situations such as exceptions and errors and enabling correct methodology. The use of finally block perfectly explains the advantage of implementing code that deals with external variables, such as files.

If you have followed this tutorial, we've discussed that the finally block executes irrespective of an exception. It allows for the closing of used resources such as file handles to be appropriately terminated even in errors or exceptions. The below example demonstrates this:

# Store the following in the text file "my_text_file.txt"

"""
Python is a high-lev el programming Language.
I love to program in Python.
Bye! :)
"""

try:
    file_handle = open("my_text_file.txt", "r")
    characters_to_read = int(input("How many characters do you want to read?\n\n"))
    characters = file_handle.read(characters_to_read)
    print(characters)

    # Close the file handle here
    file_handle.close()
except FileNotFoundError as e:
    print("Could not locate file.")
except ValueError as e:
    print("Could not convert invalid characters into integers.")

How many characters do you want to read?

20
Python is a high-lev

Now let us break down every statement in the above example.

#1:

file_handle = open("my_text_file.txt", "r")
tries to establish connection to the file.

#2:

characters_to_read = int(input("How many characters do you want to read?\n\n")
tries to prompt the user to enter the number of characters to read.

#3:

print(characters)
tries to print the characters read.

#4:

file_handle.close()
tries to close the established file connection.

However, statement #2 tries to accept user input. What if the user entered invalid characters that cannot be interpreted as integers? It would lead to the ValueError exception being raised by Python. As a result, the program execution will halt, and the statement responsible for closing the file resource handle would not execute. As a result, our file handle would open but would never be closed. Doing so is against best programming practices.

However, let us rewrite the same program using finally block as demonstrated below:

# Closing the file in finally block
try:
    file_handle = open("my_text_file.txt", "r")
    characters_to_read = int(input("How many characters do you want to read?\n\n"))
    characters = file_handle.read(characters_to_read)
    print(characters)
except FileNotFoundError as e:
    print("Could not locate file.")
except ValueError as e:
    print("Could not convert invalid characters into integers.")
finally:
    # Ensure that the file is closed successfully
    # Any non-zero value in a conditional statement will be interpreted as True
    # You can read more about implicit typecasting here
    if file_handle:
        file_handle.close()

How many characters do you want to read?

23
Python is a high-lev el

Let us break down the code in the finally block:

#1:

if file_handle
will execute if the connection to the file has been established in the first place. If not, then the statement #2
file_handle.close()
would not execute because there is no need to close the file handle that does not exists.

Remember that attempting to close a file handle that does not exist will result in an Exception.

Not doing so leads to corruption of files, as multiple programs may try to override file data.


Reading Files

Reading of files involves a structure as follows:

  • open a filehandle
  • read to the file
  • perform operations, if any
  • close the filehandle


Reading a file in Text Mode.

As earlier mentioned, File handling is an integral part of most practical applications. One of the most common operations a program does, it reading and writing to the file. In this section, we will cover the reading of files.

Since Python is a high-lev el language, it provides several methods that provide an abstraction to perform such complicated operations. It allows the developers to focus on solving the business problem rather than dealing with the way computers work.

As a general census, most programs manipulate textual data. To read the entire text file in memory at once, Python provides the read() method.

The read() method reads until the EOF(End-of-File) character is encountered.

Here is an example:

Create a file named "my_text_file.txt" in the current directory and place the following text in it.

Posuere. volutpat ullamcorper nunc quam. fringilla sed, neque blandit volutpat, orci et suspendisse porttitor. nisi tempus quis luctus, metus a sem maximus etiam pharetra. eu felis dignissim venenatis suspendisse arcu. varius efficitur et, sapien ac commodo orci, mauris donec nisi. ultricies in, urna porttitor sodales, lectus non integer tortor. vulputate quis ligula, vehicula eget nullam dapibus. convallis elit in nibh hendrerit vestibulum convallis. diam iaculis nec aliquam, nibh ultrices diam varius etiam imperdiet. elementum turpis id nunc at duis nisl. et vestibulum consequat, fermentum vitae porttitor dui, lectus nam ex. eu ut luctus interdum nulla in elit ut aenean amet. sit dolor elit, adipiscing consectetur ipsum lorem.

Copy the below code in a file named: "read.py" in the current directory.

try:
    file_handle = open("my_text_file.txt", "r")
    content = file_handle.read()
    print(content)
except FileNotFoundError as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

Please do not read huge files as they can cause the memory to run out, possibly leading to a system crash.

Reading the entire contents of the file.

As observed in the above example, we can use the read() method to read the entire content in a file. Let us revisit the above example and see how Python is managing the file data .

try:
    file_handle = open("my_text_file.txt", "r")
    content = file_handle.read()
    print(len(content))
except FileNotFoundError as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

As the above example demonstrates, the read() method reads the file's entire contents and returns it as a string. It is because Python ignores any newline interpretation and considers it as a single sequence. It means that even if we have paragraphs, they will still be in a single line instead of being in separate lines.

We could write out own code to split paragraphs in different lines, by doing as the below example demonstrates:

try:
    file_handle = open("my_text_file.txt", "r")
    content = file_handle.read()

    # Split the content whenever a newline is encountered.
    lines = content.split("\n")
    print(f"Total line(s) are {len(lines)}")
    for line in lines:
        print(line) 
except FileNotFoundError as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

Total line(s) are 4
Python is a high-lev el programming Language.
I love to program in Python. Bye! :)

The above example shows four lines instead of three lines is because of the newline character itself. To omit such characters, consider writing the following:

lines = content.split("\n")[0:-1]

The read() method returns the entire content of the file as a String object.


Reading Lines using readlines()

In the above example, we implemented our code to split the string into multiple lines. However, implementing our code to read individual lines would lead to verbosity and code duplication.

In programming, we do not want to reinvent the wheel.

Python provides a method to precisely read individual lines from a file, the readlines() method. The readlines() method returns a list. In this list, every element is a separate line. Below is an example:

try:
    file_handle = open("my_text_file.txt", "r")
    lines = file_handle.readlines()
    print(f"There are a total of {len(lines)} lines in the file. The returned type is {type(lines)}.\n\n")
    
    # Display individual lines
    for line in lines:
        print(line, end="")
except FileNotFoundError as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

There are a total of 3 lines in the file. The returned type is <class 'list'>.

Python is a high-lev el programming Language.
I love to program in Python.
Bye! :)


ADVERTISEMENT

Reading files in batches

As stated above, reading huge files can result in the program running out of memory. To avoid such scenarios, Python provides the ability to read files in specific amount of characters and provides the read() method to do precisely that. Here is an example:

# Suppose we want to read the first ten characters
try:
    file_handle = open("my_text_file.txt", "r")
    content = file_handle.read(10)
    print(len(content), content)
except FileNotFoundError as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

10 Python is


What happens when we try to read more characters than there are in the file?

Good question!. Even when I was learning to program, I asked myself this question several times. Fortunately, you won't have to go through the same loop again. If you try to read more characters than what the file consists of, Python will only read as many characters as available as return them.

In more technical terms, the read() method will read the number of specified characters or until the EOF(End-of-File) character, whichever is earlier.

Here is an example to demonstrate this:

# Trying to read one million characters
try:
    file_handle = open("my_text_file.txt", "r")
    content = file_handle.read(1_000_000)
    print(len(content), content)
except FileNotFoundError as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

82 Python is a high-lev el programming Language.
I love to program in Python.
Bye! :)


Reading in Binary Mode

As mentioned earlier, Python allows for accessing files in binary mode. To access a binary mode file, append "b" to the selected mode. Here is an example:

# To read a file in binary mode:
try:
    file_handle = open("music.mp3", "rb")
    content = file_handle.read()
    print(len(content))
except:
    print("Exception occurred")    
finally:
    if file_handle:
        file_handle.close()

5289384

Your answer will probably be different, depending on the file chosen.


Writing to Files

In the prior sections, we have covered the reading of files and some of their pitfalls. In this section, we'll be going to learn how to write files in Python. Writing to a file is a highly complex task when operating on a low level. However, Python provides abstractions that handle such overwhelming tasks.

Some examples of writing to a file are:

  • copying/transferring/moving files
  • writing to databases
  • creating a file
  • updating a file

Python provides several methods to handle writing files. Writing to a file involves almost the same structure that we have to follow when reading a file.

Here is the structure we have to follow:

  • open a filehandle
  • perform operations, if any
  • write to the file
  • close the filehandle


Writing to a file in Text Mode.

To perform write operations, Python provides the write() method. The write() method takes the stream as an argument. In the text mode, the write() method accepts a string as an argument. Below is an example:

# Writing to a file
try:
    # Will create a new file "my_contacts.txt"
    # Earlier version of this file will get replaced
    file_handle = open("my_contacts.txt", "w")
    contacts_list = (
        "Mr. Anderson 270-878-9889",
        "Mrs. Johnson 502-987-0323",
        "Mr. Neo 606-089-9111",
        "June A. Williams 832-249-6905",
        "Linda S. Austin 360-496-7747",
        "Philomena N. Corley 316-727-0227",
    )

    # Save these contacts on the disk
    for contact in contacts_list:

        # We have added a new line here
        # So that each contact appears on a separate line
        file_handle.write(contact + "\n")
    
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

If everything was successful, you must have a file "my_contacts.txt" with contacts in it.

Let's break down each statement in the above example. Shall we?

Whenever your program deals with external factors such as a Network, Files, Databases, things out of your control, always implement the relevant code inside an exception handling mechanism.

In the concerned example, we are implementing our logic inside the Try-Catch-Finally block.

#1:

file_handle = open("my_contacts.txt", "w")
opens a filehandle to the the file "my_contacts.txt". Remember, if the file does not exist, it will create it. If the file already exists, it will be replaced.

#2:

contacts_list = (
    "Mr. Anderson, 270-878-9889",
    "Mrs. Johnson, 502-987-0323",
    "Mr. Neo, 606-089-9111",
    "June A. Williams, 832-249-6905",
    "Linda S. Austin, 360-496-7747",
    "Philomena N. Corley, 316-727-0227",
)

This statement defines a tuple containing strings as items. The strings have information in the format of name, contact_number.

#3:

for contact in contacts_list:

    # We have added a new line here
    # So that each contact appears on a separate line
    file_handle.write(contact + "\n")

This statement iterates over the tuple contains the contact info, appends a newline to the string, and writes the result onto the file.

#4:

except Exception as e:
    print(e)

This block will catch any Exception because the Exception class is the base class of any exception. If an Exception is raised, it will display the exception information.

#5:

finally:
    if file_handle:
        file_handle.close()

This block will execute irrespective of the Exception status. If the file handle exists, meaning a successful connection was established, it will close the filehandle .

The writelines() method

Python also provides a writelines() method to write. The writelines() method accepts Lists and Tuples as an argument. However, all the items must be of type String. Below are some examples:

Tuple as an argument to writelines()
# Writing to a file using a Tuple
try:
    # Will create a new file "my_contacts.txt"
    # Earlier version of this file will get replaced
    file_handle = open("my_contacts.txt", "w")
    contacts_list = (
        "Mr. Anderson 270-878-9889",
        "Mrs. Johnson 502-987-0323",
        "Mr. Neo 606-089-9111",
        "June A. Williams 832-249-6905",
        "Linda S. Austin 360-496-7747",
        "Philomena N. Corley 316-727-0227",
    )

    # Save these contacts on the disk
    file_handle.writelines(contacts_list)
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

List as an argument to writelines()
# Writing to a file using a List
try:
    # Will create a new file "my_contacts.txt"
    # Earlier version of this file will get replaced
    file_handle = open("my_contacts.txt", "w")
    contacts_list = [
        "Mr. Anderson 270-878-9889",
        "Mrs. Johnson 502-987-0323",
        "Mr. Neo 606-089-9111",
        "June A. Williams 832-249-6905",
        "Linda S. Austin 360-496-7747",
        "Philomena N. Corley 316-727-0227",
    ]

    # Save these contacts on the disk
    file_handle.writelines(contacts_list)
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

Writing non-string data type to the disk.

The previous section discussed that the writelines() method accepts iterable types, namely, List and Tuples, and must contain items of only String type. However, this is particularly restricting when creating real-world applications as we deal with values of several data types such as Numerical Data, Dictionaries, Sets.

Suppose we have a multi-dimensional list that contains a name and cell phone number, and we want to write that information to the disk. How are we going to do that? I'll give you some time to ponder. When you're done click to reveal.

Here is an example to do precisely that:

try:
    # Will create a new file "my_contacts.txt"
    # Earlier version of this file will get replaced
    file_handle = open("my_contacts.txt", "w")
    contacts_list = [
        ["Mr. Anderson", 2708789889],
        ["Mrs. Johnson", 5029870323],
        ["Mr. Neo", 6060899111],
        ["June A. Williams", 8322496905],
        ["Linda S. Austin", 3604967747],
        ["Philomena N. Corley", 3167270227],
    ]
    
    # Iterate and convert into string
    for contact in contacts_list:
        contact[1] = str(contact[1])
        file_handle.writelines(contact)
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

Please copy the above code, execute it and observe the output.

It did not go as expected. The formatting is all screwed up. The writelines() method internally calls the write() method and does not append a newline at the end. So the $1 question is, How do we fix the formatting?

Well, it's straightforward. Iterate over the list and either:

  • append a newline
  • format the string

Below is the fix using formatting the string:

try:
    # Will create a new file "my_contacts.txt"
    # Earlier version of this file will get replaced
    file_handle = open("my_contacts.txt", "w")
    contacts_list = [
        ["Mr. Anderson", 2708789889],
        ["Mrs. Johnson", 5029870323],
        ["Mr. Neo", 6060899111],
        ["June A. Williams", 8322496905],
        ["Linda S. Austin", 3604967747],
        ["Philomena N. Corley", 3167270227],
    ]
    
    # Iterate and convert into string
    for contact in contacts_list:
        contact[1] = f", {str(contact[1])}\n"
        file_handle.writelines(contact)
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

Please re-write the above example using string concatenation.


ADVERTISEMENT

Writing to a file in Binary Mode.

We have covered writing textual data in files in the above section. However, knowing to write binary data is just as essential. In this section, we will discuss writing binary files. Below are some examples of applications that write binary data:

  • media converters
  • media export(such as audio, video, image)
  • File copier
  • Hex editors(most commonly used for altering executables)

To write the file in binary mode, append "b" to the mode. Here is an example:

# Write an integer, list, tuple, string to the disk in binary mode.
try:
    file_handle = open("my_binary_file.bin", "wb")
    
    # Integer
    # When writing bytes
    # The value(s) must be in the range of 0 to 255
    my_int = 255

    # Convert into bytes
    my_bin_data = bytes(my_int)
    file_handle.write(my_bin_data)

    # String 1
    my_str = "A very important message from "
    my_str_enc = my_str.encode("UTF-8")
    file_handle.write(my_str_enc)

    # String 2
    my_bin_encoded_str = b"The President"
    file_handle.write(my_bin_encoded_str)

    # Tuple
    my_bin_value_tuple = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
    my_bin_data = bytes(my_bin_value_tuple)
    file_handle.write(my_bin_data)

    # List
    my_bin_value_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    my_bin_data = bytes(my_bin_value_list )
    file_handle.write(my_bin_data)
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

The byte value(s) must always be between 0 and 255. It is because a byte can store only 8 bits. Hence, the minimum and maximum values that a byte can store are 0 and 255, respectively. Below is the binary representation of 0 and 255 in binary.

BitPos: 7 6 5 4 3 2 1 0
0:       0 0 0 0 0 0 0 0
255:     1 1 1 1 1 1 1 1 1

In the next segment, we will discuss appending to files.


Appending Files

Appending Files in Text Mode

Before, we delve into appending data. Let's understand the difference between appending and writing mode. Whenever we open a filehandle in WRITE(W) mode file, a new file gets created. It will override with a blank file of the same name. However, when we open a filehandle in APPEND(A) mode, it won't override the file; instead, it will write data from the end of the file descriptor., i.e., write from the very end of the file.

Here is an example:

try:
    file_handle = open("my_contacts.txt", "a")
    file_handle.write("Mr. Bob Lawblows, 8978786654\n")
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()


Appending Files in Binary Mode

Here is an example:

try:
    file_handle = open("my_binary_file.bin", "ab")
    file_handle.write(b"End of the Secret Message")
except Exception as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()


Misc Methods

Here are some of the useful miscellaneous methods. You can refer the Python's standard library for in-depth coverage of methods for diverse use-cases.

Checking whether a file or folder exists or not using the exists() method.

Checking for a file

# Checking for file
import os.path
if os.path.exists("my_binary_file.bin"):
    print("Yes, File does exists")
else:
    print("File does not exists")

Checking for a folder

# Checking for folder
import os.path
if os.path.exists("my_folder/"):
    print("Yes, Folder does exists")
else:
    print("Folder does not exists")

Deleting a File

To delete a file, Python provides the delete() method in the os module. Here is an example:

import os
import os.path

if os.path.exists("my_binary_file.bin"):
    os.remove("my_binary_file.bin")
    print("File deleted!")
else:
    print("No such file exists. Hence, it cannot be deleted.")

Deleting an empty folder

To delete an Empty Folder, Python provides the rmdir() method in the os module. Here is an example:

import os
import os.path
if os.path.exists("my_path/"):
    os.rmdir("my_path/")
    print("Folder deleted!")
else:
    print("No such folder exists. Hence, it cannot be deleted.")

Deleting Non-Empty Folder

To delete a Non Empty Folder, Python provides the rmtree() method in the shutil module. Here is an example:

import os.path
import shutil
if os.path.exists("my_path/"):
    shutil.rmtree("my_path/")
    print("Folder deleted!")
else:
    print("No such folder exists. Hence, it cannot be deleted.")


The with keyword

The with statement in Python is used in exception handling to make the code concise, easier to maintain, debug, and faster to develop. Since version 3, Python incorporated it to simplify the management of resources such as Files. Below is an example to demonstrate the difference:

# Management of files using try-catch-finally
try:
    file_handle = open("my_text_file.txt", "r")
    lines = file_handle.readlines()
    for line in lines:
        print(line)
except FileNotFoundError as e:
    print(e)
finally:
    if file_handle:
        file_handle.close()

Management of the file using the with statement

try:
    with open("my_text_file.txt", "r") as file_handle:
        lines = file_handle.readlines()
        for line in lines:
            print(line)
except FileNotFoundError as e:
    print(e)

The version using the with statement is 30% smaller than the try-except-finally method. The with statement automatically handles proper resource management such as connection(if the file is present) and closing the file. As a software developer, you should always follow the best methodology to write clear, concise code to ensure faster development, fewer bugs, and a maintainable codebase.


Understanding the differences between Binary vs. Text mode.

We have been discussing the usage of text and binary file modes. What we haven't covered yet is the actual difference of operation between the two. In my experience, many tutorials/books often overlook and move on, just like your ex. However, I won't.

To understand the differences between binary and text modes, we must know how they will be used and interpreted. First, we have to understand how text files are stored. Most text files are either stored in ASCII or UTF-8 encoding. You can read more about ASCII and UTF-8 here. ASCII and UTF-8 are the same.

However, UTF-16 and UTF-32 vastly expand the amount and type of characters that they can represent. In this tutorial, we will be understanding UTF-8 encoding.

Now, let us delve a bit deeper into binary data. Let us take the binary string of 1000001. If we convert this into a number, it would be 65. If string, then this would become the character uppercase 'A'. How do we know when the binary sequence 1000001 should be considered the numerical value 65 or the character 'A'?. This is where encoding comes into the picture. When we save the text file on our text editor, it is encoded as UTF-8 or ASCII and read back as such. We can even create our encoding and make our own character set, where the value of 65 represents 'Z'. It is up to the encoding and the software how to interpret it.

However, when we perform file handling using binary encoding, the numerical value of 65 gets written. Its exact binary value. It is up to the software that will utilize it, interpret it, and perform necessary actions. The value of 65 could mean an instruction for the CPU, a pixel value, or audio data. Additionally, special characters such as newline, tabs, spaces, carriage return, line-feed do not get processed and are read as they are, i.e., sequence of bits.


What are Streams?

In programming, Streams are the flow of data. They could be an incoming stream(reading a file) or an outgoing stream(writing to the file).



The above illustration shows that the Python program accepts an input stream by accepting user keystrokes from the keyboard, read file from the disk, and receives packets from the network; processes the input data and outputs the result to the screen, write to the disk or send the response to the network request.


Seeking

We have covered reading, writing, and appending file data. However, we have been processing in streams, i.e., continuous data flow, whether reading or writing. What if we want to access data in buffers for efficient processing, which is crucial in applications such as video players, video editors, and large database files?

To achieve such tasks, Python provides the seek() method. Before learning about the seek() method, we must understand more about streams and file pointers.



The above diagram illustrates how the data flows in a stream and how the file pointer points to the current byte offset and returns that value. In the illustration, the file pointer is pointing to offset 4, accessing the value E. When the file pointer is incremented or decremented, it will move to the corresponding byte offset and read the respective value at the offset.
Armed with this knowledge, let us dive into the seek() method.

The seek() method accepts the following definition:

seek(offset, whence)

offset: This is the position of the file pointer within the file.
whence: This can have three values:

  • 0 means absolute file positioning.
  • 1 means relative to the current file pointer position.
  • 2 means relative to the end of the file.
This field is optional. The default value is 0.

Now, let us write a program to read the first two characters from the middle of the file.

Create a text file called "my_test_file.txt" and save the following content in it:

0123456789

Create a file named seek.py and save the following content in it:

try:
    with open("my_test_file.txt", "r") as file_handle:

        # Reading the file and removing any whitespace character
        content = file_handle.read().strip()
        
        # Find the middle of the string
        middle = len(content) // 2

        # Set the file pointer to the middle
        file_handle.seek(middle)

        # Now read the next two characters from the middle
        chars = file_handle.read(2)
        print(chars)
except FileNotFoundException as e:
    print(e)

As you can observe, from the results, we printed two characters from the middle.

Now let us understand the statement responsible.

The following statement  

middle = len(content) // 2
calculates the length of the string and divides it by two to get the index at the middle of the string, and assigns it to the variable creatively named middle.

The statement  

file_handle.seek(middle)
moves the file pointer to the center of the file.

Finally, the statement  

chars = file_handle.read(2)
reads the following two characters and assigns them to the variable named chars and
print(chars)
prints them to the terminal.


Conclusion

In this chapter, we learned about what files are, establishing a connection to a file in text or binary mode, pitfalls, successfully closing a file, the benefits of using the with statement, differences between text mode and file mode. The consumption of file data in streams and moving the file pointer using the seek() method.


ADVERTISEMENT



All product names, logos, and brands are property of their respective owners.