Module 2: Managing Files with Python

Spread the love

INTRODUCTION – Managing Files with Python

This module begins by orienting you on how reading and writing are achieved in terms of simple and some powerful commands that foster these activities. It tackles hitting home when it comes to managing files, understanding how to move through different directories, and so forth. This module will also point out the level of abstraction enforced between the Python programming language and the operating system when files are approached. You will, for sure, leap into the understanding of CSV files on how to read from such files, write into them, and edit them.

Learning Objectives:

  • How to read, write, and iterate within files
  • File manipulation in terms of moving, deleting, and renaming files
  • Understand the command to create and navigate through directories
  • Definition of CSV files, and reading from these files
  • To write into and even edit CSV files in directory

PRACTICE QUIZ: MANAGING FILES & DIRECTORIES

1. The create_python_script function creates a new python script in the current working directory, adds the line of comments to it declared  by the ‘comments’ variable, and returns the size of the new file. Fill in the gaps to create a script called “program.py”

def create_python_script(filename):
  comments = "# Start of a new Python program"
  with open(filename, 'w') as file:
    file.write(comments)
    filesize = file.tell()
  return filesize


print(create_python_script("program.py"))

Great work! Your new python script is now ready for some real code!

2. The new_directory function creates a new directory inside the current working directory, then creates a new empty file inside the new directory, and returns the list of files in that directory. Fill in the gaps to create a file “script.py” in the directory “PythonPrograms”.

import os


def new_directory(directory, filename):
  # Before creating a new directory, check to see if it already exists
  if os.path.isdir(directory) == False:
    os.mkdir(directory)


  # Create the new file inside of the new directory
  os.chdir(directory)
  with open (filename, 'w') as file:
    pass


  # Return the list of files in the new directory
  return os.listdir()


print(new_directory("PythonPrograms", "script.py"))

Well done, you! Working with files and directories can be a little tricky, and you’re getting the hang of it!

3. Which of the following methods from the os module will create a new directory?

  • path.isdir()
  • listdir()
  • mkdir() (CORRECT)
  • chdir()

Absolutely! The function os.mkdir() is used to create a new directory in the directory with the directory name specified as a string parameter.

4. The file_date function creates a new file in the current working directory, checks the date that the file was modified, and returns just the date portion of the timestamp in the format of yyyy-mm-dd. Fill in the gaps to create a file called “newfile.txt” and check the date that it was modified.

import os
import datetime


def file_date(filename):
  # Create the file in the current directory
  with open(filename, 'w') as file:
    pass
  
  # Get the timestamp of when the file was last modified
  timestamp = os.path.getmtime(filename)


  # Convert the timestamp into a readable format, then into a string
  date_modified = datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d')


  # Return just the date portion 
  # Hint: how many characters are in “yyyy-mm-dd”? 
  return ("{}".format(date_modified))


print(file_date("newfile.txt")) 
# Should be today's date in the format of yyyy-mm-dd

Way to go! You remembered the commands to convert timestamp and format strings, to get the results that were requested.

5. The parent_directory function returns the name of the directory that’s located just above the current working directory. Remember that ‘..’ is a relative path alias that means “go up to the parent directory”. Fill in the gaps to complete this function.

import os


def parent_directory():
  # Create a relative path to the parent 
  # of the current working directory 
  relative_parent = os.path.join(os.getcwd(), '..')


  # Return the absolute path of the parent directory
  return os.path.abspath(relative_parent)


print(parent_directory())

That’s great! Reflexing the path to the parent directory, this is actually all right!

6. What is the difference between the readline() and read() methods?

  • The readline() method starts from the current position, while the read() method reads the whole file.
  • The read() method reads a single line, the readline() method reads the whole file.
  • The readline() method reads the first line of the file, the read() method reads the whole file.
  • The readline() method reads a single line from the current position, the read() method reads from the current position until the end of the file. (CORRECT)

Cool! In both methods, the position in the current file is decided. The readline() method reads a single line at a time, while the read() method reads up to the end of the file.

7. Can you identify which code snippet will correctly open a file and print lines one by one without whitespace?

with open("hello_world.txt") as text:
    for line in text:
        print(line)
with open("hello_world.txt") as text:
    for line in text:
        print(text)
with open("hello_world.txt") as text:
    print(line)
with open("hello_world.txt") as text:
    for line in text:
        print(line.strip()) (CORRECT)

Good work! Here, we are iterating line by line, and the strip() command is used to remove extra whitespace.

8. What happens to the previous contents of a file when we open it using “w” (“write” mode)?

  • The new contents get added after the old contents.
  • A new file is created and the old contents are kept in a copy.
  • The old contents get deleted as soon as we open the file. (CORRECT)
  • The old contents get deleted after we close the file.

Got it! The old contents of the file are removed as soon as the file is opened for writing mode, and the new data will proceed to overwrite it.

9. How can we check if a file exists inside a Python script?

  • Renaming the file with os.rename.
  • Creating the file with os.create.
  • Using the os.path.exists function.  (CORRECT)
  • Deleting the file with os.remove.

Yes, that’s it! The method os.path.exists() will report True when the file exists and False when it does not.

10. Some more functions of the os.path module include getsize() and isfile() which get information on the file size and determine if a file exists, respectively. In the following code snippet, what do you think will print if the file does not exist?

import os
file= "file.dat"
if os.path.isfile(file):
    print(os.path.isfile(file))
    print(os.path.getsize(file))
else:
    print(os.path.isfile(file))
    print("File not found")
file.dat
1024
False
2048
True
512
False
File not Found (CORRECT)

The function getsize() would not be called since the file does not exist, and the error message would be printed by itself.

11. What’s the purpose of the os.path.join function?

  • It creates a string containing cross-platform concatenated directories. (CORRECT)
  • It creates new directories.
  • It lists the file contents of a directory.
  • It returns the current directory.

Yes! This could be accomplished by using a directive called os.os_path(). This way, it combines directories in the same way as other os_path() functions—thus portably handling paths that are a cross-platform issue.

PRACTICE QUIZ: READING & WRITING CSV FILES

1. We’re working with a list of flowers and some information about each one. The create_file function writes this information to a CSV file. The contents_of_file function reads this file into records and returns the information in a nicely formatted block. Fill in the gaps of the contents_of_file function to turn the data in the CSV file into a dictionary using DictReader.

import os
import csv


# Create a file with data in it
def create_file(filename):
  with open(filename, "w") as file:
    file.write("name,color,type\n")
    file.write("carnation,pink,annual\n")
    file.write("daffodil,yellow,perennial\n")
    file.write("iris,blue,perennial\n")
    file.write("poinsettia,red,perennial\n")
    file.write("sunflower,yellow,annual\n")


# Read the file contents and format the information about each row
def contents_of_file(filename):
  return_string = ""


  # Call the function to create the file 
  create_file(filename)


  # Open the file
  with open(filename, mode='r') as file:
    # Read the rows of the file into a dictionary
    reader = csv.DictReader(file)
    
    # Process each item of the dictionary
    for row in reader:
      return_string += "a {} {} is {}\n".format(row["color"], row["name"], row["type"])
  return return_string


# Call the function
print(contents_of_file("flowers.csv"))

Well done! Your garden of Python skills is really blooming!

2. Using the CSV file of flowers again, fill in the gaps of the contents_of_file function to process the data without turning it into a dictionary. How do you skip over the header record with the field names?

import os
import csv


# Create a file with data in it
def create_file(filename):
  with open(filename, "w") as file:
    file.write("name,color,type\n")
    file.write("carnation,pink,annual\n")
    file.write("daffodil,yellow,perennial\n")
    file.write("iris,blue,perennial\n")
    file.write("poinsettia,red,perennial\n")
    file.write("sunflower,yellow,annual\n")


# Read the file contents and format the information about each row
def contents_of_file(filename):
  return_string = ""


  # Call the function to create the file 
  create_file(filename)


  # Open the file
  with open(filename, mode='r') as file:
    # Read the rows of the file
    rows = csv.reader(file)
    
    # Process each row
    for row in rows:
      # Skip over the header record with field names
      if row[0] == "name":
          continue
      
      # Format the return string for data rows only
      return_string += "a {} {} is {}\n".format(row[1], row[0], row[2])
  return return_string


# Call the function
print(contents_of_file("flowers.csv"))

You nailed it! Everything’s coming up roses (pardon the pun!)

3. In order to use the writerows() function of DictWriter() to write a list of dictionaries to each line of a CSV file, what steps should we take? (Check all that apply)

  • Create an instance of the DictWriter() class (CORRECT)
  • Write the fieldnames parameter into the first row using writeheader() (CORRECT)
  • Open the csv file using with open (CORRECT)
  • Import the OS module

Wonderful work! To have CSV working, we must first create such a file, setting up a DictWriter() object and mentioning the fieldnames parameter, which may consist of a list with keys.

Fantastic work! These are going to be values from a mandatory fieldnames list and should be written into the first actual row.

Excellent job! Don’t forget to open the CSV file before you could write anything on it.

4. Which of the following is true about unpacking values into variables when reading rows of a CSV file? (Check all that apply)

  • We need the same amount of variables as there are columns of data in the CSV (CORRECT)
  • Rows can be read using both csv.reader and csv.DictReader (CORRECT)
  • An instance of the reader class must be created first (CORRECT)
  • The CSV file does not have to be explicitly opened

Nice one! When opening rows into variables, we should not forget that the number of variables on the left-hand side of this equals sign must correspond to the scope of the sequence on the right- hand side.

Exactly! They both parse rows into different datatypes, though. The csv.reader and csv.DictReader are used to parse CSV with one exception, just like csv.

Excellent! Before parsing the CSV file, we must create a reader instance of the class we are using.

5. If we are analyzing a file’s contents to correctly structure its data, what action are we performing on the file?

  • Writing
  • Appending
  • Parsing (CORRECT)
  • Reading

Great work! Absolutely! As long as the file itself can be understood, it can be broken down into components and formatted so that it can be useful and used by our script. Data can be organized into sections effectively and in accordance with the format by which it is being presented.

6. If we have data in a format we understand, then we have what we need to parse the information from the file. What does parsing really mean?

  • Using rules to understand a file or datastream as structured data. (CORRECT)
  • Uploading a file to a remote server for later use, categorized by format
  • Writing data to a file in a format that can be easily read later
  • Reducing the logical size of a file to decrease disk space used and increase network transmission speed.

Right on! Well yeah! Generally, if data formats are known, it becomes rather quite easy for the user to split data into levels which can be tackled and gone through for easy scripts processing.

7. Which of the following lines would correctly interpret a CSV file called “file” using the CSV module? Assume that the CSV module has already been imported.

  • file.opencsv()
  • data=file.csv()
  • data=csv.reader(file) (CORRECT)
  • data=csv.open(file)

Right on! The reader() function of the CSV module will interpret the file as a CSV.

8. Which of the following must we do before using the csv.writer() function?

  • Import the functools module.
  • Open the file with write permissions. (CORRECT)
  • Import the argparse module.
  • Open the file with read permissions.

Nice work! Yeah, that’s correct! It’s nice to get the file opened through with open() as, while granting write-permission to help you make the changes in the file-hence proper file handling and automatic closing after the operations.

9. DictReader() allows us to convert the data in a CSV file into a standard dictionary. DictWriter() \ allows us to write data from a dictionary into a CSV file. What’s one parameter we must pass in order for DictWriter() to write our dictionary to CSV format?

  • The DictReader() function must be passed the CSV file
  • The writerows() function requires a list of key
  • The writeheader() function requires a list of keys
  • The fieldnames parameter of DictWriter() requires a list of keys (CORRECT)

Right on! This will help DictWriter() organize the CSV rows properly.

CONCLUSION – Managing Files with Python

In essence, it has been an objective of teaching you the basic mechanism in reading from files as well as writing to them, demonstrating a good handle on the main commands involved. The salient point in the said process is basically on managing files as well as changing directories where the file can be traced through the structure of folders. You have now learned its mechanics: how the work with files comes about and the layer of abstraction essential to all of it as far as Python and the operating system are related.

This point can be touched upon again and again by considering the related practical knowledge on CSV files and other practical methods. This gives you the necessary skills to manage and manipulate data effortlessly during your Python projects.

Leave a Comment