Module 4: Managing Data and Processes

Spread the love

INTRODUCTION – Managing Data and Processes

In Module Three, you will learn how to read and write a data file using the user-customer process. You will include normal streams, environment variables, and command-line arguments. We will delve into Pythons subprocesses with system commands and how to collect output from these commands. We also take care of subprocess management, checking of exit values, and normal and abnormal exit statuses.

Using log files and interactively examining what they are and how to filter them with regular expressions, one final action is to familiarize you with handling the captured output from log files.

Learning Objectives:

  • Use regular expressions to interact with log files.
  • Use Python to interact with a program and get some input value.
  • Use the input() function to interact with the user.
  • Understand how the subprocess.run() runs and interacts with system commands such as ping.
  • The steps to follow to create a log file
  • Use the get command to extract information from log files.

PRACTICE QUIZ: DATA STREAMS

1. Which command will print out the exit value of a script that just ran successfully?

  • echo $? (CORRECT)
  • import sys
  • echo $PATH
  • wc variables.py

Great! By using the echo command, the script will emit an exit value in the question mark form indicating whether an act was done without errors or was a flop.

2. Which command will create a new environment variable?

  • export (CORRECT)
  • env
  • input
  • wc

Right on! This command will create a new environment variable, and give it a value.

3. When a child process is run using the subprocess module, which of the following are true? (check all that apply)

  • The child process is run in a secondary environment. (CORRECT)
  • The parent process is blocked while the child process finishes. (CORRECT)
  • The parent process and child process both run simultaneously. 
  • Control is returned to the parent process when the child process ends. (CORRECT)

Absolutely! When a child process executes any command, it creates a very separate environment for itself outside the parent environment.

Yes! The execution of the subprocess may pause the parent until the program is over before the parent can initiate a parallel or limited task.

No! When the child process is done with the subordinate command, its part will be put to end, and the control shall be transferred back to the parent.

4. When using the run command of the subprocess module, what parameter, when set to True, allows us to store the output of a system command?

  • cwd
  • capture_output (CORRECT)
  • timeout
  • shell

Not quite. Absolutely! This parameter, cwd, helps to specify the present operating directory where the command will run, facilitating an easy execution of commands within a particular directory.

5. What does the copy method of os.environ do?

  • Creates a new dictionary of environment variables (CORRECT)
  • Runs a second instance of an environment
  • Joins two strings
  • Removes a file from a directory

Nice work! Correct! The os.environ.copy() function creates a new copy of the dictionary containing environment variables, allowing you to modify the copy without disturbing the original environment.

6. A system command that sends ICMP packets can be executed within a script by using which of the following?

  • subprocess.run
  • Ping (CORRECT)
  • CompletedProcess
  • Arguments

Right on! This function will execute a system command such as ping.

7. Which of the following is a Unicode standard used to convert an array of bytes into a string?

  • UTF-8 (CORRECT)
  • stdout
  • capture_output
  • Host

Woohoo! This encoding is part of the Unicode standard that can transform an array of bytes into a string.

8. Which method do you use to prepare a new environment to modify environment variables?

  • join
  • env
  • copy (CORRECT)
  • cwd

Awesome! Exactly! When you call the copy() method of os.environ dictionary, it copies the current environment variables into a new dictionary. This new environment can now be modified and used, leaving the original environment unchanged.

Practice Quiz: Python Subprocesses

1. What type of object does a run function return?

  • CompletedProcess (CORRECT)
  • returncode
  • stdout
  • capture_output

Awesome! This object includes information related to the execution of a command.

2. How can you change the current working directory where a command will be executed?

  • Use the capture_output parameter. 
  • Use the shell parameter.
  • Use the env parameter. Use the cwd parameter. (CORRECT)

Yes, the cwd option can be used to change the directory in which a command is executed and, for example, allow you to run commands in a directory you specify.

3. When a child process is run using the subprocess module, which of the following are true? (check all that apply)

  • The child process is run in a secondary environment. (CORRECT)
  • The parent process is blocked while the child process finishes. (CORRECT)
  • The parent process and child process both run simultaneously. Control is returned to the parent process when the child process ends. (CORRECT)

Great work! This is actually a scenario where a child process gets spawned that actually does execute a command in a separate environment.

Exactly! When the parent process awaits for the termination of the subprocess, all other operations in the parent process are blocked until the child process completes and the subprocess task is summoned.

Awesome! But after the child process completes its task, then those are the ways in which it exits out of existence, and the control happens from parent to its subsidiary

4. When using the run command of the subprocess module, what parameter, when set to True, allows us to store the output of a system command?

  • cwd
  • capture_output (CORRECT)
  • timeout
  • shell

Not quite. Indeed, yes, said parameter cwd is intended for the purpose of control alongside subprocess in another place expanded to allow a change of a current workflow directory for execution by subprocess.

5. What does the copy method of os.environ do?

  • Creates a new dictionary of environment variables (CORRECT)
  • Runs a second instance of an environment
  • Joins two strings
  • Removes a file from a directory

Nice work! That’s correct! The method os.environ.copy () duplicates the dictionary storing environment variables. The good thing with a new copy is that then one can change the copy without changing the original environment, thus making it easier to work with different settings.

6. A system command that sends ICMP packets can be executed within a script by using which of the following?

  • subprocess.run
  • Ping (CORRECT)
  • CompletedProcess
  • Arguments

Right on! This function will execute a system command such as ping.

7. Which of the following is a Unicode standard used to convert an array of bytes into a string?

  • UTF-8 (CORRECT)
  • stdout
  • capture_output
  • Host

Woohoo! That’s right! Usually, UTF-8 encoding is referred here because it is a part of the Unicode standard. UTF-8 transforms an array of bytes into a string so multiple characters can be represented-namely, special characters, characters from different languages, or some symbol sets.

8. Which method do you use to prepare a new environment to modify environment variables?

  • join
  • env
  • copy (CORRECT)
  • cwd

Awesome! Calling this method of the os.environ dictionary will copy the current environment variables to store and prepare a new environment.

PRACTICE QUIZ: PROCESSING LOG FILES

1. You have created a Python script to read a log of users running CRON jobs. The script needs to accept a command line argument for the path to the log file. Which line of code accomplishes this?

  • import sys
  • syslog=sys.argv[1] (CORRECT)
  • print(line.strip())
  • usernames = {}

Right on! This will assign the script’s first command line argument to the variable “syslog”.

2. Which of the following is a data structure that can be used to count how many times a specific error appears in a log?

  • Search
  • Continue
  • Dictionary (CORRECT)
  • Get

Great work! A dictionary is useful to count appearances of strings.

3. Which keyword will return control back to the top of a loop when iterating through logs?

  • Continue (CORRECT)
  • Get
  • With
  • Search

Excellent! The continue statement is used to return control back to the top of a loop.

4. When searching log files using regex, which regex statement will search for the alphanumeric word “IP” followed by one or more digits wrapped in parentheses using a capturing group?

  • r”IP \(\d+\)$”
  • b”IP \((\w+)\)$”
  • r”IP \((\d+)\)$” (CORRECT)
  • r”IP \((\D+)\)$” 

Awesome! Right! Formally speaking, the regex r”IP \((\d+)\)$” stands for “IP” followed by a space and parentheses. Capture group will ensure that there is a bracketed, one-digit or bigger number.

5. Which of the following are true about parsing log files? (Select all that apply.)

  • Load the entire log files into memory.
  • You should parse log files line by line. (CORRECT)
  • It is efficient to ignore lines that don’t contain the information we need. (CORRECT)
  • We have to open() the log files first. (CORRECT)

A single line can ease logging, but is really helpful when it comes to huge files.

Time and resources are conserved with lines made to skip through those that don’t contain what you’re looking for.

Use the open() and with open() to fetch the file before beginning to parse, especially because it helps one to open and close the file the right way.

6. Which of the following is a correct printout of a dictionary?

  • {‘carrots’:100, ‘potatoes’:50, ‘cucumbers’: 65} (CORRECT)
  • {50:’apples’, 55:’peaches’, 15:’banana’} (CORRECT)
  • {55:apples, 55:peaches, 15:banana}
  • {carrots:100, potatoes:50, cucumbers: 65}

You got it! A dictionary stores key:value pairs.

WORKING WITH LOG FILES

1. What should Windows users do to connect to their VM in the provided lab environment?

  • Use the local Terminal application to connect using a PEM key.
  • Download the PPK key file from the Qwiklabs Start Lab page and use PuTTY for SSH connection. (CORRECT)
  • Open the Terminal application in Linux and enter the VM’s IP address.
  • Add Secure Shell to the Chrome browser and enter the VM’s hostname.

Correct

2. What is the primary purpose of the os module?

  • It allows you to perform mathematical operations and calculations efficiently.
  • It provides a portable way to interact with the Python interpreter. (CORRECT)
  • It enables you to create graphical user interfaces (GUIs) for Python applications.
  • It is used for managing data serialization and deserialization tasks.

Correct

3. In the lab’s Python script, what is the primary role of the error_search function when working with regular expressions (RegEx)?

  • To compile a fixed RegEx pattern for matching specific error codes in the log file
  • To use RegEx for splitting the log file into individual logs
  • To encrypt and secure log file data using RegEx patternsTo create and search for RegEx patterns based on user input to identify errors in the log file (CORRECT)

Correct

4. Which file did you use that contained the system log?

  • import sys
  • error_search
  • fishy.log (CORRECT)
  • find_error.py

Correct

5. What is the purpose of defining the main function in the script, and why is it significant for the script’s execution?

  • It encrypts fishy.log and stores it in a secure location.
  • The script prompts the user for a type of error, searches fishy.log for that error, and writes the found errors to errors_found.log. (CORRECT)
  • The script merges fishy.log with other log files to create a comprehensive error report.
  • It compiles fishy.log into a Python executable file for faster error analysis.

Correct

6. In the lab’s Python script find_error.py, what happens when the script is executed with a log file like fishy.log?

  • A contact list
  • A network diagram
  • An incident response plan (CORRECT)
  • A security policy

The definition of emergency response plans draws lines on how an organization should act when a ransomware attack occurs.

7. Apply what you’ve learned from this lab to answer this question. You are tasked with enhancing the find_error.py script to also search for warning messages in addition to errors in the fishy.log file. How would you modify the script to accomplish this?

  • Create a separate script specifically for searching warning messages in the log file.
  • Modify the error_patterns list initialization to include both “error” and “WARN” as base patterns. (CORRECT)
  • Add a new input prompt to the script for the user to specify if they want to search for “WARN” messages.
  • Replace the error_patterns list with a new list containing only “WARN” patterns.

Correct

8. Which of the following does the sys module provide information about in the Python interpreter? Select all that apply.

  • Constants (CORRECT)
  • Methods (CORRECT)
  • Operating system
  • Functions (CORRECT)

Correct

9. What is the function that takes the errors returned by another function as a formal parameter? 

  • returned_errors
  • Either file_output or error_search are used for this task.
  • file_output (CORRECT)
  • error_search

Correct

10. What is the primary purpose of the re module in Python?

  • To enhance graphical capabilities and user interface design
  • For data encryption and cybersecurity purposes
  • To enable network connectivity and communication over the internet
  • To provide support for working with Regular Expressions for pattern matching in strings (CORRECT)

Correct

11. Apply what you’ve learned from this lab to answer this question. What is the purpose of using regular expressions when you interact with log files in Python?

  • To make code more readable
  • To speed up script execution
  • To filter and extract information (CORRECT)
  • To modify the log files

Correct

12. In the process of connecting to a virtual machine using SSH and PuTTY on Windows, which of the following steps is necessary?

  • Entering the username and external IP address in the Host Name (or IP address) box.
  • Entering a password for authentication
  • Opening the PuTTY Secure Shell (SSH) client
  • Downloading the PPK key file from the Qwiklabs Start Lab page (CORRECT)

Correct

13. What is the primary function of regular expressions (RegEx) in Python programming?

  • To speed up the execution of code by optimizing algorithm performance
  • To act as a programming language for creating complex software applications
  • To define a sequence of characters that form a search pattern for text processing (CORRECT)
  • To serve as a method for encrypting and securing data within a program

Correct

14. What is the role of fishy.log in the provided Python script for log file analysis?

  • This is a configuration file that dictates how the script should process logs.
  • It is the name of the script that contains the regular expressions for error analysis. (CORRECT)
  • It refers to a function within the script that generates log files for testing.
  • It is the log file that is being analyzed for specific error patterns.

Correct

15. What is the step-by-step process of how errors are searched for and processed in the script within the lab?

  • Set the log_file variable, call the error_search() function with the log_file parameter to search for errors, and store the matching errors in the returned_errors list. (CORRECT)
  • Define the main function, call the error_search() function with the log file path, and display the errors to the console.
  • Define the file_output() function, read the log file, search for a specific error type, and write the errors to an errors_found.log file.
  • Start by defining the error_search() and file_output() functions, and then read the log file specified by the user.

Correct

16. Which of the following statements is the best definition of a log file? Select the best answer. 

  • A file that stores user data
  • A file that contains an application’s source code
  • A file that stores machine codeA file that keeps track of events in an operating system (CORRECT)

Correct

17. In the script’s execution process described, what are the main functions called, and in what order?

  • error_search() and file_output() called in that order
  • file_output() and error_search()called in that order
  • file_output() and error_search() called in that order
  • error_search() and file_output() called in that order  (CORRECT)

Correct

18. How does the find_error.py script process the fishy.log file according to the provided content?

  •  
  • It compresses fishy.log to reduce its size for storage efficiency.
  • The script first searches fishy.log for user-specified errors and then generates a new file, errors_found.log, containing these errors. (CORRECT)
  • It automatically detects and fixes syntax errors within fishy.log.
  • The script translates the contents of fishy.log into another programming language for cross-platform compatibility.

Correct

19. What is the primary function of the os module in Python?

  • To handle machine learning algorithms and data analysis
  • For managing and manipulating file paths and directory structures (CORRECT)
  • It provides functions for creating and managing graphical interfaces.
  • The os module is used exclusively for web development purposes.

Correct

20. What is the purpose of the sys.exit(0) statement in the script, and how does it affect the execution of the Python script?

  • The sys.exit(0) statement is used to pause the script’s execution and wait for user input before continuing.
  • The sys.exit(0) statement is used to indicate successful termination of the script, and it has no impact on the script’s execution. (CORRECT)
  • The sys.exit(0) statement is used to forcibly terminate the script, even if there are errors in the code.
  • The sys.exit(0) statement is used to display an error message to the user and halt the script’s execution if any errors are encountered.

Correct

21. In the lab’ script, what is the purpose of the if __name__ == “__main__”: block, and why is it important for the execution of the script?

  • The if __name__ == “__main__”: block is used to define custom functions for the script, and it doesn’t impact the script’s execution.
  • The if __name__ == “__main__”: block is used to display an error message to the user if any errors are encountered during script execution.
  • The if __name__ == “__main__”: block is the main entry point of the script, where the script’s execution begins when it is run as the main program. (CORRECT)
  • The if __name__ == “__main__”: block is used to specify the author’s name and copyright information for the script.

Correct

22. Apply what you’ve learned from this lab to answer this question. You are working on the find_error.py script to search for specific errors in the fishy.log file. If you need to find all instances of a network connection failure, which of the following steps would you take to modify the script accordingly?

  • Change the error_patterns list to include only “network” and “failure”, and then run the script with fishy.log.
  • Rewrite the Regular Expression in the script to only match logs with the word “network”.
  • Modify the user input line to specifically ask for “network connection failure” errors, then process fishy.log. (CORRECT)
  • Edit the file_output function to filter out all logs except those containing the word “network”.

Correct

23. What role does the if __name__ == “__main__”: block play in the execution of the lab’s script, and at what point in the script’s execution does it come into play?

  • The if __name__ == “__main__”: block is responsible for defining custom functions within the script, and it runs at the beginning of the script’s execution.
  • The if __name__ == “__main__”: block serves as the main entry point of the script, and it is where the script’s execution begins when run as the main program. (CORRECT)
  • The if __name__ == “__main__”: block handles syntax errors and runs only if an error is encountered during the script’s execution.
  • The if __name__ == “__main__”: block is used to specify the author’s name and copyright information for the script, and it runs at the end of the script.

Correct

24. Which of the following statements about log files is true? Select all that apply.

  • They can help in identifying and fixing issues. (CORRECT)
  • They can be used to monitor system performance. (CORRECT)
  • They are created only when an error occurs.
  • They can be programmed to record specific events. (CORRECT)

Correct

25. Which term describes a program that provides a text-based interface for typing commands? 

  • An IP address
  • A download
  • A console
  • A terminal (CORRECT)

Correct

26. What is the primary purpose of the sys module in Python?

  • Provides functions and variables to interact with the Python interpreter and the runtime environment (CORRECT)
  • Performing system-level operations like managing files and directories
  • Creating graphical user interfaces (GUIs) in Python
  • Mathematical and numerical computations in Python

Correct

27. In the lab’s Python script, what is the role of the error_search function in relation to processing log files with regular expressions (RegEx)?

  • To interactively receive an error type from the user and use RegEx to find corresponding logs (CORRECT)
  • The function uses RegEx to compress log files for efficient storage
  • To convert all log file data into a single regular expression pattern for bulk processing
  • To apply a standard RegEx pattern to every log file for general error detection

Correct

CONCLUSION – Managing Data and Processes

In a nutshell, that was all for this module and its precise coverage of reading from and writing to data files by use of user interaction in detail. The standard streams related to the environment variables and command-line arguments were thoroughly identified in this chapter. As we illustrated system commands in Python subprocesses and also outlined practical uses of the system commands, the understanding about Python grew. So it is possible to capture output from the system command and manage subprocesses, allowing our understanding of examining and handling exit values.

In essence, the very last part captured the intricate processes of working with log files, along with a typical definition. These expressions cover filtering with regular expressions, as well as support in the interpretation of gathered output. You should now build good grounds for handling data files and subprocesses successfully in Python with all these skills.

Leave a Comment