Module 1: Troubleshooting Concepts

Spread the love

INTRODUCTION – Troubleshooting Concepts

This module familiarizes you with fundamental troubleshooting concepts and introduces to various strategies and techniques useful for solving practical problems that may or may not arise. It teaches you the basic principle debugging, which is at the heart of troubleshooting, and then moves on to some assorted tools such as tcpdump, ps, top, itrace, and many more that will come in handy during the process of debugging. There will be some minimal understanding as to what it means to “understand the problem.”

What looks superficially easy, however, is not always simply; troubleshooting as a technique involves a lot of challenge and more. You will get to delve into troubleshooting techniques such as fixing reproducible and intermittent errors. Also to be covered by the module is the idea of “binary searching a problem” where the next two capsules will address this issue (binary and linear searches). Finally, you will also discover how bisecting can be applied in your troubleshooting process. It wraps up by showing students how to find invalid data in a CSV file.

Learning Objectives:

  • Understand what troubleshooting is.
  • Understand the fundamentals of debugging that help in troubleshooting.
  • Identifying the root cause of a problem.
  • Correcting recurring and intermittent problems.
  • Understanding the difference between linear and binary searches.
  • Search the CSV file for invalid data.

PRACTICE QUIZ: INTRODUCTION TO DEBUGGING

1. What is part of the final step when problem solving?

  • Documentation
  • Long-term remediation (CORRECT)
  • Finding the root cause
  • Gathering information

Nice job! Long-term remediation is part of the final step when problem solving.

2. Which tool can you use when debugging to look at library calls made by the software?

  • top
  • strace
  • tcpdump
  • ltrace (CORRECT)

Keep it up! The ltrace tool helps observe how programs employ libraries.

3. What is the first step of problem solving?

  • Prevention
  • Gathering information (CORRECT)
  • Long-term remediation
  • Finding the root cause

Right on! Gathering information is the first step taken when problem solving.

4. What software tools are used to analyze network traffic to isolate problems? (Check all that apply)

  • tcpdump (CORRECT)
  • wireshark (CORRECT)
  • strace
  • top

TCP dump provides a command line interface that can capture, sort, and filter network traffic in real time.

Wireshark allows an individual to view the running activity of the network as well as collect data packets to examine how the network really works.

5. The strace (in Linux) tool allows us to see all of the _____ our program has made.

  • Network traffic
  • Disk writes
  • System calls (CORRECT)
  • Connection requests

Awesome!Strace is a command-line utility that monitors and records the arguments and return values of system calls made by a program.

6. What is the general description of debugging?  

  • Fixing bugs in the code of the application (CORRECT)  
  • Fixing problems in the system running the application  
  • Fixing issues related to hardware  
  • Fixing configuration issues in the software

Awesome! Generally, debugging means fixing bugs in the code of the application.  

7. What is the second step of problem solving?  

  • Short-term remediation  
  • Long-term remediation  
  • Finding the root cause (CORRECT)  
  • Gathering information

Right on! Finding the root cause is the second step taken when problem solving.  

8. Which command can you use to scroll through a lot of text output after tracing system calls of a script?  

  • strace -o fail.strace ./script.py  
  • strace ./script.py | less  (CORRECT)
  • strace ./script.py  
  • strace ./script.py -o fail.strace 

Great work! The less command is used to scroll through text output of large volumes.

PRACTICE QUIZ: UNDERSTANDING THE PROBLEM

1. When a user reports that an “application doesn’t work,” what is an appropriate follow-up question to gather more information about the problem?

  • Is the server plugged in?
  • Why do you need the application?
  • Do you have a support ticket number?
  • What should happen when you open the app? (CORRECT)

Awesome! Simple and good use of English is likely to elicit the right information from the project owner.

2. What is a heisenbug?

  • The observer effect. (CORRECT)
  • A test environment.
  • The root cause.
  • An event viewer.

Right on! The role of the measurer is to observe a system or event and inadvertently mix it up.

3. How do we verify if a problem is still persisting or not?

  • Restart the device or server hardware
  • Attempt to trigger the problem again by following the steps of our reproduction case (CORRECT)
  • Repeatedly ask the user
  • Check again later

Right! In the right place and at the right time it happens just as things emerged; we can see the persistence of that situation and have more data to troubleshoot.

4. The datetime module supplies classes for manipulating dates and times, and contains many types, objects, and methods. You’ve seen some of them used in the dow function, which returns the day of the week for a specific date. We’ll use them again in the next_date function, which takes the date_string parameter in the format of “year-month-day”, and uses the add_year function to calculate the next year that this date will occur (it’s 4 years later for the 29th of February during Leap Year, and 1 year later for all other dates). Then it returns the value in the same format as it receives the date: “year-month-day”.

Can you find the error in the code? Is it in the next_date function or the add_year function? How can you determine if the add_year function returns what it’s supposed to? Add debug lines as necessary to find the problems, then fix the code to work as indicated above. 

Screenshot 2024 01 18 014753

Answer:

import datetime
from datetime import date


def add_year(date_obj):
    try:
        new_date_obj = date_obj.replace(year=date_obj.year + 1)
    except ValueError:
        # This gets executed when the above method fails,
        # which means that we're making a Leap Year calculation
        new_date_obj = date_obj.replace(year=date_obj.year + 4)
    return new_date_obj


def next_date(date_string):
    # Convert the argument from string to date object
    date_obj = datetime.datetime.strptime(date_string, "%Y-%m-%d")
    next_date_obj = add_year(date_obj)


    # Convert the datetime object to string,
    # in the format of "yyyy-mm-dd"
    next_date_string = next_date_obj.strftime("%Y-%m-%d")
    return next_date_string


today = date.today()  # Get today's date
print(next_date(str(today)))
# Should return a year from today, unless today is Leap Day


print(next_date("2021-01-01"))  # Should return 2022-01-01
print(next_date("2020-02-29"))  # Should return 2024-02-29

5. When a user reports that a “website doesn’t work,” what is an appropriate follow-up question you can use to gather more information about the problem?  

  • What steps did you perform? (CORRECT)
  • Is the server receiving power?  
  • What server is the website hosted on?  
  • Do you have support ticket number?

You got it! Asking the user about the actions they performed helps me gather important information that can make it easier to figure out where the real problem lies.

6. A program fails with an error, “No such file or directory.” You create a directory at the expected file path and the program successfully runs. Describe the reproduction case you’ll submit to the program developer to verify and fix this error.  

  • A report explaining to open the program without the specific directory on the computer (CORRECT)  
  • A report with application logs exported from Windows Event Viewer   
  • A report listing the contents of the new directory  
  • A report listing the differences between strace and ltrace logs.

You got it! This a specific way to reproduce the error and verify it exists. The developer can work on fixing it right away.  

7. Generally, understanding the root cause is essential for _____?  

  • Purchasing new devices  
  • Producing test data  
  • Avoiding interfering with users  
  • Providing the long-term resolution (CORRECT)

True! The most important thing is to find out what causes a problem. To that end, securing the root cause will be crucial in order to find the correct solution which will prevent repetition of the same problem.

8. What sort of software bug might we be dealing with if power cycling resolves a problem?  

  • Poorly managed resources (CORRECT)
  • A heisenbug  
  • Logs filling up  
  • A file remains open

Way to go! When turned off and then back on again, the device creates a blank slate by getting rid of the stored resources residing in its memory.

PRACTICE QUIZ: BINARY SEARCHING A PROBLEM

1. You have a list of computers that a script connects to in order to gather SNMP traffic and calculate an average for a set of metrics. The script is now failing, and you do not know which remote computer is the problem. How would you troubleshoot this issue using the bisecting methodology?

  • Run the script with the first half of the computers. (CORRECT)
  • Run the script with last computer on the list.
  • Run the script with first computer on the list
  • Run the script with two-thirds of the computers.

Buttymyf Adefying troubleshoot involves halving the list of computers and running the script on one half in order to isolate the problem more effectively.

2. When trying to find an error in a log file or output to the screen, what command can we use to review, say, the first 10 lines?

  • wc
  • tail
  • head (CORRECT)
  • bisect

Awesome! The head command will print the first lines of a file, 10 lines by default.

3. The best_search function compares linear_search and binary_search functions to locate a key in the list, returns how many steps each method took, and which method is the best for that situation. The list does not need to be sorted, as the binary_search function sorts it before proceeding (and uses one step to do so). Here, linear_search and binary_search functions both return the number of steps that it took to either locate the key or determine that it’s not in the list. If the number of steps is the same for both methods (including the extra step for sorting in binary_search), then the result is a tie. Fill in the blanks to make this work.

Screenshot 2024 01 18 015324
Screenshot 2024 01 18 015411
def linear_search(list, key):
   #Returns the number of steps to determine if key is in the list


   #Initialize the counter of steps
   steps=0
   for i, item in enumerate(list):
       steps += 1
       if item == key:
           break
   return ___


def binary_search(list, key):
   #Returns the number of steps to determine if key is in the list


   #List must be sorted:
   list.sort()


   #The Sort was 1 step, so initialize the counter of steps to 1
   steps=1


   left = 0
   right = len(list) - 1
   while left <= right:
       steps += 1
       middle = (left + right) // 2
      
       if list[middle] == key:
           break
       if list[middle] > key:
           right = middle - 1
       if list[middle] < key:
           left = middle + 1
   return ___


def best_search(list, key):
   steps_linear = ___
   steps_binary = ___
   results = "Linear: " + str(steps_linear) + " steps, "
   results += "Binary: " + str(steps_binary) + " steps. "
   if (___):
       results += "Best Search is Linear."
   elif (___):
       results += "Best Search is Binary."
   else:
       results += "Result is a Tie."


   return results


print(best_search([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 1))
#Should be: Linear: 1 steps, Binary: 4 steps. Best Search is Linear.


print(best_search([10, 2, 9, 1, 7, 5, 3, 4, 6, 8], 1))
#Should be: Linear: 4 steps, Binary: 4 steps. Result is a Tie.


print(best_search([10, 9, 8, 7, 6, 5, 4, 3, 2, 1], 7))
#Should be: Linear: 4 steps, Binary: 5 steps. Best Search is Linear.


print(best_search([1, 3, 5, 7, 9, 10, 2, 4, 6, 8], 10))
#Should be: Linear: 6 steps, Binary: 5 steps. Best Search is Binary.


print(best_search([5, 1, 8, 2, 4, 10, 7, 6, 3, 9], 11))
#Should be: Linear: 10 steps, Binary: 5 steps. Best Search is Binary.

Way to go! You’re getting good at working with the different search methods!

4. When searching for more than one element in a list, which of the following actions should you perform first in order to search the list as quickly as possible?  

  • Sort the list (CORRECT)  
  • Do a binary search  
  • Do a linear search  
  • Use a base three logarithm

Nailed it! A list must be sorted first before it can take advantage of the binary search algorithm.

5. When troubleshooting an XML configuration file that’s failed after being updated for an application, what would you bisect in the code?  

  • File format  
  • File quantity  
  • Folder location  
  • Variables (CORRECT)

Nicely done! A file-defined list of variables can be divided or tested with middle repetition until the root cause is identified.

DEBUG PYTHON SCRIPTS

1. How does debugging a script contribute to the quality and reliability of code and applications?

  • Debugging focuses on optimizing code for faster execution and application reliability.
  • Debugging involves documenting the entire codebase.
  • Debugging primarily aims to create new various software features.
  • Debugging helps you identify and rectify errors, enhancing code quality and application reliability. (CORRECT)

Correct

2. In the lab, what was identified as the root cause of the issue that led to the TypeError?

  • An incorrect variable assignment
  • A missing function definition
  • Inconsistent indentation in the code
  • An attempt to concatenate data of different types without conversion (CORRECT)

Correct

3. Why is it necessary to convert the integer variable number to a string using the str() function in the revised print statement?

  • To enable concatenation of the integer number with the string in the print statement. (CORRECT)
  • To prevent the print statement from throwing a syntax error
  • To make the print statement more readable
  • To reduce memory usage in the script

Correct

4. In the lab, what is the main issue with the script involving concatenating two different data points?

  • The script is missing proper indentation.
  • The script uses incompatible data types for concatenation. (CORRECT)
  • The colleague didn’t include the necessary import statements.
  • The colleague didn’t provide enough information.

Correct

5. You receive the following error message (TypeError: Can’t convert ‘int’ object to str implicitly). What do you need to do to resolve this error? 

  • Replace the string with a set of integers.
  • Switch to another programming language.
  • Change the number into a string. (CORRECT)
  • Rewrite the code.

Correct

6. In the lab, you encountered an error message that indicates a problem with concatenating a string and an integer. To figure out the root cause of this bug, what step did you take next in the lab after successfully reproducing the error?

  • Examined the code within the script (CORRECT)
  • Reinstalled the SSH client
  • Checked the internet connection
  • Rebooted the virtual machine

Correct

7. A recurring problem can be defined as what? 

  • A problem that can only be resolved with a system reboot.
  • A problem that occurs only once.
  • A problem that occurs consistently under the same set of conditions. (CORRECT)
  • A problem that occurs randomly, without a predictable pattern.

Correct

8. Which of the following best describes the importance of debugging a script? 

  • It helps you find the root cause of the problem. (CORRECT)
  • It helps you rewrite lines of code that are not working properly
  • It helps you identify the output of a function.
  • It helps you run commands.

Correct

9. In the lab, where was the root cause of the issue located? 

  • Within the print statement (CORRECT)
  • When listing all the files in the directory
  • In the scripts directory
  • When viewing the contents of the file

Correct

10. In the debugging process described, what is the purpose of replacing the original print statement with the following statement?

print("hello " + name + ", your random number is " + str(number))
  • The purpose is to remove the print statement from the script entirely.
  • The purpose is to fix a syntax error in the print statement.
  • The purpose is to ensure the print statement correctly concatenates the name and random number with the appropriate data types. (CORRECT)
  • The purpose is to make the print statement more concise and shorter.

Correct

11. In the lab, what is the main issue with the script involving concatenating two different data points?

  • The script uses incompatible data types for concatenation. (CORRECT)
  • The colleague didn’t provide enough information.
  • The colleague didn’t include the necessary import statements.
  • The script is missing proper indentation.

Correct

12. In the lab, what was the recommended solution to address the issue in the Python script where two different data types (string and int) need to be concatenated?

  • Convert the integer to a string using the str() function. (CORRECT)
  • Modify the file path in the script.
  • Add the necessary import statements to the script.
  • Check for indentation errors in the script.

Correct

13. Why is it necessary to convert the integer variable number to a string using the str() function in the revised print statement?

  • To make the print statement more readable
  • To reduce memory usage in the script
  • To prevent the print statement from throwing a syntax error
  • To enable concatenation of the integer number with the string in the print statement. (CORRECT)

Correct

14. Question: In Python, what issue arises when trying to concatenate two different data types, such as a string and an integer, as seen in the provided code snippet?

  • TypeError (CORRECT)
  • TypeMismatchError
  • ValueError
  • SyntaxError

Correct

15. Complete this sentence. In troubleshooting and debugging, a(n) ______________ type of problem is reproducible under the same set of conditions. 

  • occasional
  • intermittent A problem that occurs randomly, without a predictable pattern.
  • typical
  • recurring (CORRECT)

Correct

16. When a bug is found in a software program during testing, what is the best way to communicate it to the right individuals(s)? 

  • Email the testing and development department.
  • Document it and notify relevant team members (CORRECT)
  • Post it on social media.
  • Ignore it.

Correct

17. What can you infer if a recurring problem goes away when you change the system settings? 

  • The problem was a hardware failure.
  • Something in the system settings resolved the problem permanently.
  • The problem was a software issue.
  • Something in the system settings was causing the problem. (CORRECT)

Correct

18. Why is it important to reproduce an error when debugging scripts? Select all that apply.

  • To confirm that the issue is real and not a one-time occurrence (CORRECT)
  • To skip the debugging process and directly apply fixes
  • To verify that the error occurs only once
  • To document the problem for future reference (CORRECT)
  • To isolate the variables and conditions contributing to the error (CORRECT)

Correct

19. Which of the following statements are true about adding two different data types directly in Python? 

  • In Python, as long as the syntax is correct, you can directly add two or more different data types.
  • In Python, you can directly add two different data types when necessary.
  • In Python, you cannot add two different data types directly unless you are debugging. In Python, you cannot add two different data types directly. (CORRECT)

Correct

20. While following the lab instructions and attempting to connect to your VM or run a script, you encounter the error mentioned in the lab: “TypeError: Can’t convert ‘int’ object to str implicitly.” What should be your initial response to address this error?

  • Ignore the error and continue with the lab exercises.
  • Restart your VM to resolve the error.
  • Analyze the error message and traceback to identify the issue. (CORRECT)
  • Immediately attempt to fix the error without further investigation.

Correct

21. When a bug is found in a software program during testing, what should you do? 

  • Document it and notify relevant team members (CORRECT)
  • Post it on social media
  • Ignore it. Bugs are expected in testing.
  • Email the testing and development department

Correct

22. Complete this sentence. The error message TypeError: Can’t convert ‘int’ object to str implicitly lets you know that ___________________________. 

  • Something in the code is trying to convert a string into an integer.
  • There is a syntax error in the code. 
  • Something in the code is trying to concatenate a string and an integer. (CORRECT)
  • The string within this code is missing.

Correct

23. As part of your testing process, you’ve successfully reproduced the error in the laB: After identifying that the root cause of the issue is a concatenation problem involving different data types (string and integer), you decide to use the str() function to convert the integer variable into a string. How does this step contribute to effective software testing?

  • It verifies the correctness of the software’s user interface.
  • It ensures that the software meets performance requirements.
  • It helps in finding syntax errors in the code.
  • It assists in uncovering and addressing data type compatibility issues. (CORRECT)

Correct

24. What is the definition of a recurring problem in the context of troubleshooting and debugging? 

  • A problem that occurs consistently under the same set of conditions. (CORRECT)
  • A problem that can only be resolved with a system reboot.
  • A problem that occurs only once.
  • A problem that occurs randomly, without a predictable pattern.

Correct

CONCLUSION – Troubleshooting Concepts

This basically involves going through varieties of strategies and methods for handling various challenges and it tends to have imparted the knowledge to introducing atomic helpful problem-solving. Debugging is being emphasized as the most important part of troubleshooting in addition to presenting necessary tools such as tcpdump, ps, top, itrace, et al., which aim at developing debugging skills.

It highlights the importance of finding out fully what the problem is which may be more involved than it appears at first sight. Ways of troubleshooting including distinguishing those reproducing errors and intermittent errors were also discussed. There would be included “binary searching a problem” with binary and linear search methods. Last, bisection was described as a practical troubleshooting towards the finding of the wrong data in the csv file.

It has actually made you well equipped with a wide array of tools and knowledge used in troubleshooting.

Leave a Comment