Module 2: Slowness

Spread the love

INTRODUCTION – Slowness

In this module, you will have the chance to explore the reasons that can possibly slow down a machine or application. You can determine and address slowness by finding the bottleneck that is at fault. iotop, iftop, and the Activity Monitor of MAC OS provide users with the necessary tools to determine the exhausted resource.

Also, you would learn about the very basic things that a computer does in terms of resource. . Resource usage will tell you how a computer uses its resources, such as CPU, RAM, and cache, leading you to take a closer look at the causes of slow machines or faulty scripts. As a class that look into efficiency and effective writing, you will know how to make use of profiling tools for the identification of your code’s time-consuming parts.

You will learn about different data structures, like lists, tuples, dictionaries, and sets. You will also explore how loops may come to be very expensive. The class will seriously consider the more complex delay problems, as well as how they apply to concurrency and caching services for enhancing code execution. Eventually, you will appreciate and understand the improvement in speed achieved through threaded code execution.

Learning objectives:

  • What is slowness and does it mean to use a tool to find out the bottleneck responsible for a slowdown
  • Use Iotop and Iftop tools so you can arrive at exhausted resources
  • What computer parts carry what and how they contribute to slowness
  • The right efficient coding and data structures with looping will help boost performance.
  • Bring concurrency, application caching service, and thread-based performance at execution-area.

PRACTICE QUIZ: UNDERSTANDING SLOWNESS

1. Which of the following will an application spend the longest time retrieving data from?

  • CPU L2 cache
  • RAM
  • Disk
  • The network (CORRECT)

Right on! An application will take the longest time trying to retrieve data from the network.

2. Which tool can you use to verify reports of ‘slowness’ for web pages served by a web server you manage?

  • The top tool
  • The ab tool (CORRECT)
  • The nice tool
  • The pidof tool

Great work! Performances of web servers are established by comparing server responses to the requests. The ab tool, Apache Benchmark, achieved this efficiently.

3. If our computer running Microsoft Windows is running slow, what performance monitoring tools can we use to analyze our system resource usage to identify the bottleneck? (Check all that apply)

  • Performance Monitor (CORRECT)
  • Resource Monitor (CORRECT)
  • Activity Monitor
  • top

Performance Monitor is a tool in Windows which you can use to monitor the usage of the basic resources such as CPU and Memory.

Process Monitor is another monitoring tool with some advanced components for even real-time access to the software or hardware resources.

4. Which of the following programs is likely to run faster and more efficiently, with the least slowdown?

  • A program with a cache stored on a hard drive
  • A program small enough to fit in RAM (CORRECT)
  • A program that reads files from an optical disc
  • A program that retrieves most of its data from the Internet

Nice work! It makes sure that elements of a software application or an operating system run quicker when they all fit inside memory RAM with relative easeüber than becoming stored on magnetic hard drives or delivered over the network.

5. What might cause a single application to slow down an entire system? (Check all that apply)

  • A memory leak (CORRECT)
  • The application relies on a slow network connection
  • Handling files that have grown too large (CORRECT)
  • Hardware faults

Oh, for sure! In fact, memory leakage refers to any instance where an application retains memory it no longer needs, which brings about inefficient memory operation in the long run.

Most certainly. Therefore, an application that creates too many files may eventually slow down while loading and storing these files into RAM for its operation, taking up too much of the system’s memory resources.

6. When addressing slowness, what do you need to identify?

  • The bottleneck (CORRECT)   
  • The device  
  • The script  
  • The system  

Woohoo! The bottleneck could be the CPU time, or time spent reading data from disk.  

7. After retrieving data from the network, how can an application access that same data quicker next time?

  • Use the swap
  • Create a cache (CORRECT)
  • Use memory leak
  • Store in RAM

You nailed it! A cache stores data in a form that’s faster to access than its original form.

8. A computer becomes sluggish after a few days, and the problem goes away after a reboot. Which of the following is the possible cause?

  • Files are growing too large. 
  • A program is keeping some state while running. (CORRECT)
  • Files are being read from the network. 
  • Hard drive failure.

Awesome! Any given computer will run on as if it was rebooted, and there could be issues with a program trying to run it.

PRACTICE QUIZ: SLOW CODE

1. Which of the following is NOT considered an expensive operation?

  • Parsing a file
  • Downloading data over the network
  • Going through a list
  • Using a dictionary (CORRECT)

Awesome! Using a dictionary is faster to look up elements than going through a list.

2. Which of the following may be the most expensive to carry out in most automation tasks in a script?

  • Loops (CORRECT)
  • Lists
  • Vector
  • Hash

Great work! Using end conditions instead of using indefinite loops will make this aspect easier by encouraging the addition of parallelism with subtasks in the automation process.

3. Which of the following statements represents the most sound advice when writing scripts?

  • Aim for every speed advantage you can get in your code
  • Use expensive operations often
  • Start by writing clear code, then speed it up only if necessary (CORRECT)
  • Use loops as often as possible

Awesome! In case there is no visible slowdown, speeding up should not be performed as the performance has already been found to be adequate for the said tasks.

4. In Python, what is a data structure that stores multiple pieces of data, in order, which can be changed later?

  • A hash
  • Dictionaries
  • Lists (CORRECT)
  • Tuples

Right on! Most often, a list is created just to be traversed, or its elements must be accessed at a specific location in code.

5. What command, keyword, module, or tool can be used to measure the amount of time it takes for an operation or program to execute? (Check all that apply)

  • time (CORRECT)
  • kcachegrind (CORRECT)
  • cProfile (CORRECT)
  • break

It requires just a little added time. It requires the time command to be prefixed to the name of the commands and scripts. The shell will then show the execution time statistics upon completion.

The kcachegrind tool helps visualize profile data. If we insert profiling code into our program, it lets us see how long the execution of each function takes.

It is a Python profiling tool with many features, yielding deterministic profiling for Python programs, and providing the number of times and quantity of time various parts of code executed.

6. Which of the following has values associated with keys in Python?

  • A hash
  • A dictionary (CORRECT)
  • A HashMap
  • An Unordered Map

You nailed it! Python uses a dictionary to store values, each with a specific key.

7. Your Python script searches a directory, and runs other tasks in a single loop function for 100s of computers on the network. Which action will make the script the least expensive?  

  • Read the directory once (CORRECT)
  • Loop the total number of computers  
  • Service only half of the computers  
  • Use more memory 

Exactly! That way it only loops once through the directory. This method is easy on computer memory since it can read one’s entire directory and combine multimedia data stored within that directory.

8. Your script calculates the average number of active user sessions during business hours in a seven-day period. How often should a local cache be created to give a good enough average without updating too often?  

  • Once a week
  • Once a day (CORRECT)
  • Once a month  
  • Once every 8 hours 

Woohoo! For each day, a local store can be readily accessed and used to calculate an average of seven days, which makes fetching data a more repeatable process.

9. You use the time command to determine how long a script runs to complete its various tasks. Which output value will show the time spent doing operations in the user space?  

  • Real  
  • Wall-clock  
  • Sys  
  • User (CORRECT)

You nailed it! The user value is the time spent doing operations in the user space. 

PRACTICE QUIZ: WHEN SLOWNESS PROBLEMS GET COMPLEX

1. Which of the following can cache database queries in memory for faster processing of automated tasks?

  • Threading
  • Varnish
  • Memcached (CORRECT)
  • SQLite

You nailed it! Memchached is a caching service that keeps most commonly accessed database queries in RAM.

2. What module specifies parts of a code to run in separate asynchronous events?

  • Threading
  • Futures
  • Asyncio (CORRECT)
  • Concurrent

Awesome! Asyncio is very helpful in the fact that it enables you to section off parts of your code that can run as separate asynchronous tasks, allowing for a nonblocking execution.

3. Which of the following allows our program to run multiple instructions in parallel?

  • Threading (CORRECT)
  • Swap space
  • Memory addressing
  • Dual SSD

Woohoo! Threading allows a process to split itself into parallel tasks.

4. What is the name of the field of study in computer science that concerns itself with writing programs and operations that run in parallel efficiently?

  • Memory management
  • Concurrency (CORRECT)
  • Threading
  • Performance analysis

Right on! Concurrent programming, in computer science, refers to performing different processes or elements of a program, algorithm or issue that can be executed out of sync order or despite the fact that they modify the final output with a sense of an event.

5. What would we call a program that often leaves our CPU with little to do as it waits on data from a local disk and the Internet?

  • Memory-bound
  • CPU-bound
  • User-bound
  • I/O bound (CORRECT)

Right on! If our program mainly finds itself waiting on local disks or the network, it is I/O bound.

6. A script is _____ if you are running operations in parallel using all available CPU time.  

  • I/O bound  
  • Threading  
  • CPU bound (CORRECT)
  • Asyncio

Definitely, a script is labeled CPU-bound when it proceeds with operations, such as that which completely consumes all CPUs available. And when the script runs in parallel, this consumes a lot of CPU.

7. You’re creating a simple script that runs a query on a list of product names of a very small business, and initiates automated tasks based on those queries. Which of the following would you use to store product names?  

  • SQLite  
  • Microsoft SQL Server  
  • Memcached 
  • CSV file (CORRECT)

Nice job! A simple CSV file is enough to store a list of product names.  

8. A company has a single web server hosting a website that also interacts with an external database server. The web server is processing requests very slowly. Checking the web server, you found the disk I/O has high latency. Where is the cause of the slow website requests most likely originating from?  

  • Local disk (CORRECT)
  • Remote database  
  • Slow Internet  
  • Database index

You got it! Excessive on-disk I/O latency also induces delays in app execution as it waits for disk retrieval of data for an overly long time.

9. Which module makes it possible to run operations in a script in parallel that makes better use of CPU processing time?  

  • Executor  
  • Futures (CORRECT)
  • Varnish  
  • Concurrency  

Woohoo! The module futures, therefore, can perform parallel executions of large numbers of its tasks with little difficulty in overseeing concurrency.

PERFORMANCE TUNING IN PYTHON SCRIPTS

1. Which of the following best describes a CPU-bound task?

  • A task that frequently waits for network responses.
  • A task that consistently requires more memory than is available.
  • A task that often waits for I/O operations to complete.A task that primarily utilizes only one of the available CPU cores, even when others are free. (CORRECT)

Correct

2. Which of the following best describes rsync (remote sync)?

  • rsync is a system command that enables administrative control over user access to files within networked computers.
  • rsync is a tool that automates system backups by periodically duplicating all files without checking for changes.
  • rsync is a utility for efficiently transferring and synchronizing files between a computer and an external hard drive and across networked computers by comparing the modification time and size of files. (CORRECT)
  • rsync is a networking tool used for monitoring data usage and bandwidth in real-time across multiple computer systems.

Correct

3. In the lab, you employed multiprocessing to reduce backup time. Why was multiprocessing the right choice in this example?

  • It added timestamps to each operation, which enabled better tracking
  • The task was CPU-bound, and multiprocessing leveraged unused CPU cores to run the script significantly faster (CORRECT)
  • It leveraged a faster internet connection to fix the network bottleneck
  • It reduced the number of variables used to speed up the process

Correct

4. True or false: psutil is a cross-platform library for retrieving information on running processes and system utilization (CPU, memory, disks, network, sensors) in Python. 

  • True (CORRECT)
  • False

Correct

5. In the assessment, the result of the psutil.cpu_percent() function is “.6”. What does this mean?

  • The CPU is being utilized at 60% of its capacity.
  • The system has 0.6 cores available for processing tasks.
  • The CPU utilization is 0.6%, indicating very low CPU usage. (CORRECT)
  • The function has encountered an error, and 0.6 is the error code.

Correct

6. What is the correct order of arguments when using the rsync command?

  •  [Destination] [Options] [Source-Files-Dir]
  • [Source-Files-Dir] [Destination] [Options]
  • [Options] [Destination] [Source-Files-Dir]
  • [Options] [Source-Files-Dir] [Destination] (CORRECT)

Correct

7. True or false: In the assessment, the multisync.py script is designed to backup data sequentially, one task after another.

  • True
  • False (CORRECT)

Correct

8. True or false: A script that often waits for I/O operations to complete could be called a CPU-bound task.

  • True
  • False (CORRECT)

Correct

9. How does rsync (remote sync) primarily optimize data transfer?

  • rsync prioritizes files based on their importance and transfers critical files first.
  • rsync duplicates all files every time to ensure no data is missed during transfer.
  • rsync uses advanced compression algorithms to reduce the size of files before transferring.
  • rsync transfers and synchronizes files by comparing the modification time and size, ensuring only changed data is transferred. (CORRECT)

Correct

10. In the Qwiklab, what did you use the psutil.disk_io_counters() function to do?

  • To estimate file space usage on the disk
  • To configure network interface parameters
  • To monitor real-time network traffic
  • To retrieve disk I/O statistics (CORRECT)

Correct

11. In this example, how was the backup script improved to reduce the backup time significantly?

  • By using more verbose logging
  • By increasing the network bandwidth
  • By utilizing multiprocessing (CORRECT)
  • By compressing the files before transferring

Correct

12. In the activity, what was the psutil python3 module used for?

  • Monitoring network bandwidth (CORRECT)
  • Analyzing GPU performance
  • Checking power consumption
  • Checking CPU usage (CORRECT)

Correct

13. IIn the multisync.py script, what is the role of the map method of the Pool object?

  • To distribute the tasks evenly across available CPUs (CORRECT)
  • To map the output of one task to the input of another
  • To map all tasks to a single processor
  • To create a mapping of task dependencies

Correct

14. Which of the following are options for the rsync command? Select all that apply.

  • -p
  • -z (CORRECT)
  • -v (CORRECT)
  • -a (CORRECT)

Correct

15. True or false: In this example, the efficiency of the script was improved by compressing the files before transferring.

  • False (CORRECT)
  • True

Correct

16. True or false: In order to check how much your program utilizes CPU using psutil.cpu_percent(), you first need to install the pip3 which is a Python package installer. 

  • True
  • False (CORRECT)

Correct

17. What makes rsync (remote sync) distinct from other file transfer methods?

  • rsync encrypts all files before transfer, ensuring maximum security.
  • rsync increases the speed of the internet connection during file transfer.
  • rsync requires manual selection of each file for transfer.
  • rsync uses the delta transfer algorithm, meaning it transfers only the differences between source and destination files. (CORRECT)

Correct

18. Which command did you use for checking disk I/O?

  • df -h
  • netstat
  • psutil.disk_io_counters() (CORRECT)
  • diskcheck

Correct

19. Which of the following performance metrics did you explore to identify system limitations? Select all that apply.

  • Checking power consumption
  • Monitoring network bandwidth (CORRECT)
  • Checking CPU usage (CORRECT)
  • Analyzing GPU performance

Correct

20. Why is it necessary to grant executable permission to the multisync.py script before running it?

  • To enable the script to access system files
  • To allow the script to use network resources
  • To permit the script to modify its own code
  • To allow the operating system to execute the script as a program (CORRECT)

Correct

21. Which of the following options for the rsync command provides a verbose output?

  • -q
  • -z
  • -b
  • -v (CORRECT)

Correct

22. In the rsync command syntax, what does the [Destination] argument represent?

  • The directory or file where the data will be synchronized to (CORRECT)
  • The options or flags that modify the behavior of the command
  • The source directory or file that needs to be synchronized
  • The name of the command itself

Correct

23. In the assessment, 904123904 bytes were written to disk. You found this result using the ______________ function.

  • psutil.net_io_counters()
  • psutil.memory_info()
  • psutil.disk_io_counters() (CORRECT)
  • psutil.cpu_percent()

Correct

24. What is the following Python script used for?

import psutil
psutil.cpu_percent()
  • Memory usage
  • System uptime
  • CPU utilization (CORRECT)
  • Network performance 

Correct

25. If you want to synchronize files from the directory /home/user/docs to /backup/docs with verbose output, which of the following rsync commands would you use?

  • rsync /home/user/docs /backup/docs -v
  • rsync -v /backup/docs /home/user/docs
  • rsync -v /home/user/docs /backup/docs (CORRECT)
  • rsync /backup/docs /home/user/docs

Correct

26. What is the purpose of the Pool class in the multiprocessing Python module as used in the multisync.py script?

  • To create a single process for each task
  • To manage a pool of worker processes (CORRECT)
  • To synchronize execution of processes
  • To limit the CPU usage of the script

Correct

27. Which of the following statements best describes the primary purpose of the multiprocessing module in Python?

  • It allows for the execution of multiple threads within a single process.
  • It provides support for parallel execution of code using multiple CPU cores. (CORRECT)
  • It manages multiple Python interpreters in the same program.
  • It facilitates asynchronous I/O operations without using threads or processes.

Correct

CONCLUSION – Slowness

So that was all about what this module wanted to explain to us. Now, you have the tools to find out the cause of slow progress in any machine or program. tools like iotop, iftop, or the Activity Monitor offer ways to tell bottlenecks or depletion of resources in the system-enabling better diagnosis of performance issues with respect to the design. You have learned how using the utilization of these areas, namely CPU, memory, and cache, reveals the potholes that cause slackness.

The preparation section also touched on strategies for writing efficient code and utilizing profilers to identify hot-point regions. You also had the opportunity for a much deeper understanding of Python data structures such as tuples, lists, dictionaries, and sets, which would facilitate your selection of the best things for your particular requirements and bring the end to endless loops. Essentially, there was a procedure taught in this module that brought together all the operations of concurrency and caching services, which together with the introduction of quite advanced topics, could really ameliorate sluggish programming. The discussion basically centered on techniques like indexing which allowed users to enhance the speed of the code with the help of threads. This module also introduced the students concepts related to speed—they learned how to evaluate code and increase the speed of installation projects.

Leave a Comment