Module 5: APIs, and Data Collection

Spread the love

INTRODUCTION – APIs, and Data Collection

The module explores extensively the collection techniques, including database operations and manipulation using APIs as well as scraping techniques from web pages. Detailed instructions on effective data retrieval from such sources are also included.

Reading and collecting data from multiple file formats is also taken up in this module and will give you the toolset to deal with different kinds of data. New age web-based techniques on data collection fused with the conventional approach of handling files form a very good, thorough foundation for understanding data acquisition mechanisms.

Learning Objectives:

  • To familiarize oneself with the use of HTTP protocol through Requests Library.
  • Explain how the URL Request Response HTTP protocol works.
  • Interact with a basic open-source API.
  • Basic web scraping in Python.
  • Manipulate different file formats using Python.
  • Explain the difference between APIs and REST APIs.
  • Summarize the process margin through which APIs send and receive information.

PRACTICE QUIZ 1

1. What does API stand for?

  • Automatic Program Interaction
  • Application Programming Interaction
  • Application Process Interface
  • Application Programming Interface (CORRECT)

2. Which data format is commonly found in the HTTP message for API requests?   

  • JSON (CORRECT)
  • HTML
  • XML
  • YAML

3. What is the primary purpose of an API?  

  • To provide security to web applications.  
  • To design user interfaces for mobile applications.  
  • To connect and enable communication between software applications.  (CORRECT)
  • To handle server-side database operations.  

PRACTICE QUIZ 2

1. What is the function of “GET” in HTTP requests?

  • Sends data to create or update a resource 
  • Carries the request to the client from the requestor (CORRECT)
  • Returns the response from the client to the requestor
  • Deletes a specific resource

2. What does URL stand for?

  • Uniform Request Location
  • Uniform Resource Locator (CORRECT)
  • Unilateral Resistance Locator
  • Uniform Resource Learning

3. What does the file extension “csv” stand for?

  • Comma Separated Values (CORRECT)
  • Comma Separation Valuations 
  • Common Separated Variables
  • Comma Serrated Values 

4. What is webscraping?

  • The process to describe communication options
  • The process to display all data within a URL
  • The process to request and retrieve information from a client
  • The process to extract data from a particular website (CORRECT)

MODULE 5 GRADED QUIZ

1. What are the 3 parts to a response message?

  • HTTP headers, blank line, and body (CORRECT)
  • Encoding, body, and cache
  • Start or status line, header, and body
  • Bookmarks, history, and security

2. What is the purpose of this line of code “table_row=table.find_all(name=’tr’)” used in webscraping?

  • It will find all of the data within the table marked with a tag “p”
  • It will find all of the data within the table marked with a tag “a”
  • It will find all of the data within the table marked with a tag “h1”
  • It will find all of the data within the table marked with a tag “tr” (CORRECT)

3. In what data structure do HTTP responses generally return?

  • JSON (CORRECT)
  • Lists
  • Nested Lists
  • Tuples

4. The Python library we used to plot the chart in video/lab is

  • MatPlotLib (CORRECT)
  • Plotly
  • PyCoinGecko
  • Pandas

FINAL EXAM

1. When slicing in Python what does the “0” in this statement [0:2] specify?

  • It specifies the step of the slicing
  • It specifies the position to start the slice (CORRECT)
  • It specifies the position to end the slice

2. If var = “01234567” what Python statement would print out only the odd elements?

  • print(var[2::2]) 
  • print(var[3::1])
  • print(var[1::2]) (CORRECT)

3. Consider the string Name=”EMILY”, what statement would return the index of 0?

  • Name.find(“I”)
  • Name.find(“L”)
  • Name.find(“E”) (CORRECT)

4. What is the type of the following: 1.0

  • int
  • str
  • float (CORRECT)

5. What will happen if you cast a float to an integer?

  • An error will occur
  • Nothing happens
  • It will remove decimal point (CORRECT)

6. When using the double slash “//” for integer division the result will be?

  • Rounded (CORRECT)
  • Not rounded

7. In Python 3 what following code segment will produce a float?

  • 2//3
  • 1/2 (CORRECT)

8. How many identical keys can a dictionary have?

  • (CORRECT)
  • 3
  • 100000000

9. What will this code segment “A[0]” obtain from a list or tuple?

  • The third element of a list or tuple
  • The first element of a list or tuple (CORRECT)
  • The second element of a list or tuple

10. What is the result of the following operation: ‘1:2,3:4’.split(‘:’)?

  •  [‘1’, ‘2’, ‘3’, ‘4’]
  • [‘1,2,3,4’]
  • [‘1,2’, ‘3,4’]
  • [‘1’, ‘2,3’, ‘4’] (CORRECT)

11. What is an important difference between lists and tuples?

  • Lists are mutable tuples are not (CORRECT)
  • Lists and tuples are the same
  • Tuples can only have integers 
  • Lists can’t contain a string 

12. What code segment is used to cast list “B” to the set “b”?

  • b.set()
  • b=set(B) (CORRECT)
  • b=B.dict()

13. If x=1 what will produce the below output?

Hi

Mike

  • if(x!=1): 
  • print(‘Hi’) 
  • else: 
  •     print(‘Hello’) 
  • print(‘Mike’) 
  • if(x!=1): 
  • print(‘Hello’) 
  • else: 
  •     print(‘Hi’) 
  • print(‘Mike’) (CORRECT)
  • if(x==1): 
  • print(‘Hello’) 
  • else: 
  •     print(‘Hi’) 
  • print(‘Mike’) 

14. What is the process of forcing your program to output an error message when it encounters an issue?

  • Output errors
  • Force Out
  • Exception handling (CORRECT)
  • Error messages

15. What add function would return ‘2’ ?

  • def add(x): return(x+x+x) add(‘1’) 
  • def add(x): return(x+x) add(‘1’) 
  • def add(x): return(x+x) add(1) (CORRECT)

16. What method organizes the elements in a given list in a specific descending or ascending order?

  • sort() (CORRECT)
  • split()
  • replace()
  • join()

17. What segment of code would output the following?

3

6

9

  • A=[1,2,3] for a in A: print(2*a) 
  • A=[‘1′,’2′,’3’] for a in A: print(2*a) 
  • A=[1,2,3] for a in A: print(3*a) (CORRECT)

18. What code segment would output the following?

1

3

4

  • for i in range(1,5): if (i!=1): print(i)
  • for i in range(1,5): if (i==2): print(i)
  • for i in range(1,5): if (i!=2): print(i) (CORRECT)

19. What is the method defined in the class Rectangle used to draw the rectangle?

class Rectangle(object):

        def __init__(self,width=2,height =3,color=’r’):

                                  self.height=height

                                  self.width=width

                                  self.color=color

    def drawRectangle(self):

                       import matplotlib.pyplot as plt

                       plt.gca().add_patch(plt.Rectangle((0, 0),self.width, self.height ,fc=self.color))

                       plt.axis(‘scaled’)

                       plt.show()

  • drawRectangle (CORRECT)
  • class Rectangle
  • import matplotlib

20. What is the result of the following lines of code?

  •  a=np.array([0,1,0,1,0]) b=np.array([1,0,1,0,1]) a*b 
  • array([0, 0, 0, 0, 0]) (CORRECT)
  • 0
  • array([1, 1, 1, 1, 1])

21. What is the result of the following lines of code?

  •  a=np.array([1,1,1,1,1]) a+10
  • array([1,1,1,1,1])
  • array([10,10,10,10,10])
  • array([11, 11, 11, 11, 11]) (CORRECT)

22. The following line of code selects the columns along with what headers from the dataframe df?

y=df[[‘Artist’,’Length’,’Genre’]]

  • ‘Artist’, ‘Length’ and ‘Genre’ (CORRECT)
  • This line of code does not select the headers
  •  ‘Artist’, ‘Length’ and ‘y’

23. Consider the file object: File1.What would the following line of code output?

file1.readline(4) 

  • It would output the entire text file
  • It would output the first 4 characters from the text file (CORRECT)
  • It would output the first 4 lines from the text file

24. What mode will write text at the end of the existing text in a file?

  • Append “a” (CORRECT)
  • Write “w”
  • Read “r”

25. What is the extraction of data from a website?

  • Data mining
  • Webscraping (CORRECT)
  • Web crawling

26. When slicing in Python what does the “2” in this statement [0:2] specify?

  • It specifies the step of the slicing
  • It specifies the position to end the slice (CORRECT)
  • It specifies the position to start the slice

27. What is the result of the following code segment: int(3.99)

  • 3.99
  • 4
  • (CORRECT)

28. What following code segment would produce an output of “0.5”?

  • 1//2
  • 1/2 (CORRECT)

29. In Python 3 what does regular division always result in?

  • Int
  • Float (CORRECT)

30. A dictionary must have what type of keys?

  • Not changeable
  • Duplicate
  • Unique (CORRECT)

31. What is the syntax to obtain the first element of the tuple?

  • A=(‘a’,’b’,’c’)
  • A[0] (CORRECT)
  • A[1]
  • A[:]

32. What line of code would produce this output: [‘1′,’2′,’3′,’4’]?

  • ‘1,2,3,4’.reverse(‘,’)
  • ‘1,2,3,4’.split(‘,’) (CORRECT)
  • ‘1,2,3,4’.join(‘,’)
  • ‘1,2,3,4’.split(‘:’)

33. What is a collection that is ordered, changeable and allows duplicate members?

  • List (CORRECT)
  • Set
  • Dictionary
  • Tuple

34. What happens with this segment of code: a=set(A) ?

  • It casts the list “A” to the set “a” (CORRECT)
  • It casts the list “a” to the set “A”
  • It returns an error

35. What value of x will produce the output?

Hi

Mike

 x= 

if(x!=1): 

     print(‘Hello’) 

else: 

     print(‘Hi’) 

print(‘Mike’) 

  • x=6
  • x=”7″
  • x=1 (CORRECT)

36. Given the function add shown below, what does the following return?

 def add(x): return(x+x) add(‘1’) 

  • ’11’ (CORRECT)
  • ‘2’
  • 2

37. What function returns a sorted list?

  • sort()
  • find()
  • lower()
  • sorted() (CORRECT)

38. What is the output of the following few lines of code?

 A=[‘1′,’2′,’3’] for a in A: print(2*a) 

error: cannot multiply a string by an integer 

  • 2
  • 4
  • 6
  • 11
  • 22
  • 33 (CORRECT)

39. What is the height of the rectangle in the class Rectangle?

class Rectangle(object):

        def __init__(self,width=2,height =3,color=’r’):

                                  self.height=height

                                  self.width=width

                                  self.color=color

    def drawRectangle(self):

                       import matplotlib.pyplot as plt

                       plt.gca().add_patch(plt.Rectangle((0, 0),self.width, self.height ,fc=self.color))

                       plt.axis(‘scaled’)

                       plt.show()

  • 0
  • (CORRECT)
  • 2

40. What is the result of the following lines of code?

 a=np.array([0,1,0,1,0]) b=np.array([1,0,1,0,1]) a/b 

  • array([0.1, 1.0, 0.1, 1.0, 0.1])
  • Division by zero error (CORRECT)
  • array([1, 1, 1, 1, 1])

41. What is the result of the following lines of code?

 a=np.array([10,9,8,7,6]) a+1

  • array([11,10,9,8,7]) (CORRECT)
  • array([101,91,81,71,61])
  • array([9, 8, 7, 6, 5])

42. How would you select the columns with the headers: Artist, Length and Genre from the dataframe df and assign them to the variable y ?

  • y=df[[‘Artist’,’Length’,’Genre’]] (CORRECT)
  • y=df[[‘Artist’],[‘Length’],[‘Genre’]]
  • y=df[‘Artist’,’Length’,’Genre’] 

43. In Python what statement would print out the first two elements “Li” of “Lizz”?

  • print(name[0:2]) (CORRECT)
  • print(name[1:2])
  • print(name[2:0])

44. If var = “01234567” what Python statement would print out only the even elements?

  • print(var[::2]) (CORRECT)
  • print(var[::1])
  • print(var[::3])

45. Consider the string Name=”ABCDE”, what is the result of the following operation Name.find(“B”) ?

  • 0
  • (CORRECT)
  • 2

46. In Python what can be either a positive or negative number but does not contain a decimal point?

  • float(3.99)
  • int(3.99) (CORRECT)
  • str(3.99)

47. What following code segment would return a 3?

  • Intrusion detection system (IDS) tool
  • Security information and event management (SIEM) tool (CORRECT)
  • Playbook
  • Intrusion prevention system (IPS) tool

SIEM tools accumulate and scrutinize log data to survey significant activities within an organization. An intrusion detection system, on the other hand, is an application that supervises system activity and produces potential intrusion alerts.

48. What does the index of “1” correspond to in a list or tuple?

  • The first element
  • The second element (CORRECT)
  • the third

49. What is the result of the following operation: ‘1,2,3,4’.split(‘,’) ?

  • ‘1’,’2′,’3′,’4′
  •  (‘1′,’2′,’3′,’4’)
  • ‘1234’
  • [‘1′,’2′,’3′,’4’] (CORRECT)

50. How do you cast the list A to the set a?

  • a=set(A) (CORRECT)
  • a=A.dict()
  • a.set()

51. What value of x will produce the following output?

How are you?

x= 

if(x!=1): 

 print(‘How are you?’) 

else: 

 print(‘Hi’)

  • x=1
  • x=6 (CORRECT)
  • x=”7″ (CORRECT)

52. Why is the “finally” statement used?

  • Only execute the remaining code if an error occurs
  • Execute the remaining code no matter the end result (CORRECT)
  • Only execute the remaining code if one condition is false
  • Only execute the remaining code if no errors occur

53. What code segment would output the following?

  • for i in range(1,5): if (i!=2): print(i)
  • for i in range(1,5): if (i!=1): print(i)
  • for i in range(1,5): if (i==2): print(i) (CORRECT)

54. What is the width of the rectangle in the class Rectangle?

class Rectangle(object):

        def __init__(self,width=2,height =3,color=’r’):

                                  self.height=height

                                  self.width=width

                                  self.color=color

    def drawRectangle(self):

                       import matplotlib.pyplot as plt

                       plt.gca().add_patch(plt.Rectangle((0, 0),self.width, self.height ,fc=self.color))

                       plt.axis(‘scaled’)

                       plt.show()

  • 3
  • (CORRECT)
  • 0

55. What line of code would produce the following: array([11, 11, 11, 11, 11])?

  • a=np.array([1,1,1,1,1]) a+10 (CORRECT)
  • a=np.array([1,2,1,1,1]) a+10
  • a=np.array([1,1,1,1,1]) 11-a

56. What is the method readline() used for?

  • It reads 10 lines of a file at a time
  • It reads the entire file all at once
  • It helps to read one complete line from a given text file (CORRECT)

57. Which line of code is in the mode of append?

  • with open(“Example.txt”,”w”) as file1:
  • with open(“Example.txt”,”r”) as file1:
  • with open(“Example.txt”,”a”) as file1: (CORRECT)

58. What are the 3 parts to a URL?

  • Put, route, and get
  • Block, post, and route
  • Scheme, internet address, and route (CORRECT)
  • Get, post, and scheme

59. In Python, if you executed name = ‘Lizz’, what would be the output of print(name[0:2])?

  • Lizz
  • L
  • Li (CORRECT)

60. Consider the string Name=”EMILY”, what statement would return the index of 3?

  • Name.find(“M”)
  • Name.find(“Y”)
  • Name.find(“L”) (CORRECT)

61. What following code segment would produce an output of “0”?

  • 1/2
  • 1//2 (CORRECT)

62. Lists are:

  • Mutable (CORRECT)
  • Unordered
  • Not indexed
  • Not mutable

63. What is a collection that is unordered, unindexed and does not allow duplicate members?

  • Set (CORRECT)
  • List
  • Tuple

64. What is an error that occurs during the execution of code?

  • Finally
  • Exception (CORRECT)
  • Error messages
  • Exception handling

65. What segment of code would output the following?

11

22

33

  • A=[‘1′,’2′,’3’] for a in A: print(2*a) (CORRECT)
  • A=[1,2,3] for a in A: print(2*a)
  • A=[‘1′,’2′,’3’] for a in A: print(3*a)

66. What code segment would output the following?

1

3

4

  • for i in range(1,5): if (i==2): print(i)
  • for i in range(1,5): if (i!=1): print(i)
  • for i in range(1,5): if (i!=2): print(i) (CORRECT)

67. What is the result of the following lines of code?

  •  a=np.array([0,1,0,1,0]) b=np.array([1,0,1,0,1]) a+b
  • array([0, 0, 0, 0, 0])
  • array([1, 1, 1, 1, 1]) (CORRECT)
  • 0

68. What mode over writes existing text in a file?

  • Write “w” (CORRECT)
  • Append “a”
  • Read “r”

69. What is the correct way to sort list ‘B’ using a method? The result should not return a new list, just change the list ‘B’.

  • sort(B)
  • B.sorted()
  • sorted(B)
  • B.sort() (CORRECT)

70. In Python what data type is used to represent text and not numbers?

  • str (CORRECT)
  • float
  • int

71. In Python 3 what following code segment will produce an int?

  • 2//3 (CORRECT)
  • 1/2

72. What will be the output if x=”7”?

if(x!=1): 

 print(‘Hi’) 

else: 

 print(‘Hello’) 

print(‘Mike’) 

  • Hello
  • Mike
  • Mike
  • Hi
  •  Mike (CORRECT)

73. What code segment would output the following?

2

3

4

  • for i in range(1,5): if (i!=2): print(i)
  • for i in range(1,5): if (i==2): print(i)
  • for i in range(1,5): if (i!=1): print(i) (CORRECT)

74. What is a two-dimensional data structure?

  • Pandas Dataframe (CORRECT)
  • Numpy
  • Pandas Series

75. What is scheme, internet address and route a part of?

  • Error message
  • URL (CORRECT)
  • Text file

CONCLUSION – APIs, and Data Collection

This module will definitely equip you with tremendous skills related modern data collection techniques. You will learn to use efficient techniques such as API and web scraping to extract data from the internet.

Moreover, you’ll become able to read and extract data from different file formats, which makes you very comfortable when handling heterogeneous sources of data. At the end of the module, you’d be well-trained into applying diversified data acquisition methods in your project.

Leave a Comment