The module explores extensively the collection techniques, including database operations and manipulation using APIs as well as scraping techniques from web pages. Detailed instructions on effective data retrieval from such sources are also included.
Reading and collecting data from multiple file formats is also taken up in this module and will give you the toolset to deal with different kinds of data. New age web-based techniques on data collection fused with the conventional approach of handling files form a very good, thorough foundation for understanding data acquisition mechanisms.
Learning Objectives:
To familiarize oneself with the use of HTTP protocol through Requests Library.
Explain how the URL Request Response HTTP protocol works.
Interact with a basic open-source API.
Basic web scraping in Python.
Manipulate different file formats using Python.
Explain the difference between APIs and REST APIs.
Summarize the process margin through which APIs send and receive information.
PRACTICE QUIZ 1
1. What does API stand for?
Automatic Program Interaction
Application Programming Interaction
Application Process Interface
Application Programming Interface (CORRECT)
2. Which data format is commonly found in the HTTP message for API requests?
JSON (CORRECT)
HTML
XML
YAML
3. What is the primary purpose of an API?
To provide security to web applications.
To design user interfaces for mobile applications.
To connect and enable communication between software applications. (CORRECT)
To handle server-side database operations.
PRACTICE QUIZ 2
1. What is the function of “GET” in HTTP requests?
Sends data to create or update a resource
Carries the request to the client from the requestor (CORRECT)
Returns the response from the client to the requestor
Deletes a specific resource
2. What does URL stand for?
Uniform Request Location
Uniform Resource Locator (CORRECT)
Unilateral Resistance Locator
Uniform Resource Learning
3. What does the file extension “csv” stand for?
Comma Separated Values (CORRECT)
Comma Separation Valuations
Common Separated Variables
Comma Serrated Values
4. What is webscraping?
The process to describe communication options
The process to display all data within a URL
The process to request and retrieve information from a client
The process to extract data from a particular website (CORRECT)
MODULE 5 GRADED QUIZ
1. What are the 3 parts to a response message?
HTTP headers, blank line, and body (CORRECT)
Encoding, body, and cache
Start or status line, header, and body
Bookmarks, history, and security
2. What is the purpose of this line of code “table_row=table.find_all(name=’tr’)” used in webscraping?
It will find all of the data within the table marked with a tag “p”
It will find all of the data within the table marked with a tag “a”
It will find all of the data within the table marked with a tag “h1”
It will find all of the data within the table marked with a tag “tr” (CORRECT)
3. In what data structure do HTTP responses generally return?
JSON (CORRECT)
Lists
Nested Lists
Tuples
4. The Python library we used to plot the chart in video/lab is
MatPlotLib (CORRECT)
Plotly
PyCoinGecko
Pandas
FINAL EXAM
1. When slicing in Python what does the “0” in this statement [0:2] specify?
It specifies the step of the slicing
It specifies the position to start the slice (CORRECT)
It specifies the position to end the slice
2. If var = “01234567” what Python statement would print out only the odd elements?
print(var[2::2])
print(var[3::1])
print(var[1::2]) (CORRECT)
3. Consider the string Name=”EMILY”, what statement would return the index of 0?
Name.find(“I”)
Name.find(“L”)
Name.find(“E”) (CORRECT)
4. What is the type of the following: 1.0
int
str
float (CORRECT)
5. What will happen if you cast a float to an integer?
An error will occur
Nothing happens
It will remove decimal point (CORRECT)
6. When using the double slash “//” for integer division the result will be?
Rounded (CORRECT)
Not rounded
7. In Python 3 what following code segment will produce a float?
2//3
1/2 (CORRECT)
8. How many identical keys can a dictionary have?
0 (CORRECT)
3
100000000
9. What will this code segment “A[0]” obtain from a list or tuple?
The third element of a list or tuple
The first element of a list or tuple (CORRECT)
The second element of a list or tuple
10. What is the result of the following operation: ‘1:2,3:4’.split(‘:’)?
[‘1’, ‘2’, ‘3’, ‘4’]
[‘1,2,3,4’]
[‘1,2’, ‘3,4’]
[‘1’, ‘2,3’, ‘4’] (CORRECT)
11. What is an important difference between lists and tuples?
Lists are mutable tuples are not (CORRECT)
Lists and tuples are the same
Tuples can only have integers
Lists can’t contain a string
12. What code segment is used to cast list “B” to the set “b”?
b.set()
b=set(B) (CORRECT)
b=B.dict()
13. If x=1 what will produce the below output?
Hi
Mike
if(x!=1):
print(‘Hi’)
else:
print(‘Hello’)
print(‘Mike’)
if(x!=1):
print(‘Hello’)
else:
print(‘Hi’)
print(‘Mike’) (CORRECT)
if(x==1):
print(‘Hello’)
else:
print(‘Hi’)
print(‘Mike’)
14. What is the process of forcing your program to output an error message when it encounters an issue?
Output errors
Force Out
Exception handling (CORRECT)
Error messages
15. What add function would return ‘2’ ?
def add(x): return(x+x+x) add(‘1’)
def add(x): return(x+x) add(‘1’)
def add(x): return(x+x) add(1) (CORRECT)
16. What method organizes the elements in a given list in a specific descending or ascending order?
sort() (CORRECT)
split()
replace()
join()
17. What segment of code would output the following?
3
6
9
A=[1,2,3] for a in A: print(2*a)
A=[‘1′,’2′,’3’] for a in A: print(2*a)
A=[1,2,3] for a in A: print(3*a) (CORRECT)
18. What code segment would output the following?
1
3
4
for i in range(1,5): if (i!=1): print(i)
for i in range(1,5): if (i==2): print(i)
for i in range(1,5): if (i!=2): print(i) (CORRECT)
19. What is the method defined in the class Rectangle used to draw the rectangle?
41. What is the result of the following lines of code?
a=np.array([10,9,8,7,6]) a+1
array([11,10,9,8,7]) (CORRECT)
array([101,91,81,71,61])
array([9, 8, 7, 6, 5])
42. How would you select the columns with the headers: Artist, Length and Genre from the dataframe df and assign them to the variable y ?
y=df[[‘Artist’,’Length’,’Genre’]] (CORRECT)
y=df[[‘Artist’],[‘Length’],[‘Genre’]]
y=df[‘Artist’,’Length’,’Genre’]
43. In Python what statement would print out the first two elements “Li” of “Lizz”?
print(name[0:2]) (CORRECT)
print(name[1:2])
print(name[2:0])
44. If var = “01234567” what Python statement would print out only the even elements?
print(var[::2]) (CORRECT)
print(var[::1])
print(var[::3])
45. Consider the string Name=”ABCDE”, what is the result of the following operation Name.find(“B”) ?
0
1 (CORRECT)
2
46. In Python what can be either a positive or negative number but does not contain a decimal point?
float(3.99)
int(3.99) (CORRECT)
str(3.99)
47. What following code segment would return a 3?
Intrusion detection system (IDS) tool
Security information and event management (SIEM) tool (CORRECT)
Playbook
Intrusion prevention system (IPS) tool
SIEM tools accumulate and scrutinize log data to survey significant activities within an organization. An intrusion detection system, on the other hand, is an application that supervises system activity and produces potential intrusion alerts.
48. What does the index of “1” correspond to in a list or tuple?
The first element
The second element (CORRECT)
the third
49. What is the result of the following operation: ‘1,2,3,4’.split(‘,’) ?
‘1’,’2′,’3′,’4′
(‘1′,’2′,’3′,’4’)
‘1234’
[‘1′,’2′,’3′,’4’] (CORRECT)
50. How do you cast the list A to the set a?
a=set(A) (CORRECT)
a=A.dict()
a.set()
51. What value of x will produce the following output?
How are you?
x=
if(x!=1):
print(‘How are you?’)
else:
print(‘Hi’)
x=1
x=6 (CORRECT)
x=”7″ (CORRECT)
52. Why is the “finally” statement used?
Only execute the remaining code if an error occurs
Execute the remaining code no matter the end result (CORRECT)
Only execute the remaining code if one condition is false
Only execute the remaining code if no errors occur
53. What code segment would output the following?
for i in range(1,5): if (i!=2): print(i)
for i in range(1,5): if (i!=1): print(i)
for i in range(1,5): if (i==2): print(i) (CORRECT)
54. What is the width of the rectangle in the class Rectangle?
68. What mode over writes existing text in a file?
Write “w” (CORRECT)
Append “a”
Read “r”
69. What is the correct way to sort list ‘B’ using a method? The result should not return a new list, just change the list ‘B’.
sort(B)
B.sorted()
sorted(B)
B.sort() (CORRECT)
70. In Python what data type is used to represent text and not numbers?
str (CORRECT)
float
int
71. In Python 3 what following code segment will produce an int?
2//3 (CORRECT)
1/2
72. What will be the output if x=”7”?
if(x!=1):
print(‘Hi’)
else:
print(‘Hello’)
print(‘Mike’)
Hello
Mike
Mike
Hi
Mike (CORRECT)
73. What code segment would output the following?
2
3
4
for i in range(1,5): if (i!=2): print(i)
for i in range(1,5): if (i==2): print(i)
for i in range(1,5): if (i!=1): print(i) (CORRECT)
74. What is a two-dimensional data structure?
Pandas Dataframe (CORRECT)
Numpy
Pandas Series
75. What is scheme, internet address and route a part of?
Error message
URL (CORRECT)
Text file
CONCLUSION – APIs, and Data Collection
This module will definitely equip you with tremendous skills related modern data collection techniques. You will learn to use efficient techniques such as API and web scraping to extract data from the internet.
Moreover, you’ll become able to read and extract data from different file formats, which makes you very comfortable when handling heterogeneous sources of data. At the end of the module, you’d be well-trained into applying diversified data acquisition methods in your project.