Course 5 – Analyze Data to Answer Questions

Spread the love

Week 1: Organizing Data to Begin Analysis Quiz Answers

Organizing Data to Begin Analysis INTRODUCTION

Organizing data forms a basic point in any data analysis. By sorting and filtering, it becomes to arrange data in a way to easily identify patterns, trends, and important inferences which will be made using these data organization methods from Google Data Analytics Professional Certificate by Coursera.

Participants in this certification course will gain hands-on experience using spreadsheets and SQL tools to sort and filter datasets. Furthermore, such techniques will generate more focused data exploration and more exact perceiving of relations between bodies of data. Well organized data paved the way for a successful and insightful analysis.

Learning Objective

  • Understanding the Data Analysis Process: Identify the major tasks and objectives associated with the analysis of data.
  • Data Organization Significance: Discuss the importance of data organization via sorting and filtering as the first step before even considering an analysis.
  • Improving Sorting Methods: Discuss how sorting operations in spreadsheets and databases could facilitate the arrangement of data, as well as highlight the benefits.
  • Filtering and Sorting using SQL: Demonstrate competency in the steps necessary to filter or sort data using SQL queries, showcasing practical applications and outcomes.

Test Your Knowledge on Understanding Data Analysis

1. You ask volunteers at a theater production which tasks they have already completed and add that data to a spreadsheet containing all required tasks. You will use the information provided by the volunteers to figure out which tasks still need to be done. This is an example of which phase of analysis?

  • Transform data
  • Get input from others (Correct)
  • Format and adjust data
  • Organize data (into a dataset)

Correct: Input means getting information, feedback from sources or insights that enable you to make better-informed decisions. It exposes you to broader perspectives and enhances the quality effectiveness of your choices.

2. You are working with three datasets about voter turnout in your county. First, you identify relationships and patterns between the datasets. Then, you use formulas and functions to make calculations based on your data. This is an example of which phase of analysis?

  • Get input from others
  • Format and adjust data
  • Organize data (into a dataset)
  • Transform data (Correct)

Correct: The interpretation of data consists of scrutinizing datasets in order to establish links, recognise patterns, and carry out calculations. This significant step in the process of data analysis makes raw data public and allows it to be understood as information.

3. You are working with a dataset from a local community college. You sort the students alphabetically by last name. This is an example of which phase of analysis?

  • Format and adjust data (Correct)
  • Transform data
  • Organize data (into a dataset)
  • Get input from others

Correct: As an instance of formatting and manipulating data, arranging students’ names in alphabetical order goes. It is a process that the analysts carry out to organize data for easier and more manageable analysis.

Test Your Knowledge on Organizing Data

1. Fill in the blank: A data analyst uses _____ to decide which data is relevant to their analysis and which data types and variables are appropriate.

  • database normalization
  • database references
  • database organization (Correct)
  • database relationships

Correct: The documentation of dataset evolution is important: it captures the data-cleaning errors, communicates changes to other users, and appraises the quality of the data.

2. A data analyst wants to organize a database to show only the 100 most recent real estate sales in Stamford, Connecticut. How can they do that?

  • The data analyst should add a filter to return only sales in Stamford, Connecticut, then sort the most recent sales at the top of their list. (Correct)
  • The data analyst should filter out sales in Stamford, Connecticut, then sort the most recent sales at the top of their list.
  • The data analyst should filter out sales in Stamford, Connecticut, then sort the least recent sales at the top of their list.
  • The data analyst should add a filter to return only sales in Stamford, Connecticut, then sort the least recent sales at the top of their list.

Correct: Then the data analyst should be able to apply a filter to show just Stamford, Connecticut sales and place the new downtown list at the top.

3. You are working with a database table that contains customer data. The country column designates the country where each customer is located. You want to find out which customers are located in Brazil.

You write the SQL query below. Add a WHERE clause that will return only customers located in Brazil.

Course_5_Quiz_2

How many customers are located in Brazil?

  • (Correct)
  • 3
  • 9
  • 7

Correct: The query SELECT * FROM customer WHERE country = “Brazil” is meant to yield a list of customers whose location is specified as Brazil. WHERE clause is used to filter matching records based on specified conditions; it specifies the column name, comparison operator (like =), and value(s) to be matched in the column.

This means that only customers within Brazil served with the WHERE country = “Brazil” condition would be retrieved in the query. There are 5 customers whose location in the dataset is Brazil.

Test Your Knowledge on Sorting in Spreadsheets

1. Which spreadsheet menu function is used to sort all data in a spreadsheet by the ranking of a specific sorted column?

  • Sort Range
  • Sort By Rank
  • Sort Data
  • Sort Sheet (Correct)

Correct: It allows you to arrange an entire sheet by a given column so that for, instance, when you sort the column according to a given ordering all other rows on that sheet also reflect the new stricter organization. This will make it easy to analyze or compare any two values that are present in the sheet. Sorting in any spreadsheet is done in reference to the selected value column, but it also encompasses other columns in the sheet, because the system preserves the data integrity of the row.

2. In spreadsheets, data analysts can sort a range from the Data tab in the menu or by typing a function directly into an empty cell.

  • True (Correct)
  • False

Correct: Sorting not only a range but also a sheet can be performed either through the menu or by means of a function. Data analysts can employ either one of these options: using the menu in the Data tab; or typing directly into an empty cell the function.

3. An analyst uses =SORT to sort spreadsheet data in descending order. What do they type at the end of their sort function?

  • Z-A
  • DESCEND
  • TRUE
  • FALSE (Correct)

Correct: So an analyst must enter FALSE at the end of the function if he wants to sort a spreadsheet using the SORT function in descending order.

4. The last query you ran returned the top 10 counties with the highest birth counts for 2018 only. Remove the LIMIT statement and run the query again. What is the county with the 11th highest birth count?

  • Orange County, CA (Correct)
  • Dallas County, TX
  • Unidentified Counties, KY
  • Miami-Dade County, FL

Correct: Orange County is defined as the 11th county in the year 2018 in terms of total number of births. This was achieved through a query that ordered the data and filtered it. From here, you will learn to use your command of SQL in more efficaciously organizing and structuring data.

5. What was the average temperature at JFK and La Guardia stations between June 1, 2020 and June 30, 2020?

  • 87.671
  • 92.099
  • 72.883 (Correct)
  • 74.909

Correct: That’s for sure, it sounds like you’ve made great strides in your data analysis capabilities. You are now able to run specific queries against specific subsets of a table of data, and this is allowing you to find results much more easily than before. Such data manipulation and extraction of meaning from different sources will prove a vital skill as you tackle ever more complex datasets in the future.

Test Your Knowledge on Sorting in SQL

1. A data analyst wants to sort a list of greenhouse shrubs by price from least expensive to most expensive. Which statement should they use?

  • ORDER BY shrub_price (Correct)
  • WHERE shrub_price ASC
  • WHERE shrub_price
  • ORDER BY shrub_price DESC

Correct: Correctly! In fact, the ORDER BY clause in an SQL query can be used to sort a list of price ranges of greenhouse shrubs, from cheapest to the most expensive.

2. You are working with a database table that contains data about music genres. You want to sort the genres by name in ascending order. The genres are listed in the genre_name column.

You write the SQL query below. Add an ORDER BY clause that will sort the genres by name in ascending order.

Test knowledge on sorting SQL 1

What genre appears in row 3 of your query result?

  • Easy Listening
  • Classical
  • Alternative
  • Blues (Correct)

Correct: It will fetch all columns from the genre’s table ordered in ascending order according to the values in the genre_name column. SQL defaults to ascending order unless specified otherwise and thus did not require ASC to be included in the query. Hence, if the Blues genre appears in row 3 of your query result, two genres would precede “Blues” in the alphabetical arrangement of all possible genres.

3. You are working with a database table that contains employee data. You want to sort the employees by hire date in descending order. The hire dates are listed in the hire_date column.

You write the SQL query below. Add an ORDER BY clause that will sort the employees by hire date in descending order.

Test knowledge on sorting SQL 2

What employee appears in row 1 of your query result?

  • Nancy Edwards
  • Margaret Park
  • Laura Callahan (Correct)
  • Robert King

Correct: That will bring up all columns from the employee table, and results will be displayed in descending order of the hire_date column. As you’ve noted, Laura Callahan appearing in row 1 means she is the most recently hired employee in your database based on hire_date.

If you’d like to filter or modify this any further, feel free to ask. For instance, you could limit it to just a few of the most recent hires or select specific columns.

Analyze Data to Answer Questions Weekly Challenge 1

1. In the data analysis process, which of the following refers to a phase of analysis? Select all that apply.

  • Organize data into understandable sections (Correct)
  • Visualize the data
  • Format data using sorts and filters (Correct)
  • Get input from others (Correct)

Correct: The four phases involved in analysis are: data organizing, formatting and adjusting the data, gathering input from others, and transforming by identification of relationships and calculation.

2. During which of the four phases of analysis do you compare your data to external sources?

  • Transform data
  • Format and adjust data
  • Get input from others (Correct)
  • Organize data

Correct: You compare your data with external sources during the time of collecting input from others.

3. You are performing a calculation during your analysis of a dataset. Which phase of analysis are you in?

  • Organize data
  • Format and adjust data
  • Transform data (Correct)
  • Get input from others

Correct: The TRIM function serves to eliminate leading, extra, or trailing spaces between words around data.

4. Fill in the blank: Filtering involves showing only the data that meets a specific _____ while hiding the rest.

  • model
  • measure
  • criteria (Correct)
  • observation

Correct: Filtering means showing the base data that fulfills certain conditions, hiding out others.

5. A data analyst is sorting spreadsheet data. They want to make sure that, when they rearrange the data, data across rows is kept together. What technique should they use to sort the data?

  • Sort Together
  • Sort Rows
  • Sort Column
  • Sort Sheet (Correct)

Correct: The function sorts the whole data in a spreadsheet according to a column and ensures all the data remain in one row with respect to sorting.

6. A data analyst uses a function to sort a spreadsheet range between cells H1 and K65. They sort in ascending order by the first column, Column H. What is the syntax they are using?

  • =SORT(H1:K65, 1, FALSE)
  • =SORT(H1:K65, A, FALSE)
  • =SORT(H1:K65, 1, TRUE) (Correct)
  • =SORT(H1:K65, A, TRUE)

Correct: The syntax =SORT(H1:K65, 1, TRUE) works in the following way. The first part of it performs the sorting of the data within the given range itself. The number “1” thus refers to the first column, and TRUE indicates in this case that the data will be sorted in an ascending order.

7. You are querying a database that contains data about music. Each album is given an ID number. You are only interested in data related to the album with ID number 6. The album IDs are listed in the album_id column.

You write the SQL query below. Add a WHERE clause that will return only data about the album with ID number 6.

 SELECT 
 *
 FROM 
 track
Course_5_Week_Challenge_1.1

How many tracks are on the album with ID number 6?

  • 20
  • 13 (Correct)
  • 5
  • 8

Correct: The WHERE album_id = 6 condition filters for the output of data concerning only the album associated with ID number 6. The whole query that does this is SELECT * FROM track WHERE album_id = 6. This would retrieve all columns from the “track” table for the album with ID 6. The condition clause here is applied in such a manner that it specifies a condition, filtering it against the column name, followed by an equals sign, and the actual value(s).

Thus, the album with ID number 6 consists of 13 tracks.

8. You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Chicago. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column.

You write the SQL query below. Add an ORDER BY clause that will sort the invoices by order total in ascending order.

Course_5_Week_Challenge_1.2

What total appears in row 2 of your query result?

  • 1.98 (Correct)
  • 7.96
  • 15.86
  • 5.94

Correct: The clause ORDER BY which sorts invoices according to their total amounts in ascending order, will complete the whole query SELECT * FROM invoice WHERE billing_city = “Chicago” ORDER BY total, this query will subsequently return all the columns from the “invoice” table in which the billing city is “Chicago” and arrange the order of final output according to the total in ascending order.

As in the requirement, the data shall be stored in ascending order in the ORDER BY clause. The second row in the result of this query indicates that the value 1.98 is included in the second least total of the filtered invoices.

9. During which of the four phases of analysis can you find a correlation between two variables?

  • Format and adjust data
  • Organize data
  • Transform data (Correct)
  • Get input from others

Correct: Finding a correlation between two variables occurs while transforming data.

10. Typically, a data analyst uses filters when they want to expand the amount of data they are working with.

  • True
  • False (Correct)

Correct: Using filters: This is one of the defining characteristics of a data analyst. The person uses filters to qualify a certain proportion of data to be analyzed and let the focus be on a few different sections of data that meet certain parameters or conditions.

11. A data analyst sorts a spreadsheet range between cells F19 and G82. They sort in ascending order by the second column, Column G. What is the syntax they are using?

  • =SORT(F19:G82, 2, FALSE)
  • =SORT(F19:G82, B, TRUE)
  • =SORT(F19:G82, 2, TRUE) (Correct)
  • =SORT(F19:G82, B, FALSE)

Correct: The syntax =SORT(F19:G82, 2, TRUE) works like follows: The data is sorting inside the range specified by the first part of the function (F19:G82). The number “2” refers to the second column of the range. The TRUE value represents sorting of the data in ascending order.

12. Which phase of the data analysis process has the goal of identifying trends and relationships?

  • Analyze (Correct)
  • Process
  • Act
  • Prepare

Correct: The objective of the analysis is to reveal the trends and relationships in the data so that you can answer the question or solve the problem accurately.

13. Which of the following actions might occur when transforming data? Select all that apply.

  • Recognize relationships in your data (Correct)
  • Eliminate irrelevant info from your data
  • Make calculations based on your data (Correct)
  • Identify a pattern in your data (Correct)

Correct: Transformation of data identifies the dependencies and patterns in the data. It also contains calculations on whatever is made available.

14. Fill in the blank: Sorting ranks data based on a specific _____ that you select.

  • model
  • calculation
  • observation
  • metric (Correct)

Correct: Sorting is the organization of data according to a certain metric that you choose to put it into a proper order. This makes the data easier to interpret, analyze, and visualize.

15. A data analyst is sorting data in a spreadsheet. Which tool are they using if all of the data is sorted by the ranking of a specific sorted column and data across rows is kept together?

  • Sort Sheet (Correct)
  • Sort Together
  • Sort Document
  • Sort Rank

Correct: When a specific column’s rank is chosen, it sorts the entire data in the spreadsheet in a manner similar to sort sheet. Along with this, it also simultaneously maintains the data horizontally for each row, thus retaining the correct alignment of information across the rows.

16. You are querying a database that contains data about music. You are only interested in data related to the jazz musician Miles Davis. The names of the musicians are listed in the composer column.

You write the SQL query below. Add a WHERE clause that will return only data about music by Miles Davis.

What track by Miles Davis appears in row 1 of your query result? 

  • So What
  • Summertime
  • Compulsion
  • Now’s The Time (Correct)

Correct: The WHERE composer=”Miles Davis” statement filters the dataset to return only music composed by Miles Davis. The whole query will be SELECT * FROM track WHERE composer=”Miles Davis”; this query selects all columns from the table “track” in a case where the composer is Miles Davis.

A where clause filters results based on the name of the column followed by an equals sign and your choice of specific value(s) to include.

This means that the track “Now’s The Time” by Miles Davis is stored in row 1 of your query results, which indicates that it is the first possible match for this condition.

17. A data analyst at a high-tech manufacturer sorts inventory data in a spreadsheet. They sort all data by ranking in the Order Frequency column, keeping together all data across rows. What spreadsheet tool are they using? 

  • Sort Rows
  • Sort Column
  • Sort Together
  • Sort Sheet (CORRECT)

18. Fill in the blank: To filter for all students in the Sophomore table who live in Fairfield County, a data professional uses the _____ clause in SQL. 

  • LIMIT
  • EXCEPT
  • FILTER
  • WHERE (CORRECT)

19. A junior data analyst performs several calculations on a dataset. What phase of analysis is the analyst in? 

  • Get input from others
  • Format and adjust data
  • Organize data
  • Transform data (CORRECT)

20. Which of the following statements accurately describe sorting and filtering? Select all that apply. 

  • Filtering can be performed in spreadsheets, but not SQL databases.
  • Filtering enables data professionals to view the data that is most important. (CORRECT) 
  • Sorting involves arranging data into a meaningful order. (CORRECT)
  • Sorting can be performed in both spreadsheets and SQL databases. (CORRECT)

21. Fill in the blank: During an analysis project, _____ might involve creating new columns in order to prepare the dataset for analysis. 

  • formatting and adjusting data (CORRECT)
  • organizing data
  • getting input from others
  • transforming data

22. Which query will return a list of all construction businesses that have made more than $8 million, in order from the largest number of employees to the fewest? 

  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’, Revenue < 8000000 4 ORDER BY number_of_employees ASC
  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’ 4 AND Revenue > 8000000 5 ORDER BY number_of_employees DSC
  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’ 4 WHERE Revenue < 8000000 5 ORDER BY number_of_employees DSC
  • 1 SELECT * 2 FROM ‘Company_data’ 3 WHERE Business = ‘Construction’ 4 AND Revenue > 8000000 5 ORDER BY number_of_employees ASC (CORRECT)

23. A data professional at a manufacturing company is tasked with identifying which machines are most likely to need repairs. In the analyze phase of the data analysis process, what activities might this involve? Select all that apply.  

  • Prepare a report for the stakeholders
  • Organize a dataset by machine type and performance levels (CORRECT)
  • Get input from colleagues on the data team (CORRECT)
  • Format the data to filter for machines that need the most maintenance (CORRECT)

24. Which function sorts a spreadsheet range between cells K1 and L80 in ascending order by the first column, Column K? 

  • =SORT(K1:L80, A, TRUE)
  • =SORT(K1:L80, A, FALSE)
  • =SORT(K1:L80, 1, TRUE) (CORRECT)
  • =SORT(K1:L80, 1, FALSE)

25. A data analyst determines whether there are any patterns in a dataset. What phase of analysis is the analyst in? 

  • Format and adjust data
  • Organize data
  • Transform data (CORRECT)
  • Get input from others

26. Which function sorts a spreadsheet range between cells C1 and D70 in ascending order by the first column, Column C? 

  • =SORT(C1:D70, A, TRUE)
  • =SORT(C1:D70, 1, FALSE)
  • =SORT(C1:D70, 1, TRUE)(CORRECT)
  • =SORT(C1:D70, A, FALSE)

27. A data analyst at a retail company sorts customer data in a spreadsheet. They sort all data by ranking in the Purchase Amount column, keeping together all data across rows. What spreadsheet tool are they using? 

  • Sort Document
  • Sort Sheet (CORRECT)
  • Sort Together
  • Sort Rank

28. Fill in the blank: To filter for all items in the Products table that are currently in stock, a data professional uses the _____ clause in SQL. 

  • LIMIT
  • FILTER
  • EXCEPT
  • WHERE (CORRECT)

29. Fill in the blank: During an analysis project, _____ might involve converting dates to a consistent format in order to prepare the dataset for analysis. 

  • transforming data
  • formatting and adjusting data
  • getting input from others (CORRECT)

30. Which function sorts a spreadsheet range between cells C1 and D70 in ascending order by the first column, Column C? 

  • =SORT(C1:D70, A, TRUE)
  • =SORT(C1:D70, 1, TRUE)(CORRECT)
  • =SORT(C1:D70, A, FALSE)
  • =SORT(C1:D70, 1, FALSE)

31. A data professional in human resources is tasked with identifying appropriate staff members to manage upcoming projects. In the analyze phase of the data analysis process, what activities might this involve? Select all that apply. 

  • Prepare a report for the stakeholders
  • Get input from other HR data professionals (CORRECT)
  • Format the data to filter for keywords relevant to the upcoming projects (CORRECT)
  • Organize an employee dataset by skills and experience (CORRECT)

32. A data professional at a finance company sorts spreadsheet data. They sort all data by ranking in the Financial Performance column, keeping together all data across rows. What spreadsheet tool are they using? 

  • Sort Rows
  • Sort Together
  • Sort Column
  • Sort Sheet (CORRECT)

33. A data team investigates possible relationships in a dataset. What phase of analysis is the analyst in? 

  • Get input from others
  • Organize data
  • Transform data (CORRECT)
  • Format and adjust data

34.  Which query will return a list of all corn farms that have made more than $4 million, in order from the oldest to the newest business? 

  • SELECT * (CORRECT) FROM ‘farms_directory’ WHERE Crop = ‘Corn’, Revenue < 4000000 ORDER BY Years_in_business ASC
  • SELECT * FROM ‘farms_directory’ WHERE Crop = ‘Corn’ AND Revenue > 4000000 ORDER BY Years_in_business ASC
  • SELECT * FROM ‘farms_directory’ WHERE Crop = ‘Corn’ AND Revenue > 4000000 ORDER BY Years_in_business DSC
  • SELECT * FROM ‘farms_directory’ WHERE Crop = ‘Corn’ WHERE Revenue > 4000000 ORDER BY Years_in_business DSC

35. A data professional in customer service is tasked with identifying customers who are at risk for taking their business to a competitor. In the analyze phase of the data analysis process, what activities might this involve? Select all that apply. 

  • Prepare a report for the stakeholders
  • Request input from other customer service data professionals (CORRECT)
  • Format the data to filter for low customer satisfaction scores (CORRECT)
  • Organize a dataset by customer and purchase history (CORRECT)

36.  Which function sorts a spreadsheet range between cells G1 and H60 in ascending order by the first column, Column G? 

  • =SORT(G1:H60, 1, FALSE)
  • =SORT(G1:H60, A, FALSE)
  • =SORT(G1:H60, 1, TRUE) (CORRECT)
  • =SORT(G1:H60, A, TRUE)

37. Which function sorts a spreadsheet range between cells K1 and L80 in ascending order by the first column, Column K? 

  • =SORT(K1:L80, 1, FALSE)
  • =SORT(K1:L80, A, FALSE)
  • =SORT(K1:L80, 1, TRUE) (CORRECT)
  • =SORT(K1:L80, A, TRUE)

Organizing Data to Begin Analysis CONCLUSION

In this section of the course, organizing data is discussed. Sorting and filtering are significant steps to keep your data clean and organized such that it can be analyzed.

You engaged in a formal study of this within the spreadsheet or via the SQL application which helped in preparing the data for a more profound analysis. You know now the real meaning of organizing data, and this should provoke you to join Coursera to keep learning while preparing your dataset for further deep analysis.

Leave a Comment