COURSE 4 – PROCESS DATA FROM DIRTY TO CLEAN QUIZ ANSWERS

Spread the love

Week 4: Verify and Report on your Cleaning Results

VERIFY AND REPORT ON YOUR CLEANING RESULTS INTRODUCTION

The verifying and reporting of your data cleaning efforts will comply with the Google’s Data Analytics Professional Certificate Program’s standard on Coursera. You will learn in this course how to obtain and report the results from your data-cleaning processes correctly and why it is necessary. Through applied activities on real-life datasets, you will learn through hands-on working experience the demonstration of various data-cleaning techniques and documenting what you are doing and why you are doing it.

Cleary, concisely writing the summary will be another requirement of mastering one’s skills so that others would understand the reason why specific data-cleaning activities were undertaken and the rationale behind these activities. Verification and reporting allow the analysis to be reliable and trustable for other users of the data in the future.

Learning Objectives:

  • Understand the process of verifying data-cleaning results
  • Learn the various steps required to manually clean data
  • Discover and know what data-cleaning reports must comprise
  • Identify the benefits of documenting the data-cleaning processes.

TEST YOUR KNOWLEDGE ON MANUAL DATA CLEANING

1. Making sure data is properly verified is an important part of the data-cleaning process. Which of the following tasks are involved in this verification? Select all that apply

  • Considering whether the data is credible and appropriate for the project. (Correct)
  • Manually fixing any errors found in the data. (Correct)
  • Rechecking the data-cleaning effort. (Correct)
  • Asking stakeholders to check and confirm the data is clean.

Correct: Verification makes sure that data cleaning has been done satisfactorily, and the final outcome is true and trustworthy. To verify data, analysts go back to the changes made on their earlier cleaning steps, manually correct the remaining errors, and check the credibility and appropriateness of input data for the project in question. This ensures that they can trust the data since it is good and appropriate for analysis or reporting.

2. Fill in the blank: To count the total number of spreadsheet values within a specified range, a data analyst uses the _____ function.

  • COUNTA (Correct)
  • SUM
  • WHOLE
  • TOTAL

Correct: Among the functions that a data analyst can apply in counting the total number of spreadsheet values within a specific given range is COUNTA. The COUNTA function counts all non-empty cells within the specified range whether they be numbers, text, or any other form of data and can come in handy when one wants to find out how many total entries their data set might have. For counting specifically the numeric values in the range, one does not use COUNTA, but one uses COUNT.

3. A data analyst is cleaning a dataset with inconsistent formats and repeated cases. They use the TRIM function to remove extra spaces from string variables. What other tools can they use for data cleaning? Select all that apply.

  • Import data
  • Remove duplicates (Correct)
  • Protect sheet
  • Find and replace (Correct)

Correct: Data cleaning can also be conducted by the analyst using TRIM function, the REMOVE duplicates, and the FIND AND REPLACE functions.

4. To correct a typo in a database column, where should you insert a CASE statement in a query?

  • As an ORDER BY clause
  • As a GROUP BY clause
  • As a SELECT clause (Correct)
  • As a FROM clause

Correct: The SELECT clause should comprise a CASE statement. The CASE statement checks for one or more conditions and returns a value once it finds a satisfied condition. The typo would work as the condition, and the returned value will be when the condition becomes true.

TEST YOUR KNOWLEDGE ON DOCUMENTING THE CLEANING PROCESS

1. Why is it important for a data analyst to document the evolution of a dataset? Select all that apply.

  • To determine the quality of the data (Correct)
  • To identify best practices in the collection of data
  • To inform other users of changes (Correct)
  • To recover data-cleaning errors (Correct)

Correct: Tracking the history of a dataset is very important in recovering data-cleaning errors, informing its users about the changes made, and evaluating the quality of the dataset.

2. Fill in the blank: While cleaning data, documentation is used to track _____. Select all that apply.

  • deletions (Correct)
  • errors (Correct)
  • bias
  • changes (Correct)

Correct: During the data-cleaning procedure, documentation is essential for tracking changes made to the data, document deletions, as well as errors.

3. Documenting data-cleaning makes it possible to achieve what goals? Select all that apply.

  • Demonstrate to project stakeholders that you are accountable (Correct)
  • Visualize the results of your data analysis
  • Be transparent about your process (Correct)
  • Keep team members on the same page (Correct)

Correct: So, through recording your data cleaning process, you achieve transparency in this activity as well, to keep team members aligned, as well as letting key decision makers in the project to see into the ‘how’ of what you are doing.

PROCESS DATA FROM DIRTY TO CLEAN WEEKLY CHALLENGE 4

1. The data collected for an analysis project has just been cleaned. What are the next steps for a data analyst? Select all that apply.

  • Certification
  • Reporting (Correct)
  • Verification (Correct)
  • Validation

Correct: Once the data has been cleaned, the next step for a data analyst is verification and then reporting.

2. What is the first step in the verification process?

  • Compare cleaned data with the original, uncleaned dataset and compare it to what is there now (Correct)
  • Create a chronological list of modifications made to the data
  • Determine the quality of the data
  • Inform others of your data-cleaning effort

Correct: To begin the verification process, a comparison is initially made between the cleaned data and the original unclean data, followed by an evaluation of the changes that occurred.

3. Fill in the blank: TRIM is a function that removes _____ spaces in data. Select all that apply.

  • Trailing (Correct)
  • Leading (Correct)
  • repeated (Correct)
  • inner

Correct: TRIM – this is a method for trimming leading, trailing, and unnecessary spaces within any two words in a data.

4. While verifying cleaned data, a data analyst encounters a misspelled name. Which function can they use to determine if the error is repeated throughout the dataset?

  • CHECK
  • COUNTA (Correct)
  • COUNT
  • CASE

Correct: COUNTA can be used to determine whether the error propagates the entire data set.

5. A WHEN statement considers one or more conditions and returns a value as soon as that condition is met.

  • True
  • False (Correct)

Correct: A CASE statement checks for one or more conditions and returns the respective result as soon as a condition gets satisfied.

6. Fill in the blank: Documentation is the process of tracking _____ during data cleaning. Select all that apply.

  • inactivity
  • deletions (Correct)
  • changes (Correct)
  • additions (Correct)

Correct: It includes tracking the changes, additions, deletions, errors while cleaning the data.

7. Fill in the blank: While cleaning data, a data analyst can use a changelog to keep a chronological list of changes they make. They can refer to it during the _____ period if there are errors or questions.

  • verification (Correct)
  • visualization
  • presenting
  • documentation

Correct: A data analyst can maintain a chronological list of the changes they have made using a changelog while cleaning the data. This can serve as a reference during the verification in case any faults or questions arise later.

8. Reviewing version history is an effective way to view a changelog in SQL.

  • True
  • False (Correct)

Correct: The old-time easy way such review history versions is that it is possible to access a change-log within spreadsheets.

9. Fill in the blank: Once data is clean, a data analyst moves on to _____ and verification.

  • processing
  • publishing
  • reporting (Correct)
  • confirming

Correct: A data analyst confirms and reports after data cleaning.

10. A data analyst is in the verification step. They consider the business problem, the goal, and the data involved in their analytics project. What scenario does this describe?

  • Visualizing the data
  • Seeing the big picture (Correct)
  • Reporting on the data
  • Considering the stakeholders

Correct: As a comprehensive perspective, business problem, goal and data need to be considered while verifying data cleaning.

11. Which of the following functions automatically remove extra spaces when cleaning data?

  • SNIP
  • REMOVE
  • CLEAR
  • TRIM (Correct)

Correct: TRIM clears extra spaces while cleaning data – leading, trailing, and even redundant spaces.

12. While verifying cleaned data, a data analyst encounters a misspelled name. Which function can they use to determine if the error is repeated throughout the dataset?

  • COUNTA (Correct)
  • COUNT
  • CHECK
  • CASE

Correct: To find out whether an error is recurring across the dataset, they might apply the COUNTIF function to count the incidence of any particular value or condition in the database. COUNTA sums all the non-empty cells, but it is not completely devoted to identification of any error.

13. A data analyst uses a changelog while cleaning data. What process does a changelog support?

  • Documentation (Correct)
  • Illumination
  • Disclosure
  • Examination

Correct: A changelog supports documentation.

14. Verification and reporting come directly before the data-cleaning process.

  • True
  • False (Correct)

Correct: Verification and reporting follow data cleaning activities.

15. Which function removes leading, trailing, and repeated spaces in data?

  • TRIM (Correct)
  • CROP
  • TIDY
  • CUT

Correct: The TRIM is the function that eliminates leading spaces, trailing spaces or extra spaces between words from the data.

16. Which SQL tool considers one or more conditions, then returns a value as soon as a condition is met?

  • CASE (Correct)
  • WHEN
  • THEN
  • ELSE

Correct: In short, a case statement checks one or multiple conditions and returns a value once any specified condition is satisfied.

17. Fill in the blank: A changelog contains a _____ list of modifications made to a project.

  • approximate
  • random
  • synchronized
  • chronological (Correct)

Correct: A data analyst accesses all required information via a changelog. Essentially, a changelog is taking form inside a record as excels her chronological compendium of a project: changes made.

18. A data analyst makes changes to SQL queries and uses these comments to create a changelog. This involves specifying the changes they made and why they made them.

  • True (Correct)
  • False

Correct: Documenting changes in SQL queries, along with comments to create a changelog, in turn entails keeping a record of the changes made along with the reason behind each one.

19. What is involved in seeing the big picture when verifying data cleaning? Select all that apply

  • Consider the business problem (Correct)
  • Consider the data (Correct)
  • Consider the goal (Correct)
  • Consider the reporting

Correct: To acquire a full-formed understanding for the verification of data cleaning, look at the problem – the business problem, goal, and data. When these are aligned, one can be assured that the cleaned data match project objectives and deliver insights that matter.

20. Fill in the blank: TRIM is a function that removes _____ spaces in data. Select all that apply.

  • Leading (Correct)
  • Repeated (Correct)
  • inner
  • trailing (Correct)

Correct: TRIM is a function that removes leading, trailing, and repeated spaces in data.

21. What is the process of tracking changes, additions, deletions, and errors during data cleaning?

  • Documentation (Correct)
  • Cataloging
  • Recording
  • Observation

Correct: Documenting the process involves changes, additions, deletions, and errors made within the process of data cleaning itself.

22. At what point during the analysis process does a data analyst use a changelog?

  • While cleaning the data (Correct)
  • While visualizing the data
  • While gathering the data
  • While reporting the data

Correct: A data analyst uses a changelog while cleaning data.

23. A data analyst is starting a large scale project. The project will be crucial to business success and the data analyst needs to keep the big picture at the forefront when verifying their data cleaning. What is the first step in the verification process?

  • Determine the quality of the data
  • Compare cleaned data with the original, uncleaned dataset and compare it to what is there now (CORRECT)
  • Create a chronological list of modifications made to the data
  • Inform others of the data-cleaning effort

24. During the verification process, you find that you missed a few leading spaces during data cleaning. What function can you use to eliminate these spaces?

  • TIDY
  • TRIM (CORRECT)
  • CROP
  • CUT

25. What tool can a data analyst use to figure out how many identical errors occur in a dataset?

  • CONFIRM
  • CASE
  • COUNT
  • COUNTA (CORRECT)

26. You find a few misspellings in your datatable and need to correct them when running a query. What function can you use when your set condition is met?

  • CASE (CORRECT)
  • THEN
  • WHEN
  • ELSE

27. A data analyst uses a changelog while cleaning their data. What data modifications should they track in the changelog?

  • Changes, resolutions, and deletions
  • Errors, deletions, and notes (CORRECT)
  • Errors, additions, and deletions
  • Additions, changes, and queries

28. Fill in the blank: A process to confirm that a data-cleaning effort was well-executed and the resulting data is accurate and reliable is known as _____.

  • manipulation
  • publishing
  • verification (CORRECT)
  • processing

29. What is the first step in the verification process?

  • Inform others of your data-cleaning effort
  • Compare cleaned data with the original, uncleaned dataset and compare it to what is there now (CORRECT)
  • Create a chronological list of modifications made to the data
  • Determine the quality of the data

30. During data cleaning, you find an error in a username where the ID number was accidentally joined to the user’s last name. You need to figure out if this username has been entered incorrectly more than once in your dataset. If you use a pivot table, what function can you use to determine the number of times this error occurs in your dataset?

  • COUNT
  • CASE
  • CHECK
  • COUNTA (CORRECT)

31.  Fill in the blank: A data analyst uses the CASE statement to consider one or more _____, then return a value.

  • changes
  • fields
  • identifications
  • conditions (CORRECT)

32.  Fill in the blank: While cleaning data, a data analyst can use a changelog to keep a chronological list of changes they make. They can refer to it during the _____ period if there are errors or questions.

  • documentation
  • presenting
  • verification (CORRECT)
  • visualization

33. A data analyst is reviewing modifications made to a SQL table and a spreadsheet. The data analyst will get similar results when using the changelogs for both data sources.

  • True (CORRECT)
  • False

34.  Fill in the blank: A data analyst finishes cleaning their data. The next step in the process is reporting and ____.

  • verification (CORRECT)
  • manipulation
  • replacing
  • processing

35. A data analyst is starting a large scale project that is crucial to business success. The data analyst needs to remember the big picture when verifying their data cleaning. What is involved when focusing on the big picture-view of the project? Select all that apply.

  • Consider the stakeholders
  • Consider the reporting
  • Consider the business problem (CORRECT)
  • Consider the goal (CORRECT)

36. Your manager points out an error in a product ID number in your dataset. The Product IDs can be numbers like 42 or text like “CAD-425”. Using a pivot table, what function can you use to find how many times this error occurs in the dataset?

  • CASE
  • CHECK
  • COUNT
  • COUNTA (CORRECT)

37. A data analyst is in the verification process and needs to verify the modifications that they have made to the data. What could the analyst reference to find the changes they made throughout data cleaning?

  • Changelog (CORRECT)
  • Metadata
  • Spreadsheet
  • Notepad

38. A data analyst uses the COUNTA function to count which of the following?

  • The total number of values within a specified range (CORRECT)
  • The total number of headers in a specific range
  • The specific numbers in a dataset
  • The total number of entries in a changelog

39. You’re working with a dataset that contains categorical variables. You notice that some of the strings are misspelled or are not capitalized. What function can you use to fix these errors when a condition is met?

  • CASE (CORRECT)
  • THEN
  • WHEN
  • ELSE

40. Fill in the blank: Documentation is the process of tracking _____ during data cleaning. Select all that apply.

  • inactivity
  • changes (CORRECT)
  • additions (CORRECT)
  • deletions (CORRECT)

41. Fill in the blank: As a data analyst, you should always create a _____ to track your additions, deletions, errors, and changes to a query.

  • notepad
  • spreadsheet
  • changelog (CORRECT)
  • database

42. In what step of the data-cleaning process do you find mistakes before you begin analyzing the data?

  • Publishing
  • Processing
  • Confirming
  • Verifying (CORRECT)

43. As a data analyst, you will need to keep the big picture in mind throughout any project when verifying data cleaning. What must the analyst do to take a big picture view of the project? Select all that apply.

  • Consider the reporting
  • Consider the goal (CORRECT)
  • Consider the business problem (CORRECT)
  • Consider the data (CORRECT)

VERIFY AND REPORT ON YOUR CLEANING RESULTS CONCLUSION

To put it briefly, data cleaning is a critical step in the data analysis process. It is equally important for the cleaning processes to be reported and validated, so that the data is ready for the next step. Taking this course in Coursera will make you learn about the processes of verification and reporting in data cleaning as well as the advantages they offer. Join the learning today!

Leave a Comment