Module 3: Preparing Data for Analysis Using Functions

Spread the love

INTRODUCTION – Preparing data for analysis using functions

The module ensures that each individual, who fully uses Excel in their data manipulation activities in Power BI with an upper-resourced hefty instrument somewhere, learns how to best put this all into practice. Indeed, this module ventures into primary aspects of what everyone wishes to learn, such as formatting data the right way, cleaning it, and structuring it correctly for coherence, all the way to making it quite possible to benefit from rather straightforward integration and the growth of analytical capital. WIth well-understood mastery of everything presented, users can then alter their workflows in respect of their data, which not only makes things easier for complex analysis but also helps in drawing valuable insights by using Power BI, making some simple insights quite easy.

Learning Objectives:

  • Make use of function in getting text data ready for effective analysis.
  • Use functions to create dates in a spreadsheet.
  • Formulate new contents using logical functions.

SELF-REVIEW: STANDARDIZING TEXT-BASED DATA

In the Standardizing text-based data exercise, you created calculations to standardize the data in the Adventure Works spreadsheet. You also used the functions TRIM, UPPER, PROPER, LEFT, MID, RIGHT, and CONCAT to clean the data. 

The results you produced should be similar to those in the following screenshot:

AD 4nXfwRxO8KyRWx hoDYPdC8ZOKvHnuMGTKJ9OG4cm8LwWcmskdgwJNqdGVVmLAqPvib79IOQ LVrtJvzHcFgXQlIna8aHTEOLEO90WuPg4PyDJgBSa13WbNn9CJC

Answer the questions that follow to test your understanding of the processes. Remember that you can refer to previous lesson items if required.

1. In your worksheet, you created formulas in column C that included the TRIM function and cell references from column B as arguments. The entries in column B contained a lot of unnecessary spaces though the entries contain only a single space between each word. What did the formulas remove?

  • The spaces before, after, and in the entries.
  • The spaces before and after the entries. (CORRECT)
  • The spaces before the entries only.

That’s correct! That’s the case, yes! Many spreadsheet functions-it’s available in Microsoft Excel or Google Sheets-delete spaces at the beginning and end of text. TRIM doesn’t delete the singular spaces between words. Instead, it reduces multiple consecutive spaces to just a single space-even if someone inputs more to separate the words.

2. True or False: You created formulas in cells K2 to K200 to combine the content from column G and column l. You were then able to delete columns G and I immediately, as they were no longer required.

  • True
  • False (CORRECT)

That’s correct! III. Duplicate the amount of K column and after that duplicate the value to K column, whichever will delete the formula columns value and will show generated output.

3. In the CONCAT formula in K2, you added a third argument to indicate that the two words needed space between them. What character did you input on either side of the space in the formula?

  • A double quote. (CORRECT)
  • A parenthesis.
  • An exclamation mark.

That’s correct. The space was to be treated as an extra character in the output-if you placed a pair of double quotes on one side of the space.

4. Jamie wants to perform a time analysis of Adventure Works’ sales data. However, a time analysis isn’t possible because of how the date is currently formatted. What error might have caused this formatting issue?

  • The dates are typed with forward slashes.
  • The dates are typed as text. (CORRECT)
  •  The dates are typed as numbers.

That’s correct. For Excel to recognize dates as calendar items, they must be typed as numbers.

KNOWLEDGE CHECK: USING FUNCTIONS TO CLEAN OR STANDARDIZE TEXT

1. True or False: An Excel spreadsheet contains country names split over multiple columns. Cell A2 contains the word United. Cell B2 contains the word States. When executed, the following formula generates the result United States:

=CONCAT(A2,” “,B2)

  • True (CORRECT)
  • False

That’s correct! The CONCAT function formula pays tribute to the contents of A2 and B2, adding a space between them, which is defined in one more argument for the insertion of space by wrapping the space character in double quotes, addressing it as text to be included in the result.

2. One of the employees at Adventure Works has made some typing errors in a spreadsheet. A text entry in cell contains the following text:

aDVENTURE wORKS rESELLERS

Which function should you use in a formula to copy this entry so that it is in lowercase with a capital letter at the beginning of each word?

  • LOWER
  • PROPER (CORRECT)
  • UPPER

That’s correct! The traditional method works fine, wherever you have only just a few words, but using the PROPER function has its pros and cons.

3. In the Adventure Works Reseller spreadsheet, the reseller names are listed in column D. Cell D2 contains the following text, which is left-aligned in the cell with no redundant spaces:

EastBike Shop

What is the result of this formula?

=MID(D2,5,4)

  • East
  • Shop
  • Bike (CORRECT)

That’s correct! Argument 5 falls into the second category because B is implied as a letter by Excel. Argument 4 may then be seen to include that letter together with three other letters thereafter in creating a conditioned result of four total letters.”

4. Cell A2 contains the following entry:

aceE6548.

What result would the following formula generate when applied to the above entry?

  • aCE6548
  • Ace6548.
  • ACEE6548 (CORRECT)

That’s correct! The UPPER function changes the letters into uppercase.

5. Cell E4 contains the city name North Miami Beach. What result would the following formula generate in your worksheet?

=RIGHT(E4,5)

  • Miami (CORRECT)
  • Beach
  • North

That’s correct. In starting from the very right of the text content, the RIGHT function advances five characters. The five characters before the last one are displayed.

SELF-REVIEW: CALCULATING THE NUMBER OF WORKING DAYS REMAINING IN THE YEAR

In the exercise Calculating the number of working days remaining in the year, you created calculations to add new date and time information to a worksheet called USA Launch Dates in a workbook named Advertising Campaign USA. In the formulas you created, you used the functions TODAY, NETWORKDAYS, MONTH, and YEAR to calculate the required information.

Your final workbook should resemble the following screenshot:

AD 4nXfg3Wc6Uw2nXkSQsCe19GIjFiXIEPXgcGsTFZelWrQfk3d9Y Dj66PW7pfkFqy fBCFXvVp4 hYle04r2ipdJ2UhXxWB37QIJFZLY868VuT7saMG2pORj4Die4E2Qmb7t Z 7KHKABvJv1WO 1dfiQvSNgxFciBXJHueoF2Iwr7Iq2

Now it’s time to review your understanding of the tasks you completed by answering the following questions. Don’t forget that you can revisit the previous learning items to recap the process steps.

1. In the spreadsheet, you were asked to create a formula using the MONTH function. The formula was

:=MONTH(D5)

Cell D5 contained the entry 07/02/23. This date is in American format. What was the result of this calculation?

  • 2023
  • (CORRECT)
  • 2

That’s correct! The MONTH function identifies and displays the month element of a date. In this case, the month is 7.

2. True or False: If you were to use a TODAY function in a formula in cell B1, the result would change each time the formulas in the worksheet recalculate.

  • True
  • False (CORRECT)

That’s correct! The TODAY function depends on the computer system clock so that it can track the day. However it is updated every time the 24 hours are elapsed after the last update.

3. You created a formula using the NETWORKDAYS function. You added two arguments for the function, which were the start date, and the end date. What data did the formula automatically exclude from the result?

  • Any weekend date (CORRECT)
  • Any holiday date
  • Any weekend and holiday date

That’s correct! The function uses the values of the start date and the end date of a dataset. It sets the default including the Saturdays and Sundays and uses the computer-generated information about what is actually considered the weekend in the country.

4. The date 05/30/23 has a serial number of 45076. What serial number will the date 05/31/23 have?

  • 45077 (CORRECT)
  • 45078
  • 45079

That’s correct! Excel increases the serial number by 1 for every 24-hour period.

5. How many arguments does the TODAY function require?

  • Two
  • None (CORRECT)
  • One

That’s correct! The TODAY statement simply cannot accept any argument because it has the ability to generate the present date according to the computer’s internal calendar.

KNOWLEDGE CHECK: DATE AND TIME FUNCTIONS

1. You have created a formula in your spreadsheet using the TODAY function. What must you include after the word TODAY in your formula?

  • An opening and closing parenthesis only. (CORRECT)
  • An opening and closing parenthesis and one space.
  • An opening and closing parenthesis and two spaces.

That’s correct! Even if you do not include any arguments with the TODAY function, note that a pair of opening/closing parentheses would still have to enclose the function name.

2. You have created a formula in your spreadsheet using the NOW function. By default, what will the formula display in its result?

  • The time only.
  • The date only.
  • The date and time. (CORRECT)

That’s correct! The default value that NOW function generates happens to contain the results when the whole date and time is occurred, usually because it happens along with current date and current time. What you need to do is customize this presentation by choosing among the various time formats that the number section of the Home ribbon accommodates.

3. You are working on a spreadsheet that contains three columns called Day, Month and Year. In another column, you would like to combine these entries so that it shows a complete date. Which function should you use to achieve this?

  • DATEDIF
  • DATE (CORRECT)
  • CONCAT

That’s correct! The lower perplexity scenario could mean that the person spends most of their time answering emails at the computer. Higher burstiness could be achieved through increasing the number of emails they send at turbulent times.

4. True or False: You can use the NETWORKDAYS.INTL function to calculate the number of working days between two dates while excluding national holidays and weekends because it has built-in knowledge of public holidays.

  • True
  • False (CORRECT)

That’s correct! NETWORKDAYS.INTL seems to need a bit more holiday effect. It does not know about public or national holidays. The holidays separately have to be inserted into the spreadsheet. On Mondays and Fridays with holidays exclusively listed, set reference ranges for holidays in the range reference to 3rd and so on in the function. Also, within the function, it provides a provision to override the weekend for other days by declaring the weekend using the NETWORKDAYS.INTL option.

5. You have entered the following three dates in your spreadsheet in the month, day and year format. Which of these dates has the largest serial number?

  • 01/30/2023
  • 09/10/2025 (CORRECT)
  • 04/15/2020

That’s correct! Different dates have different serial numbers based on decrees, the first serial number being assigned on January 1, 1900 will always be increased by 1 each day. However, all selections of future dates in comparison with dates of the past have serial numbers higher than the previous ones.

SELF-REVIEW: ADDING A DATA COLUMN USING THE IFS FUNCTION

In the exercise Adding a data column using the IFS function, you created calculations to generate two columns of new data in the Order Details spreadsheet in the workbook named Contoso Bikes. You generated this data using the IFS and IFS functions. You also used the function SUMIF to generate regional totals in the spreadsheet. 

The results you produced should be like those in the following screenshot:

AD 4nXcnKCOSxFT8Qw o HXYJAqQq6HPR7YLYih4NTc mCi aujA1U1RIoCi1OZT3mKE D2nmFaefYc0ONymhOtIEIOecrjErlqU5iddPi2HnKQtrrf8Mm0 Z8RXIglzqexDJG5aJkDXQCwVjapw6PYFE3UJ7GgMIlqi29609 rRGNFpjgLD

Now it’s time to review your understanding of the tasks you completed by answering the questions that follow. Don’t forget that you can revisit the previous learning items to recap the process steps.

1. You created an IF function formula in cell H7 that was designed to display the correct percentage discount if the amount in G7 was greater than $10,000. In what format did you enter the percentage in the Value if true section of the formula?

  • “10%”
  • 10
  • 10% (CORRECT)

That’s correct! Simply keying in 10% in the value if true area of the formula is what would inform Excel to display the value itself pure, namely 10%, if the logical test returned TRUE_PROPERTY_RETURN.

2. True or False: When you created a formula in cell L7 to check and display the delivery charge for the region, you used IFS rather than IF because you needed to run more than two tests but also create the most concise formula.

  • True (CORRECT)
  • False

That’s correct! Unlike the IF function, a single IFS function can evaluate several conditions up to 127 different scenarios, which is an optimal feature giving the complex functions that follow IFS-based formulas. Only one IF statement is allowed to compare circumstances. In the case of three or more regions, you would have to create a complex, nested formula with multiple IF functions. On the other hand, the IFS function can conveniently check three or more conditions within one formula.

3. You created a formula in H2 which used SUMIF to obtain a total for Region A entries only. What did you need to specify as the first argument in your formula?

  • The criteria range (CORRECT)
  • The criteria entry
  • The SUM range

4. You have created a nested IF formula using three IF functions. How many closing parentheses do you need at the end of the formula?

  • Three (CORRECT)
  • One
  • Two

That’s correct. IF functions expect two brackets: one at the beginning and one at the end of the condition. IF formulas include opening and closing brackets for each condition tested. In the case of the formula you described, since you are using three IF functions, you will need three closing brackets to close each function.

KNOWLEDGE CHECK: LOGICAL FUNCTIONS

1. Cell A2 of your worksheet contains a value of 250. What is the result of the following formula when added to your worksheet?

=IF(A2>300,10%,IF(A2>200,5%,0%))

  • 10%
  • 0%
  • 5% (CORRECT)

That’s correct. If the first IF’s logical test fails, the second one IS tested. If the logical test of the second IF has “true” as a result, then the result of the second IF expression is displayed as the “value if true.”

2. You create a formula using the IFS function to test for a series of alphabet characters. When typing the criteria to test for, what symbols should you add around each text character?

  • Parentheses.
  • Single quotation marks.
  • Double quotation marks. (CORRECT)

That’s correct. A key notation is to reference this MS Excel function for all new computational parameters in the sheet-so no more inconvenience from a swinging cursor hanging over the cell or range of cells that could allow you to worry about if the user pressed or typed things correctly: “components” or: “parts” of a new formula into parentheses for example-one part leading with =GCD( and the other finishing with ).

3. In your worksheet, cell A2 contains a value of 100. Cell B2 contains a value of 200, and C2 contains a value of 400. What is the result of the following formula when added to your worksheet?

=IF(OR(A2>=200,B2>=200),”Result 1″,IF(C2>300,”Result 2″,”Result 3″))

  • Result 2
  • Result 1 (CORRECT)
  • Result 3

That’s correct! An OR function then comes into the first IF function to satisfy both of the specified logical tests, needing either one for the function to return a specific value, “value if true.” Thus, a complete equation wouldn’t require the second IF.

4. You create the following formula using the AVERAGEIF function: 

=AVERAGEIF(A2:A50,”Chicago”,C2:C50)

What does the first argument of this function represent?

  • The average range.
  • The criteria.
  • The criteria range. (CORRECT)

That’s correct! The first argument is the criteria range in which AVERAGEIF searches for every occurrence meeting certain defined criteria.

5. In your worksheet, cell A2 contains the value 100. Cell B2 contains a value of 200, and C2 contains a value of 400. What is the result of the following formula when added to your worksheet?

=IFS(A2>200,”Rate 1″,B2>200,”Rate 2″,C2>200,”Rate 3″,TRUE,0)

  • Rate 1
  • Rate 2
  • Rate 3 (CORRECT)

That’s correct! Even though the first two tests logically did not return TRUE, the third test did instead. Which is why the third argument, under “value_if_true,” will be returned with the IFS function.

TEST YOUR KNOWLEDGE: DETECTION AND DOCUMENTATION TOOLS

1. True or False: In your worksheet, cell A2 contains the word United. B2 contains the word States. You add the following formula to your worksheet:

=CONCAT(A2,B2)

When executed, the result of this formula is:

United States

  • True
  • False (CORRECT)

That’s correct. The formula does not include space in between because there is no other argument in the formula which says that space should be added between words.

2. You are working on a spreadsheet that contains a column of customer names. You notice that there are a lot of extra spaces both before and after the entries. Which function could you use to tidy and standardize the customer name entries? 

  • RIGHT 
  • TRIM (CORRECT)
  • PROPER

That’s correct! Besides eliminating the extraneous unwanted spaces (leading or trailing either), the TRIM function doesn’t deal with interword spaces when it comes to copy editing app implementations.

3. Some of the information in your worksheet has been typed in the wrong columns. The entry in C2 incorrectly reads as: 

32MainAvenueChicagoUSA

What is the result of the following formula which references this entry?

=MID(C2,3,10)

  • MainAvenue (CORRECT)
  • 32MainAven
  • ChicagoUSA

That’s correct! The second argument, 3, instructs Excel to move in 3 characters from the left. The third argument, 10, asks Excel to display the next 10 characters.

4. When typing time entries into a cell, what character can you use to separate the hours and minutes?

  • A colon. (CORRECT)
  • A comma.
  • A semi-colon.

That’s correct! The colon symbol is used to separate hours, minutes, and seconds in a time entry.

5. Cell D2 in your spreadsheet contains the date 05/30/23. The date is in MM/DD/YY format. What result does the following formula display?

=DAY(D2)

  • 5
  • 23
  • 30 (CORRECT)

That’s correct! The DAY function identifies the day element in a date and displays that element as its result.

6. You are working on a spreadsheet that contains employee information. Column A lists the date staff members started with the company. Column B lists the date they left the company. Cell A2 lists an employee’s start date as 05/30/16, and cell B2 lists their end date as 05/30/23.

What result does the following formula display?

=DATEDIF(A2,B2,”Y”)

  • 5
  • 6
  • (CORRECT)

That’s correct. The function DATEDIF calculates the interval between two dates. This result is expressed in full years, by the argument “Y”.

7. True or False: Cell A2 contains a value of 100. B2 contains a value of 200. When executed, the following formula, which references these cells, displays a result of TRUE:

=AND(A2>=100,B2>=250)

  • True
  • False (CORRECT)

8. You must create an IF formula that checks the values in three cells and displays a “value if true” message of “target met” if any of the three values is over 1,000. Which function can you “nest” inside the IF function of your formula to complete this task?

  • An AND function.
  • An OR function. (CORRECT)
  • An IF function.

That’s correct! It is required by the OR Function that either of the values present in the three cells is such that greater than 1000 so that it can direct the IF function to display the message, “value if true” as “target met.”

9. Cell A2 in your spreadsheet contains a value of 200. Which result does the following formula, which references this cell, display?

=IF(A2<>100,”FirstMessage”,IF(A2>300,”SecondMessage”,”Third Message”))

  • First Message (CORRECT)
  • Third Message
  • Second Message

That’s correct! The first logical test checks that the value in cell A2 is not equal to 100. The result of this first test is TRUE. So, the formula displays the message “First Message” and does not move on to the second IF.

10. In your customer details worksheet, you’ve noticed that cell A2 contains the entry mARY gOMEZ. What is the result of the following formula which references this cell?

  • =PROPER(A2)
  • MARY GOMEZ
  • mary gomez
  • Mary Gomez (CORRECT)

That’s correct! A capitalize initial letter in each word from imported raw text is PROPERLY displayed by Excel.

11. Cell A2 of your spreadsheet contains the date 05/30/23. Cell B2 contains the date 06/01/23. Both dates are in the MM/DD/YY format. Tip: If your machine is set to a different region, entering dates in the MM/DD/YY format will generate an error. Enter the dates in the format required for your region. For example, if you live in Europe the format would be DD/MM/YY.

What is the result of the following formula which references these cells?

=B2-A2

  • 3
  • (CORRECT)
  • 1

That’s correct. Excel disregards the starting date (05/30/23) when it subtracts the serial number for 06/01/23 from that for 05/30/23; hence the resultant answer will be 2. It counts only the days between both the dates.

12. True or False: Cell A2 contains a value of 100. B2 contains a value of 200. When executed, the following formula which references these cells, displays the result TRUE.

=OR(A2>=100,B2>=250)

  • True (CORRECT)
  • False

That’s correct! In OR, only one of the checks needs to come back as TRUE to make the overall formula produce TRUE. In this case, the formula spits out TRUE because the first check returns TRUE.

13. You need to create an IF formula that runs a series of tests. If one test fails, the IF formula must move to the next test. Which function do you need to “nest” inside the IF function in this formula to complete this task?

  • Second Message
  • First Message
  • No Message
  • Third Message (CORRECT)

That’s correct! However, the first and second test failed; as a result, A2 is evaluated to 150, which implies that the third test should pass. In this condition, Excel returns the result of the third part of the formula, “Third Message.”

14. Cell A2 in your spreadsheet contains a value of 150. Which message does the following formula display when executed?

=IFS(A2=50,”First Message”,A2=100,”Second Message”,A2=150,”Third Message”,TRUE,”No Message”)

  • SIEM tools use automation to respond to security incidents. SOAR tools collect and analyze log data, which are then reviewed by security analysts.
  • SIEM tools and SOAR tools have the same capabilities.
  • SIEM tools are used for case management while SOAR tools collect, analyze, and report on log data.
  • SIEM tools collect and analyze log data, which are then reviewed by security analysts. SOAR tools use automation to respond to security incidents. (CORRECT)

15. Columns A, B and C of your worksheet contain numeric entries. The columns are called Day, Month and Year. What type of data does the following formula generate when added to your worksheet?

=CONCAT(A2,B2,C2)

  • Numeric
  • Date
  • Text (CORRECT)

That’s correct! Even though the entries are numeric, the use of the CONCAT function in the formula transforms them into text.

16. Cell B2 of your worksheet contains the following entry:

mOUNTAIN Bike

What is the result of the following formula which references this cell?

=LOWER(B2)

  • mountain bike (CORRECT)
  • MOUNTAIN BIKE
  • Mountain Bike

That’s correct! The LOWER function changes all letters in the text entry to lowercase.

17. Some of the information in your worksheet has been typed in the wrong columns. The entry in C2 incorrectly reads as:

32MainAvenueChicagoUSA

What is the result of the following formula which references this entry?

  • =LEFT(C2,12)
  • 32Main
  • 32MainAvenue (CORRECT)
  • 32MainAvenueChicago

That’s correct! LEFT is a function in Excel which looks for characters from the extreme left in some text and requires Excel to shortly display twelve first characters which can be found in the left of the entry.

18. In your worksheet, cell A2 contains a value of 150. Cell B2 contains a value of 200. Cell C2 contains a value of 300.

The following formula, which contains three logical tests, has returned a result of FALSE. Which of these logical tests failed?

=AND(A2>100,B2>150,C2>350)

  • The first logical test.
  • The third logical test. (CORRECT)
  • The second logical test.

That’s correct. There are two conditions met by the logical tests of A2 and B2. But the greater C2 value doesn’t meet the third logical test.

19. You must create an IF formula that checks the values in three cells. It must display a “value if true” message of “target met” if all three values exceed 1,000. Which function should you “nest” inside the IF function in this formula to complete this task?

  • An IF function.
  • An AND function. (CORRECT)
  • An OR function.

That’s correct! The condition does not allow an IF function to notify the user that the “value if true” statement of “meeting target” has been met until all values stored in the three cells exceed a set value of over 1,000 for all cells. Supposing that these conditions are not met, the formula will never display the message.

20. In your worksheet, cell A2 contains the word Super. Cell B2 contains the word Cycles. You add the following formula to your worksheet:

=CONCAT(A2,” “,B2,” “, “Inc.”)

What is the result of this formula?

  • SuperCyclesInc
  • Super Cycles Inc. (CORRECT)
  • Super CyclesInc

That’s correct! Two more pieces of argument complete a phrase; it is expressed in quotation marks. Suffix “Inc.” is similarly quoted.

21. True or False: A NOW function formula only generates a new time result every 24 hours.

  • True
  • False (CORRECT)

That’s correct! NOW function formula recalculates every time an Excel worksheet formula does, displaying time data that changes every time the formula factors, since it shows the current time.

22. You need to create a formula in your worksheet which calculates the number of weekdays between two dates. Which one of the following functions can you use to complete this task?

  • DATEDIF
  • DATE
  • NETWORKDAYS (CORRECT)

That’s correct!

23. Cell A2 in your spreadsheet contains a value of 200. When executed, what message does the following formula display?

=IFS(A2=50,”First Message”,A2=100,”Second Message”,A2=150,”Third Message”,TRUE,”No Message”)

  • First Message
  • Second Message
  • Third Message
  • No Message (CORRECT)

That’s correct! Since each of the A2 cells that contain values of 50, 100, and 150 also returns FALSE when tested for the first, second, and third conditions, Excel just picked the very last value if false. And then, it might have displayed the value as “Third Message” as the formula result.

CONCLUSION – Preparing data for analysis using functions

At the end of the day, by gaining expertise in the functionalities common in this module, the most important facet is its enhancing of their capabilities to prepare data in Excel toward thorough analysis-for instance, in Power BI. It would allow individuals to structure, format, and cleanse the data for easy integration of datasets as well as the performance of some advanced analytical tasks. It helps streamline data workflows leading to faster and more accurate actionable insights gained from data analysis, thereby amplifying productivity of the entire data analysis process.

Leave a Comment