Typically in the field of hypothesis testing, that will also involve thinking up and developing critical tools for any data professional interested in determining the statistical significance of hypotheses or experiments with respect to random chance. This is actually a complete guide to all steps involving a hypothesis test for an appropriate understanding of the process in a systematic way.
By looking into hypothesized tests, one would gain really precious knowledge about a method that could enable professionals who deal with data to come up with logically valid conclusions. This course adds to the arsenal of tools that allow one to confidently carry out hypothesis testing being able to make data-based contributions in one’s professional life.
Objectives
Hypothesis testing in Python.
One may benefit from understanding how to conduct two sample hypothesis testing
To learn how to conduct one’s own hypothesis testing.
Typology and typology II error difference are defined here.
Defining key concepts regarding hypothesis testing such as p-value and significance level.
The essence of statistical significance in hypothesis testing
Difference between null hypothesis and alternative hypothesis.
PRACTICE QUIZ: TEST YOUR KNOWLEDGE: INTRODUCTION TO HYPOTHESIS TESTING
1. Fill in the blank: The _____ typically assumes that observed data does not occur by chance.
subjective hypothesis
alternative hypothesis (CORRECT)
null hypothesis
objective hypothesis
Correct: Generally, alternative hypothesis states that the data has not been observed due to chance. It is a statement opposing the null hypothesis and accepted as true only if there is enough evidence for such support.
2. Which of the following statements describe significance level? Select all that apply.
Significance level is the threshold at which a result is considered to be due to chance.
Significance level is the probability of rejecting an alternative hypothesis when it is true.
Significance level is the threshold at which a result is considered statistically significant. (CORRECT)
Significance level is the probability of rejecting a null hypothesis when it is true. (CORRECT)
Correct: A level of significance is defined as a threshold for making a decision about whether a result is statistically significant. It can be thought of as a measure of the probability of rejecting the null hypothesis when, in fact, it is true.
3. What concept refers to the probability of observing results that are at least as extreme as those observed when the null hypothesis is true?
P-value (CORRECT)
Statistical significance
Confidence level
Z-score
Correct: The p-value indicates the probability of getting test results that are at least as extreme, or more extreme than, the one that was actually observed under the premise that the null hypothesis is correct.
4. A data professional conducts a hypothesis test. They mistakenly conclude that their result is statistically significant when it actually occurred by chance. What type of error does this scenario describe?
Type I (CORRECT)
Type II
Type III
Type IV
Correct: This situation gives an example of a type I error. A type I error occurs whenever the null hypothesis is usually rejected whilst in fact it is true.
PRACTICE QUIZ: TEST YOUR KNOWLEDGE: ONE-SAMPLE TESTS
1. In a one-sample hypothesis test, what does the null hypothesis state?
The population mean is not equal to an observed value.
The population mean is equal to an observed value. (CORRECT)
The population mean is greater than an observed value.
The population mean is less than an observed value.
Correct: A one-sample hypothesis test is one in which the null hypothesis claims that the mean of the population is equal to some specific value.
2. A data professional conducts a hypothesis test. They discover that their p-value is less than the significance level. What conclusion should they draw?
Reject the null hypothesis. (CORRECT)
Reject the alternative hypothesis.
Fail to reject the null hypothesis.
Decide the test is inconclusive.
Correct: Compare the p-value with the significance level in order to derive a conclusion from the hypothesis test. Reject the null hypothesis if the p-value is less than the significance level; otherwise, fail to reject the null hypothesis when the p-value is greater than the significance.
PRACTICE QUIZ: TEST YOUR KNOWLEDGE: TWO-SAMPLE TESTS
1. What does a two-sample hypothesis test determine?
Whether a population parameter, such as a mean or proportion, is equal to a specific value
Whether a sample statistic, such as a mean or proportion, is equal to a specific value
Whether two population parameters, such as two means or two proportions, are equal (CORRECT)
Whether two sample statistics, such as two means or two proportions, are equal
Correct: A test of the two samples hypotheses is defined in terms of the equality between two population parameters, for example, two means or two proportions.
2. What is the null hypothesis of a two-sample t-test?
The population mean is equal to an observed value
There is no difference between two population proportions
There is no difference between two population means (CORRECT)
The population proportion is equal to an observed value
Correct: The null hypothesis assumes that both population means are equal for the two sample t-test.
PRACTICE QUIZ: TEST YOUR KNOWLEDGE: HYPOTHESIS TESTING WITH PYTHON
1. A data professional can use the Python function scipy.stats.ttest_ind() to compute the p-value for the two-sample t-test.
True (CORRECT)
False
Correct: In the field of data science, the person might execute the p-value computing using the scipy.stats.ttest_ind() function. The p-value measures the odds of having sample means that would differ as much as, or more than, the difference observed, given that the null hypothesis holds true. The scipy.stats.ttest_ind() function is used for a two-sample t-test by a data professional.
2. What arguments of the Python function scipy.stats.ttest_ind(a, b, equal_var) refer to observations from the sample data? Select all that apply.
alpha
loc
a (CORRECT)
b (CORRECT)
Correct: In function scipy.stats.ttest_ind(a, b, equal_var), a refers to the observation from the first sample, b refers to the observation from the second sample and equal_var indicates whether equal variance is supposed for the population variance of both samples. If True, it assumes equal variances; if False, then variances are assumed to be unequal.
MODULE 5 CHALLENGE
1. Which of the following statements accurately describes the null hypothesis? Select all that apply.
The null hypothesis typically assumes that observed data does not occur by chance.
The null hypothesis is accepted as true only if there is convincing evidence for it.
The null hypothesis is assumed to be true unless there is convincing evidence to the contrary. (CORRECT)
The null hypothesis typically assumes that observed data occurs by chance. (CORRECT)
2. What term describes the probability of rejecting the null hypothesis when it is true?
P-value
Confidence interval
Alternative hypothesis
Significance level (CORRECT)
3. A data professional conducts a hypothesis test. They fail to reject the null hypothesis. What statement best describes their conclusion?
Their significance level is greater than their p-value
Their confidence level is greater than their p-value
Their p-value is greater than their significance level. (CORRECT)
Their p-value is greater than their confidence level
4. A data professional conducts a hypothesis test. When they draw their conclusion, they commit a type I error. Which of the following statements describe their error? Select all that apply.
They fail to reject a null hypothesis that is actually false.
They conclude their result occurred by chance when in fact it is statistically significant.
They reject a null hypothesis that is actually true. (CORRECT)
They conclude their result is statistically significant when in fact it occurred by chance. (CORRECT)
5.A data professional at an emergency response center conducts a hypothesis test to identify optimal ambulance routes. They just found the p-value. What should they do next?
Choose the significance level
State the alternative hypothesis
State the null hypothesis
Reject or fail to reject the null hypothesis (CORRECT)
6. A data professional conducts a hypothesis test. They choose a significance level of 10%. They calculate a p-value of 12.4%. What conclusion should they draw?
Reject the alternative hypothesis.
Fail to reject the null hypothesis. (CORRECT)
Fail to reject the alternative hypothesis.
Reject the null hypothesis
7. A data professional is conducting a two-sample t-test. What does their alternative hypothesis state?
There is no difference between two population means.
There is a difference between two population proportions.
There is no difference between two population proportions.
There is a difference between two population means. (CORRECT)
8. A data professional conducts a hypothesis test to compare the mean annual sales of two different restaurants in the same restaurant chain. They write the following code:
Whether or not the population variance of the two samples is assumed to be equal (CORRECT)
Significance level
P-value
Observations from the first sample
9. Which of the following statements accurately describe the null hypothesis? Select all that apply.
The alternative hypothesis typically assumes that observed data occurs by chance.
The null hypothesis typically assumes that observed data does not occur by chance.
The null hypothesis typically assumes that observed data occurs by chance. (CORRECT) The alternative hypothesis typically assumes that observed data does not occur by chance. (CORRECT)
10. To draw a conclusion about the null hypothesis, what two concepts are compared?
Confidence level and significance level
P-value and significance level (CORRECT)
P-value and alternative hypothesis
Alternative hypothesis and significance level
11. A data professional conducts a hypothesis test to compare the mean annual sales of two different restaurants in the same restaurant chain. They write the following code:
Whether or not the population variance of the two samples is assumed to be equal
Significance level
P-value
Observations from the first sample (CORRECT)
12. What is the term for the arbitrary threshold determining whether an observed difference between groups occurred by chance?
P-value
Maximum likelihood
Statistical significance (CORRECT)
Confidence level
13. A data professional conducts a hypothesis test. When they draw their conclusion, they fail to reject a null hypothesis, which is actually false. What type of error do they commit?
Type I
Type III
Type II (CORRECT)
Type IV
14. A data professional conducts a hypothesis test. They choose a significance level of 5%. They calculate a p-value of 3.3%. What conclusion should they draw?
Reject the alternative hypothesis.
Fail to reject the null hypothesis.
Reject the null hypothesis. (CORRECT)
Fail to reject the alternative hypothesis.
15. In a one-sample hypothesis test of the mean, what are the typical options for the alternative hypothesis? Select all that apply.
The population mean is equal to an observed value.
The population mean is greater than an observed value. (CORRECT)
The population mean is less than an observed value. (CORRECT)
The population mean is not equal to an observed value. (CORRECT)
16. A data professional conducts a hypothesis test. They choose a significance level of 1%. They calculate a p-value of 0.01%. What conclusion should they draw?
Fail to reject the null hypothesis.
Reject the alternative hypothesis.
Fail to reject the alternative hypothesis.
Reject the null hypothesis. (CORRECT)
17. A data professional is conducting a hypothesis test. Their null hypothesis states that there is no difference between two population proportions. What type of test are they conducting?
Two-sample z-test (CORRECT)
Two-sample t-test
One-sample z-test
One-sample t-test
18. What does the concept of p-value refer to?
The probability of observing results as or more extreme than those observed when the null hypothesis is true (CORRECT)
The probability of observing results less extreme than those observed when the null hypothesis is true
The probability of rejecting the null hypothesis when it is false
The probability of rejecting the null hypothesis when it is true
19. When would a data professional reject the null hypothesis?
When their test statistic is less than their p-value
When their significance level is less than their p-value
When their p-value is less than their test statistic
When their p-value is less than their significance level (CORRECT)
20. A data professional on a marketing team conducts a hypothesis test to compare the mean time customers spend on two different versions of a company’s website. To start, they state the null hypothesis and the alternative hypothesis. What should they do next?
Reject or fail to reject the null hypothesis
Find the margin of error
Choose a significance level (CORRECT)
Find the p-value
21. A data professional conducts a hypothesis test to compare the mean annual sales of two different restaurants in the same restaurant chain. They write the following code:
Whether or not the population variance of the two samples is assumed to be equal
Significance level
22. A data professional conducts a hypothesis test. When they draw their conclusion, they commit a type II error. Which of the following statements accurately describe this scenario? Select all that apply.
They have made an error known as a false positive.
They have made an error known as a false negative. (CORRECT)
They have failed to reject a null hypothesis, which is actually false. (CORRECT)
They concluded their result occurred by chance, but it was actually statistically significant. (CORRECT)
23. A data analytics team in the landscaping industry conducts a hypothesis test to compare the effects of certain fertilizers on flower production. To start, they state the null hypothesis and the alternative hypothesis. Then they choose a significance level. What should they do next?
Reject or fail to reject the null hypothesis
Find the p-value (CORRECT)
Select the sample data
Identify the confirmed assumption
24. What type of hypothesis typically assumes that observed data does not occur by chance?
Type II
Alternative (CORRECT)
Null
Type I
25. The null hypothesis is a statement that is assumed to be true unless there is convincing evidence to the contrary. The null hypothesis typically assumes that observed data occurs by chance.
True (CORRECT)
False
Correct: A null hypothesis is the default assumption in terms of a statement that must be taken as true unless compelling evidence comes about to the contrary. Typically such an assertion posits that the data observed arose by chance or that there is no effect or no difference.
26. What is the first step when conducting a hypothesis test?
Find the p-value
Reject or fail to reject the null hypothesis
Choose a significance level
State the null hypothesis and the alternative hypothesis (CORRECT)
Correct: The first part of a hypothesis test is to form the null hypothesis and the alternative hypothesis. The next three steps include choosing the significance level, calculating the p-value, and either rejecting or not rejecting the null hypothesis based on comparison of the p-value and the significance level.
27. To compare two population means, a data professional uses a one-sample hypothesis test.
True
False (CORRECT)
Correct: The first part of a hypothesis test is to form the null hypothesis and the alternative hypothesis. The next three steps include choosing the significance level, calculating the p-value, and either rejecting or not rejecting the null hypothesis based on comparison of the p-value and the significance level.
CONCLUSION to Introduction to Hypothesis Testing
The course esteems a quintessential advance in the journey through hypothesis testing, providing full knowledge about the principle and application of hypothecation for data professionals. The learners equipped with knowledge of basic hypothesis testing steps would be most likely practicing justification on an experimental result’s significance, that is to order meaningful results from random effects. Individuals will empower their ability to infer additional insights in data to support the boost of effectiveness in decision-making processes, drawing significant conclusions from such stimuli.
Data professionals would find the skills learned in hypothesis testing as very critical weapons in their analytical armory, guaranteeing them serious evidence-based solutions for business problems. This course marks an important milestone in the path of professional development for data experts in tomorrow’s workforce by empowering them with the mastery and practice necessary to successfully navigate hypothesis testing.