class: center, middle, inverse, title-slide .title[ # BANL 6100: Business Analytics ] .subtitle[ ## Hyothesis Testing ] .author[ ### Mehmet Balcilar
mbalcilar@newhaven.edu
] .institute[ ### Univeristy of New Haven ] .date[ ### 2023-09-28 (updated: 2024-11-07) ] --- exclude: true --- class: center, middle, sydney-blue <!-- Custom css --> <!-- From xaringancolor --> <div style = "position:fixed; visibility: hidden"> $$ \require{color} \definecolor{purple}{rgb}{0.337254901960784, 0.00392156862745098, 0.643137254901961} \definecolor{navy}{rgb}{0.0509803921568627, 0.23921568627451, 0.337254901960784} \definecolor{ruby}{rgb}{0.603921568627451, 0.145098039215686, 0.0823529411764706} \definecolor{alice}{rgb}{0.0627450980392157, 0.470588235294118, 0.584313725490196} \definecolor{daisy}{rgb}{0.92156862745098, 0.788235294117647, 0.266666666666667} \definecolor{coral}{rgb}{0.949019607843137, 0.427450980392157, 0.129411764705882} \definecolor{kelly}{rgb}{0.509803921568627, 0.576470588235294, 0.337254901960784} \definecolor{jet}{rgb}{0.0745098039215686, 0.0823529411764706, 0.0862745098039216} \definecolor{asher}{rgb}{0.333333333333333, 0.372549019607843, 0.380392156862745} \definecolor{slate}{rgb}{0.192156862745098, 0.309803921568627, 0.309803921568627} \definecolor{cranberry}{rgb}{0.901960784313726, 0.254901960784314, 0.450980392156863} \definecolor{hi}{rgb}{0.984313725490196, 0.12549019607843137, 0.12549019607843137} $$ </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { purple: ["{\\color{purple}{#1}}", 1], navy: ["{\\color{navy}{#1}}", 1], ruby: ["{\\color{ruby}{#1}}", 1], alice: ["{\\color{alice}{#1}}", 1], daisy: ["{\\color{daisy}{#1}}", 1], coral: ["{\\color{coral}{#1}}", 1], kelly: ["{\\color{kelly}{#1}}", 1], jet: ["{\\color{jet}{#1}}", 1], asher: ["{\\color{asher}{#1}}", 1], slate: ["{\\color{slate}{#1}}", 1], cranberry: ["{\\color{cranberry}{#1}}", 1], hi: ["{\\color{hi}{#1}}", 1] }, loader: {load: ['[tex]/color']}, tex: {packages: {'[+]': ['color']}} } }); </script> <style> .purple {color: #5601A4;} .navy {color: #0D3D56;} .ruby {color: #9A2515;} .alice {color: #107895;} .daisy {color: #EBC944;} .coral {color: #F26D21;} .kelly {color: #829356;} .jet {color: #131516;} .asher {color: #555F61;} .slate {color: #314F4F;} .cranberry {color: #E64173;} .hi {color: #FB2020;} </style> # Testing for Significance --- ## Motivation Last time, we were introduced to the concept of .hi[confidence intervals]. -- Given how fragile a .b[point] estimator is, producing inferences about a population parameter through an .b[interval] allows for a more flexible interpretation of a statistic of interest. -- <br> Now, we move on to a .hi[second] inferential approach: - .hi-green[Hypothesis testing] -- This procedure serves for determining whether there is .red[*enough statistical evidence*] to confirm a belief/hypothesis about a parameter of interest. --- layout: false class: center, middle, sydney-blue # A *nonstatistical* application of Hypothesis Testing --- ## A *nonstatistical* application of Hypothesis Testing <br><br> When a person is accused of a crime, they face a .hi-green[trial]. -- The prosecution presents the case, and a jury must make a decision, .hi-blue[based on the evidence] presented. -- <br> In fact, what the jury conducts is a test of .b[different hypotheses]: -- - .red[*Prior hypothesis*]: the defendant is .hi-slate[not guilty]. - .red[*Alternative hypothesis*]: the defendant is .hi-slate[guilty]. --- ## A *nonstatistical* application of Hypothesis Testing <br> The jury .hi[does not] know which hypothesis is correct. -- Their base will be the .hi-green[evidence] presented by both prosecution and defense. -- In the end, there are only two possible decisions: - .b[Convict] or - .b[Acquit]. --- layout: false class: center, middle, sydney-blue # Back to Statistics --- ## Statistical hypothesis testing The *same reasoning* follows for Statistics: -- - The .b[prior hypothesis] is called the .hi[null hypothesis] (*H<sub>0</sub>*); - The .b[alternative hypothesis] is called the research or .hi[alternative hypothesis] (*H<sub>1</sub>* or *H<sub>a</sub>*). -- Putting the *trial* example in statistical notation: - *H<sub>0</sub>*: the defendant is .b[not guilty]. - *H<sub>1</sub>*: the defendant is .b[guilty]. -- The hypothesis of the defendant being guilty (*H<sub>1</sub>*) is what we are .hi[actually] testing, since any defendant enters the trial as .red[*innocent*], until proven otherwise. -- - That is why this is our .hi-blue[alternative] hypothesis! --- ## Remarks - The testing procedure begins with the assumption that the null hypothesis (*H<sub>0</sub>*) is .hi-blue[true]; - The goal is to determine whether there is enough evidence to infer that the alternative hypothesis (*H<sub>1</sub>*) is true. --- layout: false class: center, middle, sydney-blue # Stating hypotheses --- ## Stating hypotheses The .hi[first step] when doing hypothesis testing is to .hi-blue[state] the *null* and *alternative* hypotheses, *H<sub>0</sub>* and *H<sub>1</sub>*, respectively. -- Let us exercise that by practicing with an .b[example]. -- Recall the inventory example from the last lecture. -- Now, suppose the manager does not want to estimate the exact (or closest) mean inventory level (*μ*), but rather test whether this value is .hi[different from] 350 computers. -- - Is there enough evidence to conclude that *μ* is .hi-green[not equal to 350 computers]? -- As an important .hi-blue[first note], Hypothesis Testing always tests values for .hi[population parameters]. -- Then, the next step is to know .hi-orange[what] population parameter the problem at hand is referring to. --- ## Stating hypotheses <br><br><br> Now, consider the following .hi-slate[change] in the research question for this example: -- - Is there enough evidence to conclude that *μ* is greater than 350? --- ## Example In hypothesis testing, we proceed under the assumption our claim is true and then try to disprove it. Assume that we have 10 months of Spotify returns. For example we could claim that the average Spotify stock return is 0% against it is different than zero: `$$\hi{H_0}: \mu=0$$` `$$\ruby{H_1}: \mu \ne 0$$` Since the data are drawn from a normal distribution, the sample mean is distributed `\(\bar{x} \sim N(\hi{0}, \frac{1}{10})\)`. -- <br/> <center> <div class="hi">Important:</div> Note that we take the hypothesis that \(\hi{\mathbf{\mu = 0}}\) as true!! </center> <br/> Now we can use the sampling distribution to comment on how unlikely this claim is. --- ## Example <img src="11-Hypothesis-Testing_files/figure-html/unnamed-chunk-1-1.png" width="75%" style="display: block; margin: auto;" /> --- layout: false class: center, middle, sydney-blue # The *z* test --- ## The *z* test After the hypotheses are properly stated, what do we do? -- As a simplfying assumption, we will continue to assume that the population standard deviation (*σ*) is known, while *μ* is not. -- - We will .hi-green[relax] this hypothesis soon. -- ### Another example: A manager is considering establishing a new billing system for customers. After some analyses, they determined that the new system will be cost-effective only if the mean monthly account is more than US$ 170.00. A random sample of 400 monthly accounts is drawn, for which the sample mean is US$ 178.00. The manager assumes that these accounts are normally distributed, with a standard deviation of US$ 65.00. Can the manager conclude from this that the new system will be cost-effective? Also, they assume a confidence level of 95%. --- ## Another example: We would like to see if we can disprove this hypothesis that monthly account is more than US$ 170.00 in favor of a claim that there is greater than US$ 170.00 `$$\hi{H_0}: \mu=170$$` `$$\ruby{H_1}: \mu > 170$$` --- ## The *z* test After stating the null and alternative hypotheses, we need to calculate a .hi[test statistic]. -- Recall the .hi-blue[standardization] method for a sample statistic: $$ `\begin{aligned} z = \dfrac{\bar{x} - \mu}{\sigma / \sqrt{n}} \end{aligned}` $$ -- <br> For hypothesis testing purposes, the above is also known as a .hi-slate[*z* test]. --- ## The *z* test After obtaining the *z* value, let us now make use of the confidence level (1 − *α*) of 95% assumed by the manager. -- This value will be of use to establish a .hi-slate[threshold (critical) value] in a Standard Normal curve: <img src="11-Hypothesis-Testing_files/figure-html/unnamed-chunk-2-1.svg" style="display: block; margin: auto;" /> --- ## The *z* test <br> The .hi[shaded area] is called the .b[rejection region]. -- If a *z* statistic falls .hi[within] the rejection region, our inference is to .hi-blue[reject the null hypothesis]. -- In case the *z* value falls .hi[outside] this region, then we .hi-blue[do not reject] the null hypothesis. -- - So what is our decision from the example? --- layout: false class: center, middle, sydney-blue # The critical value method --- --- ## The critical value method What is the critical value? -- Given `\(\alpha=0.5\)` ``` r z_crit <- qnorm(0.05, mean = 0, sd = 1, lower.tail = FALSE) z_crit ``` ``` ## [1] 1.644854 ``` -- Calculate the z-score ``` r mu <- 170 sigma <- 65 n <- 400 xbar <- 178 z = (xbar-mu)/(sigma/sqrt(n)) z ``` ``` ## [1] 2.461538 ``` -- Compare: ``` r reject_H0 <- z > z_crit reject_H0 ``` ``` ## [1] TRUE ``` -- We do not reject the null that the mean monthly account is more than US$ 170.00. --- layout: false class: center, middle, sydney-blue # The p-value method --- ## The p-value method <br> We may also produce inferences using .hi[p-values] instead of critical values. -- > The *p-value* of a statistical test is the probability of observing a test statistic .red[*at least as extreme*] as the one which has been computed, .red[given that *H<sub>0</sub>* is true]. -- <br> - What is the p-value in our example? ``` r 1 - pnorm(q = 2.46, mean = 0, sd = 1) ``` ``` ## [1] 0.006946851 ``` --- ## The p-value method A p-value of .0069 implies that there is a .69% probability of observing a sample mean at least as large as US$ 178 when the population mean is US$ 170. -- In other words, this value says that we have a pretty good sample statistic for our Hypothesis Testing interests. -- However, such interpretation is .b[almost never used] in practice when considering p-values. -- <br> Instead, it is .red[*more convenient*] to compare p-values with the test's significance level (*α*): - If the p-value is .hi-blue[less] than the significance value, we .hi-slate[reject the null hypothesis]; - If the p-value is .hi-blue[greater] than the significance value, we .hi-slate[do not reject the null hypothesis]. --- ## The p-value method Consider, for example, a p-value of .b[.001]. -- This number says that we will only start not to reject the null when the significance level is .hi[lower] than .001. -- - Therefore, this means that it would be really .hi-blue[unlikely] not to reject the null hypothesis in such situation. -- We can consider the following .hi-blue[ranges] and .hi-blue[significance] features for p-values: - `\(p < .01\)`: .b[highly significant] test, *overwhelming* evidence to infer that *H<sub>1</sub>* is true; - `\(.01 \leq p \leq .05\)`: .b[significant] test, *strong* evidence to infer that *H<sub>1</sub>* is true; - `\(.05 < p \leq .10\)`: .b[weakly significant] test; - `\(p> .10\)`: .b[little or no] evidence that *H<sub>1</sub>* is true. --- layout: false class: center, middle, sydney-blue # One- and two-tailed tests --- ## One- and two-tailed tests <br> As you may have already noticed, the typical Normal distribution density curve has two “tails.” -- Depending on the sign present in the alternative hypothesis (*H<sub>1</sub>*), our test may have .hi[one or two] rejection regions. -- Whenever the sign in the alternative hypothesis is either “<” or “>,” we have a .b[one-tailed test]. -- - For the former case, the rejection region lies on the *left* tail of the bell curve, whereas for the latter, the rejection region is located on the *right* tail. --- ## One- and two-tailed tests <br><br> A two-tailed test will take place whenever the .red[*not equal to*] sign `\((\neq)\)` is present in the alternative hypothesis. - This happens because, assuming this sign, the value of our parameter may lie either on the *right* or on the *left* tail. -- For two-tailed test, we simply divide the significance level (*α*) by 2. - Just as with .hi-green[confidence intervals]! --- ## Example Back to the Spotify example: Lets say we're claiming there is no return on average. We would like to see if we can disprove this hypothesis in favor of a claim that there is a positive return `$$\alice{H_0}: \mu=0$$` `$$\ruby{H_1}: \mu > 0$$` .ex[Clicker Question:] What type of hypothesis test is this? <ol type = "a"> <li>alternative</li> <li>null</li> <li>one-sided</li> <li>two-sided</li> </ol> --- ## Continuing Example So in the Spotify example, we had 10 observations with a sample mean, `\(\bar{x}=0.3\)`. We also somehow know that `\(\sigma^2 = 1\)`. If we want to test the hypothesis `$$\alice{H_0}: \mu=0$$` `$$\ruby{H_1}: \mu > 0$$` <br> We want to calculate the probability we got an `\(\bar{x}\)` as extreme as 0.3 if the true population `\(\mu=0\)`. Is it rare that we got 0.3 or is that quite common from the sampling distribution? This means we want to calculate `\(P(\bar{x}>0.3)\)` if `\(\bar{x} \sim N(0,\frac{1}{10})\)`. We will write that as: $$ P(\bar{x} > 0.3 \ \vert\ \mu = 0) $$ --- class: clear, middle <img src="11-Hypothesis-Testing_files/figure-html/p-value-1.png" width="70%" style="display: block; margin: auto;" /> --- ## Continuing Example We want to calculate `\(P(\bar{x}>0.3)\)` if `\(\bar{X} \sim N(0,\frac{1}{10})\)` -- `$$P(\bar{x}>0.3) = P(\frac{\bar{x}-\mu}{\sigma/\sqrt{n}} > \frac{0.3- \alice{0}}{1/\sqrt{10}}) = P(x>0.948) = 0.1715$$` ``` r pnorm(q = 0.3, mean = 0, sd = 1/sqrt(10), lower.tail = FALSE) ``` ``` ## [1] 0.1713909 ``` ``` r # or pnorm(0.3/(1/sqrt(10)), mean = 0, sd =1, lower.tail = FALSE) ``` ``` ## [1] 0.1713909 ``` This says that if the *true* population mean is `\(\alice{\mu = 0}\)`, then 17% of the time we would expect to see a sample mean greater than 0.3, 17% of the time. - This is not so rare (~ 1 out of 5), so this isn't strong evidence against the null. - It doesn't prove it, but it does not rule it out. --- layout: false class: center, middle, sydney-blue # Inference about the mean when `\(\sigma\)` is uknown --- ## Inference when `\(\sigma\)` is uknown <br><br> So far, we have assumed that the .hi[population standard deviation] (*σ*) was known when computing confidence intervals and hypothesis testing. -- This assumption, however, is .hi-blue[unrealistic]. -- Now, we .hi-green[relax] such belief, and move on using a very similar approach. --- layout: false class: inverse, middle # The Student *t* distribution --- ## The Student *t* distribution Recall the formula for the *z* test: $$ `\begin{aligned} z = \dfrac{\bar{x} - \mu}{\sigma / \sqrt{n}} \end{aligned}` $$ -- Now that the population standard deviation is *unknown*, what is the .hi-blue[best move]? -- <br><br> .right[Replace it by its .hi-green[sample estimator], *s*!] -- $$ `\begin{aligned} t = \dfrac{\bar{x} - \mu}{s / \sqrt{n}} \end{aligned}` $$ --- ## The Student *t* distribution Now, we do not know the population standard deviation anymore. -- However, frequentist methods still assume that the population mean (*μ*) is .hi-blue[normally distributed]. -- .pull-left[ In 1908, William S. Gosset came up with the .hi[Student *t*] distribution, whose mean and variance are - `\(E(X) = 0\)` - `\(\text{Var}(X) = \dfrac{\nu}{\nu - 2}\)` where `\(\nu = n - 1\)`. ] .pull-right[ <img src="images/lecture11/gosset.jpeg", width = "50%"> ] --- ## The Student *t* distribution The *t* distribution has a ".red[*mound-shaped*]" density curve, while the Normal's is bell-shaped. <img src="11-Hypothesis-Testing_files/figure-html/unnamed-chunk-8-1.svg" style="display: block; margin: auto;" /> --- ## The Student *t* distribution Its only parameter is `\(\nu\)`, the .hi-blue[degrees of freedom] of the distribution: $$ `\begin{aligned} X \sim t(\nu) \end{aligned}` $$ -- The larger the value of `\(\nu\)`, the more .hi-green[similar] the *t* distribution is to the Normal. -- <img src="11-Hypothesis-Testing_files/figure-html/unnamed-chunk-9-1.svg" style="display: block; margin: auto;" /> --- ## The Student *t* distribution When *σ* is unknown, we can define the .hi[confidence interval] for the population mean (*μ*) as: $$ `\begin{aligned} \bar{x} \pm t_{\alpha/2, \ \nu} \bigg(\dfrac{s}{\sqrt{n}}\bigg) \end{aligned}` $$ -- <br> Notice that, in addition to the significance level `\((\alpha)\)`, we also must take the number of degrees of freedom `\((\nu = n-1)\)` into account to find the .hi-blue[standard error] for `\(\bar{x}\)`. -- <br> Furthermore, for .hi[hypothesis testing], the .red[*t-statistic*] is obtained with: $$ `\begin{aligned} t = \dfrac{\bar{x} - \mu}{s / \sqrt{n}} \end{aligned}` $$ --- ## The Student *t* distribution We .hi[cannot] use the sampling distribution of *μ* anymore, since the population standard deviation is unknown. -- Such assumption considers an .hi-green[infinitely large] sample, which is almost never the case in practice. -- For .hi-blue[smaller] sample sizes, the *t* distribution is extremely useful, and its shape is conditional on this sample size. -- <br> .right[But why `\(n-1\)` degrees of freedom?] --- ## The Student *t* distribution An example: Assuming that it can be profitable to recycle newspapers, a company’s financial analyst has computed that the firm would make a profit if the mean weekly newspaper collection from each household exceeded 2 lbs. His study collected data from a sample of 148 households. The calculated sample average weight was 2.18 lbs. Do these data provide sufficient evidence to allow the analyst to conclude that a recycling plant would be profitable? Assume a significance level of 1%, and a sample variance of .962 lbs<sup>2</sup>. .hi-slate[Solution:] ``` r xbar <- 2.18 mu <- 2 s <- sqrt(0.962) n <- 148 df <- n-1 alpha <- 0.01 t <- (xbar-mu)/(s/(sqrt(n))) 1-pt(q = t,df = df) ``` ``` ## [1] 0.01354263 ``` ``` r (1- pt(q = t,df = df)) < 0.01 ``` ``` ## [1] FALSE ``` --- ## When to Use Which Distribution If we have simple random sample, Normal `\(X\)`, and `\(\sigma\)` known: - Use `\(z\)` distribution If we have simple random sample, Normal `\(X\)`, and `\(\sigma\)` unknown: - Use the `\(t\)` distribution .purple[What if we don't know that X is normal?] - `\(n<15 \implies\)` only use `\(t\)` if `\(X\)` looks very normally distributed. - `\(15 < n < 31 \implies\)` use `\(t\)` as long as there are no extreme outliers - `\(n>31\)`, probably okay using `\(z\)` Recall Law of Large numbers that says as `\(n \to \infty\)`: $$ \frac{\bar{x}-\mu}{\sigma/\sqrt{n}} \to^d N\sim(0,1) $$ --- exclude: true