Theory 1: Statistical Hypothesis And Random Variables

Theory1 A Statistical Hypothesis Istftftfa Random Variabl

A statistical hypothesis is ( T F T F T F )a random variable a pivotal quantity an assertion about unknown distribution of one or more random variables 2 ( ∼ )Given arandom variable(absolutely)continue X F(x),F(x)unknown,siaX1,...,Xna sample from XandFn(x)the empirical distribution function.The Kolmogorov-Smirnovtest is usedtoverify ( T F T F T F )the hypothesis H0 : X ∼ F0 vs Ha : H0 is false, where F0(x) is the hypothesized distribution for X the hypothesis H0 : X hasthe Kolmogorov distribution vs Ha : X hasthe Smirnov distribution the hypothesis H0 : X ∼ F0 vs Ha : H0 is false, using the test statistic sup |Fn(x) − F0(x)| 3 ( x ∈ R ) ( TV F TV F TV F TV F )In a hypothesis testthep−valueisthe significance 1−γ the value of test statistic the probabilityof rejecting the true hypothesis the smallest significancewith which you couldreject H0 when it’s true 4 ( TV F TV F TV F )In a hypothesis testthesignificance level αisthe value of test statistic the probabilitywith which you could reject H0 when H0 is true the probabilitywith which you could accept H0 when H0 is false 5 ( ), )For a given experimentare only and only possible,sincompatible resultsSi(i=1,...,sand s≥2), each one of hypothesized probabilityÏ€i=P[Si]; .Repeating n times the experiment the result Si will be observed Ni times. Obviously . Let ni the observed value of Ni and let the chi-squared with s-1 degrees of freedom. Answer correctly. ( TV F )the values represent the observed frequencies of the result Si ( TV F )the statistic is approximately normal when n is big TV F ( TV F )the condition is sufficient to ensure that ( pag. Given two hypothesis test simple,T1andT2to decide between the null hypothesis Hoand the alternativeHa,with error probability of 1stand 2ndype respectively equal to α1, β1 for T1 and α2, β2 forT2, then T1 will be preferable to T2 if: ( TV F TV F TV F TV F )α1 β2 α1 = α2 e β1 α2 e β1 35 1.07 √ n 1.14 √ n n 1.22 √ n n 1.36 √ n n 1.63 √ n n )Table of random variable K. For a value of n 35 we need to compare with the value which is obtained by dividing the numerators of the fractions of the last line per . Partial table of t − Student. df = n t(.995) t(.99) t(.975) t(.95) t(.9) t(.......................................... s i i n p = = ॠ2 1 s c - i n n ( . ) . s kk s k k Nn X n p p - = - = ॠss X c -- : 5 , . i in p " à¥

Paper For Above instruction

This comprehensive analysis focuses on the core principles of statistical hypothesis testing, the properties of random variables, and model comparisons in scientific experiments. It integrates theoretical foundations with practical applications, employing key statistical tests such as the Kolmogorov-Smirnov test, chi-squared test, and F-test, alongside scenario-based decision-making under uncertainty. Through detailed exploration of hypothesis formulation, test statistic derivation, significance level interpretation, and power analysis, this paper aims to elucidate the critical role of statistical reasoning in experimental validation and decision processes across scientific and industrial contexts.

Introduction

Statistical hypothesis testing is a fundamental component of inferential statistics, providing a basis to evaluate claims about populations based on sample data. It entails formulating null and alternative hypotheses, computing appropriate test statistics, and determining decision rules based on significance levels and p-values. The purpose of this paper is to explore various hypothesis testing scenarios, the application of nonparametric tests like Kolmogorov-Smirnov, and parametric tests such as t-tests, chi-squared tests, and F-tests, focusing on their methodologies, assumptions, and interpretations.

1. Foundations of Hypothesis Testing and Random Variables

A statistical hypothesis is an assertion about the distribution of one or more random variables. It can be either parametric or nonparametric, depending on whether assumptions are made about distributional forms. Random variables are the building blocks of hypothesis testing; their distributions, whether known or unknown, form the basis for constructing test statistics and significance tests.

2. Nonparametric Tests and the Kolmogorov–Smirnov Test

When the distribution of a sample is unknown, nonparametric tests such as the Kolmogorov-Smirnov (K-S) test are employed. The K-S test compares the empirical distribution function, Fn(x), with a hypothesized distribution, F0(x). Its test statistic is the supremum of the absolute difference between these functions. Null hypothesis H0 posits that the sample distribution matches F0(x), while the alternative Ha indicates a discrepancy.

The critical aspect of the K-S test involves the calculation of this supremum value and comparing it to tabulated values based on sample size and significance level. If the test statistic exceeds the critical value, the null hypothesis is rejected, indicating evidence against the assumed distribution.

3. Significance, p-values, and Error Types

In hypothesis testing, the significance level, α, reflects the probability of Type I error—the risk of wrongly rejecting a true null hypothesis. Conversely, the power of a test is related to Type II error, represented by β, the probability of failing to reject a false null hypothesis. The p-value quantifies the smallest significance level at which the null hypothesis would still be rejected.

4. Chi-Squared Test for Goodness-of-Fit

The chi-squared test assesses whether observed frequencies differ from expected frequencies under a specified hypothesis. This test is applicable when categorical data are involved, with the test statistic following a chi-squared distribution with s-1 degrees of freedom, where s is the number of categories. When sample sizes are large, the chi-squared distribution provides a reliable approximation for the distribution of the test statistic.

5. Comparing Two Hypotheses: Probabilities and Errors

Deciding between two competing hypotheses involves evaluating error probabilities α and β. The hypotheses' effectiveness depends on the balance between these errors, often influenced by the choice of test significance levels and sample sizes. The decision criterion may involve comparing error probabilities and optimizing test parameters to maximize reliability.

6. Probabilistic Models: Uniform Variables and Their Sums

The properties of uniform random variables, especially their joint distribution, are essential in modeling and simulating scenarios such as sums of uniform variates. The sum T = U1 + U2, where U1 and U2 are independent, uniformly distributed on [0,1], follows a piecewise distribution with a well-defined cumulative distribution function, FT(t). The volume interpretation of FT(t) corresponds to the area under the joint uniform distribution constrained by their sum.

7. Application of the Kolmogorov–Smirnov Test to Sample Data

Given sample observations, the objective is to determine whether they conform to a hypothesized distribution by computing the empirical distribution function and applying the K-S test. The test involves calculating the maximum deviation and comparing it to critical values. Repeated testing at a 0.05 significance level guides the decision to accept or reject the null hypothesis.

8. Case Study: Validation of Ship Distances Using Hypothesis Testing

The problem involves testing whether the mean distance traveled by a spaceship with liquid hydrogen exceeds a specified value, based on sample means and known or unknown variances. Depending on whether the variance is known, different tests (z-test or t-test) are employed. Power analysis and error probabilities are crucial in assessing the validity of the claims, especially given small sample sizes.

9. Industrial Application: Comparing Filling Machines for Consistency

Two machines, M1 and M2, are evaluated based on their ability to fill bottles accurately. Using sample means and variances, a two-sample variance comparison through an F-test determines which machine exhibits greater precision. The degrees of freedom are associated with their respective sample sizes, and the p-value indicates the strength of evidence. Considerations include cost differences and accuracy, influencing managerial decisions.

10. Statistical Decision Criteria and Critical Values