Statistical Inference I J Lee Assignment 3 Due On April 28

Statistical Inference I J Lee Assignment 3 Due On April 28, 2015prob

Estimate the probability that, for last year in Pennsylvania, at least one married couple has both partners born on April 30 or celebrated their birthday on the same day of the year, making assumptions explicit. Also, analyze a typing agency employing three typists with known error rates, estimating the probabilities of no errors and at most three errors in a 7-page article, assuming equal likelihood of being typed by each typist. Further, analyze a given lifetime density function to compute fractions of bulbs lasting more than 15 minutes, expected lifetime, conditional probabilities, and point probabilities where applicable. Then, determine parameters for a specified density function to satisfy a mean expectation and derive its cumulative distribution function. Lastly, evaluate the properties of a piecewise function serving as a cumulative distribution function, modify it if necessary to ensure validity, and compute the expectation for the associated random variable.

Paper For Above instruction

Introduction

Statistical inference involves estimating and making decisions about population parameters based on sample data. The problems span probability calculations for specific events in real-world contexts, such as marriages and error rates, as well as theoretical tasks involving probability density functions (pdfs) and cumulative distribution functions (cdfs). This paper addresses each problem sequentially, applying classical probability theory and statistical methods, explicitly stating assumptions and interpretations to ensure clarity and rigor.

Problem 1: Marriages in Pennsylvania

The first problem involves calculating the probability that, among approximately 80,000 marriages in Pennsylvania last year, there exists at least one couple where both partners share specific birthday characteristics. Assuming independent birthday distributions, and uniform probability over 365 days (ignoring leap years), the probability that both partners were born on April 30 is \(\left(\frac{1}{365}\right)^2\). The probability that both celebrated their birthday on the same day of the year (any specific day) is \(1/365\), assuming uniform distribution and independence.

The probability that none of the 80,000 couples have both partners born on April 30 is \(\left[1 - \left(\frac{1}{365}\right)^2\right]^{80,000}\). Therefore, the probability that at least one couple meets this criterion is:

\[

P = 1 - \left[1 - \left(\frac{1}{365}\right)^2\right]^{80,000}

\]

Similarly, the probability that no couple has both partners celebrating their birthday on the same day is \(\left(1 - \frac{1}{365}\right)^{80,000}\), making the probability of at least one such couple:

\[

Q = 1 - \left(1 - \frac{1}{365}\right)^{80,000}

\]

These calculations assume independence of birthdays and uniform distribution across days, which simplifies the analysis but introduces limitations regarding real-world birthday distributions.

Problem 2: Error Rates in Typing

The second problem involves estimating the probability that a 7-page article typed by one of three typists—Al, Bob, or Cathy—has no errors and at most three errors. Each typist's error rate per page is given, and the choice of typist is uniformly random.

The errors per page are modeled as Poisson distributions:

- Al: \(\lambda_A = 3\)

- Bob: \(\lambda_B = 4.2\)

- Cathy: \(\lambda_C = 2.1\)

The total errors in the 7-page article follow a Poisson distribution with \(\lambda_{total}\) as the sum of individual expectations for the pages typed:

\[

\lambda_{total} = 7 \times \lambda_i

\]

But since the typo rate varies per typist, and the typist is chosen uniformly, the overall error distribution is a mixture:

\[

P(\text{no errors}) = \frac{1}{3} \left[ e^{-\lambda_A} \sum_{k=0}^{0} \frac{\lambda_A^k}{k!} \right] + \frac{1}{3} \left[ e^{-\lambda_B} \sum_{k=0}^{0} \frac{\lambda_B^k}{k!} \right] + \frac{1}{3} \left[ e^{-\lambda_C} \sum_{k=0}^{0} \frac{\lambda_C^k}{k!} \right]

\]

which simplifies to:

\[

P(\text{no errors}) = \frac{1}{3} \left[e^{-\lambda_A} + e^{-\lambda_B} + e^{-\lambda_C}\right]

\]

Similarly, the probability of having at most 3 errors involves summing the Poisson probabilities up to 3 errors:

\[

P(\text{at most 3 errors}) = \frac{1}{3} \left[ \sum_{k=0}^{3} e^{-\lambda_A} \frac{\lambda_A^k}{k!} + \sum_{k=0}^{3} e^{-\lambda_B} \frac{\lambda_B^k}{k!} + \sum_{k=0}^{3} e^{-\lambda_C} \frac{\lambda_C^k}{k!} \right]

\]

these calculations give approximate probabilities for the article's error content, reflecting the random assignment of typists.

Problem 3: Distribution of Light Bulb Lifetimes

The third problem considers a piecewise density function for the lifetime \(X\) of a light bulb:

\[

f(x) =

\begin{cases}

2x, & 0 \leq x

\frac{3}{4}, & 2

0, & \text{otherwise}

\end{cases}

\]

The density function indicates that the lifetime in hours is distributed with a quadratic density on the interval \([0, 0.5)\) and a constant density on \((2,3)\).

(a) The fraction of bulbs lasting more than 15 minutes (0.25 hours) is:

\[

P(X > 0.25) = 1 - P(X \leq 0.25) = 1 - \int_0^{0.25} 2x dx = 1 - x^2|_0^{0.25} = 1 - (0.25)^2 = 1 - 0.0625 = 0.9375

\]

(b) The expectation \(E(X)\) is computed as:

\[

E(X) = \int_{0}^{0.5} x \cdot 2x dx + \int_{2}^{3} x \cdot \frac{3}{4} dx

\]

for the respective intervals, integrating:

\[

E(X) = \int_0^{0.5} 2x^2 dx + \frac{3}{4} \int_{2}^{3} x dx

\]

which evaluates to:

\[

E(X) = \frac{2}{3} x^3 |_{0}^{0.5} + \frac{3}{4} \left[\frac{x^2}{2}\right]_{2}^{3} = \frac{2}{3} \times (0.5)^3 + \frac{3}{4} \times \left(\frac{9 - 4}{2}\right) = \frac{2}{3} \times 0.125 + \frac{3}{4} \times \frac{5}{2}

\]

\[

= \frac{2}{3} \times 0.125 + \frac{3}{4} \times 2.5 = 0.0833 + 1.875 = 1.9583

\]

(c) The conditional probability \(P(0.25 1)\) involves:

\[

\frac{P(1 1)}

\]

Note that the density is 2x in \([0, 0.5)\) and 3/4 in \((2, 3)\), but zero elsewhere, so only the second part contributes for \(X > 1\):

\[

P(1

\]

Because the density is zero from 0.5 to 2, the probability of \(X>1\) is only from the interval \([2, 2.2]\):

\[

P(2

\]

and

\[

P(X > 1) = P(X \leq 0.5) + P(2 1) \approx P(X > 2) = 0.15

\]

Thus, the conditional probability simplifies to:

\[

\frac{0.15}{0.15} = 1

\]

(d) The point probabilities \(P(X=2)\), \(P(X=0)\), and \(P(X=E[X])\) are zero because the distribution is continuous over the intervals, and the density is zero at points for continuous distributions.

Problem 4: Parameter Determination for Density Function

Given the density \(f(x) = a + bx^{2}\) for \(x \in [0,1]\), with \(E[X] = 3/5\), the goal is to find \(a\) and \(b\).

Since \(f(x)\) must be a valid density:

\[

\int_0^1 f(x) dx = 1

\]

which implies:

\[

\int_0^1 (a + bx^{2}) dx = a \times 1 + b \times \frac{1}{3} = 1

\]

so:

\[

a + \frac{b}{3} = 1

\]

The expectation constraint yields:

\[

E[X] = \int_0^1 x f(x) dx = a \times \frac{1}{2} + b \times \frac{1}{4} = \frac{3}{5}

\]

which simplifies to:

\[

\frac{a}{2} + \frac{b}{4} = \frac{3}{5}

\]

Solving this system:

\[

a + \frac{b}{3} = 1

\]

\[

\frac{a}{2} + \frac{b}{4} = \frac{3}{5}

\]

multiplying the second equation by 4:

\[

2a + b = \frac{12}{5}

\]

from the first:

\[

a = 1 - \frac{b}{3}

\]

substituting into the second:

\[

2 \left(1 - \frac{b}{3}\right) + b = \frac{12}{5}

\]

\[

2 - \frac{2b}{3} + b = \frac{12}{5}

\]

\[

2 + \left(-\frac{2b}{3} + b\right) = \frac{12}{5}

\]

\[

2 + \left(-\frac{2b}{3} + \frac{3b}{3}\right) = \frac{12}{5}

\]

\[

2 + \frac{b}{3} = \frac{12}{5}

\]

\[

\frac{b}{3} = \frac{12}{5} - 2 = \frac{12}{5} - \frac{10}{5} = \frac{2}{5}

\]

\[

b = \frac{2}{5} \times 3 = \frac{6}{5}

\]

then:

\[

a = 1 - \frac{b}{3} = 1 - \frac{\frac{6}{5}}{3} = 1 - \frac{6/5}{3} = 1 - \frac{6/5}{3} = 1 - \frac{6/5 \times 1/3} = 1 - \frac{6}{5} \times \frac{1}{3} = 1 - \frac{6}{15} = 1 - \frac{2}{5} = \frac{3}{5}

\]

Thus, the parameters are:

\[

a = \frac{3}{5}, \quad b = \frac{6}{5}

\]

The cumulative distribution function is:

\[

F(x) = \int_0^x f(t) dt = \int_0^x (a + b t^{2}) dt = a x + b \frac{x^{3}}{3}

\]

with the specified \(a\) and \(b\).

Problem 5: Piecewise Function and Modifications

The function:

\[

F(x) =

\begin{cases}

0, & x

x/2, & 0 \leq x

(x + 2)/6, & 1 \leq x

1, & x \geq 4

\end{cases}

\]

is a candidate cdf.

(a) To verify whether \(F\) is a cdf, check whether it is non-decreasing, right-continuous, and approaches 0 at \(-\infty\), 1 at \(+\infty\). It is continuous and non-decreasing, with limits matching the properties of a cdf:

- \(\lim_{x \to -\infty} F(x) = 0\).

- \(\lim_{x \to \infty} F(x) = 1\).

(b) The derivative of \(F(x)\), where differentiable, gives the pdf:

- For \(0 \leq x

- For \(1 \leq x

- Zero elsewhere (since the function is flat outside these intervals).

(c) The expectation \(E[X]\) is:

\[

E[X] = \int_{0}^{1} x \times \frac{1}{2} dx + \int_{1}^{4} x \times \frac{1}{6} dx

\]

Calculations:

\[

E[X] = \frac{1}{2} \times \frac{x^{2}}{2} \big|_0^{1} + \frac{1}{6} \times \frac{x^{2}}{2} \big|_1^{4} = \frac{1}{4}(1^{2} - 0) + \frac{1}{12}(4^{2} - 1^{2}) = \frac{1}{4} + \frac{1}{12}(16 - 1) = \frac{1}{4} + \frac{15}{12} = \frac{1}{4} + \frac{5}{4} = \frac{6}{4} = 1.5

\]

References:

- Casella, G., & Berger, R. L. (2002). Statistical Inference (2nd ed.). Duxbury.

- Ross, S. M. (2014). Introduction to Probability and Statistics. Academic Press.

- Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer.

- Feller, W. (1971). An Introduction to Probability Theory and Its Applications. Wiley.

- Casella, G., & Berger, R. L. (2002). Statistical Inference. Duxbury.

- Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference. Springer.

- Rice, J. (2007). Mathematical Statistics and Data Analysis. Thompson.

- Cox, D. R., & Hinkley, D. V. (1974). Theoretical Statistics. Chapman and Hall.

- Lehmann, E. L., & Casella, G. (1998). Theory of Point Estimation. Springer.

- DeGroot, M. H., & Schervish, J. (2012). Probability and Statistics. Pearson.