Simulation Of Die Rolls And Probabilities Comparison In MATL
Simulation of Die Rolls and Probabilities Comparison in MATLAB
This assignment involves two parts: Part A focuses on simulating the rolling of a balanced six-faced die, and Part B involves simulating the rolling of two dice. For each part, you will generate multiple experiments, analyze the observed probabilities of certain outcomes at various trial counts, compare these with the theoretical probabilities, and visualize the results with plots. The assignment emphasizes understanding variability, convergence of empirical probabilities to theoretical values, and the effects of sample size on probability estimates.
Paper For Above instruction
In both parts of the assignment, the overarching goal is to explore the randomness inherent in dice rolling simulations and how empirical outcome probabilities approach theoretical probabilities as the number of trials increases. MATLAB's random number generation capabilities are utilized extensively for this purpose, with an emphasis on statistical analysis and visualization to interpret the results.
Part A: Simulating Single Die Rolls
Part A leverages MATLAB to simulate the experiment of rolling a fair six-faced die 900 times across 10 independent trials, capturing the variability inherent in random processes. The core of this part involves generating a 10 × 900 matrix, X, where each row represents an independent experimental sequence of rolls. The randi function is used to produce integers between 1 and 6, corresponding to the upper face of the die.
For each experiment, the program randomly selects a row from X to analyze a subset of the data. The analysis focuses on specific cumulative trials—30, 150, 300, 600, and 900—to study how the empirical probabilities of each face emerge over increasing sample sizes. The outcome frequencies are calculated by counting the number of faces in each subset, and these counts are converted into probabilities by dividing by the number of trials in each subset. The established theoretical probabilities for each face (each being 1/6) serve as benchmarks for comparison.
Visualization plays a critical role; a plot is generated to display the probabilities for each face across the different sample sizes, along with the constant theoretical probabilities. The plot includes six lines: five corresponding to the cumulative trial counts, and one representing the actual probabilities. The visualization facilitates comparison and highlights how empirical estimates evolve with increasing data, illustrating convergence to the true probabilities.
Running the program three times provides variability insights, reflecting how randomness influences the estimated probabilities. Variations across plots highlight the stochastic nature of the simulation. Typically, the empirical probabilities are expected to fluctuate more at smaller sample sizes, gradually stabilizing as the number of trials increases. The analysis responds to questions about the degree of variability, the convergence point, and the sample size at which the probabilities stabilize sufficiently close to the true values.
Part B: Simulating Two Dice Rolls and Specific Outcomes
Part B extends the simulation to rolling two independent dice simultaneously, with the goal of analyzing the probabilities of obtaining sums of 2, 7, and 11. Similar to Part A, a 10 × 900 matrix is generated, where each row contains six-faced die outcomes. Two distinct rows are randomly selected with the condition that they differ, ensuring independence between the dice. The sum of the outcomes of these two dice is computed to form a new array, Y, representing the sum outcomes for each trial.
The frequencies of observing sums of 2, 7, and 11 are counted across the various cumulative trial sizes, with probabilities derived by dividing counts by the total number of trials considered. The theoretical probabilities for these sums are well known: P(2) = 1/36, P(7) = 6/36, P(11) = 2/36.
Visualizing the evolution of these probability estimates with increasing sample size follows the same approach as in Part A, with six lines plotted: one for each trial count (30, 150, 300, 600, and 900), and one for the theoretical values. The comparisons reveal how sampling variability decreases with larger trials, leading to more precise estimations of the true probabilities.
Repeating the simulation three times allows examination of the relationship between sample size and variability, as well as the independence of the different trial count curves. Analysis of these results addresses questions related to convergence, independence of the trajectories, and potential methodological improvements, such as increasing the number of experiments or introducing different randomization techniques.
Analysis and Conclusion
The simulations demonstrate fundamental statistical principles, specifically the Law of Large Numbers, which suggests that empirical probabilities tend to approach their theoretical counterparts as the sample size increases. Smaller sample sizes (e.g., 30 trials) exhibit significant fluctuations, while larger ones (such as 900 trials) display much less variability and better convergence. This effect is evident in both parts of the assignment, though the specific outcomes (faces and sums) influence the rate at which convergence occurs.
In the case of Part A, the face '1' often demonstrates less variability due to the uniform probability and symmetry of outcomes, whereas the face '6' also stabilizes with increasing trials. Part B reveals that the probabilities of sums 2 and 11 are inherently lower and thus exhibit higher relative variability at smaller sample sizes. The sum 7, being the most probable for two dice, converges faster, evidencing lower variability across simulations.
By analyzing three different runs, it becomes clear that the curves for different trial sizes are not independent; they are linked through the cumulative data, with each larger sample size incorporating previous data, leading to a predictable reduction in variability. This dependency underscores the importance of sample size in statistical inference and the utility of simulations for understanding probabilistic behavior.
Methodological improvements could include increasing the number of independent experiment repetitions to better evaluate variability or employing more advanced randomization techniques to ensure uniformity. Additionally, parallel processing or computational optimization can handle larger data sets for more detailed analyses, further enriching the insights gained from such stochastic studies.
References
- Casella, G., & Berger, R. L. (2002). Statistical Inference. Duxbury.
- Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer-Verlag.
- Montgomery, D. C., & Runger, G. C. (2010). Applied Statistics and Probability for Engineers. Wiley.
- Ross, S. M. (2014). Simulation. Academic Press.
- Snedecor, G. W., & Cochran, W. G. (1989). Statistical Methods. Iowa State University Press.
- Wilcox, R. R. (2012). Introduction to Robust Estimation and Hypothesis Testing. Academic Press.
- Mendenhall, W., Beaver, R., & Beaver, B. M. (2012). Introduction to Probability and Statistics. Brooks/Cole.
- Feller, W. (1968). An Introduction to Probability Theory and Its Applications. Wiley.
- Fishman, G. S. (2001). Discrete-Event Simulation: Modeling, Programming, and Analysis. Springer.
- Kleijnen, J. P. C. (2015). Design and Analysis of Simulation Experiments. Springer.