
08 Oct 2025
Flipping a coin shows that randomness has variability: 10 tosses rarely give exactly 5 heads and 5 tails, but 100,000 tosses approach the expected 50:50 ratio.
Each toss is independent, so short-term deviations don’t “self-correct.”
Larger sample sizes reduce noise and reveal underlying probabilities.
With more complex outcomes, like dice or natural phenomena, the true distribution only emerges after many trials.
Simulation lets us explore uncertain systems, approximate distributions, and make reliable predictions by running many iterations with varying inputs.
When you flip a coin, there’s a 50–50 chance of getting heads or tails. If you only flip the coin ten times, it’s actually pretty unlikely that you’ll end up with exactly five heads and five tails. Do it 100,000 times, though, and you’ll land very close to 50,000 of each. That’s the law of large numbers in action: the more times you run the experiment, the closer the results get to what you’d expect.
This happens because randomness involves variability. Taking the coin example, expecting exactly 5 heads and 5 tails for 10 flips is not realistic. Each flip is subject to a random variation and with such a small number of tosses, the effect of this variation is relatively large! However, if we repeat the coin flip many times, say 100,000 times, the relative frequency of heads and tails tends to settle near the true probability of 50:50. Therefore, over the long run, with a sufficiently large sample size, random deviations tend to cancel out. This variation is also called "noise". With only 10 flips, the proportion of heads is a rough noisy estimate of the true probability. On the other hand with 100,000 flips, the noise shrinks and so the proportion stabilises at 50%, reflecting true probability of 0.5.
It’s also somewhat counterintuitive: after 10 flips, even if you get exactly 5 heads, it’s a myth to assume that the next flip is “due” to be tails. Each toss is independent, with a 50:50 chance of landing heads or tails. There is no short-term mechanism forcing the outcomes to balance out. Randomness doesn’t self-correct in the short run. A run of tails does not make heads more likely in the next toss.
The coin flip example is simple because we already know the underlying distribution. Each flip has a 50% chance of being heads or tails. But what happens when the underlying distribution is unknown, or multiple factors influence the outcomes? A classic case is the normal (Gaussian) distribution. The classic example is that of taking children’s heights. If you take the height of 2 children, they will most likely be different, and completely unrepresentative of the average height of children their age. However, if you take the height of 1,000 children of the same age, the familiar bell curve starts to emerge, with most children being of average height, and much fewer children being at the shorter and taller ends of the height range. This all means that the more trials you conduct, the closer your observed outcomes get to the true distribution. Normal distributions appear both in data and in nature, because many small, independent effects tend to combine into a bell curve.
Things get really interesting when we move beyond a simple coin—where there are only two possible outcomes—to situations with many possible outcomes or where the result is influenced by multiple independent factors. Take a die, for example. If I were to roll it 100,000 times, each face, 1 through 6, would appear roughly the same number of times. But if I only rolled it ten times, the distribution would likely be uneven, and you wouldn’t see the actual (flat) distribution because the sample size is too small.
This is where simulation becomes powerful. When outcomes are uncertain or influenced by many factors, simulation lets you run multiple iterations with varying inputs. By doing so, you can approximate the true distribution and make more reliable predictions about future outcomes. Essentially, it allows you to explore a wide range of possibilities in a safe, controlled space and understand what is likely to happen under different scenarios.