9-Block Stalagmite: Riddle #2

For different settings of the probability-to-be-green, we get different modes in the stalagmite. For example, when the probability-to-be-green is 50% and we're working with 9-blocks, we get a mode at both "4" and "5." By "mode," we mean that there will be more of that type of 9-block as compared to any other 9-block in the stalagmite. Let's call 9-blocks that have exactly 4 green squares "4zees," or in short, "4-z."
This riddle looks at the relation between the setting of probability-to-be-green and the mode. For instance, what would the mode be when the probability is 13%? Will the stalagmite max at the 1-z column? At the 2-z column?
Our riddle was: What is the range of probabilities at which a 2-z is the most prevalent in a randomly-generated 9-block? To look at this question, one must first determine a system for finding when a given n-z should be most prevalent and only then apply this system to the specific case of the 2-z. For instance, one could safely assume that a 9-block with no green squares would be most prevalent if the probability of getting green is 0; similarly, a block with all green squares would be most prevalent with a probability of 1. Thus, the range of probabilities for each n-z is somewhere between the extremes "0" and "1." This may seem trivial, but is useful in framing a range for the probability values.

The n-z and prevalence
For a particular n-z to be the mode, the probability of getting it must be greater than any other n-z. Note that the higher the probability-to-be-green, the higher the probability of getting a "greener" n-z. In other words, there is a better chance of randomly getting a 4-z at a probability of 50% than at a probability of 10%. This is even more so for a 5-z, because the 5-z is "farther to the right." Also, this means that the range of probabilities for which 5-z "rule" as the modes of an experimental outcome is immediately to the right of the respective range for 4-z. We see here a direct relationship between the range of n-z and the range of probabilities. In general, one might say that if an n-z "rules" at some probability range that ends at X, then (n+1)-z will rule at a probability that begins at X. For instance, the 2-z will begin "ruling" where 1-z "concedes."

Multiplying it out
The point at which the 2-z becomes more probable than the 1-z is when the probability of getting each is equal.  The probability of an n-z = (number of combinations) * (probability of getting the n-z), so:

For # of 1-z = # of 2-z (with x = probability of success):
9 * (x)1 * (1 - x)8 = 36 * (x)2 * (1 - x)7
9 * (x)1 * (1 - x)8 = 36 * (x)2 * (1 - x)7
9 * (1 – x) = 36 * x   =>   9 = 45 * x   =>   x = 1 / 5 = 2 / 10 = 0.2 = 20%
Thus, the count of 1-z and 2-z would be expected to be the same at the probability 20%. This means that the beginning of the range of prevalence of 2-z is at the probability of 20%.

For # of 2-z = # of 3-z (with x = probability of success):
36 * (x)2 * (1 - x)7 = 84 * (x)3 * (1 - x)6
36 * (x)2 * (1 - x)7 = 84 * (x)3 * (1 - x)6
36 * (1 – x) = 84 * x   =>   36 = 120 * x   =>   x = 3 / 10 = 0.3 = 30%
Thus, the count of 2-z and 3-z would be expected to be the same at the probability 30%. This means that the end of the range of prevalence of 2-z is at the probability of 30%.

Thus, the answer to the riddle regarding the range of probabilities of greenness in which a 2-z is most prevalent is from 2/10 to 3/10 = the probability of 20% - 30%, not inclusive, since we expect the # 2-z to equal the # 3-z at the probability of 30%.

Now, to take it a step further, let us look at the general case; it uses the “combination” function (nCr), which determines the number of combinations there are of a permutation of greens and blues,

n C r = n! / ((n-r)!* r!) :

For # of (n)-z = # of (n+1)-z (with x = probability of success, k = number of squares in the block - (k-block), not necessarily a 9-block):
k C n * (x) (n - 1) * (1 – x) (k + 1 - n) = k C (n+1) * (x) (n) * (1-x) (k - n)
k C n * (x) (n - 1) * (1 – x) (k + 1 - n) = k C (n+1) * (x) (n) * (1-x) (k - n)
(k! / (n! * (k – n)!) * (1 – x) = (k! / ((n + 1)! * (k - 1 – n)!)* x
(k! / (n! * (k – n)!) * (1 – x) = (k! / ((n + 1)! * (k - 1 – n)!)* x
(1 – x) / (k – n) = x / (n + 1)
(n + 1) * (1 – x) = (k – n) * x
n + 1 = (k – n + n + 1) * x  =>  x = (n + 1) / (k + 1)
Thus, the formula proves that the point at which the n-z is expected to be most dominant between the probability of

n / (k + 1) to (n + 1) / (k + 1).

To check our work, we wrote a "Brute-Force" model that finds the probability at which a certain n-z is most prevalent. This model finds where the probability of a given n-z is highest in the range of 0% - 100%. To use it, select the size of the side of the block (a side of 2 would produce the results for a 4-block, side of 3, 9-block), then press [Setup]. Next, choose the n-z for which you are looking, and press [Go]. When this process stops, the 'probability' slider will indicate the probability at which the n-z beging to be prevalent:



CM ProbLab support model: 9-Block Stalagmite Summation
Don't see nothin'?

As may be seen from the model simulation, the probability at which the dominant 9-block switches from 1 to 2 is 20%+, from 2 to 3 is 30%+, etc. 

Earlier, we established that the formula for the range of the 2-z was from 2/10 to 3/10, or in general, the formula for the range of a n-z is from n / (k + 1) to (n + 1) / (k + 1). One could have expected ~ / k, instead of ~ / (k + 1), since one could have thought that for 9-blocks (k = 9), we'd be dividing by 9 and not by 9+1.

This occurs due to the issue of the “null” option.  A common misconception is that a 9-block only has nine options (from 1 – 9 green squares); however, there is also the option of 0 squares with the target-color, the “null” option. The confusion may occur because the user relates the “Sample Stalagmite” situation to one involving dice, where there is no “null” option. Thus, once one realizes that there are (k + 1) options, the range of n / (k + 1) to (n + 1) / (k + 1) makes more sense.

Approximating the most prevalent n-z
Now, suppose that one, armed with knowledge of the fact that the range of 1-z starts at the probability of 10%, attempts to guess the most probable n-z from a given probability-to-be-green. Given the probability of 9%, one might say that the 1-z is most prevalent, since 9% is closer to 10% than to 0% (the beggining of the range of prevalence of the 0-z). Right? Wrong!!!

Such rounding does not work in this situation. If one sets the side to 3 and the probability to 9%, it is closer to 10% (where the 1-z becomes more dominant) than to 0% (where 0-z is more dominant).  However, it turns out that the 0-z is the most dominant at the probability of 9%. This may be surprising. However, if one thinks in terms of range-partitions, the world starts making sense again:

What one may do is divide the range of probabilities into (k + 1) portions, each with its own dominant n-z. So, the most dominant n-z of a 9-block's first range-partition is the 0-z. This first range-partition extends over the range of 1/10 = 10%; thus, the 0-z range is 0% - 10%. Therefore, it does not matter if the probability-to-be-green is 9% or 1%; it is still within the 10% range of the 0-z, making the 0-z the mode.

[last updated July 8, 2005]