CM: NetLogo ProbLab: Partition Permutation Distribution: Riddle #1

Partition Permutation: Riddle #1

If you run the experiment over many totals, with DIMINISHING-SAMPLE-SPACE? set to "On," an interesting relation emerges between the heights of adjacent columns in the "ADDENDS" histogram (see Figure, above, representing 10,000 runs up to the total of 100). The second column is 1/2 as tall as the first column, the third column is 2/3 as tall as the second column, the fourth column is 3/4 as tall as the third column, etc. In other words, the second column is 1/2 as tall as the first, the third column is 1/3 as tall as the first, the fourth column is 1/4 as tall as the first, etc. This monotonic decreasing function, 1/n, behaves differently from the histogram in Shuffle Board, where we counted 'attempts until success' ("waiting time"). Can you explain why we get a '1/n' function here? Perhaps induction could help.

Elaborating on the question
In this model, you select a 'target total,' and then the program generates random addends that might add up to this target-total. For instance, if you have selected '5' as your target-total, the program begins by randomly selecting a number in the range of 1 through 5. It might, for instance, begin by choosing '3' and then -- because is has only another 2 to fill up to the target-total -- it chooses randomly between '1' and '2.' So the program might either choose a '2' and be done with it, or it might choose a '1' and then, subsequently, choose another '1' so as to complete the adding up to 5. [Note that this is the way the model works with the DIMINISHING-SAMPLE-SPACE? switch set at "On."]

We have noted that the histogram within the model, which shows how often each addend was selected over many runs, bears a patterned relation between the value of n, an addend building up the target total, and the chance of getting n. Specifically, we saw that an addend n occurs 1/n as often as the addend '1' occurs. For instance, if the number '1' has been selected 1,000 times, then the addend '5' will have been selected roughly 200 times (1/5 * 1000). We ask you to think with us why we get this distribution.

Exploring a single case
To solve this problem, we will conduct a combinatorial analysis of the sample space of all possible addend sequences up to the target total. This theoretical-probability analysis will help us understand the distribution of randomly selected addends that we observed in the empirical-probability experiment. For example, we’ll find all the possible addend sequences adding up to 5 that include the addend '1', e.g., 4-1, 1-4, and 1-3-1. Note that the sequence 1-4 has a single '1,' and the sequence 1-3-1 has two '1's. So in terms of the histogram in this experiment, the single addend sequence 1-3-1 is “worth” two occurrences of '1'—twice as much as the addend sequence 1-4. Next, based on the rules of the experiment, we will determine the chances of getting each one of these sequences. Also note that the chance of getting a 1-4 and a 4-1 are not equal. Once the sequence has randomly begun with '1,' there are four possible addends to choose from next in building towards the target total (addends '1' though '4'). However, if the first addend is '4,' the only possible way of completing up to 5 is by selecting a 1. Therefore, we should expect to get a 1-4 less often that a 4-1 (1/5 * 1/4 vs. 1/5 * 1/1, respectively). Combining the counts and the chances will give us the overall frequency of an addend in the experiment. We'll explain this further now.

Let's say that we want to show that the chance, per permutation, of getting a '2' is 1/2 as compared to the chance of getting a '1.' When we run the simulation at a target-total of '5,' we don't know which permutation we're about to get: it might have no 2's at all, such as in the permutations 1-4, 3-1-1, 5, and 4-1; it might have a single 2, such as in the permutation 3-2; and it might have two 2's, such as in the permutation 2-1-2. So the permutation 1-4 has a value of 0 (no 2's in it), the permutation 3-2 has a value of 1 (one 2 in it), and the permutation 2-1-2 has a value of 2 (because there are two 2's). Now, there are exactly six different partition permutations totaling 5 with a single '2' in them: 1-1-1-2, 1-1-2-1, 1-2-1-1, 2-1-1-1, 3-2, and 2-3. Also, there are exactly three different permutations with two 2's in them: 1-2-2, 2-1-2, and 2-2-1. Hmmmm... three is half of six... It seems as though we're almost there... Are we?

Now here is where it gets really tricky... You might think that there is an equal chance of getting each of these permutations, and so all we'd need to do is compare the number of permuations that have a '2' to the number of those that have '1.' However, the way this model works, there is not an equal chance of getting each permutation. And when the DIMINISHING-SAMPLE-SPACE? switch is set to "On," the ADDENDS histogram expresses these different chances in the non-flat distribution that we are analyzing in this riddle solution. For instance, there is a higher chance of getting a 3-2 (1/5 * 1/2) as compared to getting a 2-3 (1/5 * 1/3). Also, the value ("worth") of 2-1-1-1, in terms of the number of 2's we get, is just 1 (because there is only a single 2). But the value of 2-2-1 is two (because there are two 2). We need to take this value into consideration as we look at the possible combinations and what they give us in terms of the addend we're looking for. Looking both at the probability of getting somthing and at what it's worth to you is called calculating the expected value (or expectation value). To better understand these probability calculations, we wrote out every possible combination of addends with the target-total of '5.'

Explaining the behavior
We considered two explanations for the behavior described in the riddle, both focusing on the probabilities of getting the partition permutations:
The first exlanation was the General Case approach, in which we considered the situations where the target-total is one more, two more and three more than the number for which we are looking. Using these cases, we attempted to prove that the number of target addends (n) is expected to be 1 / n of the number of addends of 1. This approach only gave partial results, though it gave us a feeling that the behavior we are studying "makes sense."

The Induction Approach (coming very shortly) gave more hopes for a mathematical proof. There, we considered how one could derive the calculation of probabilities of each of the addends. Using that, one could prove, either by the "brute force" method (using the computer) or mathematically that the probability would indeed turn out to be 1/n. The approach had a couple of holes, including the question of how to get from the problem to the derivation of the probabilities using mathematics.

Thus, we have two unfinished methods of solving this riddle. We are trying to work with mathematical procedures that a dedicated middle-school student could understand. Can you finish our solutions or come up with one of your own?

[Last updated Dec 4, 2004]