NetLogo Models Library:
Sampler is a HubNet Participatory Simulation in statistics. It is part of the ProbLab curricular models. Students engage in statistical analysis as individuals and as a classroom. Through these activities, students discover the meaning and use of basic concepts in statistics.
Students take samples from a hidden population and experience the mathematics of statistics, such as mean, distribution, margin of error, etc. The graphics in the SAMPLER interface are designed to ground students' understanding of statistics in proportional judgments of color distribution. The collaborative tools are designed to help students appreciate the power of large numbers for making inferences about populations. Students experience distributions both at an individual level -- variation in their own samples -- and at a group level -- variation in all students' guesses. This analogy is designed for students to appreciate the diversity of opinions in the classroom and the power of embracing everyone to achieve a complex task.
In SAMPLER, statistics is presented as a task of making inferences about a population under conditions of uncertainty and limited resources. For example, if you wanted to know what percentage of students in your city speak a language other than English, how would you go about it? Would it be enough to measure the distribution of this variable in your own class? If yes, then how sure could you be that your statistic is representative of the whole city? If not, why not? Are there certain groups of people that it would make more sense to use as a sample? Are there other groups it would make no sense to use? For instance, would it make sense to stand outside a movie house that is showing a French film with no subtitles and ask each patron whether they speak a second language? Is this a representative sample? Should we look at certain parts of town? Would all parts of town be the same? Oh, and by the way, what is an average (a mean)? A variable? A value? What does it mean to measure a distribution of a variable within a population?
Many students have a very difficult time understanding statistics -- not only in middle and high school, but also in college and beyond. Yet on the other hand, there are certain visual-mental capabilities we all have--even very young children -- that could be thought of as naive statistics. These capabilities are the proportional judgments we make constantly. We make proportional judgments when we need to decide how to maximize the utility of our actions. For instance, when we come to a new place we may say, "People in this town are very nice." How did we decide that? Or, "Don't buy fruit there -- it's often overripe." How did we infer that? Or, "To get to school, take Main street -- it's the fastest route in the morning; but drive back through High street, I find that's faster in the afternoon."
The teacher works in NetLogo and acts as the server for the students (the "clients") who each have their own client interface on their computer screens. Students see the teacher's interface projected on the classroom screen, and they can instruct the teacher to manipulate settings of the microworld that they do not have on their own client interfaces. The view in the projected interface features a square "population" of 3600 squares. Individual patches are either green or blue. The squares' color is the attribute we measure in SAMPLER. So, the SAMPLER color is a variable that can have one of two values: green or blue (a dichotomous variable, like a coin). In a basic SAMPLER activity, students and/or the teacher reveal(s) parts of or all the population and students discuss, approximate, take samples, and input their individual guesses as to the percentage of green patches within the revealed sector of the population. All participating students' inputs are collected, pooled, and represented in monitors and in the plot. Thus, each student constitutes a data-point agent and can experience impacting the class statistics.
Through collaboration, students are to achieve, as a class, the best possible approximation of the population.
The $$ game: At the beginning of every round and later, whenever the facilitator decides, all clients receive max-points, for instance $100. Now, students can bet either on their own guess or on the group guess. They pay 1 point for every percentile their bet is away from the truth or from the margin of error that they agree upon. The winner of a $$ game is the player with the highest points remaining after all of the rounds. This is an optional feature.
Basic Activity: If you change %-GREEN, RANDOM-%-GREEN? or ABNORMALITY you will need to press SETUP for the changes to take effect, you may also press SETUP if you want to get a new population with the current settings.
Press the GO button. You will now be able to reveal samples of the population by clicking in the view. However, students will not be able to take samples until STUDENT-SAMPLING? is set to true.
Before the students start sampling you might want to present questions to them, such as: 'What is this?' 'How green is this?' 'How could we figure out?'
When users login they receive their own interface. To change their personal guess for the percent green they should move the %-GREEN slider. When the user has a final guess, s/he should press the SUBMIT-ANSWER button (otherwise the guess will not be counted).
After all the students have submitted guesses press the PLOT-GUESSES button which will plot all data from this round and advance to the next. You cannot advance the activity if no students have submitted the answers. If there are any students that have no submitted answers you will be warned, though you may continue if you wish. Each round is simply a period in which students may make guesses about the greenness of the population. When a new round begins the students' submitted? flag will be reset to false so they can make another guess. The plots are kept from round to round and the population does not change. If you wish to change the population press the SETUP button (this will clear all plotted data too).
$$ Game: The procedure to play the $$ game is similar to the basic activity, take samples, guess the % green and press the SUBMIT button. Then the student should also decide to either bet on that guess or on the average guess among all students. By default students are scored using their own guesses. To change this they should press the GO WITH GROUP button. Students will be scored on how close their bet is to the actual percent green in the population.
Buttons: SETUP - Creates a new patch population with a new %-green (or random percent green if RANDOM-%-GREEN? is enabled) and the new value of abnormality. Clears the plot and data from all rounds. Students need not log out, user names and student scores will not be lost.
GO - Starts the activity, the teacher can always reveal samples by clicking in the view. The students can only take samples if STUDENT-SAMPLING? is enabled.
SHOW/HIDE-GRID - Turns on and off the grid that shows clear dividing lines between patches.
SHOW/HIDE-POPULATION - reveal the true color (green or blue) of each patch, or return any sampled patches to gray. If ORGANIZE? is true all the green patches will appear on the left and all the blue patches on the right. If you want to "disorganize" the population, turn the ORGANIZE? switch off and press SHOW-POPULATION again.
POOL-SAMPLES - reveal all the samples taken by the server and the client.
PLOT-GUESSES - histograms the collected guesses in the plot. Does the bookkeeping required at the end of a round and prepares for the next round. Once you have pressed PLOT-GUESSES the current round has ended and the next round has begun.
REPLENISH-SAMPLING-ALLOWANCE - resets each of the clients' sampling allowance to SAMPLING-ALLOWANCE.
%-GREEN - controls the percent of patches that are green if RANDOM-%-GREEN? is off.
ABNORMALITY - controls to what extent the distribution deviates from 'normal' (for a given percent green you'll get larger clumps for a larger setting).
SAMPLING-ALLOWANCE - The total number of patches clients are allowed to reveal. The teacher may REPLENISH-SAMPLING-ALLOWANCE to set all clients back to SAMPLING-ALLOWANCE.
SAMPLE-SIZE - determines the number of patches on a side of a sample block. For instance, SAMPLE-SIZE of 5 reveals a block of 25 patches. If STUDENT-SAMPLE-SIZE? is off this is also the sample size on the clients.
STUDENT-SAMPLING? - if true, students can sample; otherwise not.
STUDENT-SAMPLE-SIZE? - if true, students can size of their samples; otherwise not.
RANDOM-%-GREEN? - if true when SETUP is pressed, a random percentage green patches is chosen. Otherwise %-green is used.
KEEP-SAMPLES? - when sampling, if true, old samples are still displayed. If false, old samples are removed and cannot be seen.
ORGANIZE? - if true all the green patches will be pushed to the left and the blue will be pushed to the right when you press the SHOW-POPULATION button. Otherwise, the patches will be show as their true colors.
# STUDENTS - shows the number of connected clients.
# GUESSES - shows how many guesses were collected when you last pressed PLOT GUESSES.
MEAN THIS ROUND - shows the average of guesses that are currently plotted in the histogram.
STANDARD DEV - shows the standard deviation of guesses plotted in the histogram.
# ROUNDS - shows how many rounds have been played since the last time SETUP was pressed. A round is a period in which students may make guess about the greenness of a given population. A round ends and a new one begins each time the PLOT-GUESSES button is pressed. This is reset when you press SETUP.
MEAN ALL ROUNDS - the cumulative average for all rounds per this population (since you last pressed SETUP).
AVERAGES OF STUDENT GUESSES - X-axis is %-GREEN and Y-axis is # STUDENTS. Here you see four statistics as displayed by four different plot pens:
%-GREEN - The user's guess for the percent green. SAMPLING ALLOWANCE - the number of patches left in the user's sampling allowance. MY-SAMPLE-SIZE - the width of the sample blocks given that STUDENT-SAMPLE-SIZE? is on. SUBMIT-ANSWER - let the server know that you've locked in the current value of %-GREEN as your guess for this round. SUBMITTED? - false until the user presses SUBMIT-ANSWER this round.
For the $$ Game only:
REPLENISH $$ - resets each of the client's my-$$ to the starting quantity. Clients' $$-REMAINING are never replenished unless you press this button.
MARGIN-OF-ERROR - This determines how accurate the guess has to be in order to be correct. For example, if it's set at 3 and the greenness is 70 then you can guess between 67 and 73 and not have points taken off, but if you guess 74 or 66 you get 1 point off, etc.
CLASS MEAN $$ - shows the mean of students' MY-$$.
$$ Game on the client:
GO WITH GROUP - When scoring use the group guess rather than this individual's guess. $$ - the $$ remaining for this client (essentially his/her score).
When you set ORGANIZE? to on and press SHOW-POPULATION , the green patches move left and the blue patches move right in the view, forming a contour line. This line should fall directly below the slider handle above it and similarly should line up with the mean line in the plot. The reason we can compare these three features directly is because the 0 and 'whole' (100%) of each of these features are aligned. That is, the sliders, view, and plot have all been placed carefully so as to subtend each other precisely.
The abnormality distribution feature does not take much code to write, but is effective. Look at the code and try to understand it.
Set RANDOM-%-GREEN? to true, press SETUP, and take samples. What is the minimal number of samples you need in order to get a good idea of the distribution of colors in the population? How 'good' must a good idea be? Can you think of a way of describing this 'goodness'? What is a good way of spreading the samples on the population?
Try setting the ABNORMALITY slider to different values and press SETUP over and over for the same percentage green, for instance 50%. Can you think of situations in the world where a certain attribute is distributed in a population in a way that corresponds to a high value of ABNORMALITY? What do we mean when we speak of a 'uniform distribution' within a population? For instance, is a distribution of ABNORMALITY = 0 uniform? Or must there be strict order, for instance stripes of target-color, in order for you to feel that the distribution is uniform? Also, is there a difference between your sense of uniformity whether you're looking at the whole population or just at certain parts of it? If you threw a handful of pebbles onto a square area, would you say they fell 'uniformly'? What kinds of patterns are natural, and what kinds of patterns would you think of as coincidental?
What other quantitative aspects of sampling might a teacher or student need so as to understand and do more in this activity? Perhaps the class would want to keep a record of how well they are doing over an entire lesson. How would you quantify such performance and how would you display it? Would a plot be useful for this or just a list of numbers?
Since one of the most common configurations of this model is a 50-50 split between green and blue, the world has an even number of columns and rows so that there are exactly 50% of the patches that are green rather than a close approximation. Since an even grid is required the origin was moved to the lower left corner instead of being slightly off-center near the middle of the world.
This activity uses HUBNET-SEND-OVERRIDE to reveal the samples in the client views.
All models in ProbLab deal with probability and statistics in ways that may enrich student understanding of sample space, randomness, and distributions. In particular, many models share with SAMPLER the 3-by-3 sample that we call a "9-block."
If you mention this model or the NetLogo software in a publication, we ask that you include the citations below.
For the model itself:
Please cite the NetLogo software as:
Copyright 2003 Uri Wilensky.
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.
Commercial licenses are also available. To inquire about commercial licenses, please contact Uri Wilensky at firstname.lastname@example.org.
This activity and associated models and materials were created as part of the projects: PARTICIPATORY SIMULATIONS: NETWORK-BASED DESIGN FOR SYSTEMS LEARNING IN CLASSROOMS and/or INTEGRATED SIMULATION AND MODELING ENVIRONMENT. The project gratefully acknowledges the support of the National Science Foundation (REPP & ROLE programs) -- grant numbers REC #9814682 and REC-0126227.