NetLogo banner
 Home Page
 Download
 Models
 Community Models
 Extensions
 User Manual:
  Web version
  Printable version
 FAQ
 Resources
 Contact Us

NetLogo Models Library:
HubNet Activities/Unverified

Note: This model is unverified. It has not yet been tested and polished as thoroughly as our other models.

For information about HubNet, click here.

(back to the library)

Sampler

[screen shot] Run Sampler in your browser
uses NetLogo 4.0.4
requires Java 1.4.1+
(system requirements)

Note: If you download the NetLogo application, every model in the Models Library (besides the Community Models) is included. If you have trouble running this model in your browser, you may wish to download the application instead.

This activity is available only for computer HubNet, not calculator HubNet.

WHAT IS IT?

Sampler is a HubNet Participatory Simulation in statistics. It is part of the ProbLab curricular models. Students engage in statistical analysis as individuals and as a classroom. Through these activities, students discover the meaning and use of basic concepts in statistics.

Students take samples from a hidden population and experience the mathematics of statistics, such as mean, distribution, margin of error, etc. The graphics in the SAMPLER interface are designed to ground students' understanding of statistics in proportional judgments of color distribution. The collaborative tools are designed to help students appreciate the power of large numbers for making inferences about populations. Students experience distributions both at an individual level -- variation in their own samples -- and at a group level -- variation in all students' guesses. This analogy is designed for students' to appreciate the diversity of opinions in the classroom and the power of embracing everyone to achieve a complex task.

Learning Statistics:
In SAMPLER, statistics is presented as a task of making inferences about a population under conditions of uncertainty and limited resources. For example, if you wanted to know what percentage of students in your city speak a language other than English, how would you go about it? Would it be enough to measure the distribution of this variable in your own class? If yes, then how sure could you be that your statistic is representative of the whole city? If not, why not? Are there certain groups of people that it would make more sense to use as a sample? Are there other groups it would make no sense to use? For instance, would it make sense to stand outside a movie house that is showing a French film with no subtitles and ask each patron whether they speak a second language? Is this a representative sample? Should we look at certain parts of town? Would all parts of town be the same? Oh, and by the way, what is an average (a mean)? A variable? A value? What does it mean to measure a distribution of a variable within a population?

Many students have a very difficult time understanding statistics--not only in middle and high school, but also in college and beyond. Yet on the other hand, there are certain visual-mental capabilities we all have--even very young children--that could be thought of as naive statistics. These capabilities are the proportional judgments we make constantly. We make proportional judgments when we need to decide how to maximize the utility of our actions. For instance, when we come to a new place we may say, "People in this town are very nice." How did we decide that? Or, "Don't buy fruit there--it's often overripe." How did we infer that? Or, "To get to school, take Main street--it's the fastest route in the morning; but drive back through High street, I find that's faster in the afternoon."

HOW IT WORKS

The teacher works in NetLogo and acts as the server for the students (the "clients") who each have their own client interface on their computer screens. Students see the teacher's interface projected on the classroom screen, and they can instruct the teacher to manipulate settings of the microworld that they do not have on their own client interfaces. The View in the projected interface features a square "population" of 3721 squares. Individual patches are either green (target-color) or blue (other-color). The squares' color is the attribute we measure in SAMPLER. So, the SAMPLER color is a variable that can have one of two values: green or blue (a dichotomous variable, like a coin). In a basic SAMPLER activity, students and/or the teacher reveal(s) parts of or all the population and students discuss, approximate, take samples, and input their individual guesses as to the percentage of green patches within the revealed sector of the population. All participating students' inputs are collected, pooled, and represented in monitors and in the plot. Thus, each student constitutes a data-point agent and can experience impacting the class statistics.

Through collaboration, students are to achieve, as a class, the best possible approximation of the population.

The $$ game: At the beginning of every round and later, whenever the facilitator decides, all clients receive max-points, for instance $100. Now, students can bet either on their own guess or on the group guess. They pay 1 point for every percentile their bet is away from the truth or from the margin of error that they agree upon. This is an optional feature.

HOW TO USE IT

Quickstart Instructions:
SETUP initializes all variables but it also disconnects the users. You needn't press it now; use it only if you wish to begin the simulation again. Normally, choose RERUN to initialize variables for using Sampler with the same class. If you've just begun, see that the slider over the View is set to 50% target color, and the RANDOM-RERUN? switch is set to Off and the ABNORMALITY slider is set to 0. Note: the target color is green and the other color is blue.

Now press REVEAL POP. to see the current population: 50% of the patches are green; you have 50% greenness. Practice setting the population to a different percentage and distribution type by setting the percentage green for the population by setting the %-TARGET-COLOR slider over the View. ABNORMALITY controls how much the distribution of green deviates from 'normal'. Explore it.

If RANDOM-RERUN? is On then the computer will choose the percentage green for you. That means that even you will not know the percentage green in population. Press the RERUN button for a new population. REVEAL POP. varies according to the values of ORGANIZE? and GRID?. ORGANIZE? Off shows the colors where they are and ORGANIZE? On does color separation (see how the green/blue contour falls exactly under the %-TARGET-COLOR slider handle). GRID? On puts frames around the patches and GRID? Off does not. SAMPLE allows you to reveal with the mouse square areas on the View. The size of these square samples depends on the value of the SAMPLE-BLOCK-SIDE slider. Set KEEP-SAMPLE? to choose whether or not to keep successive samples displayed.

In your own preparation for class, in order to anticipate students' questions, you should practice by using RANDOM-RERUN? and taking samples. Think the way your students would. There are many questions to be asked and especially when you play with ABNORMALITY and get color clusters in your population. Work on strategies for maximizing the accuracy AND efficiency of your sample measurement: Should you take small samples or big ones?; just one sample or many?; where from?

To start off your students, do more or less what you did yourself:
-Set the sliders and switches as following: %-TARGET-COLOR 50; ABNORMALITY 0; RANDOM-RERUN?, ORGANIZE?, and GRID? Off.
-Press REVEAL POP.
-Now ask students 'What is this?' 'How green is this?', 'How could we figure out?'
-To show the class how green it is, set ORGANIZE? to On and press REVEAL POP.
-Draw students' attention to the contour line and how it compares to their guess on the slider above it: Did they over/under-guess?; by how much? How well did they do? Were they close enough?
-Try this with other %.

Now you'll want students to open the HubNet client and login. When they login, they should each see their own SAMPLER interface. Now they can input their own guesses as long as COLLECT DATA is pressed. COLLECT DATA imports values from all active participants -- always keep this pressed down. Ask the students to set their sliders to their guesses and press their INPUT-GUESS buttons. PLOT GUESSES calculates the statistics of participants' data and displays them in the plot and monitors.

The logic of the $$ game is that students bet on either their own guess or the group guess; press COMMIT to register each student's decision whether or not to go with the group guess. Students each 'pay' for their error: As many points are deducted from their $$ as their guess was off. But you can set the MARGIN-OF-ERROR slider to allow for more flexibility.

When the basic procedure is clear to students, switch On the ALLOW-STUDENT-SAMPLING?. Students can sample as many patches as the sampling allowance is set to. Press REPLENISH SAMPLING ALLOWANCE if you want to reset all students to TOTAL-SAMPLING-ALLOWANCE. Discuss individual and collaborative sampling strategies and their relation to the histogram distribution. Good luck!

See the SAMPLER section of the Computer HubNet Participatory Simulations Guide for an in-depth lesson plan.

Buttons:
SETUP - clears all turtles and patches and the plot. This button should only be pressed when starting out with a new group of users since all data is lost.
RERUN - creates a new population from which to sample. In creating this population, each patch has either a random (if RANDOM-RERUN? is true) or a user chosen (if RANDOM-RERUN? is false) percent chance of being green. The user chosen chance is set with the %-TARGET-COLOR slider. This button should be used to setup the model again for collecting data with a new population and the same users connected.
SAMPLE - allows the server to reveal mouse-selected square areas of SAMPLE-BLOCK-SIDE within the population. Set the KEEP-SAMPLES? switch to choose whether or not to keep successive samples.
COLLECT DATA - collects student samples and guess values from all active clients. Clients, if allowed to sample, can reveal as many patches as they have left in their sample allowance. One sample allowance unit allows you to reveal one patch.
POOL SAMPLES - shows all samples (from the clients and the server) at once
REVEAL POP. - shows all the patches' colors. If ORGANIZE? is false, the patches reveal whatever color they got when RERUN was pressed. If ORGANIZE? is true, the patches segregate themselves by color, green to the left and blue to the right. If GRID? is true, each of patches have a thin frame surrounding it.
PLOT GUESSES - histograms the collected guesses in the plot. Once you have plotted, guesses are no longer accepted for this population until RERUN or MORE DATA are pressed.
MORE DATA - prepares the model for another round of class guesses for this specific population. To create a new population, use the RERUN button.
REPLENISH SAMPLING ALLOWANCE - resets each of the client's sampling allowance to TOTAL-SAMPLE-PATCHES
COMMIT - collects students' current setting of their GO-WITH-GROUP? switch. This is used for the $$ game, and is not needed if you are not doing the game.
REPLENISH $$ - resets each of the client's $$-REMAINING to the starting quantity. Note: clients' $$-REMAINING are never replenished unless you press this button. We suggest that a good time to press it might be when you press the RERUN button.
NEXT >>> - shows the next quick start instruction
<<< PREVIOUS - shows the previous quick start instruction
RESET INSTRUCTIONS - shows the first quick start instruction

Sliders:
ABNORMALITY - slider controls to what extent the distribution deviates from 'normal' (for a given percent green you'll get larger clumps for a larger setting)
MARGIN-OF-ERROR - used for the $$ game. This determines how accurate the guess has to be in order to be correct. For example, if it's set at 3 and the greenness is 70 then you can guess between 67 and 73 and not have points taken off, but if you guess 74 or 66 you get 1 point off, etc.
TOTAL-SAMPLE-PATCHES - determines how much sampling allowance students get when it is replenished. When sampling, one sample allowance unit allows you to reveal one patch.
SAMPLE-BLOCK-SIDE - determines the side of the square that is revealed when sampling in the model. For instance, SAMPLE-BLOCK-SIDE of 5 gives a sample size of 25 patches. This value does not affect how big the samples are on the clients. They have their own versions of this slider.
CLUSTER-GUESSES - controls the histogram interval: the higher the setting, the higher the interval.

Switches:
ALLOW-STUDENT-SAMPLING? - if true, students can sample; otherwise not.
ALLOW-STUDENT-SET-BLOCK-SIDE? - if true, students can only set the size of their samples before sampling for the first time that round; otherwise not.
ORGANIZE? - if true, when REVEAL is pressed, the green and blue colors will be segregated (green on the left). If false, then when you press REVEAL, the colors will be not be organized.
GRID? - if true, each patch will have a thin frame around it to help you count them. If false, then the frames will not be seen.
RANDOM-RERUN? - when RERUN is pressed, each patch has either a random (if RANDOM-RERUN? is true) or a user chosen (if RANDOM-RERUN? is false) percent chance of being green. The user chosen chance is set with the %-TARGET-COLOR slider. This button should be used to setup the model again for collecting data with a new population and the same users connected.
KEEP-SAMPLES? - when sampling, if true, old samples are still displayed. If false, old samples are removed and cannot be seen.

Monitors:
SAMPLE SIZE - shows the number of patches in the current sample chosen in the NetLogo model (the teacher's screen)
CLASS MEAN $$ - shows the mean of students' $$-remaining but only students whose $$-remaining is 0 and above
SAMPLE/POP - shows the quotient of TARGET COLOR % IN THIS SAMPLE and TARGET COLOR % IN POPULATION. So, in a sense, it tells you how indicative a sample is of the population statistic. If the quotient is exactly 1 then the sample can be said to be representative of the population. If the quotient is smaller than 1 then the sample under represents the population, and if the quotient is larger than 1 then the sample over represents the population statistic. Larger samples generally give quotients that are closer to 1. Only when SHOW-SAMPLE-%? and SHOW-POP-%? are true does this monitor show any pertinent information.
# STUDENTS - shows how many students are actively connected to the NetLogo model (the teacher's screen)
# GUESSES - shows how many guesses were collected when you last pressed PLOT GUESSES.
MEAN THIS ROUND - shows the average of guesses that are currently plotted in the histogram.
STANDARD DEV - shows the standard deviation of guesses
# ROUNDS - shows how many rounds are represented in the plot.
MEAN ALL ROUNDS - the cumulative average for all rounds per this population (since you last pressed RERUN).

Plots:
AVERAGES OF STUDENT GUESSES- X-axis is %-TARGET-COLOR and Y-axis is # STUDENTS. Here you see four statistics as displayed by four different plot pens:
1. GUESSES: Students' collected guesses for a round represented in histograms.
2. MEAN-OF-GUESSES: the average value of guesses for the recent round
3. MEANS: the average values from successive rounds
4. MEAN-OF-MEANS: the average value of 'means'.

Client Information
MY-GUESS-FOR-THIS-SAMPLE - students use this slider to set their guess value.
MY-SAMPLE-BLOCK-SIDE - students use this slider to set the size of their sample square
GO-WITH-GROUP-GUESS? - students use this switch to set whether or not they are committing themselves to go with the class average guess.
MESSAGE FOR YOU - displays messages sent from the server; these may be collective or for subsets of students.
MY SAMPLE ALLOWANCE - shows how many patches the student may still sample.
$$-REMAINING - shows how many dollar points the student still has left.
INPUT-GUESS - press this button to send to the server the guess that you have set on the MY-GUESS-FOR-THIS-SAMPLE slider.

THINGS TO NOTICE

When you press REVEAL in the organize-on option, the target-color and the other-color move to the left and the right of the screen, respectively, forming a contour line. The location of this contour line is comparable to two other elements on the interface: the contour line falls directly below the slider handle above it (if this was not a random run) and it relates similarly to the mean line in the plot. The reason we can compare these three features directly is because the 0 and 'whole' (100%) of each of these features are aligned. That is, the sliders, View, and plot have all been placed carefully so as to subtend each other precisely.

The abnormality distribution feature does not take much code to write, but is effective. Look at the code and try to understand it.

THINGS TO TRY

Set RANDOM-RERUN? to true and press RERUN and take samples. What is the minimal number of samples you need in order to get a good idea of the target-color distribution in the population? Anyway, how 'good' must a good idea be? Can you think of a way of describing this 'goodness'? What is a good way of spreading the samples on the population?

Try setting the ABNORMALITY slider to different values and press RERUN over and over for the same percentage green, for instance 50%. Can you think of situations in the world where a certain attribute is distributed in a population in a way that corresponds to a high value of ABNORMALITY? What do we mean when we speak of a 'uniform distribution' within a population? For instance, is a distribution of ABNORMALITY = 0 uniform? Or must there be strict order, for instance stripes of target-color, in order for you to feel that the distribution is uniform? Also, is there a difference between your sense of uniformity whether you're looking at the whole population or just at certain parts of it? If you threw a handful of pebbles onto a square area, would you say they fell 'uniformly'? What kinds of patterns are natural, and what kinds of patterns would you think of as coincidental?

Set RANDOM-RERUN? to false and press RERUN multiple times. Each time you press RERUN, change the SHOW-POP-%? to true. Do you notice that the number of green patches varies from trial to trial? The way the population is established every trial is that each patch "flips a weighted coin" to see whether it should be green or not. Because there are so many patches, it turns out that the percentage of patches that "landed on" green is roughly the same as the percentage green you set with the slider. You can explore this intriguing idea further in the ProbLab model Stochastic Patchwork.

EXTENDING THE MODEL

What other quantitative aspects of sampling might a teacher or student need so as to understand and do more in this activity? Perhaps the class would want to keep a record of how well they are doing over an entire lesson. How would you quantify such performance and how would you display it? Would a plot be useful for this or just a list of numbers?

NETLOGO FEATURES

Since one of the most common configurations of this model is a 50-50 split between green and blue the world has an even number of columns and rows so that there are exactly 50% of the patches that are green rather than a close approximation. Since an even grid is required the origin was moved to the lower left corner instead of being slightly off-center near the middle of the world.

RELATED MODELS

All models in ProbLab deal with probability and statistics in ways that may enrich student understanding of sample space, randomness, and distributions. In particular, many models share with SAMPLER the 3-by-3 sample that we call a "9-block."

CREDITS AND REFERENCES

To refer to this model in academic publications, please use: Abrahamson, D. and Wilensky, U. (2003). NetLogo HubNet Sampler model. http://ccl.northwestern.edu/netlogo/models/HubNetSampler. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.

In other publications, please use: Copyright 2003 Uri Wilensky. All rights reserved. See http://ccl.northwestern.edu/netlogo/models/HubNetSampler for terms of use.

(back to the NetLogo Models Library)