Mahler, 317 Some notes on sampling

Suppose you were asked to predict how college students will vote in the next presidential election. As you’re a college student, you decide to poll other students at your college. But your college is an urban public college so you have reason to believe that the students enrolled here are not necessarily typical of the entire population of college students.Students at your school may be more inclined to vote for the democratic candidate than another group of students at an expensive, private college. Both of these groups would have their own bias (inclination that is not typical of the larger population from which they were chosen Your challenge is to survey (ask questions of) a group, known as a sample of college students.

Sampling is the process of selecting observations since you can’t ask everyone in the population, you try to ask a group that is typical or representative of the population. That way, when you find out how the sample plans to vote, you can infer that the population will vote similarly, give or take a few points one way or the other.

A group is considered representative when its characteristics closely approximate the characteristics of the population on key variables. So what might be some key variables that your sample needs to match? Can you explain why the following variables might influence how a college campus might vote?

sex

age

income level

class level ,e.g. freshmen, sophomore, etc.

GPA

public vs. private school

SAT scores

high school average required for entrance

marital status

dorm or commuter student

If we knew about the characteristics of the entire population of college students, we could insure that our sample approximated these characteristics. We would need to know how many female students of various ages, income levels, class ranks, etc. existed in the population of college students (More about this type of sampling procedure later. But for now, let’s turn to some basics.) Then we could make sure that the same proportion of these particular sample elements were present in our sample.

An element is the unit about which information is collected, in this case, the college student. We need to define our unit in operational terms. Is any student enrolled at any college for any course to be considered an element in the entire population of college students. Perhaps you want to focus on only full-time, USA students who are matriculated during the spring semester prior to Nov. 1996; maybe you’ll only consider students who will be eligible to vote. Your definition specifies that collection of elements that are to be studied by describing the elements in time and space. When you specify the elements in time and space, you have defined the population.

Good pollsters that follow rigorous sampling procedures know that a major way to reduce bias (or to insure that the sample is representative of the population) is to make sure that each element in the population has the same chance of being selected for the sample. So hypothetically, if there are 1,000,000 USA college students, using our more restrictive definition, and we can only afford to survey 3000, each student would have a 3000 to 1,000,000 change of being selected for our sample (3 out of 1000 or .003 chance). When sampling is done so that each element has the same chance of being selected, the sampling procedure is known as an equal probability of selection method, EPSEM. There are several sampling methods that probably result in more representative samples. These methods are called probability sampling methods or designs because each element has a known chance (maybe not the same chance) of being selected. For example, consider simple random sampling known as SRS. The researcher locates a valid sampling frame, that is a list of elements from which the sample can be drawn. A number is assigned to each element on this list. A table of random numbers or a computer program is used to generate random numbers until you acquire the number of elements that are desired for the sample. In our example, this was 3000. Systematic sampling is another EPSEM. With this method, the researcher knows the number of elements in the entire population since each element is listed. He or she also knows how many elements need to be in the sample, e.g. 3000. A sampling ratio or fraction is arrived at as we did above; and the inverse, the sampling interval (the number in the population divided by the number wanted in the sample) is easily determined (1,000,000 to 3000 or 333.3). This means that after you choose a random element to start with, you take the 333th element after that. When you get to this element, you can measure the distance between the two elements, and use this measuring procedure so that you don’t have to count to 333 each time. What if your list of one million was arranged by academic major and grade pont average, you might find that the 333rd , 667th, 1000th person were borderline students in their major. This unusual problem is known as periodicity (there is a cyclical arrangement to the sampling list such that a biased sample results), but it has been known to occur with systematic sampling. Since this problem is rare and can often be corrected, systematic sampling is considered easier and just as effective as SRS. Stratified sampling is an even more effective way to arrive at a representative sample. Researchers that employ stratified sampling can reduce sampling error even with a slightly smaller sample sze. This can save time and money. [More about stratified sampling later. First, we need to know more about sampling error.]

Your sample even if arrived at in a non-biased manner will still be somewhat different than the population. Thus, the percentage of students who vote democratic in your sample will be slightly different than the percentage of students who vote democratic in Nov. 1996. This difference is called sampling error. Let’s look at sampling error, the difference between our estimate of the population’s position, and the true measure or parameter of the population of college students’ vote next presidential election. .If you follow a probability sampling procedure, you can estimate the amount of sampling error by using formulas arrived at from staticians. For example, when the issue being surveyed is a matter of choice between one position or another, there is a formula for estimating the sampling error in a binomial distribution (that’s a variable that has only 2 choices, like Democrat vs. Republican, or yes or no).

standard error or sampling error = square root of P x Q/N,

where P = the percentage (in decimal) voting for Democrat

Q= the percentage not voting for Democrat

P + Q = 1, and N = the number in the sample.

NOTE! There are two ways to reduce the size of the sampling error. One is to decrease the numerator. As a pollster, you really don’t influence this, but the more extreme difference of position will result in a smaller numerator. [Try it yourself, if 80% want the democratic candidate, then only 20% will not vote for him/her. The product for this combination would be .16. Whereas a fifty-fifty split would result in a product of .25]. The other way to reduce the sampling error is to increase the sample size (N); as N gets larger the number under the square root sign gets smaller. In fact, if you made your sample 4 times larger, you would reduce the sampling error in half.

If in our sample, where N = 3000, we determined that .42 or 42% planned to vote for the democratic candidate, then P = .42, and Q = .58. Then substituting in our formula, we would determine that the sampling error (s.e.) in this case was.009

Because .42 times .58 equals .2436, which divided by 3000 equals .0000812. The square root of this number is .009. By arriving at the sampling error or standard error, we can estimate how the population will vote within a certain confidence level and interval. Let me show you what I mean. Statisticians have learned from taking numerous samples that 68% of the samples will be within plus or minus one standard error of the population parameter (the actual measure of the population). This means that our estimate, which is .42 or 42 percent has a 68 percent chance of being in the ball park if we give ourselves a little leeway plus or minus. We could be off .009 on the high side or the low side, so we subtract .009 to get .411 and we add .009 to .42 to get .429. Now we have a confidence interval and a confidence level, we say that we are 68% sure that the best estimate of how the population will vote is between [.411 to .429 for the democratic candidate]. If we want to be more than 68% sure then we have to expand our confidence interval to include plus or minus 2 standard errors of our estimate. Statisticians have learned that 95% of the samples’ estimates will be within plus or minus two standard errors of the population parameter This would require multiplying the standard error by 2

.009 X 2 =.018, We would then have to add .018 to enlarge our estimate on the high side:

.42 + .018 = .438.

We would also want to subtract .018 to broaden our confidence interval on the low side.

.42 - .018 = .402.

So we are 95% confident that our estimate of the population parameter will be within the confidence interval of .402 to .438. In other words, we are 95% sure that 42% of the college population will vote for the democratic candidate, plus or minus 1.8 percentage points.

Statisticians have learned that to be 99.9% sure that a sample’s estimate is in close range of the population parameter, you need to increase the confidence interval to plus or minus 3 standard errors of your estimate. See if you can figure out this expanded sampling interval. [.027 + or -].

Stratified sampling decreases sampling error by using sub-samples known as stratum. A stratum is a collection of like elements, for example, a list of all the college freshman in the population would be one stratum of the college population. We could develop a stratum of college freshman from private schools, and make sure that if the entire college population included 100,000 of these students, then we could select .003 percent of our sample to be from this stratum, or 300. If the other stratum of college freshmen from public schools amounted to 150,000; using the same figures, we would select .003 of them or 450. In the same manner, we would determine how many sophomores there are from public and then from private colleges. These strata would be sampled from in a proportionate manner so that our sample would be similar to the population with respect to these two variables: college class (which has 4 levels; freshman, sophomore, junior, senior), and major source of mandate (public or private institution). In effect, we would have a 4 X 2 design, resulting in 8 strata. Within each stratum there is considered to be less variability than among the different strata, because people that have the shared characteristic are presumed to have more similar opinions. In other words, the people on each of these 8 separate lists have more in common with the others on their respective list than individuals on another list. As a researcher, you can select a fewer number of elements within a list but you can still make sure that you have a diversity of opinion by using the numerous strata. In this manner, stratified sampling keeps the sampling error low although the ultimate size of the sample can be slightly smaller.

Sometimes stratified sampling is not done strictly proportionate to size. When a researcher wants to ensure that positions of minority groups are clearly understood, he or she can survey more minorities than a proportionate estimate would merit. For example, supposing there are only 50 freshman from private schools that are Native Americans. Following a proportionate to size formula, we would select .003 times 50 or only 15 for this stratum. In a nationwide survey, only including 15 in this category would reduce our ability to draw conclusions about Native American freshmen in private schools. A solution to this is to include more than 15; possibly a minimum of 60, and to correct our overall findings by weighing the results of this sub-group by .25 (since we included four times more than should be; we weighted their results by one quarter or .25). Thus, dis-proportionate probability sampling is used when some of the elements are considered to have rare event status.