Simple random sampling, also known as pure random sampling, is the most basic form of probability sampling. Based on the principle of equal probability, it selects N elements directly and randomly from a population containing N elements to form a sample (n >: n). The common method is similar to drawing lots, that is, numbering each unit in the whole group, writing these numbers on a small piece of paper, then putting them into a container (such as a paper box or a pocket), stirring them evenly, and randomly sampling them until a predetermined number of samples are drawn. In this way, the element represented by the extracted number is a simple random sample.
For example, there are 300 students in a department, and the students' union of the department plans to adopt a simple random sampling method, from which 60 students are selected for investigation. In order to ensure the scientific sampling, they first got a list of students from the department office, and then numbered each student in the list (from 00 1 to 300). After the sampling frame was compiled, they used 300 pieces of paper to write 00 1 002, …, 300 respectively. They put these 300 small pieces of paper with different numbers in a box, and randomly took out 60 small pieces of paper after mixing them up. Then, according to the numbers on these 60 small notes, they found 60 students on the general list. These 60 students constitute their sample this time. This method is easy to learn. However, when there are many overall elements, the workload of writing numbers is very large and it is not easy to mix them evenly, so this method is often used when there are few overall elements.
For the case of more overall elements, we use random number table for sampling. There is a random number table at the back of this book. The numbers and arrangements in the table are randomly formed without any regularity (so it is also called a random number table). The specific steps of sampling with a random number table are as follows:
(1) Get the list of all elements in the population (i.e. sampling box) first;
(2) numbering all the elements in the group one by one in sequence;
(3) according to the number of digits in the overall scale, determine how many digits to select from the random number table;
(4) Taking the overall scale as the standard, weighing the numbers in the random number table one by one to decide whether to accept or reject;
(5) According to the requirement of sample size, select enough digits;
(6) According to the numbers selected from the random number table, find out the corresponding elements in the sampling box.
The element set selected according to the above steps is the required sample. For example, there are ***3 000 people (four digits) in a population, and 100 people need to be selected as a sample for investigation. First, we need to get a list of the whole members; Then number everyone from 1 to 3 000; Then, according to the size of the population, it is determined to choose four digits from the random number table. The specific selection method is as follows: starting from a certain four-digit number in any row and any column of the random number table, select each four-digit number that appears in sequence in the random number table from top to bottom or from left to right, and take 3000 as the standard: all numbers less than or equal to 3000 are selected, and all numbers greater than 3000 and already selected numbers are not selected until the number of 100 is selected; Finally, according to the extracted numbers, we find their corresponding 100 members from the whole list. These 100 members constitute a random sample. Table 6-2 is an example of using a random number table to select four digits when sampling a population of 3,000 people (the last four digits are used, and the order is from top to bottom). Table 6-2 Reasons why the selected numbers in the random number table are not selected 8432990906090 1053873020. The last four digits are greater than 30009427410041013902507250793666. The four digits after 3000 1359866042 are greater than 3 000632191268326839420582507. Repeat with the selected third digit of 2725651kloc-0/7665438.
If the first four digits are used and the order is still from top to bottom, then we can extract four digits from Table 6-2: 1 053, 0 139, 1 359, 2 725. If you take the middle four digits, you get four numbers: 2 990, 1 404, 1 9 12, 0 582.
Second, systematic sampling.
Systematic sampling is also called equidistant sampling or interval sampling. It is a method of sorting the units of a group by numbers, then calculating a certain interval, and then extracting the numbers of individuals in this fixed interval to form a sample. Like simple random sampling, it needs a complete sampling framework, and the sample extraction is to extract individuals directly from the group without other intermediate links.
The specific steps of systematic sampling are:
(1) Number each individual in the group in sequence, that is, make a sampling box.
(2) Calculate the sampling interval. The calculation method is to divide the size of the population by the size of the sample. Assuming that the overall size is n and the sample size is n, the sampling interval k is obtained by the following formula:
K (sampling interval) = n (population size) n (sample size)
(3) In the first k individuals, take an individual by simple random sampling and write down the number of this individual (assuming that the number of this individual is A), which is called a random starting point.
(4) In the sampling box, starting from a, one individual is extracted from every k individuals, that is, the number of individuals extracted is a, A+K, A+2K, …, A+(n- 1)K respectively.
(5) These n individuals together constitute the overall sample.
For example, 100 college students should be taken as a sample from a total of ***3 000 students in a university. First, we number the list of 3 000 students in turn, and then according to the above formula, we can get the sampling interval as follows:
K=3 000/ 100=30
That is, one for every 30 people. To this end, we first extract a number from the number 1 ~ 30 by simple random sampling. If 12 is drawn, then take 12 as the first number and draw another number every 30 minutes. In this way, we can get 12, 42, 72, …, 2 982, and the total is *** 100. According to the number 100, we find out 100 students from the overall list one by one, and these 100 students constitute this sample.
From the above process, we can easily see that systematic sampling is obviously much easier than simple random sampling, especially in the case of large population and sample size. This is why social research uses less simple random sampling and more systematic sampling.
It is worth noting that a very important premise of systematic sampling is that the arrangement of individuals in the population should be random relative to the variables studied, that is, there is no regular distribution related to the variables studied; Otherwise, the results of systematic sampling will produce great deviation. Therefore, when we use systematic sampling method, we must pay attention to the compilation method of sampling frame. Pay special attention to the following two situations:
First, in the overall list, the arrangement of individuals has a certain order and level. For example, we need to take samples from multiple families to investigate consumption. The list of households is arranged according to the total income of each family from high to low. In this way, if two researchers use systematic sampling method to sample from this population, assuming that the sampling interval is 40, the number of random starting points of one is 3; And another random starting number is after 38. Then, the average household income calculated from the sample book of the former researcher must be much higher than that calculated from the sample book of the latter. Because every family in the first sample is 35 places ahead of every family in the second sample in income level, that is to say, every family in the former is 35 places higher than every family in the latter in total income. If you notice this situation in advance, you can use the method of extracting the middle position, that is, the 20 th.
Secondly, in the overall list, there is a periodic distribution corresponding to the sampling interval in the arrangement of individuals. For example, in the previous example about college students, we calculated that the spacing was 30. If the overall list at this time is arranged by teaching classes, each class is about 30 people, and the list of each class is arranged according to the level of students' academic performance, or in the order of class cadres, ordinary students and poor students. Then, when the random starting number is at the top, the sample is composed of students with excellent grades in each class, or all of them are composed of class cadres in each class;