What are the characteristics of statistical inference basic information?

Statistical inference Statistical inference is a probabilistic inference of unknown things based on random observation data (samples) and the conditions and assumptions (models) of the problem. It is the main task of mathematical statistics, and its theory and method constitute the main content of mathematical statistics.

Basic information characteristics

A basic feature of statistical inference is that the conditions it is based on contain random observation data. Probability theory takes random phenomena as the research object and is the theoretical basis of statistical inference.

Expression form

In mathematical statistics, the problem of statistical inference is often manifested in the following forms: the problem studied has a definite population, and its overall distribution is unknown or partially unknown, and some conclusions related to the unknown distribution are drawn through the samples (observation data) extracted from the population. For example, the height of a group of people constitutes a whole, and it is generally believed that the height obeys a normal distribution, but the average value of this whole is unknown. Randomly select some people to measure their height, and use these data to estimate the average height of this group of people. This is a form of statistical inference, that is, parameter estimation. If the question of interest is "Does the average height exceed 1.7 (m)", it is necessary to test whether this proposition is true through samples, which is also a form of reasoning, that is, hypothesis testing. Because statistical inference infers the whole (population) from parts (samples), it cannot infer the whole from samples, and its conclusion should be expressed in the form of probability. The purpose of statistical inference is to use the basic assumptions of the problem and the information contained in the observation data to make as accurate and reliable a conclusion as possible.

Statistical inference is to extract some samples from the population, and then make a scientific judgment on the population through reasonable analysis of the random data obtained from the extracted part. It is accompanied by a certain probability of speculation, which is characterized in that the population is inferred from the sample, and statistical inference is the core part of mathematical statistics. The basic problems of statistical inference can be divided into two categories: one is parameter estimation; The other is hypothesis testing.

way

In quality activities and management practice, people are concerned about the quality level of specific products, such as the average value of product quality characteristics and unqualified rate. These all need to extract samples from the population and estimate and infer by analyzing the observed values of the samples, that is, infer the unknown parameters of the population distribution according to the samples, which is called parameter estimation. There are two basic forms of parameter estimation: point estimation and interval estimation.

Improved method

Individual is a part of the whole, and local characteristics can reflect the characteristics of the whole. However, due to the heterogeneity of the whole and the randomness of the sample, the sample can not accurately reflect the whole. Therefore, the conclusions about people drawn from the analysis of some individuals are wrong and unreliable. Theoretically, there are two ways to eliminate and reduce this error.

Try to be unified.

Population is an unknown thing we want to study, and it is often impossible for us to change its uniformity. When we can make it achieve the ideal uniformity, we have completely mastered it, and there is no need to study it again.

Ensure the representativeness of sampling.

Taking appropriate sampling methods to ensure the representativeness of sampling can effectively control and improve the reliability and correctness of statistical inference.

There are many methods of random sampling, the common ones are:

1, simple random sampling

Simple random sampling means that the sampling process should be carried out independently, and the chances of each individual being drawn in the population are equal. Random sampling is not random sampling, and random sampling is easily influenced by personal likes and dislikes. In order to achieve randomization, we can draw lots, roll dice or look up a random number table. If l0 products are randomly selected from 100 products to form a sample, the 100 products can be numbered from L to 100, and then l0 numbers are randomly selected by drawing lots, and the products represented by these l0 numbers form a sample. The advantages of this sampling method are small sampling error and complicated procedure. In practice, it is not easy to truly achieve equal opportunities for each individual to be drawn.

2. Periodic systematic sampling

Periodic systematic sampling, also known as equidistant sampling or mechanical sampling, is to sequentially number the population, determine the first block by drawing lots or looking up a random number table, and then take samples in turn according to the principle of equidistant sampling. If five samples are taken from 120 parts, the products are numbered according to the production order, the first part is determined by simple random sampling method, and then 1 part is taken every 24 numbers (from 120÷5=24), and five samples are taken by * *. This method is especially suitable for online sampling, simple and easy to operate, and it is not easy to make mistakes in implementation. But once the starting point of sampling is determined, the whole sample is completely fixed. There is a certain periodic change in the overall quality characteristics. When the sampling interval coincides with the change period of the quality characteristics, samples with large deviation may be obtained.

3. Stratified sampling method

Stratified sampling method, that is, individuals at different levels are randomly selected from a population that can be divided into different subgroups according to the prescribed proportion. When the same product is produced by different equipment and different environments, the quality of the product may vary greatly due to different conditions. In order to make the sampled samples representative, the products produced under different conditions can be grouped to make the products in the same group have the same quality, and then the samples in each group are randomly selected in proportion to synthesize a sample. The samples obtained by this sampling method are representative and the sampling error is small. The disadvantage is that the sampling procedure is complex and is usually used for product quality inspection.

4. Cluster sampling method

In this method, the population is divided into several groups in a certain way, and then several groups are randomly selected, and the samples are composed of all individuals in these groups. For example, according to the production process, 1000 parts are put into 20 boxes, each box contains 50 parts, and then a box is randomly selected, and 50 parts in this box constitute a sample. This sampling method is convenient to implement, but the samples come from individual groups and cannot be evenly distributed in the population, with poor representativeness and large sampling error.