What are the characteristics of research errors in biostatistics and how to treat them in actual statistical analysis?

The first step of biometric analysis is the collection and arrangement of data. There are two main methods to collect data: investigation and biological experiment. The sorting of data is mainly completed by checking and proofreading the original data and making frequency distribution table and frequency distribution map. The experimental data in the field of life sciences generally have three basic characteristics: the concentration is mainly reflected by arithmetic mean, median and geometric mean; The difference is mainly measured by standard deviation, variance and coefficient of variation. The distribution pattern is mainly reflected by skewness and kurtosis. This chapter first introduces the most basic biostatistics terms such as population and variable, and then illustrates the specific method of using software to sort out the original experimental data, and makes a statistical analysis of the characteristics of the experimental data, and comprehensively expounds the methods of sorting out and analyzing the data.

In scientific experiments and investigations, a large number of original data are often obtained, and the observation results of a specific thing or phenomenon are called data. Before statistical analysis, these data are generally scattered, sporadic and isolated, and they are a mess of numbers. In order to reveal the scientific significance of these materials, it is necessary to sort out and analyze them and reveal their internal laws.

2. 1 Common statistical terms

In order to better learn and understand the biostatistics knowledge in the following chapters, we must first master the following basic concepts of biostatistics.

2. 1. 1 population, individuals and samples

Group refers to the whole research object, and each member is called an individual. According to the number of individuals constituting the population, the population can be divided into finite population and infinite population. For example, studying the shell height of Pinctada, because it is impossible to estimate the specific number of Pinctada, it can be considered that Pinctada is an infinite population.

The total number is often very large, and it takes a lot of time, manpower and material resources to measure it all, and it is even impossible to measure every individual completely. In addition, sometimes the data acquisition process is destructive to the research object. If the hardness of the shell is to be measured, the shell must be crushed. Therefore, we can only reflect the characteristics of the crowd by studying some individuals in the crowd. The process of randomly obtaining some individuals from the population is called sampling. In order to make the sampling results representative, it is necessary to adopt the method of random sampling, for example, to sample a biological population equally and estimate its biological characteristics. Simple random sampling methods include lottery, lottery, random number table and so on. A collection of individuals extracted from a population is called a sample. The number of individuals in a sample is called sample size, sample content or sample capacity, which is usually represented by n. If n≤30, the sample is a small sample. N > 30, the sample is a large sample. For example, in March 2009, a pearl farm randomly selected 654.38+00 cages and ***227 Pinctada martensii in order to investigate the growth of 6.5438+00 Pinctada martensii cultured in 2007. The 6,543,800,000 Pinctada martensii that need to be studied here are populations, and each Pinctada martensii is an individual. 227 randomly selected Pinctada martensii are all samples. The sample size of this sample is 227, which is much larger than 30, and it belongs to a large sample.

2. 1.2 variables and constants

Variables are indicators reflected by the research object, such as the content of chlorophyll a in seawater, the weight and length of animals, the food intake of fish, the activity of enzymes, the diameter of cells, the size of DNA molecules and so on. Variables are usually recorded in capital letters such as x or y, and the observed values of variables can be marked as x, which is called data or data. For example, to measure the body length x of a batch of fish, we can randomly select 10 tail fish as samples and measure their body length (x, cm). 10 observation value 14.2, 15.4, 13.6, 15.8, 15.5, 16. 1 According to the possible values of variables, variables can be divided into continuous variables and discrete variables. A continuous variable refers to a variable that can take any value in a certain interval, and its measured value can be infinitely subdivided and its value is continuous. For example, the rice plant height of 50 ~ 60 cm is a continuous variable, because countless values can be taken out in this range. Similarly, the speed of molecular movement, the weight of fish, the shell height of shellfish, the size of enzyme activity and the size of DNA molecules all belong to continuous random variables. Continuous variables can only be obtained by measurement, and their observed values are called continuousdata, also called measurementdata, such as length value, time value, weight value, etc. If the possible values of variables are natural numbers or integers, such variables are called discrete variables, and their values are generally obtained by counting, such as the number of eggs conceived by fish and shellfish. The observed values of discrete variables are called discrete data, also called counting data. If the value of a variable is a relatively stable value within a certain range, then the variable is called a constant. For example, the acceleration of gravity is constant in a small space-time range. The value of the constant is a constant and has relative stability.