Chapter 1 Algebra, Trigonometric Formulas, and Elementary Algebra

test results

B ₁

B ₂

_Bm

...

...

...

B ₁

B ₂

_Bm

...

...

...

2. Statistical Hypothesis Experiment

1. Steps of Statistical Hypothesis Testing

It is assumed that the population has certain statistical properties (such as having a certain parameter, or following a certain distribution, etc.), and then testing whether the hypothesis is credible. This method is called statistical hypothesis testing (or hypothesis testing). The steps are as follows:

For example , if the average strength of a certain product is known in kilograms, the production method is changed, and parts are randomly selected to calculate kilograms and kilograms. Q Does the change in the production method have a significant effect on the strength?

Statistical Hypothesis Testing Steps

Process analysis

( 1 ) Suppose H ₀

( 2 ) Select statistics and clarify their distribution

( 3 ) gives the significance level

( 4 ) Find out the confidence limit

( 5 ) Calculate the statistic u

( 6 ) Statistical inference

At that time , accept H ₀

At that time , negating H ₀

H ₀ :

( is the overall mean after the production method has been changed)

Depend on

Check the normal distribution table to get K _0.025=1.96

due to

Therefore, it is believed that H ₀ , with a significance level of 5 % , the change in the manufacturing method is considered to have no significant effect on the strength of the product.

2. Statistical hypothesis test table for normal population parameters

For large samples, no matter the population follows the even distribution, according to the central limit theorem, it can be considered that the sample mean asymptotically follows the normal distribution. Therefore, statistical hypothesis testing of population parameters was performed using the " u- test method " described below.

In the table is the given significance level, which is the sample mean, and s is the sample standard deviation.

name

Condition and inspection purpose

Assumption

Statistics and their distribution

Negative domain

Determination of confidence limits

check

test

Law

Given the population variance , test whether the mean of the population is equal to (or less than or greater than) a known constant

Two population variances are known to be equal

Compare the two population means and

_{Condition and inspection purpose}

_Assumption

_{Statistics and their distribution}

_{Negative domain}

Determination of confidence limits

Two population variances are known

Compare two population means

and

check

test

Law

Population variance unknown, test whether the population mean is equal to (or less than or greater than) a known constant

Two populations are known to have the same variance (but the value is unknown), compare the sum of the means of the two populations .

check

test

Law

Given the population mean , test whether the population variance is equal to (or less than or greater than) a known constant .

The population mean is unknown, test whether the population variance is equal to (or less than or greater than) a known constant .

check

test

Law

The mean and variance of the two populations are unknown, and the variances of the two populations are compared

3. Statistical hypothesis testing of the overall distribution function

Let be a known type of distribution function, be a parameter (known or partially known), be a sample of the population, and be a hypothesized distribution function , and perform statistical hypothesis testing in two cases.

All parameters of 1 ^° are known to divide the real axis into m disjoint intervals:

which is understood to be . Let the theoretical frequency be

The number of samples that fall in the interval is (empirical frequency), then the statistic

Following a distribution with m degrees of freedom, the hypothesis can be tested by applying the test method

H ₀ : F ( x ) =F ₀ ( x )

Is it credible.

All or part of the parameters of 2 ^°F ₀ ( x ) are unknown. If there are l parameters unknown, the maximum likelihood method (this section, 1, 3 ) can be used to determine the estimates of these l parameters. As the corresponding parameter, then the theoretical frequency can be calculated in the case of 1 ^° , and then the empirical frequency can be calculated, then the statistic

When n is large, it follows a distribution with degrees of freedom. Hypotheses can be tested by applying tests

H ₀ : F ( x ) =F ₀ ( x )

Is it credible.

4. Statistical hypothesis test for whether two samples are from the same distribution population

[ Symbol test method ] This method is simple and intuitive, and does not require an understanding of the distribution law of the test quantity. It is often used to test whether the degree of fluctuation is the same and whether there is an obvious change in the production status.

The symbols " + " , " - " and " 0 " are used to indicate that the data of A is larger, smaller and equal than that of B respectively, and , and are used to indicate the number of occurrences of " + " , " - " and " 0 ". Statistical hypothesis testing step use case description is as follows:

Example A and B analyze the content of a certain component in the same substance and obtain the following table

First

Second

symbol

14.7 15.0 15.2 14.8 15.5 14.6 14.9 14.8 15.1 15.0

14.6 15.1 15.4 14.7 15.2 14.7 14.8 14.6 15.2 15.0

+ - - + + - + + - 0

First

Second

symbol

14.7 14.8 14.7 15.0 14.9 14.9 15.2 14.7 15.4 15.3

14.6 14.6 14.8 15.3 14.7 14.6 14.8 14.9 15.2 15.0

+ + - - + + + - + +

Are there any significant differences in the results of the two analyses ?

Statistical Hypothesis Testing Steps

Process analysis

(1) Suppose H ₀

(2) Statistics

(3) Give the significance level

(4) Find out the confidence limit

(5) Calculate statistics

(6) Statistical inference

At that time , accept H ₀

At that time , negating H ₀

Assume that the two analysis results have the same distribution function

r= min { n ₊ , n _- }

a = 10%

Check the symbol inspection table ( see next page ), by

N=n ₊ + = 12+7=19,

a = 10% , the negative domain is .

because r= 7>5 =r _10%

Therefore, accepting H ₀ means that there is no significant difference in the analysis results of A and B with 10% reliability .

Symbol Checklist _

1 5 10 25

( % )

1 5 10 25

( % )

1 5 10 25

( % )

twenty one

twenty two

twenty three

twenty four

0 0

0 0 1

0 0 1 1

0 1 1 2

0 1 2 3

1 2 2 3

1 2 3 3

1 2 3 4

2 3 3 4

2 3 4 5

2 4 4 5

3 4 5 6

3 5 5 6

4 5 6 7

4 6 7 8

5 6 7 8

5 7 7 9

6 7 8 9

6 7 8 10

6 8 9 10

7 8 9 10

7 9 10 11

8 9 10 12

8 10 11 12

9 10 11 13

9 11 12 13

9 11 12 14

10 12 13 14

11 12 13 15

11 13 14 15

11 13 14 16

12 14 15 16

12 14 15 17

13 15 16 17

13 15 16 18

14 16 17 19

15 17 18 19

15 17 18 20

15 18 19 20

16 18 19 21

16 18 20 21

17 19 20 22

17 20 21 23

18 20 21 23

18 21 22 24

19 21 22 24

19 21 23 25

20 22 23 25

20 22 24 25

20 23 24 26

21 23 24 26

21 24 25 27

22 24 25 27

22 25 26 28

23 25 27 29

23 26 27 29

24 26 28 30

24 27 28 30

25 27 28 31

25 28 29 31

25 28 29 32

26 28 30 32

26 29 30 32

27 29 31 33

27 30 31 33

28 30 32 34

28 31 32 34

28 31 33 35

29 32 33 35

29 32 33 36

30 32 34 36

30 33 34 37

31 33 35 37

31 34 35 38

31 34 36 38

32 35 36 39

[ Note ] The numbers in the table represent the sign limits corresponding to the sign and N and the significance level .

[ Rank sum test method ] This method has higher accuracy than the symbol test method , can better utilize the information provided by the data , and does not require the data to be "paired" . The steps and use cases are described as follows :

For example, a life test is carried out on a product made of two materials, A and B , and it is found that

A 1610 1650 1680 1700 1750 1720 1800

B 1580 1600 1640 1640 1700

Is there any significant difference in the impact of the two materials on product quality ?

Solution Arrange the above data into the following table from small to large :

rank

1 2 3 4 5 6 7 8 9 10 11 12

First

Second

1610 1650 1680 1700 1720 1750 1800

1580 1600 1640 1640 1700

The rank in the first row in the above table represents the ordinal number arranged from small to large. There are 1700 A and B data , and they are ranked in two ordinal positions of 8 and 9. The rank is taken according to the average rank .

Statistical Hypothesis Testing Steps

Process analysis

( 1 ) Suppose H ₀

( 2 ) Statistics

( 3 ) gives the significance level

( 4 ) Find out the confidence limit

( 5 ) Calculate statistics

( 6 ) Statistical inference

At that time , accept H ₀

When or , negate H ₀

Assuming no significant difference in the impact of the two materials on product life

T = sum of ranks for the group with the smaller number of samples

Check the "rank sum test table" (see next page), parameters n ₁ =5, n ₂ =7

( n ₁n ₂ , the size of the two samples) to get the lower limit of T

and caps (i.e. negative domains or

T= 1+2+4+5+8.5=20.5 (rank sum of group B)

Because , so negate H ₀ , that is, with 5 % , think that the influence of the two materials on the product life is significantly different

rank sum test table

n ₁

n ₂

n ₁

n ₂

n ₁

n ₂

n ₁

n ₂

n ₁

n ₂

twenty one

twenty two

twenty one

twenty one

twenty three

twenty two

twenty four

twenty two

twenty one

twenty three

twenty two

twenty four

108

105

114

111

131

127

[ Note ] The header indicates the number of data in the two groups; and are the lower and upper limits of the rank sum, respectively. The corresponding rank and upper and lower limits are represented by bold numbers, and the corresponding rank and upper and lower limits are represented by ordinary fonts.

3. Analysis of variance

Analysis of variance is a method of analyzing experimental (or observational) data. The basic problem it solves is to clarify the influence of various factors related to the research object and the interaction between various factors on the object through data analysis. The objects it studies are assumed to follow a normal distribution.

[ One-way ANOVA ] considers the influence of different levels of a factor A on the object under investigation. Test for k different levels A _i of A (their distributions are tested to obtain test data ; n _k ) assuming (although the value is unknown), test whether the mean of the test results of each A _i is significantly different. The inspection steps are as follows:

( 1 ) Assumption

( 2 ) Select statistics and clarify their distribution

in the formula

( 3 ) gives the significance level

( 4 ) The confidence limit can be found from the F distribution table (degree of freedom is ( k -1, n - k ) ) , which satisfies

( 5 ) List calculation statistics.

Grading	Test data x _ij		n _i
A ₁ A ₂ A _k	... ... ... ...		n ₁ n ₂ n _k

		mark

( 6 ) One-way ANOVA table

variance

source

sum of square

degrees of freedom

mean square

Statistics

confidence limits

statistical inference

Between groups

At that time , accept H ₀

At that time , negating H ₀

sum

Explanation: If the value of 1 ^° is larger, take it as a constant, then use it instead to carry out the above calculation, and the analysis result will not change. 2 ^{° The between}_- group variance S1 reflects the systematic error caused by different levels of factor A , while the within-group variance S2 is the within _- group difference caused by random factors. If the effects of different factors A _i are similar, the ratio of the between-group variance to the within-group variance is small, then it can be considered ; if the effects of different factors A _i are significantly different, the ratio of the between-group variance to the within-group variance is larger, it cannot be considered .

[ Two-way ANOVA ] Consider the influence of two factors A and B. A is divided into l grades A ₁ , A ₂ , ··· , A _l . B is divided into m grades B ₁ , B ₂ , ··· , B _m under the condition of two factors A _i_j (that is, A _i and B _j are required to make lm kinds of cooperation in each test) for n trials, get lmn data . the assumed distribution , testing the effect of A orWhether the effect of B or the effect of B has a significant effect on the test results, respectively. The inspection steps are as follows:

( 1 ) Hypothesis H ₀ : The corresponding effect ( A or B or ) has no significant effect on the test results.

( 2 ) Select statistics and clarify their distribution

where F _A , F _B and represent the effect of factor A , the effect of B and the interaction of factors A and B , respectively, and

( 3 ) gives the reliability .

( 4 ) Find out the confidence limits . When the degrees of freedom are , then

( 5 ) List calculation statistics (Table 1 and Table 2 ).

Table 1

Table 2

	B ₁ B ₂ ... B _m
A ₁ A ₂ A _l	x ₁₁ x ₁₂ ... x _{1 m} x ₂₁ x ₂₂ ... x _{2 m} ... x _l₁ x _l₂ ... x _lm
	...
	...

( 6 ) Two-way ANOVA table

variance

source

sum of square

degrees of freedom

mean square

Statistics

confidence limits

statistical inference

A 's

effect

B 's

effect

S _A = Q

S _B =R

when

, accept H _0.

role

random action

S _{A B} =

T– Q + P

S _false = W

( l )( m )

lm ( n )

when

, negate H ₀ .

total flat

Fang He

lmn

When the interaction of the two factors A and B is not significant, S _{A B} and S are _mistakenly mixed together. At this time, if only one experiment is performed under the condition (ie n= 1 ), the measured experimental data is x _{i j} , record

but

At this time, the statistics and distribution of factor A and factor B are

The calculation process and analysis of variance are the same as before.

[ Analysis of variance by systematic grouping ] The method of grouping by system is often used for investigation. For example, when a county is surveyed, several communes are selected, each commune also selects several brigades, and each brigade selects several production teams. This approach is called system grouping.

ANOVA for systematic grouping is different from multivariate ANOVA. _{For example, in the two}- way ANOVA, the factors A and B are parallel, but in the ANOVA of _the systematic grouping , A and B are not parallel . _l , and then in each group A _i are grouped by factor B into B _i₁ , B _i₂ ,..., B _im . However, the method of analysis is similar.

Suppose n times of tests are made under the conditions of factor A _i and factor B _ij , and the test data is , and the inspection steps are as follows:

( 1 ) Hypothesis H ₀ : Under the condition, the effect of factor A (or B ) is not significant.

( 2 ) Select statistics

where F _AB and F _B represent the significance of the influence of factor A and factor B , respectively, and

( 3 ) gives the reliability .

( 4 ) Find out the confidence limits . When the degrees of freedom are

(5) List calculation statistics

		Test result x _ij^(k)
A ₁	B ₁₁ B ₁₂	...		x ₁₁ x ₁₂ x _{1 m}

A _l	B _l₁ B _l2 B _l_m	...		x ₁₁ x ₁₂ x _{1 m}

			mark