All about construction and renovation

The level of statistical significance in psychology. Level of statistical significance (p)

Significance level - is the probability that we considered the differences significant, but they are actually random.

When we indicate that differences are significant at the 5% significance level, or at R< 0,05 , then we mean that the probability that they are still unreliable is 0.05.

When we indicate that differences are significant at the 1% significance level, or at R< 0,01 , then we mean that the probability that they are still unreliable is 0.01.

If we translate all this into a more formalized language, then the significance level is the probability of rejecting the null hypothesis, while it is true.

Error,consisting ofthe onewhat werejectednull hypothesis,while it is true is called a type 1 error.(See Table 1)

Tab. 1. Null and alternative hypotheses and possible test states.

The probability of such an error is usually denoted as α. In fact, we would have to put in parentheses not p < 0.05 or p < 0.01, and α < 0.05 or α < 0,01.

If the error probability is α , then the probability of a correct decision: 1-α. The smaller α, the greater the probability of a correct solution.

Historically, in psychology, it is customary to consider the 5% level (p≤0.05) as the lowest level of statistical significance: the 1% level is sufficient (p≤0.01) and the highest 0.1% level ( p≤0.001), therefore, in the tables of critical values, the values ​​of the criteria are usually given, corresponding to the levels of statistical significance p≤0.05 and p≤0.01, sometimes - p≤0.001. For some criteria, the tables indicate the exact level of significance of their different empirical values. For example, for φ*=1.56 p=0.06.

Until, however, the level of statistical significance reaches p=0.05, we are not yet entitled to reject the null hypothesis. We will adhere to the following rule of rejecting the hypothesis of no differences (HO) and accepting the hypothesis of statistical significance of differences (H 1).

Rule of rejection Ho and acceptance h1

If the empirical value of the criterion equals or exceeds the critical value corresponding to p≤0.05, then H 0 is rejected, but we cannot yet definitely accept H 1 .

If the empirical value of the criterion equals or exceeds the critical value corresponding to p≤0.01, then H 0 is rejected and H 1 is accepted.

Exceptions : G sign test, Wilcoxon T test, and Mann-Whitney U test. They are inversely related.

Rice. 4. An example of the “significance axis” for the Rosenbaum Q test.

The critical values ​​of the criterion are designated as Q o.o5 and Q 0.01, the empirical value of the criterion as Q emp. It is enclosed in an ellipse.

To the right of the critical value Q 0.01 extends the "significance zone" - empirical values ​​fall here that exceed Q 0.01 and, therefore, are certainly significant.

To the left of the critical value of Q 0.05, the "zone of insignificance" extends - empirical values ​​of Q fall here, which are below Q 0.05, and, therefore, are unconditionally insignificant.

We see that Q 0,05 =6; Q 0,01 =9; Q emp. =8;

The empirical value of the criterion falls within the range between Q 0.05 and Q 0.01. This is a zone of "uncertainty": we can already reject the hypothesis about the unreliability of differences (H 0), but we cannot yet accept the hypotheses about their reliability (H 1).

In practice, however, the researcher can consider significant already those differences that do not fall into the zone of insignificance, declaring that they are significant at p < 0.05, or indicating the exact level of significance of the obtained empirical value of the criterion, for example: p=0.02. With the help of standard tables that are in all textbooks on mathematical methods, this can be done in relation to the Kruskal-Wallis H criteria, χ 2 r Friedman, L Page, φ* Fisher .

The level of statistical significance or the critical values ​​of the criteria are defined differently when testing directed and undirected statistical hypotheses.

With a directional statistical hypothesis, a one-tailed test is used, with an undirected hypothesis, a two-tailed test. The two-tailed test is more stringent because it tests for differences in both directions, and therefore the empirical value of the test that previously corresponded to the p significance level < 0.05, now corresponds only to the p level < 0,10.

We don't have to decide for ourselves each time whether he uses a one-tailed or two-tailed test. The tables of critical values ​​of the criteria are selected in such a way that the directional hypotheses correspond to a one-sided criterion, and the non-directional hypotheses correspond to a two-sided criterion, and the given values ​​satisfy the requirements that apply to each of them. The researcher only needs to ensure that his hypotheses coincide in meaning and form with the hypotheses proposed in the description of each of the criteria.

Lecture 4

General Principles for Testing Statistical Hypotheses

We emphasize once again that the data obtained as a result of the experiment on any sample serve as the basis for judging the general population. However, due to the action of random probabilistic reasons, an estimate of the parameters of the general population made on the basis of experimental (sample) data will always be accompanied by an error, and therefore such estimates should be considered as conjectural, and not as final statements. Similar assumptions about the properties and parameters of the general population are called statistical hypotheses .

The essence of testing a statistical hypothesis is to establish whether the experimental data and the hypothesis put forward are consistent, is it permissible to attribute the discrepancy between the hypothesis and the result of the statistical analysis of experimental data due to random causes? Thus, a statistical hypothesis is a scientific hypothesis that allows statistical testing, and mathematical statistics is a scientific discipline whose task is to scientifically substantiate testing of statistical hypotheses.

Statistical hypotheses

When testing statistical hypotheses, two concepts are used: the so-called zero (notation H 0) and an alternative hypothesis (notation H 1).

Null hypothesis is the no-difference hypothesis. It is designated as and is called null because it contains the number 0: , where are the matched feature values.

The null hypothesis is what we want to disprove if we are faced with the task of proving the significance of the differences.

Alternative hypothesis is a hypothesis about the significance of differences. It is marked as . An alternative hypothesis is what we want to prove, which is why it is sometimes called experimental hypothesis.

There are problems when it is required to prove just the insignificance of differences, i.e. confirm the null hypothesis. However, more often than not, it is required to prove significance of differences, as they are more informative in the search for a new one.

The null and alternative hypotheses can be directional or non-directional.

Directional hypotheses

: does not exceed

: exceeds

Undirected hypotheses

: is not different

: is different

If during the experiment it was noticed that in the water group the individual values ​​of the subjects for some trait, for example, for social courage, are higher, and in the other they are lower, then to test the significance of these differences, it is necessary to formulate directed hypotheses.

If it is necessary to prove that the first group underwent more pronounced changes under the influence of some experimental influences than the second group, then in this case it is also necessary to formulate directed hypotheses.

If it is required to prove that the forms of distribution of a feature differ in the first and second groups, then undirected hypotheses are formulated.

Comment. When describing each criterion, the formulations of hypotheses are given, which it helps to test.

Generally speaking, when accepting or rejecting hypotheses, various options are possible.

For example, a psychologist conducted selective testing of intelligence indicators in a group of adolescents from complete and single-parent families. As a result of the processing of experimental data, it was found that adolescents from single-parent families have lower intelligence indicators on average than their peers from complete families. Can a psychologist, based on the results obtained, conclude that an incomplete family leads to a decrease in intelligence in adolescents? The conclusion adopted in such cases is called a statistical decision. We emphasize that such a solution is always probabilistic.

When testing a hypothesis, experimental data may contradict the hypothesis , then this hypothesis is rejected. Otherwise, i.e. if the experimental data are consistent with the hypothesis, it is not rejected. Often in such cases it is said that the hypothesis is accepted (although this formulation is not entirely accurate, it is widely used and we will use it in what follows). This shows that statistical testing of hypotheses based on experimental, selective data is inevitably associated with the risk (probability) of making a false decision. In this case, errors of two kinds are possible.

Type I error occurs when a decision is made to reject a hypothesis when in fact it turns out to be true.

Type II error will occur when a decision is made not to reject the hypothesis, although in reality it will be false. Obviously, correct conclusions can also be drawn in two cases. The above is better presented in the form of table 1:

Table 1

It is possible that the psychologist may be mistaken in his statistical decision; As we see from Table 1, these errors can only be of two kinds. Since it is impossible to exclude errors in the adoption of statistical hypotheses, it is necessary to minimize the possible consequences, i.e. accepting an incorrect statistical hypothesis. In most cases, the only way to minimize errors is to increase the sample size.

The concept of the level of statistical significance

When justifying a statistical inference, one should decide where is the line between accepting and rejecting the null hypothesis? Due to the presence of random influences in the experiment, this boundary cannot be drawn absolutely exactly. It is based on the concept significance level.

Def. Significance levelis the probability of incorrectly rejecting the null hypothesis. Or, in other words, significance level is the probability of a Type I error in decision making.

To denote this probability, as a rule, they use either the Greek letter or the Latin letter R. In what follows, we will use the letter R.

Historically, in the applied sciences that use statistics, and in particular in psychology, it is considered that the lowest level of statistical significance is the level; sufficient-level and superior level. Therefore, in the statistical tables that are given in the appendix to textbooks on statistics, tabular values ​​​​for the levels are usually given: ; ; . Sometimes tabular values ​​are given for levels and . The values ​​0.05, 0.01 and 0.001 are the so-called standard levels of statistical significance . In the statistical analysis of experimental data, the psychologist, depending on the objectives and hypotheses of the study, must choose the required level of significance. As you can see, here the largest value, or the lower limit of the level of statistical significance, is 0.05 - this means that five errors are allowed in a sample of one hundred elements (cases, subjects) or one error out of twenty elements (cases, subjects). It is believed that neither six, nor seven, nor more times out of a hundred, we can make a mistake. The cost of such mistakes would be too high.

Note that in modern statistical software packages on computers, not standard significance levels are used, but levels calculated directly in the process of working with the corresponding statistical method. These levels, denoted by the letter R, can have a different numeric expression in the range from 0 to 1, for example, R= 0,7, R= 0.23 or R= 0.012. It is clear that in the first two cases, the obtained significance levels are too high and it is impossible to say that the result is significant. At the same time, in the latter case, the results are significant at the level of 12 thousandths, this is a reliable level.

The rule for accepting a statistical conclusion is as follows: on the basis of the experimental data obtained, the psychologist calculates the so-called empirical statistics, or empirical value, using the statistical method chosen by him. It is convenient to denote this value as H emp. Then empirical statistics H emp compared with two critical values, which correspond to the 5% and 1% significance levels for the selected statistical method and which are denoted as . The values ​​are found for a given statistical method according to the corresponding tables given in the appendix to any textbook on statistics. These quantities, as a rule, are always different and, for convenience, they can be further called as and . The values ​​of critical values ​​found from the tables and are conveniently presented in the following standard notation form:

We emphasize, however, that we used the notation and as an abbreviation for the word "number". In all statistical methods, their symbolic designations of all these quantities are accepted: both the empirical value calculated by the corresponding statistical method, and the critical values ​​\u200b\u200bfound from the corresponding tables. For example, when calculating Spearman's rank correlation coefficient according to Table 21 of the Appendix, the following values ​​of critical values ​​were found, which for this method are denoted by the Greek letter (rho).

It is customary to write the found values ​​as follows:

Now we need to compare our empirical value with the two critical values ​​found in the tables. This is best done by placing all three numbers on the so-called " significance axes». « Significance axis» is a straight line, at the left end of which is 0, although it is usually not marked on this straight line itself, and the number series increases from left to right. In fact, this is the usual school x-axis OH Cartesian coordinate system. However, the peculiarity of this axis is that three sections are allocated on it, “ zones". The left zone is called zone of insignificance , right - zone of significance , and the intermediate zone of uncertainty . The boundaries of all three zones are Ch cr1 For P = 0.05 and for P = 0.01 as shown below.

When substantiating a statistical inference one must decide where the line between acceptance and rejection of zero hypotheses? Due to the presence of random influences in the experiment, this boundary cannot be drawn absolutely exactly. It is based on the concept significance level.levelsignificance is the probability of incorrectly rejecting the null hypothesis. Or, in other words, levelsignificance-This the probability of a Type I error in decision making. To denote this probability, as a rule, they use either the Greek letter α or the Latin letter R. In what follows, we will use the letter R.

Historically, it has been that in applied sciences using statistics, and in particular in psychology, it is considered that the lowest level of statistical significance is the level p = 0.05; sufficient - level R= 0.01 and the highest level p = 0.001. Therefore, in the statistical tables that are given in the appendix to textbooks on statistics, tabular values ​​\u200b\u200bare usually given for the levels p = 0,05, p = 0.01 and R= 0.001. Sometimes tabular values ​​are given for levels R - 0.025 and p = 0,005.

The values ​​0.05, 0.01 and 0.001 are the so-called standard levels of statistical significance. In the statistical analysis of experimental data, the psychologist, depending on the objectives and hypotheses of the study, must choose the required level of significance. As you can see, here the largest value, or the lower limit of the level of statistical significance, is 0.05 - this means that five errors are allowed in a sample of one hundred elements (cases, subjects) or one error out of twenty elements (cases, subjects). It is believed that neither six, nor seven, nor more times out of a hundred, we can make a mistake. The cost of such mistakes would be too high.

Note, that in modern statistical packages on computer not standard significance levels are used, but levels calculated directly in the process of working with the corresponding statistical method. These levels, denoted by the letter R, can have a different numeric expression in the range from 0 to 1, for example, p = 0,7, R= 0.23 or R= 0.012. It is clear that in the first two cases the significance levels obtained are too high and it is impossible to say that the result is significant. At the same time, in the latter case, the results are significant at the level of 12 thousandths. This is a valid level.

Acceptance rule statistical inference is as follows: on the basis of the experimental data obtained, the psychologist calculates, according to the statistical method chosen by him, the so-called empirical statistics, or empirical value. It is convenient to denote this value as H emp. Then empirical statistics H emp is compared with two critical values, which correspond to the 5% and 1% significance levels for the chosen statistical method and which are denoted as Ch cr. Quantities H cr are found for a given statistical method according to the corresponding tables given in the appendix to any textbook on statistics. These quantities, as a rule, are always different and, for convenience, they can be further referred to as Ch cr1 And Ch cr2. Critical values ​​found from the tables Ch cr1 And Ch cr2 It is convenient to represent in the following standard notation:


We emphasize, however, that we have used the notation H emp And H cr as an abbreviation of the word "number". In all statistical methods, their symbolic designations of all these quantities are accepted: both the empirical value calculated by the corresponding statistical method, and the critical values ​​\u200b\u200bfound from the corresponding tables. For example, when calculating the rank coefficient spearman correlations according to the table of critical values ​​of this coefficient, the following values ​​of critical values ​​were found, which for this method are denoted by the Greek letter ρ (“ro”). So for p = 0.05 according to the table, the value is found ρ cr 1 = 0.61 and for p = 0.01 value ρ cr 2 = 0,76.

In the standard notation adopted below, it looks like this:

Now us necessary compare our empirical value with the two critical values ​​\u200b\u200bfound from the tables. This is best done by placing all three numbers on the so-called "significance axis". The “significance axis” is a straight line, at the left end of which is 0, although it, as a rule, is not marked on this straight line itself, and the number series increases from left to right. In fact, this is the usual school abscissa axis OH Cartesian coordinate system. However, the peculiarity of this axis is that three sections, “zones”, are distinguished on it. One extreme zone is called the zone of insignificance, the second extreme zone is called the zone of significance, and the intermediate zone is called the zone of uncertainty. The boundaries of all three zones are Ch cr1 For p = 0.05 and Ch cr2 For p = 0.01, as shown in the figure.

Depending on the decision rule (inference rule) prescribed in this statistical method, two options are possible.

First option: The alternative hypothesis is accepted if H empCh cr.

Significance zone
Zone of insignificance
0,05
0,01
Ch cr1
Ch cr2

Counted H emp according to some statistical method, it must necessarily fall into one of the three zones.

If the empirical value falls into the zone of insignificance, then the hypothesis H 0 about the absence of differences is accepted.

If H emp fell into the zone of significance, the alternative hypothesis H 1 is accepted if there are differences, and the hypothesis H 0 is rejected.

If H emp falls into the zone of uncertainty, the researcher faces dilemma. So, depending on the importance of the problem being solved, he can consider the obtained statistical estimate reliable at the level of 5%, and thus accept the hypothesis H 1, rejecting the hypothesis H 0 , or - unreliable at the level of 1%, thus accepting the hypothesis H 0 . We emphasize, however, that this is exactly the case when a psychologist can make mistakes of the first or second kind. As discussed above, in these circumstances it is best to increase the sample size.

We also emphasize that the value H emp can exactly match either Ch cr1 or Ch cr2. In the first case, we can assume that the estimate is reliable exactly at the level of 5% and accept the hypothesis H 1 , or, conversely, accept the hypothesis H 0 . In the second case, as a rule, the alternative hypothesis H 1 about the presence of differences is accepted, and the hypothesis H 0 is rejected.

Determine expected in your experiment results. Usually, when scientists conduct an experiment, they already have an idea of ​​what results to consider "normal" or "typical." This may be based on the experimental results of past experiments, on reliable data sets, on data from the scientific literature, or the scientist may be based on some other sources. For your experiment, define the expected results and express them as numbers.

  • Example: Let's say earlier research has shown that in your country, red car owners are more likely to get speeding tickets than blue ones. For example, average scores show a 2:1 preference for red cars over blue ones. Our task is to determine if the police are equally biased towards the color of cars in your city. To do this, we will analyze the fines issued for speeding. If we take a random set of 150 speeding tickets issued to either red or blue car owners, we expect that 100 fines will be issued to owners of red cars, and 50 - owners of blue, if the police in our city are as prejudiced about the color of cars as they are throughout the country.

Determine observed the results of your experiment. Now that you've determined your expected results, it's time to experiment and find the actual (or "observed") values. You again need to represent these results as numbers. If we create experimental conditions and the observed results different from expected, then we have two possibilities - either it happened by accident, or it is caused with our experiment. The purpose of finding the p-value is precisely to determine whether the observed results differ from the expected ones in such a way that one can not reject the "null hypothesis" - the hypothesis that there is no relationship between the experimental variables and the observed results.

  • Example: Let's say in our city we randomly selected 150 speeding tickets that were issued to either red or blue car owners. We have determined that 90 fines were issued to owners of red cars, and 60 - blue owners. This is different from the expected results, which are 100 And 50, respectively. Did our experiment (in this case, changing the data source from state to city) actually lead to this change in results, or is our city police biased against motorists? similar, like the national average, and we see just a random deviation? The p-value will help us determine this.
  • Determine the number degrees of freedom your experiment. The number of degrees of freedom is the degree of variability in your experiment, which is determined by the number of categories you are exploring. The equation for the number of degrees of freedom is Number of degrees of freedom = n-1, where "n" is the number of categories or variables that you analyze in your experiment.

    • Example: in our experiment, there are two categories of outcomes: one category for red car owners and one for blue car owners. Therefore, in our experiment we have 2-1 = 1 degree of freedom. If we compared red, blue and green cars, we would have 2 degrees of freedom, and so on.
  • Compare expected and observed results with a test chi-square. Chi-square (written "x 2") is a numerical value that measures the difference between expected And observable experiment values. The equation for chi-square is the following: x 2 \u003d Σ ((o-e) 2 / e) where "o" is the observed value and "e" is the expected value. Sum the results of the given equation for all possible outcomes (see below).

    • Note that this equation includes the summation operator Σ (sigma). In other words, you need to calculate ((|o-e|-.05) 2 /e) for each possible outcome and add the numbers together to get the chi-square value. In our example, we have two possible outcomes - either the car that received the penalty is red or blue. So we have to count ((o-e) 2 /e) twice - once for the red cars and once for the blue cars.
    • Example: Let's plug our expected and observed values ​​into the equation x 2 = Σ((o-e) 2 /e). Remember that because of the summation operator, we need to count ((o-e) 2 /e) twice - once for the red cars and once for the blue ones. We will make this work as follows:
      • x 2 = ((90-100) 2/100) + (60-50) 2/50)
      • x 2 = ((-10) 2/100) + (10) 2/50)
      • x 2 = (100/100) + (100/50) = 1 + 2 = 3 .
  • Select significance level. Now that we know the number of degrees of freedom in our experiment and the value of the chi-square test, we need to do one more thing before we can find our p-value. We need to determine the level of significance. In simple terms, the level of significance indicates how confident we are in our results. A low value for significance corresponds to a low probability that the experimental results are random and vice versa. Significance levels are written as decimal fractions (such as 0.01), which corresponds to the probability that we obtained the experimental results by chance (in this case, the probability of this is 1%).

  • Use the chi-square distribution datasheet to find the p-value. Scientists and statisticians use large spreadsheets to calculate the p-value of their experiments. Table data usually have a vertical axis on the left, corresponding to the number of degrees of freedom, and a horizontal axis on the top, corresponding to the p-value. Use the table data to first find your number of degrees of freedom, then look at your series from left to right until you find the first value, more your chi-square value. Look at the corresponding p-value at the top of your column. The p-value you need is between this number and the next one (the one to the left of yours).

    • Chi-squared distribution tables can be obtained from a variety of sources - they can simply be found online, or looked up in science or statistics books. If you don't have these books handy, use the picture above, or use an online spreadsheet that you can view for free, such as medcalc.org. She is located .
    • Example: Our chi-square value was 3. So let's use the chi-square distribution table in the image above to find an approximate p-value. Since we know that in our experiment all 1 degree of freedom, choose the very first row. We go from left to right along the given line until we meet a value greater than 3 , our chi-square value. The first one we find is 3.84. Looking up our column, we see that the corresponding p-value is 0.05. This means that our p-value between 0.05 and 0.1(next p-value in the table in ascending order).
  • Decide whether to reject or leave the null hypothesis. Since you have determined the approximate p-value for your experiment, you need to decide whether to reject the null hypothesis of your experiment or not (recall, this is the hypothesis that the experimental variables you manipulated Not influenced the results you observed). If the p-value is less than the significance level - congratulations, you have proven that there is a very likely relationship between the variables you manipulated and the results you observed. If the p-value is higher than the significance level, it is not possible to say with certainty whether the results you observed were the result of pure chance or manipulation of these variables.

    • Example: Our p-value is between 0.05 and 0.1. It's clear Not less than 0.05, so unfortunately we we cannot reject our null hypothesis. This means that we have not reached a minimum of 95% probability that the police in our city issue tickets to owners of red and blue cars with a probability that is quite different from the national average.
    • In other words, there is a 5-10% chance that the results we observe are not the consequences of a change in location (analysis of the city, not the whole country), but simply an accident. Since the accuracy we claim should not exceed 5%, we cannot say with certainty that the police in our city are less biased towards the owners of red cars - there is a small (but statistically significant) probability that this is not the case.
  • Hypothesis testing is carried out using statistical analysis. Statistical significance is found using the P-value, which corresponds to the probability of a given event under the assumption that some statement (null hypothesis) is true. If the P-value is less than a given level of statistical significance (usually 0.05), the experimenter can safely conclude that the null hypothesis is false and move on to consider the alternative hypothesis. Using Student's t-test, you can calculate the P-value and determine the significance for two data sets.

    Steps

    Part 1

    Setting up an experiment

      Define your hypothesis. The first step in evaluating statistical significance is to choose the question you want answered and formulate a hypothesis. A hypothesis is a statement about experimental data, their distribution and properties. For any experiment, there is both a null and an alternative hypothesis. Generally speaking, you will have to compare two sets of data to determine if they are similar or different.

      • The null hypothesis (H 0) usually states that there is no difference between the two datasets. For example: those students who read the material before class do not get higher marks.
      • The alternative hypothesis (H a) is the opposite of the null hypothesis and is a statement that needs to be confirmed with experimental data. For example: those students who read the material before class get higher marks.
    1. Set the significance level to determine how much the distribution of the data must differ from the usual one in order to be considered a significant result. Significance level (also called α (\displaystyle \alpha )-level) is the threshold you define for statistical significance. If the P-value is less than or equal to the significance level, the data is considered statistically significant.

      Decide which criteria you will use: one-sided or two-sided. One of the assumptions in Student's t-test is that the data are normally distributed. The normal distribution is a bell-shaped curve with the maximum number of results in the middle of the curve. Student's t-test is a mathematical data validation method that allows you to determine whether the data falls outside the normal distribution (more, less, or in the “tails” of the curve).

      • If you're not sure if the data is above or below the control group, use a two-tailed test. This will allow you to determine the significance in both directions.
      • If you know in which direction the data might fall outside of the normal distribution, use a one-tailed test. In the example above, we expect students' grades to go up, so a one-tailed test can be used.
    2. Determine the sample size using statistical power. The statistical power of a study is the probability that a given sample size will produce the expected result. A common power threshold (or β) is 80%. Power analysis without any prior data can be tricky because some information about the expected means in each data set and their standard deviations is required. Use the online statistical power calculator to determine the optimal sample size for your data.

      • Typically, researchers conduct a small pilot study to provide data for power analysis and determine the sample size needed for a larger and more complete study.
      • If you do not have the opportunity to conduct a pilot study, try to estimate possible average values ​​based on the literature data and the results of other people. This may help you determine the optimal sample size.

      Part 2

      Compute Standard Deviation
      1. Write down the formula for the standard deviation. The standard deviation indicates how large the spread of the data is. It allows you to conclude how close the data obtained on a particular sample. At first glance, the formula seems rather complicated, but the explanations below will help you understand it. The formula is as follows: s = √∑((x i – µ) 2 /(N – 1)).

        • s - standard deviation;
        • the ∑ sign indicates that all the data obtained in the sample should be added;
        • x i corresponds to the i-th value, that is, a separate result obtained;
        • µ is the average value for this group;
        • N is the total number of data in the sample.
      2. Find the average in each group. To calculate the standard deviation, you must first find the mean for each study group. The mean value is denoted by the Greek letter µ (mu). To find the average, simply add up all the resulting values ​​and divide them by the amount of data (sample size).

        • For example, to find the average grade in a group of students who study material before class, consider a small data set. For simplicity, we use a set of five points: 90, 91, 85, 83 and 94.
        • Let's add all the values ​​together: 90 + 91 + 85 + 83 + 94 = 443.
        • Divide the sum by the number of values, N = 5: 443/5 = 88.6.
        • Thus, the average value for this group is 88.6.
      3. Subtract each value obtained from the average. The next step is to calculate the difference (x i - µ). To do this, subtract each value obtained from the found average value. In our example, we need to find five differences:

        • (90 - 88.6), (91 - 88.6), (85 - 88.6), (83 - 88.6) and (94 - 88.6).
        • As a result, we get the following values: 1.4, 2.4, -3.6, -5.6 and 5.4.
      4. Square each value obtained and add them together. Each of the quantities just found should be squared. This step will remove all negative values. If after this step you still have negative numbers, then you forgot to square them.

        • For our example, we get 1.96, 5.76, 12.96, 31.36 and 29.16.
        • We add the obtained values: 1.96 + 5.76 + 12.96 + 31.36 + 29.16 = 81.2.
      5. Divide by the sample size minus 1. In the formula, the sum is divided by N - 1 due to the fact that we do not take into account the general population, but take a sample of all students for evaluation.

        • Subtract: N - 1 = 5 - 1 = 4
        • Divide: 81.2/4 = 20.3
      6. Take the square root. After dividing the sum by the sample size minus one, take the square root of the found value. This is the last step in calculating the standard deviation. There are statistical programs that, after entering the initial data, perform all the necessary calculations.

        • In our example, the standard deviation of the marks of those students who read the material before class is s = √20.3 = 4.51.

      Part 3

      Determine Significance
      1. Calculate the variance between the two groups of data. Up to this step, we have considered the example for only one group of data. If you want to compare two groups, obviously you should take the data for both groups. Calculate the standard deviation for the second group of data and then find the variance between the two experimental groups. The dispersion is calculated using the following formula: s d = √((s 1 /N 1) + (s 2 /N 2)).