the reason why the statistics is used (what I want to know from the statistical info)
데이터의 의미를 정확히 모를 때(when I need more knowledge about the field)
데이터 자체는 쉽게 파악이 되나 데이터 양이 많을 때 (if the volume of the info is vast, it is difficult to figure out the details)
측정을 잘못 했을 때 (noise is too many in info or need the reliability) 를 대비해서 데이터에 대해 알기 위해 사용
the tendency of covariance
the degree that two variances are distant from each mean value => as another one is moving forward the postive/negative direction being distant from its mean, one variance is also moving forward the positive direction being distant from its mean.
When the deviation is multiplied, the output value is also larger. But, when the unit is different from each data, the scale of the deviation is also different. In other words, you cannot compare with these data, so you need to standardize the data first.
correlation coefficient
You cannot explain anything (e.g. the relation) only using the correlation coefficient.
Correlation coefficient cannot describe the nonlinear relation. (You have to identify if there are direct correlation among the data first.)
Because raw data originally includes the noise, you cannot figure out the perfect linear relation. In linear regressional coefficient, the errors are included.)
the basic concepts of the sample compared to the population
data should be unbiased. (if the specific biased information is identified from the sample, this information cannot explain about the population.)
the features of the sample are same as of the population (you cannot predict at all if they are totally different.)