Header Image - MCM Alchimia 5

The Student t distribution

by admin 0 Comments

It is usual to assume in all types of analyzes, tests or calibrations, that repetitive events without external stimuli that vary their probabilities, will be distributed according to a normal or gaussian distribution defined by the mean and the standard deviation calculated for the sample. Strictly speaking this is only true when the number of repetitions is large, consistent with the central limit theorem, however when we do not have enough information to describe the properties of this gaussian distribution because our study sample is not large enough, suppose that these conditions are also fulfilled, we will surely throw values ​​of uncertainty underestimated for our measurement, as indicated in the guide JCGM 100 – Guide to the expression of uncertainty in measurement.

This same problem was raised by William Gosset, who signed his work as “Student” for reasons of business confidentiality of the company where he worked. Gosset needed to estimate, from experimental data, a distribution that represented small samples of unknown variance. This distribution function proposed by Gosset is known as Student t distribution, and responds to the following general equation:

In any normally distributed population, the Student t distribution allows increasing the width of the resulting normal distribution to increase the uncertainty associated with the measurand as a result of the poverty of information provided by a small sample on the total lot. To the extent that this sample is larger, the distribution t will approach the Normal obtained from the standard deviation of the sample until it is identical to this latter for infinite repetitions of the event.

The correct thing in all types of analysis is to assign to repetitive events the distribution t with a parameter gl, which will be the degrees of freedom, whose value will be the number of repetitions minus 1. MCM Alchimia allows to simulate a random sample according with the Student t distribution, not only with this parameter of form (degrees of freedom), but with parameters of scale and position, through the standard deviation and the mean respectively, so that it can be used in any situation where appropriate, with no additional operations.

Input parameters:

  • Mean value. This parameter defines the displacement of the function on the abscissa axis. Corresponds to the average value, or average, of the random variable. The data collection of this variable, therefore, will be distributed on both sides of this function. In the case of this distribution, as in all symmetrical functions, the average will coincide with statistical Mode.
  • Degrees of freedom. Corresponds to the number of repetitions minus 1, Represents the number of values ​​that can vary without modifying the value of the sample mean.
  • Standard deviation. Measure of the dispersion of the values ​​with respect to the sample mean. If this distribution is used for Type A (statistical) uncertainty components, this value can be calculated according to the equation:

    where n is the number of values ​​or repetitions. On the other hand, if what you want to know is the standard deviation of the sample means, this value can be obtained by dividing s / √ n .

Constant

by admin 0 Comments

This is not a distribution strictly speaking, but a value with zero uncertainty, such as the water dissociation constant, normal gravity, the reference temperature of a test, the nominal capacity of a volumetric material, etc. It can also be used for components of the model with several uncertainty contributors, writing the model as a constant added to a list of zero-value components with varied distributions.


More help

Triangular distribution

by admin 0 Comments

The continuous triangular distribution is characterized by being bounded to two extremes as in the case of the rectangular, but also has a mode (or value more probabe) within that range. The probability in any subinterval of equal length will increase linearly until fashion and then descend in the same way to the upper bound. This distribution is widely used in variables where information is limited, as in the case of the uniform, but where we have an approximate knowledge of the Modal value, that is, where, although the exact point of this value is not known, has information of the region or subinterval where to find it.

Important: In the case of MCM Alchimmia, only the centered triangular is available, that is, the statistical mode corresponds to the average value of the AB interval.

The general equation will then be defined for the interval AB, while outside those extremes the distribution function will be 0. The formula will then be:

Input parameters:

  • Average. Mean value and modal value of the random variable.
  • Semi-interval. Corresponds to the middle of the interval to which this distribution is applied, that is (B-A) / 2, where A and B are the upper and lower bounds of the interval. When this function is applied to the uncertainty by resolution of an analog instrument, this parameter will correspond to the appreciation (or estimate & lt; e & gt;). Also in the area of ​​chemistry it is usual to take the tolerance of the volumetric material or even the reference materials as contributors of uncertainty of tringular distribution (EURACHEM/QUAM:2012 8.1.6). In both mentioned cases, the semi-interval will correspond to the value and tolerance of the material.

More help

Rectangular Distribution (Uniform)

by admin 0 Comments

This continuous distribution is characterized by having the same probability for any value of the interval. It is widely used for contributions of type B uncertainties in which only the major and minor dimensions of the interval are known, for example in the division or resolution of a digital instrument. In many cases this distribution can also be assigned when there is little information about the random variable, in bibliographic data or when the coverage factor of an uncertainty is not known,

The general formula of this distribution is defined for all values ​​of x for which A ≤ x ≤ B, according to the equation:

Input parameters:

  • Average. Average value of the random variable.
  • Semi-interval. Corresponds to the middle of the interval to which this distribution is applied, that is (B-A) / 2, where A and B are the upper and lower bounds of the interval. If this function is applied to the uncertainty by resolution of a digital instrument, this parameter will correspond to half of the minor division (d / 2). Sometimes it also applies to analogic instruments taking the appreciation (or estimation) as if it were an estimated division, beyond which it is no possible more visual information. In this case the semi-interval will be e / 2.

More help

Normal Distribution (Gaussian)

by admin 0 Comments

This distribution is the one that most frequently is representing natural and social events. Much of the evidence from classical statistics, as well as the estimation of uncertainties, is based on the assumption that the data conform to a normal distribution. From the theoretical perspective, the Central Limit Theorem maintains that given a random sample of sufficiently large size, it will be observed that the distribution of means follows an approximately normal distribution. The general formula of this distribution is:

where μ represents the location and σ the scale of the function. In order to estimate a measurement uncertainty, μ corresponds to the mean and mode value of the random variable, while σ is the standard deviation.

Input parameters:

  • Mean. Average value, or average of the random variable. The data collection of this variable, therefore, will be distributed on both sides of this function. In the case of this Normal or Gaussian distribution, the mean will coincide with fashion.
  • Standard deviation. Measure of the dispersion of the values ​​with respect to the sample mean. If this distribution is used for Type A (statistical) uncertainty components, this value can be calculated according to the equation:

    where n is the number of values ​​or repetitions. On the other hand, if what you want to know is the standard deviation of the sample’s mean, this value can be obtained by dividing s / √ n .

If the parameter to which this distribution is assigned corresponds to the uncertainty contribution from a calibration certificate, the standard deviation corresponds to the standard uncertainty ( u ), or to the expanded uncertainty divided by the coverage factor k.


More help

Lognormal distribution

by admin 0 Comments

This distribution represents random variables whose logarithms are distributed according to a normal distribution. The lognormal distribution takes different forms depending on the value of its scale parameter and is often used in the reliability of high technology products and also in microbiological counts since they are based on the multiplicative growth model.
Input parameters:
As indicated, the logarithms of the values of the lognormal random variable are distributed according to a gaussian function. This distribution function can be defined from two sets of parameters as selected in the radio buttons on the right of the data panel.

  • μ (Y). Average population Y data. This Y population will be defined according to the group of data that we wish to refer to, that is, to the lognormal population or the normal population of their logarithms.
  • s (Y). Standard deviation of Y. With Y according to the characteristics indicated above.
  • Y = X (LogNormal) / Y = ln (X) (Normal). This selector allows you to choose which data group the input parameters are referring to.
    • Y = X (LogNormal). In this first case the generated pseudo-random values will form a lognormal distribution whose mean will be μ (Y) and its standard deviation will be s (Y).
    • Y = ln (X) (Normal). In this case the generated values will be distributed in LogNormal form. The set formed by the logarithms of these data will have a Normal distribution whose mean will be μ (Y) and its standard deviation will be s (Y) .

More help

Chi Square distribution

by admin 0 Comments

This continuous probability distribution in the field of positive reals is intimately related to the Normal distribution, for example, it is the sample distribution of σ². The Xi (or Chi) Square distribution is defined with a single parameter which are degrees of freedom. The function is always asymmetric and biased to the right. This distribution is very frequently used in various branches of science since it allows analyzing data sets and determining if the difference between them is due to chance (null hypothesis) or to another external factor.

Input parameters:

  • Degrees of freedom. Represents the amount of values that are free to vary without influencing the result.

More help

Weibull distribution

by admin 0 Comments

This distribution is a continuous function in the domain of positive real numbers, frequently used in economics, meteorology and telecommunications, as well as other specific applications, such as the reliability rate or the survival of organisms or machines. The random variables that have the Weibull distribution model the distribution of faults in systems when the fault ratio is proportionally related to a power of time. This distribution is defined from a characteristic Form (> 0) parameter that would indicate the failure rate, so that if the failure rate decreases, it is constant or increases with time. That corresponds with if the parameter k is smaller, equal or greater than 1.

Input parameters:

  • Shape. This parameter defines the shape of the distribution. You can take as a value any field number of reals greater than zero.
  • Scale. This second parameter allows to scale the resulting values generating pseudo-random with the same form but greater standar deviation.

More help

Cauchy distribution

by admin 0 Comments

The Cauchy distribution has the particularity of being of the Gaussian type of distributions, however it has the highest peak and the tails decompose very slowly. Although MCM Alchimia suitably generates the pseudo-random samples for this distribution, the results graph will look like an isolated peak since the abscissa axis of it is taken in the 99% coverage probability interval. Because the decay of the tails is so gradual, the range of significant probabilities becomes very narrow.
Input parameters:

  • Xo. The distribution of cauchy has no mean. This parameter represents the shift of zero on the x axis, in addition to coinciding with the median and axis of symmetry of the distribution.
  • Scale. The parameter scale must belong to the domain of the reals and be greater than zero.

More help

Von Mises distribution

by admin 0 Comments

The distribution of Von Mises is a continuous function of the circular calls, that is, they are defined for the real ones in the interval from 0 to 2p. This function is currently used preferably in the field of epidemiology to describe the spread of diseases or technological applications such as signal processing. The Von Mises distribution is also known as normal circular as it is similar to Gaussian, but restricted to the circular plane.

Input parameters:

  • Mean. In this case the mean will define the position of the average value of the function in the field of the real ones. In this way, the values ??will be distributed on both sides of this value with a maximum distance of p. If this field is left blank, the distribution with Media = 0 will be simulated.
  • k. The parameter k must belong to the domain of the reals and be greater than zero. K in the Von Misses distribution represents the concentration of the values ??in the simulated function, that is, the inverse of the variance.

More help