Header Image - MCM Alchimia 5

Correlated variables

by admin 0 Comments

Frequently test models are used that contain two or more magnitudes with some degree of correlation, that is, that systematically, when modifying the value of one of them, the other increases or decreases. The correlation coefficients between two variables vary between -1 and 1, where the value indicates the strength of the correlation, while the sign indicates the direction. In this way, we understand that if the correlation is = 1 there is an absolute direct proportionality between the magnitudes whereas if the value is -1 the proportionality is inverse. On the other hand, a zero value for the correlation coefficient indicates that the variables are independent.

If we know the correlation coefficients between the magnitudes of the mathematical model of our trial we can use the correlation panel. The correlation matrix is ​​automatically constructed with the variables of our model so that we indicate the correlation coefficient between them. When we complete the matrix, click on the < connect to the project > and after checking that the matrix is ​​correct, the green light of the button will indicate that it is connected.

In this case, the simulation will be done with correlated variables. If at any time we want to work with independent magnitudes again, we only have to uncheck the checkbox < correlated > on the simulation button. In this way we can alternate between both states without correcting the correlation matrix each time.

Regression

by admin 0 Comments

NOTE: This tool is only available for models with connected curves (See work with curves)

At the end of the list of distributions, MCM Alchimia have available this tool that allows to assign regression parameters to our test model. In this way, we have 3 options related to our connected curve to choose from:

  • B0. Independent coefficient (ordinate at the origin) of the connected curve.
  • B1. Coefficient of the first order of the connected curve.

Both B0 and B1 will have associated uncertainties that correspond to the result of the simulation, that is, if at least one of the values ​​used to construct the curve had uncertainty, the simulation will necessarily produce a succession of curves for each set of simulated values, each with coefficients B0 and B1 that will have some degree of variation. The magnitude of this variation is related to the magnitude of the uncertainty of the input data in both axes.

  • Include residuals. If this option is selected, the uncertainties associated with the coefficient we have selected (B0 or ​​B1) will be used increased by a quantity determined by the contribution due to least squares residuals . Thus, the uncertainty associated with both parameters will include simulation and waste contributions
  • Uncertainty due to adjustment. If we assign this option to our parameter, what we will obtain is the standard uncertainty of the connected curve due to residuals, centered at zero (only as contribution of uncertainty). This value is found in regressions made in spreadsheet such as typical error or standard error of regression.

Note : It is important to note that if we simulate the typical error of a curve in isolation, the standard deviation result of the resulting population is a little larger than the same parameter obtained in a spreadsheet. This is because although the variable is defined as mean 0 and deviation = to the standard uncertainty, the simulation distribution in MCM Alchimia for this case is not a Normal, but a Student t with the degrees of freedom that determine the Number of points with which the curve was built, resulting in a greater deviation for the measurand.

Experimental (raw data)

by admin 0 Comments

This distribution is not a distribution in itself, but a powerful and exclusive form of MCM Alchimia to enter raw values ​​of repeatability to the model of our trial without having to make any kind of previous operations in a spreadsheet.

As can be seen in the graph, the entry data panel has several selectors to define the input characteristics of our uncertainty component.

1. Input data set

To the left of the panel we have 5 radio buttons (selectors), which will indicate to the software the form in which the input data will be entered. We will then have 5 input forms:

  • Direct. By selecting this option we will be telling MCM Alchimia that we will enter the data one by one. From these data, the software will automatically obtain the mean, standard deviation of the sample means and degrees of freedom, statistics parameters necessary to perform the simulation.
  • Indirect SX. This option indicates that we want the application to request two columns of data, of which each value for repeatability will be obtained from the subtraction Value = (X – S ) . An example where this option can be used is when two digital instruments are calibrated by comparison, and you want to calculate the repeatability of the errors. We indicate the reading of the pattern in the column of S and the reading of the sample in X. It can also be used when it is desired to obtain the weight of a substance contained in a crucible, through the weight of the empty crucible and with the content. In this case, the mass values ​​of the empty crucible can be indicated in S and the mass of the crucible with the contents in X.
  • Indirect SXXS. In certain cases the error values ​​are obtained from a set of measures also known as ABBA format, which will be requested in a table of 4 columns. This format is sometimes used when it is desired to eliminate the bias caused by the potential drift of the measuring instruments. In this way, each value will be obtained from the calculation Value = (X1 + X2) / 2 – (S1 + S2) / 2 .
  • Indirect X / S. In this indirect format, each value for repeats will be obtained from the Value = X / S relationship, indicated through a table of two columns.
  • Indirect SXS. This format is similar to SXXS, only in three columns. The values ​​will be obtained automatically by the application, through the operation: Value = X – (S1 + S2) / 2 .

2. Income of values.

In the upper right corner of the MCM panel Alchimia has a field where we must indicate the number of repetitions to which we want to estimate the simulation parameters, that is, the mean and the standard deviation of the stockings By default the application has 10 values, however this value can be changed by manually editing the number or with the increment / decrement arrows.

After indicating the number of measurements that will be entered, we click on the “Values” button, where a grid will open for the entry of values. According to the income format chosen in the radio buttons selector, the table will have the number of columns required:

3.- Simulation distribution.

This section of the panel provides two ways to perform the simulation from the parameters of the previously defined distribution from the entered values:

Student t distribution. Selecting this option, MCM Alchimia will calculate the mean and standard deviation of the sample means from the table of values ​​entered. Then it will generate randomized distributed according to the Student t function, with a number of degrees of freedom equal to the number of values ​​-1.

Normal. In this case, the software will perform the same calculations as before, then take the inverse value of the function t (coverage factor k ) for the chosen coverage probability and the degrees of freedom calculated. The simulation will then be done with pseudo-randomized normal distribution of calculated mean and standard deviation s1 = ks / k ‘ , being k’ the inverse value of the distribution t for the same probability of coverage, but infinite degrees of freedom.

For a high number of degrees of freedom, both simulations will have similar or identical characteristics. In contrast, for a number of degrees of freedom <10 The results obtained from both simulations could present significant differences. Then which one to choose?

Depending on the utility that we want to give to this application, we may require one or another simulation distribution, but for technicians who are not experts in statistics, we recommend following the following rule:

  1. For validation of traditional calculation estimates according to GUM (JCGM 101), routine laboratory tests, proficiency tests, GUM vs MCM validation, etc. simulate according to a Normal distribution.
  2. For research, statistics, economics and cases where it is necessary to know accurately the impact of a variable on the uncertainty of the measurand, simulate according to a Student t distribution.

4. Forcing Mean = 0.

This option is foreseen when we want the taxpayer to whom this distribution applies only to be entered for uncertainty purposes. In this way, the average value of the function will be canceled so that the mean is = 0 and does not provide value. This is especially useful when we have model components that have more than one uncertainty, for example resolution and repeatability. In this way it can be indicated as summed variables, one constant with the value of the parameter and the rest only as contributors of uncertainty with zero value. In this case Type A can be put as experimental. The example of this help uses this tool.


The Student t distribution

by admin 0 Comments

It is usual to assume in all types of analyzes, tests or calibrations, that repetitive events without external stimuli that vary their probabilities, will be distributed according to a normal or gaussian distribution defined by the mean and the standard deviation calculated for the sample. Strictly speaking this is only true when the number of repetitions is large, consistent with the central limit theorem, however when we do not have enough information to describe the properties of this gaussian distribution because our study sample is not large enough, suppose that these conditions are also fulfilled, we will surely throw values ​​of uncertainty underestimated for our measurement, as indicated in the guide JCGM 100 – Guide to the expression of uncertainty in measurement.

This same problem was raised by William Gosset, who signed his work as “Student” for reasons of business confidentiality of the company where he worked. Gosset needed to estimate, from experimental data, a distribution that represented small samples of unknown variance. This distribution function proposed by Gosset is known as Student t distribution, and responds to the following general equation:

In any normally distributed population, the Student t distribution allows increasing the width of the resulting normal distribution to increase the uncertainty associated with the measurand as a result of the poverty of information provided by a small sample on the total lot. To the extent that this sample is larger, the distribution t will approach the Normal obtained from the standard deviation of the sample until it is identical to this latter for infinite repetitions of the event.

The correct thing in all types of analysis is to assign to repetitive events the distribution t with a parameter gl, which will be the degrees of freedom, whose value will be the number of repetitions minus 1. MCM Alchimia allows to simulate a random sample according with the Student t distribution, not only with this parameter of form (degrees of freedom), but with parameters of scale and position, through the standard deviation and the mean respectively, so that it can be used in any situation where appropriate, with no additional operations.

Input parameters:

  • Mean value. This parameter defines the displacement of the function on the abscissa axis. Corresponds to the average value, or average, of the random variable. The data collection of this variable, therefore, will be distributed on both sides of this function. In the case of this distribution, as in all symmetrical functions, the average will coincide with statistical Mode.
  • Degrees of freedom. Corresponds to the number of repetitions minus 1, Represents the number of values ​​that can vary without modifying the value of the sample mean.
  • Standard deviation. Measure of the dispersion of the values ​​with respect to the sample mean. If this distribution is used for Type A (statistical) uncertainty components, this value can be calculated according to the equation:

    where n is the number of values ​​or repetitions. On the other hand, if what you want to know is the standard deviation of the sample means, this value can be obtained by dividing s / √ n .

Constant

by admin 0 Comments

This is not a distribution strictly speaking, but a value with zero uncertainty, such as the water dissociation constant, normal gravity, the reference temperature of a test, the nominal capacity of a volumetric material, etc. It can also be used for components of the model with several uncertainty contributors, writing the model as a constant added to a list of zero-value components with varied distributions.


More help

Triangular distribution

by admin 0 Comments

The continuous triangular distribution is characterized by being bounded to two extremes as in the case of the rectangular, but also has a mode (or value more probabe) within that range. The probability in any subinterval of equal length will increase linearly until fashion and then descend in the same way to the upper bound. This distribution is widely used in variables where information is limited, as in the case of the uniform, but where we have an approximate knowledge of the Modal value, that is, where, although the exact point of this value is not known, has information of the region or subinterval where to find it.

Important: In the case of MCM Alchimmia, only the centered triangular is available, that is, the statistical mode corresponds to the average value of the AB interval.

The general equation will then be defined for the interval AB, while outside those extremes the distribution function will be 0. The formula will then be:

Input parameters:

  • Average. Mean value and modal value of the random variable.
  • Semi-interval. Corresponds to the middle of the interval to which this distribution is applied, that is (B-A) / 2, where A and B are the upper and lower bounds of the interval. When this function is applied to the uncertainty by resolution of an analog instrument, this parameter will correspond to the appreciation (or estimate & lt; e & gt;). Also in the area of ​​chemistry it is usual to take the tolerance of the volumetric material or even the reference materials as contributors of uncertainty of tringular distribution (EURACHEM/QUAM:2012 8.1.6). In both mentioned cases, the semi-interval will correspond to the value and tolerance of the material.

More help

Rectangular Distribution (Uniform)

by admin 0 Comments

This continuous distribution is characterized by having the same probability for any value of the interval. It is widely used for contributions of type B uncertainties in which only the major and minor dimensions of the interval are known, for example in the division or resolution of a digital instrument. In many cases this distribution can also be assigned when there is little information about the random variable, in bibliographic data or when the coverage factor of an uncertainty is not known,

The general formula of this distribution is defined for all values ​​of x for which A ≤ x ≤ B, according to the equation:

Input parameters:

  • Average. Average value of the random variable.
  • Semi-interval. Corresponds to the middle of the interval to which this distribution is applied, that is (B-A) / 2, where A and B are the upper and lower bounds of the interval. If this function is applied to the uncertainty by resolution of a digital instrument, this parameter will correspond to half of the minor division (d / 2). Sometimes it also applies to analogic instruments taking the appreciation (or estimation) as if it were an estimated division, beyond which it is no possible more visual information. In this case the semi-interval will be e / 2.

More help

Normal Distribution (Gaussian)

by admin 0 Comments

This distribution is the one that most frequently is representing natural and social events. Much of the evidence from classical statistics, as well as the estimation of uncertainties, is based on the assumption that the data conform to a normal distribution. From the theoretical perspective, the Central Limit Theorem maintains that given a random sample of sufficiently large size, it will be observed that the distribution of means follows an approximately normal distribution. The general formula of this distribution is:

where μ represents the location and σ the scale of the function. In order to estimate a measurement uncertainty, μ corresponds to the mean and mode value of the random variable, while σ is the standard deviation.

Input parameters:

  • Mean. Average value, or average of the random variable. The data collection of this variable, therefore, will be distributed on both sides of this function. In the case of this Normal or Gaussian distribution, the mean will coincide with fashion.
  • Standard deviation. Measure of the dispersion of the values ​​with respect to the sample mean. If this distribution is used for Type A (statistical) uncertainty components, this value can be calculated according to the equation:

    where n is the number of values ​​or repetitions. On the other hand, if what you want to know is the standard deviation of the sample’s mean, this value can be obtained by dividing s / √ n .

If the parameter to which this distribution is assigned corresponds to the uncertainty contribution from a calibration certificate, the standard deviation corresponds to the standard uncertainty ( u ), or to the expanded uncertainty divided by the coverage factor k.


More help

Lognormal distribution

by admin 0 Comments

This distribution represents random variables whose logarithms are distributed according to a normal distribution. The lognormal distribution takes different forms depending on the value of its scale parameter and is often used in the reliability of high technology products and also in microbiological counts since they are based on the multiplicative growth model.
Input parameters:
As indicated, the logarithms of the values of the lognormal random variable are distributed according to a gaussian function. This distribution function can be defined from two sets of parameters as selected in the radio buttons on the right of the data panel.

  • μ (Y). Average population Y data. This Y population will be defined according to the group of data that we wish to refer to, that is, to the lognormal population or the normal population of their logarithms.
  • s (Y). Standard deviation of Y. With Y according to the characteristics indicated above.
  • Y = X (LogNormal) / Y = ln (X) (Normal). This selector allows you to choose which data group the input parameters are referring to.
    • Y = X (LogNormal). In this first case the generated pseudo-random values will form a lognormal distribution whose mean will be μ (Y) and its standard deviation will be s (Y).
    • Y = ln (X) (Normal). In this case the generated values will be distributed in LogNormal form. The set formed by the logarithms of these data will have a Normal distribution whose mean will be μ (Y) and its standard deviation will be s (Y) .

More help

Chi Square distribution

by admin 0 Comments

This continuous probability distribution in the field of positive reals is intimately related to the Normal distribution, for example, it is the sample distribution of σ². The Xi (or Chi) Square distribution is defined with a single parameter which are degrees of freedom. The function is always asymmetric and biased to the right. This distribution is very frequently used in various branches of science since it allows analyzing data sets and determining if the difference between them is due to chance (null hypothesis) or to another external factor.

Input parameters:

  • Degrees of freedom. Represents the amount of values that are free to vary without influencing the result.

More help