Where Xˉ=n1∑j=1nXj and S=n−11∑j=1n(xj−xˉ)(xj−xˉ)⊺
T2∼n−p(n−1)pFp,n−p
whe re Fp,n−p denotes F-distribution with p and n−p degrees of freedom
At α level of significant, we reject H0 if T2>n−p(n−1)pFp,n−p(α)
where Fp,n−p(α) is the upper (100α)-th percentile of the Fp,n−p distribution
If T2 is too large, meaning xˉ is too far from μ0
Confidence Regions of Component Means:
A 100(1−α)% confidence region for the mean of a p-dimension normal distribution is the ellipsoid determined by all μ such that n(xˉ−μ)⊺S−1(xˉ−μ)≤n−p(n−1)pFp,n−p(α)
where xˉ=n1∑j=1nxj and S=n−11∑j=1n(xj−xˉ)(xj−xˉ)⊺
The confidence ellipsoid is xˉ±λin−p(n−1)pFp,n−p(α)ei
When a point is out of the control region, individual Xˉ charts are constructed.
When the lower control limit is less than zero for data that must be nonnegative, LCL is generally set to zero.
Points are displayed in time order rather than as a scatter plot, and this makes patterns and trends visible.
For the jth points, we calculate the T-squared Statistic: Tj2=(x−xˉ)⊺S−1(x−xˉ)
Then plot the T2-values on a time axis, the lower limit is 0, and upper limt is UCL=χp2(α), there is no centerline in T2-chart
When the multivariable T2-chart signals that the j-th unit is out of order, it should be determined which variables are responsible
A region based on Bonferoni Interval is frequently chosen for this purpose. The k-th variable is out of control if xjk does not lie in the interval xˉk∓tn−1(0.005/p)skk where p is the total nb of measured variables
Inference when some observations are missing:
Often, some components of a vector observation are unavailable. We treat situations where data are missing at random.
To estimate the incomplete data, we use the EM algorithm.
Prediction step. Given some estimate θ~ of the unknown parameters, predict the contribution of any missing observation to the (complete-data) sufficient statistics.
Estimation step. Use the predicted sufficient statistics to compute a revised estimate of the parameters.
When the observations X1,...,Xn are a random sample from a p-variate normal population, the prediction–estimation algorithm is based on the complete data sufficient statistics
T1=∑j=1nXj=nXˉ
and T2=j−1∑2XjXj⊺=(n−1)S+nXˉXˉ⊺
We assume that the population mean μ and variance ∑ are unknown and estimated with μ~ and Σ~
Estimation:
μ~=nT~1 and Σ~=n1T~2−μ~μ~⊺
Prediction step:
for each vector xj with missing values, let xj(1) denotes the vector of missing components and xj(2) denotes vector of available components
Contribution estimation of xj(1) to T1:xj(1)=E(Xj(1)∣xj(2);μ,Σ)=μ(1)+Σ12Σ22−1(xj(2)−μ(2))