Reliability and factor models: Omega and Factor Determinacy (and alpha)¶

Rich Jones
2025-07-12

Overview

This note walks through the definition of and calculation of McDonald’s Omega ($\omega$) and factor score determinacy ($\rho$), and illustrates a special case where $\omega = \rho^2$. The special case is for a single factor model, all items are continuous, all factor loadings are equal (so-called tau equivalence or parallel tests).

Omega ($\omega$) reflects the proportion of variance in a unit-weighted observed composite score that is attributable to the common factor (McDonald, 1999, page 89).

Factor determinacy ($\rho$) reflects the correlation between a regression-based factor score estimate and the underlying common factor (Muthén, Mplus Technical Appendices, page 47). Therefore, $\rho^2$ would reflect the proportion of variance in a regression-based factor score estimate that is attributable to the underlying common factor.

Some authors/programs will refer to determinacy as $\rho$, others $\rho^2$: so read carefully.

Latent variables and their quality¶

Latent variables, often referred to as factors, represent theoretical constructs that cannot be directly measured but are inferred from a set of observed indicators. In some SEM applications, researchers aim to obtain individual scores on these latent variables, known as factor scores. These scores are essentially estimates of an individual's standing on the unobserved construct. However, estimated factor scores do not inherently possess the exact properties of the true, underlying latent factors. Rather, they serve as approximations, and the accuracy of these approximations directly influences the validity and reliability of any subsequent analyses in which these scores are utilized, whether as predictors, dependent variables, or for classification purposes.

Methodologists have developed various ways to express the quality of these factor score estimates, and this note concerns two of these: factor reliability via coefficient omega, and factor determinacy.

Omega¶

"Omega is the ratio of the true-score variance of $Y$ to the total variance of $Y$. Here the true-score variance is interpreted as the variance due to the (common) attribute" (McDonald 1999, page 89), and $Y$ is the unit-weighted total score (i.e., the sum of item scores), The “common attribute” is the latent factor in a common factor model. McDonald’s omega quantifies:

$$ \omega = \frac{\operatorname{Var}(\text{true score of } Y)}{\operatorname{Var}(Y)} $$

And in the context of a single-factor model, the true-score variance of $Y$ is the variance due to the common factor, not just any shared variance among items.

Given:

indicators are standardized to unit variance
$\lambda_i$ = standardized factor loading for item $i$, with $i = 1, \dots, p$
Residual variance for item $i$: $\theta_i = 1 - \lambda_i^2$

$$ \omega = \frac{\left( \sum_{i=1}^p \lambda_i \right)^2}{\left( \sum_{i=1}^p \lambda_i \right)^2 + \sum_{i=1}^p (1 - \lambda_i^2)} $$

The numerator is the squared sum of loadings: $(\lambda_1 + \lambda_2 + \cdots + \lambda_p)^2$. The denominator is that same squared sum plus the sum of residual variances (i.e., $1 - \lambda_i^2$ for each item).

Matrix Expression for Omega¶

In a single-factor model, where:

$\boldsymbol{\Lambda}$ is the vector of factor loadings (i.e., a $p \times 1$ column vector),
$\boldsymbol{\Theta}$ is the residual variance matrix (a $p \times p$ diagonal matrix),
$\boldsymbol{\Sigma} = \boldsymbol{\Lambda} \boldsymbol{\Lambda}^\top + \boldsymbol{\Theta}$ is the total covariance matrix of observed variables,

Then omega can be written as:

$$ \omega = \frac{\mathbf{1}^\top \boldsymbol{\Lambda} \boldsymbol{\Lambda}^\top \mathbf{1}}{\mathbf{1}^\top \boldsymbol{\Sigma} \mathbf{1}} $$

where

$\mathbf{1}$ is a $p \times 1$ vector of ones.
The numerator, $\mathbf{1}^\top \boldsymbol{\Lambda} \boldsymbol{\Lambda}^\top \mathbf{1}$, equals the squared sum of the factor loadings, $\left(\sum_{i=1}^p \lambda_i \right)^2$.
The denominator, $\mathbf{1}^\top \boldsymbol{\Sigma} \mathbf{1}$, equals the variance of the unit-weighted total score, $\text{Var}(Y)$.

For unit-weighted total scores $\mathbf{Y} = \mathbf{1}^\top \mathbf{y}$, Omega can be expressed as:

$$ \omega = \frac{\mathbf{1}^\top \Lambda \Lambda^\top \mathbf{1}}{\mathbf{1}^\top \Sigma \mathbf{1}} $$

If we are working in the single factor case, then

$\Lambda$ is a $p \times 1$ column vector $\lambda$
$\Sigma = \Lambda \Lambda^\top + \Theta$, where:
- $\Lambda \Lambda^\top$ is the common variance component (rank-1)
- $\Theta$ is a diagonal matrix of residual variances

So, the simplified omega becomes:

$$ \omega = \frac{(\mathbf{1}^\top \lambda)^2}{\mathbf{1}^\top (\lambda \lambda^\top + \Theta) \mathbf{1}} $$

The numerator can be further simplified:

$$ \mathbf{1}^\top \lambda = \sum_{i=1}^p \lambda_i \quad \Rightarrow \quad (\mathbf{1}^\top \lambda)^2 = \left( \sum_{i=1}^p \lambda_i \right)^2 $$

As can the denominator:

$$ \mathbf{1}^\top (\lambda \lambda^\top + \Theta) \mathbf{1} = \mathbf{1}^\top \lambda \lambda^\top \mathbf{1} + \mathbf{1}^\top \Theta \mathbf{1} = (\sum \lambda_i)^2 + \sum \theta_i $$

where $\theta_i = 1 - \lambda_i^2$ if variables are standardized. This leaves:

$$ \omega = \frac{\left( \sum_{i=1}^p \lambda_i \right)^2}{\left( \sum_{i=1}^p \lambda_i \right)^2 + \sum_{i=1}^p (1 - \lambda_i^2)} = \frac{(\sum \lambda_i)^2}{(\sum \lambda_i)^2 + \sum (1 - \lambda_i^2)} $$

In [49]:

# Computation using matrix form
# Step 2: Construct Lambda * Lambda' 
# lambda_outer = lambda_vec * lambda_vec.T <- already created
# Step 3: Residual variance matrix
# theta_matrix = eye(p) * theta_val <- already created
# Step 4: Sigma = Lambda*Lambda' + Theta
Sigma = lambda_outer + theta_matrix
lambda_outer + theta_matrix

Out[49]:

$\displaystyle \left[\begin{matrix}1 & \frac{1}{2} & \frac{1}{2} & \frac{1}{2} & \frac{1}{2}\\\frac{1}{2} & 1 & \frac{1}{2} & \frac{1}{2} & \frac{1}{2}\\\frac{1}{2} & \frac{1}{2} & 1 & \frac{1}{2} & \frac{1}{2}\\\frac{1}{2} & \frac{1}{2} & \frac{1}{2} & 1 & \frac{1}{2}\\\frac{1}{2} & \frac{1}{2} & \frac{1}{2} & \frac{1}{2} & 1\end{matrix}\right]$

Factor determinacy¶

Factor determinacy ($\rho$) reflects the correlation between a regression-based factor score estimate and the underlying common factor (Muthén, Mplus Technical Appendices, page 47). Beauducel & Hilger (2017) provide (rewritten to use Mplus notation):

$$ \rho^2 = \operatorname{diag} \left( \boldsymbol{\Psi} \boldsymbol{\Lambda}^\top \boldsymbol{\Sigma}^{-1} \boldsymbol{\Lambda} \boldsymbol{\Psi} \right) $$

where:

Meaning	Mplus Notation
Factor loading matrix	$\boldsymbol{\Lambda}$
Factor covariance matrix	$\boldsymbol{\Psi}$
Observed variable covariance matrix	$\boldsymbol{\Sigma}$
Factor determinacy coefficient squared	$\rho^2$

This gives the squared correlation between the latent variable $\eta$ and its regression-based factor score estimate $\hat{\eta}$. Note that Mplus output provides $\rho$ as the factor determinacy estimate.

In the case of a single factor model, and unit variance assumed for the common factor, we have:

$$ \rho^2 = \operatorname{diag} \left( \boldsymbol{\Lambda}^\top \boldsymbol{\Sigma}^{-1} \boldsymbol{\Lambda} \right) = \lambda^\top \Sigma^{-1} \lambda $$

where $\lambda^\top \Sigma^{-1} \lambda$ is a scalar quantity that can be interpreted as a generalized squared length of the factor loading vector $\lambda$ in the space defined by the inverse of the observed variables' covariance matrix $\Sigma^{-1}$. It is scalar because

$\lambda$ is a $p \times 1$ column vector of factor loadings (one loading per indicator).
$\Sigma$ is a $p \times p$ symmetric, positive-definite covariance matrix of the observed variables.
$\Sigma^{-1}$ is also $p \times p$.
So:
- $\Sigma^{-1} \lambda$ is a $p \times 1$ column vector
- $\lambda^\top (\Sigma^{-1} \lambda)$ is a $1 \times 1$ matrix — a scalar.

Therefore, $\lambda^\top \Sigma^{-1} \lambda$ is a single number (a scalar) that tells you how well the factor can be recovered from the observed indicators. It equals the squared correlation between the latent factor and its optimal (regression-based) score estimate.

Closing comments¶

Mplus does not give factor determinacy when there are categorical indicators¶

In the Mplus Discussion Board, Bengt Muthén states:

“If you have categorical outcomes you don’t get factor determinacy because that is a concept valid only for continuous outcomes. With categorical outcomes you would instead consider ‘item information’ which Mplus provides.” (https://www.statmodel.com/discussion/messages/9/533.html?1576705397)

Factor determinacy -- the correlation between true latent factor scores and estimated regression-based factor scores -- only applies when items are continuous, as the regression-based factor scores depend on linear relationships between indicators and factor. When indicators are binary or ordinal, Mplus does not provide this coefficient. Instead, Muthén suggests looking at item and test information functions, closely aligned with IRT (Item Response Theory) concepts, which reflect how well each item informs the underlying latent trait.

See Appendix 2 for discussion of an idea for using Mplus Bayesian plausible values for obtaining a factor determinacy-like statistic, which would be applicable to situations in which the indicators included categorical indicators.

Cronbach's alpha¶

Coefficient alpha (Cronbach's alpha) is a conceptually similar quantity to coefficient omega. However, alpha assumes that all loadings are equal. In our example, we have set all loadings to be equal, but this is not usually the case with real data. Omega can be calculated with varying factor laodings, alpha does not even consider the loadings, and is instead defined based on the variances and covariances among the items (c.f., Mplus Discussion via Statmodel.com).

$$ \alpha = \frac{p}{p - 1} \left(1 - \frac{\operatorname{tr}(\Sigma)}{1^\top \Sigma 1} \right) $$

where

$p$: number of items
$1$: $p \times 1$ vector of ones
$\Sigma$: $p \times p$ observed variable covariance matrix
$\operatorname{tr}(\Sigma)$: the trace of $\Sigma$, i.e., the sum of the item variances (diagonal elements)
$1^\top \Sigma 1$: the variance of the unit-weighted total score

Summary¶

Reliability Measure	Meaning
alpha	Common variance of sum score and a latent trait, if the latent trait were tau-equivalent
omega	Common variance of a sum score and modeled latent trait
determinacy ($\rho^2$)	Common variance of a regression-based factor score estimate and a modeled latent trait

See Appendix 1 for definitions of congeneric, tau-equivalent, and parallel model types.

References¶

Beauducel, A., & Hilger, N. (2017). On the bias of factor score determinacy coefficients based on different estimation methods of the exploratory factor model. Communications in Statistics-Simulation and Computation, 46(8), 6144-6154.

Lord, F. M., & Novick, M. R. (1968). Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley.

McDonald, R. P. (1999). Test Theory: A Unified Treatment. Mahwah, NJ: Lawrence Erlbaum.

Muthén, B. (1998-2004). Mplus Technical Appendices (Version 3, March 2004 ed.). Muthén & Muthén. https://statmodel.com/download/techappen.pdf

Additional Reading¶

Grice, J. W. (2001). Computing and evaluating factor scores. Psychological methods, 6(4), 430-450.

Raykov, T., & Marcoulides, G. A. (2011). Introduction to Psychometric Theory. New York: Routledge.

Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s β, and McDonald’s ωH: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70(1), 123-133.

Appendix 1. Some terminology¶

In measurement theory there is a hierarchy of models in terms of the restrictions they impose on the items:

Congeneric model: Each indicator is allowed its own factor loading and unique error variance. In other words, the items may differ in how strongly they relate to the latent construct and in their measurement error.
Tau-equivalent model: Here the indicators are assumed to have equal factor loadings, meaning they all scale the latent variable in the same way. However, the error variances (unique variances) can differ across items.

Parallel model: This is a more stringent case where not only are the factor loadings equal but the unique error variances are also the same across items.

When your factor analysis model specifies that both the factor loadings and the unique variances are equal for all indicators, the items are considered parallel. This is the most restrictive form, implying that each item is essentially interchangeable, differing only by random error that is identical in variance across items.

The term "tau-equivalent" comes from classical test theory (CTT), particularly from the foundational work by Lord and Novick (1968) in Statistical Theories of Mental Test Scores. The Greek letter τ (tau) was used to denote the true score of an individual on a psychological or educational test.

Origin and Meaning of Tau-Equivalent¶

In CTT, the observed score $X_i$ on item $i$ is modeled as:

$$ Y_i = \tau_i + \varepsilon_i $$

Where:

$\tau_i$: the true score or latent trait contribution from the underlying construct (same across all items for a person in the tau-equivalent model),
$\varepsilon_i$: the error term, assumed to be uncorrelated with the true score and with other items’ errors.

The tau-equivalent model assumes:

Each item's true score is an equal multiple of the latent trait (i.e., equal factor loadings),
But error variances can differ.

In other words, the items are equally sensitive to the latent trait (same slope or loading) but may differ in measurement error. That’s what distinguishes tau-equivalence from other models in the hierarchy:

Model Type	Factor Loadings	Error Variances
Congeneric	Free	Free
Tau-equivalent	Equal	Free
Parallel	Equal	Equal

Why "tau"?¶

The use of τ (tau) was conventional in earlier psychometric theory to represent true scores, much like we use $\theta$ for latent traits in modern IRT or factor models. So "tau-equivalent" refers to items that are equally related to the same true score (or latent factor), hence having equal factor loadings.

Appendix 2. Determinacy using Bayesian Plausible Values¶

This appendix discusses using Bayesian plausible values (PVs) from Mplus to approximate a measure similar to factor score determinacy $\rho^2$. Factor determinacy $\rho$ is:

$$ \rho = \text{corr}(\hat{\eta}, \eta) $$

and $\rho^2$ is:

$$ \rho^2 = \text{Var}(\mathbb{E}[\eta \mid \mathbf{x}]) / \text{Var}(\eta) $$

This quantifies how much information the observed indicators provide about the latent variable — it's the R² for predicting the factor from the indicators.

In Mplus's Bayesian framework, plausible values (PVs) are posterior draws from:

$$ p(\eta_j \mid \mathbf{x}_j) $$

So for person $j$, Mplus gives you multiple samples $\eta_j^{(1)}, \eta_j^{(2)}, \dots, \eta_j^{(M)}$, representing uncertainty about their factor given their observed data.

We can estimate:

$$ \rho^2 = \frac{\text{Var}(\mathbb{E}[\eta_j \mid \mathbf{x}_j])}{\text{Var}(\eta_j)} $$

via these steps:

1. Get the plausible values matrix¶

Let:

$N$: number of individuals
$M$: number of plausible values (typically 5–20)

Obtain an $N \times M$ matrix of plausible values: each row is a person, each column is one PV draw.

2. Compute variance of person-level posterior means¶

For each person $j$, compute:

$$ \bar{\eta}_j = \frac{1}{M} \sum_{m=1}^{M} \eta_j^{(m)} $$

Then compute:

$$ \text{Var}(\bar{\eta}_j) \quad \text{(across persons)} $$

This is the numerator: variance of posterior means.

3. Compute total variance (within + between)¶

The total variance of $\eta$ is approximated by:

$$ \text{Var}(\eta_j^{(m)}) = \text{Var}(\bar{\eta}_j) + \mathbb{E}_j[\text{Var}(\eta_j^{(m)} \mid \mathbf{x}_j)] $$

Compute this as:

total_var = np.var(pvs, ddof=1)  # flattened matrix

Or manually as:

$$ \text{Total variance} = \text{Between-person variance} + \text{Average within-person variance} $$

4. Compute $\rho^2$¶

$$ \rho^2 \approx \frac{\text{Between-person variance}}{\text{Total variance}} $$

This aligns with the interpretation of $R^2$ — fraction of variance in the latent factor "explained" by observed indicators.

Example Code in Python (if PVs are in a matrix pvs):

import numpy as np

# pvs: N x M matrix of plausible values
posterior_means = np.mean(pvs, axis=1)
between_var = np.var(posterior_means, ddof=1)

# within-person variance
within_vars = np.var(pvs, axis=1, ddof=1)
avg_within_var = np.mean(within_vars)

# total variance of factor
total_var = between_var + avg_within_var

# approximate factor determinacy squared
rho2 = between_var / total_var
print(f"Estimated rho^2 ≈ {rho2:.3f}")