Structural equations modeling based on covariance (SEM-CB) has got the great attention of researchers, since Joreskog introduced it in 1978. However, superiority of LISREL (probably best-known software for performing this type of analysis) leads to statement of fact, that not all researchers are aware of alternative technique for SEM modeling, such as partial least squares (PLS). This prediction-oriented technique is versatile when constructs are measured by great number of indicators and where structural equations modeling carried out by maximum likelihood method reaches its limit. In this post, we will skip mathematical detail of that analysis and focus on general idea of this issue.

#### Statistical analysis – the first generation

The first generation of statistical techniques, such as approach based on regression (multivariate regression analysis, discriminant analysis, analysis of variance etc.) and factor analysis, or cluster analysis, belong to the core of statistical toolset, which are designed for identification and confirmation of theoretical hypothesis based on statistical analysis of empirical data.

Many researchers from various scientific disciplines applied these methods for discovery of phenomena. These methods have significantly shaped a manner in which we see the world today (e.g. theory of intelligence was created on the basis of factor analysis).

The common denominator of these all methods is that they have three limitations called:

• postulate saying about simple model structure (at least in the case of regression approaches)
• assumption, that every variable could considered as observable one
• hypothesis that every variable is measured without an error, that may limit their application to many research circumstances

In order to overcome mentioned above limits more and more authors started using structural equations modeling (SEM) as an alternative. In comparison with regression approaches, which analyses only one layer relationship of between independent and dependent variables in the same time, SEM as a second-generation technique, allows for simultaneous modelling of associations between many independent and dependent variables, and also mediating and moderating variables. Thus, in this approach it is difficult to distinguish dependent variables from independent ones, because it’s assumed that dependent are variables, which are explained as the final in the model.

SEM allows the researcher to construct unobservable variables measured by indicators (called items, manifest or observable variables), and also as an explicit measurement model of error of observable variables. This method overcome limits of first-generation techniques described earlier and consequently gives the researcher a flexibility to perform statistical test measurement assumptions on empirical data (e.g. confirmatory analysis or analysis of full measurement model, discriminant validity analysis (It is worth to find out why this assumption is one of the most important for structural models testing CLICK).

#### Two approaches to estimating parameters– variance- and covariance-based Structural Equations Modeling

Generally there are two approaches to estimating structural model parameters. The first one is called covariance-based approach, and second – variance-based (or prediction oriented). Covariance-based approach tries to minimize a difference between sample variance -covariance model and the theoretical model . Unfortunately, this type of analysis is quite challenging due to estimating all model parameters simultaneously, and also because of conservative assessment criteria of model fit. Whereas, variance-based approach tries to maximize variance of dependent variables explained by independent variables (instead of reproduce empirical covariance matrix like in covariance-based approach), model estimating are performing stepwise, in stages. PLS model (partial least squares) like every SEM contains a structural part reflecting associations between latent variables (inner model) and measuring components, which shows how latent variables are related to its indicators (outer model). Here you can check when to use PLS model and when SEM-CB.

#### Advantages of SEM method, particularly PLS method.

The advantage of PLS is a possibility of construct reflective (common factor variable) and formative (linear combination of indicators) latent variables in tested structural models, which are built from big amount of observable variables. Moreover, due to these possibilities of structural model estimation, measurement error is taken into account in analysis. For this reason, with this type of structures, estimation of influence coefficients and explained variance are more approximate to actual ones than in the case of formative latetnt models or variables measured without an error.

To see scientifically proved regularity acceptance of autonomous car technology click below. If you want to get to know how I used structural equations modeling for this purpose have a look at my research paper Click.

More about CB i PLS modelling you can read things wrote by champions on this field:

What is PLS structural equations modeling?

What is structural equations modeling? This is a question asked by many young scientists commencing their research career, and even by those older who don’t follow advances in terms of advanced statistical models testing.

#### Structural equations models (SEM) – in general

Structural equations models (SEM) (Bollen, 1989; Kaplan, 2000) contain statistical methodologies aimed at estimating the net of causal relations defined in compliance with tested theoretical model, which links two or more unobservable concepts and each of them is measured by certain amount of observable indicators. The fundamental idea is that the complex inside of this network system can be analyzed taking into account the net of causal relations between latent/hidden concepts, called latent variables. Every variable is measured by several observable variables usually defined as manifest variables. This idea can be rendered in the sense that structural equations models represent contact point between Path analysis (Tukey, 1964), and Confirmatory factor analysis (Thurstone, 1931). A graphic example of such representation is shown on the picture below:

* Every ellipse is an observable variable. Every unobservable variable is reflected in its observable manifestations e.g. research questionnaire items. The above calculations were performed using SEM-PLS method. For the analysis there have been used PTH2 algorithm (consistent pls algorithm), which controls variables measurement error with the Dijkstr’s reliability coefficient.

#### What is PLS structural equations modeling?

The issue of PLS structural equations modeling and what it is exactly will be described in this paragraph. PLS approach (Partial Least Squares) known as PLS Path Modeling (PLS-SEM) was designed as a procedure different from classical approach based on CB-SEM variance–covariance matrix. PLS modeling is based on estimation of composite or reflective variables method (Tenenhaus, 2008). It is an approach in which an iterative algorithm, which calculates the estimates associated with measurement model of structural model separately, and then estimates path values in/from structural model, plays the main role. Therefore, it is claimed that PLS-SEM in the best-case scenario, explains the variance of unobservable variables by observable indicators, as well as variables put to any regression in path model. That’s why PLS modeling is considered more as a predictive approach to the analysis than a confirmational one. Contrary to classical approach based on CB-SEM variance-covariance matrix, PLS-SEM does not aim to reproduce that matrix.

PLS-SEM is considered as an approach used to soft modeling, in which there are no strong assumptions about variable distribution form, sample size and measurement scale of tested variables. This is particularly valuable feature of this approach given the fact that in many areas of science such assumptions is hard to achieve, at least in full e.g. in clinical studies (Kock & Gaskins, 2016; Kock & Hadaya, 2018). On the other hand, this implies inability to parametric inference, which is replaced in PLS by confidence intervals analysis and hypothesis testing using sampling procedures such as bootstrap, jacknife, blindfold or stable exponential smoothing (see Stable1, 2, 3 w Kock, 2014). This results in less ambitious statistical inference about estimation accuracy, but allows to maximize an explained variance. In PLS we know that coefficients are biased due to prediction maximization, but largely this method returns results characterized by strong consistency.

#### Summary

What is structural equations modeling? It seems that PLS structural equations modeling is rather maximizing prediction approach (it maximizes explaining variance in measurement and path model) than maximizing estimation theoretical accuracy one. Undoubtedly, PLS structural equations modeling is a statistical method worthy of interest. Not only because of its novelty, but also due to its low requirements for data, and therefore possibility of easy and practical application in research featuring a complex measurement nature and complex methodology of research design.

Check our PLoS ONE article where we used the SEM-PLS technique (and topic modeling algorithm) in autonomous vehicle technology acceptance model (click the figure below):

Topic modeling algorithm results (click the figure below):