A simulation study of Pathmox with nonnormal data - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

A simulation study of Pathmox with nonnormal data

Description:

Non normality of distributions doesn't affect the results of the test statistic ... with unbalanced children nodes delivers less sensitive p-values of the statistic. ... – PowerPoint PPT presentation

Number of Views:63

Avg rating:3.0/5.0

Slides: 31

Provided by: usuar382

Category:

more less

Transcript and Presenter's Notes

Title: A simulation study of Pathmox with nonnormal data

1
A simulation study of Pathmox with non-normal
data

Gastón Sánchez, Tomàs Aluja-Banet

based on Ph.D. Gaston Sánchez. PATHMOX approach
Segementation tress in PLS-PM
Laboratory of Information Analysis and Modeling
Universitat Politècnica de Catalunya
2
Outline

Heterogeneity in PLS-PM
PATHMOX Approach
Simulation Studies
Conclusions

3
Heterogeneity
Standard application of PLS Path Model One model
for the whole population
We assume the same model for all individuals
(i.e. we are assuming that the satisfaction
model for mobile phone operators Is the same
regardless the individual is a teenager or an
adult)
4
Heterogeneity
Different sets of individuals with particular
behavior
5
Assignable sources of heterogeneity
Heterogeneity with observed segmentation variables
Segments defined by observable segmentation
variables Group Information (age, gender,
ethnicity, religion, etc) Number of segments
based on the levels (categories) of segmentation
variables Multi-group approaches
Heterogeneity without segmentation variables
Unknown variables causing heterogeneity No Group
Information Unknown number of segments Cluster-bas
ed / Latent Class approaches
6
Heterogeneity in PATHMOX
Where the data come from?
Socio-demographic
Segmentation Variables
Survey data
Psycho-demographic
Consumption
It is interesting to detect the segments with
different PLS-PM models
We are searching for groups of individuals where
the relation between image and satisfaction (for
instance) is different for one group to
another. We perform this search to define the
different groups using the external segmentation
variables.
7
Heterogeneity in PATHMOX
How to use the segmentation variables?
For each segmentation variable, we can define
binary partitions
Global Model
X11
?1
X1k
Y31
h3
Y32
X21
Y3p
?2
X2q
Z1 Zk Zm Segmentation variables
8
The PATHMOX Approach
Segmentation Tree of PLS Path Models
Root Node Global Model
Child Node
MOX gtMOXEXELOA From Nahuatl (Aztec language)
which means Divide into groups
Leaf Node
9
Split criterion
Parent Model
A B
Child Models
A
B
10
Hypothesis test
Equality of Coefficients
BA1
BB1
B1
VS
BA2
BB2
B2
A B
Under H0
Under H1
E(z1) N(0, s2In) , E(z2 ) N(0, s2In)
Assuming a common variance
F-statistic for multiple regression (Lebart et
al, 1979)
F distribution with p1p2, 2n-2(p1p2) d.f.
Best split with most significant F-statistic
11
Stopping criterion
1. Minimum number of elements inside one node
2. p-value gt threshold
p value gt a
3. Specifying a growth depth-level
12
Simulation studies
Sensitivity of the F-test respect to

Skewness of data
Distance between path coefficients
Sample Sizes
Levels of noise of the endogenous term
Levels of noise of indicators
Unbalanced segments
Variances of endogenous residuals

13
Experimental conditions
Node A (fixed inner model)
Node B
l11
l11
x11
e11
x11
e11
l12
x1
l12
z
z
x12
e12
x1
x12
e12
x31
e31
l31
b31
x13
e13
x31
e31
l31
l13
x13
e13
l13
b31
l32
x3
l32
x3
x32
e32
x32
e32
l33
l33
b32
l21
x21
e21
b32
x21
l21
e21
x33
e33
x33
e33
l22
x2
l22
x22
e22
x2
x22
e22
l23
x23
e23
l23
x23
e23
Simulation according to the following
experimental conditions

Data Distributions Normal (Sanchez Aluja
PLS07), Non-normal
Path Coefficients
Sample size 100, 200, 500, 1000
Balancing proportions 60, 70, 80, 90
Noise levels of z 10, 30, 50
Noise levels of e 10, 30, 50

14
Path coefficients
9 Pairs of path coefficients
Gradually changed values
Fixed values
15
Data distributions
Examples of Distributions
Beta (6,6)
Beta (9,4)
Beta (9,1)
16
Symmetric distribution b (6,6)
17
Moderate skew distribution b (9,4)
18
High skew distribution b (9,1)
19
Global results
20
Influence of b distance by distribution
21
Influence of sample size by distribution
22
Influence of noise of LVs
23
Influence of noise of MVs
24
Unbalanced Segments (normal data)
25
Unbalanced Segments b (9,4)
26
Influence of different variances
Different Variances of endogenous error terms
x1
x1
0.7
0.7
h2
h2
0.8
z2
0.8
z2
h1
h1
z1
z1
0.5
0.5
E(z1) N(0, s12In) , E(z2 ) N(0, s22In)
Four types of different variances
27
Influence of different variances
28
Conclusions

Non normality of distributions doesnt affect the
results of the test statistic
Splits with unbalanced children nodes delivers
less sensitive p-values of the statistic.
F-statistic favors balanced splits.
Unequal variances of endogenous latent variables
render less reliable the test statistic and hence
the tree.
The F test is used to discover unexpected
segments by ordering the splits for a given node,
as a data mining tool.

29
References

Cassel, C., Hackl, P., Westlund, A.H. (1999)
Robustness of partial least squares method for
estimating latent variable quality structures.
Journal of Applied Statistics, 26(4) 435-446.
Chin, W. (2000) Frequently Asked QuestionsPLS
PLS-Graph http//disc-nt.cba.uh.edu/chin/plsfaq/pl
sfaq.htm.
Chin, W. (2003) A Permutation Based Procedure for
Multi-Group Comparison of PLS Models. In
Proceedings of the PLS03 Intl. Symposium, 33-43.
M. Vilares, M. Tenenhaus, P. Coelho, V. Esposito
Vinzi, A. Morineau (Eds), Decisia.
Chin, W.W., Marcolin, B.L., and Newsted, P.R.
(2003) A Partial Least Squares Latent Variable
Approach for Measuring Interaction Effects
Results from a Monte Carlo Simulation Study and
Voice Mail Emotion/Adoption Study. Information
Systems Research, 14(2) 189-217.
Chow, G. (1960) Tests of Equality between Sets of
Coeffs. in Two Linear Regressions. Econometrica,
28(3) 591-605.
Dilon, W.R., Kumar, A. (1994) Latent Structure
and Other Mixture Models in Marketing An
Integrative Survey and Overview. In Advanced
Methods of Marketing Research, Richard P. Bagozzi
(Ed.), Blackwell, 295-351.
Esposito Vinzi, V., Trinchera, L.,
Squillacciotti, S., Tenenhaus, M. (2008)
REBUS-PLS, A Response-based procedure for
detecting unit segments in PLS Path Modelling.
App. Stoch. Models in Business Industry, 24(5)
439-459.
Goodhue, D., Lewis, W. Thompson, R. PLS, Small
Sample Size, and Statistical Power in MIS
Research. (Proceedings of the 39th Hawaii
International Conference on System Sciences -
2006, HICSS06, Track 8, 2006).
Hahn, C., Johnson, M.D., Herrmann, A., Huber,
A. (2002) Capturing Customer Heterogeneity Using
a Finite Mixture PLS Approach. Schmalenbach
Business Review, 54 243-269.
Henseler, J. (2007) A New and Simple Approach to
Multi-Group Analysis in PLS Path Modeling. In H.
Martens and T. Naes. (Eds), Proceedings of the
PLS07 International Symposium, Matforsk, As,
Norway, 104-107.

30
References

Jedidi, K., Jagpal, H.S., DeSarbo, W.S. (1997)
Finite-Mixture Structural Equation Models for
Response-Based Segmentation and Unobserved
Heterogeneity. Marketing Science, 16(1) 39-59.
Lebart, L., Morineau, A., Fénelon J.P. (1979)
Traitement des données statistiques. Paris
Dunod.
Lohmöller, J. B. (1989) Latent Variable Path
Modeling with Partial Least Squares. Heidelberg
Physica-Verlag.
Lubke, G.H. Muthén, B. (2005) Investigating
Population Heterogeneity with Factor Mixture
Models. Psychological Methods, 19(1) 21-39.
Palumbo, F. Romano, R. (2008) Possibilistic PLS
Path Modeling A New Approach to the Multigroup
Comparison. In Proceedings in Computational
Statistics, 303-314. Paula Brito (Ed),
Heidelberg Physica-Verlag.
Ringle, C. Schlittgen, R. (2007) A Genetic
Algorithm Segmentation Approach for Uncovering
Separating Groups of Data in PLS-PM. In H.
Martens T. Naes. (Eds) Proceedings of the
PLS07 Intl. Symposium, Matforsk, As, Norway,
75-78.
Sánchez, G. Aluja, T. (2007) A Simulation Study
of PATHMOX (PLS Path Modeling Segmentation Tree)
Sensitivity. In H. Martens T. Naes. (Eds)
Proceedings of the PLS07 Intl. Symposium,
Matforsk, As, Norway, 33-36.
Serch, O. (2008) Sistema de Visualització de
models PLS-PM. Projecte Final de Carrera.
Facultat dInformàtica de Barcelona, Universitat
Politècnica de Catalunya. Enero, 2008.
Squillacciotti, S. (2005) Prediction oriented
classification in PLS Path Modelling. In
Proceedings of the PLS05 Intl. Symposium, T.
Aluja, J. Casanovas, V. Esposito, A. Morineau, M.
Tenenhaus (Eds), SPAD TestGo, 499-506.
Tenenhaus, M., Esposito Vinzi, V., Chatelin,
Y.M., Lauro, C. (2005) PLS path modeling.
Computational Statistics Data Analysis, 48
159-205.