Principal Component Analysis

Principal components analysis (PCA) is a data reduction technique that transforms a larger number of correlated variables into a much smaller set of uncorrelated variables called principal components.

1. Functionalities

From to estimate the number of components
To achieve a correlation matrix and draw plots
To achieve the principal components and loadings result tables
To gachieve the principal components and loadings distribution plots in 2D and 3D

2. About your data

All the data for analysis are numeric
More samples size than the number of independent variables, that is, the number of rows is greater than the number of columns

Please follow the Steps to build the model, and click Outputs to get analytical results.

Output 1. Data Explores

Part of Data

Please edit data in Data tab

Output 2. Model Results

Explanations

This plot graphs the components relations from two components, you can use the score plot to assess the data structure and detect clusters, outliers, and trends
Groupings of data on the plot may indicate two or more separate distributions in the data
If the data follow a normal distribution and no outliers are present, the points are randomly distributed around zero

2. When A >=2, choose 2 components to show component and loading 2D plot

2.1. Component at x-axis

2.2. Component at y-axis

In the plot of PC1 and PC2 (without group circle), we could find some outliers, for example, 11 and 23. If we chose diet and add group circle in Euclid distance, we could find diet type sun was separated from others.

Explanations

This plot show the contributions from the variables to the PCs (choose PC in the left panel)
Red indicates negative and blue indicates positive effects
Use the cumulative proportion of variance (in the variance table) to determine the amount of variance that the factors explain.
For descriptive purposes, you may need only 80% (0.8) of the variance explained.
If you want to perform other analyses on the data, you may want to have at least 90% of the variance explained by the factors.

Loadings

Variance table

Explanations

This plot (biplots) overlays the components and the loadings (choose PC in the left panel)
If the data follow a normal distribution and no outliers are present, the points are randomly distributed around zero
Loadings identify which variables have the largest effect on each component.
Loadings can range from -1 to 1. Loadings close to -1 or 1 indicate that the variable strongly influences the component. Loadings close to 0 indicate that the variable has a weak influence on the component.

When A >=2, choose 2 components to show component and loading 2D plot

2.1. Component at x-axis

2.2. Component at y-axis

In the plot of PC1 and PC2, we could find ACAT2 have comparatively strong negative effect to PC1, and PKD4 has strong positive effect on PC1. For PC2, THIOL has strong positive effect and VDR has strong negative effect. The results are corresponding to the loading plot

Explanations

This is the extension for 2D plot. This plot overlays the components and the loadings for 3 PCs (choose PCs and the length of lines in the left panel)
We can find the outliers in the plot.
If the data follow a normal distribution and no outliers are present, the points are randomly distributed around zero
Loadings identify which variables have the largest effect on each component
Loadings can range from -1 to 1. Loadings close to -1 or 1 indicate that the variable strongly influences the component. Loadings close to 0 indicate that the variable has a weak influence on the component.

This plot needs some time to load for the first time

When A >=3, choose 3 components to show component and loading 3D plot

The default is to show the first 3 PC in the 3D plot

1. Component at x-axis

2. Component at y-axis

3. Component at z-axis

4. (Optional) Change line scale (length)

Trace legend

Exploratory Factor Analysis

Exploratory Factor analysis (EFA) is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors.

1. Functionalities

From parallel analysis to estimate the number of components
To achieve a correlation matrix and plots
To achieve the factors and loadings result tables and
To achieve the factors and loadings distribution plots in 2D and 3D

2. About your data

All the data for analysis are numeric
More samples size than the number of independent variables, that is, the number of rows is greater than the number of columns

Please follow the Steps to build the model, and click Outputs to get analytical results.

Output 1. Data Explores

Part of Data

Please edit data in Data tab

Output 2. Model Results

Explanations

This plot graphs the factor relations to the variables
Results in the window show the statistical test for the sufficiency of factors.

Explanations

This plot graphs the relations from two factors, you can use the score plot to assess the data structure and detect clusters, outliers, and trends
Groupings of data on the plot may indicate two or more separate distributions in the data
If the data follow a normal distribution and no outliers are present, the points are randomly distributed around zero

2. When A >=2, choose 2 factors to show component and loading 2D plot

2.1. Component at x-axis

2.2. Component at y-axis

In the plot of ML1 and ML2, we could find some outliers, for example, 169 and 208. We can remove these points in Data tab. If we chose type and add group circle in Euclid distance, we could find B group was somewhat different. Not all the groups had circles due to the number of points were too less.

Explanations

This plot show the contributions from the variables to the PCs (choose PC in the left panel)
Red indicates negative and blue indicates positive effects
Use the proportion of variance (in the variance table) to determine the amount of variance that the factors explain.
For descriptive purposes, you may need only 80% (0.8) of the variance explained.
If you want to perform other analyses on the data, you may want to have at least 90% of the variance explained by the factors.

Loadings

Variance table

Explanations

This plot (biplots) overlays the factors and the loadings (choose PC in the left panel)
If the data follow a normal distribution and no outliers are present, the points are randomly distributed around zero
Loadings identify which variables have the largest effect on each component
Loadings can range from -1 to 1. Loadings close to -1 or 1 indicate that the variable strongly influences the component. Loadings close to 0 indicate that the variable has a weak influence on the component.

When A >=2, choose 2 factors to show factors and loading 2D plot

1. Factor at x-axis

2. Factor at y-axis

After removing the points 169 and 208, we could find chem2 have comparatively strong relation to ML2.

Explanations

This is the extension for 2D plot. This plot overlays the factors and the loadings for 3 PCs (choose PCs and the length of lines in the left panel)
We can find the outliers in the plot.
If the data follow a normal distribution and no outliers are present, the points are randomly distributed around zero
Loadings identify which variables have the largest effect on each component
Loadings can range from -1 to 1. Loadings close to -1 or 1 indicate that the variable strongly influences the component. Loadings close to 0 indicate that the variable has a weak influence on the component.

This plot needs some time to load for the first time

When A >=3, choose 3 factors to show factors and loading 3D plot

The default is to show the first 3 factors in the 3D plot

1. Factor at x-axis

2. Factor at y-axis

3. Factor at z-axis

4. (Optional) Change line scale (length)

Trace legend

Data Preparation

1. Functionalities

2. About your data

Case Example 1: Mouse gene expression data

Case Example 2: Chemical data

Please follow the Steps, and Outputs will give real-time analytical results. After getting data ready, please find the model in the next tabs.

Data Preparation

Change the types of some variable?

Output 1. Data Information

Output 2. Descriptive Results

Principal Component Analysis

1. Functionalities

2. About your data

Please follow the Steps to build the model, and click Outputs to get analytical results.

Build the Model

Step 1. Choose parameters to build the model

Step 2. If data and model are ready, click the blue button to generate model results.

Output 1. Data Explores

Output 2. Model Results

Exploratory Factor Analysis

1. Functionalities

2. About your data

Please follow the Steps to build the model, and click Outputs to get analytical results.

Build the Model

Step 1. Choose parameters to build the model

Step 2. If data and model are ready, click the blue button to generate model results.

Output 1. Data Explores

Output 2. Model Results