Problem Description
I am working on a project that involves using an unsupervised neural network (autoencoder) in MATLAB to preprocess data and generate transformed features (encoded_Xencoded_Xencoded_X) as inputs for a supervised neural network. The main goal is to capture both linear and non-linear relationships among variables in the dataset, which consists of 113113113 observations and 404040 features. However, I am facing challenges at multiple levels, including:
Behavior of the Unsupervised Neural Network
The unsupervised neural network (autoencoder) is generating a transformed dataset (encoded_Xencoded_Xencoded_X) with dimensions identical to the original dataset (XXX, 113×40113 \times 40113×40), as intended. However, the encoded features exhibit the following issues:
- Highly correlated features: The transformed features in encoded_X are almost uniformly positively correlated, which suggests that the autoencoder might not be learning meaningful or diverse patterns in the data.
- Limited variability: Despite the presence of both linear and non-linear relationships in the original data, encoded_Xencoded_Xencoded_X appears to fail in capturing the full complexity of the input data.