Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit of the matrix size for sparse Eigendecomposition; memory or CPU? #160

Open
dsl2501 opened this issue Aug 12, 2023 · 2 comments
Open

Comments

@dsl2501
Copy link

dsl2501 commented Aug 12, 2023

I found that 1024 x 1024 size was maximum for the following code. Performance profiling did not yield the reason, as memory of max. 8 max was being used without any evidence of leakage and CPU 1% all the time (32 core AMD CPU with 1T memory).
I wonder if there is any method to handle 4096x4096 or larger matrices with sparsity condition met. Actually, my real-life matrix is much much larger, and could be handled by matlab eigs.

Here is the code.

#include <Eigen/Core>
#include <Eigen/SparseCore>
#include
#include
#include
#include <Spectra/SymEigsSolver.h>
#include <Spectra/MatOp/SparseSymMatProd.h>

// save results of Eigenvalue Eigenvector
void saveResults(const Eigen::VectorXd& eigenvalues, const Eigen::MatrixXd& eigenvectors) {
std::ofstream file("results.txt");
if (file.is_open()) {
file << "Eigenvalues:\n" << eigenvalues << "\nEigenvectors:\n" << eigenvectors << '\n';
file.close();
std::cout << "Results have been saved to results.txt\n";
} else {
std::cerr << "Unable to open file.\n";
}
}

int main() {
int size = 1024; // Matrix size is fixed to 256x256.

Eigen::SparseMatrix<double> T(size, size); // Sparse matrix declaration.

// Randomly fill the sparse matrix
std::default_random_engine generator;
std::uniform_real_distribution<double> distribution(0.0, 1.0);
for (int row = 0; row < size; ++row) {
    for (int col = 0; col < size; ++col) {
        if (distribution(generator) < 0.05) { // 5% chance to fill an element
            T.insert(row, col) = distribution(generator);
        }
    }
}

// Define the relative path to the project root
std::string pathToProjectRoot = "C:/Users/.../MatrixMultiplyEigen/data_matrix.txt"; // Modify this path based on your project structure

// Save the matrix to a text file
std::ofstream txtFile(pathToProjectRoot);
for (int row = 0; row < size; ++row) {
    for (int col = 0; col < size; ++col) {
        txtFile << T.coeff(row, col) << '\t'; // '\t' is the tab character for formatting
    }
    txtFile << '\n'; // Newline for each row
}
txtFile.close();
std::cout << "Matrix has been saved to data_matrix.txt" << std::endl;

// Construct matrix operation object using the wrapper class SparseSymMatProd
Spectra::SparseSymMatProd<double> op(T);

// Construct eigen solver object, requesting the largest eigenvalue
Spectra::SymEigsSolver< Spectra::SparseSymMatProd<double> > eigs(op, 3, 6);

// Initialize and compute
std::cout << "Initialization and computation starting..." << std::endl;
eigs.init();
Eigen::Index nconv = eigs.compute(); // Use Eigen::Index instead of int
std::cout << "Computation finished. Results:" << std::endl;

// Retrieve results
if (eigs.info() == Spectra::CompInfo::Successful)
{
    Eigen::VectorXd evalues = eigs.eigenvalues();
    Eigen::MatrixXd evectors = eigs.eigenvectors();
    std::cout << "Eigenvalues:\n" << evalues << std::endl;
    std::cout << "Eigenvectors:\n" << evectors << std::endl;

    // Save the results to a file
    saveResults(evalues, evectors);
}
else
{
    std::cout << "Eigenvalue computation failed." << std::endl;
}

return 0;

}

@yixuan
Copy link
Owner

yixuan commented Aug 15, 2023

Interestingly, I can finish running this code in less than one second.
(One quick note: the matrix you generate is not symmetric, and the solver will only use the lower triangular part.)

g++ -O2 -I../../include main.cpp -o main.out
time ./main.out

Matrix has been saved to data_matrix.txt
Initialization and computation starting...
Computation finished. Results:
Eigenvalues:
 26.3354
  8.1581
-8.20074
Eigenvectors:
  -0.0267025   -0.0147411    -0.057598
  -0.0338742      0.06437    0.0199869
  -0.0250167      0.02961    0.0065676
  -0.0387617 -0.000350736     0.028971
<output truncated>
  -0.0321336  -0.00566424    0.0268457
  -0.0284954   -0.0135238   -0.0122153
  -0.0367226   -0.0587411   -0.0181242
Results have been saved to results.txt

real	0m0.349s
user	0m0.312s
sys	0m0.028s

When I set the size to 4096, change eigs(op, 3, 6) to eigs(op, 3, 20), and disable the printing code, then the code also runs reasonably fast. In fact, most of the time was spent in writing files.

time ./main.out 

Matrix has been saved to data_matrix.txt
Initialization and computation starting...
Computation finished. Results:
Results have been saved to results.txt

real	0m12.656s
user	0m10.647s
sys	0m1.985s

Did you turn on the compiler optimization?

@dsl2501
Copy link
Author

dsl2501 commented Aug 17, 2023

Thanks a lot. yixuan. Your comment helped me understand the importance of subspace 6 (or 20 despite only 3 eigenvalues) and compiler optimization on visual studio 2022.. It worked with 16K as well as 4096. Writing time decreased from 3 hour to 1 hour 50min after adopting binary saving.
I am trying to increase the size upto the size (1 million x 1 million) of the real-life asymmetric matrix. Once I got results or questions, I will be back.
Lee

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants