Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid memory access when calling transform(vector_2d, fn) with heterogeneous vector #431

Open
3 of 8 tasks
tjdwill opened this issue Nov 19, 2024 · 4 comments
Open
3 of 8 tasks

Comments

@tjdwill
Copy link

tjdwill commented Nov 19, 2024

Bug category

  • bug - compilation error
  • bug - compilation warning
  • bug - runtime error
  • bug - runtime warning
  • bug - logic error

Describe the bug

Hello, while studying matplot/util/common.cpp, I noticed that the implementation for transform derives the number of columns based on the number of columns in the first row of a 2D vector. This results in access violations when calling the function with a vector in which any subsequent row has less than x[0].size() elements (see example below).

Similarly, I think this issue also presents itself in flatten because if x[0].size() happens to be smaller than any other row, the output vector's reserved size will be too small, requiring reallocation during the insert operation. This reallocation would render the iterators invalid (if I understood std::vector::insert correctly).

Steps to Reproduce

Compile and run these tests:

Test 1: Non-empty Rows

// matplot_seg_fault.cpp
#include <iostream>
#include <vector>
#include <matplot/matplot.h>


void print_vec(std::vector<std::vector<double>> v)
{
    for(int i {0}; i < v.size(); ++i)
	{
		std::cout << "\n";
        for (int j {0}; j < v[i].size(); ++j)
            std::cout << v[i][j] << " ";
	}
    std::cout << "\n";
}

int main()
{
    namespace mpl = matplot;
    std::vector<std::vector<double>> bad_input {
        std::vector<double(10, 0.),
        {0.},
        {0.},
    };

    auto bad_transformed = mpl::transform(bad_input, [](double x){ return x; });
    print_vec(bad_transformed);
}

Output

0 0 0 0 0 0 0 0 0 0 
0 0 0 1.63042e-322 2 0 0 4.00193e-322 2.74672e-315 2.74672e-315 
0 0 0 4.00193e-322 2.74672e-315 2.74672e-315 2.74672e-315 2.74672e-315 2.74672e-315 2.74672e-315 

Test 2: Empty Rows

// matplot_seg_fault.cpp
#include <iostream>
#include <vector>
#include <matplot/matplot.h>


void print_vec(std::vector<std::vector<double>> v)
{
    for(int i {0}; i < v.size(); ++i)
	{
		std::cout << "\n";
        for (int j {0}; j < v[i].size(); ++j)
            std::cout << v[i][j] << " ";
	}
    std::cout << "\n";
}

int main()
{
    namespace mpl = matplot;
    std::vector<std::vector<double>> bad_input {
        {0., 1., 2., 3.},
        {},
        {},
        {},
    };

    auto bad_transformed = mpl::transform(bad_input, [](double x){ return x; });  // seg faults here
    print_vec(bad_transformed);
}

Output

Segmentation fault (core dumped)

Platform

  • cross-platform issue - linux
  • cross-platform issue - windows
  • cross-platform issue - macos

Environment Details:

  • OS: Fedora Linux
  • OS Version: 6.11.7-300.fc41.x86_64
  • Compiler: gcc
  • Compiler version: 14.2.1 20240912

Additional context

I am aware that most users would pass in homogeneously-dimensioned vectors to the API, but I wanted to test to see if the potential violation was there. I don't know whether the library's perspective is to assume users wouldn't do that or to ensure they won't, so I decided to report it just in case.

P.S.

Not really the subject of the report, but I also noticed an inconsistency of capitalization on _TEMP_MACRO_ on line 248 of common.cpp when compared to lines 227, 250, and 252. I don't know if macros are case-sensitive or not, so I'm just noting it here.

@tjdwill tjdwill changed the title Invalid memory access when calling `transform(vector_2d, fn) with heterogeneous vector Invalid memory access when calling transform(vector_2d, fn) with heterogeneous vector Nov 19, 2024
@alandefreitas
Copy link
Owner

I don't know whether the library's perspective is to assume users wouldn't do that or to ensure they won't, so I decided to report it just in case

Yes, that's true. I don't know if there's something we can do about it. I never considered this possibility. If not, it could be a precondition for the function to be asserted in debug mode.

@alandefreitas
Copy link
Owner

Not really the subject of the report, but I also noticed an inconsistency of capitalization on TEMP_MACRO on line 248 of common.cpp when compared to lines 227, 250, and 252. I don't know if macros are case-sensitive or not, so I'm just noting it here.

Yes. That's a bug. Macros are case-sensitive.

@tjdwill
Copy link
Author

tjdwill commented Nov 20, 2024

Yes, that's true. I don't know if there's something we can do about it. I never considered this possibility. If not, it could be a precondition for the function to be asserted in debug mode.

Just so I understand (still learning), if the user is building in debug mode the library would run input checks?

 

Yeah, I'm not sure what the best fix is that wouldn't introduce potentially-costly checks; refactors; or additional dependencies like Eigen.

I don't see why a user would pass in non-uniform data in an actual use-case, but maybe a documentation note specifying that the functions expect uniformly-dimensioned input would work as a short-term remedy?

Macros are case-sensitive.

Good to know, thank you.

@alandefreitas
Copy link
Owner

Just so I understand (still learning), if the user is building in debug mode the library would run input checks?

Yes. If they're cheap.

I don't see why a user would pass in non-uniform data in an actual use-case, but maybe a documentation note specifying that the functions expect uniformly-dimensioned input would work as a short-term remedy?

Yes. The pre-conditions should be documented. Although we only have comments for now. There's no library reference in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants