Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop relying on FILENAME_MAX #8757

Open
bradcray opened this issue Mar 9, 2018 · 18 comments
Open

Stop relying on FILENAME_MAX #8757

bradcray opened this issue Mar 9, 2018 · 18 comments

Comments

@bradcray
Copy link
Member

bradcray commented Mar 9, 2018

On PR #8733, @cassella writes:

...it would be simpler and more flexible if the executableFilename buffer
were dynamically allocated. Especially if someone does use a
filesystem that allows names longer than FILENAME_MAX. ...

...the more I read about FILENAME_MAX, the less confident I
feel about relying on it.
(http://www.gnu.org/software/libc/manual/html_node/Limits-for-Files.html
says it can be larger than you can make an array, (at least on
GNU/Hurd)). I think it'd be better to be able to accept whatever filenames
the user gives, and let the OS worry about whether they're too long.

So the proposal here is to do as Paul suggests on a rainy day.

[edit: there's a strong start to this on PR #17059 that needs someone to see it through. See also #17517 ]

@yozaam
Copy link

yozaam commented Aug 23, 2020

I see that FILENAME_MAX is used in many places, library,files,log etc should we convert all those char arrays into strings?

@bradcray
Copy link
Member Author

bradcray commented Sep 1, 2020

@yozaam: Sorry for the slow response; I've been on summer vacation and just got back today.

I think you're right that converting char arrays within the compiler to strings would be attractive and get away from this issue (a git blame would probably show that I'm responsible for a good many of these, as I'm more C than C++ programmer). Note that files in the runtime/ directory are in C (not C++), so you'd want to limit this conversion to sources in the compiler/ directory.

@rchinmay
Copy link
Contributor

@bradcray I tried looking into the issue. In the compiler/ directory, we can replace char array as string so that its size would be dynamically assigned. In runtime/ directory where C is used, could we not use malloc to assign its size rather than creating it to the length of FILENAME_MAX.
For example, in runtime/src/launch/slurm-srun/launch-slurm-srun.c :

char slurmFilename[FILENAME_MAX];
...
sprintf(slurmFilename, "%s%d", baseSBATCHFilename, (int)mypid);

is used.
Could we not use this:

char* slurmFilename=NULL;
...
slurmFilename=(char *)malloc((sizeof(char))*(sizeof(baseSBATCHFilename) + snprintf(NULL, 0, "%d", (int)mypid) + 1));
sprintf(slurmFilename, "%s%d", baseSBATCHFilename, (int)mypid);

This could reduce the reliance on FILENAME_MAX. We could also check if the pointer returned by malloc is NULL and then gives some error, like Memory allocation error. This could be done in almost all of the places in runtime/ directory.
But I understand that it could slower than static allocation if the size is of the same order of FILENAME_MAX.
Thanks,
R. Chinmay

@daviditen
Copy link
Member

@rchinmay I think you're on the right track.

I'd probably use strlen(baseSBATCHFilename) instead of sizeof(baseSBATCHFilename) so that it would be robust to having baseSBATCHFilename change from a macro to a const char* at some point in the future.

If you're adding a malloc call there should probably be a corresponding free added as well.

@rchinmay
Copy link
Contributor

rchinmay commented Jan 29, 2021

@daviditen @bradcray I tried to solve this. Changes made to runtime/ directory work well. But I had some doubts regarding design issues for the changes in compiler/ directory. I thought that there were 2 possible design selection to achieve this. I will brief them here, please guide me which should be preferred.

  1. Changing char arrays to std::string. This would solve the main problem very efficiently. But there are many functions to which these are passed to (or returned) as char * in the code. So, this change would cause all these functions to change. Example: snprintf(), sprintf(), sizeof(), strlen(), astr() are some of the functions which do not accept these changes implicitly. So, we would have to use c_str() function if we need const char*. When we need char* and not const char*, we could use &s[0] (where s is the name of the string), but this would sometimes cause problems for converting char* back to std::string like changing some part of char* in function beyond the length of the string do not reflect back in passed string. These problems are difficult to handle. Simpler way would be to create a new char array pointing to first element's address of the string. Then after the function, the char* can be assigned to the string. But this causes copying, and hence uses more memory.
  2. Changing char arrays to char* by using malloc(), free() for dynamic memory allocation, similar to what worked in runtime/ directory. But here, most of the char arrays are extern and hence freeing them can be confusing. I am not so clear of the order of usage of these variables in different files and hence I feel the safest way is to have a separate file for the purpose of memory destruction. This can be simpler to implement than the first method, according to me, but would still cause some changes in the structure of the project, and hence can be problematic in future. The other downside to this is that there would sometimes be problems of allocating memory, but an error being called and the compilation stops, without calling the free() method.

Please correct me if I am wrong anywhere. I have presently completed the changes in runtime/ directory, and everything works fine. But I was facing a problem to choose between these two options to implement in compiler/ directory.
Thanks,
R. Chiinmay

@bradcray
Copy link
Member Author

@rchinmay : As a high-level thought, how about doing the changes to the runtime/ directory in one PR if that felt clean and straightforward; and then we could worry about the more challenging compiler directory in a separate effort. That way the part that it sounds like you've successfully completed could be captured rather than lost (if we can't land the compiler portion for some reason).

As far as the compiler goes, my instinct would be to lean on std::string rather than reverting to char*/malloc/free, just to take advantage of the C++ setting. Using .c_str() to get a const char* when needed makes sense to me. I don't have an intuition for cases that would want a char*, but wonder whether that's just code that we should rewrite to use std::string as well (if it's under our control). Could you give some examples of where using std::string + .c_str() break down / aren't applicable?

Admittedly, I think a lot of this code was written by me at the dawn of time, and I'm much more of a C than C++ programmer, so thanks for taking a look.

@rchinmay
Copy link
Contributor

@bradcray Yes, I think it is possible to convert to std::string in compiler/ directory. I had seen some errors where using std::string with c_str() was causing error since some functions like sprintf() to write in the string. But, this could be solved by changing sprintf to assignment or concatenation. Most of the other functions work well with the change to std::string.

@bradcray
Copy link
Member Author

bradcray commented Feb 1, 2021

@rchinmay : That matches my thinking: If we could get away from things like sprintf() (which can arguably be problematic anyway) and change such cases to use C++ string operations, that seems like a win. If you decide at some point that this is a larger effort than you wanted to take on and you just want to stop at the runtime, that'd be fine too, of course.

@rchinmay
Copy link
Contributor

rchinmay commented Feb 7, 2021

@bradcray @daviditen I was stuck in a problem since a couple of days while solving this issue. In compiler/main/driver.cpp argument description accepts the location as void *. Char arrays can be directly passed to this. But std::string cannot. Converting the string to const char* did not work because the -o file specified does not make any changes in the executableFilename string. Hence this causes the executable file to be always named as filename without the part after last dot. I tried a couple of things here. Using const_cast<char*>(s.c_str()) does not work because the information given after the flag -o is not stored in the string and hence lost. Using &s[0] also did not work due to similar reasons. However, I expected &s to work because this gives the pointer to the string, and can be implicitly converted to void*. But it gives internal error on make check. Other cases give [Fail] There was an issue with the installation, test job output incorrect. with ./hello6-taskpar-dist: No such file or directory because hello6-taskpar-dist is created in CHPL_HOME and not in the path mentioned by -o flag.
(Here s was referring to char arrays which are now converted to std::string.)
Could you please guide me how to tackle this?
Thanks,
R. Chinmay

@bradcray
Copy link
Member Author

@rchinmay : Sorry for the slow response here. Do you have a branch that contains your current attempt here? I think it might be easiest for me to understand the problem well and suggest paths forward if I could see that.

@bradcray
Copy link
Member Author

@rchinmay : As we wrap up the spring release, I wanted to ping on this again to see whether you have a branch that I could look at to understand your last question?

@rchinmay
Copy link
Contributor

@bradcray Thanks for informing about the spring release. I had lost the progress of the changes made as I had not made a separate branch for it, neither committed the changes to any branch. I will rework on it on the areas that I had problem implementing with, and I will tag you after linking the branch here.
Thanks,
R. Chinmay.

@rchinmay
Copy link
Contributor

@bradcray
BRANCH LINK
Sorry for the delayed response. I have implemented the part which was producing the problem. I will work on other parts of the issue, once this problem is resolved. I will brief about the problem here. I have only changed all the occurrences of executableFilename in compiler/ in the branch I have provided. The problem is that, it is not accepting the executableFilename from the -o flag during compilation, and every time the default executableFilename is selected, which is always sourceFilename without the .chpl part. I assume this problem is because in compiler/main/driver.cpp, the line 996 was changed. Since executableFilename is now a string and the field only accepts void*, I have tried &executableFIlename[0], &executableFilename and const_cast<char*>(executableFilename.c_str()) in that field. But, none of them resolve this issue. It is because the pointer of the string generated in all these cases are either copies or casted constant pointers and hence the inputs accepted to the pointer is not visible in the string. Hence the -o flag does not work and always the default executableFilename is kept for the executable file generated on compilation. Simple way to verify this would be to run make check and the final test fails and produces (with CHPL_CHECK_DEBUG=1):

[Debug] Starting "make check" script.
[Debug] Debug output is turned on, because $CHPL_CHECK_DEBUG == 1
[Info] Running minimal test script: $CHPL_HOME/util/test/checkChplInstall
[Info] Found executable chpl in /home/rchinmay/Documents/First_PR/chapel/bin/linux64-x86_64/chpl.
[Info] Found $CHPL_HOME directory: /home/rchinmay/Documents/First_PR/chapel
[Info] /home/rchinmay/.chpl does not exist. Creating it.
[Info] Temporary test job directory: /home/rchinmay/.chpl/chapel-test-n1pL8
[Debug] printchplenv, because $CHPL_CHECK_DEBUG == 1
machine info: Linux rchinmay-VivoBook-ASUSLaptop-X430FA-S430FA 5.4.0-67-generic #75-Ubuntu SMP Fri Feb 19 18:03:38 UTC 2021 x86_64
CHPL_HOME: /home/rchinmay/Documents/First_PR/chapel *
script location: /home/rchinmay/Documents/First_PR/chapel/util/chplenv
CHPL_HOST_PLATFORM: linux64
CHPL_HOST_COMPILER: gnu
CHPL_HOST_ARCH: x86_64
CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: gnu
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none *
  CHPL_COMM_SUBSTRATE: none
  CHPL_GASNET_SEGMENT: none
  CHPL_LIBFABRIC: none
CHPL_TASKS: fifo *
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: cstdlib *
CHPL_ATOMICS: cstdlib
  CHPL_NETWORK_ATOMICS: none
CHPL_GMP: none *
CHPL_HWLOC: none
CHPL_REGEXP: none *
CHPL_LLVM: none *
CHPL_AUX_FILESYS: none
CHPL_LIB_PIC: none
CHPL_SANITIZE: none
CHPL_SANITIZE_EXE: none
[Info] Compiling $CHPL_HOME/test/release/examples/hello6-taskpar-dist.chpl
[Info] Compiling with C backend
[Debug] Compilation exit status was 0
[Info] Test job compiled into /home/rchinmay/.chpl/chapel-test-n1pL8/hello6-taskpar-dist
[Info] $CHPL_LAUNCHER=none is compatible with test script.
[Info] Running test job.
[Info] Test job complete.
[Fail] There was an issue with the installation, test job output incorrect.
[Debug] Full test output is turned on, because $CHPL_CHECK_DEBUG == 1

== Reference Test output hello6-taskpar-dist.comm-none.good ==

Hello, world! (from locale 0 of 1)

== Actual Test Output (sorted to compare v. reference) ==

/home/rchinmay/Documents/First_PR/chapel/util/test/checkChplInstall: line 241: ./hello6-taskpar-dist: No such file or directory

== Difference ==

1c1
< Hello, world! (from locale 0 of 1)
---
> /home/rchinmay/Documents/First_PR/chapel/util/test/checkChplInstall: line 241: ./hello6-taskpar-dist: No such file or directory


== Actual Test Output (raw, with verbose) ==
/home/rchinmay/Documents/First_PR/chapel/util/test/checkChplInstall: line 277: ./hello6-taskpar-dist: No such file or directory
[Debug] Test job exit status was 127
[Debug] Exit "make check" script with status code 20
[Debug] Removing /home/rchinmay/.chpl/chapel-test-n1pL8
[Debug] Removing /home/rchinmay/.chpl
make: *** [Makefile:215: check] Error 20

Also, the hello6-taskpar-dist executable file would be created in $CHPL_HOME/ directory since the file name provided through -o flag was not reflected in executableFIlename. I could not find any other way to pass string to void* and also accept input from the pointer passed. Please help me solve this problem.
Thanks,
R. Chinmay

@cassella
Copy link
Contributor

Hmm. I think in order to have the Chapel process_arg() code be able to store into a C++ string, it needs to know it's operating on a string. As you've found you can't just write into memory to extend it.

Updating all the type=='P' options to change their char* buffers to strings would be a pretty big effort. Another option would be to add a new type representing C++ strings that knows how to set a string value, and use it for one option at a time as you convert them. process_arg() and ApplyValue() would both need to be able to handle it.

(I'd think that as long as process_arg(type='P') calls strncpy(), it should cause a fatal error if the provided argument will be truncated by the strncpy, like 'S' does. IMO the compiler shouldn't silently do something different than what the user asked.

And also both cases in ApplyValue when getting the values from environment variables.)

@rchinmay
Copy link
Contributor

@cassella Thank you for your advice. I have added a new flag type for strings with type='R' (its name may be changed later). Now the -o flag is working perfectly fine. I will also change the other char arrays with length in order of FILENAME_MAX soon.
Thanks,
R. Chinmay

@bradcray
Copy link
Member Author

bradcray commented Jun 22, 2021

I just edited the OP to note the good start at this on PRs #17059 and #17517.

@riftEmber
Copy link
Member

Looking at picking this up due to #26261. We could resume from those partially-completed PRs, but they might be old enough now that it'd just be less work to start fresh.

@riftEmber
Copy link
Member

PR'd #26357 for the compiler (not based on the partial work) and #26381 for the runtime (based on the partial work).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants