Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Introduce R-specific control of lib_lightgbm #2837

Merged
merged 4 commits into from
Mar 10, 2020

Conversation

jameslamb
Copy link
Collaborator

I've been working on #1440 / #1909 (as part of #629 ), and am still struggling with it. But I think that a small part of what I've done can be introduced right now, and that it actually might be easier to discuss it as its own thing.

I'm currently working on this NOTE from R CMD CHECK:

Note: information on .o files is not available
File ‘/Users/jlamb/repos/lgb-logging-thing/LightGBM/lightgbm.Rcheck/lightgbm/libs/lib_lightgbm.so’:
  Found ‘___assert_rtn’, possibly from ‘assert’ (C)
  Found ‘___stderrp’, possibly from ‘stderr’ (C)
  Found ‘___stdoutp’, possibly from ‘stdout’ (C)
  Found ‘_printf’, possibly from ‘printf’ (C)
  Found ‘_putchar’, possibly from ‘putchar’ (C)
  Found ‘_vprintf’, possibly from ‘vprintf’ (C)

Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs. The detected symbols are linked into the code but
might come from libraries and not actually be called.

Similar to the way XGBoost solved this, I'd like to introduce a new define that tells the compiler "hey lib_lightgbm is being compiled for use with the R package".

I first tried to do this to introduce Rprintf(), the R equivalent to printf(), in log.h. I've been running into some complications there that will take some more time to address, so for this PR I'd like to focus on one isolated case not related to logging.

Found ‘___assert_rtn’, possibly from ‘assert’ (C)

This comes from src/io/json11.cpp. This PR uses -DLGB_R_BUILD to say "exclude cassert if you're building the R package"...and I can confirm it works! After running the code below:

export CXX=/usr/local/bin/g++-8
export CC=/usr/local/bin/gcc-8
Rscript build_r.R
R CMD CHECK lightgbm_2.3.2.tar.gz --as-cran --no-manual

I get the following output (notice that assert is no longer mentioned!):

Note: information on .o files is not available
File ‘/Users/jlamb/repos/lgb-logging-thing/LightGBM/lightgbm.Rcheck/lightgbm/libs/lib_lightgbm.so’:
  Found ‘___stderrp’, possibly from ‘stderr’ (C)
  Found ‘___stdoutp’, possibly from ‘stdout’ (C)
  Found ‘_printf’, possibly from ‘printf’ (C)
  Found ‘_putchar’, possibly from ‘putchar’ (C)
  Found ‘_vprintf’, possibly from ‘vprintf’ (C)

That command also runs all the R tests, so this should pass CI and doesn't seem to be affecting the R package's functionality.

If this PR is approved and merged, then the next thing I'll work on is log.h, where for example I'll try to make something like this work:

#include <R.h>

static void Write(LogLevel level, const char* level_str, const char *format, va_list val) {
    if (level <= GetLevel()) {  // omit the message with low level
      // write to STDOUT
      #ifndef LGB_R_BUILD
        printf("[LightGBM] [%s] ", level_str);
        vprintf(format, val);
        printf("\n");
      #else
        Rprintf("[LightGBM] [%s] ", level_str);
        Rvprintf(format, val);
        Rprintf("\n");
      #endif
      fflush(stdout);
    }
}

Thanks for your time!

@guolinke
Copy link
Collaborator

guolinke commented Feb 28, 2020

thanks @jameslamb !
BTW, should we add something for pre-compiled dll?
I mean, the build flag, and related documentation for the pre-compiled dll for R package.

@StrikerRUS
Copy link
Collaborator

Just want to say that I'm really excited to see we have some progress on #629! Thank you James!

@StrikerRUS
Copy link
Collaborator

Can't we achieve the same goal via CMake options to not play around with export? I'm asking because afraid that export inside the system call introduce side effects for user's environment, doesn't it? Or it only affects one particular R session?

if (R_ver >= 3.5) {
cmake_cmd <- paste0(cmake_cmd, " -DUSE_R35=ON ")
}

@jameslamb
Copy link
Collaborator Author

Can't we achieve the same goal via CMake options to not play around with export? I'm asking because afraid that export inside the system call introduce side effects for user's environment, doesn't it? Or it only affects one particular R session?

if (R_ver >= 3.5) {
cmake_cmd <- paste0(cmake_cmd, " -DUSE_R35=ON ")
}

system() spins off an isolated shell session that doesn't impact the calling environment:

image

BUT you're right, passing the argument to cmake directly would be cleaner, and would be portable to Windows. Let me try that!

@jameslamb
Copy link
Collaborator Author

thanks @jameslamb !
BTW, should we add something for pre-compiled dll?
I mean, the build flag, and related documentation for the pre-compiled dll for R package.

Can you point me to the code where we build the dll / so files that we include with each release? I'd be happy to update that code / documentation to show how to build a version of the library specifically for R (with this flag).

But I don't think we need to do anything in install.libs.R...by the time you reach that script, you have to already have the pre-compiled dll in hand. Also I'll add a note on #629 to this effect, but I think our first priority should be getting R source installs working in a CRAN-compliant way, and then work back to pre-compiled.

@jameslamb jameslamb mentioned this pull request Feb 28, 2020
12 tasks
@StrikerRUS
Copy link
Collaborator

@guolinke

BTW, should we add something for pre-compiled dll?
I mean, the build flag, and related documentation for the pre-compiled dll for R package.

I think we will publish a separate artifacts in our Releases which will be sent to CRAN. But I guess it should be done after closing #629. Or am I misunderstand something and you are speaking about different things?

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm that one item from a note is eliminated! 🎉

@guolinke
Copy link
Collaborator

I mean, sometimes user will build the dll first, then install python /R packages. After this PR, I think python/R will not share the same dll. So we need the build options to build R dll, maybe including msvc project file.

@jameslamb
Copy link
Collaborator Author

I mean, sometimes user will build the dll first, then install python /R packages. After this PR, I think python/R will not share the same dll. So we need the build options to build R dll, maybe including msvc project file.

yep, makes sense! I agree, from this point we should have a different lib_lightgbm.so / lib_lightgbm.dll based on whether or not you are using the R package.

Another option we could give is an R package .tar.gz that just has that library in it.

@StrikerRUS to answer your question...yes CRAN allows for precompiled packages, but you cannot publish them yourself. That is done for security reasons, so that CRAN tightly controls what software people can get into your environment via CRAN.

We have to have a working source distribution (one that gets compiled when users download and install it) first, which means yes we have to close #629 first.

From the CRAN policies:

Binary packages are not accepted from maintainers: CRAN will only host binary packages prepared by those responsible for the binary areas. Their packages are made automatically by batch jobs and can take a day or two to appear on the CRAN master site (maybe longer to reach CRAN mirrors).

@Laurae2
Copy link
Contributor

Laurae2 commented Mar 7, 2020

@jameslamb According to my lgbdl package, the precompiled library should be at the root of the LightGBM folder.

The file is then checked by install.libs.R if use_precompile is TRUE.

@jameslamb
Copy link
Collaborator Author

@jameslamb According to my lgbdl package, the precompiled library should be at the root of the LightGBM folder.

The file is then checked by install.libs.R if use_precompile is TRUE.

perfect, thanks

@jameslamb
Copy link
Collaborator Author

I'm going to rebase to master and rebuild just to be sure, then I'll merge this. Thanks for the reviews!

@jameslamb jameslamb merged commit 17b8ce1 into microsoft:master Mar 10, 2020
@jameslamb jameslamb deleted the feat/r-specific branch March 25, 2020 20:01
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants