-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Features & build cleanups #189
Conversation
Pull Request Test Coverage Report for Build f1d87733566eed8fa74d89ca325696057da832f6-PR-189
💛 - Coveralls |
Thanks for this @pmarks! It seems that it is now failing (against master w/ htslib 1.10.x) because of https://github.com/rust-bio/rust-htslib/runs/729762497?check_suite_focus=true#step:9:114 I'll try to get my PR finished and passing (#193) ASAP so that this doesn't become a blocker to merge this PR. |
@brainstorm I have |
Woah, great work @pmarks, checking it out right now... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great refactor Patrick, thanks for that! A bit concerned about functionality from hfile_s3.c
and hfile_gcs.c
potentially missing though? Personally, I would like to have those enabled on the MUSL builds at least.
@pmarks @brainstorm I trust your judgement here, as you are both way better in building C than me. Feel free to merge whenever you feel this is done. |
OK @johanneskoester @brainstorm @nlhepler, this should be ready to go. Last question what do you think about taking In our experience compiling bindgen then rust-htslib become a huge part of the critical path of build time. I think there's also some machines that don't have the required clang installation that would now work out of the box. I'm not aware of any downsides. |
I'm very happy with shorter build times indeed... only thing I can think of is whether we will face any kind of operational/caching issue re-generating those prebuilt bindings in the future? I guess it should be mentioned somewhere in docs? Slightly OT but relevant since the rebuilding of
Equivalent C code fine though: $ cat s3_htslib.c
#include <stdio.h>
#include <stdlib.h>
#include <htslib/hfile.h>
#include <htslib/hts.h>
#include <htslib/sam.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char **argv)
{
samFile *fp_in = hts_open(argv[1],"r"); //open bam file
bam_hdr_t *bamHdr = sam_hdr_read(fp_in); //read header
printf("%s\n", bamHdr->target_name[1]);
return 0;
}
$ ./s3_htslib s3://test-bucket/test.bam
2
$ Would you mind double-checking that the new |
Just noticed that the
|
@brainstorm - htslib is causing curl 7.70 to generate err = CURLE_UNKNOWN_OPTION while setting an FTP option here: This is because the I'm not getting a 403 back from sw, which is the same as what happens in my browser. Do you have crdentials for that test bucket that we can put into a test? |
@brainstorm I added a simple s3 test to the bucket you mentioned - can you add creds? I can get it with firefox, but I'm getting access when I put |
TIL: I would have never imagined that htslib would actually require FTP bits, always thought of S3 as HTTP-only protocol, nice forensics @pmarks 👍 I was testing with an internal private bucket/object so I cannot publish that, so let's stick with the public 1000genomes one? Unfortunately there's still something else off after activating FTP support on libcurl: $ cross test --release --features="s3" --target x86_64-unknown-linux-musl
(... all tests ok until...)
[E::hts_open_format] Failed to open file "s3://1000genomes/1000G_2504_high_coverage/additional_698_related/data/ERR3989458/NA20358.final.cram" : Input/output error
test bam::tests::test_s3_connect ... FAILED
failures:
---- bam::tests::test_s3_connect stdout ----
Err(
Open {
target: "s3://1000genomes/1000G_2504_high_coverage/additional_698_related/data/ERR3989458/NA20358.final.cram",
},
)
thread 'bam::tests::test_s3_connect' panicked at 'called `Result::unwrap()` on an `Err` value: Open { target: "s3://1000genomes/1000G_2504_high_coverage/additional_698_related/data/ERR3989458/NA20358.final.cram" }', src/bam/mod.rs:1823:18
failures:
bam::tests::test_s3_connect
test result: FAILED. 74 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
error: test failed, to rerun pass '--lib' Same with MUSL:
... could be that CRAM over S3 is not supported via plain |
@pmarks Found a suitable S3-BAM public dataset:
Same effect with BAM though: |
@brainstorm ok I'm making prebuilt bindings the default. I added a note to the README about it. I added a test to access a BAM file via HTTP. I think the s3 support is basically there, but I think there's various details about authentication that the user needs to sort out with env vars and url params. |
100% agree, merge this, S3 support can come right after, thanks @pmarks, great work here indeed! |
Weird, it works fine for me (at least just printing the BAM header:
It does work as well for private buckets that need priv/pubkeys....anyway, we'll get to the bottom of this in any case ;) |
Hi @pmarks, are the Context: Pull request #214 adds faidx bindings and compilation works well with PS: I like #189 since I had quite some trouble with building htslib on a CentOS with old bindgen :-) |
@pmarks I believe I'm facing a similar issue with the bindgen prebuilt bindings that Manuel found here... could you document the prebuilding of bindgen a bit more thoroughly under hts-sys? I tried to just diff --git a/hts-sys/osx_prebuilt_bindings.rs b/hts-sys/osx_prebuilt_bindings.rs
index d0521c1..c0dbeff 100644
--- a/hts-sys/osx_prebuilt_bindings.rs
+++ b/hts-sys/osx_prebuilt_bindings.rs
@@ -344,7 +344,6 @@ pub const __MAC_10_14_1: u32 = 101401;
pub const __MAC_10_14_4: u32 = 101404;
pub const __MAC_10_15: u32 = 101500;
pub const __MAC_10_15_1: u32 = 101501;
-pub const __MAC_10_15_4: u32 = 101504;
pub const __IPHONE_2_0: u32 = 20000;
pub const __IPHONE_2_1: u32 = 20100;
pub const __IPHONE_2_2: u32 = 20200;
@@ -386,10 +385,6 @@ pub const __IPHONE_12_3: u32 = 120300;
pub const __IPHONE_13_0: u32 = 130000;
pub const __IPHONE_13_1: u32 = 130100;
pub const __IPHONE_13_2: u32 = 130200;
-pub const __IPHONE_13_3: u32 = 130300;
-pub const __IPHONE_13_4: u32 = 130400;
-pub const __IPHONE_13_5: u32 = 130500;
-pub const __IPHONE_13_6: u32 = 130600;
pub const __TVOS_9_0: u32 = 90000;
pub const __TVOS_9_1: u32 = 90100;
pub const __TVOS_9_2: u32 = 90200;
@@ -407,9 +402,7 @@ pub const __TVOS_12_1: u32 = 120100;
pub const __TVOS_12_2: u32 = 120200;
pub const __TVOS_12_3: u32 = 120300;
pub const __TVOS_13_0: u32 = 130000;
-pub const __TVOS_13_2: u32 = 130200;
-pub const __TVOS_13_3: u32 = 130300;
-pub const __TVOS_13_4: u32 = 130400;
+pub const __TVOS_13_1: u32 = 130100;
pub const __WATCHOS_1_0: u32 = 10000;
pub const __WATCHOS_2_0: u32 = 20000;
pub const __WATCHOS_2_1: u32 = 20100;
@@ -426,8 +419,7 @@ pub const __WATCHOS_5_0: u32 = 50000;
pub const __WATCHOS_5_1: u32 = 50100;
pub const __WATCHOS_5_2: u32 = 50200;
pub const __WATCHOS_6_0: u32 = 60000;
-pub const __WATCHOS_6_1: u32 = 60100;
-pub const __WATCHOS_6_2: u32 = 60200;
+pub const __WATCHOS_6_0_1: u32 = 60001;
pub const __DRIVERKIT_19_0: u32 = 190000;
pub const __MAC_OS_X_VERSION_MAX_ALLOWED: u32 = 101500;
pub const __ENABLE_LEGACY_MAC_AVAILABILITY: u32 = 1;
@@ -3246,7 +3238,7 @@ extern "C" {
) -> ::std::os::raw::c_int;
}
extern "C" {
- pub static seq_nt16_table: [::std::os::raw::c_uchar; 256usize];
+ pub static mut seq_nt16_table: [::std::os::raw::c_uchar; 256usize];
}
extern "C" {
pub static mut seq_nt16_str: [::std::os::raw::c_char; 0usize];
@@ -14561,13 +14553,6 @@ fn bindgen_test_layout_fd_set() {
)
);
}
-extern "C" {
- pub fn __darwin_check_fd_set_overflow(
- arg1: ::std::os::raw::c_int,
- arg2: *const ::std::os::raw::c_void,
- arg3: ::std::os::raw::c_int,
- ) -> ::std::os::raw::c_int;
-}
pub type fd_mask = __int32_t;
pub type pthread_cond_t = __darwin_pthread_cond_t;
pub type pthread_condattr_t = __darwin_pthread_condattr_t;
@@ -16436,7 +16421,7 @@ fn bindgen_test_layout_sam_hdr_t() {
}
pub type bam_hdr_t = sam_hdr_t;
extern "C" {
- pub static bam_cigar_table: [i8; 256usize];
+ pub static mut bam_cigar_table: [i8; 256usize];
}
#[repr(C)]
#[derive(Debug, Copy, Clone)]
@@ -19423,277 +19408,6 @@ fn bindgen_test_layout_kbitset_iter_t() {
)
);
}
-#[repr(C)]
-#[derive(Debug, Copy, Clone)]
-pub struct __faidx_t {
- _unused: [u8; 0],
-}
-pub type faidx_t = __faidx_t;
-pub const fai_format_options_FAI_NONE: fai_format_options = 0;
-pub const fai_format_options_FAI_FASTA: fai_format_options = 1;
-pub const fai_format_options_FAI_FASTQ: fai_format_options = 2;
-pub type fai_format_options = u32;
-extern "C" {
- pub fn fai_build3(
- fn_: *const ::std::os::raw::c_char,
- fnfai: *const ::std::os::raw::c_char,
- fngzi: *const ::std::os::raw::c_char,
- ) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn fai_build(fn_: *const ::std::os::raw::c_char) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn fai_destroy(fai: *mut faidx_t);
-}
-pub const fai_load_options_FAI_CREATE: fai_load_options = 1;
-pub type fai_load_options = u32;
-extern "C" {
- pub fn fai_load3(
- fn_: *const ::std::os::raw::c_char,
- fnfai: *const ::std::os::raw::c_char,
- fngzi: *const ::std::os::raw::c_char,
- flags: ::std::os::raw::c_int,
- ) -> *mut faidx_t;
-}
-extern "C" {
- pub fn fai_load(fn_: *const ::std::os::raw::c_char) -> *mut faidx_t;
-}
-extern "C" {
- pub fn fai_load3_format(
- fn_: *const ::std::os::raw::c_char,
- fnfai: *const ::std::os::raw::c_char,
- fngzi: *const ::std::os::raw::c_char,
- flags: ::std::os::raw::c_int,
- format: fai_format_options,
- ) -> *mut faidx_t;
-}
-extern "C" {
- pub fn fai_load_format(
- fn_: *const ::std::os::raw::c_char,
- format: fai_format_options,
- ) -> *mut faidx_t;
-}
-extern "C" {
- pub fn fai_fetch(
- fai: *const faidx_t,
- reg: *const ::std::os::raw::c_char,
- len: *mut ::std::os::raw::c_int,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn fai_fetch64(
- fai: *const faidx_t,
- reg: *const ::std::os::raw::c_char,
- len: *mut hts_pos_t,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn fai_fetchqual(
- fai: *const faidx_t,
- reg: *const ::std::os::raw::c_char,
- len: *mut ::std::os::raw::c_int,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn fai_fetchqual64(
- fai: *const faidx_t,
- reg: *const ::std::os::raw::c_char,
- len: *mut hts_pos_t,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn faidx_fetch_nseq(fai: *const faidx_t) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn faidx_fetch_seq(
- fai: *const faidx_t,
- c_name: *const ::std::os::raw::c_char,
- p_beg_i: ::std::os::raw::c_int,
- p_end_i: ::std::os::raw::c_int,
- len: *mut ::std::os::raw::c_int,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn faidx_fetch_seq64(
- fai: *const faidx_t,
- c_name: *const ::std::os::raw::c_char,
- p_beg_i: hts_pos_t,
- p_end_i: hts_pos_t,
- len: *mut hts_pos_t,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn faidx_fetch_qual(
- fai: *const faidx_t,
- c_name: *const ::std::os::raw::c_char,
- p_beg_i: ::std::os::raw::c_int,
- p_end_i: ::std::os::raw::c_int,
- len: *mut ::std::os::raw::c_int,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn faidx_fetch_qual64(
- fai: *const faidx_t,
- c_name: *const ::std::os::raw::c_char,
- p_beg_i: hts_pos_t,
- p_end_i: hts_pos_t,
- len: *mut hts_pos_t,
- ) -> *mut ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn faidx_has_seq(
- fai: *const faidx_t,
- seq: *const ::std::os::raw::c_char,
- ) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn faidx_nseq(fai: *const faidx_t) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn faidx_iseq(
- fai: *const faidx_t,
- i: ::std::os::raw::c_int,
- ) -> *const ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn faidx_seq_len(
- fai: *const faidx_t,
- seq: *const ::std::os::raw::c_char,
- ) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn fai_parse_region(
- fai: *const faidx_t,
- s: *const ::std::os::raw::c_char,
- tid: *mut ::std::os::raw::c_int,
- beg: *mut hts_pos_t,
- end: *mut hts_pos_t,
- flags: ::std::os::raw::c_int,
- ) -> *const ::std::os::raw::c_char;
-}
-extern "C" {
- pub fn fai_set_cache_size(fai: *mut faidx_t, cache_size: ::std::os::raw::c_int);
-}
-#[repr(C)]
-#[derive(Debug, Copy, Clone)]
-pub struct hts_tpool_process {
- _unused: [u8; 0],
-}
-#[repr(C)]
-#[derive(Debug, Copy, Clone)]
-pub struct hts_tpool_result {
- _unused: [u8; 0],
-}
-extern "C" {
- pub fn hts_tpool_init(n: ::std::os::raw::c_int) -> *mut hts_tpool;
-}
-extern "C" {
- pub fn hts_tpool_size(p: *mut hts_tpool) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_dispatch(
- p: *mut hts_tpool,
- q: *mut hts_tpool_process,
- func: ::std::option::Option<
- unsafe extern "C" fn(arg: *mut ::std::os::raw::c_void) -> *mut ::std::os::raw::c_void,
- >,
- arg: *mut ::std::os::raw::c_void,
- ) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_dispatch2(
- p: *mut hts_tpool,
- q: *mut hts_tpool_process,
- func: ::std::option::Option<
- unsafe extern "C" fn(arg: *mut ::std::os::raw::c_void) -> *mut ::std::os::raw::c_void,
- >,
- arg: *mut ::std::os::raw::c_void,
- nonblock: ::std::os::raw::c_int,
- ) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_dispatch3(
- p: *mut hts_tpool,
- q: *mut hts_tpool_process,
- exec_func: ::std::option::Option<
- unsafe extern "C" fn(arg: *mut ::std::os::raw::c_void) -> *mut ::std::os::raw::c_void,
- >,
- arg: *mut ::std::os::raw::c_void,
- job_cleanup: ::std::option::Option<unsafe extern "C" fn(arg: *mut ::std::os::raw::c_void)>,
- result_cleanup: ::std::option::Option<
- unsafe extern "C" fn(data: *mut ::std::os::raw::c_void),
- >,
- nonblock: ::std::os::raw::c_int,
- ) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_wake_dispatch(q: *mut hts_tpool_process);
-}
-extern "C" {
- pub fn hts_tpool_process_flush(q: *mut hts_tpool_process) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_process_reset(
- q: *mut hts_tpool_process,
- free_results: ::std::os::raw::c_int,
- ) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_process_qsize(q: *mut hts_tpool_process) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_destroy(p: *mut hts_tpool);
-}
-extern "C" {
- pub fn hts_tpool_kill(p: *mut hts_tpool);
-}
-extern "C" {
- pub fn hts_tpool_next_result(q: *mut hts_tpool_process) -> *mut hts_tpool_result;
-}
-extern "C" {
- pub fn hts_tpool_next_result_wait(q: *mut hts_tpool_process) -> *mut hts_tpool_result;
-}
-extern "C" {
- pub fn hts_tpool_delete_result(r: *mut hts_tpool_result, free_data: ::std::os::raw::c_int);
-}
-extern "C" {
- pub fn hts_tpool_result_data(r: *mut hts_tpool_result) -> *mut ::std::os::raw::c_void;
-}
-extern "C" {
- pub fn hts_tpool_process_init(
- p: *mut hts_tpool,
- qsize: ::std::os::raw::c_int,
- in_only: ::std::os::raw::c_int,
- ) -> *mut hts_tpool_process;
-}
-extern "C" {
- pub fn hts_tpool_process_destroy(q: *mut hts_tpool_process);
-}
-extern "C" {
- pub fn hts_tpool_process_empty(q: *mut hts_tpool_process) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_process_len(q: *mut hts_tpool_process) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_process_sz(q: *mut hts_tpool_process) -> ::std::os::raw::c_int;
-}
-extern "C" {
- pub fn hts_tpool_process_shutdown(q: *mut hts_tpool_process);
-}
-extern "C" {
- pub fn hts_tpool_process_attach(p: *mut hts_tpool, q: *mut hts_tpool_process);
-}
-extern "C" {
- pub fn hts_tpool_process_detach(p: *mut hts_tpool, q: *mut hts_tpool_process);
-}
-extern "C" {
- pub fn hts_tpool_process_ref_incr(q: *mut hts_tpool_process);
-}
-extern "C" {
- pub fn hts_tpool_process_ref_decr(q: *mut hts_tpool_process);
-}
extern "C" {
#[link_name = "\u{1}_wrap_kbs_init2"]
pub fn kbs_init2(ni: size_t, fill: ::std::os::raw::c_int) -> *mut kbitset_t; Pregenerating a correct |
cc
and aconfig.h
generated bybuild.rs
to compile htsliblibdeflate
and pipe that through to htslib.TODO:
libdeflater
once it's updated with required changes. (PR at add build.rs outputs and link directive adamkewley/libdeflater#4)