Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Zilong-Li committed May 7, 2024
1 parent f9eb583 commit 7059964
Show file tree
Hide file tree
Showing 6 changed files with 12 additions and 33 deletions.
9 changes: 5 additions & 4 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ remotes::install_github("Zilong-Li/vcfppR") ## from latest github

If you find it useful, please cite the [paper](https://doi.org/10.1093/bioinformatics/btae049)

``` {r cite}
```r
library(vcfppR)
citation("vcfppR")
```
Expand All @@ -64,6 +64,7 @@ popfile <- "https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_25
Want to investigate the concordance between two VCF files? `vcfcomp` is the utility function you need! For example, in benchmarkings, we intend to calculate the genotype correlation between the test and the truth.

```{r r2}
library(vcfppR)
res <- vcfcomp(test = rawvcf, truth = phasedvcf,
region = "chr21:1-5100000", stats = "r2",
formats = c("GT","GT"))
Expand Down Expand Up @@ -95,21 +96,21 @@ vcfplot(res, main = "Structure Variant Counts", col = 1:7)

`vcftable` gives you fine control over what you want to extract from VCF/BCF files.

**Read SNP variants**
**Read SNP variants with GT format**

```{r snp}
res <- vcftable(phasedvcf, "chr21:1-5100000", vartype = "snps")
str(res)
```

**Read SNP variants with PL format and drop the INFO column in the VCF/BCF**
**Read SNP variants with PL format and drop the INFO column**

```{r pl}
res <- vcftable(rawvcf, "chr21:1-5100000", vartype = "snps", format = "PL", info = FALSE)
str(res)
```

**Read INDEL variants with DP format in the VCF/BCF**
**Read INDEL variants with DP format**

```{r indel}
res <- vcftable(rawvcf, "chr21:1-5100000", vartype = "indels", format = "DP")
Expand Down
30 changes: 4 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,28 +41,6 @@ If you find it useful, please cite the
``` r
library(vcfppR)
citation("vcfppR")
#> To cite package 'vcfppR' in publications use:
#>
#> Li Z (2024). "vcfpp: a C++ API for rapid processing of the variant
#> call format." _Bioinformatics_, *40*(2), btae049. ISSN 1367-4811,
#> doi:10.1093/bioinformatics/btae049
#> <https://doi.org/10.1093/bioinformatics/btae049>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> title = {vcfpp: a C++ API for rapid processing of the variant call format},
#> author = {Zilong Li},
#> journal = {Bioinformatics},
#> volume = {40},
#> number = {2},
#> pages = {btae049},
#> year = {2024},
#> month = {01},
#> issn = {1367-4811},
#> abstract = {Given the widespread use of the variant call format (VCF/BCF) coupled with continuous surge in big data, there remains a perpetual demand for fast and flexible methods to manipulate these comprehensive formats across various programming languages.This work presents vcfpp, a C++ API of HTSlib in a single file, providing an intuitive interface to manipulate VCF/BCF files rapidly and safely, in addition to being portable. Moreover, this work introduces the vcfppR package to demonstrate the development of a high-performance R package with vcfpp, allowing for rapid and straightforward variants analyses.vcfpp is available from https://github.com/Zilong-Li/vcfpp under MIT license. vcfppR is available from https://cran.r-project.org/web/packages/vcfppR.},
#> doi = {10.1093/bioinformatics/btae049},
#> }
```

## URL as filename
Expand All @@ -83,6 +61,7 @@ the utility function you need\! For example, in benchmarkings, we intend
to calculate the genotype correlation between the test and the truth.

``` r
library(vcfppR)
res <- vcfcomp(test = rawvcf, truth = phasedvcf,
region = "chr21:1-5100000", stats = "r2",
formats = c("GT","GT"))
Expand Down Expand Up @@ -126,7 +105,7 @@ vcfplot(res, main = "Structure Variant Counts", col = 1:7)
`vcftable` gives you fine control over what you want to extract from
VCF/BCF files.

**Read SNP variants**
**Read SNP variants with GT format**

``` r
res <- vcftable(phasedvcf, "chr21:1-5100000", vartype = "snps")
Expand All @@ -145,8 +124,7 @@ str(res)
#> - attr(*, "class")= chr "vcftable"
```

**Read SNP variants with PL format and drop the INFO column in the
VCF/BCF**
**Read SNP variants with PL format and drop the INFO column**

``` r
res <- vcftable(rawvcf, "chr21:1-5100000", vartype = "snps", format = "PL", info = FALSE)
Expand All @@ -165,7 +143,7 @@ str(res)
#> - attr(*, "class")= chr "vcftable"
```

**Read INDEL variants with DP format in the VCF/BCF**
**Read INDEL variants with DP format**

``` r
res <- vcftable(rawvcf, "chr21:1-5100000", vartype = "indels", format = "DP")
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ pkgdown: 2.0.9
pkgdown_sha: ~
articles:
concordance-of-two-vcf-files: concordance-of-two-vcf-files.html
last_built: 2024-05-07T13:49Z
last_built: 2024-05-07T15:08Z

Binary file modified docs/reference/figures/README-summary_sm-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/search.json

Large diffs are not rendered by default.

0 comments on commit 7059964

Please sign in to comment.