Skip to content

Commit

Permalink
Merge pull request #4 from zmughal-contrib/zmughal/fix-tox21-raw
Browse files Browse the repository at this point in the history
Write raw file data to tox21.parquet
  • Loading branch information
tomlue authored Sep 15, 2023
2 parents 5c16c50 + 01be6e5 commit 2664dc5
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 11 deletions.
10 changes: 0 additions & 10 deletions stages/2_unzip.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,3 @@ cat $temppath/files.txt | tail -n +2 | xargs -P14 -n1 bash -c '
echo '$rawpath'/$filename
unzip -q '$downloadpath'/$1 -d '$rawpath'/$filename
' {}

for x in $rawpath/*/; do
echo $x
for y in $x*; do
echo $y
z=$(printf %s "$y" | tr "-" "_")
echo $z
mv $y $z
done
done
2 changes: 1 addition & 1 deletion stages/3_write_brick.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ rawfiles <- discard(rawfiles,~grepl("description",.x))
rawtable <- map(rawfiles,~readr::read_tsv(.x))
rawtable <- keep(rawtable,~nrow(.x)>0)
rawmerge <- bind_rows(rawtable)
arrow::write_parquet(aggmerge,"brick/tox21.parquet")
arrow::write_parquet(rawmerge,"brick/tox21.parquet")

0 comments on commit 2664dc5

Please sign in to comment.