Skip to content

Latest commit

 

History

History
15 lines (10 loc) · 607 Bytes

README.md

File metadata and controls

15 lines (10 loc) · 607 Bytes

batch-vs-index-warc

See the blog post: S3 Throughput: Scans vs Indexes.

A benchmark to explore the speed of reading WARC entries in bulk vs individually.

mvn clean install assembly:single   # Build the JAR
NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Single

NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Batch