Skip to content

A benchmark to explore the speed of reading WARC entries in bulk vs individually.

License

Notifications You must be signed in to change notification settings

code402/batch-vs-index-warc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

batch-vs-index-warc

See the blog post: S3 Throughput: Scans vs Indexes.

A benchmark to explore the speed of reading WARC entries in bulk vs individually.

mvn clean install assembly:single   # Build the JAR
NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Single

NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Batch

About

A benchmark to explore the speed of reading WARC entries in bulk vs individually.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages