-
Notifications
You must be signed in to change notification settings - Fork 1
Research on sorting, esp. related to caching and branch prediction
pbiggar/sorting-branches-caching
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This file describes the experimental framework used in the technical report "Sorting in the Presence of Branch Prediction and Caches", by Paul Biggar and David Gregg, and the paper "An Experimental Study of Sorting and Branch Prediction", by Paul Biggar, Nicholas Nash, Kevin Williams and David Gregg. NB: Most of the scripts and programs used here are hard coded, and will only work from the working directories specified below. There are 4 sets of tests: testing the sorts to see if they work, hardware performance counters, software branch predictors and simplescalar tests. Most of the results are included in this package in their rawest form. It is necessary to generate some graphs and process some results. Commands which must be run to be able to create the docs are marked with a * below. main.c: Enter the code directory. In main.c there are a list of sorts to be tested (look for 'test_sort'). The #defines TEST_MIN and TEST_MAX control how many keys are to be tested. The sorts are tested for an array of every size between the min and max, also using that size as a seed for the random number generator. The #define RANDOM_SIZE controls time_sort() which gives a quick tick count or how long it takes to run a sort. Not very official, at all. It also defines the size of array used for the software predictors, which is official. The #define RUN_VISUAL decides if the visual tests are being done. These were used in development, and aren't that interesting. Hardware counters: In the code/ directory, call do_all_papiex or do_all_perfex. You need a patched kernel for this. See: http://icl.cs.utk.edu/papi/. Capture the output. Ctrl-C when bored, or needing results, as this will take a long time (3 days or so on Pentium 4 1.8Ghz): $ ./do_all_papiex > papiex_results or $ ./do_all_perfex > perfex_results If you use papiex (more accurate), then it needs to be converted to the other format: $ ../scripts/papi_convert papiex_results > perfex_results Then process the results to give you graphs: $ ../scripts/process_perfex_output perfex_results There's a little bit of hard coding in this script, so change $key_num to the number of times perfex actually ran. Also modify the input so that only completed work cycles are there. This means if the last sort you ran was selectsort, and it ran 256 times, but other sorts ran 257 times, remove the sort results for the 257th iteration from the results file. Additionally, this requires results from 'do_nothing', or else the output will be empty. This produces a file in the data/ directory called processed_cycles_data which is in gnuplot format. If you add more sorts to the framework, you'll need to edit the gnuplot scripts in docs/*/scripts/ with the column numbers that are printed out by process_perfex_output. To create the gnuplot images, enter the data/perfex_gnuplot_scripts/ directory and: * $ gnuplot * The images appear in the data/plots/ directory. Software branch predictors: To run the branch predictors, enter the code/ directory, and ensure that -D_USE_SOFTWARE_PREDICTOR is set in the CFLAGS line in the Makefile. Then: $ make $ ./fastsort The software counters will appear in data/counters/. To process these, from the data/ directory, run (after unbzipping some files): * $ ../scripts/process_counter counters/* Warnings from process_counter probably mean that the .bz2 files weren't unzipped. This will print out the maximum branches. If this isn't slightly lower than the y-axis in the gnuplot scripts, it will need to be changed (On line 8 of process_counter, then rerun it). This creates the gnuplot data files in data/counter_gnuplot_data and gnuplot scripts in the data/counter_gnuplot_generated_scripts/ directory. To create the images, enter the data/ directory and run: * $ for i in counter_gnuplot_generated_scripts/*; do gnuplot $i; done The images are put into the data/plots directory. Simplescalar: If your simplescalar binaries are big-endian (if gcc is ssbig-na-sstrix-gcc), then edit the $CC variable in scripts/do_simple to reflect this. In the code/ directory, run: $ ./do_all_simple This will run do_simple 10 times per sort, using data/BIG as it's input. Simplescalar's output is captured by the scripts, edited and output into a more human readable format, in data/simple_logs. To convert this to gnuplot format, run: * $ ../scripts/do_all_convert To create the gnuplot scripts for these files, run: * $ ../scripts/do_all_plot Though this doesn't actually run gnuplot, despite its name. do_all_convert creates gnuplot data files in data/converted_simple_logs/, and do_all_plot creates gnuplot scripts for each sort type and each sort, stored in data/simplescalar_generated_gnuplot_scripts. To create the images, enter the data/ directory and run: * $ gnuplot simplescalar_generated_gnuplot_scripts/* Documents: The are three documents in docs/. tech_report/ is the technical report incorporating everything you see here. branch_paper/ is the first paper, mostly discussing branch prediction. The create the papers, run: * $ make again in the respective directories. All the papers use plots from the data/plots/ directory, and have sym-links to it. Some of the plots might be slightly and subtly different, so it's best to enter the /docs/*/scripts directory first, and run: * $ ./gnuplot_dir Running make will not check a dependency of the plots, so it will be necessary to: * $ make again or * $ make clean * $ make to resolve this. All the macros used to present the graphics are in report.tex or paper.tex. The code is in docs/*/code/ and has been edited for readability from that which is in code/. Some extra images will need to be created for the technical report. From data/: * $ ../scripts/process_two_point_counter counters/bubblesort * $ gnuplot counter_gnuplot_generated_scripts/two_point_counter_bubblesort * $ ../scripts/process_two_point_counter counters/improved\ bubblesort * $ gnuplot \ counter_gnuplot_generated_scripts/two_point_counter_improved_bubblesort All these tests were done on a Pentium 4, which is little-endian, and using a little-endian version of simple-scalar. If your machine is big-endian, you'll need to take some steps to get the same results. Tool versions: The following are the versions of the tools used in the creation of this report. - Valgrind: valgrind-2.4.0 - PapiEx: 0.99rc1, with perfctl patch 2.6.8-rc2, using papi 3.0.8.1, on Linux 2.6.8.1 - gcc: gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) - perl: v5.8.7 built for i386-linux-thread-multi - gnuplot: gnuplot 4.0 patchlevel 0 - SimpleScalar: SimpleScalar/PISA Tool Set version 3.0 of August, 2003. - sslittle-na-sstrix-gcc: gcc version 2.6.3 - host: i386-redhat-linux - Pentium 4: $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 1 model name : Intel(R) Pentium(R) 4 CPU 1.60GHz stepping : 2 cpu MHz : 1595.321 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss tm bogomips : 3185.04
About
Research on sorting, esp. related to caching and branch prediction
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published