SimEngine

SimEngine is an open-source similarity measurement and feature selection repository in Python. It is build upon several standard math libraries(e.g. math, Numpy) and machine learning libraries (scikit-learn, Scipy).
SimEngine could be used to determine the similarity between proxy and parent applications. It also includes algorithms to facilitate feature selection, as minimizing the number of features is very important to reduce the data collected and thus lower the number of application runs.
With the help of the quantitative similarity measurement we have developed in SimEngine, users are guided to choose proper proxy applications for particular uses. Besides quantifying fidelity of proxy applications, similarity measurement approaches in SimEngine can also be applied to various HPC problems, such as compiler optimization, code refactoring, and application input sensitivity.

Prerequisites:

Python 3
Numpy
Scipy
Scikit-learn
pandas version = 1.4.4

How to run SimEngine

Clone the repository to your computing platform. $ git clone https://github.com/SimBioSysLab/SimEngine

Then, if you want to process the raw data, go to step A. If you want to start from the processed CSV files, go to step B

A. Start from the raw data

This step will create the accumulated CSV and delta CSV by following the below instructions:
1- Download the raw data (https://github.com/sandialabs/proxy-parent-data/tree/main/SKX) inside your machine.
2- Go to the dataprocess directory in the repo.
a. $ cd dataprocess
3- Open dataprep.sh shell file and change the TopDataPrep variable (at the beginning of the file) to point to the raw data directory.
4- Perform $ chmod +x dataprep.sh
5- Run the shell file ./dataprep.sh to get two csv directories csv_acc and csv_std, inside the SimEngine directory, that contains the all ranks average and standard deviation data. Notice, this step may take a long time (e.g. one or two days) to build two csv directories.
6- Follow steps in B.

B. Use the processed CSV directory to analysis

1- Go to the SimEngine directory in the top of the repo. $ cd SimEngine/
2- Download the two csv directories from https://github.com/sandialabs/proxy-parent-data/tree/main/ProcessedData/SKX. (If you start from Step A, you do not need to download again.)
3- $ python main.py
4- You can find the analysis output figures inside the graphs directory.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
SimEngine		SimEngine
dataprocess		dataprocess
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SimEngine

Prerequisites:

How to run SimEngine

A. Start from the raw data

B. Use the processed CSV directory to analysis

About

Releases

Packages

Languages

SimBioSysLab/SimEngine

Folders and files

Latest commit

History

Repository files navigation

SimEngine

Prerequisites:

How to run SimEngine

A. Start from the raw data

B. Use the processed CSV directory to analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages