PHPD (Process Hive Partitioned Data)

Usage Guidelines

To run this program, ensure that Rust is installed on your system. You can download and install Rust from the official website: Rust Installation Guide.

Installation

Clone the repository using the following command:

git clone https://github.com/TechfaneTechnologies/phpd.git

Then, navigate to the project directory and build the project:

cd phpd
cargo build --release

Generating Dummy Data

To generate dummy data, execute the following command:

cargo run --release --bin generate_dummy_data

Alternatively, you can run the compiled binary directly:

./target/release/generate_dummy_data

Example Terminal Output:

$ ./target/release/generate_dummy_data
    Proceeding with the generation of dummy data for the following instruments: ["BANKNIFTY", "BANKEX", "FINNIFTY", "MIDCPNIFTY", "NIFTY", "NIFTYNXT50", "SENSEX"]
    For the year 2024
    At directory: /Users/DrJuneMoone/Document/hive_partitioned_data
    Successfully generated dummy data at: /Users/DrJuneMoone/Document/hive_partitioned_data
    Generated 5502 CSV files across 1841 subfolders, totaling 8.69 GiB
    Processing speed: 923.87 MiB per second in 9.63 seconds

Processing Hive Partitioned Dummy Data

To process the generated hive partitioned dummy data, run the following command:

cargo run --release --bin process_dummy_data

Or execute the binary directly:

./target/release/process_dummy_data

Example Terminal Output:

$ ./target/release/process_dummy_data
    Found 7 instruments

    Instrument: SENSEX Grouped CSV Files {2: [CsvFile { path: "/Users/DrJuneMoone/Document/hive_partitioned_data/SENSEX/20240101/SENSEX-2.csv", date: "20240101", seq_id: 2 }, CsvFile { path: "/Users/DrJuneMoone/Document/hive_partitioned_data/SENSEX/20240102/SENSEX-2.csv", date: "20240102", seq_id: 2 }, .... ]}

    Processing instrument: BANKNIFTY
    Found 3 sequence groups
    Processing sequence group: 2

    Processing file: /Users/DrJuneMoone/Document/hive_partitioned_data/NIFTYNXT50/20240102/NIFTYNXT50-1.csv
    Processing file: /Users/DrJuneMoone/Document/hive_partitioned_data/NIFTY/20240102/NIFTY-2.csv
    Processing file: /Users/DrJuneMoone/Document/hive_partitioned_data/SENSEX/20240103/SENSEX-2.csv
    ............
    ............
    Successfully merged dummy data at: /Users/DrJuneMoone/Document/hive_partitioned_data
    Generated 21 sequentially merged CSV files, totaling 8.69 GiB
    Processed 5502 CSV files across 1841 subfolders, totaling 8.69 GiB
    Processing speed: 3.30 GiB per second in 5.27 seconds

Changing the Data Directory

To change the location of the hive partitioned data, modify the base_path variable in the source code:

Edit src/generate_dummy_data.rs, lines 18-20.
Edit src/process_dummy_data.rs, lines 8-10.

After making the necessary changes, rebuild and run the program to regenerate and process the data with the updated location.

Processing Actual Hive Partitioned Data

To process your actual hive partitioned data, update the base_path variable in src/process_dummy_data.rs (lines 8-10) and run the following command:

cargo run --release --bin process_dummy_data

Or execute the compiled binary:

./target/release/process_dummy_data

Example Video

Watch On YouTube

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PHPD (Process Hive Partitioned Data)

Usage Guidelines

Installation

Generating Dummy Data

Example Terminal Output:

Processing Hive Partitioned Dummy Data

Example Terminal Output:

Changing the Data Directory

Processing Actual Hive Partitioned Data

Example Video

About

Releases

Packages

Languages

License

TechfaneTechnologies/phpd

Folders and files

Latest commit

History

Repository files navigation

PHPD (Process Hive Partitioned Data)

Usage Guidelines

Installation

Generating Dummy Data

Example Terminal Output:

Processing Hive Partitioned Dummy Data

Example Terminal Output:

Changing the Data Directory

Processing Actual Hive Partitioned Data

Example Video

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages