Skip to content

jerry0012000/CS6140ProjectTask1_Data_Synthesize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This script automates the integration of Principal Component (PC) data from a CSV file into a TXT file, ensuring that each date in the TXT file is appropriately matched with the PC values based on quarterly financial report dates. It is designed to handle datasets where companies report data on different dates, dynamically matching and appending Principal Component 1 and 2 values to the TXT file. Features

Dynamic Date Matching: Ensures that each row in the TXT file is assigned the correct Principal Component data based on the corresponding company's financial report release date.
Flexible File Selection: Allows users to select the TXT and CSV files through a graphical file dialog.
Automated Data Integration: Reads, processes, and appends new columns (Principal Component 1 and 2) to the TXT file.
Version Control: Generates a new, timestamped TXT file to preserve the original data.
Sorted Results: The updated TXT file is sorted by date in descending order for consistency.

How It Works

Input Files:
    TXT File: Contains dates and other data to which the Principal Components will be added.
    CSV File: Contains dates and corresponding Principal Component 1 and 2 values. Dates in the CSV are assumed to be in descending order.
Process:
    The script reads the CSV file, extracts and sorts the Principal Component data.
    For each date in the TXT file:
        If the date corresponds to a financial report period, the PC values from the CSV are appended.
        If the date falls outside the report range, placeholders ("None") are added.
    New dates from the CSV file that do not exist in the TXT file are appended with their respective PC values.
Output:
    A new TXT file is generated with the additional columns for Principal Component 1 and 2, ensuring data integrity and separation.

Usage Instructions

Run the Script:
    Ensure you have Python installed with the required libraries (tkinter, csv, and datetime).
    Execute the script using python AddComponents.py.
File Selection:
    A dialog box will appear for selecting the TXT file. Choose the desired file.
    A second dialog box will appear for selecting the CSV file. Choose the file containing Principal Component data.
Output File:
    The script will generate a new TXT file in the same directory as the original file, with a timestamp in the filename (e.g., data_20241207_103015.txt).

About

Project Folder of CS6140 Machine Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages