Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating CLI from elasticsearch-file-importer.py #5

Closed
peter279k opened this issue Oct 23, 2019 · 2 comments
Closed

Separating CLI from elasticsearch-file-importer.py #5

peter279k opened this issue Oct 23, 2019 · 2 comments

Comments

@peter279k
Copy link

As title, to make issue #2 working friendly, I suggest we can consider let the CLI be separated from elasticsearch-file-importer.py.

That is, the following Python codes can be on new Python file name called es-importer.py:

if __name__ == '__main__':

    PARSER = argparse.ArgumentParser(
        description='Read a data from a variety of file formats and post the data to Elasticsearch'
        )
    
    SUBPARSERS = PARSER.add_subparsers(title="data_type", description="Supported data format", help="Choose one of the supported data formats.")
    CSV_PARSER = SUBPARSERS.add_parser("CSV", help="Import a CSV file into Elasticsearch")
    CSV_PARSER.add_argument('csvFile', help='Path to the CSV file to read')
    CSV_PARSER.add_argument('esIndex', help='Name of the Elasticsearch index mapping')
    CSV_PARSER.add_argument('--stopWordsFile', help='Path to a file of stopwords')
    CSV_PARSER.set_defaults(func=process_report)

    LOG_PARSER = SUBPARSERS.add_parser("Logs", help="Import a log into ElasticSearch")
    LOG_PARSER.add_argument("logFile", help="Path to the log file to read")
    LOG_PARSER.add_argument("formatFile", help="Path to file containing log format regex string.")
    LOG_PARSER.add_argument("esIndex", help="Name of the Elasticsearch index mapping")
    LOG_PARSER.set_defaults(func=process_log)

    JSON_PARSER = SUBPARSERS.add_parser("JSON", help="Import a JSON file into Elasticsearch")
    JSON_PARSER.add_argument("jsonFile", help="Path to JSON file to read")
    JSON_PARSER.add_argument("esIndex", help="Name of the Elasticsearch index mapping")
    JSON_PARSER.set_defaults(func=process_json)

    ARGS = PARSER.parse_args()
    ARGS.func(ARGS)
@jadonn
Copy link
Owner

jadonn commented Oct 24, 2019

That sounds like a great idea! Would it be good to split the other functions into their own files? That might be unnecessary, but it might also make the code easier to organize.

@peter279k If you want to submit a pull request with any contributions, I would be happy to review it and get it merged in. I'm going to try to work on the issues you have listed, but any help would be great, too!

@jadonn
Copy link
Owner

jadonn commented Nov 12, 2019

I finally got a chance to finish splitting off the CLI portion into its own script. I merged in the changes in #10. It seems like the script still works ok so far. I am going to work on splitting off the rest of the functions into their own scripts for each type of file - CSV, log, and JSON, and start working on writing automated tests. I am going to close this issue. If there are any more improvements or changes that need to be made for the CLI, we can reopen this issue or file a new one.

@jadonn jadonn closed this as completed Nov 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants