Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IDEA] Granular Message Storage Settings #6255

Open
thorst opened this issue Jul 11, 2024 · 5 comments
Open

[IDEA] Granular Message Storage Settings #6255

thorst opened this issue Jul 11, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@thorst
Copy link

thorst commented Jul 11, 2024

Is your feature request related to a problem? Please describe.
Our problem is that our database is currently 6 tb. Our devs like 45 days of history and are unwilling to budge on that. In fact, they would like more.

Describe your use case
Having granular settings for what content is stored is imperative for us to be better stewards of our data and disk requirements. In cloverleaf we only stored the raw inbound from the inbound thread, and what was sent in the outbound thread. In Mirth, that would be the ability to save the raw message from the source connector, and the final state in the outbound connectors.

Describe the solution you'd like
I would like to have the message storage section in the channel configuration, summary tab, to have an advanced button. You would click on that and select which states you want to save for each connector. Each state would have a checkbox, and each connector would be a row in a table, with associated state checkboxes.

I realize this is much less user friendly than the current solution, so I would keep the current solution for the regular folks, but power users need more configuration. We don't need all of this data stored, and would rather have just the raw, and resend the message if needed, and store the data for longer. Imagine storing 90 days of messages, for less disk usage than we are currently using with 45 days.

Add to that, that we are now discussing how we will upgrade PostgreSQL. The going procedure is to export, uninstall Postgres, install new version of Postgres, and then import. This would take prohibitively long with 6 tb.

Describe alternatives you've considered

  1. Store less - our devs prefer the ability to be able to go back 45 days, if possible, they are very adamant about not going lower.
  2. Use global pre/post processor - I am already getting concurrency errors with the post processor, that's a different topic, but we could store the transaction to an external db. This is nice because users don't have to make any changes. We would most likely want an exclusion list, for channels where we don't need to save a copy, but ultimately a little more storage than what we need is fine, it would still be way less than what is being stored currently.
  3. Code template - Users could call a code template where they wanted to store the transaction, and that would save to an external db. This is perfect in that it only saves exactly when you want it to, but bad because you have to remember to call it.
  4. Archiving - I could potentially use the archiver to save the files out to a directory, which a channel could read from, and insert the data into the db. This seems too clunky though, and without testing Im afraid of it trying to clean up old files (that no longer exist because they are read in by another process) - but I should state that Im not sure what the use case is for that feature.
  5. Database Scraper - You could have a channel or external script/process loop over the database tables and extract the data you want. The positive of this is that it could reside in an external process, so mirth isn't getting bogged down with the details. The negative is that it would be slightly delayed. Could be intensive if it ran too fast, say, to loop over the previous days worth of data
  6. Clustering would allow you to upgrade one db at a time

All of these solutions assume an external database, which could be down and cause issues with the process. There could be issues in my code. I would need to build all this, which is fine, but an official solution would be preferred.

Messages would be stored in a cold storage db which would mean they would be harder to search and retrieve. It would pull them out of their current workflow, and while I could get snazzy and add a resend button that takes the data and calls the client api, it would be harder to resend when compared to the current provided solution in the message history. The obvious answer is that mirth does the decision making of whether to store it or not at the time of saving, to never occupy the disk space to begin with. Then the user interacts with the messages the same as always in the message history.

@thorst
Copy link
Author

thorst commented Jul 12, 2024

An alternative could be a "first and last" setting, where it saves the raw on the source connector, and the sent transaction for each destination. This is much less optimized and configurable, but probably easier for you all to implement.

It wouldn't solve all issues for everyone, but I am sure many would use it.

@kirbykn2
Copy link

kirbykn2 commented Jul 15, 2024 via email

@thorst
Copy link
Author

thorst commented Jul 15, 2024

Just convenience. I'm not the one setting the requirement, but for systems where you don't have access to resend from the source, the interface engine is the next best thing and faster than interacting with the vendor.

@kirbykn2
Copy link

kirbykn2 commented Jul 15, 2024 via email

@thorst
Copy link
Author

thorst commented Jul 15, 2024

You pay for convenience.
For sure, our disk usage and associated cost is high, and mirth uses more than what our old cloverleaf system does, and that's what this ticket is about.

I would really question resending from Mirth messages that are that old. How often does it happen? Why does it happen?
Sometimes you can use the history to build knowledge on all the transactions sent for a patient. Other times a downstream consumer will state they didn't get, or misplaced a message (like a result) and want us to resend. Or they received it, but it didn't file because of some piece of data on the message, at which point we would manually hack the message to get it to file into the system (very edge case). So of course you wouldn't want to send data that had been corrected or otherwise superseded, but there are plenty of cases where this is perfectly fine.

Should you be using database level resources (storing in Mirth for that long)? I would, they do not have to resend messages in MIrth that often, and if they do, I would like to fix the issue that is causing resends.
Current plan is to write the message history to a mysql db, that way I can upgrade our primary postgres db, which will be much smaller, and then separately upgrade mysql. Since mysql would just be message history, it could have a longer downtime and still not impact patient care. I can also get pickier about what I store, and so the size will be much smaller than what mirth is storing currently. It would be much nicer if we could make these granular tweaks in mirth, and then store much less, and just keep it all in one db. For now though, I will store in the backup db, and it'll be paired down. I wouldn't describe them as needing to resend often, or that there is a common cause. With a complex system there will always be things that pop up here or there, and we could just say, "oh, no, we can't do that". But where possible we like to say, "sure, give me 10 minutes". So with my new system we would put the pruner to be very short, like 2-7 days, and rely on the archive system to come into play in the situations where it's needed. In cloverleaf we used to, several version ago, have everyhting be file based. That was nice for compression, but then whenever you need to search for something it was a PITA to deal with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants