Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DSL query support #49

Open
balintnadasi opened this issue Feb 28, 2024 · 7 comments
Open

DSL query support #49

balintnadasi opened this issue Feb 28, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@balintnadasi
Copy link

Hi guys!

Is there any chance that this backend will support pure DSL query generation in the near future?

@thomaspatzke thomaspatzke added the enhancement New feature or request label Mar 8, 2024
@Mat0vu
Copy link
Contributor

Mat0vu commented May 28, 2024

Hi,
I would also be glad to see a DSL-Backend. In the long term we consider switching to EQL or ES|QL, however at the moment we are still using the sigmac converter with some customizations and dsl output format. A DSL backend for pysigma would enable us to continue using aggregations/correlation queries while making it relatively easiy to compare the output from sigmac with the new output of pysigma to ensure that the searches are still working as expected. Additionally, as long as ES|QL is still in technical preview, we probably will avoid using this backend for productive use.

If you also still think that the dsl backend is a useful feature, I could offer to start working on a new DSL Backend, since I don´t see a dsl-branch in this repo and also not in the forks indicating somebody is already working on this. If you have already started I can also maybe help with testing :)

@thomaspatzke
Copy link
Member

I didn't started a DSL backend and I don't know about anyone who started. So feel free 😉

@balintnadasi
Copy link
Author

Hello @Mat0vu !

I saw that you forked the project and I would be happy to help with my unit tests. Where can I get the JSONQueryBackend?

@Mat0vu
Copy link
Contributor

Mat0vu commented Jul 17, 2024

Hi @balintnadasi ,

sorry for the late response. I´ve just updated my fork where I´ve been working on the implementation of a DSL backend for Elasticsearch.

Since the DSL Language is using json-queries in contrast to EQL or ESQL and because it was difficult to get the desired output using the variables provided by the TextQueryBackend which then passes the data to the python str.format() function, I´ve decided to create a new class JsonQueryBackend which could have been included into the base.py class.
However, in the end the code of my new JsonQueryBackend was almost identical to the TextQueryBackend with only a few adjustments. That´s why I´ve thought that it is probably a better way to switch back to TextQueryBackend.
Now the DSLBackend is based on the TextQueryBackend again and overwrites some functions completely (especially to get working correlation rules), which I could not get to work with json and the various str.format() calls within the default superclass.

So far I´ve managed to implement (hopefully) most of the basic use cases:

  • term(s) searches for case sensitive and insensitive (Elastic >= 7.10 required) searches
  • regex searches for |contains, |startswith, |endswith and |re
  • event_count and value_count correlations can be used and are translated into composite aggregations.

Currently not supported (only the stuff I know of, so probably not complete):

  • temporal correlations
  • correlations using more than one base rule (correlation_search_multi_rule is not implemented yet)

I´m not a specialist regarding Elastic-Mappings, and all of the fields we are searching in are mapped as keyword fields, for which regex and term queries work well. However, searching in text-fields might require some changes to the search type (e.g. match-search)...

I would say the Backend is far from finished but the current status seemed to be working fine when translating some of our existing rules and comparing the hits with the rules that were translated with sigmac. Aggregations also seemed to be working fine.

I will not be able to continue working on this topic for the next few weeks and because ES|QL is going to be fully supported by Elatic >=8.14 we are currently considering to switch to the new language. Anyways, you are very welcome to add unit_tests and improve the code :)
If I find time, I will also try to continue working on this, however this won´t be possible in the next few weeks...

@andurin
Copy link
Collaborator

andurin commented Nov 3, 2024

@balintnadasi / @Mat0vu
Is this still an issue and would you like to prepare a pull request for ES-DSL in the future or has EQL/ES-QL successful overwritten the need?

@Mat0vu
Copy link
Contributor

Mat0vu commented Nov 5, 2024

Hi @andurin,
because my team is currently switching to ESQL, we do not need DSL support anymore. If @balintnadasi or anyone else still wants DSL, they can use the code from here as a starting point.

@balintnadasi
Copy link
Author

Hello @andurin !

I revised @Mat0vu 's code to get a query that approximates the old sigmac (unfortunately, regex filters seemed slower in some cases). For now, I’m facing some escaping issues, and I'm working on resolving them. If all goes well (and I will have some time before xmas), I hope to create the merge request in December.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants