You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use a fork of your n0s1 code to scan our (large) confluence cloud instance. Thanks for that, it is very useful.
However, I found out that not all spaces are being scanned, but I didn't get an error message or timeout. I just noticed that a test space I added was not in the report. The total scan took about 5 hours. I figured it was caused by somehow the connection being closed and the client object to become empty. I saw that you recently added error handling and did some refactoring. But the strange thing is, we didn't get errors. But I will adopt the error handling in any case.
For now, I solved the issue with missing spaces by adding a self.connect() in the method 'get_data' for every batch of spaces to be collected. There might be a better way though, but for now this works.
def set_config(self, config):
from atlassian import Confluence
SERVER = config.get("server", "")
EMAIL = config.get("email", "")
TOKEN = config.get("token", "")
LABEL_FALSE_POSITIVE = config.get("label_false_positive", "cict-no-secrets-confirmed")
self._url = SERVER
self._user = EMAIL
self._password = TOKEN
self.label_false_positive = LABEL_FALSE_POSITIVE
self._connect()
return self.is_connected()
def _connect(self):
from atlassian import Confluence
if self._user and len(self._user) > 0:
self._client = Confluence(url=self._url, username=self._user, password=self._password)
else:
self._client = Confluence(url=SERVER, token=TOKEN)
and in get_data:
def get_data(self, include_comments=False, test=""):
if not self._client:
return None, None, None, None, None, None
start = 0
limit = 50
finished = False
while not finished:
logging.info(f"Spaces batch: {start} - {start+limit}")
# reconnect for every batch
self._connect()
if not test:
res = self._client.get_all_spaces(
start=start, limit=limit, expand="history"
)
start += limit
spaces = res.get("results", [])
else:
key = test
res = self._client.get_space(key, expand="history")
finished = True
spaces = [res]
I also added a possibility to only test with one space as the total scan takes such a long time via the parameter test.
For your interest, another improvement I made for our use case, is a change to the config.yaml: id: generic-api-key as we got tons of false positives due to this regex finding the confluence user macro and link macro in combination with 'key'.
And we added a method to skip a page if a label was set to indicate the page is a false positive, because the found secret is just meant as an example. In that case, the user can add a specific label to indicate that it is a false positive.
def is_false_positive(self, page_id):
labels_json = self._client.get_page_labels(page_id)
labels = labels_json.get("results", [])
for label in labels:
if label["name"] == self.label_false_positive:
logging.info(f"INFO: page {page_id} is false positive due to label {label}")
return True
return False
And in the method get_data:
for p in pages:
comments = []
title = p.get("title", "")
page_id = p.get("id", "")
if self.is_false_positive(page_id):
continue
In any case, thanks for your code. Hope my comments are useful.
Kind regards,
Mariska
The text was updated successfully, but these errors were encountered:
Thank you for reporting the issue and for the proposed enhancement. I will add it to the next release.
Did you have the chance to test your enhancements on top of the latest main branch? Does it fix your bug, or are you still having issues?
Apologies for the late response. I am back to business now, and I should be way more responsive from now on.
Hi, No problem. I've not yet had the time to test your latest version. I've planned this for the first week of September. With the reconnect I implemented, it works in any case, but I'll let you know what the results are after I've tested with your latest version again.
Hi @tallandtree, I just wanted to let you know that I have a new PR #31 out that implements a solution for this issue. Please let me know if it addresses your use cases.
Thanks
I use a fork of your n0s1 code to scan our (large) confluence cloud instance. Thanks for that, it is very useful.
However, I found out that not all spaces are being scanned, but I didn't get an error message or timeout. I just noticed that a test space I added was not in the report. The total scan took about 5 hours. I figured it was caused by somehow the connection being closed and the client object to become empty. I saw that you recently added error handling and did some refactoring. But the strange thing is, we didn't get errors. But I will adopt the error handling in any case.
For now, I solved the issue with missing spaces by adding a self.connect() in the method 'get_data' for every batch of spaces to be collected. There might be a better way though, but for now this works.
and in get_data:
I also added a possibility to only test with one space as the total scan takes such a long time via the parameter test.
For your interest, another improvement I made for our use case, is a change to the config.yaml:
id: generic-api-key
as we got tons of false positives due to this regex finding the confluence user macro and link macro in combination with 'key'.And we added a method to skip a page if a label was set to indicate the page is a false positive, because the found secret is just meant as an example. In that case, the user can add a specific label to indicate that it is a false positive.
And in the method
get_data
:In any case, thanks for your code. Hope my comments are useful.
Kind regards,
Mariska
The text was updated successfully, but these errors were encountered: