-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic pagination in OCP tab #118
base: revamp
Are you sure you want to change the base?
Conversation
Signed-off-by: Vishnu Challa <[email protected]>
Updating actions to get triggered only on main branch
Merging revamp into main
Signed-off-by: Vishnu Challa <[email protected]>
Adding --host in vite config
Signed-off-by: Vishnu Challa <[email protected]>
Removing branch name from release action
} | ||
aggregate = {} | ||
for x,y in keysDict.items(): | ||
obj = {x:{"terms":{"field":y,"size":10}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have size hardcoded to 10 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 instances is actually the Elasticsearch default for aggregations, so there's no point in specifying a size unless you want it bigger or smaller.
And, especially if you want to frequently override the instance limit here, a global (or class) constant would be better than an inline literal so they're easier to find and control.
query["aggs"].update(aggregate) | ||
|
||
isFilterReset = False | ||
response = await es.filterPost(query=query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MVarshini Did you make sure that this query is working?
RequestError was raised19: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [build.keyword] in order to sort on')
TypeError was raised14: 'NoneType' object is not subscriptable
TypeError was raised15: argument of type 'NoneType' is not iterable
INFO: 127.0.0.1:42960 - "GET /api/v1/ocp/jobs?pretty=true&start_date=2024-10-10&end_date=2024-10-15&size=75&offset=1&sort=%7B%22build.keyword%22:%7B%22order%22:%22asc%22%7D%7D&filter=%7B%22ciSystem%22:[%22prow%22]%7D HTTP/1.1" 200 OK
Error retrieving filter data': RequestError(400, 'search_phase_execution_exception', 'No mapping found for [build.keyword] in order to sort on')
TypeError was raised: 'NoneType' object is not subscriptable
TypeError was raised: 'NoneType' object is not subscriptable
INFO: 127.0.0.1:42966 - "GET /api/v1/ocp/filters?pretty=true&start_date=2024-10-10&end_date=2024-10-15&size=75&offset=1&sort=%7B%22build.keyword%22:%7B%22order%22:%22asc%22%7D%7D&filter=%7B%22ciSystem%22:[%22prow%22]%7D HTTP/1.1" 200 OK
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x7feb9ce77be0>, 441476.621)]']
connector: <aiohttp.connector.TCPConnector object at 0x7feb8308b430>
{'query': {'query_string': {'query': 'uuid: "bc78c231-6d7f-4726-8afb-ef4a87d84331"'}}}
{'ciSystem': 'PROW', 'uuid': 'bc78c231-6d7f-4726-8afb-ef4a87d84331', 'releaseStream': '4.15.0-0.nightly', 'platform': 'AWS', 'clusterType': 'self-managed', 'benchmark': 'cluster-density-v2', 'masterNodesCount': 3, 'workerNodesCount': 6, 'infraNodesCount': 3, 'masterNodesType': 'm6a.xlarge', 'workerNodesType': 'm6a.xlarge', 'infraNodesType': 'r5.xlarge', 'totalNodesCount': 12, 'clusterName': 'ci-op-yfrkcx95-96149-rzgmn', 'ocpVersion': '4.15.0-0.nightly-2024-10-11-141906', 'networkType': 'OVNKubernetes', 'buildTag': '1844750966205714432', 'jobStatus': 'success', 'buildUrl': 'https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.15-nightly-x86-payload-control-plane-6nodes/1844750966205714432', 'upstreamJob': 'periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.15-nightly-x86-payload-control-plane-6nodes', 'upstreamJobBuild': '4.15.0-0.nightly-2024-10-11-141906-qe-perfscale', 'executionDate': '2024-10-11T16:08:02Z', 'jobDuration': '1634', 'startDate': '2024-10-11T16:08:02Z', 'endDate': '2024-10-11T16:35:16Z', 'startDateUnixTimestamp': '1728662882', 'endDateUnixTimestamp': '1728664516', 'timestamp': '2024-10-11T16:35:29Z', 'ipsec': 'false', 'ipsecMode': 'Disabled', 'fips': 'false', 'encrypted': 'false', 'encryptionType': '', 'publish': 'External', 'computeArch': 'amd64', 'controlPlaneArch': 'amd64'}
{'query': {'bool': {'must': [{'query_string': {'query': 'benchmark: "cluster-density-v2$" AND workerNodesType: "m6a.xlarge" AND masterNodesType: "m6a.xlarge" AND masterNodesCount: "3" AND workerNodesCount: "6" AND platform: "AWS" AND ocpVersion: 4.15* AND jobStatus: success'}}]}}}
{'query': {'query_string': {'query': '( uuid: "bc78c231-6d7f-4726-8afb-ef4a87d84331" ) AND metricName: "jobSummary"'}}}
{'query': {'query_string': {'query': '( uuid: "f1055fa3-8c5c-4cc9-b75d-39594c499a58" OR uuid: "4d59851e-b665-4c6e-914d-fa7fea1e2cb0" OR uuid: "ce07640a-a094-411a-84d2-f24d5cd6d064" OR uuid: "14a0d6ae-d4e4-435e-95ab-2a37b5633d3c" OR uuid: "c2c59959-7980-4d04-8495-c42d75957e28" OR uuid: "7335a811-95a3-453b-ab8d-182b6c1d1c3e" OR uuid: "44ee7997-9327-4dff-8d42-e64a40110cfe" OR uuid: "28e7d49f-535e-43ca-a383-feff6e669908" OR uuid: "d80cfd21-d005-4085-8f4f-57117a9a016f" OR uuid: "d3b26ef5-54a5-44df-b025-e6465e00a596" ) AND metricName: "jobSummary"'}}}
INFO: 127.0.0.1:51932 - "GET /api/v1/ocp/graph/bc78c231-6d7f-4726-8afb-ef4a87d84331 HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/anyio/streams/memory.py", line 98, in receive
return self.receive_nowait()
File "/usr/local/lib/python3.9/site-packages/anyio/streams/memory.py", line 93, in receive_nowait
raise WouldBlock
anyio.WouldBlock
I see a bunch of errors in the dashboard backend pod. Also it doesn't render frontend UI at all while applying filters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minimize the search this is only happening to me when looking at cluster-density-v2
jobs in ocp tab, seems to be realted to the Graph of this type of jobs
if self.prev_es and self.new_es: | ||
self.new_index = self.new_index_prefix + (self.new_index if indice is None else indice) | ||
today = datetime.today().date() | ||
seven_days_ago = today - timedelta(days=7) | ||
if end_date and end_date < seven_days_ago: | ||
new_results = [] | ||
new_results = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this being changed to a dictionary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems now it has extra values, I am guessing to deal with pagination, because now it includes the total
@@ -135,8 +153,6 @@ const OCPReducer = (state = initialState, action = {}) => { | |||
return { ...state, categoryFilterValue: payload }; | |||
case TYPES.SET_OCP_FILTER_OPTIONS: | |||
return { ...state, filterOptions: payload }; | |||
case TYPES.SET_OCP_FILTERED_DATA: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did we remove this and not other filters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few questions in general while reviewing this PR
- How are the queries being changed to from the existing ones? What are the reasons behind making these choices?
- What is plan of implementation? .i.e in what order do we want our changes to be merged. Can we take them in small and atomic PRs breaking down into smaller components.
- How are we making sure that any of these changes are not breaking the existing components in the dashboard?
@MVarshini Can we please take a step back and come up with a design document with HLD and LLD for this work? I think we need an alignment on the execution/implementation plan first before we make any code changes. Thanks for understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find that the Quay
tab is not working.
Some comments regarding what Vishnu already pointed out.
Edit:
The home page seems also to only show jobs from the telco
product
query["aggs"].update(aggregate) | ||
|
||
isFilterReset = False | ||
response = await es.filterPost(query=query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minimize the search this is only happening to me when looking at cluster-density-v2
jobs in ocp tab, seems to be realted to the Graph of this type of jobs
if self.prev_es and self.new_es: | ||
self.new_index = self.new_index_prefix + (self.new_index if indice is None else indice) | ||
today = datetime.today().date() | ||
seven_days_ago = today - timedelta(days=7) | ||
if end_date and end_date < seven_days_ago: | ||
new_results = [] | ||
new_results = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems now it has extra values, I am guessing to deal with pagination, because now it includes the total
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a quick pass through, so some comments.
I think the big thing here is that I believe the pandas and numpy versions we set were only needed for compatibility with Python 3.10. In exploring the FastAPI unhandled exception behavior today (which hadn't been working for me, either) I discovered that several of the packages we're using (including uvloop
) are not (fully) compatible with Python 3.10, and the backend container is built on CentOS stream9, where 3.9.19 is the standard Python.
If we stick to Python 3.9, the unhandled exception traceback works, which I think negates the need for your problematic try blocks, and also I believe the need for the numpy and pandas version changes. The containers build and run fine against main
with Python 3.9. (And if we're going to fully support Python 3.10, as I said, we need to change more than just numpy and pandas, so maybe we should defer that complication until we need it.)
@@ -4,8 +4,9 @@ | |||
from app import config | |||
import bisect | |||
import re | |||
import traceback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you actually using traceback
? I don't see it in the diff lines in this file.
# return await self.scan_indices(self.new_es, self.new_index, query, timestamp_field, start_date, end_date, size, offset) | ||
#else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed or restored -- commented-out code is generally bad, and these lines drastically change the meaning of the block.
except Exception as err: | ||
print(f"{type(err).__name__} was raised19: {err}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that you did this for something along the same lines as my attempt at a default unhandled exception reporter. There's something flaky about the FastAPI default exception handling -- we should be getting full tracebacks on the console, while the API call should get a 500 response.
This code, however, is printing a message without much context and then ignoring the exception, implicitly returning None
from the method, which is almost certainly not what you want. I'd recommend removing these try
blocks.
I'm not sure what Python version the CPT dashboard deployment usually uses. The toml file says "^3.9", which I think means anything >= 3.9 but < 4.0 -- however, the container installs on centos:stream9
, where python3
is 3.9.19.
I found experimentally that running Python 3.10 breaks the traceback, possibly because several packages (including uvloop 0.15.2) seem to not be 3.10 ready. It does appear to work with Python 3.9, however. So if you're testing under Python 3.9 and not getting tracebacks on the server stderr on an unhandled exception, you're seeing something different ... but if you're running it on Python 3.10, try going to Python 3.9 instead.
That'd factor out some of these changes, as well as making real problems behave a bit better.
query['query']['bool']['filter'][0]['range'][timestamp_field]['lte'] = str(min(end_date, each_index.timestamps[1])) | ||
query['query']['bool']['filter'][0]['range'][timestamp_field]['gte'] = str(max(start_date, each_index.timestamps[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really bulky and long (and harder to read in Python than the equivalent would be in JavaScript). I'd recommend
f = query["query"]["bool"]["filter"][0]["range"][timestamp_field]
f["lte"] = str(min(end_date, each_index.timestamps[1]))
f["gte"] = str(max(start_date, each_index.timestamps[0]))
(And note that while both Python and JavaScript allow single quotes, standard formatters for both will generally convert them to double-quotes. And this code seems to go back and forth between single and double quoting, which isn't ideal...)
pandas = "2.2.2" | ||
numpy = "1.26.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that, in line with my comments on the unhandled exception traceback, I discovered this afternoon that the CPT dashboard backend, with its current set of dependencies, should only be running on Python 3.9. (At least, the unhandled exception traceback only works on 3.9, but I wouldn't want to guarantee that everything else works as expected on 3.10!) The problems we experienced during container startup that led to this change also seem to be tied to Python 3.10, so (if we stick to Python 3.9) I'm not sure we really need this, either.
Or, if we're going to support Python 3.10, we need to update at least uvloop
, probably uvcorn
and fastapi
. (And maybe others, I gave up untangling this after a while and just went back to 3.9.)
|
||
const startIdx = page !== 1 ? (page - 1) * perPage : 0; | ||
const endIdx = page !== 1 ? page * perPage - 1 : perPage; | ||
//dispatch(setOCPPageOptions(page, perPage)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More commented-out code. Is this needed?
// const applyOtherFilter = () => { | ||
// removeStatusFilter(); | ||
// setOtherSummaryFilter(props.type); | ||
// }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to keep this? (Souvenir? Battle trophy? 😆 )
@@ -84,7 +84,7 @@ const MetricsTab = (props) => { | |||
<MetricCard | |||
title={"Others"} | |||
footer={summary?.othersCount} | |||
clickHandler={applyOtherFilter} | |||
// clickHandler={applyOtherFilter} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer clickable? I don't actually know what this UI element does, so I can't comment on the appropriateness ... but again, commented-out code just raises questions, and should be avoided. (And if it's important, comment why it was commented.)
} | ||
aggregate = {} | ||
for x,y in keysDict.items(): | ||
obj = {x:{"terms":{"field":y,"size":10}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 instances is actually the Elasticsearch default for aggregations, so there's no point in specifying a size unless you want it bigger or smaller.
And, especially if you want to frequently override the instance limit here, a global (or class) constant would be better than an inline literal so they're easier to find and control.
except Exception as err: | ||
print(f"{type(err).__name__} was raised14: {err}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think relying on the default unhandled exception traceback is better for debugging -- and simply returning None
after a print
in production seems bad. (And this should work with Python 3.9 and the existing package dependencies.)
Type of change
Description
This PR includes handling Pagination, sorting and filtering of OCP page at the server side.
Related Tickets & Documents
Checklist before requesting a review
Testing