use pod logs as feedback mechanism from flytekit to flytepropeller #3838
Replies: 4 comments 6 replies
-
Checking logs at scale can be extremely expensive as you will invariably stream logs to propeller memory? |
Beta Was this translation helpful? Give feedback.
-
Going to keep riffing on this with myself. Many logging frameworks deploy a sidecar to handle seamless integration. We could use a lightweight sidecar (container B) to accept feedback requests from the flytekit process (container A), using a localhost connection or something, and then dump immediately to stdout. Propeller can retrieve the logs from the sidecar using the k8s api server as a feedback mechanism. There would be very little traffic, only as much as we write (hopefully 10s of lines) and would give all of the same benefits of parsing logs directly on the flytekit container. |
Beta Was this translation helpful? Give feedback.
-
2023-11-09 Contributor's meetup notes: with support from @hamersaw, @fg91 volunteers to champion an RFC for this idea. |
Beta Was this translation helpful? Give feedback.
-
@fg91 do you still plan to shepherd an RFC for this? |
Beta Was this translation helpful? Give feedback.
-
Motiviation
As Flyte scales users are requesting more information flowing from flytekit to flyteconsole. This runtime information may include additional configuration, execution metadata, etc. The current approach is to use a blobstore to write another file, or append to an existing file. This has many issues, foremost of which:
(1) small writes, especially file appends, are an anti-pattern.
(2) we are essentially piggybacking a storage framework for inter-processes communication to break out of k8s.
Proposal
This proposal is to use k8s Pod logs to encode flytekit "reports" which may then be parsed by flytepropeller and included in
TaskExecutionEvents
. Information will be regex parsable, where each specific category has a unique regex, before being emitting in the container logs. For example, adding runtime configurable log links could be encoded like:Flytepropeller will periodically check Pod logs (at most) every
N
(configurable) seconds, so not every single time the task is evaluated to reduce k8s apiserver stress. The frequency and near real-time ness is an obvious tradeoff. Additionally, flytepropeller will automatically check on Pod startup and completion to ensure correctness regardless of the throttling. As risk of over-simplifying this, flytepropeller uses a regex to parse each category of information and uses it to modify the currentTaskExecutionEvent
. This can be further optimized by included alastCheckTimestamp
to reduce duplicate reporting.Use Cases
There are a number of current efforts being blocked by a lack of flytekit to flytepropeller feedback. Currently in consideration (with a brief explanation of how this proposal is a solution) are:
Considerations
The log information must be a single line and very small (128 / 256 character max?), so logs are not unnecessarily bloated. The increase in verbosity will affect the performance of each log check and parse from flytepropeller. I do not think there is a Pod log k8s request field to get all the logs after timestamp
N
, so every check will retrieve the entirety of logs.Ideas
TRACE
level, configurable log links areINFO
level, etcBeta Was this translation helpful? Give feedback.
All reactions