You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The TensorRT integration test went red when the workflow was moved from a base container with Python 3.8 to Python 3.10 as part of the 3.8 support deprecation. The problem is that the model engine staged at gs://apache-beam-ml/models/ssd_mobilenet_v2_320x320_coco17_tpu-8.trt (based on a TF Model Garden config) was built with TensorRT 8.x, and Python 3.10 containers use TensorRT 10.x. Unfortunately the documentation around loading the model from the TF side is somewhat out of date or not necessarily what we need; additionally, we need to convert the model from a TF format to a TensorRT format since we do not use the ONNX route in the test.
The gradle task for the test is :sdks:python:test-suites:dataflow:py310:tensorRTtests and is defined here:
When testing the workflow ensure that you're running on dataflow or on a machine with a GPU, as the workflow will fail with CUDA error 35 if there isn't a GPU present.
Issue Failure
Failure: Test is continually failing
Issue Priority
Priority: 2 (backlog / disabled test but we think the product is healthy)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam YAML
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Infrastructure
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
What happened?
The TensorRT integration test went red when the workflow was moved from a base container with Python 3.8 to Python 3.10 as part of the 3.8 support deprecation. The problem is that the model engine staged at gs://apache-beam-ml/models/ssd_mobilenet_v2_320x320_coco17_tpu-8.trt (based on a TF Model Garden config) was built with TensorRT 8.x, and Python 3.10 containers use TensorRT 10.x. Unfortunately the documentation around loading the model from the TF side is somewhat out of date or not necessarily what we need; additionally, we need to convert the model from a TF format to a TensorRT format since we do not use the ONNX route in the test.
The gradle task for the test is
:sdks:python:test-suites:dataflow:py310:tensorRTtests
and is defined here:beam/sdks/python/test-suites/dataflow/common.gradle
Line 444 in 2488ca1
When testing the workflow ensure that you're running on dataflow or on a machine with a GPU, as the workflow will fail with CUDA error 35 if there isn't a GPU present.
Issue Failure
Failure: Test is continually failing
Issue Priority
Priority: 2 (backlog / disabled test but we think the product is healthy)
Issue Components
The text was updated successfully, but these errors were encountered: