-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[ONNX] Refactor delegate memory management #32661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
c994e2b to
f01eae9
Compare
Signed-off-by: Maxim Vafin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Refactors memory management in the ONNX frontend by consolidating tensor data handling from multiple pointer-based members to a single AlignedBuffer approach.
- Replaces multiple data-related fields in
TensorMetaInfowith a singlem_bufferfield - Unifies tensor data access through
get_buffer()instead of separate data pointers and external locations - Consolidates external data loading into reusable utility functions
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/frontends/onnx/tests/CMakeLists.txt | Adds tensor external data utility file to test sources and includes |
| src/frontends/onnx/frontend/src/translate_session.cpp | Updates tensor data check to use unified buffer approach |
| src/frontends/onnx/frontend/src/input_model.cpp | Updates tensor data check and constructor to use buffer |
| src/frontends/onnx/frontend/src/core/tensor.hpp | Removes old data fields, adds buffer-based constructor and accessor |
| src/frontends/onnx/frontend/src/core/tensor.cpp | Implements unified buffer-based data extraction and constant creation |
| src/frontends/onnx/frontend/src/core/node.cpp | Updates attribute tensor creation to use buffer |
| src/frontends/onnx/frontend/src/core/graph_iterator_proto.cpp | Major refactor of tensor data extraction to use buffer utilities |
| src/frontends/onnx/frontend/src/core/decoder_proto.hpp | Removes old tensor data field initialization |
| src/frontends/onnx/frontend/include/openvino/frontend/onnx/decoder.hpp | Updates TensorMetaInfo structure to use buffer field |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review not completed yet. Publishing remarks already found and switching to reviewing a more urgent review.
| const uint8_t* m_tensor_data; | ||
| ov::Any m_tensor_data_any; | ||
| size_t m_tensor_data_size; | ||
| std::shared_ptr<ov::AlignedBuffer> m_buffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing include:
#include <memory> // For shared_ptr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually your AlignedBuffer already is a resource manager. Maybe you could keep it by value in TensorMetaInfo (instead of shared_ptr) in order to avoid managing the memory twice (and paying with memory fragmentation / non-locality)?
|
|
||
|
|
||
| target_include_directories(ov_onnx_frontend_tests PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}") | ||
| target_include_directories(ov_onnx_frontend_tests PRIVATE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling, that objects in frontend/src belong to a separate / own target. Need of using relative paths traversing the file tree backwards indicates that.
In kosher CMake these libraries would export their interface directories. So that linking them would "infect" the current target with necessary paths. See: https://cmake.org/cmake/help/latest/prop_tgt/INTERFACE_INCLUDE_DIRECTORIES.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, those are files from frontend library. We include them here to not duplicate code (or to not create a separate target for decoder)
| } else { | ||
| return m_tensor_place->get_data_size(); | ||
| } | ||
| FRONT_END_NOT_IMPLEMENTED(get_data_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is it planned to be implemented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe yes, or maybe will delete it later
| return true; | ||
| } else { | ||
| throw std::runtime_error("Unsupported memory management mode"); | ||
| template <typename T, typename Container> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling, that this belongs in the AlignedBuffer (in a constructor or factory member function).
Since we're using C++17, you'd need to detect things yourself:
#include <type_traits>
#include <utility> // for std::declval
#include <iterator> // for std::begin, std::end
template <typename T, typename = void>
struct is_iterable : std::false_type {};
template <typename T>
struct is_iterable<T, std::void_t<
decltype(std::begin(std::declval<T>())),
decltype(std::end(std::declval<T>()))
>> : std::true_type {};
// These two factories return a shared_ptr, but as a member of AlignedBuffer
// they could/should act in-place (i.e. fill the current AlignedBuffer).
template <typename Container = T, typename std::enable_if_t<!is_iterable<U>::value, int> = 0>
std::shared_ptr<ov::AlignedBuffer> make_buffer_from_container(const Container& container) {
using T = typename Container::value_type; // Assuming you can do this
auto buffer = std::make_shared<ov::AlignedBuffer>(container.size() * sizeof(T));
T* ptr = buffer->template get_ptr<T>();
size_t idx = 0;
for (const auto& elem : container) {
ptr[idx++] = static_cast<T>(elem);
}
return buffer;
}
template <typename Container = T, typename std::enable_if_t<is_iterable<U>::value, int> = 0>
std::shared_ptr<ov::AlignedBuffer> make_buffer_from_container(const Container& container) {
using T = typename Container::value_type; // Assuming you can do this
auto buffer = std::make_shared<ov::AlignedBuffer>(container.size() * sizeof(T));
std::copy(container.begin(), container.end(), buffer->template get_ptr<T>());
return buffer;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AlignedBuffer is just an interface to transfer memory, it doesn't need to have such interface, we can do it in frontend.
| T* ptr = buffer->template get_ptr<T>(); | ||
| size_t idx = 0; | ||
| for (const auto& elem : container) { | ||
| ptr[idx++] = static_cast<T>(elem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use ptr directly:
*ptr++ == static_cast<T>(elem);
|
Hello @mvafin, I'm working on implementing the delegate interface to interface with ORT. At first glance the changes in here seem to pose a problem with the implementation. I'd like to ask the PR to put on hold until we can discuss the changes. |
Details:
The changes allow memory management to be moved on the delagate side.
Tickets: