Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-29332 Add opentelemetry based tracing #17538

Closed
wants to merge 12 commits into from

Conversation

rpastrana
Copy link
Member

@rpastrana rpastrana commented Jun 30, 2023

  • Adds opentel cpp lib + otlp + otlp-http

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

- Adds opentel cpp lib + otlp + otlp-http

Signed-off-by: Rodrigo Pastrana <[email protected]>
@github-actions
Copy link

@rpastrana
Copy link
Member Author

@GordonSmith I will add cmake option to suppress this feature if it makes sense in the future, right now it seems open telemetry will be part of the 9.4+ platform.
Also, I added this PR: hpcc-systems/vcpkg#32
let me know if that needs to change

Copy link
Member

@GordonSmith GordonSmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana you should include this change in your PR for Open Telemetry Support as there is no benefit to having it in master without it (and there is a slight downside to including it when not needed)?

@rpastrana
Copy link
Member Author

@rpastrana you should include this change in your PR for Open Telemetry Support as there is no benefit to having it in master without it (and there is a slight downside to including it when not needed)?

Thanks @GordonSmith, that makes sense, Gavin's Jira listed this change as a stand-alone subtask but I can merge together.
Can you comment on the actual change, is it fine as-is or does it need any changes before eventual merging?

@GordonSmith
Copy link
Member

That change to vcpkg.json.in looks fine - this is an "always on" thing?

@rpastrana
Copy link
Member Author

It's likely to be an always-on feature once implemented. What are the vcpkg side effects of making it optional based on a cmake flag?

@GordonSmith
Copy link
Member

GordonSmith commented Jul 10, 2023

You wouldn't make it optional if its always needed - so what you have is good.
If it was an optional thing, then we would add its own flag to the vcpkg.json.in to enable/disable as needed.

@rpastrana
Copy link
Member Author

You wouldn't make it optional if its always needed - so what you have is good. If it was an optional thing, then we would add its own flag to the vcpkg.json.in to enable/disable as needed.

Clearly, and that's why it's currently not optional, the flag would be to suppress it until needed.

- WIP
- Adds system/tracing/tracemanager
- Adds instrumented server/client samples

Signed-off-by: Rodrigo Pastrana <[email protected]>
Copy link
Member

@GordonSmith GordonSmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at the cmake stuff.

@@ -0,0 +1,73 @@
################################################################################
# HPCC SYSTEMS software Copyright (C) 2012 HPCC Systems®.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Year is wrong (do we need this block anyway?)


if (opentelemetry-cpp_FOUND)
MESSAGE(STATUS "Found Opentelemetry")
MESSAGE("^^^^^^^^^${OPENTELEMETRY_CPP_INCLUDE_DIRS}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these needed - it will fail above with the REQUIRED?

jlib
)
else()
MESSAGE(STATUS "Opentelemetry libs/headers not found")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these needed - it will fail above with the REQUIRED?

${CMAKE_BINARY_DIR}/oss
${HPCC_SOURCE_DIR}/system/tracing
)
#SET ( SRCS tracemanager.hpp opentel/instrumented/httpserver.cpp opentel/instrumented/httpclient.cpp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code.

opentelemetry-cpp::ostream_span_exporter
#opentelemetry-cpp::jaeger_trace_exporter
#opentelemetry-cpp::otlp_grpc_exporter
#opentelemetry-cpp::otlp_grpc_log_exporter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code


#find_package(opentelemetry-cpp CONFIG REQUIRED)

if (opentelemetry-cpp_FOUND)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file won't be included if its not found?

#find_package(opentelemetry-cpp CONFIG REQUIRED)

if (opentelemetry-cpp_FOUND)
MESSAGE(STATUS "Found Opentelemetry")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed

opentelemetry-cpp::ostream_span_exporter
#opentelemetry-cpp::jaeger_trace_exporter
#opentelemetry-cpp::otlp_grpc_exporter
#opentelemetry-cpp::otlp_grpc_log_exporter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code

opentelemetry-cpp::http_client_curl
jlib)
else()
MESSAGE(STATUS "Opentelemetry libs/headers not found")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed

jlib)
else()
MESSAGE(STATUS "Opentelemetry libs/headers not found")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an INSTALL?

@rpastrana rpastrana changed the title HPCC-29332 Add opentelemetry-cpp to vcpkg list HPCC-29332 Add opentelemetry based tracing Jul 30, 2023
@rpastrana rpastrana closed this Jul 30, 2023
@rpastrana rpastrana reopened this Jul 30, 2023
@rpastrana rpastrana closed this Jul 30, 2023
@rpastrana rpastrana reopened this Jul 30, 2023
@github-actions
Copy link

https://track.hpccsystems.com/browse/HPCC-29332
This pull request is already registered

1 similar comment
@github-actions
Copy link

https://track.hpccsystems.com/browse/HPCC-29332
This pull request is already registered

@rpastrana
Copy link
Member Author

@ghalliday this PR is nowhere near ready for review but as discussed, wanted to share with you to review at a very high level while I'm out. We can discuss when I return

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments on my first pass are a bit random, but the general impression is:

  • The open telemetry code should be encapsulated within jtrace, and ideally would not leak outside it. (That may not be completely possible, but it is a good initial ideal.)
  • The jtrace code should have classes which map to the different concepts that we want to support and can be used to manage them. That was not clear when I looked at it.
  • Generally we use interfaces rather than template classes - see a more detailed comment above.

I will reread it through again before you get back - I expect some of it will become clearer.

@@ -18,21 +18,197 @@
#ifndef JTRACE_HPP
#define JTRACE_HPP

#undef UNIMPLEMENTED //opentelemetry defines UNIMPLEMENTED
#include "opentelemetry/exporters/ostream/span_exporter_factory.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid these includes in the header file if at all possible... Otherwise everything in the system is suddenly going to be including many more header files.

namespace opentel_trace = opentelemetry::trace;

//#include "jexcept.hpp" //re-define UNIMPLEMENTED
#define UNIMPLEMENTED throw makeStringExceptionV(-1, "UNIMPLEMENTED feature at %s(%d)", sanitizeSourceFile(__FILE__), __LINE__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cleaner ways of achieving this
#define OLD_X X
#undef X
#include....
#undef X
#define X OLD_X

But it also shouldn't be in the header file unless there is no other solution.

// virtual Keys(nostd::function_ref<bool(nostd::string_view)> /* callback */)
// list of all the keys in the carrier.
// By default, it returns true without invoking callback */
template <typename T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong being a template class - it suggests a design decision hasn't been made, and requires everything is in the header - which will cause compile sloath.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This largely as-is example from OpenTel doesn't look wrong to me given what they were hoping to accomplish (get,set,keys for various representations of "httpheaders"), but we can provide certainly go a different way (do I recall you mentioning a lambda function interface )?


//Declare the span, provide appropriate attributes, and options
//Trace ID generated if no parent context is provided
auto processingRequestSpan =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect much of this code to be encapsulated inside jtrace. That may not be possible, but having open telemetry code that manages spans scattered through the code suggests poor encapsulation.

@@ -450,6 +483,9 @@ int CEspHttpServer::processRequest()
return 0;
}

//need to ensure that the span is ended when out of scope Owend<ISpan> processingRequestSpan?
processingRequestSpan->End();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be managed by a local class object - otherwise any exceptions thrown will not call this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

//Target propogator? http? grpc? binary? custom?
//HPCC component tracing switches?
Owned<const IPropertyTree> traceConfig;
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you need the try block. If this throws an exception it suggests you are calling it at the wrong point.

static constexpr const char *kCallerIdHTTPHeader = "HPCC-Caller-Id";
}

// github copilot generated comment:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't really make it any clearer!


TraceManager(const std::string & moduleName)
{
TraceManager(moduleName.c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost certainly going to lead to traceName pointing at a name that has been fred.

static opentelemetry::nostd::shared_ptr<opentelemetry::trace::Tracer> getTracer(std::string moduleName);

//Extracts parent contex from the carrier's headers and returns it as callerSpanId
static void getCallerSpanId(context::propagation::TextMapCarrier &carrier, std::string & callerSpanId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect these should be member functions of another class, rather than static functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss offline to clearly define "another class"

// HPCCHttpTextMapCarrier<IProperties> , HttpTextMapCarrier<http_client::Headers>, etc
// The carrier must implement the TextMapCarrier interface
// and it will determine the header which contains the parent span
template<class CARRIER>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again the use of templates suggests a lack of design. The HPCC codebase tends to use interfaces rather than templates classes.
templates are generally reserved for

  • simple integer types
  • generic general purpose container classes (e.g. hash table)
  • rarely implementing classes on top of other base classes.

@afishbeck
Copy link
Member

I think the general concept is good. But I would vastly simplify the abstraction, especially in the beginning. Then as I added features I could be forced to expose a little more complexity.

But basically my first pass would to implement something like:

**process main:**

initTracing("roxie");

**incoming query:**

inside the roxie context, or passed into it:

HttpRequest req;

//the idea of the transaction span is an opentelemetry server span, but I would make a distinction of a process vs a transaction
Owned<ITraceTransactionSpan> tspan = createTracingTransactionSpan(req.headers);  //can grab tracing headers and global-id headers.  Can also grab whatever http attributes we want
tspan->addAttribute(SemanticConventions::kProcessPid, pid); //just an example, not sure which non http semantic conventions we want.
tspan->addAttribute("hpcc.version", ver);  //add well known or custom attributes.  this is where I'd consider not getting in the way of the opentelemetry registered values.  needs some thought.


**ECL code:**

getTraceId(){getRoxieContext->getTransactionSpan()->getTraceId()}
getSpanId(){getRoxieContext->getTransactionSpan()->getSpanId()}
getGlovalId(){getRoxieContext->getTransactionSpan()->getGlobalId()}

//The ECL developers do their own proprietary logging of IDs.

**SOAPCALL:**

Owned<ITraceChildSpan> childspan = getRoxieContext->getTransactionSpan()->createChildSpan("SOAPCALL");

//add well known or custom attributes.  this is where I'd consider not getting in the way of the opentelemetry registered values.  The header with those value names are really just defined strings
span->addAttribute("hpcc.soapcall.timeout", timeout);  


**SocketCall out from soapcall:**

Owned<ITraceClientSpan> clientspan = childspan->createClientSpan("SOAPCALL client");
HttpRequest req;
clientSpan->setHeaders(req.headers, includeTraceHeaders, includeGlobalId);
sendHttpReq(req);

for now, when each span goes out of scope it is logged through jlog, likely in its opentelemetry json format?
Future we can support other exporters.

@@ -426,6 +427,11 @@ int init_main(int argc, const char* argv[])
//save off generated config to register with container. Legacy can always reference the config file, application based ESP needs generated config saved off
Owned<IPropertyTree> appConfig;

TraceManager traceManager("esp");
auto espTracer = traceManager.getTracer();//All instances of this esp process can share the same tracer same as TraceManager::getTracer("esp")
auto rootEspProcspan = espTracer->StartSpan(__func__); //Dummy top level span for the esp process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dummy span really needed? If so, at least for now, I would completely hide it from a process that handles requests. Even for job type processes, like say eclagent, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required, the component owner will define the appropriate spans, this illustrates a top level span for the current function

@afishbeck
Copy link
Member

Spans for queued workunits and agent jobs would be a different discussion.

@rpastrana
Copy link
Member Author

The comments on my first pass are a bit random, but the general impression is:

* The open telemetry code should be encapsulated within jtrace, and ideally would not leak outside it.  (That may not be completely possible, but it is a good initial ideal.)

* The jtrace code should have classes which map to the different concepts that we want to support and can be used to manage them.  That was not clear when I looked at it.

* Generally we use interfaces rather than template classes - see a more detailed comment above.

I will reread it through again before you get back - I expect some of it will become clearer.

Great comments and I'm in agreement.

@rpastrana
Copy link
Member Author

The comments on my first pass are a bit random, but the general impression is:

* The open telemetry code should be encapsulated within jtrace, and ideally would not leak outside it.  (That may not be completely possible, but it is a good initial ideal.)

* The jtrace code should have classes which map to the different concepts that we want to support and can be used to manage them.  That was not clear when I looked at it.

* Generally we use interfaces rather than template classes - see a more detailed comment above.

I will reread it through again before you get back - I expect some of it will become clearer.

@ghalliday thanks for all the insightful comments, I prob should have been more explicit that this PR is nowhere near ready and only provided to invoke high level conversation, as you suggested we can discuss actual requirements and high level structure offline. (please don't expect replies to all your current comments)

- Removes system/tracing/tracemanager

Signed-off-by: Rodrigo Pastrana <[email protected]>
Signed-off-by: Rodrigo Pastrana <[email protected]>
Signed-off-by: Rodrigo Pastrana <[email protected]>
Signed-off-by: Rodrigo Pastrana <[email protected]>
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana from an initial scan that looks much closer to what I would hope for. More of the implementations can be moved from the header file to the cpp file. I think there is a good chance you can remove all references to open telemetry in the header if you do that.

I will go back and look at it much more closely.

virtual void activate() = 0;
};

class CSpan : public CInterfaceOf<ISpan>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implementation can also move into the cpp file.

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana some more detailed comments. A brief summary form memory:

  • Perform the traceManager initialisation once when the config is processed.
  • I don't think we need IHPCCTracer as well as ITraceManager
  • Simplify the interface a bit by removing a few functions, and not passing attributes when a span is created.
  • activate() not needed
  • move some more code into the cpp file from the header.

@@ -184,6 +184,26 @@ void checkSetCORSAllowOrigin(EspHttpBinding *binding, CHttpRequest *req, CHttpRe

int CEspHttpServer::processRequest()
{
Owned<IHPCCTracer> tracer = queryTraceManager()->initTracing("esphttpserver"); //Initialize the trace manager, and ESP trace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the trace manager to only be intialised at process startup, rather than per request. Ideally in loadConfiguration so that it was common to all components.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you need the tracer interface - is there a situation where you need more than one per process?

Owned<IHPCCTracer> tracer = queryTraceManager()->initTracing("esphttpserver"); //Initialize the trace manager, and ESP trace

//This is the span that will be used to track the processing of the request
Owned<IProperties> reqProcessSpanAttributes = createProperties();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attributes can be modified after the span creation, so it may be simpler (and more flexible/efficient) to not pass the spanAttributes to the span creation, and use span->setProp() afterwards.

reqProcessSpanAttributes->setProp("http.url", m_request->queryPath());
reqProcessSpanAttributes->setProp("http.host", m_request->queryHost());
reqProcessSpanAttributes->setProp("http.request_content_length", m_request->getContentLength());
reqProcessSpanAttributes->setProp("http.request_query_string", m_request->queryParamStr());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is an example, I suspect we do not want to do this because it will contain PII.

//This is the span that will be used to track the processing of the request
Owned<IProperties> reqProcessSpanAttributes = createProperties();
reqProcessSpanAttributes->setProp("http.request_port", "8010");
reqProcessSpanAttributes->setProp("app.name", "esp");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good as an example, but we should only add attributes that are actually useful.

Owned<ISpan> reqProcessSpan = tracer->createTransactionSpan("ProcessingHTTPRequest", mockHTTPHeaders, reqProcessSpanAttributes);

//ideally the span should be activated when created
reqProcessSpan->activate();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Any reason why it cannot be activated?

if (!span)
return false;

if (!span->IsRecording())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this check for? I think recording means something specific for spans, it is not required for a valid span id.

}
}

void CTransactionSpan::setAttriburesFromHTTPHeaders(StringArray & httpHeaders)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: attributes
same for the function below.

hpccCallerId.set(httpHeaders.item(currentHeaderIndex+1));
continue;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this function and the next to do the same thing. This doesn't contain the latter code.

For simplicitly I would create an IProperties from the array and then call the other function. If it is time critical then investigate some more later.


void CSpan::activate()
{
auto provider = opentelemetry::trace::Provider::GetTracerProvider();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to use this functionality. Because we can have multiple active spans at the same time we will need to separately track the active span.

auto tracer = provider->GetTracer(tracerName.get());
options.kind = spanKind;
name.set(spanName);
span = tracer->StartSpan(spanName, {}, options);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this StartSpan should be done a bit differently, but it will become clearer as the structure settles down.

- Removes sample usage to avoid code review complications
- Adds realistic sample in ESP processrequest
- Initialize OpenTel at loadConfig time
- Migrate OT imports to implementation
- Eliminate HPCCTracer interface
- Add appropriate interface methods
- Does not yet use LogTrace

Signed-off-by: Rodrigo Pastrana <[email protected]>
@ghalliday ghalliday self-requested a review August 24, 2023 11:25
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana various comments. We can walk through them later.

Also you seem to have merged master into this branch - that will cause problems later, you need to rebase onto master instead.


DBGLOG("Initializing OpenTel based tracing for component: '%s'", componentTag);
//Owned<CTraceManager> tracer =
queryTraceManager(componentTag); //inits opentel singleton internals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be

initTraceManager(componentTag, componentConfiguration);

or similar.

{
opentelemetry::trace::StartSpanOptions options;
options.kind = opentelemetry::trace::SpanKind::kClient;
CSpan(&options, spanName, tracerName_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there are reason to prefer this syntax? Nothing else that I know of is coded this way. I don't think it does what you expect, I think this creates a temporary object within the function.

}

CSpan::CSpan(opentelemetry::trace::SpanKind spanKind, const char * spanName, const char * tracerName_, const IProperties * spanAttributes)
void CSpan::setOTTraceID()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: structure of the c++ file - should have the CSpan functions together, and before the classes that are derived from them.


virtual void activate() = 0;
class CNoOpSpan : public CInterfaceOf<ISpan>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best to avoid implementations of interfaces in header files. It can lead the c++ compile to assume this is the most common implementation and special case checks for it. That tends to bloat the code, and since we're using dlls it almost never matches.

{
Owned<IProperties> httpHeaderProps = createProperties();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: This could be extracted to a general purpose function that could live in jprop.hpp:

void extractHeaders(IProperties * target, const StringArray & httpHeaders);


//This span tracks processing of httprequest
//Owned<ISpan> reqProcessSpan = queryTraceManager("esp")->createTransactionSpan("ProcessingHTTPRequest", mockHTTPHeaders);
Owned<ISpan> reqProcessSpan = queryTraceManager("esp")->createTransactionSpan("ProcessingHTTPRequest", mockHTTPHeadersSA);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should not be a name passed to the trace manager (see comment later on).


virtual void activate() = 0;
class CNoOpSpan : public CInterfaceOf<ISpan>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this null class, or would the default implementation do? Unless it is needed I would not bother specialising.

nostd::shared_ptr<opentelemetry::trace::Span> span;
};

class CTransactionSpan : public CSpan
{
private:
opentelemetry::v1::trace::SpanContext parentContext = opentelemetry::trace::SpanContext::GetInvalid();
void setAttriburesFromHTTPHeaders(const IProperties * httpHeaders);
void setAttriburesFromHTTPHeaders(StringArray & httpHeaders);
void setAttributesFromHTTPHeaders(const IProperties * httpHeaders, opentelemetry::trace::StartSpanOptions * options);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the options parameter. Possibly ok if this was in the c++ file, not if it was in the header.

class CClientSpan : public CSpan
{
public:
CClientSpan(const char * spanName, const char * tracerName_, const IProperties * spanAttributes);
CClientSpan(const char * spanName, const char * tracerName_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructor for a client span should almost certainly have a ISpan/CSpan as a parent parameter.

};
virtual ISpan * createTransactionSpan(const char * name, StringArray & httpHeaders) = 0;
virtual ISpan * createTransactionSpan(const char * name, const IProperties * httpHeaders) = 0;
virtual ISpan * createClientSpan(const char * name) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be members of a ISpan rather than a trace manager. They can stay here for the moment, but they should primarily(/exclusively?) be created as a member of an ISpan. They should also have a parent span parameter.

- Provides simple clear interface definition
- Provides traceman init and query
- Migrate ALL OT imports to implementation
- Add appropriate interface methods
- Provides manual parent span injection
- Does not yet use LogTrace
- Does not yet defint helm/k8s schema

Signed-off-by: Rodrigo Pastrana <[email protected]>
- Updates interface
- Forces specific Server->client/internal span relationship
- Starts OTLP support
- Attempts to remove all getcurrent references
- Does not yet use LogTrace
- Does not yet defint helm/k8s schema

Signed-off-by: Rodrigo Pastrana <[email protected]>
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana a couple of comments, but looks a good base for discussion.


interface ITraceManager : extends IInterface
{
virtual ISpan * createServerSpan(const char * name, StringArray & httpHeaders, ISpan * parentSpan) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do server spans have parent spans?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the parent would typically be assigned from caller's context (http headers in this case)

virtual void setSpanAttribute(const char * key, const char * val) = 0;
virtual void setSpanAttributes(const IProperties * attributes) = 0;
virtual void addSpanEvent(const char * eventName) = 0;
virtual void setActive() = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debated not exposing this because it is very openTel specific, but ultimately decided the caller should have the ability to set a particular trace as the active (parent span for all new spans). If we don't expose this, or similar functionality the caller will be responsible for tracking and explicitly setting the parent span which is sensible I suppose

virtual const char * queryHPCCCallerID() = 0;
virtual const char * querySpanName() = 0;
virtual const char * queryTraceID() = 0;
virtual const char * queryTraceFlags() = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what this function would return, but fine if it has a purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The traceflags are def openTelemetry specific, and I could see an argument to keep the interface agnostic...
They, along with the traced and span id make up the span context which is used to create and inject parentContext

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana a few minor comments. I think now would be a good time to squash all the commits, close this PR and open a new one with the branch.

@@ -184,6 +184,48 @@ void checkSetCORSAllowOrigin(EspHttpBinding *binding, CHttpRequest *req, CHttpRe

int CEspHttpServer::processRequest()
{
//Mock http headers from request
Owned<IProperties> mockHTTPHeaders = createProperties();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code should be moved to some unit tests - probably in jlibtests.cpp. That can then be expanded with other unit tests. E.g. creating server spans with and without incoming traceids, invalid trace ids etc. and check that they behave correctly.
Then can extend to check that client and internal spans can be created on parallel threads and that it doesn't cause problems, and gather overhead for creating the spans.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree on the usefulness of unit tests

@@ -268,8 +313,16 @@ int CEspHttpServer::processRequest()
ESPLOG(LogMin, "%s %s, from %s", method.str(), m_request->getPath(pathStr).str(), m_request->getPeer(peerStr).str());
else //user ID is in HTTP header
ESPLOG(LogMin, "%s %s, from %s@%s", method.str(), m_request->getPath(pathStr).str(), userid, m_request->getPeer(peerStr).str());

//checkUserAuth could declare nested span
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably also move to the unit tests. Creating spans when we call ldap etc. is something worth recording in a jira to come back to later.


virtual void querySpanContextProperties(IProperties * contextProps) = 0;
//virtual bool injectSpanContext(CHPCCHttpTextMapCarrier * carrier) = 0;
virtual bool injectSpanContext(IProperties * contextProps) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name could be improved. Probably get rather than inject. What is the difference with the function above? If different I suspect the parameter names should be different, and possibly combine into a single call.

virtual void setSpanAttributes(const IProperties * attributes) = 0;
virtual void addSpanEvent(const char * eventName) = 0;

virtual void querySpanContextProperties(IProperties * contextProps) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistentency: Functions that start query generally return a value. get is generally used if returns a value in a supplied parameter.


const CHPCCHttpTextMapCarrier carrier(httpHeaders);
auto globalPropegator = context::propagation::GlobalTextMapPropagator::GetGlobalPropagator();
opentelemetry::v1::context::Context currentContext = context::RuntimeContext::GetCurrent();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling getCurrent() - I don't think that is valid.

init();
}

CSpan(const char * spanName, SpanType type, const char * tracerName_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency: 3 different ways of differentiating the parameter from the member i) diff name ii) use this for the member iii) append an underscore.
I don't really mind any of them (although everywhere else uses _ as a prefix rather than a suffix), but much better to be consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the many reasons I prefer to prefix all members with m_

tracerName.set(parent->queryTraceName());
this->type = type;

init();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent: extra space

@rpastrana rpastrana closed this Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants