Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-29817 Eliminate seeks, scans & wildseeks from KeyStatsCollector and track cache stats in hthor #17541

Merged
merged 1 commit into from
Oct 4, 2023

Conversation

shamser
Copy link
Contributor

@shamser shamser commented Jul 3, 2023

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

@shamser shamser changed the title HPCC-29492 Eliminate querySeeks, queryScan & queryWildSeeks from ContextLogger HPCC-29817 Eliminate querySeeks, queryScan & queryWildSeeks from ContextLogger Jul 3, 2023
@shamser shamser requested a review from jakesmith July 4, 2023 09:03
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - looks in good in general, please see feedback.

@@ -672,6 +673,108 @@ class THORHELPER_API IndirectCodeContext : implements ICodeContext
protected:
ICodeContext * ctx;
};
class CThorBaseContextLogger : public CSimpleInterfaceOf<IContextLogger>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial/formatting: we norrmally have a new line before a new class definition.

system/jlib/jstats.cpp Show resolved Hide resolved
mutable CRuntimeStatisticCollection stats;

public:
CThorBaseContextLogger() : stats(jhtreeCacheStatistics)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this a more generliazed stats logger context, it would be good to rename it (as it's not just 'Thor') - call it something like CStatsContextLogger, and pass in the mapping:

CThorBaseContextLogger(CRuntimeStatisticCollection  &_mapping) : mapping(_mapping)

{
va_list args;
va_start(args, format);
CTXLOGva(MCdebugProgress, unknownJob, NoLogMsgCode, format, args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think worth passing in a LogMsgJobInfo & to the ctor and using here and in logOperatorExceptionVA below.
then can the CThorContextLogger::CTXLOG and CThorContextLogger::logOperatorExceptionVA can be deleted.

@@ -3036,11 +3012,11 @@ class KeyedLookupPartHandler : extends ThreadedPartHandler<MatchSet>, implements
Owned<IKeyManager> manager;
IAgentContext &agent;
DistributedKeyLookupHandler * tlk;

CHThorContextLogger &contextLogger;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: unless needs to be an implementation reference, better if it was an interface reference (IContextLogger &)
Declaring it as a CHThorContextLogger makes it look like it needs to / code relies on it.

and similarly in a few other places

// improvement: override constructor to store setHttpIdHeaderNames, override CTXLOG & logOperatorExceptionVA to log LogMsgJobInfo
class CHThorContextLogger : public CThorBaseContextLogger
{
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a rename of the base to CStatsContextLogger (+ passing in mapping + job),
I think this can be deleted in favour of using CStatsContextLogger.
Thor's derivation can remain for it's specialization if need be.

unsigned postFiltered;
unsigned wildseeks;
CHThorContextLogger contextLogger;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there need to be a logger per activity type that uses one..
Could there just be 1 in the agent (which is how it's done in roxie) ?

@@ -1179,7 +1179,7 @@ class InlineXmlDataReader : public WorkUnitRowReaderBase

//---------------------------------------------------------------------------------------

static const StatisticsMapping graphStatistics({});
static const StatisticsMapping roxieGraphStatistics({});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to rename, related to comment re. moving Thor mapping (back to Thor code),

unsigned __int64 seeks = stats.getStatisticValue(StNumIndexSeeks);
unsigned __int64 scans = stats.getStatisticValue(StNumIndexScans);
unsigned __int64 skips = stats.getStatisticValue(StNumIndexSkips);
logctx.CTXLOG("Indexread returning result set %d rows from %" I64F"u seeks, %" I64F"u scans, %" I64F"u skips", processed-processedBefore, seeks, scans, skips);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may not be trivial, but wondering if the 4 incidents of this type of code ( I think they're same with different messages?) could be commoned up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly but would prefer to keep as it is to stop the scope of this jira from creeping.

public:
CThorContextLogger() : stats(jhtreeCacheStatistics)
CThorContextLogger()
{
if (globals->hasProp("@httpGlobalIdHeader"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps pass a IPropertyTree in to CStatsContextLogger too and move this code there.

@shamser shamser force-pushed the issue29817 branch 10 times, most recently from a63c70a to 2b74f3b Compare July 20, 2023 10:13
@shamser shamser requested a review from jakesmith August 1, 2023 10:06
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - made 1st review pass at this, it does look a lot cleaner overall, but I think IContextLogger/IRoxieContextLogger and their implementations need looking at/refactoring.
See comments.

const LogMsgJobInfo & logMsgJobInfo;
LogTrace logTrace;
public:
IContextLogger(const IPropertyTree *cfg, const LogMsgJobInfo & _logMsgJobInfo=unknownJob) : logMsgJobInfo(_logMsgJobInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interfaces should really not have concrete implementations.
I realize IContextLogger already had some, but with a ctor it looks pointedly wrong.

Whatever is implementing IContextLogger should have the ctor and implement the concrete implementations.

In this case, if we want a default implementation with a ctor, there should be a CDefaultContextLogger or similar, and other implementations can derive from it.

I see that IRoxieContextLogger also broke this rule, but given a default IContextLogger implementation, implementations of IRoxieContextLogger can also derive from that default implementation and override what they want to (i.e. the "ContextLogger" implementation in ccd.hpp)

@@ -59,6 +59,7 @@ enum TracingCategory
class LogItem;
interface IRoxieContextLogger : extends IContextLogger
{
IRoxieContextLogger(IPropertyTree *cfg) : IContextLogger(cfg) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see related comments in jlog.hpp re. IContextLogger.
interfaces should really not have constructors and concrete implementations.

ContextLogger(const ContextLogger &); // Disable copy constructor
public:
IMPLEMENT_IINTERFACE;

ContextLogger() : stats(accumulatedStatistics, true)
ContextLogger() : IRoxieContextLogger(topology), stats(accumulatedStatistics, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comments re. interfaces shouldn't implement.
I know this is not entirely new, but looks very wrong now with interface taking ctor.
There should instead be a base class that implements IRoxieContextLogger and inherit from it.

@shamser shamser force-pushed the issue29817 branch 2 times, most recently from 4a8c9b5 to a72b6ae Compare August 14, 2023 12:20
@shamser shamser requested a review from jakesmith August 14, 2023 13:06
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - looks cleaner, but it would best if this non-trivial refactoring of these IContextLogger and IRoxieContextLogger implementations were in a separate PR.

And I think can be done without removing roxie's extended interface (IRoxieContextLogger), which should also be pure.

The implementations could share a common base, either by the concrete implementations implementing the pure methods and calling the common base methods, or to avoid all the trivial 1-liner method implementations, the base class could be a template class, that takes either IContextLogger or IRoxieContextLogger, e.g.:

template class <INTERFACE>
class CDefaultContextLogger : public CInterfaceOf<INTERFACE>
{
..
}

class CBaseRoxieContextLogger : public CDefaultContextLogger<IRoxieContextLogger>
{
..
}

That way, the roxie code will not need changing to use a class and can continue to use IRoxieContextLogger (which will now be purely abstract).

@shamser shamser changed the title HPCC-29817 Eliminate querySeeks, queryScan & queryWildSeeks from ContextLogger HPCC-29817 Eliminate querySeeks, queryScan & queryWildSeeks from KeyStatsCollector Sep 21, 2023
@shamser shamser changed the base branch from candidate-9.2.x to candidate-9.4.x September 21, 2023 16:50
@shamser shamser changed the title HPCC-29817 Eliminate querySeeks, queryScan & queryWildSeeks from KeyStatsCollector HPCC-29817 Eliminate seeks, scans & wildseeks from KeyStatsCollector and track cache stats in hthor Sep 21, 2023
@shamser shamser force-pushed the issue29817 branch 2 times, most recently from a167289 to 5260a0a Compare September 21, 2023 17:19
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any problems.

@ghalliday
Copy link
Member

@shamser the smoke test is failing because it fails to build.

@shamser shamser force-pushed the issue29817 branch 2 times, most recently from b10392c to ee606f3 Compare September 22, 2023 11:23
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - looks good, 2 minor comments

class CStatsContextLogger : public CSimpleInterfaceOf<IContextLogger>
{
protected:
const LogMsgJobInfo & job;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may not make any practical difference (if the passed in LogMsgJobInfo stay in scope/aren't destroyed),
but I think when keeping a LogMsgJobInfo we normally copy it (only by reference when not taking a copy/referenced copy) - e.g. see LogMsg class member.

@@ -944,4 +943,6 @@ class jlib_decl RuntimeStatisticTarget : implements IStatisticTarget
extern jlib_decl StringBuffer & formatMoney(StringBuffer &out, unsigned __int64 value);
extern jlib_decl stat_type aggregateStatistic(StatisticKind kind, IStatisticCollection * statsCollection);

extern jlib_decl const StatisticsMapping jhtreeCacheStatistics;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: would be easier to find if moved with the other jstats.h StatisticMappings (around line 494)

@shamser shamser force-pushed the issue29817 branch 2 times, most recently from fc1c60a to cdfe748 Compare September 22, 2023 13:51
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - looks good, please squash.

@ghalliday
Copy link
Member

@shamser needs rebasing before it can be merged

…and track cache stats in hthor

Signed-off-by: Shamser Ahmed <[email protected]>
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamser - looks good

Owned<ISpan> activeSpan;
mutable CRuntimeStatisticCollection stats;
public:
CStatsContextLogger(const CRuntimeStatisticCollection &_mapping, const LogMsgJobInfo & _job=unknownJob) : job(_job), stats(_mapping) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial/formatting: double space before &_mapping

public:
CStatsContextLogger(const CRuntimeStatisticCollection &_mapping, const LogMsgJobInfo & _job=unknownJob) : job(_job), stats(_mapping) {}

virtual void CTXLOGva(const LogMsgCategory & cat, const LogMsgJobInfo & job, LogMsgCode code, const char *format, va_list args) const override __attribute__((format(printf,5,0)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial/formatting: double space before __attribute

@ghalliday ghalliday merged commit 8d6574d into hpcc-systems:candidate-9.4.x Oct 4, 2023
49 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants