Shared communicator #652

christophertubbs · 2024-06-10T19:40:24Z

Expanded context functionality into its own package, applied linter suggestions, made some of the logging interfaces look more like python's vanilla logging, got shared redis communicators working, expanded readmes, fixed some GUI issues for the evaluation service, enhanced error tracking for the evaluation service's runner, abstracted out object creation functions to make it easier to slide in new functionality.
Added some fixes to django tests, added a decorator for alerting when functionality is out of its supported python version range, and added a custom remote object manager

Adds the foundation for sharing complex or large resources across multiple processes.

closes #527

python/lib/core/dmod/core/context.py

aaraney · 2024-06-11T13:55:03Z

python/lib/core/dmod/core/context.py

+
+class ProxiablePropertyMixin(ProxiableGetPropertyProtocol, ProxiableSetPropertyProtocol, ProxiableDeletePropertyProtocol):
+    """
+    Mixin functions that allow property functions (fget, fset, fdel) to be called explicitly rather than implicitly


Ive not finished looking through all the code, so this might be a, "hey, look over there" comment. When and why would you need to call a descriptor explicitly?

When descriptors are referenced directly on instances, they of course pretend to be like raw values, not the functions they really are. If you want to forward the value of property b in a proxy, you don't forward the property, you instead forward the direct value. This messes up in a lot of places because a lot of the core vanilla python multiprocessing code only deals with callables. If you want to forward a call to a property, you can't do so to the instance member as that returns the value before it may be manipulated. As a result, you have to go from the top down, i.e. reach into the instances class, gain the reference to the descriptor for the property, pass in the instance that owns the descriptor, and then get direct access to the functions that may pass the value back and forth along the wire. If you want to pass along the fget function and not the value that fget produces, you have to pass along fget.

The situation surrounding it is like having a step in a staircase that's half an inch taller than the rest. It doesn't seem like it'll cause a lot of issues, but it ends up causing a lot of problems if not accounted for.

aaraney

There is a lot here. I've tried my best to make a few passes getting more and more thorough on each pass. I am sure that i've missed some things though (@robertbartel, it would be great to get your eyes on this too). I still need to go through all the tests more thoroughly but figured this would be good starting point.

aaraney · 2024-06-11T14:19:36Z

python/lib/core/dmod/core/context.py

+    if not hasattr(cls, "__name__"):
+        raise TypeError(f"Cannot create a proxy name for {cls} - it has no consistent '__name__' attribute")
+
+    return f"{cls.__name__}{PROXY_SUFFIX}"


I suggest we use __qualname__ here instead. See pep 3155 for motivation.

Suggested change

return f"{cls.__name__}{PROXY_SUFFIX}"

return f"{cls.__qualname__}{PROXY_SUFFIX}"

__qualname__ provides the full name of the class starting from its root package, so something like dmod.core.common.collections.Whatsit instead of Whatsit. This function is supposed to get the name of a class, so naming something dmod.core.common.collections.WhatsitProxy instead of WhatsitProxy will yield a syntax error.

__qualname__ provides the full name of the class starting from its root package, so something like dmod.core.common.collections.Whatsit instead of Whatsit.

That's not true. From the pep

For top-level functions and classes, the __qualname__ attribute is equal to the __name__ attribute. For nested classes, methods, and nested functions, the __qualname__ attribute contains a dotted path leading to the object from the module top-level. A function’s local namespace is represented in that dotted path by a component named .

Ok, I'm going to experiment with it to ensure that everything's kosher. I feel like I ran afoul of this when I was first writing it.

python/lib/core/dmod/core/context.py

aaraney · 2024-06-18T19:50:48Z

python/lib/core/dmod/core/context/manager.py

+        Args:
+            scope: The scope object to add
+        """
+        self.__scopes[scope.name] = scope


Is overwriting a scope okay?

I didn't consider it considering the low likely-hood due to downstream and intended use, but I'm going to say no, but think of a better implementation for it rather than just a del or a raise.

Making inject_scope private and adding a check for the name and raising a KeyError if it's already present. It's highly unlikely that it'll occur, but it's now even harder to trigger.

aaraney · 2024-06-18T20:07:36Z

python/lib/core/dmod/test/test_context.py

+    Returns:
+        A semi random string of a semi-random length
+    """
+    if min_length is None:


Just make these the default values for the parameters pls. That is way clearer.

aaraney · 2024-06-18T20:10:05Z

python/lib/core/dmod/test/test_context.py

+    return total
+
+
+def make_word(min_length: int = None, max_length: int = None, character_set: str = None, avoid: str = None) -> str:


Unless we are doing fuzz testing, its generally not the best idea to test using random data. Perhaps we use something like this to automate the process of creating random data that we use to drive a table driven test, but this just feels like the beginning of a flakey test waiting to happen.

This is used in a way that's almost like a table driven test. This is used to generate values that make up 10s of identity mutation tests. The value does not generally as much as "was this the same before and after mutation?"

Random words and random numbers is the easiest way to generate that amount of data.

Its usage is involved in mass parameter generation for individual test generation. These values flow into others and when assertions are made, they result in messages like: "#{step_number})'{step_result.step_name}' failed - expected '{step_result.expected_result}' but resulted in '{step_result.value}'", or:

[Test SharedClassTwo] Test Equality #3) 'Use 'set_c' to set 'c' to 's79SD8' on Partially Mixed Class Two' - expected 's79SD8' but resulted in 'fUsd8Sl8'

aaraney · 2024-06-18T20:10:25Z

python/lib/core/dmod/test/test_context.py

+        return False
+
+    def __hash__(self):
+        return -0


Blame autocomplete for the -0, but __hash__ needs to be implemented since __eq__ is overwritten.

Why not this instead?

Suggested change

return -0

return id(self)

aaraney · 2024-06-18T20:11:05Z

python/lib/core/dmod/test/test_context.py

+        return False
+
+
+SENTINEL = Sentinel()


Why not just use object(). This is the standard way to create a sentinel value.

Used for lines like if self.ignore_result and isinstance(expectation, Sentinel):. Can't do isinstance(expectation, Object) or expectation == object() and yield the same types of results.

Right, but you can do expectation == SENTINEL.

robertbartel

As noted in some of the comments, there is some uncertainty about whether everything being put into dmod.core should be added there. Also, there is just a lot here to try to absorb at once. To address both of those things, let's break out the dmod.core changes into their own separate PR.

python/lib/core/README.md

robertbartel · 2024-06-20T15:36:18Z

python/lib/core/dmod/core/_version.py

@@ -1 +1 @@
-__version__ = '0.16.0'
+__version__ = '0.18.0'


Two things here ...

First, (and this applies to all the updated packages), I don't think any of the other packages requiring dmod.core are getting updated to ensure they use this new version. I expected that at least one of them actually needs code added in the new version. I'm actually a little surprised there weren't any check or test failures related to not updating the versions and dependencies.

Second, I think it could cause a problem for us to jump versions like this (though I'm having some deja vu here). The currently open PR #650 is also updating dmod.core, incrementing the version to 0.17.0, and updating dependents to use that. Depending on the order of merging these PRs, we could have the version go from 0.16.0 to 0.18.0 to 0.17.0. I'm not 100% sure that causes a problem, but it's the type of thing that sure looks like it could. It also seems like it would be very hard to diagnose, if this was the underlying cause for something.

robertbartel · 2024-06-20T16:16:08Z

python/lib/core/dmod/core/common/collection.py

I have had same thought about a lot of the new things in this set of changes and other similar ones involving dmod.core. One the one hand, we want to be able to reuse things. On the other, we don't want to pack too much into a "core" package that isn't really fundamental to what DMOD does. And on the third hand, too many library packages doesn't make sense either.

If all these new things are general enough to belong in the dmod.core package, then, given how large this PR is, I'd like to see the dmod.core changes (or at least the bulk of them) broken out into a dedicated PR. That can be handled first in isolation and will make things much easier to review and discuss effectively.

There may also be other similar points of decoupling - e.g., perhaps GUI-related changes - that can also be separated into individual PRs to further help make all of this easier to digest.

aaraney · 2024-07-03T18:50:09Z

python/lib/core/dmod/core/common/collection.py

Maybe we can land somewhere in the middle? The majority of the features introduced in the PR could make up a entire library or framework. I think we can and should use python's subpackaging semantics to make that clearer than what we have now. Im thinking we create dmod.core.multiprocessing or dmod.core.rpc and move the majority of this work into that subpackage. As for naming, im just throwing ideas out there, i'm certain there are better names that what I just suggested.

The main point of this is to group code in submodules that capture the expressed concepts and keep related or dependent concepts close to one another (A needs B so A is close to B). I hear your point about classes like a Bag, but I think the classes added in this PR are more specialized and arent really abstract data types like a bag is. Thoughts?

aaraney · 2024-07-03T19:02:28Z

python/lib/core/dmod/core/common/helper_functions.py

@@ -43,6 +44,34 @@ def get_mro_names(value) -> typing.List[str]:
    return list()


+def format_stack_trace(first_frame_index: int = 1) -> str:


Fair enough. Seeing use of re without it being tested gives me goosebumps.

Any reason not to either wrap or directly use traceback.format_stack instead? It seems to do pretty much the same thing.

aaraney · 2024-07-03T19:04:17Z

python/lib/core/dmod/core/common/helper_functions.py

+    return os.linesep.join(lines)
+
+
+def parse_duration(time_delta: str) -> timedelta:


Lets slightly improve this type hint.

Suggested change

def parse_duration(time_delta: str) -> timedelta:

def parse_duration(time_delta: str | bytes | timedelta) -> timedelta:

Will need to import from __future__ import annotations at the top of the module to use the | union operator.

Also, why not just use pandas.Timedelta?

aaraney · 2024-07-03T19:21:36Z

python/lib/core/dmod/core/common/helper_functions.py

+    - "12._313"   FALSE
+    - "- 12.313"  FALSE
+    """
+    return re.match(


Why not just: No need to break out the regular expressions.

Suggested change

return re.match(

try:

float(value)

return True

except ValueError:

return False

aaraney · 2024-07-03T19:43:39Z

python/lib/core/dmod/test/test_context.py

+STRING_INTEGER_PATTERN = re.compile(r"-?\d+")
+MEMBER_SPLIT_PATTERN = re.compile(r"[./]")
+
+MutationTuple = namedtuple("MutationTuple", ["field", "value", "should_be_equal"])


I guess its more or less a stylistic preference, but why not use a dataclass here instead?

aaraney · 2024-07-09T17:15:21Z