-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QC-779 Multiple mesos executor fix #613
Conversation
core/task/schedulerstate.go
Outdated
} else { | ||
log.Debug("scheduler quit, no errors") | ||
state.sm.Event(context.Background(), "EXIT") | ||
} | ||
}() | ||
} | ||
|
||
func (state *schedulerState) CopyExecutor() *mesos.ExecutorInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rename to CopyExecutorInfo
for clarity?
also, perhaps proto.Clone()
might be faster than marshalling and unmarshalling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I did not use Clone
originally was, that it did not build, hence the stupid marshalling/unmarshalling. I got error
*mesos.ExecutorInfo does not implement protoreflect.ProtoMessage (missing method ProtoReflect)
but the object in defined in filename.pb.go
but now it works... ffs literally the same code I used earlier today. I wonder what I did differently before (I even copied the same code as before :D)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could have been that you imported a wrong lib, I saw in GoLand that you could pick three different libraries with proto.Clone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe... truth is, that I let lsp handle adding dependencies...
Is there still a risk that we deep-copy |
Nope, this was the only place where the executor from state is written to. It is supposed to be this immutable object that is safe to read from, so it is just created as a part of |
And one more thing... this is a problem of golang, as there is no const keyword like in eg. c++, rust, ... If this was correctly written c++ code using const or rust code, this bug would not be able to happen. So maybe it might be worth it to note this somewhere, when someone will be doing +- of golang language... |
… not race condition creating copy of ExecutorInfo for each mesos offer so there is not race condition
Indeed, I missed the fact that executor ID is set just a line after the (now) deep-copy:
before I thought that the right executor ID is already in all clear, thanks! |
@@ -36,6 +36,7 @@ import ( | |||
"github.com/AliceO2Group/Control/common/logger/infologger" | |||
"github.com/AliceO2Group/Control/core/controlcommands" | |||
"github.com/AliceO2Group/Control/core/task/schedutil" | |||
"github.com/gogo/protobuf/proto" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gogo/protobuf
is deprecated though, can you try using a different one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is the official one now: https://github.com/golang/protobuf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it will not work, see previous build error.. but this package is used in mesos, so we are using it either way inherently. So I don't mind using it for cloning an object, that is already created by this package in mesos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I personally also don't mind using marshall and unmarshall because we are using it once per run and per offer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, since mesos brings it anyway, let's keep gogo/protobuf, thanks for checking.
sorry for some of the formatting fixes.
There was a race condition which was in method
makeTaskForMesosResources
where we sharingglobal
object to write. This should create a deep copy of given object and it seems to fix the problem on staging.