Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exchanges continually report as not_responsible #1011

Open
martincox opened this issue Mar 20, 2020 · 11 comments
Open

Exchanges continually report as not_responsible #1011

martincox opened this issue Mar 20, 2020 · 11 comments

Comments

@martincox
Copy link
Contributor

2020-03-20 10:08:14.611 [info] <0.1670.0>@riak_kv_entropy_manager:maybe_exchange:986 Attempt to start exchange between {1096126227998177188652763624537212264741949407232,3} and 1141798154164767904846628775559596109106197299200 resulted in not_responsible

Must be the case that the exchange_queue in entropy_manager is being populated errorneously. I'll dig into this a bit more, doesn't seem like expected behaviour, and I've not noticed it in previous builds. Have you seen this befgore @martinsumner?

@martinsumner
Copy link
Contributor

Is this during a cluster change operation, or ongoing in steady-state running?

@martincox
Copy link
Contributor Author

martincox commented Mar 20, 2020

Ongoing, after the cluster has been initialised and transfers have completed etc.

@martinsumner
Copy link
Contributor

Hmm, looked back at the last time I tested with this form of AAE enabled - and I got this constantly in the background as well.

I've touched the code here, so let me have a look - this isn't right.

I have it in Riak 2.9.1, which version have you found it in?

@martincox
Copy link
Contributor Author

Merged bet branches into a 3.1 and saw it there, so assumed it was something from my merge. But looked on 3.0 and it's there also.

@martinsumner
Copy link
Contributor

Ah - my change was to add the log!

@martincox
Copy link
Contributor Author

Hmmm, so it's possibly something that's been there for a while, but silently?

@martinsumner
Copy link
Contributor

martinsumner commented Mar 20, 2020

Perhaps. I may go back to when the log was added to make sure it occurs in that version. The log got introduced in 2.9.0

@martinsumner
Copy link
Contributor

martinsumner commented Mar 20, 2020

I have a working theory, that this is just wrong:

https://github.com/basho/riak_kv/blob/develop-3.0/src/riak_kv_entropy_manager.erl#L958-L964

an Exchange is {LocalIdx, RemoteIdx, IndexN} - and prune_exchanges/1 in some cases switches the local and remote index!!?? Why?

The RemoteIdx must be on another node (if target_n__val has been met by the cluster logic). So when this happens, and the RemoteIdx is used as a LocalIdx, that partition cannot be a primary when he hit this point - https://github.com/basho/riak_kv/blob/develop-3.0/src/riak_kv_entropy_manager.erl#L895.

I just don't understand what prune_exchanges/1 is trying to achieve. It has been there since the introduction of AAE.

@martinsumner
Copy link
Contributor

The only theory is that prune_exchanges/1 is to make sure that each exchange happens only in 1 direction. So by screwing it by switching the indexes in the other direction all exchanges occur, but once not twice?

@martinsumner
Copy link
Contributor

martinsumner commented Mar 20, 2020

So I think that does make sense. With prune_exchanges/1 doing the index switcheroo whenever LocalIdx > RemoteIdx, then 50% of the exchanges result in not_responsible.

So Two nodes will generate the exchange between A and B for IndexN: one as {A, B, IndexN}, the other as {B, A, IndexN}. The first node will run the exchange for that combination, the second will not. So the exchange will happen, but only once per cycle.

If I remove prune_exchanges/1 (just make it lists:usort(Exchanges)) - then all exchanges result in ok - but for each cycle every exchange happens twice (once from each node involved in the exchange). So prune_exchanges/1 is deliberately hitting an unexpected response, to behave in an expected way.

So I think the not_responsible is safe to ignore for now. Whether the code should do this or not ... I'm not sure.

@martincox
Copy link
Contributor Author

Yep, you're right there I think. It seems like it is deliberately switched to prevent us from uneccessarily doing the same exchange twice, instigated from both the owner and the fallback, which makes sense. Just seems a crappy way of going about it - unless there's more to it that I'm not seeing.

Changing the list comp in prune_exchanges produces the same thing but without the incorrect exchanges. Works as expected, with no not_responsible exchanges being handled.

L = [Exchange || {A, B, _IndexN} = Exchange <- Exchanges, A < B]

I'll make this change PR anyway and let it sit for review for a bit / mull it over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants