Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added public ips on a farm are not always synced with the chain #157

Closed
AhmedHanafy725 opened this issue Feb 15, 2024 · 13 comments
Closed
Assignees
Labels
type_bug Something isn't working
Milestone

Comments

@AhmedHanafy725
Copy link

env: devnet

farm: 52
image

image

also, farm 4432 has the same issue

@sameh-farouk
Copy link
Member

sameh-farouk commented Feb 20, 2024

Update:
Still trying to figure out what is causing this.

  • Did a code review.
  • Checked if we are missing a Farm type change.
  • Review previous changes done to farm type on the chain.

What I have found so far is that during an old tfchain migration, a state cleaning was performed to remove some bad IPs from the chain. This might have resulted in having more IPs than what we currently have in the chain (since the cleaning didn't emit any events). However, the OP mentioned that there are fewer IPs in the Graphql, indicating that there might be a different issue at play.

@Omarabdul3ziz
Copy link
Contributor

i think we maybe have also a problem with deleting, some farms has ips on the database but not on the chain, what i noticed is farms 138, 33, 14
image
image

@xmonader xmonader removed this from 3.13.x Mar 21, 2024
@xmonader xmonader added this to 3.14.x Mar 21, 2024
@sameh-farouk
Copy link
Member

sameh-farouk commented Mar 27, 2024

@Omarabdul3ziz yes there is another issue but not with deleting, note what i was saying in my previous comment. There is indeed a two issues in play here, and the one you described is due to previous chain migration where we added a validation and removed invalid IPs info from chain state without propagating this to our processor via tracked events. a gateway always have to be public and to be on the same subnet as the IP address. In your case, these IPs shows in graphql and not exists in chain due to being invalid.
A graphql fix should follow to remove these ips from graphql as well.

@sameh-farouk
Copy link
Member

sameh-farouk commented Mar 28, 2024

Update:

  • I tracked the propagated calls and events related to this farm in our indexer.

Farm 52 was created in block 1313143 with an empty set of IPs
it was updated a handful of times in different extrinsic since then, but this one is the most interesting

https://polkadot.js.org/apps/?rpc=wss%3A%2F%2Ftfchain.dev.grid.tf%2Fws#/extrinsics/decode/0x1908840060e6768d0f49b892e5d9e7f8247320687c28c2651c37e168c4a244c649ebe11c014e861a913967e5a97bffeadd0c7a8c834fe2cdc7ed8331079a991b0401c129651108390a6f00d97f2ef69ba9b30e88540ce3b03d53e1eacccf7bc171c78522892502c9440003024019053400000028312e312e312e352f32341c312e312e312e3019053400000028312e312e312e362f32341c312e312e312e3019053400000028312e312e312e372f32341c312e312e312e3019053400000028312e312e312e382f32341c312e312e312e3019053400000028312e312e312e392f32341c312e312e312e301905340000002c312e312e312e31302f32341c312e312e312e301905340000002c312e312e312e31312f32341c312e312e312e301905340000002c312e312e312e31322f32341c312e312e312e301905340000002c312e312e312e31332f32341c312e312e312e301905340000002c312e312e312e31342f32341c312e312e312e301905340000002c312e312e312e31352f32341c312e312e312e301905340000002c312e312e312e31362f32341c312e312e312e301905340000002c312e312e312e31372f32341c312e312e312e301905340000002c312e312e312e31382f32341c312e312e312e301905340000002c312e312e312e31392f32341c312e312e312e301905340000002c312e312e312e32302f32341c312e312e312e30

The same farm was updated about 15 times in the same batch extrinsic with different IP addresses each time.

The processor received all farm updated events. Because they were propagated from the same externsic and included in the same block, it processes the events in an incorrect order resulting in an outdated set of IPs to persist in processor db.

I expect the processor to process them based on their index within the block, but seems this is not in check.

@sameh-farouk
Copy link
Member

sameh-farouk commented Mar 28, 2024

Update:
Based on @AhmedHanafy725 feedback, this batch extrinsic was initiated from the dashboard by adding an IP range.
Will try to reproduce.

@sameh-farouk
Copy link
Member

Update:
I wasn't able to reproduce the issue. I tried adding an IP range from the dashboard, and it seems that events from batch call are being processed with respect to the event index in the block, just as expected.

It would be very helpful if you could reproduce the issue and share the steps with us.

For now, what i can do a migration to remove the invalid IPs from GraphQL.

@sameh-farouk sameh-farouk added this to the 3.0.0 milestone May 1, 2024
@sameh-farouk sameh-farouk moved this to Accepted in 3.14.x May 2, 2024
@sameh-farouk sameh-farouk moved this from Accepted to In Progress in 3.14.x May 2, 2024
@sameh-farouk sameh-farouk moved this from In Progress to Blocked in 3.14.x May 2, 2024
@AhmedHanafy725
Copy link
Author

I tried it again adding 5 ips with batch call and nothing is showing in graphql image

while they are added on the chain
image

@sameh-farouk
Copy link
Member

@AhmedHanafy725 Which network?

@AhmedHanafy725
Copy link
Author

@AhmedHanafy725 Which network?

devnet

@sameh-farouk
Copy link
Member

sameh-farouk commented May 13, 2024

@AhmedHanafy725 There is a validation on the processor to not allow saving duplicate IPs and all these IPs you tried to add were stored on other farms previously, so it wasn't added to your farm.

you verify this using:

query MyQuery {
  publicIps(where: {ip_eq: "1.1.1.1/24"}) {
    ip
    id
    farm {
      id
      name
      farmID
      dedicatedFarm
      certification
      twinID
      pricingPolicyID
    }
    contractId
  }
}

It appears that the validation was intended to prevent the addition of the same public IP address to different farms. However, this should have been implemented at the chain level to prevent such an occurrence.

How should we approach this? do we need to alter this validation?

@sameh-farouk
Copy link
Member

sameh-farouk commented May 14, 2024

Now it's clear what caused these synchronization issues.

  1. IPs registered on the chain but not indexed by the processor (which is the main issue mentioned by OP) are explained here here.

  2. IPs indexed by the processor but not exists on the chain are explained here here.

@sameh-farouk
Copy link
Member

sameh-farouk commented May 20, 2024

Update:
After discussing this with @xmonader and @AhmedHanafy725 I suggested a solution here

@sameh-farouk sameh-farouk moved this from Blocked to In Progress in 3.14.x May 21, 2024
@sameh-farouk sameh-farouk moved this from In Progress to Pending Review in 3.14.x May 22, 2024
@sameh-farouk
Copy link
Member

sameh-farouk commented May 22, 2024

Update: a PR that handling this case #157 (comment) is ready fore review

@sameh-farouk sameh-farouk moved this from Pending Review to Pending Deployment in 3.14.x May 27, 2024
@sameh-farouk sameh-farouk moved this from Pending Deployment to In Verification in 3.14.x May 29, 2024
@xmonader xmonader closed this as completed Aug 1, 2024
@github-project-automation github-project-automation bot moved this from In Verification to Done in 3.14.x Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type_bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants