Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EurekaClientConfigurationRefresher & DiscoveryClient#refreshInstanceInfo concurrent execution #4094

Open
yuhuangbin opened this issue May 17, 2022 · 5 comments

Comments

@yuhuangbin
Copy link

yuhuangbin commented May 17, 2022

Describe the bug
As we know, when RefreshScopeRefreshedEvent is published, service instances perform deregister and register actions.
refer: EurekaDiscoveryClientConfiguration
deregister and register will call ApplicationInfoManager#setInstanceStatus to modify InstanceStatus.

protected static class EurekaClientConfigurationRefresher
			implements ApplicationListener<RefreshScopeRefreshedEvent> {

                 // ...

		public void onApplicationEvent(RefreshScopeRefreshedEvent event) {
			if (eurekaClient != null) {
				eurekaClient.getApplications();
			}
			if (autoRegistration != null) {
                                // deregister instance
				this.autoRegistration.stop();
				// register instance
				this.autoRegistration.start();
			}
		}
	}

But, the DiscoveryClient will also be modified InstanceStatus by schedule task.

void refreshInstanceInfo() {
            applicationInfoManager.refreshDataCenterInfoIfRequired();
            applicationInfoManager.refreshLeaseInfoIfRequired();

            InstanceStatus status;
            try {
                // get instance status
                status = getHealthCheckHandler().getStatus(instanceInfo.getStatus());
            } catch (Exception e) {
                logger.warn("Exception from healthcheckHandler.getStatus, setting status to DOWN", e);
                status = InstanceStatus.DOWN;
            }

            if (null != status) {
                // modify instance status
                applicationInfoManager.setInstanceStatus(status);
            }
       
    }

So, EurekaClientConfigurationRefresher & DiscoveryClient#refreshInstanceInfo concurrent execution causes concurrency issues

Step1 deregister instance then modify InstanceStatus to DOWN by EurekaClientConfigurationRefresher

// deregister instance
this.autoRegistration.stop();

Step2 get InstanceStatus from instanceInfo is DOWN by DiscoveryClient#refreshInstanceInfo()

// get instance status
status = getHealthCheckHandler().getStatus(instanceInfo.getStatus());

Step3 deregister instance then modify InstanceStatus to UP by EurekaClientConfigurationRefresher

// register instance
this.autoRegistration.start();

Step4 modify InstanceStatus to DOWN by DiscoveryClient#refreshInstanceInfo()

// modify instance status
applicationInfoManager.setInstanceStatus(status);

If the execution order is as above, the service InstanceStatus changes to DOWN.

@yuhuangbin
Copy link
Author

It is recommended that the operation of querying the shared variable first and then updating it be locked here, and this problem has not been reproduced after the lock is currently added. @OlgaMaciaszek Can you give a little advice?

    void refreshInstanceInfo() {
        
        applicationInfoManager.refreshDataCenterInfoIfRequired();
        applicationInfoManager.refreshLeaseInfoIfRequired();
        // need lock here
        synchronized (applicationInfoManager) {
            InstanceStatus status;
            try {
                status = getHealthCheckHandler().getStatus(instanceInfo.getStatus());
            } catch (Exception e) {
                logger.warn("Exception from healthcheckHandler.getStatus, setting status to DOWN", e);
                status = InstanceStatus.DOWN;
            }

            if (null != status) {
                applicationInfoManager.setInstanceStatus(status);
            }
        }
    }

@OlgaMaciaszek
Copy link
Collaborator

Hello, @yuhuangbin, thanks for reporting the issue. Will take a look next week.

@yuhuangbin
Copy link
Author

Hello, @OlgaMaciaszek , Is there a conclusion to this issue?

@OlgaMaciaszek
Copy link
Collaborator

Looks like a bug.

@OlgaMaciaszek
Copy link
Collaborator

This needs to be addressed within Netflix/Eureka DiscoveryClient. Have submitted a PR with a fix: Netflix/eureka#1566.

@OlgaMaciaszek OlgaMaciaszek moved this from Todo to In Progress in 2023.0.5 Dec 23, 2024
@OlgaMaciaszek OlgaMaciaszek moved this from Todo to In Progress in 2024.0.1 Dec 23, 2024
@spencergibb spencergibb removed this from 2023.0.5 Jan 10, 2025
@spencergibb spencergibb moved this to In Progress in 2023.0.6 Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Status: In Progress
3 participants