-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Azure managed identity #4897
Conversation
Signed-off-by: Ben Sherman <[email protected]>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound good but there are zero unit tests
@bentsherman in the example I do instantiate a DefaultAzureCredentials as you do in your change. This should be enough as the instance(s) are configured to use that clientId |
I'm with Ben - where do you authenticate against the Batch service? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused as to how this is authenticating against the Batch service. Also, we should leave some scope open for a system assigned (anonymous) managed identity.
final credential = new DefaultAzureCredentialBuilder() | ||
.managedIdentityClientId(clientId) | ||
.build() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be worth checking if this can pick up a system assigned identity as well as a named one. You should be able to do this by dropping the managedIdentityClientId(clientId)
part (educated guess here):
final credential = new DefaultAzureCredentialBuilder()
if ( clientId ) {
credential.managedIdentityClientId(clientId)
}
finalCredential = credential.build()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will also enable people to authenticate as themselves on their personal machines if they're logged in to Azure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need another config option to explicitly enable the system assigned identity.
Here are the docs for the default credential builder: https://learn.microsoft.com/en-us/java/api/com.azure.identity.defaultazurecredentialbuilder?view=azure-java-stable
It's not obvious to me that it defaults to the system-assigned identity, since it seems to support other credentials. But maybe you can discern better than me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you just call defaultCredentialBuilder
it goes in this order: https://learn.microsoft.com/en-us/azure/developer/java/sdk/identity-azure-hosted-auth#default-azure-credential
- Environment - DefaultAzureCredential reads account information specified via environment variables and use it to authenticate.
- Managed Identity - If the application deploys to an Azure host with Managed Identity enabled, DefaultAzureCredential authenticates with that account.
- IntelliJ - If you've authenticated via Azure Toolkit for IntelliJ, DefaultAzureCredential authenticates with that account.
- Visual Studio Code - If you've authenticated via the Visual Studio Code Azure Account plugin, DefaultAzureCredential authenticates with that account.
- Azure CLI - If you've authenticated an account via the Azure CLI az login command, DefaultAzureCredential authenticates with that account.
More specifically they give this example where they state that the clientId
is only required if using a user assigned identity:
/**
* Authenticate with a managed identity.
*/
public void createManagedIdentityCredential() {
ManagedIdentityCredential managedIdentityCredential = new ManagedIdentityCredentialBuilder()
.clientId("<USER ASSIGNED MANAGED IDENTITY CLIENT ID>") // only required for user assigned
.build();
// Azure SDK client builders accept the credential as a parameter
SecretClient client = new SecretClientBuilder()
.vaultUrl("https://{YOUR_VAULT_NAME}.vault.azure.net")
.credential(managedIdentityCredential)
.buildClient();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that with the system-assigned identity you would only need to know the batch/storage account names and then you could submit jobs from anywhere without any client-side credentials?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is the "system" the node from which you are submitting tasks i.e. the Nextflow head job which would presumably be running in Azure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both. You would need to give it the batch and storage account names, but it would authenticate because of what it is, rather than what it knows. So if you ran it on an Azure VM with a System-assigned identity, you would need to:
- Grant that system assigned identity permissions for the Batch and Storage account
- Configure Nextflow to try and submit to that Batch and Storage account
- Run Nextflow on that virtual machine, which will assume that identity
plugins/nf-azure/src/main/nextflow/cloud/azure/config/AzManagedIdentityOpts.groovy
Outdated
Show resolved
Hide resolved
You will also need to update the azcopy version for this to work, see here for some additional details: #3314 (comment) |
IF the pool has (manually at the moment) assigned to a managed identity (user assigned) then any operation performed (towards batch or storage) by instances of the pool will use the clientId for authentication |
plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchService.groovy
Show resolved
Hide resolved
But how does it know which Batch account to authenticate against? It will assume the identity, then it needs to know which batch account to be using when it submits jobs and tasks, that doesn't appear to be here in the code? |
Good point, currently with the managed identity it will leave the batch credentials as null and the batch account is ignored. Here are the docs for batch credentials: https://learn.microsoft.com/en-us/java/api/com.microsoft.azure.batch.auth.batchcredentials?view=azure-java-stable The following implementations are available:
|
According to a post by @vsmalladi on StackOverflow, you can use a Managed Identity to generate a token, then use that with |
plugins/nf-azure/src/main/nextflow/cloud/azure/nio/AzFileSystemProvider.groovy
Outdated
Show resolved
Hide resolved
…mProvider.groovy Signed-off-by: Adam Talbot <[email protected]>
@adamrtalbot this will be great to try with TES plugin |
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchService.groovy
Outdated
Show resolved
Hide resolved
Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Matteo Fiandesio <[email protected]>
Signed-off-by: Matteo Fiandesio <[email protected]>
Signed-off-by: Matteo Fiandesio <[email protected]>
task: update azure batch to 1.0.0-beta version
Signed-off-by: Ben Sherman <[email protected]>
plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchService.groovy
Outdated
Show resolved
Hide resolved
plugins/nf-azure/src/main/nextflow/cloud/azure/nio/AzFileSystemProvider.groovy
Outdated
Show resolved
Hide resolved
…mProvider.groovy Co-authored-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Something was messed up and it didn't include all the changes from #5042. Now it produces this error:
|
Is the new The current configuration even uses a bogus azure {
managedIdentity {
clientId = 'foobar'
}
} I would expect this to fail, but instead it appears as if user@host:~ $ NXF_VER=24.05.0-edge nextflow run ./tmp/main.nf -profile azure_mi
N E X T F L O W ~ version 24.05.0-edge
Launching `./tmp/main.nf` [insane_cori] DSL2 - revision: ac120465ff
ERROR ~ Missing Azure storage credentials: please specify a managed identity, service principal, or storage account key
-- Check '.nextflow.log' file for details |
Not necessarily Azure Batch, it could just be Azure Storage. You will need to name the storage account though:
|
Fixed it. My problem was that the Regarding the |
It's not a requirement for the Managed Identity per se, but it's needed so that you know what resources you are trying to access, so practically you will need to supply a storage account name. |
I have been testing this manually in the context of https://github.com/seqeralabs/fusion/issues/526 and I have found it to fail occasionally (some runs succeed, some fail with the error below): azureuser@vm:~ $ NXF_VER=24.05.0-edge nextflow run hello
N E X T F L O W ~ version 24.05.0-edge
Launching `https://github.com/nextflow-io/hello` [naughty_wilson] DSL2 - revision: 7588c46ffe [master]
executor > azurebatch (4)
[83/640680] process > sayHello (3) [100%] 4 of 4, failed: 4 ✘
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'sayHello (1)'
Caused by:
One of the specified Azure Blob(s) is not found
Command executed:
echo 'Bonjour world!'
Command exit status:
-
Command output:
(empty)
Work dir:
az://fusion-develop/scratch/9f/c7b067d8cd7aa5f91971f47ed8b914
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details The execution environment is an Azure VM, with process.executor = 'azurebatch'
workDir = "az://fusion-develop/scratch"
fusion {
enabled = false
}
azure {
managedIdentity {
system = true
}
storage {
accountName = "<redacted>"
}
batch {
accountName = "<redacted>"
location = "<redacted>"
autoPoolMode = true
deletePoolsOnCompletion = true
allowPoolCreation = true
pools {
auto {
autoScale = true
vmCount = 1
maxVmCount = 10
}
}
}
} I initially thought the problems came from our implementation of MI in Fusion, but after stripping everything away the error persists. Any thoughts on what could be happening? |
I wasn't able to recreate the issue after running it 10 times with Nextflow v24.06.0-edge, are you using anything else?
|
@adamrtalbot I launched several runs with user@host:~$ NXF_VER=24.06.0-edge nextflow run hello -c nextflow.config-mi
N E X T F L O W ~ version 24.06.0-edge
NOTE: Your local project version looks outdated - a different revision is available in the remote repository [7588c46ffe]
Launching `https://github.com/nextflow-io/helloworld` [pensive_yonath] DSL2 - revision: 1d71f857bb [master]
ERROR ~ Managed Identity authentication is not available.
-- Check '.nextflow.log' file for detail Also, it might not be relevant but your configuration snippet seems to be using a Service Principal ( |
You can't use an MI from your laptop (at least, not practically). So from your laptop you need to user a service principal. I will try from the VM and see if it's an issue with the machine. |
Close #4871
Thanks @swampie for the minimal example. Is there any config required to authenticate with Azure Batch? Currently with this PR, no batch credentials will be provided if only the managed identity is specified