-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial commit for Azure Blob Storage adaptor #438
Initial commit for Azure Blob Storage adaptor #438
Conversation
Hello @josephjclark @stuartc. @taylordowns2000 suggested that I flag this to your attention. I suspect that this isn't yet in a merge-able state, but it feels close enough that I wanted to open a PR. |
@bensonmiller Thank you for this! Don't worry too much about the testing side - mocking is a really hard problem (we're working on a solution but no news yet). I'll take a look at this on Monday and see if we can get it merged. Does Azure have a sandbox that I can test this adaptor against? What's the effort for me, who thinks azure is a colour, to test this adaptor on my own machine? |
@josephjclark Thank you! There are a couple ways to handle Azure testing. I could grant you access to a small Azure subscription that I have. That's not terrible, but would still require you to create and manage a user account in Azure. Alternatively (and maybe the best approach) is to have you sign up for an Azure account. When you create an account for the first time, MSFT puts some cloud credits ($200 IIRC) into new accounts. Pretty standard stuff, but I can't imagine that would exceed those credits on blob storage testing in 1000 years. If you want to go down that second path, this is how I'd approach it:
That should be all that's necessary to get a storage account that works with this adaptor. I think that's a relatively easy process and still (mostly) allows you to assert that Azure is just a color. I'm not far from that perspective myself, feeling much more comfortable in AWS and GCP. . . EDIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @bensonmiller thank you for such a well written and well-documented PR.
There is absolutely nothing wrong with your Javascript and frankly I think you should have a little more faith in yourself! It's all nice and clean and concise and my comments are more around high-level design or low-level patterns that OpenFn uses (and perhaps doesn't document so well).
I really like the idea of splintering out this "mini adaptor" to just cover part of Azure 👌 I wish we'd thought of that elsewhere.
I haven't tested anything yet. Regarding that, given that is adaptor is by you and for you, I am happy to take it on your word that it works. I can see that the API is fairly simple (and maybe will get simpler?) which again makes me feel like I don't need to test too extensively.
A temporary access key could be a good way to go but let's get through this review round first and then see where we are!
@@ -0,0 +1 @@ | |||
# @openfn/language-azureblobstorage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this is approved, you'll want to run pnpm changeset
from repo root and follow to steps to generate release notes. But this is the last thing you'll do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll take care of this before releasing :)
"author": "Open Function Group", | ||
"license": "LGPLv3", | ||
"files": [ | ||
"dist/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a note to myself - I think we need to include assets
here? We should update the template
@josephjclark -- Thank you for this superb review. I've skimmed it and I love the feedback. I have a fairly busy day today, so it's unlikely that I'll have any updates to commit before tomorrow, but I'll work through your comments and incorporate at soonest. |
@josephjclark I've updated this PR to incorporate almost all of your comments. Notable changes include:
|
packages/azure-storage/README.md
Outdated
Create a file `job.js`, as shown below, to run with the OpenFn CLI. | ||
|
||
```js | ||
const data = state.data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example code here should show some typical adaptor usage and tell a bit of a story. I'd like to re-work it heavily, make it shorter, and align with our best practices.
We're not very good at explaining this stuff so let me take a minute to explain a few things!
- You don't need to call the execute function - the OpenFn runtime will do that for you!
- Job code is designed to be a small number of top level "operation" functions with as little procedural code as possible. It's designed to be very lightweight and elegant (although to be fair it rarely is in practice!)
- As part of that, every line of job code should start with an operation from an adaptor (not a
const
orlet
). I'm pretty sure this code won't actually run on our existing app at openfn.org. - A cheat code for this is to declare an
fn(() => { ... })
block at the start of the job and do all your initialisation code there. Any variables that are re-used in the job should be written to state. - By the way, we are guilty all the time of using
fn
blocks to nest lots of complex procedural code. It's kind of inevitable when dealing with messy data from the real world!
The big thing with your example is that it front-loads a lot of variable declarations and then feeds them through to the operations. What we'd usually do is pass a function in place of a value, and use the function to dynamically calculate stuff (we call this an "open function" by the way).
Ok, that's a lot of words. Here's what I would do with your example:
uploadBlob(
state => {
const date = new Date();
const id = '0e82962a-6ed0-4a88-92c1-51ae785b4126';
return `${date.getFullYear}/${date.month}/com.example/${id}.json`;
},
state.data,
{
blobHTTPHeaders: { blobContentType: 'application/json' }
}
);
Note that the container name is coming from config here - so the state object described above probably needs updating:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josephjclark -- Thank you for this comment. I've reworked my example per your guidance and I'm getting close. I'm experiencing a problem when I use an open function to create and return the blobName
argument to the uploadBlob()
function. For my experiment, I'm using the following two job definitions, below. The first one uses a plain string for the blobName
argument, which works fine. The second one uses the open function approach, which results in an error when trying to instantiate the blockBlobClient
(which is the first place in the uploadBlob
function where the blobName
argument is used).
This works:
uploadBlob(
"some/path/myblobname.json",
state.data,
{
blobHTTPHeaders: { blobContentType: 'application/json' }
},
{ createContainer: true, containerName: 'thisone'}
);
This doesn't work
uploadBlob(
() => { return "some/path/myblobname.json"; },
state.data,
{
blobHTTPHeaders: { blobContentType: 'application/json' }
},
{ createContainer: true, containerName: 'thisone'}
);
I run the job via the CLI. When it fails, it returns the following output. The "blobName.split is not a function
" error is thrown when I try to instantiate the blockBlobClient
. I'm wondering if the open function hasn't yet returned, so the blockBlobClient
is trying to instantiate itself with a /function/ instead of a /string/, which causes it to error.
➜ nhgh_workflows openfn uploadAzureSimpler.js -ma azure-storage -s tmp/state_for_azure_upload_simple.json -O -v
[CLI] ♦ Versions:
▸ node.js 18.17.1
▸ cli 0.4.10
▸ runtime 0.2.1
▸ compiler 0.0.38
▸ @openfn/language-azure-storage latest
[CLI] ✔ Loading adaptors from monorepo at /Users/benson/sandbox/newhorizons/openfn_adaptors
[CLI] ✔ Loaded state from tmp/state_for_azure_upload_simple.json
[CLI] ✔ Compiled from uploadAzureSimple.js
[R/T] ♦ Starting job job-1
[R/T] ✘ Failed job job-1 after 154ms
[R/T] ✘ TypeError: TypeError: blobName.split is not a function
[R/T] ✘ Check state.errors.job-1 for details.
[CLI] ✔ Result:
[CLI] ♦ {
"data": {
"foo": "bar",
"bif": "baz",
"sunday": "monday",
"answer": "42"
},
"errors": {
"job-1": {
"type": "TypeError",
"jobId": "job-1",
"message": "TypeError: blobName.split is not a function",
"error": {
"source": "runtime",
"type": "RuntimeError",
"severity": "fail",
"subtype": "TypeError",
"name": "RuntimeError",
"message": "TypeError: blobName.split is not a function"
}
}
}
}
[CLI] ⚠ Errors reported in 1 jobs
[CLI] ✔ Finished in **231ms**
I'd appreciate your thoughts on this. I'd be happy to jump on a GMT call if it would help. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bensonmiller - that's the expandReferences
thing biting us.
Check out this comment: #438 (comment)
You need to expand the blobName
reference and use the "resolved" value in the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes! Thanks for the re-iteration. This should be all fixed up now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bensonmiller, this is looking great!
I think I accidentally fired off my review notes early - apologies for that, how embarassing!!
I've tidied them all up now ahem.
There's just a few more things we ought to deal with, then we can get this out in to the world!
- Updating configuration schema - Better handling of function options - Adding support for - Bumped to 1.0
@josephjclark I think (hope) that this is ready for a final look before prepping for release ( |
Thank you @bensonmiller! I am very happy with this - we'll handle the changeset stuff and get this released for you ASAP. |
9985fb4
into
OpenFn:release/azure-blobstorage
Summary
Adds an OpenFn adaptor for Azure Blob Storage
Details
This adaptor adds the following capabilities for working with Azure Blob Storage:
Issues
This is my first adaptor PR and I'm not an especially competent JS developer. I'm certain there are issues. But I've subjected this adaptor to a fair amount of testing via the CLI and it is working respectably. Some thoughts:
StorageSharedKeyCredential
at the moment.downloadAsJSON()
anddownloadAsString()
. The difference between these is minimal, but I saw a couple other adaptors doing similar things so I did it here. I'm not sure that I love it (or, even, that I've implemented it especially well).Review Checklist
Before merging, the reviewer should check the following items:
migration guide followed?
dev only changes don't need a changeset.