[FEATURE] Metadata schema #176

sdatkinson · 2023-04-08T18:51:56Z

This Issue is for defining and implementing a schema for metadata.

"display_name" (str) What the model name might be displayed as (instead of the file name!)
"loudness" (float, <= 0.0) The output (in dB) of a model, given a nominal signal. (Implemented in Define metadata, store loudness #137)
"gear_type" (enum): "amp", "pedal", "amp_pedal".
"with_cab" (bool)
"compression" (float, between 0.0 and 1.0) a quantification of how much compression the model imparts on a signal. Compression of 0.0 means that the response is linear in the input; 1.0 means that an infinitesimal input signal would cause the model to output a signal at its max loudness (example: y = a * sign(x))
"character" (enum): "clean", "crunch", "hi_gain"
"tags": a list of strs
"modeled_by": (str) the name of the person who made the model
"date": {"year": int, "month": int, "day": int} when the model was exported to a .nam file (e.g. determined during .export().
"instrument": (enum) "bass", "guitar", "other" what instrument the model is intended to be used with. (wrt "other", there have been instances of e.g. modeling amps intended for harmonicas!)
"misc": (str) Anything else?
"standard_architecture" (enum) "standard", "lite", "feather" (if the architecture matches one of the architectures I've made easy to pick), or None.
"esr_val1": (float or None) The final ESR, if the model was validated on the validation set from the v1_1_1.wav file (If you used different training data, then the ESR wouldn't be "apples to apples". There's also the possibility of validation set leakage...). So long as the validation st stays the same, this can continue to be reported. This woudl be retired for e.g. "esr_val2" if a future standard training signal changed the validation set.

I'd like for there to be metadata about the gear being modeled, but I'm not sure what the most reasonable way to do this is.

Option "1":

"gear_modeled": (str) Just write it out
Option "2": "make" and "model" (and "year"?)
Option "n":
"gear_modeled": a full-on graph where each node is an element of the signal chain. (Yikes...can you imagine?)

Seeking suggestions and feedback, esp. about how to represent the gear being modeled.

Edits:

2023-04-08 4:15p Add "standard_architecture", "esr_val1"

The text was updated successfully, but these errors were encountered:

heebje · 2023-04-08T19:22:08Z

Gear type could have more entries (for tape machines, etc). Maybe something like "studio_hardware"?
Licence! The licence that the creator of the profile wants to attach to it, if any. Like, is it public domain, or CC, or commercial, or ...?

mikeoliphant · 2023-04-08T19:22:25Z

"final_esr" - The final esr of the model at the end of training.
"sample_rate" - The sample rate used for training.

gtirard · 2023-04-08T19:27:08Z

Could we include an image of the orginal amp in the model's metadata?

JZ1978 · 2023-04-08T19:37:02Z

Not sure if this would make it easy for people to "steal" other peoples work and put new/fake metadata on other peoples models but.... It would be nice if there was a way to write metadata onto the already created .nam files. So the last few months of contributions can have the same search features as any new stuff that comes out once this is implemented.

colin-campbell · 2023-04-08T20:33:44Z

Essential: a metadata format "version".
Awesome: A license string, A bag of parameter names, types? and values (that UIs can interpret how they wish), a automatic cryptographic MAC hash for the capture data (and/or) metadata (authentication), and that hash signed with a cryptographic sig of the author (Attribution).

Aelfstone · 2023-04-08T21:57:58Z

"gear_type" could also be "preamp" and "other".

sdatkinson · 2023-04-08T23:26:20Z

@heebje

* Licence! The licence that the creator of the profile wants to attach to it, if any. Like, is it public domain, or CC, or commercial, or ...?

Is there precedence for holding this as metadata in a file?

sdatkinson · 2023-04-08T23:27:22Z

@mikeoliphant

"sample_rate" - The sample rate used for training.

I intend that to a a proper part of the model (not just metadata) when non-48k is supported in the trainer (see #13)

sdatkinson · 2023-04-08T23:28:45Z

@gtirard

Could we include an image of the orginal amp in the model's metadata?

I'm not keen on that for now, since I'm basically using the JSON file format for the moment 🙂

sdatkinson · 2023-04-08T23:30:37Z

@JZ1978

Not sure if this would make it easy for people to "steal" other peoples work and put new/fake metadata on other peoples models but.... It would be nice if there was a way to write metadata onto the already created .nam files. So the last few months of contributions can have the same search features as any new stuff that comes out once this is implemented.

I wouldn't be against it. I don't think this makes it meaningfully harder to "steal" anyone's models; it's mainly to help facilitate building displays, databases, and such.

scottcorgan · 2023-04-08T23:31:55Z

On of the only things that matters from the ToneHunt perspective of parsing the is that it remains valid JSON. Other than that, a strategy to make sure there is versioning in place that will help us parse updates (various versions) to the .nam file spec as it progresses.

We'll track our work in our own issue when it's ready tonehunt-org/tonehunt#161

sdatkinson · 2023-04-08T23:33:46Z

@colin-campbell

Essential: a metadata format "version". Awesome: A license string, A bag of parameter names, types? and values (that UIs can interpret how they wish), a automatic cryptographic MAC hash for the capture data (and/or) metadata (authentication), and that hash signed with a cryptographic sig of the author (Attribution).

I'm interested in the hash, but not convinced that it does its job. Why couldn't I just rewrite the hash with my own? (Infosec isn't my area of expertise.)

sdatkinson · 2023-04-08T23:41:23Z

@scottcorgan

On of the only things that matters from the ToneHunt perspective of parsing the is that it remains valid JSON.

I won't commit to that (hence why I renamed the extension to .nam instead of .json), but if it does become incompatible w/ the JSON spec, I'll give you tools to load the model 👍🏻 (Using Python, in this repo; C++, in NeuralAmpModelerCore). And I'll publish a spec for the file format.

Other than that, a strategy to make sure there is versioning in place that will help us parse updates (various versions) to the .nam file spec as it progresses.

The plan is to handle this with the main model version, which I've fixed to the repo version (current main branch is 0.5.1) and the intent is to follow semantic versioning. Since this is pre-v1 and it seems (to me) that there's a little bit of ambiguity on it, I'm bumping the minor version when breaking changes are introduced; patch otherwise. So adding new fields would bump the patch version; renaming or removing one would minor. Open to feedback on this, but I think it'll be fine.

We'll track our work in our own issue when it's ready scottcorgan/tonehunt#161

🎉

scottcorgan · 2023-04-09T00:01:45Z

@scottcorgan

On of the only things that matters from the ToneHunt perspective of parsing the is that it remains valid JSON.

I won't commit to that (hence why I renamed the extension to .nam instead of .json), but if it does become incompatible w/ the JSON spec, I'll give you tools to load the model 👍🏻 (Using Python, in this repo; C++, in NeuralAmpModelerCore). And I'll publish a spec for the file format.

Other than that, a strategy to make sure there is versioning in place that will help us parse updates (various versions) to the .nam file spec as it progresses.

The plan is to handle this with the main model version, which I've fixed to the repo version (current main branch is 0.5.1) and the intent is to follow semantic versioning. Since this is pre-v1 and it seems (to me) that there's a little bit of ambiguity on it, I'm bumping the minor version when breaking changes are introduced; patch otherwise. So adding new fields would bump the patch version; renaming or removing one would minor. Open to feedback on this, but I think it'll be fine.

We'll track our work in our own issue when it's ready scottcorgan/tonehunt#161

🎉

Thanks @sdatkinson. We're running a JS only app. We might be able to do some C++ parsing, but I'm sure we can whip something up quickly to parse whichever format you end up using. Do what's best for NAM. We'll follow.

rossbalch · 2023-04-09T00:36:17Z

I like the current set you have going on there, I think it's important not to go too over board. I think option two for the gear modelled is good. Make people be concise. "Manufacturer", "Model", "Channel"

Maybe for character you have two fields:
"tonal_character" Clean, Crunch, Hi-Gain, Fuzzy etc
"tonal_balance" Bright, Dark, Full, Mid-forward

I can't think of any other fields I would like to see. Maybe "Direct" vs "Speaker"

I my opinion any other details can be described by a user as they share the file. I'm not really a details person though, I don't really care about the settings on the amp, the mic used, the specific pre amp or load box etc.

fichl · 2023-04-09T11:56:40Z

Hi everybody,
just an idea... how about putting gear specific frequencies for bass, mid and treble in the metadata.
maybe a model of a bassamp would rather eq @ 100hz instead of 150hz.
if left blank, defaultsettings are used.

colin-campbell · 2023-04-09T12:10:42Z

I'm interested in the hash, but not convinced that it does its job. Why couldn't I just rewrite the hash with my own? (Infosec isn't my area of expertise.)

@sdatkinson In short:

The MAC hash authenticates (validates) the contents of the "message", ie the integrity of the nam file - any app will know if a nam file has been tampered with or corrupted in some way. It doesn't provide secure attribution of authorship - although If for example you hash the metadata and that metadata includes for example a string "© 2023-04-09 Steven Atkinson", you will know if your work is being ripped off if someone changes it and passes it off as their own, because the hash won't match your hash.
The secure attribution (and non-repudiation) comes from digitally signing your hash with a private key and including the public key in the file. This ensures that people can't impersonate you. There are existing technologies like JWS https://en.wikipedia.org/wiki/JSON_Web_Signature that take care of the technical details. However what these don't do on their own establish trust.
Trust is established by verification of the identity of the publisher, and that is the tricky part - once could use an ssh key attached to a Github account, for example, but that is not always great for mere mortals. However the good news is that there are open platforms for this like https://www.sigstore.dev (like letsencrypt for code) that help with this.

I'm happy to discuss this further with you. My take on this - as with the licensing question in the other issue - is ensuring that Free Software creators don't get ripped off of their works, and not allowing proprietary commercial interests an unfair advantage by ripping these people off.

MirrorProfiles · 2023-04-13T15:01:19Z

I think it would be REALLY amazing to have dBu reference level of the model's as metadata.

The plugin could then have a preference where you set the headroom of your DI input, and the plugin can internally adjust so that the gain is matched exactly as intended for the model.

This is important because we are all using different interfaces and DI boxes, and also using different reamp boxes. So if someone is recording DI's with lots of headroom, and sets their reamp box for unity gain for that, and then someone else who is using a different interface will have the same level of accuracy without any guessing or additional steps.

It would be very simple - the model has its own reference level as meta data, and the users sets their input reference level in the plugin (which is hopefully saved so once its set you leave it). Whenever you change models, the gain will automatically correct if there is a difference between the users input level and the level the model was captured at.

sdatkinson · 2023-04-28T15:45:05Z

Thanks, all!

Closing this as the first version of the metadata schema is shipped in v0.5.1 (#202, #203).

Let's give it a spin and re-visit if anything is obviously lacking. If so, then new issues may reference topics discussed here.

sdatkinson added the enhancement New feature or request label Apr 8, 2023

scottcorgan mentioned this issue Apr 8, 2023

Read future metadata from NAM files tonehunt-org/tonehunt#161

Open

sdatkinson self-assigned this Apr 13, 2023

This was referenced Apr 23, 2023

Metadata in GUI trainer #202

Merged

Metadata in easy_colab.ipynb #203

Merged

sdatkinson closed this as completed Apr 28, 2023

vvasilikos mentioned this issue May 13, 2023

[FEATURE] Metadata: Settings used in profile #242

Open

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Metadata schema #176

[FEATURE] Metadata schema #176

sdatkinson commented Apr 8, 2023 •

edited

Loading

heebje commented Apr 8, 2023

mikeoliphant commented Apr 8, 2023

gtirard commented Apr 8, 2023

JZ1978 commented Apr 8, 2023

colin-campbell commented Apr 8, 2023

Aelfstone commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

scottcorgan commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

scottcorgan commented Apr 9, 2023

rossbalch commented Apr 9, 2023 •

edited

Loading

fichl commented Apr 9, 2023

colin-campbell commented Apr 9, 2023 •

edited

Loading

MirrorProfiles commented Apr 13, 2023

sdatkinson commented Apr 28, 2023 •

edited

Loading

[FEATURE] Metadata schema #176

[FEATURE] Metadata schema #176

Comments

sdatkinson commented Apr 8, 2023 • edited Loading

heebje commented Apr 8, 2023

mikeoliphant commented Apr 8, 2023

gtirard commented Apr 8, 2023

JZ1978 commented Apr 8, 2023

colin-campbell commented Apr 8, 2023

Aelfstone commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

scottcorgan commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

sdatkinson commented Apr 8, 2023

scottcorgan commented Apr 9, 2023

rossbalch commented Apr 9, 2023 • edited Loading

fichl commented Apr 9, 2023

colin-campbell commented Apr 9, 2023 • edited Loading

MirrorProfiles commented Apr 13, 2023

sdatkinson commented Apr 28, 2023 • edited Loading

sdatkinson commented Apr 8, 2023 •

edited

Loading

rossbalch commented Apr 9, 2023 •

edited

Loading

colin-campbell commented Apr 9, 2023 •

edited

Loading

sdatkinson commented Apr 28, 2023 •

edited

Loading