Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organisation of data #8

Open
ioquatix opened this issue Apr 10, 2017 · 10 comments
Open

Organisation of data #8

ioquatix opened this issue Apr 10, 2017 · 10 comments
Labels

Comments

@ioquatix
Copy link

ioquatix commented Apr 10, 2017

Some mime types come from IANA, right?

Is there some way to have separate data sources based on different orgs, e.g. IANA, Apache, Unofficial, etc.

Does that even make sense?

@halostatue
Copy link
Member

The primary sources are Apache, IANA, and user contributions. I do not think that it makes sense to separate these in terms of the data organization here—given the primary data use, as there is generally enough information to post-process the data into separate groups as needed. However, the IANA generally doesn’t include extensions, so if they are needed, they will (almost) always be user-contributed.

@ioquatix
Copy link
Author

Thanks for the quick reply.

I was surprised that IANA don't list extensions. I sort of understand why, but it also seems like it would be really useful.

@halostatue
Copy link
Member

You might be able to get it from individual RFCs that most IANA things are based on, but…obtaining that information would mean parsing the text of the RFCs. :/

@ioquatix
Copy link
Author

Yeah, that's crazy talk :p

@halostatue
Copy link
Member

There are potentially some things that can be obtained from individual records in the parsing. application/javascript has the following block embedded:

Additional information:

   Magic number(s):             n/a
   File extension(s):           .js
   Macintosh File Type Code(s): TEXT

It’s not structured, but it may be possible for some types. However, image/png doesn’t have similar information…which makes it ugly. Therefore, I generally try to parse from the information that is guaranteed to be present (plus, downloading each of those individual items would start being unkind to IANA servers).

@ioquatix
Copy link
Author

Have you considered proposing to IANA that they maintain a list of canonical file extensions?

@halostatue
Copy link
Member

I haven’t; it’s probably not a bad idea, but I’m not sure how I’d start that particular discussion.

@ioquatix
Copy link
Author

I looked, they do capture file extension on their submission form, so perhaps they are thinking about it, or perhaps they do publish it, but not on that page? I've emailed them.

@ioquatix
Copy link
Author

Hi Samuel,

We don't have a date yet, but a column that presents the file extensions listed in the registration template will be added to the media type registry.

Best regards,

Amanda Baber
Lead IANA Services Specialist
PTI

@halostatue
Copy link
Member

They still haven’t added an extensions column yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants