-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add datasource value to records. #28
Comments
(0) Brainstorming The Reference Format Name The temporary column name of Using I think the terms "content" and "package" combine to a good general term for what we are talking about. It is more descriptive than the term "work", which can be mistaken for other definitions of this word. Also a work can be unfinished, unpublished, not information-based, etc. The term "content package" differs from using the word "package" alone, indicating that we can be talking about things other than software: ebooks, structured data packages, Web-site snapshots (ex. ZIM), videos, etc. It differs from using the word "content" alone to indicate something that isn't a free-flowing scrap of content (like this post) but an organized versioned aggregate of all the pieces needed for some end (like a source repository complete with issues discussions). And what we are defining here is a way to link to / reference to content package metadata with other database / authority / index. We are connecting different ecosystems: FreeBSD packages link to other FreeBSD packages (ex. as dependencies), and NPM packages link to other NPM packages, but we want to link to both. I think the term interlink is fitting. And so, until I get better ideas, my suggested name for this standard is: Content Package Interlink Format, or C.P.I.F.! 😃 It's also a play on "copy if" - you may or may not want to copy this content package depending on the meta-data you in CPIF links. But I hope someone suggests something better, so the name is of course subject to change. (1) Brainstorming The Reference Format Structure The CPIF link format has to be multi-part, with the first part identifying the package database, and at least one more to uniquely identify the specific package. Since BSD ports datasources (likely our most significant data source for software) and Gentoo Portage use a slash-delimited path to identify records, I also used a slash following the prefix. Other content package databases may use a deeper-layer hierarchy. This also maps easily URLs and the Unix filesystem, including some new ideas for the latter (ex It is an open question if maybe CPIF should organize the data-sources by category (ex. /cpif/video/youtube, /cpif/software/cabal, etc). I currently think this is a bad idea, because some databases could fall into multiple categories, and it would be best to deal with that further down the path (ex. /cpif/facebook/video/$ID, /cpif/facebook/photo/$ID). Also, some data sources defy easy categorization. (2) Database Identifiers I think my web paste example covers most foreseeable scenarios. (Note that it contained an error: I forgot to edit out ".se" from the pkgsrc prefix when pasting.) In light of the above brainstorming, it should now read: h2o:
uri:
- https://h2o.examp1e.net
tags:
- server
- software
- web
license:
- MIT/X11 License
license_reference:
- https://h2o.examp1e.net/faq.html
cpif:
- github/h2o/h2o
- freebsd/www/h2o
- pkgsrc/www/h2o
- opensuse/h2o
- homebrew/h2o It is an open question about whether we should use domain names for the projects (ex. brew.sh) rather than a simplified ID string (ex. homebrew). I think that the latter is the way to go. This way we can maintain consistency even if domain names change (ex. a gTLD to Namecoin exodus). Also sometimes there are multiple sites for a package database: some more formal for the project (ex. freebsd.org, pkgsrc.org) while other third party sites contain the actual metadata (ex. freshports.org, pkgsrc.SE). (To be continued...) |
Format NameI'm not a big fan of the term "content" for this purpose. You say this to
This usage seems to imply the common use of the term "content" on the internet, I find your objection to the word "work" unconvincing. The term has some Of course, I also think that the "interlink format" you propose is naturally I think all I really like about the name as a whole you that you invented is Format Structure / Database IdentifiersI think I'm happier (but not fully happy -- more on that in a moment) with something like:
. . . or something like that. On the other hand, URIs might be appropriate Those are my sixteen cents. Two cents ain't what they used to be. |
I'm fine with I'll just make a few nitpicks, but leave the final decision to CI. It would be interesting to have a lengthy debate on codifying a reexamined computing terminology — like how the word "content" means "digital stuff you can download and consume" (now including apps) rather than package "contents", etc, etc, etc — but this isn't the place. I haven't thought of this being a new file format (*.mif), but a string syntax format to be used in other file formats - like how URI / Href syntax is used in HTML / etc. (Major differences from Href would include: *1* It can be a list of reference strings instead of just one. *2* Having a centrally-defined prefix lookup table instead of arbitrary server. *3* No protocol, port, etc; only path.) Some of your But, again, I leave the details up to CI. "Don't let perfect be the enemy of good." I just wanted to emphasize the importance of coming up with a reusable standard for referencing works metadata sources, ideally with a memorable name. This syntax can then be used for a number of my future projects, like a package manager for installing copyfree software, fonts, Nim libraries, ebooks, offline website snapshots, etc. |
gists, GitHub Pages, and wikis
I'm not entirely sure what you're suggesting.
That's a good idea, of course, and I don't object in principle. Getting into the practical details, though, I still think that just providing literal directions to the source information is probably more useful. |
@lbmn proposed new metadata:
The proposal involved a
works_databases_references
key to provide references to sources of information about works included in the list, as exemplified in a web paste for a YAML work submission @lbmn provided. This kind of thing could prove useful in the transition to a proper works database, with some automated population of new works entries and perhaps automated alerts of changes to projects behind existing entries based on those data sources.At this time,
datasources
seems like a better metadata key name at this time, though the specific format of the values in thedatasources
array must still be considered. Please share any ideas and suggestions in comments here, or in discussion in the#copyfree
and/or##copyfree
channels on freenode.The text was updated successfully, but these errors were encountered: