I just learned about an interesting standard called "Metalink."
Metalink is designed to be used for applications that need to download
sets of files (e.g. linux package managers). It uses an xml format to
describe the aggregation of files and things like checksums, primary
and secondary sources (i.e., URLs), protocols (http, bittorrent).
I cannot help but note the overlap between such a standard and the OAI-
ORE effort -- while they obviously come from very different domains,
the generic nature of the task (aggregating sets of objects, adding
some metadata about the objects, and the relationships between
objects) inevitably leads to overlap. There are other standards
(Maven's Project Object Model, the Yahoo Media RSS extension, etc.)
that also tackle the aggregation issue from various angles.
I wouldn't suggest that such diversity is a bad thing, but I cannot
help but think that an awareness/understanding about how they differ
(or not) would be helpful to implementors of OAI-ORE. When learning
about a new technology, I like to see a comparison chart with other,
related technologies for two reasons: one, I want to know if I am
choosing the right technology/standard for the task at hand, and two,
I want to know that the creators of the technology/standard are aware
of other similar efforts and have (presumably) taken those into
account (drawing best practices, avoiding wheel re-invention, etc.).
On Fri, Aug 22, 2008 at 8:58 AM, pkeane <pjke...@gmail.com> wrote:
> Hi All-
> I just learned about an interesting standard called "Metalink." > Metalink is designed to be used for applications that need to download > sets of files (e.g. linux package managers). It uses an xml format to > describe the aggregation of files and things like checksums, primary > and secondary sources (i.e., URLs), protocols (http, bittorrent).
> I cannot help but note the overlap between such a standard and the OAI- > ORE effort -- while they obviously come from very different domains, > the generic nature of the task (aggregating sets of objects, adding > some metadata about the objects, and the relationships between > objects) inevitably leads to overlap. There are other standards > (Maven's Project Object Model, the Yahoo Media RSS extension, etc.) > that also tackle the aggregation issue from various angles.
> I wouldn't suggest that such diversity is a bad thing, but I cannot > help but think that an awareness/understanding about how they differ > (or not) would be helpful to implementors of OAI-ORE. When learning > about a new technology, I like to see a comparison chart with other, > related technologies for two reasons: one, I want to know if I am > choosing the right technology/standard for the task at hand, and two, > I want to know that the creators of the technology/standard are aware > of other similar efforts and have (presumably) taken those into > account (drawing best practices, avoiding wheel re-invention, etc.).
> Any thoughts on: 1. OAI-ORE's relationship to Metalink, 2. the > usefulness of a "comparison chart" that differentiates OAI-ORE from > related standards?
1. I can see the surface similarity, in as much as metalink allows the linking of multiple resources with the same bytestream together in the <file> blocks, and multiple resources in the <files> element. I couldn't see any defined semantics of the relationship between entries in the <files> element, but did I just miss it? You could create a metalink style profile of ORE by identifying a few relationships for the message digests, owl:sameAs (xxx:isMirrorOf?) between the mirror files and so forth. It's a very specific sort of Aggregation.
2. Definitely a good idea! Suggestions welcome as to which other projects to include in a comparison chart!
On Fri, Aug 22, 2008 at 2:58 PM, pkeane <pjke...@gmail.com> wrote:
> I just learned about an interesting standard called "Metalink." > Metalink is designed to be used for applications that need to download > sets of files (e.g. linux package managers). It uses an xml format to > describe the aggregation of files and things like checksums, primary > and secondary sources (i.e., URLs), protocols (http, bittorrent). > [...] > Any thoughts on: 1. OAI-ORE's relationship to Metalink, 2. the > usefulness of a "comparison chart" that differentiates OAI-ORE from > related standards?
> 1. I can see the surface similarity, in as much as metalink allows > the linking of multiple resources with the same bytestream together in > the <file> blocks, and multiple resources in the <files> element. I > couldn't see any defined semantics of the relationship between entries > in the <files> element, but did I just miss it? You could create a > metalink style profile of ORE by identifying a few relationships for > the message digests, owl:sameAs (xxx:isMirrorOf?) between the mirror > files and so forth. It's a very specific sort of Aggregation.
> 2. Definitely a good idea! Suggestions welcome as to which other > projects to include in a comparison chart!
> On Fri, Aug 22, 2008 at 2:58 PM, pkeane <pjke...@gmail.com > <mailto:pjke...@gmail.com>> wrote:
> I just learned about an interesting standard called "Metalink." > Metalink is designed to be used for applications that need to download > sets of files (e.g. linux package managers). It uses an xml format to > describe the aggregation of files and things like checksums, primary > and secondary sources (i.e., URLs), protocols (http, bittorrent). > [...] > Any thoughts on: 1. OAI-ORE's relationship to Metalink, 2. the > usefulness of a "comparison chart" that differentiates OAI-ORE from > related standards?
-- Phil Barker Learning Technology Adviser ICBL, School of Mathematical and Computer Sciences Mountbatten Building, Heriot-Watt University, Edinburgh, EH14 4AS Tel: 0131 451 3278 Fax: 0131 451 3327 Web: http://www.icbl.hw.ac.uk/~philb/
-- Heriot-Watt University is a Scottish charity registered under charity number SC000278.
On Tue, Aug 26, 2008 at 6:39 AM, Robert Sanderson <azarot...@gmail.com>wrote:
> 1. I can see the surface similarity, in as much as metalink allows the > linking of multiple resources with the same bytestream together in the > <file> blocks, and multiple resources in the <files> element. I couldn't > see any defined semantics of the relationship between entries in the <files> > element, but did I just miss it? You could create a metalink style profile > of ORE by identifying a few relationships for the message digests, > owl:sameAs (xxx:isMirrorOf?) between the mirror files and so forth. It's a > very specific sort of Aggregation.
Yes, exactly what I was driving at: if I have an application that "knows" ORE, it should be possible to meet the use case that metalink addresses with ORE itself.
This sort of list & comparisons would be truly useful. BTW, it was pointed out to me that Maven's POM (which I had originally mentioned) might not merit inclusion since it's a very different sort of thing. Not being a Maven user, I'll defer to others expertise.
An underlying motivation here (for me, at least) would be to say in essence "ORE can meet many/all of your aggregation needs" and here's how it might apply to all of these use cases. In any regard, it's a useful exercise to see how different domains approach the need for descriptions of aggregated resources.
> On Fri, Aug 22, 2008 at 2:58 PM, pkeane <pjke...@gmail.com> wrote:
>> I just learned about an interesting standard called "Metalink." >> Metalink is designed to be used for applications that need to download >> sets of files (e.g. linux package managers). It uses an xml format to >> describe the aggregation of files and things like checksums, primary >> and secondary sources (i.e., URLs), protocols (http, bittorrent). >> [...] >> Any thoughts on: 1. OAI-ORE's relationship to Metalink, 2. the >> usefulness of a "comparison chart" that differentiates OAI-ORE from >> related standards?
> On Tue, Aug 26, 2008 at 6:39 AM, Robert Sanderson > <azarot...@gmail.com> wrote:
> 1. I can see the surface similarity, in as much as metalink allows > the linking of multiple resources with the same bytestream together > in the <file> blocks, and multiple resources in the <files> > element. I couldn't see any defined semantics of the relationship > between entries in the <files> element, but did I just miss it? > You could create a metalink style profile of ORE by identifying a > few relationships for the message digests, owl:sameAs > (xxx:isMirrorOf?) between the mirror files and so forth. It's a > very specific sort of Aggregation.
> Yes, exactly what I was driving at: if I have an application that > "knows" ORE, it should be possible to meet the use case that > metalink addresses with ORE itself.
> 2. Definitely a good idea! Suggestions welcome as to which other > projects to include in a comparison chart!
> Starting with:
> * METS
Should have a fairly close mapping to ORE in that at least in its usage of xlink pointers to other METS instances and actual resources allows for the representation of aggregated aggregations and aggregated resources. The allowed inclusion of other heterogeneous XML formats allows for the encoding of more complex descriptive metadata.
I would also add DDI 1.X/2.X/3.X, IMS-CP, and LOM in this category of XMLSchema driven standards for encoding aggregations of resources and their relationships to one-another. I'm sure there are a dozen others.
> This sort of list & comparisons would be truly useful. BTW, it was > pointed out to me that Maven's POM (which I had originally > mentioned) might not merit inclusion since it's a very different > sort of thing. Not being a Maven user, I'll defer to others > expertise.
A Maven POM is a descriptive representation of a Maven "project" used to construct an "Artifact" (Resource). Dependencies, Plugins, Inheritance and Overlays might represent mechanisms of aggregation within this representation of the Project. For an example of such mechanisms, one can review the following example...
But its important to understand that what Maven supports as an "Identifier" mechanism is a composite set of tags that identify resources and not necessarily a URI (though one can be constructed by the underlying application (Maven). For instance...
Which represents the base URI for accessing a number of important resources used within the Maven transitive dependency resolution mechanism including pom.xml "manifests", "time-stamped versions", and "signatures".
I'll finally toss in that there already exists an evolving RDF standard in this area, though it may not have such an elaborate dependency/inheritance mechanism... DOAP
> An underlying motivation here (for me, at least) would be to say in > essence "ORE can meet many/all of your aggregation needs" and > here's how it might apply to all of these use cases. In any > regard, it's a useful exercise to see how different domains > approach the need for descriptions of aggregated resources.
In such a pluralistic environment as the traditional WWW ( and now the Semantic Web/LOD) my sense that everyone would look to ORE to meet their aggregation needs seems extremely unlikely. It might only happen if the users of many communities recognize something highly unique and valuable to its implementation beyond that which can be done in a more tractable case specific format (which, IMO, is more easily adopted, customized and utilized in their case and limited resources). We can see that has already been the case with OAI-PMH, where initially groups like Google saw an immediate possibility/need to access resources in our niche', but then abandoned it as a global solution in favor of a more "platform specific" representation to meet their needs (the Sitemap protocol and standard http:// sitemaps.org). Now it is we who are adopting their vision and not the other way around.
Metalink is a local use-case specific format pretty much dedicated to efficient file transfer. The data it carries is not dissimilar to the type of metadata one might get directly off your file system and/or the basic representation of those files in services such as ftp or http servers. Its less abstract, so there is a much lower bar to representing the state of some resources you may be exposing in a service given the metadata is intrinsic to the mechanism by which it is stored and delivered.
So, while someone very interested in ORE might dedicate effort to a mapping, I doubt it would be of interest to those tools already utilizing an existing standard with popular uptake. Likewise, in the RDF world, what your more likely to see is an RDF representation of such attributes and relationships that has little to do with ORE and is more specifically related to Metalink.
In fact, IMO, I would be less inclined to utilize ORE for such a case over that of a more specific ontology that is more specific to the application/client base I am attempting to serve metadata to. At the moment I did use ORE in some of our RDF representations:
After a clear expression (modelling) in the above ontology, we (the DSpace community) should be much more free to map to (or insert in) ORE or any other ontology when necessary. Once we have a greater capability to support statements attached to specific DSpace Resources (part of the DSpace 2.0 work), a door will open that allows the Submitter and Curator to attach whatever statements they necessitate to any DSpace Resource (and this could include ORE Statements).
This said, I don't think folks will be interested in, for instance, having to replicate all their linkages of their "dc:relations" to ore relations (describes, isDescribedBy, aggregates, isAgregatedBy,...). And so my previous question about utilizing property/class inheritance in ones RDF Ontology to intrinsically express such mappings. Based on the response I got to that question, I've started to shy away altogether from the ORE model expressing the contents explicitly via a predicates. I think ORE aggregations should actually just be containers of "loosely predicated", "ORE typed" rdf:resources.
For instance, any non-literal object of a statement that is of type "ore:Resource" where the "subject" is of rdf:type ore:Aggregation is part of that aggregation, no matter its "predicate".
> <URI-R> a ore:ResourceMap > <URI-R> some:predicate <URI-A>
> <URI-A> a ore:Aggregation > <URI-A> some:predicate <URI-AR>
> <URI-AR> a ore:Resource
Then regardless of the underlying "predicates" used by ontologies, an application (or more concretly, an "Application Profile Ontology") can simply just label its resources of a specific type (ore:Aggregation, ore:ResourceMap, ore:Resource and allow the application to be more flexible in its expression on top of that simplistic model. This would mean the following representations might each be valid ORE:
Here, when the behavior is properly specified, an application could behave very simply...
For any rdf:resource in the ore:ResourceMap... 1a.) Resolve all non-literal object resources 1b.) Determine if resources referenced are of rdf:type ore:Aggregation (by looking for existing statements or attempting to resolve as SW/LOD).
For any rdf:resource in the ore:Aggregation... 2a.) Resolve all non-literal object resources 2b.) Determine if resources referenced are of rdf:type ore:Resource
> On Fri, Aug 22, 2008 at 2:58 PM, pkeane <pjke...@gmail.com> wrote:
> > I just learned about an interesting standard called "Metalink."
> > Metalink is designed to be used for applications that need to download
> > sets of files (e.g. linux package managers). It uses an xml format to
> > describe the aggregation of files and things like checksums, primary
> > and secondary sources (i.e., URLs), protocols (http, bittorrent).
> > [...]
> > Any thoughts on: 1. OAI-ORE's relationship to Metalink, 2. the
> > usefulness of a "comparison chart" that differentiates OAI-ORE from
> > related standards?
Would it make sense for us to start a page on the ORE wiki ( http://foresite.cheshire3.org/wiki/) with a list, and perhaps (as a start) links to information about each? With the assumption that some prose would be added describing the similarities/differences/use-cases for each vis-a-vis ORE...
> > On Fri, Aug 22, 2008 at 2:58 PM, pkeane <pjke...@gmail.com> wrote:
> > > I just learned about an interesting standard called "Metalink." > > > Metalink is designed to be used for applications that need to download > > > sets of files (e.g. linux package managers). It uses an xml format to > > > describe the aggregation of files and things like checksums, primary > > > and secondary sources (i.e., URLs), protocols (http, bittorrent). > > > [...] > > > Any thoughts on: 1. OAI-ORE's relationship to Metalink, 2. the > > > usefulness of a "comparison chart" that differentiates OAI-ORE from > > > related standards?
> Would it make sense for us to start a page on the ORE wiki (http:// > foresite.cheshire3.org/wiki/) with a list, and perhaps (as a start) > links to information about each? With the assumption that some > prose would be added describing the similarities/differences/use- > cases for each vis-a-vis ORE...
> --peter
> On Wed, Aug 27, 2008 at 1:09 PM, Jerome <jmcdo...@uiuc.edu> wrote:
> I would think that the XFDU work being done by the CCSDS (of OAIS > Fame) should be in that chart as well.
> > 2. Definitely a good idea! Suggestions welcome as to which other > projects > > to include in a comparison chart!
> > On Fri, Aug 22, 2008 at 2:58 PM, pkeane <pjke...@gmail.com> wrote:
> > > I just learned about an interesting standard called "Metalink." > > > Metalink is designed to be used for applications that need to > download > > > sets of files (e.g. linux package managers). It uses an xml > format to > > > describe the aggregation of files and things like checksums, > primary > > > and secondary sources (i.e., URLs), protocols (http, bittorrent). > > > [...] > > > Any thoughts on: 1. OAI-ORE's relationship to Metalink, 2. the > > > usefulness of a "comparison chart" that differentiates OAI-ORE > from > > > related standards?
And to add a couple more to the list: EAD and CIDOC-CRM
I put the wiki up because it was easy for me (10 minutes) as opposed to on the host for the main openarchives.org site. One thing that could be done would be to have: wiki.openarchives.org rather than my own domain name.
Rob
Wed, Aug 27, 2008 at 7:24 PM, Mark Diggory <mdigg...@mit.edu> wrote:
> Shouldn't a wiki and any such documentation be centralized to the OAI group > somehow?
> On Aug 27, 2008, at 11:16 AM, Peter Keane wrote:
> Would it make sense for us to start a page on the ORE wiki ( > http://foresite.cheshire3.org/wiki/) with a list, and perhaps (as a start) > links to information about each? With the assumption that some prose would > be added describing the similarities/differences/use-cases for each > vis-a-vis ORE...
> --peter
> On Wed, Aug 27, 2008 at 1:09 PM, Jerome <jmcdo...@uiuc.edu> wrote:
>> I would think that the XFDU work being done by the CCSDS (of OAIS >> Fame) should be in that chart as well.
> And to add a couple more to the list: EAD and CIDOC-CRM
> I put the wiki up because it was easy for me (10 minutes) as > opposed to on the host for the main openarchives.org site. > One thing that could be done would be to have: > wiki.openarchives.org rather than my own domain name.
> Rob
> Wed, Aug 27, 2008 at 7:24 PM, Mark Diggory <mdigg...@mit.edu> wrote: > Shouldn't a wiki and any such documentation be centralized to the > OAI group somehow?
> On Aug 27, 2008, at 11:16 AM, Peter Keane wrote:
>> Would it make sense for us to start a page on the ORE wiki (http:// >> foresite.cheshire3.org/wiki/) with a list, and perhaps (as a >> start) links to information about each? With the assumption that >> some prose would be added describing the similarities/differences/ >> use-cases for each vis-a-vis ORE...
>> --peter
>> On Wed, Aug 27, 2008 at 1:09 PM, Jerome <jmcdo...@uiuc.edu> wrote:
>> I would think that the XFDU work being done by the CCSDS (of OAIS >> Fame) should be in that chart as well.
Just wondering if anyone here has taken a look at Yahoo's SearchMonkey
project? It obviously has a much different aim than ORE -- improving
web search for producers and consumers, but the path they have taken
bears a striking resemblance to OAI-ORE. They have developed a data
format the called DataRSS [1] which uses a subset of of the attributes
of RDFa to describe arbitrary RDF graphs. DataRSS can be used
standalone or as an Atom extension (which ends up looking quite a lot
like the most recent thinking on ORE's Atom serialization, with the
DataRSS piece being the "triples" mechanism [2]).
Among the interesting bits in the (quite extensive) documentation
includes an appendix on recommended vocabularies [3] and a large set
of examples for various types of data (personal profiles, business
addresses, reviews,events, etc.) [4].
Anyway, I just began looking at it today after reading an article in
Nodalities [5]. I found it interesting as another point of
triangulation, as it becomes increasingly clear that there is only
challenge/problem that the web offers and we are all trying to solve
it ;-).
> well, there are links on the OAI site, that'd seem to suffice. I
> like the virtual host idea, might be a nice addition and make it
> look more unified.
> -Mark
> On Aug 27, 2008, at 12:41 PM, Robert Sanderson wrote:
> > And to add a couple more to the list: EAD and CIDOC-CRM
> > I put the wiki up because it was easy for me (10 minutes) as
> > opposed to on the host for the main openarchives.org site.
> > One thing that could be done would be to have:
> > wiki.openarchives.org rather than my own domain name.
> > Rob
> > Wed, Aug 27, 2008 at 7:24 PM, Mark Diggory <mdigg...@mit.edu> wrote:
> > Shouldn't a wiki and any such documentation be centralized to the
> > OAI group somehow?
> > On Aug 27, 2008, at 11:16 AM, Peter Keane wrote:
> >> Would it make sense for us to start a page on the ORE wiki (http://
> >> foresite.cheshire3.org/wiki/) with a list, and perhaps (as a
> >> start) links to information about each? With the assumption that
> >> some prose would be added describing the similarities/differences/
> >> use-cases for each vis-a-vis ORE...
> >> --peter
> >> On Wed, Aug 27, 2008 at 1:09 PM, Jerome <jmcdo...@uiuc.edu> wrote:
> >> I would think that the XFDU work being done by the CCSDS (of OAIS
> >> Fame) should be in that chart as well.