Sorry for taking so long to reply. I didn't see the email till just now.
I envisioned that each application would understand it's own event set
and develop its own namespace in XRI. So non-text is just as easy as text,
you develop an XRI spec to describe the document along with its parts and
attributes, then you invent a series of events (described in XML) that
operate on that document and those parts and attributes. The XMPP
portion will deliver the event messages to the participants and provides a
framework for inventing new "Presence" codes and publish/subscribe
opportunities (if needed). XMPP can deliver avatar images which are
non-text data so the structure is there for non-text data, but I'll talk
more about it just the same.
As for the non-text parts and data, the options as I see them are translate
the binary bits into text descriptions (or use some kind of text encoding),
use a CDATA tag to embed the binary directly, there might be something in
the EXI (Efficient XML Interchange) spec that adresses this directly, or use
an alternate channel to deliver the bits and use the XMPP event to tell
applications where to go for the new bits. (In this last case, let's
pretend that the only two apps in the session were on the same physical box
- they could use a shared memory area to pass the data between them and use
the XMPP channel to pass the addresses within that shared memory space.
Another case might be some kind of object where it doesn't make sense to
maintain a local copy of the whole object (perhaps it's too big) and the app
is sophisticated enough to splice together the remote object with changes
stored locally (I'm thinking updates to something like the Google Maps
database, or a local augmentation/overlay to the Google Maps data) - the
event would describe the change, but instead of putting all the data
directly in the event message, it would describe the event in a way that
told the other participants they needed to go elsewhere to get the details
of this event - this "elsewhere" need not be XML based even though the event
telling them about it was.)
The XMPP server is the Message hub, and one of the participating apps is the
host for the session. The non-text app joins the right XMPP channel, if
this app has never participated on this document before, the host sends the
new participant the complete current version of the document (wrapped up in
XML or alternatives I mentioned), if the app has participated before then
there is an existing version local to the new member which needs to be
synchronized. Using a process very similar to version control updates, the
two systems would check each "part" (I envision a scheme where it first
checks the version of the whole object, if no match, check the next level of
major parts, on the ones that don't match check the next level of parts,
continue until you have no more levels to go down and resend the current
complete version of those lowest level parts). I'm reminded of LDAP
synchronization where each object and attribute has a version on it and two
hosts communicate and figure out what the right version of each object is.
In really sticky situations, where there are conflicts, you get the human to
figure out how to merge the two objects. To start however let's just assume
that new participants always get a complete copy of the current object and
there are no "offline changes" to try and merge in.
In fact, I've been thinking about this quite a lot and if what we're
developing is a collaboration architecture/framework, then what's really
needed are versioned document type interfaces and protocols. Something very
much along the lines of SNMP where there are generic information
libraries that operate on that generic class of object (for instance, the
equivalent of "router" in SNMP in our framework might be "text/plain" or
"image/png"), but there can also be published specialized libraries that can
implement more functionality for a subclass of those objects (for instance
"cisco router" in SNMP might be "text/html" or
"image/game-objects-grid-with-bitmasks"). The generic library can certainly
operate on the subclass of objects (though the subclass library can't
operate on all generic objects obviously) even if it is at a reduced level
or efficiency from using the specialized libraries. But what's been enabled
is that any application that implements the "text/plain" interface
description can collaborate on all "text/plain" documents with other apps
that know how to speak "text/plain". This of course extends to "text/html"
and "image/x-bmp".
Let's take the simple text document or bitmap image (the model expands to
much more complicated structures too). There are lots of apps out there to
manipulate these types of documents. These document types are well known
and their structure is well understood, so it seemed to me it should be very
possible using a collaboration framework, to get many different apps
collaborating at the same time.
I also was thinking that while many peer applications could collaborate,
differing versions of the same application might not be able to collaborate
because they wouldn't support the same event structures or necessarily
understand the same addressing identifiers. This led me down the interfaces
path.
All applications in the same session must be able to speak the same language
(understand the same event structures) and be talking about the same thing
(the nomenclature for the objects).
In my idealized system at the moment what I see are application developers
who want to collaborate on specific types of documents would implement the
interface(s) for that document type. The interface chosen would be
versioned "text/plain v20080503.100" ( I made something up date.rev). This
way the session host can interrogate the client up front about whether or
not that participant supports the version of the interface that session is
using.
Events could show up in the event loop of the application just like other OS
based events. There would be certain mandatory events to implement and
other may/should type events. One question I'm still working through is
whether or not the XRI format description has a separate version from the
event set. In other words "We're using v1.2 of the object format and event
set v5.4" (because it was a simple object that was easy to describe, but the
kinds of things being done to that object have evolved with time).
By ignoring the actual document itself and using versioned interfaces as the
"agreement/contract" model, libraries that provide rich API experiences for
app developers can be built up and app developers aren't stuck redeveloping
that layer, and any object that can be viewed through the lens of an
existing document type can be collaborated on (Think text/plain editors
being used to collaborate on text/xml documents before there were text/xml
native editors).
At the same time, the library developers are able to get ahold of a specific
nomenclature on how to describe the document, and a specific set of events
that can happen to those parts to deal with. The versioning allows for the
evelution of the libraries and standards, and protects end users from
putting incompatible applications in the same session.
Finally, it allows for heterogeneous applications to collaborate which is
something
kind of a holy grail in my mind's eye.
Hope this was useful.
-- Michael --
PS In a lot of ways it seems to me that other technologies have addressed
various pieces of the same puzzles.
- Integration with the KDE Decibal project seems important to abstract even
the XMPP layer a little bit further (which itself seems to be based on
telepathy)
- SNMP has lots of good ideas on how to approach the "Many vendors, many
objects, many applications, one framework" problem. They have a
hierarchical object id based system where a library written for one level in
the hierarchy will work for all children within that hierarchy.
- It just kind of happened in this email, but MIME types might be a decent
starting point for the kinds of objects folks will initially be able to
collaborate on.
- MMORPGs have many participants collaborating on the same shared objects.
They've probably got some very good ideas when you start thinking of each
local application as a game client that receives events from its peers. As
well as giving the participants communication structures beyond the actual
document editing.
- As an overall end-user experience some thoughts need to be given to how
end users will be able to find each other's sessions. I think the IRC and
IM world's of today give us lots of ideas: private chats, public channels,
invitations, channel listings, away messages?, channel kicks/bans, others
I'm sure.
- I strongly believe security needs to be in at the base of the design.
Since i'm proposing XRI as the way to identify objects and parts, I'm
thinking that somehow associating an ACL to those XRI components is the way
to go. LDAP might provide some inspiration as it is a collaborative
hierarchical object model.
- Lastly, I'm hoping to avoid being locked into a realtime paradigm for
collaboration. Many people collaborate by emailing the document back and
forth. The main advantage of this method is they both don't have to be
online at the same time. It reminds me of those PBEM (Play by Email)
games like chess, where you'd make your move in a game, the game would send
an email to your opponent, your opponent would open the email
attachment which would open using their copy of the game which would render
your move for them, they'd make their move, and the cycle would continue. I
think it would be great if these non-realtime protocols could be used as
transports for the events in the collaborative framework.
On Thu, Apr 17, 2008 at 9:32 AM, Roger Pixley <skree
...@gmail.com> wrote:
> Seems like a good proposal the more that I think about it the better
> it is to have a central server per user logged in. How would you deal
> with non text type documents?
> On 4/11/08, Michael Fair <mich...@daclubhouse.net> wrote:
> > So assuming that you all bought into my prior model of using the MMORPG
> > as the event model, and XMPP as the Message Bus for said events, the
> next
> > thought I had was regarding the events themselves.
> > Each event needs to communicate at least two things, what happened, and
> > what did it happen to. What is also good to know is what did it for
> > authorization
> > purposes. At some point the security model where not all session
> members
> > have complete control over the shared resource needs to be addressed.
> > I propose using the Extensible Resource Identifier (XRI) (if you don't
> > already
> > know, please look it up as I can't explain it in this brief email) as
> they
> > way of
> > describing what did it happen to. I'm sure some bright spark can also
> use
> > it
> > to describe the "What Happened" too.
> > In short XRI has thought about how to address objects, their properties,
> and
> > even fragments of the properties as well as attributes of all of the
> above.
> > They
> > took from URL and URI, and IRI (internationalized) and extended it so it
> fit
> > well within a Federated XML world and could be referenced through
> > "synonyms".
> > In other words, multiple identifiers can be used to identify the same
> > object.
> > It's the federation and the synonums part (where different authorities
> can
> > control different parts of the identifier) that really made me take
> notice.
> > The folks who worked on that spec really thought this thing through and
> > seemed
> > to know the subject matter of what would be needed well.
> > Thanks for the chance to share my thoughts,
> > -- Michael --