In the few weeks since I published the first drafts of AtomActivity, ActivitySchema and friends several things have come about:
- FriendFeed is collapsing multiple photo-related activities together into a single entry in its activity feeds.
- FriendFeed is using MediaRSS-in-Atom to publish Flickr photos in its activity feeds.
The former opposes a decision made in the name of simplicity at the last activity streams meetup to go with one activity per entry. However, since the spec is being lead by implementations rather than the other way around, I'm planning to reverse this decision and go for a slightly less constrained model where an activity entry can have multiple objects, as long as the verb, actor, context and other activity properties are the same for each. Doing any further coalescing (similar activities by multiple actors, for example) is the job of the UI layer of the aggregator and should not be reflected in feeds. The next iteration of the spec will contain a section with requirements specifically targetting activity re-publishers and aggregators describing what their feeds should look like.
The latter further erodes the argument that we can do AtomActivity because MediaRSS-in-Atom is not yet widely deployed. However, it's still not a slam-dunk for MediaRSS-in-Atom because the way FriendFeed is publishing these elements (as direct children of the activity's atom:entry element) puts them outside the purview of AtomActivity, which expressly ignores everything except the publication date and the activity-specific elements in an activity entry, under the assumption that the content there is intended for non-activity feed readers... therefore (and this will be written explicitly in the next draft of the spec) activity feed publishers can put anything they like in there that will make non-activity feed readers behave in the desired way, and activity-aware readers should ignore it and use the activity information.
Folks who were keen on AtomMedia as an alternative to MediaRSS-in-Atom should take note, though, that the likelyhood of success of the former is getting weaker with each system that implements MediaRSS-in-Atom; I don't personally have time to work on AtomMedia at the same time as AtomActivity, so I'd love it if someone else would take over as author of the media spec.
One pain point that exists for activity streams right now is the dispersal of responses over various networks. When I post a blog entry like this one, folks get the opportunity to comment on my blog itself (via TypePad Connect), or they can comment on the copy of my entry that gets sucked into LiveJournal from my Atom feed, or they can comment on the activity that shows up in FriendFeed or Plaxo. If I had it set up, they'd be able to comment on Facebook too.
It would be useful if all of these comments were aggregated together so that the entire thread of conversion could be viewed whatever context you end up reading my entry in. This is, unfortunately, not an easy problem to solve on the decentralized social web. Bloggers are accustomed to being "master of their domain", so in an ideal world they want their blog to be the master source of comments. However, it's clear that FriendFeed users want to leave their comments directly from the FriendFeed UI, not follow the link through to the original entry and comment there.
One idea is to provide some sort of endpoint where comments can be submitted by remote systems, but it's difficult to see how that would work with authenticated comments and with comment forms that have features such as CAPTCHAs. It would also be tricky to get right with the "paste in a chunk of HTML" comment systems such as Disqus and TypePad Connect. Every blog has its own variation of allowed comment markup, too, not to mention odd-ball cases like YouTube's video comments. Coming at it from the other side, it's unlikely that the other systems will be willing to relinquish control of "their" comments; for many sites, the discussions they host are a big part of their value.
Another approach is a more passive model where comments simply get added to an activity stream somewhere and it somehow gets consumed by all other sites displaying comments, but then discovery of these various streams and figuring out how to deal with abuses becomes the problem.
I don't know the solution right now, but I do feel that this is an important problem to work on as we move towards a more decentralized social web; as people start to use more and more different activity aggregators it will become increasingly difficult to stay engaged with the conversations that are going on.
Last Thursday Six Apart hosted a very productive meet-up for the Activity Streams community -- which turned out to be far bigger than I imagined -- where we had some good discussions about where we are and were we're going. I think overall the feedback on the current spec drafts was positive, though there was a definite desire to grow the schema to support the activities exposed by more social sites. MySpace joining the fray has also made purely social activities such as friendship relationships, which we were previously deferring for a later draft, suddenly more important.
I like to work to defined goals, so here are my high-level goals for the next iteration:
- Write a spec for the representation of people as Atom entries, to enable them to be used as activity objects. This will probably be based on the XML serialization of the PortableContacts schema, though there will be some adjustments to address the redundancy that exists between some existing Atom elements and the PortableContacts fields.
- Expand the schema to include verbs and object types necessary to support a large proportion of the publishers currently supported by FriendFeed and Plaxo. These are easier to specify because FriendFeed and Plaxo already process these in a particular way so there are examples to draw from.
- Start to spec out some schema additions for the purely social activities exposed by MySpace. This will be harder, because we don't really have any good examples of what this might look like in Atom, but I hope to work with Monica Keller from MySpace to figure out what makes sense for them and hopefully extrapolate that to Facebook and other similar systems. MySpace also exposes activities raised by OpenSocial, so we'll need to address how AtomActivity and OpenSocial work together at some point, but I'm hoping we can defer that at least to the next iteration.
Since I'm working on this largely in my spare time I can't really give a timeframe for the above, but I'd certainly like to do them sooner rather than later. MySpace in particular seems to be ready to launch, so that's pushing things forward faster than they might otherwise have moved. I'm getting some good feedback from a number of folks online too, and I've made a list of the outstanding issues that've been raised which I intend to post online shortly.
It was exciting to see so many folks at the meet-up enthusiastic about solving this problem. Hopefully with a couple more iterations we'll get to a place where folks can start to feel more comfortable implementing this stuff, as things solidify.
ItoWorld.com has produced a really neat visualization of 2008's OpenStreetMap edits. It starts zoomed in on Ipswich, which is not far from my former home of Colchester, and I was quite pleased to see the little flash of Colchester as it zooms out... that's (partially) me! You wouldn't spot it if you didn't know exactly what to look for, but whatever.
More interesting on the global scale is just how much editing went on last year, not only in the UK but all over the world. It's great to see OpenStreetMap taking off, though I'm still sceptical whether it can really ever have any use beyond getting CC-licenced maps to go on a website's "How To Find Us" page, since you can't rely on it for any place you don't know. (Having said that, though, there are of course parts of the world where maps are not so readily available, and I do hope OpenStreetMap can go some way to addressing that problem.)
(FWIW, Colchester is still incomplete &mdash there are a few folks still working on it, but there are still large chunks of it not mapped — but I believe neighbouring town Wivenhoe (which is where I actually lived) has all of the important streetmap-level details.)
I've been thinking some more and talking a bit with folks about whether Activity Streams should be in RSS or Atom. I did get some feedback saying that both should be supported, but I'm not sure I really want to create two different ways to publish/consume activity data. Here are some advantages of each...
First the advantages of switching to RSS:
- We don't have to invent a new way to represent media objects.
- Almost all sites publish RSS, in some cases exclusively. (So in order to publish an activity stream, they'd need to build out an extra feed endpoint.)
- Sites that don't currently publish Atom would need to add an additional autodiscovery link, which may confuse aggregators and complicates the UI for feed subscription in browsers.
But here are the advantages of staying with Atom:
- Its core elements are in an XML namespace, which makes it easier/nicer to include inside weird containers like XMPP stanzas.
- We can use atom:source to deal (to a certain extent) with activity aggregation feeds such as the Atom feeds that FriendFeed publishes. No such concept exists in RSS.
- We don't have to deal with the complexities and ambiguities of Media RSS. (In other words, we can decide on something sensible without being constrained by existing practice.)
- The Atom schema and data model is much better defined than RSS. (Though lots of software just treats Atom as a funny serialization of RSS, so this benefit doesn't really manifest in practice.)
Here's where my head is at right now: the concept of "object feeds" in the AtomActivity spec could in theory be adapted to map onto RSS without many changes. Therefore we could include a section in AtomActivity for how to construct the "implied activity" for an RSS item much like we currently describe how to construct the same for a non-action Atom entry. The concept of "activity entries" is more complicated to adapt to RSS due to its re-use of Atom elements, but given that there are currently only a few implementations that contain something resembling "activity entries", so hopefully we can get them to converge on Atom for this.
What this means in practice is that sites publishing feeds of objects can take their existing feeds, whether Atom or RSS, and add the
activity:object-type annotations, and be done. Sites publishing feeds of activities (FriendFeed, Plaxo, Movable Type Action Streams, Wordpress Activity Streams, ...) would need to use Atom, because there would be no representation defined for this in RSS. Consumer libraries would need to support both RSS and Atom, but there would be a well-defined mapping for how to turn both sorts of object entries into Atom-based activity entries.
This would make Atom the primary format but there would be some limited (but well-defined) support for RSS. Does that seem reasonable to folks?
In my previous entries I alluded to research into the popularity of different approaches for publishing feeds, particularly those containing media objects such as photos, videos and audio. I've now written up a short summary of my findings.
The three things that spring right out here are:
- RSS is published by just about everyone.
- You usually find Atom in the traditional blogging space, but it isn't even in the game when it comes to media publishing.
- The only thing you can actually reliably get out of MediaRSS is a thumbnail image.
I'm continuing to mull over whether to rewrite activity streams in terms of RSS or to hope for increased adoption of Atom. My leaning right now is to the former.
Another interesting fact not reflected in my results document is that none of the RSS feeds I examined used any RSS features that are not available in Atom when augmented with my AtomMedia draft, and AtomMedia allows only one way to publish each case rather than the myriad combinations of
media:content and other element nesting that are allowed by Media RSS and used by feeds in the wild. It's too bad that if I move to RSS/MediaRSS for activity streams I'll have no need for AtomMedia; I'd be delighted if someone else would pick it up and finish it off, though.
If you've been following my adventures this weekend you'll know that I started off wondering why RSS is still so prevalent when we have Atom. However, not long after that I started doing research in preparation for specifying the media object types in AtomActivity and discovered one big reason why RSS is still widely used: folks publish media objects like podcasts using MediaRSS, but there is no standard for media objects in Atom.
So faced with the need to mark up activities involving photos and videos in Atom, what is a boy to do? Last night I took a whack at adapting a subset of MediaRSS to Atom, with the hope that AtomActivity could refer to that. However, today I played around some more with various software that consumes feeds with embedded media, and found that there does seem to be a subset of MediaRSS that does actually work in software today, and that made me take a step back and reassess my goals.
My design goal with AtomActivity was always to describe some minimal extra markup that would allow existing feeds to be consumed by activity aggregators. Asking providers to add a bunch of additional junk to their Atom feeds when they already have fully functional MediaRSS feeds doesn't really jive well with that design goal.
The research that motivated me to ask why sites still publish RSS does seem to indicate that RSS is far more widely deployed, both by publishers and by aggregators, than Atom is. Aside from a few Six Apart products, no major service that I looked at publishes Atom only. Most publish both Atom and RSS at the same time with only basic content in the Atom feed. Many others publish only RSS, or they publish both but only have autodiscovery for RSS. I'm less certain about the consumer side, but given that only a tiny handful of publishers actually publish Media information in Atom at all I'm guessing that today systems like FriendFeed and Plaxo Pulse are using the RSS feeds when they're pulling from sites like YouTube and Flickr. If the goal is to make only minimal changes to existing practice, it does look like we're barking up the wrong tree by building on Atom.
The question now is whether to persevere with AtomActivity or to repurpose it as an RSS extension instead. Using RSS has the benefit that MediaRSS is already widely used in RSS to mark up media content, so we can do a reasonable job at consuming these feeds as they exist today. It does mean that we lose out on some Atom features such as atom:source and the Atom Threading Extensions, but neither of these are widely used today so that's no major loss.
If we did go this route I'd still want to write up a proper spec for a subset of MediaRSS that serves the same use-cases as my AtomMedia draft, since the current "specification" for MediaRSS is to big and not really detailed enough. However, at least this approach means that there is existing implementation practice to base such a subset on, so I'd be describing what works today rather than what might work in a few years if anyone actually bothers to implement it when their RSS feeds already work anyway.
As ever, I'm eager to hear what the rest of the world thinks. It's lonely here inside my head...
Sam Ruby quite fairly called me out for hating on folks that publish RSS while doing it myself. The reason is quite unexciting, though: my blog is, for historical reasons, hosted by LiveJournal. LiveJournal provides Atom and RSS feeds for all blogs it hosts.
However, I'm already doing a bunch of munging of LiveJournal's output to do things like using TypePad Connect for comments, so it didn't take long to munge out the RSS stuff. While I was at it I finally got shot of all of the script and CSS cruft that LiveJournal adds to every page to support ads, contextual popups, navigation strips and all sorts of other things that I don't have on my blog anyway.
The long-term plan is to move from LiveJournal to something else — either MovableType or TypePad most likely — but I'm putting that off until I can figure out a way to keep all of my old content appearing at the same URLs with the same comments attached.
In my last entry I noted that there doesn't seem to be any standard practice for publishing media in Atom. A handful of publishers do the best they can with the stock Atom spec and make a single link with
rel="enclosure", while Google (Picasa, YouTube) is the only publisher I could find that actually uses the MediaRSS elements in Atom. Most sites just don't bother: if you want that information, you need to go fetch the RSS feed.
Since only Google's using it right now anyway, rather than import wholesale the whole of MediaRSS into Atom — MediaRSS is a pretty big, complex beast with lots of stuff that's arguably unimportant for most use-cases — I decided to design an Atom extension that's based on some of the features of MediaRSS but bashed into a more Atom-like shape and without the elements for which Atom already provides equivalents.
I now have a first draft of "AtomMedia". Here are the main differences between AtomMedia and MediaRSS:
- AtomMedia has the narrower scope of being aimed at the aggregation and activity stream use-cases. Much of MediaRSS's complexity is so that it can be used by the indexer for Yahoo! Video Search, but that's not my goal here.
- MediaRSS uses extension elements exclusively, while AtomMedia extends the
atom:linkelement. In particular, it extends standard Atom's
link rel="enclosure"for compatibility with existing implementations.
- AtomMedia excludes the MediaRSS features that are not directly useful for the aggregation and activity stream use-cases. In particular, I did not include content ratings, regional exclusions, "credits", timed text and media hashes. Many of these feel like things that are more general than this use-case, anyway.
- AtomMedia excludes some bits that Atom already has equivalents or near-equivalents of:
- Due to the tighter scope, I was able to include tighter requirements for specific use-cases that will hopefully mean that there will be less variation between publishers.
- AtomMedia reduces the media metadata considerably: it has only width/height for visual things and duration for time-based things. Some of the other attributes (fileSize, lang, type, ...) have equivalents in generic Atom and are thus omitted.
- AtomMedia assumes that each entry describes exactly one media object that might have multiple representations. MediaRSS looks like it's trying to allow entries with multiple objects associated with them, but it doesn't define well exactly how that works in practice and I've seen no feeds actually make use of this feature.
If I can get some traction on this I'd like to use it as the representation format for the photo and video object types in the AtomActivity schema specification. The main important thing I'm missing right now is a namespace URI. How does one register URLs under http://purl.org/syndication/, as seems to be the done thing for Atom extensions in development?