Thursday, 11 December 2008

Experience report: Kosmix

I've been playing a little with Kosmix http://www.kosmix.com/ , a site that uses a categorisation engine to pull together resources from the web and present them in what it calls a magazine-style format. The idea is that you get a more rounded view of a subject through the combination of sources, with the output presented in an attractive and easy-to-assimilate layout.

Kosmix seems strong in searches for "hard" topics - that is, people, places and things. Type in "cupcake" and you'll get a very coherent result that looks like a mixture of encyclopedia entry and advertising supplement.

"Soft", more abstract, topics give less certain results. I tried "evolution" and while Kosmix brought back information about the core concept, it also retrieved information that proves how over-used in our language such abstract concepts can be. "Evolution" is a car by Mitsubishi as well as a scientific principle, and Kosmix showcases both.

On the principle that maybe *any* search strategy is going to be better with people, places and things, I tried "darwin" as an alternative to "evolution". (And yes, I know Charles Darwin is not synonymous with evolution, but with theories of evolutionary mechanisms - but the two terms are going to crop up in close association.) Kosmix has a good idea that when I type in "darwin" I'm after the eponymous Charles. But it also knows that Darwin is a place, so I get a Google map of the city and accommodation tips too. (Just wait until I revive my campaign for building a city named Greenspan - then we'll have some fun.)

One of the delights of language is that its malleability makes context all. Kosmix mostly regards "mars" as a planet. It also understands that Mars make candy. However, the Flickr images it retrieves were all taken on earth...

I'm British so I also tend to notice US bias in the retrievals. "Polo" is predominantly a sport according to Kosmix, but where I live it's more of a mint, and occasionally a Volkswagen. Kosmix knows that a "life saver" may go in your mouth as well as around your neck, which is again somewhat US-centric. Some of these effects reflect the content that's out there, while others reflect the theory of knowledge that underpins Kosmix's caregorisation engine.

So far, with admittedly light use, I'm finding Kosmix to be good on ultra-hard topics (try "iraq") and less good on ultra-soft topics (I tried "call-off", a slippery concept that took me a long time to figure a few weeks back, and which Kosmix doesn't really get a grip on). The thing is, if a topic is sufficiently hardened to make its categorisation certain in Kosmix, then the chances are that you already know how to navigate to the information you need on the web.

Except that... Kosmix did surprise me (in a pleasant way) with its treatment of "ancient iraq". The result set for this search term brings both contemporary news about the fate of ancient sites in the current war and wider material about civilisations in Mesopotamia. This is the kind of fusion currently on show at the British Museum's "Babylon" exhibition - which was creatively curated by experts. If Kosmix can "curate" web content with the same kind of (automated) insight, then maybe these folks really are on to something.

One minor aspect of Kosmix does have me scratching my head: Why does it capitalise your search terms for you when it presents its results? I kind-of feel like I'm being corrected when it does this. Maybe that's just me being picky... It usually is.

2 comments:

  1. Good analysis. I manage this product, and this is great feedback for us. For "Mars", if you look at the disambiguation hints at the top (under "we have several topic guides..."), we do tell you that mars could mean a lot of different things. The link to "mars (planet)" is the very first link there. And if you look under "more categories", you'll find a link to "mars incorporated", the company that manufactures the candy. If you click on those links, you'll get pages that have a more specific interpretation of "mars".

    Likewise, with Darwin and Polo, you could narrow down to the specific one you want.

    You do make some good observations around soft topics, and we'll be working on these in our future releases. For now, we offer you ways to make your soft topic harder - for instance, if you look at "Related in the Kosmos" for Evolution, you find links to Abiogenisis, Natural Selection, Molecular Evolution and a lot more.

    ReplyDelete
  2. You're absolutely right, Vijay: I was scanning the results very quickly, and reacting off the top of my head. I'm going to let this first experience drift out of my brain before making another trial, and probably my understanding and engagement will have improved by then. I'll post another set of impressions in a few days time. Hope you're enjoying the project - it looks like fun.

    ReplyDelete