Shirky and Ontology vs. Folksonomies
January 27, 2006Fields: semweb Semantic Web Ontology
An interesting article here , with some comments here. Obviously linked to the last post, and something that is probably bubbling up (incidentally, it’s an area that seems to have been rather ignored by the ‘academic’ community: When I checked, citeseer was down, and ArXiv (cs) only had one paper on folksonomies, and that was a network analysis). I actually think there are some far more incisive comments about the problems with SemWeb in a different Shirky Article, so I’ll try and comment on both. This is just a collection of comments, rather than a point-by-point reply.
He’s fundamentally right about the weakness in deductive logic; the world just isn’t neat enough to allow us to hope for closure on the world. On the other hand, that doesn’t mean that we should just throw ontology out of the window; we just need to be careful about our uses & claims, and more than that, need to be careful about our definitions.
1: Gruber’s (famous) definition of Ontology is: An ontology is a specification of a conceptualization. I still don’t know what this means. I think it means that an ontology is a MODEL of the world. I then to chop its functions up into terminological (shared words), taxonomic (Is-A in a tree) and ontological (complex definitions of classes). I don’t know whether it works for others, but it works for me, and I think it’s got an important bit to it: The reason we go to so much trouble in defining classes is that it allows us to say: If
2: (many) things aren’t definitely one thing or another; this is also very true. I’m not talking here about different uses of ontology, which need different views; instead, the fundamental failure of ontologies to reflect real life. And he’s right. It’s just that it doesn’t matter. What matters is not whether it’s right, but whether it’s right enough. In 1993, some people from MIT pointed out some uncomfortable truths about Knowledge Representation (of which Ontology is part). Their first, and most powerful point, is that all KR is a surrogate, and an imperfect surrogate, so it will, always, produce some incorrect inferences. What matters is whether it also produces enough correct ones.
3: The failure of classification systems, which brings us on to more interesting ideas. If you start by saying that there are books about Russia, of course you run into problems. That’s because books don’t ‘have’ a country. If instead you start by asking ‘what can we say about a book’ (Title, Main subject, Author, Colour) then you get somewhere. What you then need to do is to to define ‘Books about Russia’ as being those books that have a main subject Russia (I’m trying to resist the urge to break into rdf here), and run your reasoner. It will tell you which books are ‘about Russia’. Of course, if you want a green sociology book, then you can do that too….The point is that defined classes are essentially queries, and poly-heirarchical ontologies let each book be the answer to multiple queries. As long as you can define your class/ query, you can find the book. Of course, it’s not always perfect, but see (2) above.
Now, where this get interesting is: what base terms should we use to build our queries? In the past, we’ve tried big top down approaches; where it might get interesting, and where I think folksonomies get good, are if you use the folk-tags as your base terms, and build queries/ classes out of those. Of course, you might need a bit more (like pulling the country space out the URL), and language encoding would be a cherry, but it might be nice….very nice, in fact.
4: N% belonging: The final thing is to deal with the fact that we can’t be sure that things are always something. His suggestion is to use n% membership, which is ok (although I have no real idea of the semantics of this: you might be able to jury-rig a frequentist approach, but it would be very variable in its response, and I don’t know what a subjective approach would mean here. Anyway, I digress…), and there has been some work on Bayesian Ontologies (somewhere, sorry). The other approach is to say that things either are, or are not, in a category, but we don’t know which. What we might be able to do is to come up with some reasons - some arguments - for believing one or the other. Of course, which you trust is up to you (and I’m not sure there’s a ‘right’ answer). Then again, seeing as my Thesis is on hooking up ontologies and arguments, I would say this.
Any comments on this insanely long post gratefully recieved…
