Semantic web updates

Published by Martin Kleppmann on 02 Oct 2008.

A few weeks ago I noted down some links to current developments of the semantic web. After hearing Tom Morris speak again on “The State of the Semantic Web” at BarCampLondon5, here are some more:

Getting started with the Semantic Web - Get Semantic Wiki
Freebase Parallax, a new user interface for browsing Freebase
BBC programmes use RDF: e.g. Doctor Who - in RDF
SIOC (Semantically-Interlinked Online Communities) is an ontology/vocabulary for expressing social links
FOAF project
Searching semantic web data sources: Sindice
SPARQL/Update or SPARUL - for modifying RDF data stores
POWDER - define sets of URLs?
Simple Knowledge Organization System (SKOS) - simpler than ontologies, without inference?
Rule Interchange Format (RIF) - define inference rules

(OMG mad W3C acronyms!)

I also heard Sian Clark of Yahoo speak about SearchMonkey at BCS Search Solutions 2008. This is a very interesting development, allowing site owners to annotate their pages with structured information (using RDFa or Microformats), allowing them to be presented more meaningfully in the search results. A great idea I think!

This move by Yahoo starts giving a first convincing answer to the chicken-and-egg problem of the semantic web: “why would anybody bother to annotate their data in a machine-readable way?” There has got to be some reward attached to it, and doing search engine optimisation (SEO) for Yahoo is a very good reason for creating some semantic metadata! (It’s unlikely to really fly until Google also adopts the idea, but surely that’s just a matter of time.)

What I wonder about: what attempts will there be to parse structured data out of unstructured data sources? There are a few companies doing more or less this, for example Globrix extracts structured information about properties (rent or buy, location, price, number of bedrooms and bathrooms, etc.) from plain text descriptions on estate agents’ websites, and Mydeco extracts structured information about furniture (type of item, colour, width/depth/height, weight, retailer’s location, etc.) from similar unstructured text. There is no technical reason why they couldn’t release that information in a machine-readable RDF format, although there may well be commercial reasons for them wanting to keep it to themselves.

If you found this post useful, please support me on Patreon so that I can write more like it!

To get notified when I write something new, follow me on Bluesky or Mastodon, or enter your email address:

I won't give your address to anyone else, won't send you any spam, and you can unsubscribe at any time.

Semantic web updates

Recent posts

Conference talks

Semantic web updates

Subscribe

My book

Recent posts

Conference talks