Are You Prepared for the Semantic Web?
James Mathewson is the global search strategy lead for IBM and co-author of Audience, Relevance and Search, Targeting Web Audiences with Relevant Content. He writes regularly for the book's companion blog, writingfordigital.com, from which this article is reprinted.
I recorded a four-part podcast series with Kristina Halvorson, moderated by Mike Moran. We had a great time. We talked about the content strategy around created, curated and aggregated content. I wanted to expound on a point in this article, on which I didn’t have time to elaborate during the podcast series.
Neither Kristina nor I care much for “set-it-and-forget-it” content aggregation. If you set up a feed of content to your pages and you don’t monitor or moderate the end result, you risk doing damage to your company’s website and brand. Mike asked me if I thought there was ever a good use for such aggregation and I said, “Maybe when the Semantic Web is more mature, there might be. But not with our current Web.” I started to elaborate on this point, and caught myself because we were just about out of time in that segment and the explanation was about to get really long and technical. So I just left it there. After turning the unfinished thought over in my head for a few days, I needed to write about it to relieve the tension.
If you want to stay ahead of the web publishing curve, now is the time to start learning and experimenting with the Semantic Web. It’s been in development since the ’90s, led by Tim Berners-Lee himself. When media historians write books about the information transformation we are in (from print to web), the Semantic Web will be at least as important as the invention of HTML. Having Semantic Web-enabled pages will soon be a big competitive advantage for you and your company.
For example, you can get rich snippets in your Google search engine results if you enable linking data on your pages. But that is just the tip of the iceberg in what you will be able to do with Semantic Web-enabled content.
What is the Semantic Web?
The Semantic Web is a set of standards developed by the w3C that allows website owners to embed all kinds of data in their content and organize the data with inference rules. To applications, the Web is a big blob of unstructured data–mostly raw text. It’s really hard for applications like search engines to make sense out of it all. For those who adopt it, the Semantic Web will add structure to the data so that applications can better understand what to do with all the content. This will allow more automated Web processing, such as intelligent RSS feeds and better search algorithms.
The promise I alluded to in the podcast series was about setting up intelligent RSS feeds that automatically serve relevant content to your audiences. Imagine being able to specify not only the keywords at the center of your RSS feed aggregators, but all kinds of other data, such as date, author, publisher, ratings, tags, etc. Suppose you developed a feed of all Kristina Halvorson’s tweets just about one facet of content strategy within the last two months, and you combined it on the same page with five other leading experts on content strategy that meet the same parameters in the same time frame. That might be the kind of page where “set-it-and-forget-it” aggregation makes sense, to answer Mike’s question more precisely. This kind of feed requires Semantic Web functions.
Search is the other area of web content that will change radically when more publishers enable Semantic Web standards such as RDF and OWL. These standards enable content owners not only to write content with the keywords their target audiences use, but also to disambiguate the meanings of those keywords. When search engines parse this code, they can serve more relevant content to their users. Using entity extraction, Semantic Web search engines will be able to assign the appropriate RDF data to pages whether they are coded appropriately or not. This is how the next generation of search engines will work.
As with search, most automated content efforts center on keywords. Social media listening is one case in point. The keywords you feed into any listening program determine the nature of your results. It’s a bit of a chicken-and-egg problem, because you need to monitor social media to learn about the words your audience favors, but you need to know what words your audience favors to accurately monitor social media on their behalf. Also, using keywords alone, the research needs human validation to ensure accuracy. Social media listening will be much more accurate on first pass if it looks for Semantic Web data in addition to keywords and links.
In this article, all I have been able to do is scratch the surface in this emerging field. But I wanted to give you a taste for it without overwhelming you. The Web seems hard enough as it is without over complicating it. The good news is, the Semantic Web should actually simplify Web publishing by giving publishers more tools to help their target audiences find and use relevant content. Preparing for the Semantic Web needs to be on your roadmap now.