FUMSI is for individuals, teams and organisations. Get the benefits of a FUMSI subscription. Learn more »

FUMSI: Subscribe
Flexible, practical value for individuals, teams and organisations.
Learn more »

Enter your
email address:

FUMSI Account »
FreePint Account »

Bookmark and Share






If you find this useful, please consider subscribing, sharing your feedback or providing a testimonial. Browse most recent articles.
 

How the Semantic Web Will Change Information Management: Three Predictions

October 2008 | Perma Link
Bookmark and Share  \"Feed\"   
Subscribe to FUMSI »  
Views: 7,076  

By Silver Oliver

A lot has been written about what the Semantic Web is, but little has been said about the impact it will have. After a brief recap on what the Semantic Web is all about, I will discuss three effects that it will have on the information management.

I am particularly in debt to the work of Chris Sizemore, Michael Smethurst and Tom Coates, who have inspired and informed my interest in this topic.  

 

The Semantic what?

 

The goal of the Semantic Web is to enable people to share structured data on the Web as easily as they can share documents today.

 

‘The current Web is a Web of documents where documents (Web pages) are connected by embedded Hyperlinks (links). Thus when you click on one document, the result is a single step Web transversal to another document. This widely understood, and accepted, Web interaction pattern is facilitated by a resource locater called a Uniform Resource Locater (URL) and a messaging protocol known as Hypertext Transfer Protocol (HTTP).’

 

Deploying Linked Data

We all know how successful this has been as a model because of its openness and simplicity. The Semantic Web builds on the things that made this model successful for publishing documents, but instead uses it for the publishing of structured data.

What do we mean by structured data?

 

Take a link to J.K. Rowling on the Amazon page for the book ‘The Half Blood Prince’. Now, this is fine for browsing from a document about the book to a document about the author. But it does not provide any data which is of use in the way that a library catalogue might be, where the database would contain the machine readable statement:

 

Rowling, J.K. (Author 1965057--), MARC: code Author, Half Blood Prince ISBN 0-7475-8108-8ISBN: No

 

This is structured data that can be handled by a computer and is more useful, in this case, than a simple link between two documents. People like documents but machines need structured data.

 

Now, in a Web of data we would want to be explicit about the things or concepts (J.K. Rowling and the ‘Half Blood Prince’) in question, as is done in the library catalogue with name authorities and ISBN numbers. In keeping with the convention of the document Web, URLs are used to point to things. So, using URLs, we point to a resource that represents that concept or thing. Let us say, for example, we use J.K. Rowling’s Wikipedia page to 'point' to the concept 'J.K. Rowling'.

 

One difference is that, unlike the Web of documents which is not explicit about the type of links between resources, the Semantic Web is explicit. So, in this case, we might use Dublin Core ‘creator Creator’ as the link type (property) between the two concepts/ URL for J.K. Rowling and the concept/ URL for ‘Half Blood Prince’.

 

So, a semantic link between two documents URLs can express the same claim statement as the library catalogue:

 

J.K. Rowling (URL) is the ‘DCdublincore:creator’ of ‘Half Blood Prince’ (URL)

 

Same old Web, only better. Got it?? 


Prediction number 1: a move from the pull to the push search paradigm, or more ‘context-aware’ applications

 

There is talk about the impact the Semantic Web will have on search. This is undoubtedly true, but it seems to me that the real change will come from improving push and recommendation style services.

 

Most of our information-seeking behaviour on the Web is through keyword search; pulling the information we need to use. The push approach has ended up getting a bad name because of its clumsy and intrusive nature. Think about banner adds and pop-ups; they have little understanding of a user’s intent or the context in which they have come to a particular Web service.

 

The Semantic Web could assist in this area, by publishing data in a way that smart applications can take advantage of and so improve smart context aware recommendations. The right thing, at the right place and at the right time.

 

Examples of these applications are already here; of applications taking advantage of semantically published data. The Google Social Graph API is a new service that allows developers to expose social relationships embedded in Web sites. Basically, it allows Web services to be aware of your network of friends. This is possible because of standards like FOAF and XFN that publish this data using Semantic Web technologies and philosophies.

 

The Google Social Graph API is only one of a number of services offering to make our sites context aware. FireEagle is an API from Yahoo that can be used to make a site aware of a user’s current location. The new BBC Programmes site has been built in a way that others can use its data to make their sites BBC programme aware.

 

In order to do this, Web services need structured data. The Semantic Web will help these kinds of services because it has the advantage of providing a single, standardised access mechanism for getting structured data instead of relying on diverse interfaces and result formats. 


Prediction number 2: the battle of the identifiers or the age of pointing at things

 

In the world of the Semantic Web, until you have named something you cannot do much useful with it.

 

‘We will not be able to work with concepts until we have pointed at each new thing (and like Adam in the Bible I guess) give it a unique identity. This is an age of naming, it is an age of pointing at things.’

Tom Coates

We have already seen an example of this with our statement about J.K. Rowling. When making this statement we need to point to the concept ‘J.K. Rowling’. This could be a representation of the concept J.K. Rowling on the Amazon website, but it becomes a lot more useful when it is a universally accepted representation. This would be analogous to using Library of Congress Subject Headings for your library classification than a home grown one.

 

But where will these identifiers come from, who will manage them and who will own them?

 

Bloggers and creators of websites have, for a while, been pointing at things on the Web. Not just saying 'have a look at this' but expressing 'this is what I mean'. A blog post that uses an unusual or new concept will often point to Wikipedia as a way of disambiguating the concept as well as pointing the reader to a definition. This logic has been carried through to a number Semantic Web projects which are using Wikipedia as Web-scale subject identifiers - not just Amazon or BBC scale.

 

Other Semantic Web services decide to use their own identifiers. The very successful, free to use semantic tagging service Open Calais applies subject identifiers to submitted text. Currently the service uses its own identifiers for pointing at things (people, places, organisations, subjects). This might have advantages for Reuters but it potentially limits the usefulness of the service.

 

For example, when the BBC looked at adding subject tags to programmes, it thought about the type of identifiers it wanted to use:

 

'if the vocabulary used to tag programmes and news was Web-scale then The Times, The New York Times, Fox News etc (or someone in between) could start to aggregate stories around a shared sense of topic. It's like Yahoo! Term Extraction or Open Calais except the terms returned are Web native or Web-scale identifiers if you will.'

Michael Smethurst

 

It is unclear what the business value of owning identifiers will be, but there is certainly support to keep them, like the Web, open.

 

 

'Whilst the BBC or *insert your organisation here* should own their data (whilst hopefully making it free - as in beer; as in speech) we don't have to own our identifiers. If we choose to use the power of Web-scale identifiers we free our content to fly and leave it to other people to add value / make money in the middle.'

Michael Smethurst

Whether other organisations will be as open with their identifiers remains to be seen but they may not be able to afford not to.

Prediction number 3: the changing role of the information professional

Does this mean we will all need to become ontologists? Probably no more than we are all taxonomists at the moment. That is, you don't need to know how to build them so much as how to apply them to a domain. In fact we might need to develop our user-centred design skills as much as our ontology-building ones.

 

When Google released its Google Social Graph API, it managed to find friends of mine who were also subscribed to Google Reader. It then added these friends’ RSS feeds to my list of feeds under the assumption that I am interested in the same things as they are. Though clever, I quickly realised that none of my friends have the same professional interests as I do. The user experience of this was intrusive, confusing and represented a misunderstanding on behalf of Google of the sensitive domain of my networks of friends.

 

We have a tolerance for poor suggestions in Google, but this will not necessarily be the same when decisions are being made on our behalf by smart semantic agents. These are user centered design challenges. Our understanding of a domain has to be balanced with a profound understanding of a user’s relationship to their virtual world.


The other side of this is modeling the domain of information we are dealing with. We have explored a little of what is involved in expressing information semantically and this could be generalised into three steps.

 

·       Identify the resources that are of interest.

·       Name them – ideally with Web-scale identifiers.

·       Create relationships between them – using existing standards were possible (SKOS, FOAF etc).

 

An example in this case would be the BBC Programmes site. What was of interest were individual episodes of programmes. These were given names (unique idenitifers).

 

‘But once you have decided what constitutes a programme episode then something really significant happens – you can give it a name, make it addressable, you can – for the first time point to it.’

Tom Coates

http://www.plasticbag.org/archives/2005/04/the_age_of_pointatthings/

 

Finally, relationships are created between individual episodes and the series of which they are part, actors appearing in the programme and its subject. These relationships need to be expressed using an appropriate vocabulary, probably from those that exist already. Dublin Core, SKOS and FOAF provide us with most of the relationships we need to make statements about bibliographic resources, subject and people. It is just a case of picking the appropriate one.

 

Actually, this process is not that much different from traditional librarianship, where a librarian would identify useful resources (collections), name them with (shelf numbers) and make create relationships to other resources through the library catalogue (using MARC, for example).

 

The skills of information professionals will be essential in populating and managing the Web of data and, to make this happen, we must make the shift from thinking repository-scale to thinking Web-scale.

 


Silver Oliver is an Information Architect currently working at the BBC. His background is in Library science but most of the last 6 years has been spent working in user experience design. His interests are in metadata, taxonomies and navigation. Silver blogs at www.blockslabpillar.com


Other related FUMSI stories:

Corporate Blogging: How To Be Open: http://web.fumsi.com/go/article/share/3199

Social Networking: A Research Tool: http://web.fumsi.com/go/article/find/3196

SLA: In Focus: http://web.fumsi.com/go/article/share/3129

Intranet 2.0: Ten Not-So-Easy Steps: http://web.fumsi.com/go/article/share/3091


[Get Copyright Permissions] Click here for copyright permissions!
Copyright 2008 Free Pint Ltd.

You may also be interested in:

 

Latest Articles:

Show me all FUMSI articles »

 

Latest Reports and Tools:

Show me all Reports and Tools »

This section sponsored by:


Read more about our sponsors »

FUMSI Focus: Disability Resources

Our Editor Recommends...

AbilityNet is a fabulous resource that both helps disabled users of technology and provides advice and services to IT professionals. Factsheets and skillsheets guide you through the mysteries of customising your computer, setting up your monitor, and using both the keyboard and mouse.

For professionals, AbilityNet provide training courses, accessibility audits and disabled user testing. There is also an industry news feed and articles on topics from creating accessible PDFs through to understanding cognitive difficulties.

Check back regularly for new recommendations, or subscribe to FUMSI Focus, a free monthly update.

Contribute

Karen LoasbyContact Karen Loasby, our contributing editor for the Manage practice area, with your feedback and suggestions for articles or resources.

Subscribe

Get the monthly FUMSI Magazine, FUMSI Folios and discounts on reports. Find out more »

Sponsor

Sponsors of the Manage practice area reach records managers, information policy managers and senior leaders of IT teams with budgetary control or influence over their organisation's data purchases. Sponsorships for this practice area are limited, so contact us today for further information. Learn more now »

Comment

Ask your tricky Manage-related questions in the FreePint Bar -- our community is ready to help!

Email any suggestions on FUMSI using our Suggestion Box »

Tell Others

If you find FUMSI useful, please tell a colleague, forward an article, or promote a FUMSI Professional or FUMSI Enterprise subscription within your organisation.

Supply a Testimonial

If you find FUMSI useful, we would love to hear from you.

More MANAGE Resources

Latest MANAGE articles:

More MANAGE articles »

Latest MANAGE tools and reports:

More tools and reports »

Subscribe to FUMSI »

Why subscribe? Because you get:

  • Monthly FUMSI Magazine
  • Monthly FUMSI Folios
  • All FUMSI Reports
  • Other valuable Free Pint Limited discounts

Learn more and subscribe »

 
How do I FUMSI?
» Find
» Use
» Manage
» Share
Subscribe
Magazine Articles
» 'Find' Articles
» 'Use' Articles
» 'Manage' Articles
» 'Share' Articles
FUMSI Magazine
FUMSI Folios
Reports
» 'Find' Reports
» 'Use' Reports
» 'Manage' Reports
» 'Share' Reports
About FUMSI
» Philosophy
» People
» Site Map
» Search
» Sponsors
Contact
» Suggestion Box
» Testimonial