Let's face it right at the start, no ducking the issue, image tagging for business findability is hard - though the light at the end of the tunnel is that it's not too hard, and done well it can make a huge difference to the value of a set of images.
This article outlines the options open to tag images for a business need - selling, sharing, reducing duplication of effort etc. It assumes an image focused audit or assessment has already understood the creation and use of image content and the need is to choose from a set of options in order to create a tagging plan, with a set of rules, guidelines and success metrics.
Please consider the bulk of this article to be a list of focus areas and pick from them depending on your needs.
Image findability - basic attributes
Perhaps the most familiar area, for those more used to working with documents than images, are the basic attributes of images - think Dublin Core for images, but don't forget to consider the layers or instances of many images. The original object could be a sculpture, a painting, a daguerreotype, or an original digital creation. The second generation image could be an archival image depicting the original object, plus any further images - cut-down image files with differing screen sizes or images in different formats such as Tiff or Jpeg.
In order to create a useful metadata scheme, capturing and allowing access to basic attributes of images, consider at least the following metadata types:
Locations where second generation images were created
Points of view - view from below, close-up etc
Unique image id numbers and batch numbers
Secondary image codes that may come from various legacy systems
Techniques used in the images - grain, blur etc
Whether the images are part of a series and where they fit in that series
The type of image - photographic print, glass plate negative, colour images, black and white images.
This metadata gives background to the original and the second generation images created during production. Much the data can be obtained freely or cheaply, lots will be quick and easy to grab and enter into systems. It should also be objective and easy to check.
Image findability - depicted content
A major part of image findability are the things depicted in them. Classifying images based on depicted content means considering anything and everything that is and can be depicted in an image.
Broadly speaking, people searching for depicted content are looking for a number of types:
Generic and named:
Places: cities, towns, villages, streets
Structures: parks, skyscrapers, cottages, walls, doors, windows
Topography: mountains, valleys
Groups and organisations: air forces, choirs, police departments
Animals and plants.
Peoples and their:
Roles and occupations
Ethnicity and nationality: mothers, doctors, Caucasians, French, Germans
Actions, activities and events: running, writing, laughing, smiling, birthdays, parties, book signings, meetings.
Objects: a myriad of items.
Anatomy and attributes of people, animals and plants - arms, legs, adults, leaves, trunks, paws, tails.
Also of use are:
Depicted text shown in images - often signs or writing
Commercial tags such as ‘Copy space'
Depicted periods and art and architecture styles
When dealing with depicted content I've found some of the biggest issues to be:
Identification - knowing what is in an image
Focus and specificity - knowing what to include and what to exclude
Consistency - applying the same term in the same way for the same depicted content
Image findability - conceptual aboutness
Image indexing gets especially tricky, and really parts company from the world of document indexing, with the ‘aboutness' access to images. By their nature images convey a myriad of messages to any number of people. Few images are not ‘about' some type of abstract concept and few images users make no use of this important access point to image content
Conceptual aboutness includes a variety of types:
Emotions - love, hate, fear, alienation
Behaviours - surrender, blame, hypocrisy
Social and political concepts - poverty, inequality, democracy, capitalism, fascism
Characteristics and ideals - purity, innocence, wisdom
Other concepts - Childhood, Fantasy, Momentum, Repetition
Popular phrases - Message in a Bottle, Time is Money, Not in Context, The Morning After, Word of Mouth, You are What You Eat
Moods - bleak, gloomy, rustic, eerie
Some people object to the application of these concepts to digital images. They see these concepts as hugely subjective, hard to apply and hard to search for. However, to some degree, the application of description terms to images can be equally subjective and hard to do, but equally as valuable.
It's tempting to dismiss conceptual aboutness access to images when adopting a classification perspective. But it's less easy when looking from the users' viewpoint. Many people, in many roles, like to and need to access images based on their conceptual aboutness. This demand exists in some image markets more than in others, it's important, it will not go away, and it needs to be addressed by image classification staff.
Whilst being important and useful, applying these concepts to images brings its own challenges:
Images can be seen in new ways over-night. For example, an image of a passenger plane can have innocuous meanings of: speed, travel, communication, and technology one day, then, following a terrorist event or a well publicised crash, new meanings surface in people's minds when they look at the same image: fear, terrorism, death, danger, risk etc.
How to handle the messages people get from images? Should we categorise the messages or rely on the firmer depictions? Perhaps a case can be made for not allowing this type of access to images, instead searchers could look for objects that represent a given concept rather than looking for the concept. This has the advantage that a plane is always a plane. If one day a user associates a plane with fear and another day with speed, they'll always be able to find the plane by searching using the depicted term. However, this is not how people always search. People looking for images of planes because they want to use them to represent fear or speed, will often simply type in fear, speed etc, starting with the concept and not the depiction.
I prefer to take action based on users and their requirements and in many cases I'd argue for the use of these abstract concepts - though one size does not fit all in this world.
Abstract concept access to images can often be through a positive or a negative route. For example, in positive times an image of a handshake can be about 'recruitment', 'welcome; or 'new beginnings'. In more negative times the same image becomes about 'redundancy', 'endings' and 'goodbyes'. In these cases, should images be tagged to reflect both sides of a concept, the good and the bad? If a person is looking for a positive or a negative would it make sense to them to see the same image for either search? Would this be useful? Can certain concepts always be attached to certain objects? Is a lightbulb always about: creativity, ideas, innovation, and inspiration? I think not, but how to control the application of these concepts so the right concept is associated with the right image - at least most of the time?
This is a good time to admit that it is impossible to always consistently apply abstract aboutness concepts to still images. It cannot be guaranteed that all images a searcher may consider relevant to a concept will always be indexed with that concept. Concepts do inherently mean different things to different people and images do convey different concepts at different times. One person's 'isolation' may be another person's 'solitude'. What can be achieved is a relatively consistent understanding of the meaning and ways to apply a given concept and a guarantee that when a searcher uses a given term the images they see in their results set will be good examples of that concept.
Image findability - anything else?
Yes, two more things: titles and captions. At least one of these is pretty crucial to most images and often images will need both. Titles of three or four words introduce an image; they aim to do this quickly and simply. Captions of three to four sentences provide more space to explain an image or to put it into a useful context. The more there is to say about an image the more important these two free text fields are - crucial for an historical image, not as important for a simple trademark.
Bring it all together and what have we got?
A set of images that are tagged and findable based on a reasoned approach to the needs of the users of these images. Enhanced findability should produce: easier and more effective access to the right images in the right ways.
At Dow Jones, Ian delivers solutions for global clients. Projects include: assessments, metadata and vocabulary strategy and creation, search and browse support, and asset tagging. Ian also manages www.taxonomywarehouse.com - a large resource for licensable vocabularies.
Ian spent 13 years developing vocabulary and indexing for images at Corbis and Photonica. Ian lead Corbis' UK image cataloguing division. At Photonica, Ian created e-commerce websites and developed vocabularies underpinning classification and retrieval of all image content. This included an English thesaurus and its localisation into five languages.
Ian tweets and blogs about information management and extols the virtues of findability at conferences.
The FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.
'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'
FUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.