Federated search - also known as metasearch or a distributed information retrieval system - provides one portal for searching information from multiple sources. Federated search software has become necessary because the task of searching multiple databases one by one has become too onerous due to the exponential increase in the number of databases. Organisations often lease databases from several providers, each with its own database structure and search methodology. Federated search software allows users to issue a single search; the software then runs it against each of the specified target databases, returning one list of results. Hosted by the vendor or by the subscribing organisation, it provides a simplified search process for the user and enhances information discovery.
Federated search software sits between the user and information sources as a discovery tool that allows the user to find information quickly from multiple sources. At its core, federated search is a single interface that has the ability to simultaneously search multiple data sources. Those data sources may be corporate intranets, fee-based databases, library catalogues, Internet resources, and user-specific digital storage. Using federated search software is an attempt to improve the accuracy and relevance of individual searches while reducing the amount of time required to search specific resources one by one. In addition, by being able to search many sources simultaneously, more content will become visible to the user quickly.
Federated search software presents a simple search interface to the user. Rather than needing to learn several search interfaces, each with their own nuances, the user learns one search interface.
Behind the scenes, the federated search software transforms the single query into specific queries for each information source, which are then executed simultaneously. The results of those queries are returned to the user in a unified list with minimal or no item duplication. The results may be displayed to the user in a list or clustered by categories. Some federated search software products will display the number of hits by database, and then allow the user to select which hits to display.
For the federated search software to work properly, it must understand the content sources it is accessing. The software must have been programmed with the information needed to use each source correctly. Such a program (or application programming interface or API) is often called a connector. Each connector works in a specific way with each database or content source. It is responsible for translating the query so that it is executed correctly in each database and for translating the results for display to the user. The federated search software will likely be pre-programmed with a set of connectors. Additional connectors can be added if the organisation requires it.
Some databases and content sources can be searched by the software in real time (or on-the-fly), while others may need to be periodically crawled (indexed). It is better to have content sources that can be searched in real time because the search software then accesses the most current version of the searchable index. Although a content source could be crawled frequently, the index that is searched will only be as up to date as the date of the last crawl.
Federated search software may have the following features:
Ability to search multiple database concurrently
Search databases in real time
Simple and advanced search capabilities
De-duping (de-duplication) of records - removing duplicates according to a set of rules (e.g. which database takes precedence)
Merged search results
Sorting of records
Faceted - topically clustered - search results
Save, print and email results
Exporting of results to a file
Extensive patron authentication support
Display databases by categories
Compliance with an OpenURL resolver
Search status report
Ability to search local and remote collections as well as Internet resources and search engines using three modes: http (or screen scraping), Z39.50 and XML.
Personalised access to resources
Ability to access to electronically available content without further authentication
Compliant with guidelines for accessibility by people with disabilities
Relevance ranking
Unlimited simultaneous users
Ability to link into interlibrary loan (ILL) system
Extensive search statistics
Ability to be branded to match the look-and-feel associated with the local organisation
Ability to do distributed federated search where the search feature can be built into various sections of the organisation's web site
The Benefits and Shortcomings of Federated Search
The benefits of federated search include:
One-stop ‘shopping' - a feature users have proven they like (as demonstrated by the popularity of Google, and similar search tools)
Enhanced information discovery
The ability to search multiple repositories without having to learn the specific search options for each repository. Those repositories include library catalogues, full-text databases, abstract and index databases, ebooks and more
The ability to create one portal for all library content
Searching library and non-library content simultaneously (e.g. in a corporate environment).
The shortcomings include:
Federated search does not emulate native search. Searching a content source through its native search function will always offer more sophisticated options
It is harder to delve deeper into a collection with a federated search product
Complete de-duping (removing of duplicates) may be difficult
A change in a database's configuration will render it unsearchable by the federated search engine until the ‘connector' is fixed
Configuring a federated search system for a new or unknown database takes time and may require money to compensate the vendor or a contractor
Searching for a ‘known item' is not what federated search does well. For example, if you know the name of the article you want, it would be better to go directly to the database that contains that article than to use federated search software.
Since every federated search software product is different, it is important for an organisation to understand what features are required, what benefits the organisation is seeking, and how it will be impacted by the shortcomings. It is also important for the organisation to understand the industry, since the industry - companies, products, and features - is in flux. Thus, it is recommended that the organisation carefully assess the offerings to choose the best software as opposed to ‘following the leader' and automatically selecting what others have chosen.
Is Your Organisation a Good Candidate for a Federated Search Solution?
Prior to doing any needs assessment on federated search, you may wonder if you should even bother. While it would be best to do the needs assessment in order to discern the needs of your organisation, you may decide not to pursue investigating federated search if your organisation does not:
Have funding to invest in federated search software. Even if you use an open source product, you will need to invest the time of your staff, and time is money
Have personnel resources to investigate software options and work with a federated search company on installation. No matter how easily a federated search product can be installed, it will require the time and attention of your staff
Subscribe to or use multiple databases. If your organisation only uses a handful of content sources, your users may find it tolerable to search each source individually
Have complaints from its users concerning the time it takes to search various content sources. Of course, they may be complaining to each other and not to you and your staff, so you should take time to ensure that your users are voicing their real thoughts and concerns.
You may want to dedicate part of a staff meeting to this topic in order to gather information quickly from your staff, in order to ensure that this decision is not made in a vacuum.
A Tale of Two Searches - Federated Search in Action
Sara sits in front of her PC and looks at the library's web site. Sara is looking for information on the use of open source software in corporations, but isn't sure where to look for the information. The library's web site lists 100 databases. Should she search each one individually? She estimates that could easily take her two hours to complete. Should she guess at specific databases to search and hope that she selects the correct ones? Sara decides to search as many of the databases as possible, but gives up after 20 minutes out of frustration from not finding what she needs quickly.
Sean looks at the library web site from his PC and sees not only a list of 100 databases to search, but also a ‘Quick Database Search' option that allows him to search all of the databases at the same time. He types his query into the Quick Database Search box and presses enter. Immediately results begin to appear on his screen. Sean, who is searching on permaculture, finds references in databases that he would not have searched on his own. This Quick Database Search (i.e., federated search) allowed him to search many databases at once, discover those databases that contained pertinent information, and receive on-target results.
Evaluating and Implementing Federated Search
As with any strategic project, it's important to develop and follow a process to make the best decision with regard to federated search solutions. The Federated Search Report and Tool Kit provides step-by-step guidance in:
Jill Hurst-Wahl, MLS, has over 20 years' experience in the library and information sector, including work as an IT business systems analyst and as a corporate librarian. Since founding Hurst Associates in 1998, Jill has consulted with for-profit and non-profit institutions including many cultural heritage organisations.
As blogger, Jill's Digitization 101 blog (http://www.Digitization101.com) is a frequently read resource among those involved in digitization programs. She also blogs about social networking tools at eNetworking 101: The Blog (http://www.eNetworking101.com/blog).
The FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.
'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'
FUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.