| |
By Jill Hurst-Wahl
Federated search -
also known as metasearch or a distributed information retrieval system -
provides one portal for searching information from multiple sources. Federated search software has become
necessary because the task of searching multiple databases one by one has
become too onerous due to the exponential increase in the number of
databases. Organisations often lease
databases from several providers, each with its own database structure and
search methodology. Federated search
software allows users to issue a single search; the software then runs it
against each of the specified target databases, returning one list of
results. Hosted by the vendor or by the
subscribing organisation, it provides a simplified search process for the user
and enhances information discovery.
This article is extracted from the new FUMSI report, Federated Search Report and Tool Kit. Learn More »
What Is Federated
Search?
Federated search software sits between the user and
information sources as a discovery tool that allows the user to find
information quickly from multiple sources.
At its core, federated search is a single interface that has the ability
to simultaneously search multiple data sources.
Those data sources may be corporate intranets, fee-based databases,
library catalogues, Internet resources, and user-specific digital storage. Using federated search software is an attempt
to improve the accuracy and relevance of individual searches while reducing the
amount of time required to search specific resources one by one. In addition, by being able to search many
sources simultaneously, more content will become visible to the user
quickly.
Federated search software presents a simple search interface
to the user. Rather than needing to learn several search interfaces, each with
their own nuances, the user learns one search interface.
Behind the scenes, the federated search software transforms
the single query into specific queries for each information source, which are
then executed simultaneously. The
results of those queries are returned to the user in a unified list with
minimal or no item duplication. The
results may be displayed to the user in a list or clustered by categories. Some federated search software products will
display the number of hits by database, and then allow the user to select which
hits to display.
For the federated search software to work properly, it must
understand the content sources it is accessing.
The software must have been programmed with the information needed to
use each source correctly. Such a
program (or application programming interface or API) is
often called a connector. Each connector
works in a specific way with each database or content source. It is responsible for translating the query
so that it is executed correctly in each database and for translating the
results for display to the user. The
federated search software will likely be pre-programmed with a set of
connectors. Additional connectors can be
added if the organisation requires it.
Some databases and content sources can be searched by the
software in real time (or on-the-fly), while others may need to be periodically
crawled (indexed). It is better to have
content sources that can be searched in real time because the search software
then accesses the most current version of the searchable index. Although a content source could be crawled
frequently, the index that is searched will only be as up to date as the date
of the last crawl.
Federated search
software may have the following features:
- Ability
to search multiple database concurrently
- Search
databases in real time
- Simple
and advanced search capabilities
- De-duping
(de-duplication) of records - removing duplicates according to a set of rules
(e.g. which database takes precedence)
- Merged
search results
- Sorting
of records
- Faceted
- topically clustered - search results
- Save,
print and email results
- Exporting
of results to a file
- Extensive patron authentication support
- Display
databases by categories
- Compliance with an OpenURL resolver
- Search status report
- Ability to search local and remote collections
as well as Internet resources and search engines using three modes: http (or screen scraping), Z39.50 and XML.
- Personalised access to resources
- Ability to access to electronically available
content without further authentication
- Compliant with guidelines for accessibility by
people with disabilities
- Relevance ranking
- Unlimited
simultaneous users
- Ability
to link into interlibrary loan (ILL) system
- Extensive
search statistics
- Ability
to be branded to match the look-and-feel associated with the local organisation
- Ability
to do distributed federated search where the search feature can be built into
various sections of the organisation's web site
The Benefits and Shortcomings of Federated
Search
The
benefits of federated search include:
- One-stop ‘shopping' - a feature users have
proven they like (as demonstrated by the popularity of Google, and similar
search tools)
- Enhanced information discovery
- The ability to search multiple repositories
without having to learn the specific search options for each repository. Those repositories include library
catalogues, full-text databases, abstract and index databases, ebooks and more
- The ability to create one portal for all library
content
- Searching
library and non-library content simultaneously (e.g. in a corporate
environment).
The shortcomings include:
- Federated search does not emulate native
search. Searching a content source
through its native search function will always offer more sophisticated options
- It is harder to delve deeper into a collection
with a federated search product
- Complete de-duping (removing of duplicates) may
be difficult
- A change in a database's configuration will
render it unsearchable by the federated search engine until the ‘connector' is
fixed
- Configuring a federated search system for a new
or unknown database takes time and may require money to compensate the vendor
or a contractor
- Searching for a ‘known item' is not what federated
search does well. For example, if you
know the name of the article you want, it would be better to go directly to the
database that contains that article than to use federated search software.
Since every
federated search software product is different, it is important for an
organisation to understand what features are required, what benefits the
organisation is seeking, and how it will be impacted by the shortcomings. It is also important for the organisation to
understand the industry, since the industry - companies, products, and features
- is in flux. Thus, it is recommended
that the organisation carefully assess the offerings to choose the best software
as opposed to ‘following the leader' and automatically selecting what others
have chosen.
Is Your Organisation
a Good Candidate for a Federated Search Solution?
Prior to doing any
needs assessment on federated search, you may wonder if you should even
bother. While it would be best to do the
needs assessment in order to discern the needs of your organisation, you may
decide not to pursue investigating federated search if your organisation does
not:
- Have funding to invest in federated search
software. Even if you use an open source
product, you will need to invest the time of your staff, and time is money
- Have personnel resources to investigate
software options and work with a federated search company on installation. No matter how easily a federated search
product can be installed, it will require the time and attention of your staff
- Subscribe to or use multiple databases. If your organisation only uses a handful of
content sources, your users may find it tolerable to search each source
individually
- Have complaints from its users concerning
the time it takes to search various content sources. Of course, they may be complaining to each
other and not to you and your staff, so you should take time to ensure that
your users are voicing their real thoughts and concerns.
You may want to
dedicate part of a staff meeting to this topic in order to gather information
quickly from your staff, in order to ensure that this decision is not made in a
vacuum.
A Tale of Two Searches - Federated Search in
Action
Sara sits in front
of her PC and looks at the library's web site.
Sara is looking for information on the use of open source software in
corporations, but isn't sure where to look for the information. The library's web site lists 100
databases. Should she search each one
individually? She estimates that could
easily take her two hours to complete.
Should she guess at specific databases to search and hope that she
selects the correct ones? Sara decides
to search as many of the databases as possible, but gives up after 20 minutes
out of frustration from not finding what she needs quickly.
Sean looks at the
library web site from his PC and sees not only a list of 100 databases to
search, but also a ‘Quick Database Search' option that allows him to search all
of the databases at the same time. He
types his query into the Quick Database Search box and presses enter. Immediately results begin to appear on his
screen. Sean, who is searching on
permaculture, finds references in databases that he would not have searched on
his own. This Quick Database Search
(i.e., federated search) allowed him to search many databases at once, discover
those databases that contained pertinent information, and receive on-target
results.
Evaluating and
Implementing Federated Search
As with any strategic project, it's important to develop and
follow a process to make the best decision with regard to federated search
solutions. The Federated Search Report and Tool Kit provides step-by-step
guidance in:
- Conducting a needs assessment
- Understanding staff requirements
- Identifying appropriate federated search
solutions
- Engaging in research on solutions and completing
due diligence
- Measuring results.
Case studies, figures, and 10 hands-on activities make the Tool
Kit customisable to your organisational needs.
Need more? Check out the Federated Search Report and Tool Kit, now available for purchase. Learn More »
By Jill Hurst-Wahl
Jill Hurst-Wahl, MLS, has over
20 years' experience in the library and information sector, including work as
an IT business systems analyst and as a corporate librarian. Since founding Hurst Associates in 1998, Jill
has consulted with for-profit and non-profit institutions including many
cultural heritage organisations.
As blogger, Jill's Digitization
101 blog (http://www.Digitization101.com) is a
frequently read resource among those involved in digitization programs. She also blogs about social networking tools
at eNetworking 101: The Blog (http://www.eNetworking101.com/blog).
FUMSI articles by Jill Hurst-Wahl »
Click here for copyright permissions!
Copyright 2010 Free Pint Limited
Related articles:
You may also be interested in:
|