Yahoo! Pipes application
(http://pipes.yahoo.com) is a
powerful tool for mashing up a variety of information. I've
found it particularly useful for aggregating and filtering RSS feeds,
so you can create a specific and relevant flow of information on a
variety of topics.
The added advantage is that you can also easily copy Pipes and
tweak them just a bit to create additional topic feeds as you need
them. In this article, I'll walk you through how to do this.
Fig. 1
Getting
acquainted with Pipes Yahoo!
Pipes have been around for quite a while -- and still remain in
'beta' version. Seeing this 'beta' status after so many years made me
worry that Pipes would go away unexpectedly in the near future, but
as an update
on the site from June 2009 indicates:
'We've been getting some
questions about what's going on with Pipes these day(s) from various
blogs and our message boards (Phew!
So I'm not the only one who was worried), so here's a small
update about its progress and what we're working on.
Today,
Pipes serves hundreds of millions requests monthly and its usage
continues to grow. As part of the Yahoo! Open Strategy, we're excited
about the ongoing potential for Pipes and can't wait to see what
developers like you use it for next.'
There are some really cool things
being created in Pipes which, I'll admit, are way beyond my current
competence level in using Pipes. For example, check out the Pipe
'Social
Media Firehose' -- a fairly quick way to get a scan on product,
company or phrase mentions across several social media tools.
When I look at the construction
behind some of these Pipes, I'll admit I often don't understand what
I'm seeing. When you start to build a Pipe from scratch, you quickly
come to realise that building a brand new Pipe is similar to using
Microsoft Visual Basic -- it's almost like a programming language.
My goal here is to keep the learning curve and confusion to a
minimum - to show you a quick and easy way to create a Pipe that
aggregates and filters feeds, and then copy your Pipe to easily
create additional feeds in your topic areas of interest. This process
uses only a couple of the modules available in Pipes.
Getting
started: the importance of cloning Before you get started playing with
Pipes, you will need to sign up for a Yahoo! ID, if you don't already
have one. You can do this at https://edit.yahoo.com/registration.
You can watch the Pipes
introductory video on the site, but I've found the easiest way to
create a new pipe is to copy or 'clone' an existing one. You can
clone ones that you've made yourself, or any available 'public'¯
pipes. You can browse or search available Pipes at
http://pipes.yahoo.com/pipes/pipes.popular.
I've 'published' a few of my Pipes
(in other words, made them available to the public) on the main Pipes
site at http://pipes.yahoo.com.
When you're creating your Pipe, you can choose to publish them or
not. If you don't, they're only for your usage. You can still share
the information created by your Pipes; it's just that others can't
see the sources or mechanism behind your Pipe. This is especially
helpful if you are doing sensitive business or intelligence work.
Depending upon the intensity of the
Pipe (number of sources, operations, etc.), it may take a while
loading -- just be patient. Once the Pipe does come up, you'll see a
few different options:
View Source: This allows you to
look at how the Pipe is constructed.
Clone: This allows you to copy the
Pipe and to modify it for your own usage.
Fig 3
You'll also notice that there are
several ways to capture the information created by the Pipe. You can
get a straight RSS feed, which you can then put into any RSS reader
or service. You also have the ability to get a 'badge'¯, so that the
Pipe will feed onto Typepad, Blogger or Wordpress blogs or iGoogle.
Pipes also offers the raw code that you can embed onto a page.
Fig. 3.5
Again, we'll
use the Open Source Pipe as an example to walk you through Cloning
(or copying) a Pipe, modifying it, and testing it.
Once you've signed in, you can
Clone the Pipe. When you click on Clone, you'll get a screen that
looks almost exactly like the original Pipe, but will be named
'(Original name of the Pipe) copy'¯. In this case, the copy will be
called 'Open Source copy'¯.
Modifying
your sources The Pipe copy will run. Now you have an
additional option to 'Edit source'.
When you click on 'Edit source'¯,
you can see the detail of the feeds that are part of the Pipe and the
filters.
Fig. 4
The Open source Pipe is a fairly
simple example. You'll see a module on the left called 'Fetch site
feed', which has several URLs listed. The nice thing about the 'Fetch
site feed' module is that you don't need to search for the RSS feeds
on the site URL that you enter; you can simply put in the URL and
Pipes will fetch any and all RSS feeds available on the page.
Fig. 5
You'll notice on the far left,
there are other modules you can also use. One of these is 'Fetch
feed'. You would use this option if you have the actual RSS feed
links.
Fig. 6
For 'Fetch site feed' URLs, you can
enter a home page URL for a site, and you can also enter a URL that
lists RSS feeds available from the source. (Two good examples are the
Wall
Street Journal RSS feed page and the Financial
Times RSS feed page).
To remove any URLs, you can simply
click on the 'minus sign' circle to the left of the URL link. To add
additional URLs, simply click on the 'plus sign' next to 'URL' at the
top of the list. An open field will be added to the bottom of the
list.
Typically, if the auto-discovery
process has been successful, the source logo will appear to the left
after a moment. Otherwise, a question mark may appear. This doesn't
necessarily mean that no feeds were found or available; it may simply
mean the feeds are down at the moment.
Filtering
your information
Now we'll look at the Filter box. The
Filter module will allow you to set up some Boolean limiters around
the raw feeds. You'll see that you have the option to Permit or Block
the words you specify, and the option to use Any or All of the words
you specify.
Fig. 7
In the Open source example, the
filter permits those items where the item description contains 'open
source'. You can modify the words in that text box, or add additional
keywords or 'Rules' using the 'plus sign' next to 'Rules'.
Here is where Pipes can begin to
get confusing, but it really isn't too bad if you keep it simple. If
you click on the 'item:description'¯box, this is actually a
drop-down box that has a multitude of options. For keyword filtering,
I typically stick with either 'item:description'¯ or 'item:title'.
Using one of these typically gets me the relevance I need out of the
feed. When I'm setting up a new feed, I will try both of these to see
how results vary, and select the one that gives me the best results.
If you scroll down further in the
drop-down box, you'll also see that you can filter by publishing
date. I have not yet tried this function. In my opinion, if you're
simply looking to capture the most recent and breaking information,
it's not necessary to use this function. Do note, though, that by
adding more Rules, you can filter both by keywords and by publishing
date.
You'll also see that the
'Contains'¯ box is a drop-down box. You can choose 'Does not
contain', and other operators that you would use with a publishing
date filter. Usually I will stick with 'Contains', but adding a 'Does
not contain' operator often will screen out duplicative or
less-relevant results.
For example, I created a feed
specifically for Facebook news, and chose to filter out items
containing 'Twitter'. The results were much more specific to
Facebook, and filtered out those items where Facebook was simply
mentioned along with other social networking tools.
That said, I encourage you to
experiment with different filters, operators and keywords until you
come upon a combination that works for you.
The
key advantages Now that you know a little bit about
the logistics, let me share my broader approach to this process.
You'll see that the Open Source Pipe
example pulls from a lot of technical market sources (such as CNET
and VARbusiness), which makes sense for the Open Source Pipe. It also
pulls from more mainstream sources that cover technology, such as the
New York Times and Wall Street Journal. I've
found that, if I can create a good base of resources around an
industry, I can often simply Clone a Pipe and just put in different
key words to get a strong Pipe on a new topic.
The other advantage is, I can
quickly add sources to the 'Fetch site feed'¯ list of resources as I
find them, continually making my Pipes stronger.
Pipes can get very confusing very
quickly, which I believe is one of the reasons they are not more
widely used. The other challenge with Pipes is finding a good base of
resources around an industry that's new to you. This is not a new
challenge to researchers. The good news is that there are many
colleagues and resources we can rely on to help us locate these
resources. A hidden challenge in this process, however, is that some
of the resources I'd really like to integrate into Pipes still don't
offer RSS feeds.
Despite the complexity of Pipes, by
using this relatively simply process of copying and modifying
existing Pipes, I've found a way
to maximise my information feeds and their relevancy. I encourage you
to Clone the Pipes example I've shared here, and see what you might
be able to create for yourself!
By Scott Brown
Scott
Brown is owner of Social Information Group
(http://www.socialinformationgroup.com), an independent information
practice focused on the effective use of social networking tools for
sharing and finding information. He has worked with Fortune 500
companies, government and non-profit organisations, and individuals to
help them understand and effectively use these tools. As Senior
Information Specialist with Digital Libraries & Research, the
library and information organisation at Sun Microsystems, he provided
in-depth secondary research and competitive intelligence, conducted
stakeholder work, and used Six Sigma tools to determine customer needs
and wants. He received his library degree from San Jose State
University in California in 1999.
Contact Marcy Phelps, our contributing editor for the Find practice area, with your feedback and suggestions for articles or resources.
Subscribe
Get the monthly FUMSI Magazine, FUMSI Reports and discounts on reports. Find out more »
For the latest updates, subscribe to the free weekly FUMSI Focus »
Sponsor
Sponsors of the Find practice area reach power-searchers with budgetary control or influence over their organisation's data purchases. Sponsorships for this practice area are nearly sold out, so contact us today for further information. Learn more now »
Comment
Ask your tricky Find-related questions in the FreePint Bar -- our community is ready to help!