<?xml version="1.0" encoding="UTF-8"?>
<post>
  <body>&lt;p&gt;For those of you that do not know Yahoo Pipes:&lt;/p&gt;


	&lt;p&gt;&lt;cite&gt;Pipes is a powerful composition tool to aggregate, manipulate, and mash-up content from around the web.&lt;/cite&gt;&lt;/p&gt;


	&lt;p&gt;So basically, do you love the Unix pipe operator? Now imagine you using your mouse to drag the components (processes) and connecting them together using lines (pipes), all this on your regular browser, defining a visual attractive workflow without the requirement of external software. Now that&amp;#8217;s Web 2.0!&lt;/p&gt;


	&lt;p&gt;After you create a pipe, it becomes accessible via an &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;, and you can integrate it with your web-page, or better, you can download the results via &lt;span class=&quot;caps&quot;&gt;XML&lt;/span&gt; (RSS) or &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; format, consuming whatever data your pipe produces! For &lt;strong&gt;free&lt;/strong&gt;! Now my mind starts thinking evil&amp;#8230; :-)&lt;/p&gt;


	&lt;h2&gt;Requirements and motivation&lt;/h2&gt;

	&lt;p&gt;First you need a valid Yahoo account :-). What we want to achieve? Imagine you want to build a feed aggregator (maybe to build a planet site). You need to parse, collect and process all members&amp;#8217; feeds, watch for errors, duplicates&amp;#8230; You have to know how to process all the &lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt; and &lt;span class=&quot;caps&quot;&gt;ATOM&lt;/span&gt; specifications and find the &lt;a href=&quot;http://code.google.com/p/feed-normalizer/&quot;&gt;common denominator&lt;/a&gt; of these formats. Worst than that, you will need the &lt;span class=&quot;caps&quot;&gt;CPU&lt;/span&gt; and network power to cope with more users and more posts.&lt;/p&gt;


	&lt;p&gt;You can use the gazillion of libraries to do this job and waste your &lt;span class=&quot;caps&quot;&gt;CPU&lt;/span&gt; cycles and network bandwidth&amp;#8230; or you can do it with Yahoo Pipes :-)&lt;/p&gt;


	&lt;p&gt;To achieve that we need to pass to Yahoo a list of blogs we want to aggregate. One way of doing that is by creating a &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; file that Yahoo can access every-time it runs your pipe. The &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; could have just one column with the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; of the blog (it does not have to be the actual feed &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;, more on that later):&lt;/p&gt;


&lt;pre&gt;
&quot;URL&quot; 
&quot;http://www.digg.com&quot; 
&quot;http://www.slashdot.org&quot; 
&quot;http://www.osnews.com&quot; 
&lt;/pre&gt;
	&lt;p&gt;Put this file (blogs.csv) accessible on your web service, and we can start to build our pipe!&lt;/p&gt;


	&lt;h2&gt;Building the pipe&lt;/h2&gt;

	&lt;p&gt;Now we can get our hands dirty!&lt;/p&gt;


	&lt;h3&gt;Fetch &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt;&lt;/h3&gt;

	&lt;p&gt;Start by creating a new pipe. The first element we need is the &amp;#8220;Source &amp;rarr; Fetch &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt;&amp;#8221; component. Drag it to the pipe edit pane. On the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; put the address of your &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; file. You can accept the default option to use the first column as the column name. Yahoo uses that later when you need to select which columns you want to extract. The component should look like this:&lt;/p&gt;


&lt;form mt:asset-id=&quot;43&quot; class=&quot;mt-enclosure mt-enclosure-image&quot;&gt;&lt;img alt=&quot;YahooPipe1.png&quot; src=&quot;http://blog.0x82.com/Picture%201.png&quot; width=&quot;332&quot; height=&quot;189&quot; class=&quot;mt-image-center&quot; style=&quot;text-align: center; display: block; margin: 0 auto 20px;&quot;/&gt;&lt;/form&gt;
	&lt;h3&gt;Fetch Feed Site (Loop)&lt;/h3&gt;

	&lt;p&gt;Now &lt;strong&gt;for each&lt;/strong&gt; blog on our &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; file, you want to fetch the &lt;span class=&quot;caps&quot;&gt;RSS&lt;/span&gt; or &lt;span class=&quot;caps&quot;&gt;ATOM&lt;/span&gt; on the site. Remember that I told you didn&amp;#8217;t have to know the exact &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; of the feed? That&amp;#8217;s right, Yahoo uses the standards to find the correct feed by looking at the blog content, much like the way Firefox shows that a website has a feed available to subscribe.&lt;/p&gt;


	&lt;p&gt;For this you&amp;#8217;ll want the module &amp;#8220;Source &amp;rarr; Fetch Site Feed&amp;#8221; that knows how to do just that. But wait, don&amp;#8217;t we have a &lt;strong&gt;list&lt;/strong&gt; of blogs to parse? That&amp;#8217;s right, we need to apply the &amp;#8220;Fetch Site Feed&amp;#8221; to &lt;strong&gt;all&lt;/strong&gt; blogs. To do that, we use the &amp;#8220;Operators &amp;rarr; Loop&amp;#8221; component!&lt;/p&gt;


	&lt;p&gt;So we drag a &amp;#8220;Loop&amp;#8221; module and put a &amp;#8220;Fetch Site Feed&amp;#8221; inside it. Now, we need to configure these components. Yahoo can do much of the configuration by itself when you connect the blocks. So just connect the two components, creating a pipe! Then the loop and site feed will react, allowing you to consume only what the previous module produces. It is a kind of &amp;#8220;strongly typed pipes&amp;#8221; :-)&lt;/p&gt;


	&lt;p&gt;Basically you have to configure:&lt;/p&gt;


&lt;ul&gt;
	&lt;li&gt;The &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; on the &amp;#8220;Fetch Site Feed&amp;#8221; should be &amp;#8220;item.URL&amp;#8221; (i.e., the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; column from the &lt;span class=&quot;caps&quot;&gt;CSV&lt;/span&gt; file).&lt;/li&gt;
	&lt;li&gt;The &amp;#8220;emit all&amp;#8221; should be selected on the &amp;#8220;Loop&amp;#8221; module, we are interested in all the results.&lt;/li&gt;
&lt;/ul&gt;

	&lt;p&gt;The end result should be this:&lt;/p&gt;


&lt;form mt:asset-id=&quot;44&quot; class=&quot;mt-enclosure mt-enclosure-image&quot;&gt;&lt;img alt=&quot;YahooPipes2.png&quot; src=&quot;http://blog.0x82.com/Picture%202.png&quot; width=&quot;433&quot; height=&quot;242&quot; class=&quot;mt-image-center&quot; style=&quot;text-align: center; display: block; margin: 0 auto 20px;&quot;/&gt;&lt;/form&gt;
	&lt;h3&gt;| sort | uniq &gt; output&lt;/h3&gt;

	&lt;p&gt;Now we have our feed items almost ready! However, we should process them in order to output a more &amp;#8220;user friendly&amp;#8221; result. For me, this includes sorting the feed items by the date of publication (the newer first), and removing the duplicated elements (who knows&amp;#8230;).&lt;/p&gt;


	&lt;p&gt;This couldn&amp;#8217;t be simpler with Yahoo Pipes. Just drag a &amp;#8220;Operators &amp;rarr; Sort&amp;#8221; and connect it to the previous &amp;#8220;Loop&amp;#8221; module. You should say that you want to order by &amp;#8220;item.pubDate&amp;#8221; (remember to connect the module &lt;strong&gt;before&lt;/strong&gt; selecting &amp;#8220;item.pubDate&amp;#8221;) in descending order.&lt;/p&gt;


	&lt;p&gt;Then you can drag and connect a &amp;#8220;Operators &amp;rarr; Unique&amp;#8221; component and say you want to reject duplicated &amp;#8220;item.link&amp;#8221; items! The result should be like this figure:&lt;/p&gt;


&lt;form mt:asset-id=&quot;45&quot; class=&quot;mt-enclosure mt-enclosure-image&quot;&gt;&lt;img alt=&quot;YahooPipes3.png&quot; src=&quot;http://blog.0x82.com/Picture%203.png&quot; width=&quot;425&quot; height=&quot;206&quot; class=&quot;mt-image-center&quot; style=&quot;text-align: center; display: block; margin: 0 auto 20px;&quot;/&gt;&lt;/form&gt;
	&lt;h3&gt;Connect to output and have fun&lt;/h3&gt;

	&lt;p&gt;The final step is to connect the output of the &amp;#8220;Unique&amp;#8221; component to the &amp;#8220;Pipe Output&amp;#8221;. You&amp;#8217;re done :-) Now you can play with the debugger and see the output of your pipe. You can click on any of your components and get the intermediate results too!&lt;/p&gt;


	&lt;p&gt;Now that you&amp;#8217;ve tested your pipe, you should save it, click &amp;#8220;Run Pipe&amp;#8221; on top, and Yahoo redirects you to a page where you can see the results of running your pipe :-) Then, you can publish (make the pipe public), edit, delete, clone, adding the results to your Google account, see a bunch of statistics about your pipe, etc&amp;#8230;&lt;/p&gt;


	&lt;p&gt;For me, the best feature is under the &amp;#8220;More Options&amp;#8221; menu. You can download the results of your pipe on &lt;span class=&quot;caps&quot;&gt;XML&lt;/span&gt; or (even better) in &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt;! You can then use this on your favorite application to do.. well.. whatever you want :-) Using the pipe this way is like the &amp;#8220;pull model&amp;#8221; because you poll the pipe every-time you want results.&lt;/p&gt;


	&lt;p&gt;The other option is to use the &amp;#8220;Operators &amp;rarr; Web Service&amp;#8221; component that pushes the pipe results to your application &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;, and you don&amp;#8217;t have to poll Yahoo anymore.. (well, you have to run the pipe by invoking its &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; so&amp;#8230;).&lt;/p&gt;


	&lt;h2&gt;Conclusion and final thoughts&lt;/h2&gt;

	&lt;p&gt;I&amp;#8217;ve just scratched the surface of &amp;#8220;Yahoo Pipes&amp;#8221;. There are tons of other features that allow you to build a pipe with whatever complexity you want. But we can already raise some questions:&lt;/p&gt;


&lt;ul&gt;
	&lt;li&gt;How easier could it be??&lt;/li&gt;
	&lt;li&gt;Does it scale? (imagine you have 1000 blogs to aggregate&amp;#8230;)&lt;/li&gt;
	&lt;li&gt;Will it be always free (regardless the number of times you run the pipe) ?&lt;/li&gt;
	&lt;li&gt;Are they using &lt;a href=&quot;http://www.plagger.org&quot;&gt;plagger&lt;/a&gt;? :-)&lt;/li&gt;
	&lt;li&gt;Is there any technical information about the Yahoo implementation of this service?&lt;/li&gt;
	&lt;li&gt;Is Yahoo hiring in Europe? This is a serious interesting project&amp;#8230;&lt;/li&gt;
&lt;/ul&gt;

	&lt;p&gt;Questions or corrections are welcome :-) Start playing with your pipes today!&lt;/p&gt;</body>
  <excerpt>&lt;p&gt;Today I discovered &lt;a href=&quot;http://pipes.yahoo.com&quot;&gt;Yahoo Pipes&lt;/a&gt; after reading &lt;a href=&quot;http://blog.karlus.net/archives/2007/12/10/1833&quot;&gt;this post&lt;/a&gt;. A friend of mine told me the service is already one year old. It seems I&amp;#8217;m a little bit distracted.&lt;/p&gt;


	&lt;p&gt;First, I have to say this: &lt;span class=&quot;caps&quot;&gt;IMHO&lt;/span&gt;, this wins the &amp;#8220;best web service of the year&amp;#8221; award, and by a large margin! Google has some serious competition with this kind of services&amp;#8230;&lt;/p&gt;


	&lt;p&gt;So on this post I decided to give a brief introduction to this Yahoo service and demonstrate how easily you can build a dynamic feed aggregator using Yahoo Pipes!&lt;/p&gt;</excerpt>
  <id type="integer">75</id>
  <permalink>building-a-dynamic-feed-aggregator-with-yahoo-pipes</permalink>
  <published-at type="datetime">2007-12-12T11:54:47-08:00</published-at>
  <title>Building a dynamic feed aggregator with Yahoo Pipes</title>
</post>
