<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for Shut up and code</title>
	<atom:link href="http://lepunk.co.uk/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://lepunk.co.uk</link>
	<description>lePunk&#039;s thoughts about life, universe and everything</description>
	<lastBuildDate>Sun, 21 Apr 2013 16:15:10 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5</generator>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by Scraping the sh*t out of the interwebz – Part #2 &#124; Shut up and code</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-1690</link>
		<dc:creator>Scraping the sh*t out of the interwebz – Part #2 &#124; Shut up and code</dc:creator>
		<pubDate>Sun, 21 Apr 2013 16:15:10 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-1690</guid>
		<description><![CDATA[[...] you haven&#8217;t read the first part of my php curl tutorial you should do so here. As in the previous one we will use my open source curl.class.php but this time we will do [...]]]></description>
		<content:encoded><![CDATA[<p>[...] you haven&#8217;t read the first part of my php curl tutorial you should do so here. As in the previous one we will use my open source curl.class.php but this time we will do [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by Gary Teh</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-247</link>
		<dc:creator>Gary Teh</dc:creator>
		<pubDate>Wed, 13 Mar 2013 15:34:58 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-247</guid>
		<description><![CDATA[Yup, the engine allows for the processing of RSS feeds. Inappropriate of web scraping does impose a huge bandwidth penalty on the content provider.]]></description>
		<content:encoded><![CDATA[<p>Yup, the engine allows for the processing of RSS feeds. Inappropriate of web scraping does impose a huge bandwidth penalty on the content provider.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by David Bradbury</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-219</link>
		<dc:creator>David Bradbury</dc:creator>
		<pubDate>Mon, 11 Mar 2013 19:44:56 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-219</guid>
		<description><![CDATA[Thanks for posting this. I&#039;ve been working on scraping data recently so that I can keep track of trends in certain products I use. Remember to use RSS feeds or API&#039;s when available to be more polite and decrease processing time! In the case of eBay, they have ebay dot com/sch/rss/. You can customize the results as well and it will be nicely organized for processing.]]></description>
		<content:encoded><![CDATA[<p>Thanks for posting this. I&#8217;ve been working on scraping data recently so that I can keep track of trends in certain products I use. Remember to use RSS feeds or API&#8217;s when available to be more polite and decrease processing time! In the case of eBay, they have ebay dot com/sch/rss/. You can customize the results as well and it will be nicely organized for processing.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by Gary Teh</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-210</link>
		<dc:creator>Gary Teh</dc:creator>
		<pubDate>Mon, 11 Mar 2013 02:41:57 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-210</guid>
		<description><![CDATA[Cool! Me and my team mate thought to make the entire scraping layer abstract and used this instead

https://krake.io/scrap-gm

{
  origin_url: &#039;http://www.ebay.com/sch/?_nkw=GM%20Part&amp;_pgn=1&#039;,
  columns: [
    {
      col_name: &#039;item_name&#039;,
      dom_query: &#039;h4 a&#039;      
    }, {      
      col_name: &#039;item_detail_url&#039;,
      dom_query: &#039;h4 a&#039;,
      required_attribute: &#039;href&#039;,
      options : {
        columns: [{
            col_name: &#039;description&#039;,
            dom_query: &#039;#desc_div&#039;
          },{
            col_name: &#039;seller_name&#039;,
            dom_query: &#039;.mbg a[[0]]&#039;            
          },{
            col_name: &#039;seller_profile_url&#039;,
            dom_query: &#039;.mbg a[[0]]&#039;,
            required_attribute: &#039;href&#039;
        }]
      }
    }, {      
      col_name: &#039;item_image&#039;,
      dom_query: &#039;.img img&#039;,
      required_attribute: &#039;src&#039;  
    }
  ],
  next_page: {
    dom_query: &#039;.next&#039;
  }
};]]></description>
		<content:encoded><![CDATA[<p>Cool! Me and my team mate thought to make the entire scraping layer abstract and used this instead</p>
<p><a href="https://krake.io/scrap-gm" rel="nofollow">https://krake.io/scrap-gm</a></p>
<p>{<br />
  origin_url: &#8216;http://www.ebay.com/sch/?_nkw=GM%20Part&amp;_pgn=1&#8242;,<br />
  columns: [<br />
    {<br />
      col_name: 'item_name',<br />
      dom_query: 'h4 a'<br />
    }, {<br />
      col_name: 'item_detail_url',<br />
      dom_query: 'h4 a',<br />
      required_attribute: 'href',<br />
      options : {<br />
        columns: [{<br />
            col_name: 'description',<br />
            dom_query: '#desc_div'<br />
          },{<br />
            col_name: 'seller_name',<br />
            dom_query: '.mbg a[[0]]&#8217;<br />
          },{<br />
            col_name: &#8216;seller_profile_url&#8217;,<br />
            dom_query: &#8216;.mbg a[[0]]&#8217;,<br />
            required_attribute: &#8216;href&#8217;<br />
        }]<br />
      }<br />
    }, {<br />
      col_name: &#8216;item_image&#8217;,<br />
      dom_query: &#8216;.img img&#8217;,<br />
      required_attribute: &#8216;src&#8217;<br />
    }<br />
  ],<br />
  next_page: {<br />
    dom_query: &#8216;.next&#8217;<br />
  }<br />
};</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by dibbz</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-209</link>
		<dc:creator>dibbz</dc:creator>
		<pubDate>Sun, 10 Mar 2013 21:24:26 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-209</guid>
		<description><![CDATA[I prefer to use this class (below), it&#039;s clean and lean code, I use curl instead of the built in as it is more resilient and I also like/need to follow redirects.
http://simplehtmldom.sourceforge.net/index.htm

Badly structured html? Look at Tidy.
http://www.php.net/manual/en/intro.tidy.php]]></description>
		<content:encoded><![CDATA[<p>I prefer to use this class (below), it&#8217;s clean and lean code, I use curl instead of the built in as it is more resilient and I also like/need to follow redirects.<br />
<a href="http://simplehtmldom.sourceforge.net/index.htm" rel="nofollow">http://simplehtmldom.sourceforge.net/index.htm</a></p>
<p>Badly structured html? Look at Tidy.<br />
<a href="http://www.php.net/manual/en/intro.tidy.php" rel="nofollow">http://www.php.net/manual/en/intro.tidy.php</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by lePunk</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-208</link>
		<dc:creator>lePunk</dc:creator>
		<pubDate>Sun, 10 Mar 2013 20:46:42 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-208</guid>
		<description><![CDATA[Yeah, good suggestion. However when I was trying out phpQuery i found that it fails to work on badly structured html]]></description>
		<content:encoded><![CDATA[<p>Yeah, good suggestion. However when I was trying out phpQuery i found that it fails to work on badly structured html</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by Phillip</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-207</link>
		<dc:creator>Phillip</dc:creator>
		<pubDate>Sun, 10 Mar 2013 20:36:17 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-207</guid>
		<description><![CDATA[If you looking to grab something odly specific phpQuery can make it easier than the DOMDocument can, it lets you use jQuery style selectors to grab elements. Because there jQuery based you can easily use the console to build them out with too. Just something to keep in find if you want the 4rd header containing the word &quot;Trunip&quot; that has the class &quot;farkle&quot;.]]></description>
		<content:encoded><![CDATA[<p>If you looking to grab something odly specific phpQuery can make it easier than the DOMDocument can, it lets you use jQuery style selectors to grab elements. Because there jQuery based you can easily use the console to build them out with too. Just something to keep in find if you want the 4rd header containing the word &#8220;Trunip&#8221; that has the class &#8220;farkle&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by lePunk</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-205</link>
		<dc:creator>lePunk</dc:creator>
		<pubDate>Sun, 10 Mar 2013 16:57:07 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-205</guid>
		<description><![CDATA[Yes Emil, you are absolutely right. Will make sure to cover this topic in one of the following tutorials]]></description>
		<content:encoded><![CDATA[<p>Yes Emil, you are absolutely right. Will make sure to cover this topic in one of the following tutorials</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scraping the sh*t out of the interwebz &#8211; Part #1 by Emil Vikström</title>
		<link>http://lepunk.co.uk/scraping-the-sht-out-of-the-interwebz-part-1/#comment-203</link>
		<dc:creator>Emil Vikström</dc:creator>
		<pubDate>Sun, 10 Mar 2013 16:42:56 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=83#comment-203</guid>
		<description><![CDATA[This lacks a discussion of robots.txt and user agents, stuff that all web crawlers/scrapers should adhere to. In short: Make sure your user-agent is set to something sensible, preferrably with some contact information, and make sure you follow the rules set up by the target site&#039;s robots.txt file.]]></description>
		<content:encoded><![CDATA[<p>This lacks a discussion of robots.txt and user agents, stuff that all web crawlers/scrapers should adhere to. In short: Make sure your user-agent is set to something sensible, preferrably with some contact information, and make sure you follow the rules set up by the target site&#8217;s robots.txt file.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Tech support tips for developers by HK</title>
		<link>http://lepunk.co.uk/tech-support-tips-for-developers/#comment-121</link>
		<dc:creator>HK</dc:creator>
		<pubDate>Wed, 27 Feb 2013 11:50:18 +0000</pubDate>
		<guid isPermaLink="false">http://lepunk.co.uk/?p=68#comment-121</guid>
		<description><![CDATA[I&#039;m perfectly calm, thanks. Perhaps my faith in humanity is slightly dampened knowing that a cretin like you exists, but at least people like you make the rest of us look better.

So all you have to say is: &quot;You don’t have the foggiest idea what you’re talking about&quot;? ...Which is just what I said to you repeated back to me?! Your intellect astounds me. 
Basically, you have nothing more to say on the subject matter than &quot;OMG, this is so wrong, everyone is a sociopath!!!1111&quot; with ZERO to back it up with. There is absolutely no point in wasting any time on you.

Tip for the future: Don&#039;t just disagree with posts for the sake of it. IT DOES NOT MAKE YOU LOOK COOL.

Good luck finding a company stupid enough to employ you, especially as you&#039;ve &quot;forgotten&quot; about management and development. Bahaha!]]></description>
		<content:encoded><![CDATA[<p>I&#8217;m perfectly calm, thanks. Perhaps my faith in humanity is slightly dampened knowing that a cretin like you exists, but at least people like you make the rest of us look better.</p>
<p>So all you have to say is: &#8220;You don’t have the foggiest idea what you’re talking about&#8221;? &#8230;Which is just what I said to you repeated back to me?! Your intellect astounds me.<br />
Basically, you have nothing more to say on the subject matter than &#8220;OMG, this is so wrong, everyone is a sociopath!!!1111&#8243; with ZERO to back it up with. There is absolutely no point in wasting any time on you.</p>
<p>Tip for the future: Don&#8217;t just disagree with posts for the sake of it. IT DOES NOT MAKE YOU LOOK COOL.</p>
<p>Good luck finding a company stupid enough to employ you, especially as you&#8217;ve &#8220;forgotten&#8221; about management and development. Bahaha!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
