Public Notes by chase_ats Tagged #scraping

Notes publicly shared by our members.
Claims to be whitehat and that is what we see, but this blog is filled with not so whitehat gold [for a non-blackhat[er] [Ruby] [programmer]] #whitehat #masquerading #blackhat #automation #screen #scraping #testing #automated #generate #data #blog #watir #pub
It's Selenium! #selenium #automation #web #testing #masquerading #blackhat #whitehat #server #api #scraping #screen #data #pub
This site is just scraping http://800notes.com which always seems to be at the top of Google rankings for phone number searches #blackhat #automation #scraping #steal #content #web #2.0 #user #generated #pub
"Returns the most important pieces of content on a web page. Finds the best block of text, image and title by analysing the page content." #readability #parse #scraping #pub
"40+ channels on various subjects and latest trends top ranked stories from trusted sources based on popularity and quality sign in with twitter and choose your channels" "In-A-Gist algorithmically curates tweets based on popularity in real-time. We collate tweets on the same topic and this page is built from such curated tweets. We keep refreshing this page as and when we find popular tweets on topics mentioned in the tweet. They are presented in the "Related Tweets" section." #twitter #API #scraping #idea #pub
From Skype HN chat #bootstrap_layout #coupons #deals #aggregation #automation #scraping #idea #copy #pub
"Retrieve full-text mobile-optimized JSON-encoded articles from a RSS feed" _It uses the Streamified.me API. Not sure how much of the app is just a wrapper of that and how much is actually legit coding_ #scraping #closed_source #simple #RSS #feeds #web_tools #pub
"A feed fetching and parsing library that treats the internet like Godzilla treats Japan: it dominates and eats all." #ruby #open_source #scraping #feeds #automation #pub
"Feedbag is a Ruby library for the auto-discovery of syndicated feeds (RSS/Atom)." _Give it a url and it'll try finding the feed for the site_ #gems #ruby #open_source #parsing #scraping #automation #feeds #pub
"information from the web into useable data Turn any website into a table of data or an API in minutes without writing any code" #SaaS #scraping #pub

" Websites are full of useful data. Extracting that data is difficult. Your web browser doesn’t help. Today, you extract data by writing code. We provide tools to make extraction simple. We're in Developer Preview. We love feedback." #data_mining #web #tools #desktop_apps #beta #@to_investigate #pub
crawlera.com
#SaaS #paid #scraping #pub
Posting ESPN Insider articles on a Google Adsense ridden site with almost no design. And the scraping is badly done too. But still, def something people want. Found it via a one poster, likely the site owner, posting a link to it in a related forum discussion in early 2013. Site is still up as of late 2014. -- Bottom says: "Powered by Howtoknow.us" #content_scraping #scraping #ideas #copy #SEO #bluehat #sports #fantasy_sports #paywalls #pub