Public Notes
on
histre
Node.js or Ruby for Scraping - Stack Overflow
stackoverflow.com
"When you say that mechanize can't scrape dynamic content, you really mean that it's a little bit more work to figure out which ajax requests need to be made and make them. The other side of that is that once you do you generally get a nice json response that's easy to deal with. Mechanize is also much faster than a full browser solution so my opinion is that it's usually worth the extra work.
As far as Node goes, there's potential and maybe once it's been around for a while some great libraries will become available, but I haven't seen anything yet that would make up for the ruby things I wiss miss."
##dynamic_screen_scraping #project #mechanize #ruby #ajax #javascript #screen_scraping #pub
Show More
jnicklas/capybara · GitHub
github.com
"Normally Capybara expects to be testing an in-process Rack application, but you can also use it to talk to a web server running anywhere on the internets, by setting app_host:"
_To reiterate, this way capybara isn't expecting to be testing within the current ruby rack app's urls
##dynamic_screen_scraping #capybara #automation #pub
_"Using the DSL elsewhere" You can mix the DSL into any context by including Capybara::DSL…This enables its use in unsupported testing frameworks, and for general-purpose scripting."_ #capybara ##dynamic_screen_scraping #screen_scraping #user_testing #pub
_"Using the DSL elsewhere" You can mix the DSL into any context by including Capybara::DSL…This enables its use in unsupported testing frameworks, and for general-purpose scripting."_ #capybara ##dynamic_screen_scraping #screen_scraping #user_testing #pub
Show More
Does the Twitter feed really bottom out around 800 tweets? That's nothing. Wtf...
#twitter #APIs #hacking #scraping #rare #open_source #bookmarked_on_site #C# #windows #apps #screen_scraping #pub
Show More
Collect and share the web
Get started for free