Public Notes
on
histre
Node.js or Ruby for Scraping - Stack Overflow
stackoverflow.com
"When you say that mechanize can't scrape dynamic content, you really mean that it's a little bit more work to figure out which ajax requests need to be made and make them. The other side of that is that once you do you generally get a nice json response that's easy to deal with. Mechanize is also much faster than a full browser solution so my opinion is that it's usually worth the extra work.
As far as Node goes, there's potential and maybe once it's been around for a while some great libraries will become available, but I haven't seen anything yet that would make up for the ruby things I wiss miss."
##dynamic_screen_scraping #project #mechanize #ruby #ajax #javascript #screen_scraping #pub
Show More
ruby on rails - Is it possible to run capybara-webkit (i.e. forked webkit_server) on Heroku Cedar? - Stack Overflow
stackoverflow.com
"This project has a JavaScript, rather than Ruby API, although the browser instance can expose a web-server, allowing you to communicate with it from Ruby over HTTP."
##dynamic_screen_scraping #phantomjs #ruby #javascript #project #pub
Show More
jonleighton/poltergeist · GitHub
github.com
"Poltergeist is a driver for Capybara. It allows you to run your Capybara tests on a headless WebKit browser, provided by PhantomJS."
##dynamic_screen_scraping #ruby #phantomjs #project #capybara #pub
Show More
mattheworiordan/capybara-screenshot · GitHub
github.com
"Using this gem, whenever a Capybara test in Cucumber, Rspec or Minitest fails, the HTML for the failed page and a screenshot (when using capybara-webkit, Selenium or poltergeist) is saved into $APPLICATION_ROOT/tmp/capybara. This is a huge help when trying to diagnose a problem in your failing steps as you can view the source code and potentially how the page looked at the time of the failure." _Can also have it run manually on demand_
##dynamic_screen_scraping #screenshots #project #capybara #phantomjs #ruby #javascript #testing #@to_do #pub
Show More
"Poltergeist the Capybara driver for PhantomJS nearly has what you're asking but not quite. If you're willing to fill the gap yourself you can try and hack the render binding to pass the coordinates of the element you're interested in to the setClipRect() setter which dictates the area to be captured."
_Answers talks about hacking Poltergeist's javascript code that exposes Phantomjs to add Casperjs features_
##dynamic_screen_scraping #phantomjs #casperjs #ruby #capybara #hacks #pub
Show More
jnicklas/capybara · GitHub
github.com
"Normally Capybara expects to be testing an in-process Rack application, but you can also use it to talk to a web server running anywhere on the internets, by setting app_host:"
_To reiterate, this way capybara isn't expecting to be testing within the current ruby rack app's urls
##dynamic_screen_scraping #capybara #automation #pub
_"Using the DSL elsewhere" You can mix the DSL into any context by including Capybara::DSL…This enables its use in unsupported testing frameworks, and for general-purpose scripting."_ #capybara ##dynamic_screen_scraping #screen_scraping #user_testing #pub
_"Using the DSL elsewhere" You can mix the DSL into any context by including Capybara::DSL…This enables its use in unsupported testing frameworks, and for general-purpose scripting."_ #capybara ##dynamic_screen_scraping #screen_scraping #user_testing #pub
Show More
Collect and share the web
Get started for free