Public Notes by chase_ats Tagged #open_data

Notes publicly shared by our members.
commoncrawl.org
"non-profit foundation dedicated to providing an open repository of web crawl data that can be accessed and analyzed by everyone" #open_source #open_data #crawlers #Google #spiders #Data #text #web_index #pub
urlsearch.commoncrawl.org
"Enter a domain to find the location of files in the corpus that have pages from that URL. The output will be an alphabetically ordered list and a JSON file that can be downloaded" #open_data #web_index #crawlers #search_engines #Data #open_source #bootstrap_layout #pub
#!TO_TAG_for_real #startups #data #databases #open_data #paid #freemium #APIs #dev #SEO #automated #work #schemes #pub