Public Notes
on
histre
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation. - apify/crawlee
#nodejs #crawler #scraper #headless-browser #framework
Show More
推特 图片 视频 爬虫;一键下载. Contribute to caolvchong-top/twitter_download development by creating an account on GitHub.
#twitter #crawler #python
Show More
Home - Firecrawl
www.firecrawl.dev
Firecrawl crawls and converts any website into clean markdown.
#api #crawler #markdown #ai #llm #readability
Show More
A social networking service scraper in Python. Contribute to JustAnotherArchivist/snscrape development by creating an account on GitHub.
#osint #crawler #twitter #telegram #python #scraper
Show More
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API. - mendableai/firecrawl
#crawler #api #ai #markdown #service
Show More
Incredibly fast crawler designed for OSINT. Contribute to s0md3v/Photon development by creating an account on GitHub.
#crawler #python #osint #archive
Show More
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫 - NanmiCoder/MediaCrawler
#crawler #bilibili #xiaohongshu #python
Show More
Web Scraping in Python – The Complete Guide | Hacker News
news.ycombinator.com
Web Scraping Proxies API for Developers
proxiesapi.com
Web Scraping in Python - The Complete Guide | ProxiesAPI
proxiesapi.com
Web crawling framework based on asyncio. Contribute to gaojiuli/gain development by creating an account on GitHub.
#python #crawler #asyncio
Show More
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM) - ultrafunkamsterdam/undetected-chromedriver: Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
#headless #chrome #crawler #anti-crawler
Show More
Extract web data on big scale.
scrapeninja.net
Scrape and Monitor Data from Any Website with No Code
www.browse.ai
telegram群组-电报群搜索- TgSql.com
www.tgsql.com
阅读(io.legado.app.release) - 3.21.080316 - 应用 - 酷安
www.coolapk.com
HTTrack Website Copier - Free Software Offline Browser (GNU GPL)
www.httrack.com
HTTrack is a free (GPL, libre/free software) and easy-to-use offline browser utility. It allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the 'mirrored' website in your browser, and you can browse the site from link to link, as if you were viewing it online....
#website #archiving #download #crawler #pub
Show More
Teleport -- Offline Browsing Webspider
www.tenmax.com
Teleport Pro: The world's most widely used webspider. Fast, reliable, robust, comprehensive webspidering, Teleport Pro by Tennyson Maxwell Information Systems, Inc.
#website #archiving #crawler #pub
Show More
Collect and share the web
Get started for free