xin / crawler
一个简单的API即可实现多平台文章采集,目前已接入:微信公众号,CSDN,博客园,简书,人民网,QQ新闻,百家号
v2.0.0
2022-05-10 10:32 UTC
Requires
- ext-curl: *
- ext-dom: *
- ext-json: *
- ext-libxml: *
- ext-simplexml: *
- guzzlehttp/guzzle: ^6.3 | ^7.0
- jaeger/querylist: ^4.1
- xin/capsule: ^1.0
- xin/support: ^1.0
This package is auto-updated.
Last update: 2024-11-17 13:57:40 UTC
README
介绍
一个简单的API即可实现多平台文章采集,目前已接入:微信公众号,CSDN,简书,阮一峰日志,人民网,QQ新闻
安装教程
composer require xin/crawler
使用说明
use Xin\Crawler\CrawlerManager;
require_once '../vendor/autoload.php';
$crawler = new CrawlerManager();
//$data = $crawler->craw('https://mp.weixin.qq.com/s/3vD2wppFR7Ljl4nO9p97uA');
//$data = $crawler->craw('https://blog.csdn.net/jimlong/article/details/8606005');
//$data = $crawler->craw('https://www.jianshu.com/p/871c604d9aa2');
//$data = $crawler->craw('http://www.ruanyifeng.com/blog/2019/08/information-theory.html');
//$data = $crawler->craw('http://money.people.com.cn/n1/2019/0805/c42877-31276626.html');
//$data = $crawler->craw('https://new.qq.com/omn/FIN20190/FIN2019080500948300.html');
//$data = $crawler->craw('https://www.cnblogs.com/mithrandirw/p/8468925.html');
//$data = $crawler->craw('https://baijiahao.baidu.com/s?id=1619247656903510608&wfr=spider&for=pc');
$data = $crawler->craw('https://www.sohu.com/a/223511457_99893391');
var_dump($data);