smweb / web-scrapper
Web scrapper for 10web's blog
Installs: 5
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Type:app
Requires
- php: >=7.1.0
This package is auto-updated.
Last update: 2024-10-03 10:51:32 UTC
README
Simple CLI web application that scrapes and aggregates latest blog posts and shows them on the front page.
Please notice, that the app is in dev mode.
Some of the feature include:
- Dependencies managed through composer
- Options for date range and article limit
Dependencies
- PHP web server
- PHP >= 7.1
- MySQL >= 8.0
How to quickly setup
-
Ensure you have composer installed
- You can use composer (recommended) to create the project using
composer create-project smweb/web-scrapper:dev-master myproject
(rename {myproject} to any) - or download the project in zip format here and extract it to your http server.
- You can use composer (recommended) to create the project using
-
In the root folder, run composer install
-
In the app/ folder, edit db_config.php for proper DB credentials
-
After DB config edited, run command in the root folder
php app/create_tables.php
to create DB tables -
To scrap posts and save to DB run command in the root folder
php app/scraper_cli.php --count "{count}" --startDate "{startDate}" --endDate "{endDate}"
, where- {count} is articles count to scrap, integer // 10 by default
- {startDate} is article's published min date
- {endDate} is article's published max date
- Date format: mm/dd/yyyy (example: 04/23/2021 )
-
To view frontpage with scraped data, run command in the root folder
php -S localhost:800
to start a server at the root folder. Frontpage can be accessed atlocalhost:800
.
Versioning
Project uses GitHub for versioning.