starfruit / crawler-bundle
Starfruit Crawler Bundle
0.0.2
2025-09-18 10:02 UTC
Requires
- google/apiclient: ^2.18
- google/auth: ^1.48
This package is auto-updated.
Last update: 2025-09-18 10:02:54 UTC
README
Starfruit Crawler Bundle
Requirements
Google Cloud
- Create a new project then enable below libraries:
- Create a service account and download JSON credentials file
Installation
composer require starfruit/crawler-bundle
OR
composer require starfruit/crawler-bundle --ignore-platform-req=ext-amqp
- Update
config/bundles.php
file:
return [ .... Starfruit\CrawlerBundle\StarfruitCrawlerBundle::class => ['all' => true], ];
Setup
- Create a new variable in
.env
file:
# path to file Google Cloud JSON, example:
CRAWLER_BUNDLE_GOOGLE_JSON=/root/project/public/crawler-google-credential.json
- Update
config/config.yaml
file:
imports: - { resource: 'local/' } pimcore: ... ... # config for crawler bundle starfruit_crawler: target: class_object: # list of classname as key, and fields News: # name of class content_field: 'content' # field to paste crawled content last_version_field: 'importUrl' # field to store last version, can be null Event: # name of class content_field: 'mainContent' # custom asset path in Admin to store images, media asset_store_path: '/default-crawler-media/image' # custom format for html after crawling content_format: heading: # all default config to mapping headling value to html tag default: 'p' # default tag HEADING_1: 'h1' HEADING_2: 'h2' HEADING_3: 'h3' HEADING_4: 'h4'