wilr / silverstripe-algolia
Algolia Indexer and Search Functionality
Installs: 15 806
Dependents: 0
Suggesters: 0
Security: 0
Stars: 13
Watchers: 6
Forks: 16
Open Issues: 5
Type:silverstripe-vendormodule
Requires
- php: ^8
- algolia/algoliasearch-client-php: ^3
- masterminds/html5: ^2.7
- ramsey/uuid: ^4
- symbiote/silverstripe-queuedjobs: ^4 || ^5
Requires (Dev)
- phpunit/phpunit: ^9.5
- squizlabs/php_codesniffer: ^3
This package is auto-updated.
Last update: 2024-10-16 23:04:09 UTC
README
Maintainer Contact
- Will Rossiter (@wilr) will@fullscreen.io
Installation
composer require "wilr/silverstripe-algolia"
Features
☑️ Supports multiple indexes and saving records into multiple indexes.
☑️ Integrates into existing versioned workflow.
☑️ No dependencies on the CMS, supports any DataObject subclass.
☑️ Queued job support for offloading operations to Algolia.
☑️ Easily configure search configuration and indexes via YAML and PHP.
☑️ Indexes your webpage template so supports Elemental and custom fields out of the box
Documentation
Algolia’s search-as-a-service and full suite of APIs allow teams to easily develop tailored, fast Search and Discovery experiences that delight and convert.
This module adds the ability to sync Silverstripe pages to a Algolia Index.
Indexing and removing documents is done transparently for any objects which
subclass SiteTree
or by applying the
Wilr\SilverStripe\Algolia\Extensions\AlgoliaObjectExtension
to your
DataObjects.
🛠️ Setting Up
First, sign up for Algolia.com account and install this module. Once installed, Configure the API keys via YAML (environment variables recommended).
app/_config/algolia.yml
--- Name: algolia After: silverstripe-algolia --- SilverStripe\Core\Injector\Injector: Wilr\SilverStripe\Algolia\Service\AlgoliaService: properties: adminApiKey: "`ALGOLIA_ADMIN_API_KEY`" searchApiKey: "`ALGOLIA_SEARCH_API_KEY`" applicationId: "`ALGOLIA_SEARCH_APP_ID`" indexes: IndexName: includeClasses: - SilverStripe\CMS\Model\SiteTree indexSettings: attributesForFaceting: - "filterOnly(objectClassName)"
Once the indexes and API keys are configured, run a dev/build
to update the
database and refresh the indexSettings. Alternatively you can run
AlgoliaConfigure
to manually rebuild the indexSettings.
Configuring the index names
This module will assume your indexes are setup as dev_{IndexName}
,
test_{IndexName}
and live_{IndexName}
where the result of your environment
type is prefixed to the names listed in the main YAML config.
If you explictly want to disable the environment prefix (or use a custom
approach) use the ALGOLIA_PREFIX_INDEX_NAME
environment variable.
ALGOLIA_PREFIX_INDEX_NAME='dev_will'
Or for testing with live data on dev use ALGOLIA_PREFIX_INDEX_NAME='live'
Defining Replica Indexes
If your search form provides a sort option (e.g latest or relevance) then you will be using replica indexes (https://www.algolia.com/doc/guides/managing-results/refine-results/sorting/how-to/creating-replicas/)
These can be defined using the same YAML configuration.
--- Name: algolia After: silverstripe-algolia --- SilverStripe\Core\Injector\Injector: Wilr\SilverStripe\Algolia\Service\AlgoliaService: properties: adminApiKey: "`ALGOLIA_ADMIN_API_KEY`" searchApiKey: "`ALGOLIA_SEARCH_API_KEY`" applicationId: "`ALGOLIA_SEARCH_APP_ID`" indexes: IndexName: includeClasses: - SilverStripe\CMS\Model\SiteTree indexSettings: attributesForFaceting: - "filterOnly(ObjectClassName)" replicas: - IndexName_Latest IndexName_Latest: indexSettings: ranking: - "desc(objectCreated)" - "typo" - "words" - "filters" - "proximity" - "attribute" - "exact" - "custom"
Indexing
If installing on a existing website run the AlgoliaReindex
task (via CLI) to
import existing data. This will batch import all the records from your database
into the indexes configured above.
./vendor/bin/sake dev/tasks/AlgoliaReindex "flush=1"
Individually records will be indexed automatically going forward via the
onAfterPublish
hook and removed via the onAfterUnpublish
hook which is
called when publishing or unpublishing a document. If your DataObject does not
implement the Versioned
extension you'll need to manage this state yourself by
calling $item->indexInAlgolia()
and $item->removeFromAlgolia()
.
AlgoliaReindex
takes a number of arguments to allow for customisation of bulk
indexing. For instance, if you have a large amount of JobVacancies to bulk
import but only need the active ones you can trigger the task as follows:
/vendor/bin/sake dev/tasks/AlgoliaReindex "onlyClass=Vacancy&filter=ExpiryDate>NOW()"
If you do not have access to a CLI (i.e Silverstripe Cloud) then you can also
bulk reindex via a queued job AlgoliaReindexAllJob
.
Optional
forceAll
forces every Silverstripe record to be re-synced.
./vendor/bin/sake dev/tasks/AlgoliaReindex "flush=1&forceAll=1"
clearAll
truncates the search index before reindexing.
./vendor/bin/sake dev/tasks/AlgoliaReindex "flush=1&clearAll=1&forceAll=1"
Customising the indexed attributes (fields)
By default only ID
, Title
and Link
, LastEdited
will be indexed from each
record. To specify additional fields, define a algolia_index_fields
config
variable.
class MyPage extends Page { // .. private static $algolia_index_fields = [ 'Content', 'MyCustomColumn', 'RelationshipName' ]; }
Or, you can define a exportObjectToAlgolia
method on your object. This
receives the default index fields and then allows you to add or remove fields as
required
use SilverStripe\ORM\ArrayList; use SilverStripe\ORM\Map; class MyPage extends Page { public function exportObjectToAlgolia($data) { $data = array_merge($data, [ 'MyCustomField' => $this->MyCustomField() ]); $map = new Map(ArrayList::create()); foreach ($data as $k => $v) { $map->push($k, $v); } return $map; } }
Customizing the indexed relationships
Out of the box, the default is to push the ID and Title fields of any
relationships ($has_one
, $has_many
, $many_many
) into a field
relation{name}
with the record ID
and Title
as per the behaviour with
records.
Additional fields from the relationship can be indexed via a PHP function
public function updateAlgoliaRelationshipAttributes(\SilverStripe\ORM\Map $attributes, $related) { $attributes->push('CategoryName', $related->CategoryName); }
Excluding an object from indexing
Objects can define a canIndexInAlgolia
method which should return false if the
object should not be indexed in algolia.
public function canIndexInAlgolia(): bool { return ($this->Expired) ? false : true; }
Queued Indexing
To reduce the impact of waiting on a third-party service while publishing
changes, this module utilizes the queued-jobs
module for uploading index
operations. The queuing feature can be disabled via the Config YAML.
Wilr\SilverStripe\Algolia\Extensions\AlgoliaObjectExtension: use_queued_indexing: false
Displaying and fetching results
For your website front-end you can use InstantSearch.js libraries if you wish,
or to fetch a PaginatedList
of results from Algolia, create a method on your
Controller
subclass to call Wilr\SilverStripe\Algolia\Service\AlgoliaQuerier
<?php use SilverStripe\Core\Injector\Injector; use Wilr\SilverStripe\Algolia\Service\AlgoliaQuerier; class PageController extends ContentController { public function results() { $hitsPerPage = 25; $paginatedPageNum = floor($this->request->getVar('start') / $hitsPerPage); $results = Injector::inst()->get(AlgoliaQuerier::class)->fetchResults( 'indexName', $this->request->getVar('search'), [ 'page' => $this->request->getVar('start') ? $paginatedPageNum : 0, 'hitsPerPage' => $hitsPerPage ] ); return [ 'Title' => 'Search Results', 'Results' => $results ]; } }
Or alternatively you can make use of JS Search SDK (https://www.algolia.com/doc/api-client/getting-started/install/javascript/)
🔍 Inspect Object Fields
To assist with debugging what fields will be pushed into Algolia and see what
information is already in Algolia use the AlgoliaInspect
BuildTask. This can
be run via CLI
./vendor/bin/sake dev/tasks/AlgoliaInspect "class=Page&id=1"
Will output the Algolia data structure for the Page with the ID of '1'.
Elemental Support
Out of the box this module scrapes the webpage's main
HTML section and stores
it in a objectForTemplate
field in Algolia. This content is parsed via the
AlgoliaPageCrawler
class.
<main> $ElementalArea <!-- will be indexed via Algolia --> </main>
If this behaviour is undesirable then it can be disabled via YAML.
Wilr\SilverStripe\Algolia\Service\AlgoliaIndexer:
include_page_content: false
Or you can specify the HTML selector you do want to index using YAML. For
instance to index any elements with a data-index
attribute.
Wilr\SilverStripe\Algolia\Service\AlgoliaPageCrawler:
content_xpath_selector: '//[data-index]'
Subsite support
If you use the Silverstripe Subsite module to run multiple websites you can handle indexing in a couple ways:
- Use separate indexes per site.
- Use a single index, but add a
SubsiteID
field in Algolia.
The decision to go either way depends on the nature of the websites and how related they are but separate indexes are highly recommended to prevent leaking information between websites and mucking up analytics and query suggestions.
Subsite support with a single index
If subsites are frequently being created then you may choose to prefer a single index since index names need to be controlled via YAML so any new subsite would require a code change.
The key to this approach is added SubsiteID
to the attributes for faceting
and at the query time.
Step 1. Add the field to Algolia
SilverStripe\Core\Injector\Injector:
Wilr\SilverStripe\Algolia\Service\AlgoliaService:
properties:
adminApiKey: "`ALGOLIA_ADMIN_API_KEY`"
searchApiKey: "`ALGOLIA_SEARCH_API_KEY`"
applicationId: "`ALGOLIA_SEARCH_APP_ID`"
indexes:
index_main_site:
includeClasses:
- SilverStripe\CMS\Model\SiteTree
indexSettings:
distinct: true
attributeForDistinct: "objectLink"
searchableAttributes:
- objectTitle
- objectContent
- objectLink
- Summary
- objectForTemplate
attributesForFaceting:
- "filterOnly(objectClassName)"
***- "filterOnly(SubsiteID)"***
Step 2. Expose the field on SiteTree
via a DataExtension (make sure to apply the extension)
<?php
class SiteTreeExtension extends DataExtension
{
private static $algolia_index_fields = [
'SubsiteID'
];
}
Step 3. Filter by the Subsite ID in your results
<?php use SilverStripe\Core\Injector\Injector; use Wilr\SilverStripe\Algolia\Service\AlgoliaQuerier; class PageController extends ContentController { public function results() { $hitsPerPage = 25; $paginatedPageNum = floor($this->request->getVar('start') / $hitsPerPage); $results = Injector::inst()->get(AlgoliaQuerier::class)->fetchResults( 'indexName', $this->request->getVar('search'), [ 'page' => $this->request->getVar('start') ? $paginatedPageNum : 0, 'hitsPerPage' => $hitsPerPage, 'facetFilters' => [ 'SubsiteID' => SubsiteState::singleton()->getSubsiteId() ] ] ); return [ 'Title' => 'Search Results', 'Results' => $results ]; } }
Subsite support with separate indexes
Create multiple indexes in your config and use the includeFilter
parameter to
filter the records per index.
The includeFilter
should be in the format {$Class}
: {$WhereQuery}
where
the $WhereQuery
is a basic SQL statement performed by the ORM on the given
class.
SilverStripe\Core\Injector\Injector:
Wilr\SilverStripe\Algolia\Service\AlgoliaService:
properties:
adminApiKey: "`ALGOLIA_ADMIN_API_KEY`"
searchApiKey: "`ALGOLIA_SEARCH_API_KEY`"
applicationId: "`ALGOLIA_SEARCH_APP_ID`"
indexes:
index_main_site:
includeClasses:
- SilverStripe\CMS\Model\SiteTree
includeFilter:
"SilverStripe\\CMS\\Model\\SiteTree": "SubsiteID = 0"
indexSettings:
distinct: true
attributeForDistinct: "objectLink"
searchableAttributes:
- objectTitle
- objectContent
- objectLink
- Summary
- objectForTemplate
attributesForFaceting:
- "filterOnly(objectClassName)"
index_subsite_pages:
includeClasses:
- SilverStripe\CMS\Model\SiteTree
includeFilter:
"SilverStripe\\CMS\\Model\\SiteTree": "SubsiteID > 0"
indexSettings:
distinct: true
attributeForDistinct: "objectLink"
searchableAttributes:
- objectTitle
- objectContent
- objectLink
- Summary
- objectForTemplate
attributesForFaceting:
- "filterOnly(objectClassName)"