medienreaktor/meilisearch

Integrates Meilisearch into Neos.

1.1 2024-06-18 06:45 UTC

This package is auto-updated.

Last update: 2024-12-09 12:07:56 UTC


README

Integrates Meilisearch into Neos. Compatibility tested with Meilisearch versions 1.2 to 1.8.

This package aims for simplicity and minimal dependencies. It might therefore not be as sophisticated and extensible as packages like Flowpack.ElasticSearch.ContentRepositoryAdaptor, and to achieve this, some code parts had to be copied from these great packages (see Credits).

✨ Features

  • ✅ Indexing the Neos Content Repository in Meilisearch
  • ✅ Supports Content Dimensions for all node variants
  • ✅ CLI commands for building and flushing the index
  • ✅ Querying the index via Search-/Eel-Helpers and QueryBuilder
  • ✅ Frontend search form, result rendering and pagination
  • ✅ Faceting and snippet highlighting
  • ✅ Geosearch filtering and sorting
  • ✅ Vector Search for semantic search / AI search
  • 🔴 No asset indexing (yet)
  • 🔴 No autocomplete / autosuggest (this is currently not supported by Meilisearch)

🚀 Installation

Install via composer:

composer require medienreaktor/meilisearch

There are several ways to install Meilisearch for development. If you are using DDEV, there is a Meilisearch-snippet.

⚙️ Configuration

Configure the Meilisearch client in your Settings.yaml and set the Endpoint and API Key:

Medienreaktor:
  Meilisearch:
    client:
      endpoint: ''
      apiKey: ''

You can adjust all Meilisearch index settings to fit your needs (see Meilisearch Documentation). All settings configured here will directly be passed to Meilisearch.

Medienreaktor:
  Meilisearch:
    settings:
      displayedAttributes:
        - '*'
      searchableAttributes:
        - '__fulltext.text'
        - '__fulltext.h1'
        - '__fulltext.h2'
        - '__fulltext.h3'
        - '__fulltext.h4'
        - '__fulltext.h5'
        - '__fulltext.h6'
      filterableAttributes:
        - '__identifier'
        - '__dimensionsHash'
        - '__path'
        - '__parentPath'
        - '__nodeType'
        - '__nodeTypeAndSupertypes'
        - '_hidden'
        - '_hiddenBeforeDateTime'
        - '_hiddenAfterDateTime'
        - '_hiddenInIndex'
        - '_geo'
      sortableAttributes:
        - '_geo'
      rankingRules:
        - 'words'
        - 'typo'
        - 'proximity'
        - 'attribute'
        - 'sort'
        - 'exactness'
      stopWords: []
      typoTolerance:
        enabled: true
        minWordSizeForTypos:
          oneTypo: 5
          twoTypos: 9
      faceting:
        maxValuesPerFacet: 100

Please do not remove, only extend, above filterableAttributes, as they are needed for base functionality to work. After finishing or changing configuration, build the node index once via the CLI command flow nodeindex:build.

Document NodeTypes should be configured as fulltext root (this comes by default for all Neos.Neos:Document subtypes):

'Neos.Neos:Document':
  search:
    fulltext:
      isRoot: true
      enable: true

Properties of Content NodeTypes that should be included in fulltext search must also be configured appropriately:

'Neos.NodeTypes:Text':
  search:
    fulltext:
      enable: true
  properties:
    'text':
      search:
        fulltextExtractor: "${Indexing.extractHtmlTags(node.properties.text)}"

'Neos.NodeTypes:Headline':
  search:
    fulltext:
      enable: true
  properties:
    'title':
      search:
        fulltextExtractor: "${Indexing.extractHtmlTags(node.properties.title)}"

You will see that some properties are indexed twice, like _path and __path, _nodeType and __nodeType. This is due to the different privacy of these node properties:

  • _*-properties are default Neos node properties that are private to Neos (and may change)
  • __*-properties are private properties that are required for the Meilisearch-integration

We have to make sure that our required properties are always there, so we better index them separately.

📖 Usage with Neos and Fusion

There is a built-in Content NodeType Medienreaktor.Meilisearch:Search for rendering the search form, results and pagination that may serve as a boilerplate for your projects. Just place it on your search page to start.

You can also use search queries, results and facets in your own Fusion components.

prototype(Medienreaktor.Meilisearch:Search) < prototype(Neos.Neos:ContentComponent) {
    searchTerm = ${String.toString(request.arguments.search)}

    page = ${String.toInteger(request.arguments.page) || 1}
    hitsPerPage = 10

    searchQuery = ${this.searchTerm ? Search.query(site).fulltext(this.searchTerm).nodeType('Neos.Neos:Document') : null}
    searchQuery.@process {
        page = ${value.page(this.page)}
        hitsPerPage = ${value.hitsPerPage(this.hitsPerPage)}
    }

    facets = ${this.searchQuery.facets(['__nodeType', '__parentPath'])}
    totalPages = ${this.searchQuery.totalPages()}
    totalHits = ${this.searchQuery.totalHits()}
}

If you want facet distribution for certain node properties or search in them, make sure to add them to filterableAttributes and/or searchableAttributes in your Settings.yaml.

The search query builder supports the following features:

⚡ Usage with JavaScript / React / Vue

If you want to build your frontend with JavaScript, React or Vue, you can completely ignore above Neos and Fusion integration and use instant-meilisearch.

Please mind these three things:

1. Filtering for node context and dimensions

Setup your filter to always include the following filter string: (__parentPath = "$nodePath" OR __path = "$nodePath") AND __dimensionsHash = "$dimensionsHash" where $nodePath is the NodePath of your context node (e.g. site) and $dimensionHash is the MD5-hashed JSON-encoded context dimensions array.

You can obtain these values in PHP using:

$nodePath = (string) $contextNode->findNodePath();
$dimensionsHash = md5(json_encode($contextNode->getContext()->getDimensions()));

In Fusion, you get these values (assuming site is your desired context node) using:

nodePath = ${site.path}
dimensionsHash = ${String.md5(Json.stringify(site.context.dimensions))}

2. The node URI

The public URI to the node is in the __uri attribute of each Meilisearch result hit.

It is generated at indexing time and one reason we create separate index records for each node variant, even if they are redundant due to dimension fallback behaviour. This is in contrast to Flowpack.ElasticSearch.ContentRepositoryAdaptor, where only one record is created and multiple dimensions hashes are assigned.

If you have assigned a primary domain to your site, the URI will be absolute, otherwise relative.

3. Image URIs

If you need image URIs in your frontend, this can also be configured.

Configure your specific properties or all image properties to be indexed:

Neos:
  ContentRepository:
    Search:
      defaultConfigurationPerType:
        Neos\Media\Domain\Model\ImageInterface:
          indexing: '${AssetUri.build(value, 600, 400)}'

You can set your desired width, height and optional allowCropping, allowUpScaling and format values in the method arguments.

If you have set the baseUri in your Settings.yaml, the path to your image will be absolute and not asynchron. (e.g. https://example.com/_Resources/Persistent/1/2/3/4/1234567890n/filename-800x600.jpg)

Otherwise, the image paths will be relative and asynchron (e.g. /media/thumbnail/12345678-1234-1234-1234-1234567890)

To set the baseUri add your URI to your Settings.yaml:

Neos:
  Flow:
    http:
      baseUri: https://example.com/

📍 Geosearch

Meilisearch supports filtering and sorting on geographic location. For this feature to work, your nodes should supply the __geo property with an object of lat/lng values. An easy way to achieve this is to use a proxy property:

'Neos.Neos:Document':
  properties:
    latitude:
      type: 'string'
      ui:
        label: 'Latitude'
    longitude:
      type: 'string'
      ui:
        label: 'Longitude'
    __geo:
      search:
        indexing: "${{lat: node.properties.latitude, lng: node.properties.longitude}}"

The search query builder supports filtering with geoRadius() and sorting with geoPoint() (see above).

📐 Vector Search

You can use Meilisearch as a vector store with the experimental Vector Search feature. Activate it using the /experimental-features endpoint as described in the release notes.

Vectors for each document have to be provided by you and indexed in the _vector-property of your node. This can be done writing a custom Eel-helper that computes the vectors using a third-party tool like OpenAI or Hugging Face.

'Neos.Neos:Document':
  properties:
    _vector:
      search:
        indexing: "${VectorIndexing.computeByNode(node)}"

The search query builder supports querying by vectors. Depending on your use case, vectors have to be computed again for the search phrase, e.g.:

prototype(Medienreaktor.Meilisearch:Search) < prototype(Neos.Neos:ContentComponent) {
    searchTerm = ${String.toString(request.arguments.search)}
    searchVector = ${VectorIndexing.computeByString(this.searchTerm)}

    vectorSearchQuery = ${this.searchVector ? Search.query(site).vector(this.searchVector) : null}

    searchResults = ${this.vectorSearchQuery.execute()}
}

To show similar documents to your current document (e.g. for Wikis, Knowledge Bases or News Rooms), use the current document's vector as search vector.

👩‍💻 Credits

This package is heavily inspired by and some smaller code parts are copied from:

All credits go to the original authors of these packages.