shabushabu / laravel-paradedb-search
Integrates the pg_search extension by ParadeDB into Laravel
Fund package maintenance!
boris-glumpler
Requires
- php: ^8.2
- illuminate/contracts: ^11.0
- spatie/laravel-package-tools: ^1.16
Requires (Dev)
- larastan/larastan: ^2.9
- laravel/pint: ^1.14
- nunomaduro/collision: ^8.0
- orchestra/testbench: ^9.0
- pestphp/pest: ^2.34
- pestphp/pest-plugin-arch: ^2.7
- pestphp/pest-plugin-laravel: ^2.3
- pestphp/pest-plugin-type-coverage: ^2.8
- phpstan/extension-installer: ^1.3
- phpstan/phpstan-deprecation-rules: ^1.1
- phpstan/phpstan-phpunit: ^1.3
- roave/security-advisories: dev-latest
- tpetry/laravel-postgresql-enhanced: ^2.0
- tpetry/laravel-query-expressions: ^1.4
Suggests
- tpetry/laravel-postgresql-enhanced: Adds vector operators to use in regular Eloquent where statements
- tpetry/laravel-query-expressions: Provides useful expressions for use in ParadeDB search queries
README
ParadeDB Search for Laravel
Integrates the pg_search
Postgres extension by ParadeDB into Laravel
Supported minimum versions
Installation
Caution
Please note that this is a new package and, even though it is well tested, it should be considered pre-release software
Before installing the package you should install and enable the pg_search extension.
You can then install the package via composer:
composer require shabushabu/laravel-paradedb-search
You can also publish the config file:
php artisan vendor:publish --tag="laravel-paradedb-search-config"
These are the contents of the published config file:
return [ 'index_suffix' => env('PG_SEARCH_INDEX_SUFFIX', '_idx'), 'highlighting_tag' => env('PG_SEARCH_HIGHLIGHTING_TAG', '<b></b>'), ];
Usage
Add a bm25 index
Each model that you want to be searchable needs a corresponding bm25
index. These can be generated within a migration like so:
use ShabuShabu\ParadeDB\Indices\Bm25; return new class extends Migration { public function up(): void { Schema::create('products', static function (Blueprint $table) { // all your product fields }); Bm25::index('products') ->addNumericFields(['amount']) ->addBooleanFields(['is_available']) ->addDateFields(['created_at', 'deleted_at']) ->addJsonFields(['options']) ->addRangeFields(['size']) ->addTextFields([ 'name', 'currency', 'description' => [ 'tokenizer' => [ 'type' => 'default', ], ], ]) ->create(drop: true); } public function down(): void { Bm25::index('products')->drop(); } };
Add a partial bm25 index
Bm25::index('teams') ->partialBy('max_members > 2') // ... ->create();
TantivyQL
ParadeDB Search for Laravel comes with a fluent builder for TantivyQL, a simple string-based query language.
This builder can be used within various ParadeDB expressions.
Basic query
use ShabuShabu\ParadeDB\TantivyQL\Query; Query::string()->where('description', 'keyboard')->get(); // results in: description:keyboard
Add an IN condition
Query::string() ->where('description', ['keyboard', 'toy']) ->get(); // results in: description:IN [keyboard, toy]
Add an AND NOT condition
(string) Query::string() ->where('category', 'electronics') ->whereNot('description', 'keyboard'); // results in: category:electronics AND NOT description:keyboard
Boost a condition
Query::string() ->where('description', 'keyboard', boost: 1) ->get(); // results in: description:keyboard^1
Apply the slop operator
Query::string() ->where('description', 'ergonomic keyboard', slop: 1) ->get(); // results in: description:"ergonomic keyboard"~1
More complex example with a sub condition
Query::string() ->where('description', ['keyboard', 'toy']) ->where( fn (Builder $builder) => $builder ->where('category', 'electronics') ->orWhere('tag', 'office') ) ->get(); // results in: description:IN [keyboard, toy] AND (category:electronics OR tag:office)
Apply a simple filter
use ShabuShabu\ParadeDB\TantivyQL\Operators\Filter; Query::string() ->whereFilter('rating', Filter::equals, 4) ->get(); // results in: rating:4
Apply a boolean filter
use ShabuShabu\ParadeDB\TantivyQL\Operators\Filter; Query::string() ->whereFilter('is_available', '=', false) ->get(); // results in: is_available:false
Apply a basic range filter
use ShabuShabu\ParadeDB\TantivyQL\Operators\Filter; Query::string() ->whereFilter('rating', '>', 4) ->get(); // results in: rating:>4
Apply an inclusive range filter
use ShabuShabu\ParadeDB\TantivyQL\Operators\Range; Query::string() ->whereFilter('rating', Range::includeAll, [2, 5]) ->get(); // results in: rating:[2 TO 5]
Apply an exclusive range filter
use ShabuShabu\ParadeDB\TantivyQL\Operators\Range; Query::string() ->whereFilter('rating', Range::excludeAll, [2, 5]) ->get(); // results in: rating:{2 TO 5}
Performing a basic search
To search, you just use the custom @@@
operator in a regular Eloquent where condition.
Product::query() ->where('description', '@@@', 'shoes') ->get();
See: https://docs.paradedb.com/documentation/full-text/overview
ParadeDB functions
For more complex operations, it might be necessary to use some of the provided ParadeDB functions, all of which have corresponding query expressions:
JSON
The right side of the @@@
operator also accepts JSON query objects, similar to how Elasticsearch Query DSL works.
use ShabuShabu\ParadeDB\Expressions\JsonB; Product::query() ->where('id', '@@@', new JsonB([ 'fuzzy_term' => [ 'field' => 'description', 'value' => 'shoez' ] ])) ->get();
See: https://docs.paradedb.com/documentation/advanced/overview
Get all the records
use ShabuShabu\ParadeDB\Expressions\All; use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Boolean; Product::query() ->where('id', '@@@', new Boolean( should: new All(), mustNot: new Term('description', 'shoes') )) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/all
Check that a field exists
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Exists; use ShabuShabu\ParadeDB\Expressions\Boolean; Product::query() ->where('id', '@@@', new Boolean( must: [ new Term('description', 'shoes'), new Exists('rating') ], )) ->limit(5) ->get();
See: https://docs.paradedb.com/documentation/advanced/term/exists
Get none of the records
use ShabuShabu\ParadeDB\Expressions\Blank; Product::query() ->where('id', '@@@', new Blank()) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/empty
Boost a query
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Boost; use ShabuShabu\ParadeDB\Expressions\Boolean; Product::query() ->where('id', '@@@', new Boolean( should: [ new Term('description', 'shoes'), new Boost(new Term('description', 'running'), 2.0) ] )) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/boost
Add a constant score
use ShabuShabu\ParadeDB\Expressions\All; use ShabuShabu\ParadeDB\Expressions\Score; use ShabuShabu\ParadeDB\Expressions\Boolean; use ShabuShabu\ParadeDB\Expressions\ConstScore; Product::query() ->selectWithScore() ->where('id', '@@@', new Boolean( should: [ new ConstScore(new Term('description', 'shoes'), 1.0), new Term('description', 'running'), ] )) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/const
Perform a disjunction max query
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Score; use ShabuShabu\ParadeDB\Expressions\DisjunctionMax; Product::query() ->select(['*', new Score()]) ->where('id', '@@@', new DisjunctionMax([ new Term('description', 'shoes'), new Term('description', 'running'), ])) ->get();
The DisjunctionMax
constructor also accepts an array of queries, so using the fluid interface might be more convenient for multiple queries:
use ShabuShabu\ParadeDB\TantivyQL\Query; use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Score; use ShabuShabu\ParadeDB\Expressions\DisjunctionMax; Product::query() ->select(['*', new Score()]) ->where('id', '@@@', DisjunctionMax::query() ->add(Query::string()->where('description', 'shoes')) ->add('description:running') ->tieBreaker(1.2) ) ->get();
This also allows you to conditionally add queries:
use ShabuShabu\ParadeDB\TantivyQL\Query; use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Score; use ShabuShabu\ParadeDB\Expressions\DisjunctionMax; Product::query() ->select(['*', new Score()]) ->where('id', '@@@', DisjunctionMax::query() ->add(Query::string()->where('description', 'shoes')) ->add('description:running', when: false) ) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/disjunction_max
Search for a fuzzy term
use ShabuShabu\ParadeDB\Expressions\FuzzyTerm; Product::query() ->where('id', '@@@', new FuzzyTerm('description', 'shoez')) ->get();
See: https://docs.paradedb.com/documentation/advanced/term/fuzzy_term
Search for a fuzzy phrase
use ShabuShabu\ParadeDB\Expressions\FuzzyPhrase; Product::query() ->where('id', '@@@', new FuzzyPhrase('description', 'ruining shoez')) ->get();
See: https://docs.paradedb.com/documentation/advanced/phrase/fuzzy_phrase
Parse a Tantivy query string
Useful for directly searching for user-supplied queries.
use ShabuShabu\ParadeDB\Expressions\Parse; Product::query() ->where('id', '@@@', new Parse('description:"running shoes" OR category:footwear')) ->get();
Additionally, ParadeDB Search for Laravel
comes with its own Tantivy Query Language Builder:
use ShabuShabu\ParadeDB\TantivyQL\Query; use ShabuShabu\ParadeDB\Expressions\Parse; Product::query() ->where('id', '@@@', new Parse( Query::string() ->where('description', 'running shoes') ->orWhere('category', 'footwear') )) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/parse
Parse a Tantivy query string for a given field
Like ShabuShabu\ParadeDB\Expressions\Parse
above, but it takes a query string without fields and searches for the given field.
use ShabuShabu\ParadeDB\Expressions\ParseWithField; Product::query() ->where('id', '@@@', new ParseWithField( field: 'description', query: 'speaker bluetooth', conjunctionMode: true, )) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/parse#parse-with-field
Highlight search terms
use ShabuShabu\ParadeDB\Expressions\Snippet; Product::query() ->select(['id', new Snippet('description')]) ->where('description', '@@@', 'shoes') ->limit(5) ->get();
See: https://docs.paradedb.com/documentation/full-text/highlighting
Search for a phrase
use ShabuShabu\ParadeDB\Expressions\Phrase; Product::query() ->where('id', '@@@', new Phrase( field: 'description', phrases: ['sleek', 'shoes'], slop: 1, )) ->get();
See: https://docs.paradedb.com/documentation/advanced/phrase/phrase
Perform a phrase prefix query
use ShabuShabu\ParadeDB\Expressions\PhrasePrefix; Product::query() ->where('id', '@@@', new PhrasePrefix('description', ['running', 'sh'])) ->get();
See: https://docs.paradedb.com/documentation/advanced/phrase/phrase_prefix
Search within a given range
use ShabuShabu\ParadeDB\Expressions\Range; use ShabuShabu\ParadeDB\Expressions\Ranges\Int4; use ShabuShabu\ParadeDB\Expressions\Ranges\Bounds; Product::query() ->where('id', '@@@', new Range('rating', new Int4(1, 3, Bounds::includeStartExcludeEnd))) ->get();
Here are the supported range types (all within the ShabuShabu\ParadeDB\Query\Expressions\Ranges
namespace), plus their corresponding Postgres type:
Int4::class;
orint4range
Int8::class;
orint8range
Numeric::class;
ornumrange
Date::class;
ordaterange
Timestamp::class;
ortsrange
TimestampTz::class;
ortstzrange
See: https://docs.paradedb.com/documentation/advanced/term/range
Find ranges for a given value
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Boolean; use ShabuShabu\ParadeDB\Expressions\RangeTerm; Product::query() ->where('id', '@@@', new Boolean( must: [ new RangeTerm('weight_range', 1), new Term('category', 'footwear') ] )) ->get();
Ranges can also be compared to other ranges:
use ShabuShabu\ParadeDB\Expressions\RangeTerm; use ShabuShabu\ParadeDB\Expressions\Ranges\Int4; use ShabuShabu\ParadeDB\Expressions\Ranges\Relation; Product::query() ->where('id', '@@@', new RangeTerm( field: 'weight_range', term: new Int4(10, 12), relation: Relation::intersects, )) ->get();
See: https://docs.paradedb.com/documentation/advanced/term/range_term
Perform a regex query
use ShabuShabu\ParadeDB\Expressions\Regex; Product::query() ->where('id', '@@@', new Regex('description', '(plush|leather)')) ->get();
See: https://docs.paradedb.com/documentation/advanced/term/regex
Search for a term
use ShabuShabu\ParadeDB\Expressions\Term; Product::query() ->where('id', '@@@', new Term('rating', 4)) ->get();
See: https://docs.paradedb.com/documentation/advanced/term/term
Search for a set of terms
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\TermSet; Product::query() ->where('id', '@@@', new TermSet([ new Term('description', 'shoes'), new Term('description', 'running'), ])) ->get();
The above query can also be written in a fluid manner:
Product::query() ->where('id', '@@@', TermSet::query() ->add(new Term('description', 'shoes')) ->add(new Term('description', 'running')) ) ->get();
The term
method allows you to conditionally add terms:
$when = false; Product::query() ->where('id', '@@@', TermSet::query() ->add(new Term('description', 'shoes'), $when) ) ->get();
See: https://docs.paradedb.com/documentation/advanced/term/term_set
Perform a complex boolean query
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Boolean; use ShabuShabu\ParadeDB\Expressions\FuzzyTerm; use ShabuShabu\ParadeDB\Expressions\Ranges\Int4; use ShabuShabu\ParadeDB\Expressions\Ranges\Bounds; Product::query() ->where('id', '@@@', new Boolean( should: new Term('description', 'headphones'), must: [ new Term('category', 'electronics'), new FuzzyTerm('description', 'bluetooht'), ], mustNot: new Range('rating', new Int4(null, 2, Bounds::excludeAll)), )) ->get();
Boolean queries can also be constructed in a fluid manner:
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Operators\FullText; use ShabuShabu\ParadeDB\Expressions\Boolean; use ShabuShabu\ParadeDB\Expressions\FuzzyTerm; use ShabuShabu\ParadeDB\Expressions\Ranges\Int4; use ShabuShabu\ParadeDB\Expressions\Ranges\Bounds; Product::query() ->where('id', FullText::search->value, Boolean::query() ->should(new Term('description', 'headphones')) ->must(new Term('category', 'electronics')) ->must(new FuzzyTerm('description', 'bluetooht')) ->mustNot(new Range('rating', new Int4(null, 2, Bounds::excludeAll))) ) ->get();
The two queries above are identical. The fluent methods allow you to conditionally add queries, though:
use ShabuShabu\ParadeDB\Expressions\Term; use ShabuShabu\ParadeDB\Expressions\Boolean; use ShabuShabu\ParadeDB\Expressions\FuzzyTerm; use ShabuShabu\ParadeDB\Expressions\Ranges\Int4; use ShabuShabu\ParadeDB\Expressions\Ranges\Bounds; $when = false; Product::query() ->where('id', '@@@', Boolean::query() ->should(new Term('description', 'headphones')) ->must(new Term('category', 'electronics')) ->must(new FuzzyTerm('description', 'bluetooht'), $when) ->mustNot(new Range('rating', new Int4(null, 2, Bounds::excludeAll))) ) ->get();
See: https://docs.paradedb.com/documentation/advanced/compound/boolean
Sort by rank
use ShabuShabu\ParadeDB\Expressions\Score; Product::query() ->addSelect(new Score()) ->where('description', '@@@', 'shoes') ->orderBy(new Score()) ->limit(5) ->get();
See: https://docs.paradedb.com/documentation/full-text/scoring
Find similar documents
When you pass a document ID, aka an Eloquent model key, then documents related to the given document are found.
use ShabuShabu\ParadeDB\Expressions\MoreLikeThis; Product::query() ->where('id', '@@@', new MoreLikeThis( idOrFields: 3, minTermFrequency: 1, )) ->get();
Alternatively, you can pass in document fields instead of an id to search against:
use ShabuShabu\ParadeDB\Expressions\MoreLikeThis; Product::query() ->where('id', '@@@', new MoreLikeThis( idOrFields: ['description' => 'shoes'], minDocFrequency: 0, maxDocFrequency: 100, minTermFrequency: 1, )) ->get();
See: https://docs.paradedb.com/documentation/advanced/specialized/more_like_this
Using the query builder macro
In quite a lot of cases, the column you search against will be id
. For this reason, you can also use the provided whereSearch
macro.
use ShabuShabu\ParadeDB\Expressions\MoreLikeThis; Product::query() ->whereSearch(new MoreLikeThis(idOrFields: 3, minTermFrequency: 1)) ->get();
Hybrid search
pg_search
also allows you to perform hybrid full-text/similarity searches. For this to work you will need to install pgvector. Please note that ParadeDB Search for Laravel
registers all custom pgvector
operators already for you.
use Tpetry\QueryExpressions\Value\Value; use ShabuShabu\ParadeDB\Expressions\Rank; use ShabuShabu\ParadeDB\Expressions\Score; use Tpetry\QueryExpressions\Language\Alias; use ShabuShabu\ParadeDB\Operators\Distance; use ShabuShabu\ParadeDB\Expressions\Similarity; use Tpetry\QueryExpressions\Operator\Arithmetic\Add; use Tpetry\QueryExpressions\Operator\Arithmetic\Divide; use Tpetry\QueryExpressions\Function\Conditional\Coalesce; Product::query() ->withExpression('semantic_search', Product::query() ->select([ 'id', new Alias(new Rank([ [new Similarity('embedding', Distance::cosine, [1, 2, 3]), 'asc'] ]), 'rank'), ]) ->orderBy(new Similarity('embedding', Distance::cosine, [1, 2, 3])) ->limit(20) ) ->withExpression('bm25_search', Product::query() ->select([ 'id', new Alias(new Rank([new Score(), 'asc']), 'rank'), ]) ->where('description', '@@@', 'keyboard') ->limit(20) ) ->select([ new Alias(new Coalesce(['semantic_search.id', 'bm25_search.id']), 'id'), new Alias(new Add( new Coalesce([ new Divide(new Value(1.0), new Add(new Value(60), 'semantic_search.rank')), new Value(0.0), ]), new Coalesce([ new Divide(new Value(1.0), new Add(new Value(60), 'bm25_search.rank')), new Value(0.0), ]), ), 'score'), 'products.description', 'products.embedding' ]) ->from('semantic_search') ->join('bm25_search', 'semantic_search.id', '=', 'bm25_search.id', 'full outer') ->join('products', 'products.id', '=', new Coalesce(['semantic_search.id', 'bm25_search.id'])) ->orderByDesc('score') ->orderBy('description') ->limit(5);
See: https://docs.paradedb.com/documentation/guides/hybrid
A word of caution
While it is possible to combine ParadeDB queries with regular Eloquent queries, you will incur some performance penalties.
For optimal performance it is recommended to let the bm25
index do as much work as possible!
Getting help
If your issue has something to do with this package, then please use the issues and discussions!
If your issue is related to pg_search
, tho, then please create a discussion in the ParadeDB repo.
To make this a bit easier, you can use the paradedb:help
command that ships with this package:
php artisan paradedb:help
Please note that this command is just an implementation of the paradedb.help()
function. Please use this command wisely!
Testing
The tests require a PostgreSQL database, which can easily be set up by running the following script:
composer testdb
Warning
Please note that both pg_search and pgvector extensions need to be available already.
Then run the tests:
composer test
Or with test coverage:
composer test-coverage
Or with type coverage:
composer type-coverage
Or run PHPStan:
composer analyse
ParadeDB test table
There is also a command that allows you to create and drop the built-in test table
php artisan paradedb:test-table create
Changelog
Please see CHANGELOG for more information on what has changed recently.
Contributing
Please see CONTRIBUTING for details.
Security Vulnerabilities
Please review our security policy on how to report security vulnerabilities.
Credits
- Taylor Otwell for creating Laravel
- ParadeDB for creating
pg_search
- ShabuShabu
- All Contributors
Disclaimer
This is a 3rd party package and ShabuShabu is not affiliated with either Laravel or ParadeDB.
License
The MIT License (MIT). Please see License File for more information.