The go-to PHP IDE with extensive out-of-the-box support for Laravel and its ecosystem.

Laravel MongoDB Full-Text Search tutorial: The Art of the Relevancy

Last updated on by

Laravel MongoDB Full-Text Search tutorial: The Art of the Relevancy image

There are very compelling reasons to use a full-text search based on an inverted index and a relevancy scoring model. In my experience, the best reason is when you're actually trying to perform a Search function and expect the first result to be the most relevant. That is exactly why search engines were built, and I'll assume that's your main use case.

Secondly, the inverted index may be superior to classic database indexes in some cases, but remember that it's not its primary purpose.

For the remainder of this article, we'll use "search" and "query" this way:

  • "Search" means to retrieve and return information ranked by relevancy (the most important concept here). The first document returned is the most relevant, and subsequent search results are less and less relevant according to the relevancy algorithm/score.
  • "Query" focuses on finding information but does not imply that relevancy is important, and is more akin to a regular database query that returns information matching certain criteria.

Oftentimes, using search engines requires setting up, maintaining, and securing a dedicated search system. Subsequently, you had to learn a new API and the quirks that come with every system. It can be so involved that some people would rather have a poor search experience than deal with the hassle.

Introduction: The New Era of MongoDB Search

Removing this friction is the motivation behind MongoDB Search and Vector Search being built into the same software and accessible by one connection URL and API. Today, this powerful search functionality is available in the free MongoDB Community edition (run it locally!) and, of course, on Atlas, the cloud MongoDB platform.

Laravel users are top of mind, and this article will demonstrate how to utilize Full Text Search (FTS) with an existing MongoDB database.

Laravel implementation

In this article, we'll use a GitHub code repository to illustrate how MongoDB Search works in Laravel. I created a tag, as the repo will evolve in the future.

The repo's README.md has more information and is structured to mirror the tutorial's natural progression: Each section includes the "why" behind configuration choices, example commands with expected outputs, and a troubleshooting guide that explains what went wrong and how to fix it.

Prerequisites to run the repo

Connect the Laravel app to the MongoDB database

If you haven't connected to MongoDB from Laravel before, we have a more detailed tutorial on how to build a back-end service with Laravel and MongoDB. In this article, we'll emphasize only the main points related to using MongoDB Search in Laravel.

We'll assume that your MongoDB Atlas cluster is now running and that you have loaded the sample data, especially the sample_mflix database, which we will be using. Alternatively, there's MongoDB Compass, a native app that offers a more responsive and user-friendly experience.

The sample_mflix database contains two movie collections, "movies" and "embedded_movies." We are going to work on the "movies" collection, as it does not contain vector embeddings. Our Laravel app will create the embeddings.

Connect to the database

Using the code repo, let's first connect to the MongoDB database.

Cluster network access: Make sure your current IP address is allowed through the cluster's firewall by adding it to the allowed list. If you are working on public WiFi (hotel, convention center, airport, etc.), adding the IP won't be enough, and you may need to allow all IPs to have access, which is not recommended for security. Follow the instructions on the official documentation page (jump to "Add IP Access List Entries" and choose the "Atlas GUI" tab).

Connect from Laravel: Create an .env file, based on the .env.example file, and look for DB_CONNECTION=mongodb. Right below that, you have to update the DB_DSN entry with the actual connection string of your live cluster that includes username and password. Here's a tutorial that shows how to get the connection string in Atlas (jump to "In Atlas, go to the Clusters page for your project").

Our CodeSpaces environment will run the three commands below at startup (look at init_repo.sh in the repo). If you prefer your own setup, don't forget to initialize the repo with the commands below:

 
#create a new .env file
cp .env.example .env
 
# download libraries required by our app
composer install
 
#generate the keys for this app
php artisan key:generate

In the .env file, replace the sample DB_DSN MongoDB connection string with your own.

DB_CONNECTION=mongodb
DB_DSN=mongodb+srv://USERNAME:PASSWORD@cluster.mongodb.net/sample_mflix?retryWrites=true&w=majority
DB_DATABASE=sample_mflix

Depending on your Laravel environment, you may have different base URLs, for example:

Environment Example URL
Local PHP http://localhost:8000/api/hello
Codespaces https://[unique-id]-8000.app.github.dev/api/hello
Docker http://127.0.0.1:8080/api/hello

In the article, we'll simply use

{{BASE_URL}}/api/hello

In the Codespaces environment, you can find the URL by hovering above the "globe" icon in the Ports tab.

We're going to build some API endpoints, so in CodeSpaces, port 80 is made "public" to facilitate access. If you see a warning message like this, just click on Continue.

Note that MongoDB's schema flexibility allows us to remain migration-free for now, so we won't have to execute "php artisan migrate". Once your credentials are in, run this command to launch the app:

php artisan serve

There are some API endpoints that have been created for testing the app and the connection to MongoDB.

Remember, in Codespaces, the URL format is {friendly-name}-{random-hash}-{port}.app.github.dev and can be obtained in the Ports tab.

# returns {"response":"hello world"} if the app is up and running
curl {{BASE_URL}}/api/hello
 
# returns {"status":"success","connection":"MongoDB connection successful"...
curl {{BASE_URL}}/api/mongodb-test

If both API calls are successful, our connection is solid, and we are ready for the next step: start our Relevancy-based Search journey!

MongoDB Full-Text Search Options: $text Index vs. MongoDB Search (Lucene-powered)

Why LIKE Queries and Regex Fail: Moving Beyond Basic Pattern Matching

Let's provide some context. When developers want to search records based on text, it is not uncommon to start with an exact match on a text field, with inherent usability limitations.

Subsequently, a regex matching is introduced to return data based on certain string patterns. However, regex often involves a full (B-tree) index scan, and although better than a full documents scan, it is not very scalable, and the latency will decrease the user experience as the dataset grows.

Using a MongoDB text index can be a bit better for natural language queries and features tokenization, removes stop words (the, a, and…), and has stemming (the word "running" becomes the "run" token) but you won't be able to use regex on this kind of index as the original strings is processed into tokens.

The results can be ordered in various ways. For example, to use these techniques on a blog, you may sort the results and have the most recent articles appear at the top. The search process seemingly works, but the most recent article may not be the most relevant article for that search phrase.

MongoDB Search Powered by Lucene: Enterprise Search in Your Database

MongoDB Search is a powerful, built-in search capability based on Lucene, an open-source search engine upon which big-name search engines are based. This feature started in the Atlas cloud, but is now also available in MongoDB Community edition since Sep 17 2025.

MongoDB exposes the Lucene functionality as an aggregation pipeline, which looks and feels just like other MongoDB database queries and is accessed with the same database connection. No additional DevOps work. In this article, we'll explore how to use MongoDB Search using the native Laravel API.

Why Your Current Search Is Probably Sub-Optimal

Before diving into the code, it's good to know some fundamental principles of using MongoDB search and the underlying Lucene architecture. Using an ordinary MongoDB database index (B-Tree) to search for text is more likely to be slower. In some instances, you can scan a limited range of the index, but many use cases bump into situations where a full index scan or collection scan will happen. At scale, this is challenging. Going from an exact match to a regex makes it more costly.

The legacy text database index ($text) is better as it introduces some elements important to search.

  1. Tokens: strings are processed, insignificant words are removed, and word indexing is optimized by transforming original text words into "tokens". A Token is a word/string that is used for indexing.
  2. Relevancy: the legacy text index has a basic relevancy algorithm based on Term Frequency (TF)

MongoDB Search (powered by Lucene) takes search to the next level and features:

  1. Fuzzy matching using Levenshtein Distance logic.
    1. It can automatically find "Smartphone" even if the user makes a typo
  2. Specific spoken and written language support
  3. Much better scoring mechanism with the BM25 algorithm
  4. Autocomplete to suggest relevant results in real time, and the great Relevant As-You-Type Suggestions tutorial.

There are more advantages (multiple clauses, phrase search, relevancy tuning controls, analysis configuration, etc., and index intersection!), but for now, I think these are the primary things to focus on before learning more later. This will vastly improve the relevancy of your search functionality by making results more relevant.

Creating a Lucene-like Full-Text Search Index in MongoDB

Now let's code! Assuming the Laravel app is running, and you've been able to test that your MongoDB Atlas database is connected, you only need to take two additional steps before launching your first high-end search query! First, we'll create a search Index, the inverted index we talked about before. Secondly, we'll use the MongoDB Aggregation Pipeline to run the search query.

Search Index Creation

The search index is created in the CreateFullTextSearchIndex command. The main code is

$indexName = config('fulltext.index.name');
$collectionName = config('vector.collection');
 
// Get full-text search configuration
$searchFields = ['title', 'plot', 'fullplot', 'cast', 'directors'];
 
// Build field mappings for full-text search
$fieldMappings = [];
foreach ($searchFields as $field) {
$fieldMappings[$field] = [
'type' => 'string'];
}
 
// Create full-text search index
$this->info('Creating new full-text search index...');
$result = $collection->createSearchIndex(
[
'mappings' => [
'dynamic' => false,
'fields' => $fieldMappings]
],
[
'name' => $indexName
]);

We call createSearchIndex with dynamic=false because we want to be intentional in the selection of attributes to be indexed. We know our data, and at the moment, the five attributes in $searchFields are the ones we think we'll need.

To trigger the creation of the index, execute the command:

php artisan fulltext:create-index
 
# Force recreate (deletes existing index first)
# php artisan fulltext:create-index --force

Implementing MongoDB $search in Laravel Eloquent: PHP Code Examples

Great, we know our search index is ready (check in the GUI if you want) and working inside MongoDB, but let's access that functionality via the Laravel framework.

We've implemented a "naive" search query in MovieSearchTextController:naive() to show you the mechanics and basic syntax, and what comes out with zero tuning. The main query is

 
$results = Movie::query()
->aggregate()
->search(
operator: Search::text(
path: config('fulltext.index.fields', ['title', 'plot', 'fullplot', 'cast', 'directors']),
query: $query
),
index: config('fulltext.index.name')
)
->addFields(score: ['$meta' => 'searchScore'])
->limit(config('fulltext.search.limit'))
->get();

To have more insights, asked the search engine to give us its internal score computation by using $meta. This is important because we want to gauge how relevant the results are.

 
curl -X POST {{BASE_URL}}/api/search-text-naive \
-H "Content-Type: application/json" \
-d '{"query":"your search term here"}'

Sample output

{
"query": "The Godfather",
"results": [
{
"_id": {
"$oid": "573a13b0f29313caabd341d2"
},
"title": "C(r)ook",
"plot": "A killer for the Russian Mafia in Vienna wants to retire and write a book about his passion - cooking. The mafia godfather suspects treason.",
"fullplot": "A killer for the Russian Mafia in Vienna wants to retire and write a book about his passion - cooking. The mafia godfather suspects treason.",
"genres": [
"Comedy"
],
"year": 2004,
"cast": [
"Henry Hèbchen",
"Moritz Bleibtreu",
"Corinna Harfouch",
"Nadeshda Brennicke"
],
"directors": [
"Pepe Danquart"
],
"poster": "https://m.media-amazon.com/images/M/MV5BNDY2MjlkMjYtNjJkYi00Yjc0LWI2MTItOTEwOWU4YzNkYjEwL2ltYWdlL2ltYWdlXkEyXkFqcGdeQXVyMzA3Njg4MzY@._V1_SY1000_SX677_AL_.jpg",
"score": 8.478110313415527
}
 
{<movie-2>},
...
{<movie-10>},
],
"count": 10,
"search_type": "naive",
"index": "movies_fulltext_index"
}

Suggested search terms for testing: "Titanic", "space adventure aliens", "Tom Hanks drama."

We sent a search query and let the BM25 algorithm use its default settings to return somewhat relevant results.

To unlock the full power of search, you, the developer, need to spice things up. The art of search is to take action to increase the relevancy using your intimate knowledge of both the data and how users want to search it.

MongoDB Search Field Weighting: Boosting Title, Cast, and Plot Fields

From my experience, Movie searches fall into three main patterns: title-first searches (60-70%), where users know exactly what they want; discovery/conceptual searches (20-30%), where users describe themes, plots, or moods; and actor/director searches (10-15%), where users look for content by talent. For text-based search systems, this suggests the title should receive the highest weight, followed by curated plot summaries for conceptual matching, with full descriptions serving as supplementary context.

Based on the above and given our dataset, we can start assigning different weights to different fields. We'll go with this set of weights:

  • Title exact phrase match (10x) - Highest priority for exact title matches
  • Title match (7x) - partial match
  • Cast (5x) - Medium priority for actor-based searches
  • Plot (3x) - Medium-high priority for curated summaries that capture movie essence
  • Directors (2x) - Medium priority for director-based searches
  • Fullplot (1x) - Standard weight for comprehensive descriptions

The query is implemented in MovieSearchTextController::weighted(), and the interesting part is

$results = Movie::query()
->aggregate()
->search(
operator: Search::compound(
should: [
// Exact phrase match on title - highest priority
Search::phrase(
path: 'title',
query: $query,
score: ['boost' => ['value' => 10]]
),
// Fuzzy text match on title - high priority
Search::text(
path: 'title',
query: $query,
score: ['boost' => ['value' => 7]]
),
Search::text(
path: 'cast',
query: $query,
score: ['boost' => ['value' => 5]]
),
Search::text(
path: 'plot',
query: $query,
score: ['boost' => ['value' => 3]]
),
Search::text(
path: 'directors',
query: $query,
score: ['boost' => ['value' => 2]]
),
Search::text(
path: 'fullplot',
query: $query,
score: ['boost' => ['value' => 1]]
),
]
),
index: config('fulltext.index.name')
)
->addFields(score: ['$meta' => 'searchScore'])
->limit(config('fulltext.search.limit'))
->get();

You can see how each attribute gets a boost factor and how the syntax works. You can refer to the MongoDB Search documentation to learn more about the MongoDB Query Language for search.

You can use the weighted ("non-naive") search with this endpoint:

 
curl -X POST {{BASE_URL}}/api/search-text \
-H "Content-Type: application/json" \
-d '{"query":"your search term here"}'

Naive vs Weighted Results

Since both search methods use a different relevancy Weighted Scoring Profile, we should not compare the scores between naive and weighted. Instead, the relative scores within each set of results are what's important.

Query: "The Godfather" (title-first use case)

Naive Search Weighted Search
Rank Title Score Rank Title Score
1 C(r)ook 8.48 1 The Godfather (1972) 76.45
2 Eadweard 7.94 2 The Godfather: Part III 61.04
3 Maqbool 7.51 3 The Godfather: Part II 57.67
4 The Godfather: Part III 7.38 4 Godfather 26.28
5 The Kennedys 7.13 5 The Kennedys 17.17

Alternatively, search "The Matrix."

The Naive Search fails due to "Keyword Dilution" and "Length Normalization" biases; BM25 rewards shorter documents like C(r)ook because the term "Godfather" makes up a larger percentage of their metadata compared to the dense, text-heavy records of the actual trilogy. Furthermore, without field weights, a single mention of "Godfather" in an obscure plot summary (such as Maqbool) is treated as equal to a match in the title.

In contrast, the Weighted Search corrects this by applying a massive 10x boost to title exact-match (my thesis), ensuring that the exact sequence "The Godfather" anchors the top result. The strategy successfully groups the entire trilogy at the summit, creating a clear "relevance gap" where the intended masterpiece scores roughly 3.4x higher (86.56 vs 25.67) than the nearest irrelevant noise.

Query: "Tom Hanks"

Naive Search Weighted Search
Rank Title Score Rank Title Score
1 Shooting War 19.21 1 Shooting War 50.44
2 Larry Crowne 14.53 2 Larry Crowne 39.86
3 Tom and Huck 10.25 3 Nothing in Common 36.48
4 Tom and Huck 10.25 4 Tom Sawyer 36.07
5 Jerry and Tom 10.00 5 Tom Sawyer 35.78

Shooting War: Tom Hanks is the narrator and executive producer. Because he appears in multiple weighted fields (Cast, Director/Producer, and Plot), his name creates a "cumulative score" that pushed it to the top.

Larry Crowne & Nothing in Common: Tom Hanks is the lead actor. The search successfully surfaced because the 4x Cast boost prioritized his name in the actor metadata over incidental mentions elsewhere.

Tom and Huck, Tom Sawyer, & Jerry and Tom: These are "false positives" triggered by the 5x Title boost. Since the engine was looking for "Tom" OR "Hanks," it found the name "Tom" in the titles and mistakenly assumed they were highly relevant, even though the "Hanks" part was missing.

While the common first name "Tom" still allows some noise, such as Tom Sawyer, to linger, the 10:5:4:3:2:1 weighting strategy effectively prioritizes structured entity data over unstructured plot descriptions. Ultimately, this transition from statistical keyword matching to hierarchical field importance proves that the system now understands user intent far better than standard BM25. There's always room for improvement.

Conclusion: We Just Scratched the Surface

By now, you’ve experienced the "art" of search relevancy and seen how layering weights transforms raw data into an intuitive user experience. Together, we have built a search system that far outpaces standard database read queries by moving beyond simple string and pattern matching and into the realm of intent-driven ranking.

If MongoDB is already your application database, congratulations—you just unlocked enterprise-grade Lucene search with zero infrastructure changes, no ETL pipelines, and a single command (`php artisan fulltext:create-index`).

Even if you're running another database as your primary, MongoDB could serve as a scalable, best-of-breed search extension that handles full-text, vector, and geospatial queries on a single managed platform.

While we’ve made strides, there is always more to learn; every dataset is unique, and the path to a "perfect" search result involves a constant, customizable cycle of testing, tuning, and iteration. My advice is that you come up with an evaluation mechanism, potentially multi-layered, that would indicate if the results are helping your business objectives.

This article is part of a series, and previously, we showed how to use MongoDB Vector Search with Laravel via Eloquent to perform semantic searches that go well beyond keywords.

Hubert Nguyen photo

Lead Developer Advocate at MongoDB

Cube

Laravel Newsletter

Join 40k+ other developers and never miss out on new tips, tutorials, and more.

image
SerpApi

The Web Search API for Your LLM and AI Applications

Visit SerpApi
Tinkerwell logo

Tinkerwell

The must-have code runner for Laravel developers. Tinker with AI, autocompletion and instant feedback on local and production environments.

Tinkerwell
Get expert guidance in a few days with a Laravel code review logo

Get expert guidance in a few days with a Laravel code review

Expert code review! Get clear, practical feedback from two Laravel devs with 10+ years of experience helping teams build better apps.

Get expert guidance in a few days with a Laravel code review
Lucky Media logo

Lucky Media

Get Lucky Now - the ideal choice for Laravel Development, with over a decade of experience!

Lucky Media
PhpStorm logo

PhpStorm

The go-to PHP IDE with extensive out-of-the-box support for Laravel and its ecosystem.

PhpStorm
Kirschbaum logo

Kirschbaum

Providing innovation and stability to ensure your web application succeeds.

Kirschbaum
Harpoon: Next generation time tracking and invoicing logo

Harpoon: Next generation time tracking and invoicing

The next generation time-tracking and billing software that helps your agency plan and forecast a profitable future.

Harpoon: Next generation time tracking and invoicing
Laravel Cloud logo

Laravel Cloud

Easily create and manage your servers and deploy your Laravel applications in seconds.

Laravel Cloud
Shift logo

Shift

Running an old Laravel version? Instant, automated Laravel upgrades and code modernization to keep your applications fresh.

Shift
Acquaint Softtech logo

Acquaint Softtech

Acquaint Softtech offers AI-ready Laravel developers who onboard in 48 hours at $3000/Month with no lengthy sales process and a 100 percent money-back guarantee.

Acquaint Softtech
SerpApi logo

SerpApi

Access real-time search engine results through a simple API—no more scraping headaches! Use it for AI applications, SEO tools, product research, travel information, and more

SerpApi
SaaSykit: Laravel SaaS Starter Kit logo

SaaSykit: Laravel SaaS Starter Kit

SaaSykit is a Multi-tenant Laravel SaaS Starter Kit that comes with all features required to run a modern SaaS. Payments, Beautiful Checkout, Admin Panel, User dashboard, Auth, Ready Components, Stats, Blog, Docs and more.

SaaSykit: Laravel SaaS Starter Kit

The latest

View all →
Storage Cache Store in Laravel 13.10.0 image

Storage Cache Store in Laravel 13.10.0

Read article
Drag-and-Drop Sorting for Eloquent Models with Reorderable for Laravel image

Drag-and-Drop Sorting for Eloquent Models with Reorderable for Laravel

Read article
Ship AI with Laravel: Real-Time Streaming Chat UI with Livewire image

Ship AI with Laravel: Real-Time Streaming Chat UI with Livewire

Read article
Frontend Nation 2026 Returns June 3-4 with Laravel in the Lineup image

Frontend Nation 2026 Returns June 3-4 with Laravel in the Lineup

Read article
Use a Google Sheet as Your Laravel Database with the Google Sheets Database Driver image

Use a Google Sheet as Your Laravel Database with the Google Sheets Database Driver

Read article
Larapanda: A Type-Safe Lightpanda Browser SDK for Laravel image

Larapanda: A Type-Safe Lightpanda Browser SDK for Laravel

Read article