Try Depot: Bring ultra-fast, remote Docker builds directly to your Laravel workflow

Typesense Search

Typesense Search image

Typesense is an open-source, typo-tolerant search engine optimized for instant (typically sub-50ms) search-as-you-type experiences and developer productivity.

If you've heard about ElasticSearch or Algolia, a good way to think about Typesense is that it is:

  • An open source alternative to Algolia, with some key quirks solved and
  • An easier-to-use batteries-included alternative to Elasticsearch

If you prefer a video walk-through, here's a step-by-step course from Laracasts and here's a video walk-through from Aaron Francis.

Key Features

  • Typo Tolerance: Handles typographical errors elegantly, out-of-the-box.
  • Simple and Delightful: Simple to set-up, integrate with, operate and scale.
  • Blazing Fast: Built in C++. Meticulously architected from the ground-up for low-latency (<50ms) instant searches.
  • Tunable Ranking: Easy to tailor your search results to perfection.
  • Sorting: Dynamically sort results based on a particular field at query time (helpful for features like "Sort by Price (asc)").
  • Faceting & Filtering: Drill down and refine results.
  • Grouping & Distinct: Group similar results together to show more variety.
  • Federated Search: Search across multiple collections (indices) in a single HTTP request.
  • Geo Search: Search and sort by results around a latitude/longitude or within a bounding box.
  • Vector Search: Index embeddings from your machine learning models in Typesense and do a nearest-neighbor search. Can be used to build similarity search, semantic search, visual search, recommendations, etc.
  • Semantic / Hybrid Search: Automatically generate embeddings from within Typesense using built-in models like S-BERT, E-5, etc or use OpenAI, PaLM API, etc, for both queries and indexed data. This allows you to send JSON data into Typesense and build an out-of-the-box semantic search + keyword search experience.
  • Conversational Search (Built-in RAG): Send questions to Typesense and have the response be a fully-formed sentence, based on the data you've indexed in Typesense. Think ChatGPT, but over your own data.
  • Natural Language Search: LLM-powered intent detection & query understanding, that converts any free form natural language queries into structured filters, sorts and queries.
  • Image Search: Search through images using text descriptions of their contents, or perform similarity searches, using the CLIP model.
  • Voice Search: Capture and send query via voice recordings - Typesense will transcribe (via Whisper model) and provide search results.
  • Scoped API Keys: Generate API keys that only allow access to certain records, for multi-tenant applications.
  • JOINs: Connect one or more collections via common reference fields and join them during query time. This allows you to model SQL-like relationships elegantly.
  • Synonyms: Define words as equivalents of each other, so searching for a word will also return results for the synonyms defined.
  • Curation & Merchandizing: Boost particular records to a fixed position in the search results, to feature them.
  • Raft-based Clustering: Setup a distributed cluster that is highly available.
  • Seamless Version Upgrades: As new versions of Typesense come out, upgrading is as simple as swapping out the binary and restarting Typesense.
  • No Runtime Dependencies: Typesense is a single binary that you can run locally or in production with a single command.

Typesense Use Cases

Here is an evolving set of use-cases where Typesense can be used:

  1. Typo-tolerant fuzzy search-as-you-type to power autocomplete search bars and search results pages
  2. Faceted navigation and browsing experience, where users don't need to type in a keyword, instead they directly start applying filters to drill down multiple attributes to get to the documents they're looking for. (Eg: https://ecommerce-store.typesense.org/)
  3. As a geo-distributed cache, in order to place data close to users. Instead of hitting your primary database for data which is probably hosted in a single geo region, since you already send Typesense a snapshot of your data, you could fetch documents directly from a geo-distributed Typesense cluster, which routes requests to the node that's closest to the user, thus reducing latency.
  4. For finding documents that are similar to each other, using vector search. The definition of "similarity" can be defined by any ML models you build, you'd take the output of the model (vectors), index them in Typesense and then do a nearest-neighbor search. Using this, you can implement features like personalization, recommendations, visual search, semantic search, similarity search, etc.
  5. Multi-tenant search, where certain records / fields can only be accessed by certain sets of users (eg, logged-in users, admins, users on a certain pricing plan, etc)
  6. Federated search, where the search is performed across multiple indices and the results are shown to users side-by-side.
  7. Geo-search, to search / sort records that are in proximity to a given latitude/longitude.
  8. Semantic search, where users can type in a conceptually related keyword and Typesense will return results from your dataset, even if that exact keyword doesn't exist. For eg, let's say a user searches for "ocean" and your dataset only has the keyword "sea", semantic search can still retrieve relevant results containing the word "sea".
  9. Long-term memory for chat-based LLMs like ChatGPT. Typesense's vector search and hybrid search features can be used to make Chat LLMs respond to users' questions with information in your JSON dataset. Using this you can build conversational chat bots over your dataset.
  10. To power data visualizations like charts and tables, using the aggregated facet metrics returned by Typesense.