Fulltext Search
This section documents the fulltext search classes and functions that provide advanced text search capabilities.
FulltextFilter
- class matrixone.sqlalchemy_ext.fulltext_search.FulltextFilter(columns: List[str], mode: str = 'boolean mode')[source]
Bases:
ColumnElementAdvanced fulltext filter for integrating fulltext search with ORM queries.
This class wraps FulltextQueryBuilder to provide seamless integration with MatrixOne ORM’s filter() method, allowing fulltext search to be combined with other SQL conditions.
- Core Methods (Group-level operators):
must(): Required terms/groups (+ operator)
must_not(): Excluded terms/groups (- operator)
encourage(): Optional terms/groups with normal weight (no prefix)
discourage(): Optional terms/groups with reduced weight (~ operator)
- Parameter Types:
str: Single term (e.g., “python”)
FulltextGroup: Group of terms (e.g., group().medium(“java”, “kotlin”))
Usage with ORM:
# Basic fulltext filter results = client.query(Article).filter( boolean_match("title", "content").must("python").encourage("tutorial") ).all() # Combined with other conditions results = client.query(Article).filter( boolean_match("title", "content").must("python") ).filter( Article.category == "Programming" ).all() # Complex fulltext with groups results = client.query(Article).filter( boolean_match("title", "content", "tags") .must("programming") .must(group().medium("python", "java")) .discourage(group().medium("legacy", "deprecated")) ).all()
Weight Operator Examples
# Encourage tutorials, discourage legacy content boolean_match("title", "content") .must("python") .encourage("tutorial") # Boost documents with 'tutorial' .discourage("legacy") # Lower ranking for 'legacy' documents
- Supported MatrixOne Boolean Mode Operators:
Group-level: +, -, ~, (no prefix) - applied to entire groups/terms Element-level: >, < - applied within groups using high(), low() Other: “phrase”, term* - exact phrases and prefix matching Complex: +red -(<blue >is) - nested groups with mixed operators
- Important MatrixOne Requirements:
Column Matching: The columns specified must exactly match the columns defined in the FULLTEXT index. If your index is FULLTEXT(title, content, tags), you must include all three columns.
Limitations: - Only one MATCH() function per query is supported - Complex nested groups may have syntax restrictions - Use fulltext_and/fulltext_or for combining with other conditions
- inherit_cache: bool | None = False
Indicate if this
HasCacheKeyinstance should make use of the cache key generation scheme used by its immediate superclass.The attribute defaults to
None, which indicates that a construct has not yet taken into account whether or not its appropriate for it to participate in caching; this is functionally equivalent to setting the value toFalse, except that a warning is also emitted.This flag can be set to
Trueon a particular class, if the SQL that corresponds to the object does not change based on attributes which are local to this class, and not its superclass.See also
Enabling Caching Support for Custom Constructs - General guideslines for setting the
HasCacheKey.inherit_cacheattribute for third-party or user defined SQL constructs.
- __bool__()[source]
Override bool to prevent SQLAlchemy from treating this as a boolean value. This is important for proper WHERE clause generation.
- columns(*columns: str) FulltextFilter[source]
Set the columns to search in.
- must(*items) FulltextFilter[source]
Add required terms or groups (+ operator at group level).
- must_not(*items) FulltextFilter[source]
Add excluded terms or groups (- operator at group level).
- encourage(*items) FulltextFilter[source]
Add terms or groups that should be encouraged (normal positive weight).
- phrase(*phrases: str) FulltextFilter[source]
Add exact phrases - equivalent to “phrase”.
- prefix(*terms: str) FulltextFilter[source]
Add prefix terms - equivalent to term*.
- boost(term: str, weight: float) FulltextFilter[source]
Add a boosted term (term^weight).
- discourage(*items) FulltextFilter[source]
Add terms or groups that should be discouraged (~ operator at group level).
- set_natural_query(query: str) FulltextFilter[source]
Set natural language query string (used for NATURAL_LANGUAGE mode).
- group(*filters: FulltextFilter) FulltextFilter[source]
Add nested query groups (OR semantics).
- natural_language() FulltextFilter[source]
Set to natural language mode.
- boolean_mode() FulltextFilter[source]
Set to boolean mode.
- query_expansion() FulltextFilter[source]
Set to query expansion mode.
- label(name: str)[source]
Create a labeled version for use in SELECT clauses.
This allows using fulltext expressions as selectable columns with aliases:
Args:
name: The alias name for the column
Returns:
A SQLAlchemy labeled expression
Examples:
.. code-block:: python # Use as a SELECT column with score query(Article, Article.id, boolean_match("title", "content").must("python").label("score")) # Multiple fulltext scores query(Article, Article.id, boolean_match("title", "content").must("python").label("relevance"), boolean_match("tags").must("programming").label("tag_score"))
Generated SQL:
SELECT articles.id, MATCH(title, content) AGAINST('+python' IN BOOLEAN MODE) AS score FROM articles
FulltextSearchBuilder
- class matrixone.sqlalchemy_ext.fulltext_search.FulltextSearchBuilder(client: Client)[source]
Bases:
objectElasticsearch-like fulltext search builder for MatrixOne.
Provides a chainable interface for building complex fulltext queries with support for various search modes, filters, and sorting.
- Boolean Mode Operators:
+word: Required term (must contain)-word: Excluded term (must not contain)~word: Lower weight term (reduces relevance score)<word: Lower weight term (reduces relevance score)>word: Higher weight term (increases relevance score)word : Optional term (may contain)
"phrase": Exact phrase matchword*: Wildcard prefix match(word1 word2) : Grouping (contains any of the words)
Note: MatrixOne supports all boolean mode operators.
- Search Modes:
NATURAL_LANGUAGE: Automatic stopword removal, stemming, relevance scoring
BOOLEAN: Full control with operators, no automatic processing
QUERY_EXPANSION: Not supported in MatrixOne
Examples:
# Natural language search results = client.fulltext_search() .table("articles") .columns(["title", "content"]) .with_mode(FulltextSearchMode.NATURAL_LANGUAGE) .query("machine learning") .with_score() .limit(10) .execute() # Boolean search with complex terms results = client.fulltext_search() .table("articles") .columns(["title", "content"]) .with_mode(FulltextSearchMode.BOOLEAN) .add_term("machine", required=True) .add_term("learning", required=True) .where("category = 'AI'") .order_by("score", "DESC") .limit(20) .execute()
- table(table_name: str) FulltextSearchBuilder[source]
Set the target table for the search.
Args:
table_name: Name of the table to search
Returns:
FulltextSearchBuilder: Self for chaining
- columns(columns: List[str]) FulltextSearchBuilder[source]
Set the columns to search in.
Args:
columns: List of column names to search
Returns:
FulltextSearchBuilder: Self for chaining
- with_mode(mode: str) FulltextSearchBuilder[source]
Set the search mode.
Args:
mode: Search mode - FulltextSearchMode.NATURAL_LANGUAGE: Automatic processing, user-friendly - FulltextSearchMode.BOOLEAN: Full control with operators - FulltextSearchMode.QUERY_EXPANSION: Not supported in MatrixOne
Returns:
FulltextSearchBuilder: Self for chaining
Examples:
# Natural language mode (default) .with_mode(FulltextSearchMode.NATURAL_LANGUAGE) # Boolean mode for complex queries .with_mode(FulltextSearchMode.BOOLEAN)
- with_algorithm(algorithm: str) FulltextSearchBuilder[source]
Set the search algorithm.
Args:
algorithm: Search algorithm - FulltextSearchAlgorithm.TF_IDF: Traditional TF-IDF scoring - FulltextSearchAlgorithm.BM25: Modern BM25 scoring (recommended)
Returns:
FulltextSearchBuilder: Self for chaining
Examples:
# Use BM25 algorithm (recommended) .with_algorithm(FulltextSearchAlgorithm.BM25) # Use TF-IDF algorithm .with_algorithm(FulltextSearchAlgorithm.TF_IDF)
- query(query_string: str) FulltextSearchBuilder[source]
Set a simple query string (resets previous terms).
Args:
query_string: The search query (natural language or boolean syntax)
Returns:
FulltextSearchBuilder: Self for chaining
Examples:
# Natural language query .query("machine learning algorithms") # Boolean query .query("+machine +learning -java")
Note: This method resets any previously added terms, phrases, or wildcards.
- add_term(term: str, required: bool = False, excluded: bool = False, proximity: int | None = None) FulltextSearchBuilder[source]
Add a search term to the query.
Args:
term: The search term required: Whether the term is required (+) - must contain this term excluded: Whether the term is excluded (-) - must not contain this term proximity: Proximity modifier for boolean mode (not supported in MatrixOne)
Returns:
FulltextSearchBuilder: Self for chaining
Examples:
# Required term: +machine .add_term("machine", required=True) # Excluded term: -java .add_term("java", excluded=True) # Optional term: learning .add_term("learning") # Complex query: +machine +learning -java .add_term("machine", required=True) .add_term("learning", required=True) .add_term("java", excluded=True)
- add_phrase(phrase: str) FulltextSearchBuilder[source]
Add an exact phrase to the query.
Args:
phrase: The exact phrase to search for (wrapped in double quotes)
Returns:
FulltextSearchBuilder: Self for chaining
Examples:
# Exact phrase: "machine learning" .add_phrase("machine learning") # Multiple phrases .add_phrase("deep learning") .add_phrase("neural networks")
- add_wildcard(pattern: str) FulltextSearchBuilder[source]
Add a wildcard pattern to the query.
Args:
pattern: Wildcard pattern with * suffix (e.g., "test*", "neural*")
Returns:
FulltextSearchBuilder: Self for chaining
Examples:
# Prefix match: neural* .add_wildcard("neural*") # Multiple wildcards .add_wildcard("machine*") .add_wildcard("learn*")
- with_score(include: bool = True) FulltextSearchBuilder[source]
Include relevance score in results.
Args:
include: Whether to include the score
Returns:
FulltextSearchBuilder: Self for chaining
- select(columns: List[str]) FulltextSearchBuilder[source]
Set the columns to select in the result.
Args:
columns: List of column names to select
Returns:
FulltextSearchBuilder: Self for chaining
- where(condition: str) FulltextSearchBuilder[source]
Add a WHERE condition.
Args:
condition: WHERE condition
Returns:
FulltextSearchBuilder: Self for chaining
- order_by(column: str, direction: str = 'DESC') FulltextSearchBuilder[source]
Set ORDER BY clause.
Args:
column: Column to order by direction: Order direction (ASC/DESC)
Returns:
FulltextSearchBuilder: Self for chaining
- limit(count: int) FulltextSearchBuilder[source]
Set LIMIT clause.
Args:
count: Number of results to return
Returns:
FulltextSearchBuilder: Self for chaining
- offset(count: int) FulltextSearchBuilder[source]
Set OFFSET clause.
Args:
count: Number of results to skip
Returns:
FulltextSearchBuilder: Self for chaining
FulltextQueryBuilder
- class matrixone.sqlalchemy_ext.fulltext_search.FulltextQueryBuilder[source]
Bases:
objectBuilder for constructing fulltext boolean queries.
This class provides a chainable API for building complex fulltext search queries that are compatible with MatrixOne’s MATCH() AGAINST() syntax.
- Core Methods:
must(): Required terms/groups (+ operator)
must_not(): Excluded terms/groups (- operator)
encourage(): Optional terms/groups with normal weight (no prefix)
discourage(): Optional terms/groups with reduced weight (~ operator)
Examples:
# Basic usage query.must("python") # +python query.encourage("tutorial") # tutorial query.discourage("legacy") # ~legacy query.must_not("deprecated") # -deprecated # Group usage query.must(group().medium("java", "kotlin")) # +(java kotlin) query.encourage(group().medium("tutorial", "guide")) # (tutorial guide) query.must_not(group().medium("spam", "junk")) # -(spam junk)
Note: Group-level operators (+, -, ~) applied to entire groups. Element-level operators (>, <) applied within groups using high(), low()
- must(*items) FulltextQueryBuilder[source]
Add required terms or groups (+ operator at group level).
Documents MUST contain these terms/groups to match. This is equivalent to the ‘+’ operator in MatrixOne’s boolean mode syntax.
Args:
*items: Can be strings (terms) or FulltextGroup objects
Examples:
# Required term - documents must contain 'python' query.must("python") # Generates: +python # Required group - documents must contain either 'java' OR 'kotlin' query.must(group().medium("java", "kotlin")) # Generates: +(java kotlin) # Multiple required terms query.must("python", "programming") # Generates: +python +programming # Unpack list to search multiple terms words = ["python", "programming"] query.must(*words) # Correct: unpacks the list
Raises:
TypeError: If a list or tuple is passed directly without unpacking
Returns:
FulltextQueryBuilder: Self for method chaining
- must_not(*items) FulltextQueryBuilder[source]
Add excluded terms or groups (- operator at group level).
Documents MUST NOT contain these terms/groups to match. This is equivalent to the ‘-’ operator in MatrixOne’s boolean mode syntax.
Args:
*items: Can be strings (terms) or FulltextGroup objects
Examples:
# Excluded term - documents must not contain 'deprecated' query.must_not("deprecated") # Generates: -deprecated # Excluded group - documents must not contain 'spam' OR 'junk' query.must_not(group().medium("spam", "junk")) # Generates: -(spam junk) # Multiple excluded terms query.must_not("spam", "junk") # Generates: -spam -junk # Unpack list to exclude multiple terms words = ["spam", "junk"] query.must_not(*words) # Correct: unpacks the list
Raises:
TypeError: If a list or tuple is passed directly without unpacking
Returns:
FulltextQueryBuilder: Self for method chaining
- encourage(*items) FulltextQueryBuilder[source]
Add terms or groups that should be encouraged (normal positive weight).
Documents can match without these terms, but containing them will INCREASE the relevance score. This provides normal positive weight boost.
Args:
*items: Can be strings (terms) or FulltextGroup objects
Examples:
# Encourage documents with 'tutorial' query.encourage("tutorial") # Generates: tutorial # Encourage documents with 'beginner' OR 'intro' query.encourage(group().medium("beginner", "intro")) # Generates: (beginner intro) # Multiple encouraged terms query.encourage("tutorial", "guide") # Generates: tutorial guide # Unpack list to encourage multiple terms words = ["tutorial", "guide"] query.encourage(*words) # Correct: unpacks the list
- Weight Comparison:
encourage(“term”): Normal positive boost (encourages term)
discourage(“term”): Reduced/negative boost (discourages term)
Raises:
TypeError: If a list or tuple is passed directly without unpacking
Returns:
FulltextQueryBuilder: Self for method chaining
- discourage(*items) FulltextQueryBuilder[source]
Add terms or groups that should be discouraged (~ operator at group level).
Documents can match without these terms, but containing them will DECREASE the relevance score. This provides reduced or negative weight boost, effectively discouraging documents that contain these terms.
Args:
*items: Can be strings (terms) or FulltextGroup objects
Examples:
# Discourage documents with 'legacy' query.discourage("legacy") # Generates: ~legacy # Discourage documents with 'old' OR 'outdated' query.discourage(group().medium("old", "outdated")) # Generates: ~(old outdated) # Multiple discouraged terms query.discourage("legacy", "deprecated") # Generates: ~legacy ~deprecated # Unpack list to discourage multiple terms words = ["legacy", "deprecated"] query.discourage(*words) # Correct: unpacks the list
- Weight Comparison:
encourage(“term”): Normal positive boost (encourages term)
discourage(“term”): Reduced/negative boost (discourages term)
- Use Cases:
# Search Python content, but discourage legacy versions query.must(“python”).encourage(“3.11”).discourage(“2.7”)
# Find tutorials, but avoid outdated content query.must(“tutorial”).discourage(group().medium(“old”, “deprecated”))
Raises:
TypeError: If a list or tuple is passed directly without unpacking
Returns:
FulltextQueryBuilder: Self for method chaining
- phrase(phrase: str) FulltextQueryBuilder[source]
Add a phrase search to the main group.
- prefix(prefix: str) FulltextQueryBuilder[source]
Add a prefix search to the main group.
- boost(term: str, weight: float) FulltextQueryBuilder[source]
Add a boosted term to the main group.
- group(*builders: FulltextQueryBuilder) FulltextQueryBuilder[source]
Add nested query builders as groups (OR semantics).
- as_sql(table: str, columns: List[str], mode: str = 'boolean mode', include_score: bool = False, select_columns: List[str] | None = None, where_conditions: List[str] | None = None, order_by: str | None = None, limit: int | None = None, offset: int | None = None) str[source]
Build a complete SQL query with optional AS score support.
This method generates a full SQL query similar to FulltextSearchBuilder but using the query built by FulltextQueryBuilder.
Args:
table: Table name to search in columns: List of columns to search in (must match FULLTEXT index) mode: Search mode (BOOLEAN, NATURAL_LANGUAGE, etc.) include_score: Whether to include relevance score in results select_columns: Columns to select (default: all columns "*") where_conditions: Additional WHERE conditions order_by: ORDER BY clause (e.g., "score DESC") limit: LIMIT value offset: OFFSET value
Returns:
str: Complete SQL query
Examples:
.. code-block:: python # Basic query with score query = FulltextQueryBuilder().must("python").encourage("tutorial") sql = query.as_sql("articles", ["title", "content"], include_score=True) # SELECT *, MATCH(title, content) AGAINST('+python tutorial' IN boolean mode) AS score # FROM articles WHERE MATCH(title, content) AGAINST('+python tutorial' IN boolean mode) # Query with custom columns and ORDER BY score sql = query.as_sql("articles", ["title", "content"], select_columns=["id", "title"], include_score=True, order_by="score DESC", limit=10)
- as_score_sql(table: str, columns: List[str], mode: str = 'boolean mode') str[source]
Convenient method to generate SQL with score included.
This is equivalent to calling as_sql() with include_score=True.
Args:
table: Table name to search in columns: List of columns to search in mode: Search mode
Returns:
str: Complete SQL query with AS score
Example:
query = FulltextQueryBuilder().must("python").encourage("tutorial") sql = query.as_score_sql("articles", ["title", "content"]) # Generates SQL with AS score automatically included
Fulltext Functions
- matrixone.sqlalchemy_ext.fulltext_search.boolean_match(*columns) FulltextFilter[source]
Create a boolean mode fulltext filter for specified columns.
This is the main entry point for creating fulltext search queries that integrate seamlessly with MatrixOne ORM’s filter() method.
Args:
*columns: Column names or SQLAlchemy Column objects to search against
Returns:
FulltextFilter: A chainable filter object
Examples:
# Basic search - must contain 'python' boolean_match("title", "content").must("python") # Multiple conditions boolean_match("title", "content") .must("python") .encourage("tutorial") .discourage("legacy") # Group search - either 'python' or 'java' boolean_match("title", "content").must(group().medium("python", "java")) # Using SQLAlchemy Column objects boolean_match(Article.title, Article.content).must("python")
Note: The columns specified must exactly match the FULLTEXT index columns. For example, if your index is FULLTEXT(title, content, tags), you must use boolean_match(“title”, “content”, “tags”)
- matrixone.sqlalchemy_ext.fulltext_search.natural_match(*columns, query: str) FulltextFilter[source]
Create a natural language mode fulltext filter for specified columns.
Natural language mode provides user-friendly search with automatic processing: - Stopword removal (e.g., ‘the’, ‘a’, ‘an’) - Stemming and variations - Relevance scoring based on TF-IDF or BM25 algorithm - Best for end-user search interfaces
- Parameters:
*columns – Column names or SQLAlchemy Column objects to search against - Must exactly match the columns in your fulltext index - Can be strings or Column objects
query – Natural language query string - User-friendly search terms - Automatically processed for best results - Multi-word queries are supported
- Important - Column Matching:
The columns specified in MATCH() must exactly match the columns defined in the FULLTEXT index. Mismatches will cause errors.
- Examples:
If index is: FULLTEXT(title, content) - ✅ natural_match(“title”, “content”, query=”…”) - Correct - ❌ natural_match(“title”, query=”…”) - Error (partial) - ❌ natural_match(“content”, query=”…”) - Error (partial)
If index is: FULLTEXT(content) - ✅ natural_match(“content”, query=”…”) - Correct - ❌ natural_match(“title”, “content”, query=”…”) - Error (extra column)
- Parser Compatibility:
Works with all parser types: - Default parser: Standard text tokenization - JSON parser: Searches JSON values within documents - NGRAM parser: Chinese and Asian language tokenization
- Returns:
A fulltext filter object for use in queries
- Return type:
Examples:
# Basic natural language search result = client.query("articles.id", "articles.title", "articles.content").filter( natural_match("title", "content", query="machine learning") ).execute() # Using with ORM models result = client.query(Article).filter( natural_match(Article.title, Article.content, query="artificial intelligence") ).execute() # Single column search result = client.query(Article).filter( natural_match(Article.content, query="python programming") ).execute() # With relevance scoring result = client.query( Article.id, Article.title, Article.content, natural_match(Article.content, query="deep learning").label("score") ).execute() # JSON parser - searching within JSON documents result = client.query(Product).filter( natural_match(Product.details, query="Dell laptop") ).execute() # NGRAM parser - Chinese content search result = client.query(ChineseArticle).filter( natural_match(ChineseArticle.title, ChineseArticle.body, query="神雕侠侣") ).execute() # Combined with SQL filters result = client.query(Article).filter( natural_match(Article.content, query="programming tutorial") ).filter(Article.category == "Education").execute()
- matrixone.sqlalchemy_ext.fulltext_search.group() FulltextGroup[source]
Create a new query group builder with OR semantics between elements.
Creates a group where elements have OR relationship. The group-level semantics (required, excluded, optional, reduced weight) are determined by how it’s used: - must(group()) → +(…) - group is required - must_not(group()) → -(…) - group is excluded - encourage(group()) → (…) - group is optional with normal weight - discourage(group()) → ~(…) - group is optional with reduced weight
- Element-level Methods (use inside groups):
medium(): Add terms with medium weight (no operators)
high(): Add terms with high weight (>term)
low(): Add terms with low weight (<term)
phrase(): Add exact phrase matches (“phrase”)
prefix(): Add prefix matches (term*)
IMPORTANT: Inside groups, do NOT use must()/must_not() as they add +/- operators. Use medium() for plain terms or high()/low() for element-level weight control.
- Examples
# Required group - must contain ‘java’ OR ‘kotlin’ query.must(group().medium(“java”, “kotlin”)) # +(java kotlin)
# Excluded group - must not contain ‘spam’ OR ‘junk’ query.must_not(group().medium(“spam”, “junk”)) # -(spam junk)
# Optional group with normal weight query.encourage(group().medium(“tutorial”, “guide”)) # (tutorial guide)
# Optional group with reduced weight query.discourage(group().medium(“old”, “outdated”)) # ~(old outdated)
# Complex MatrixOne style with element-level weights query.must(“red”).must_not(group().low(“blue”).high(“is”)) # Generates: ‘+red -(<blue >is)’