Secondary Index Verification Guide

This guide demonstrates how to verify the consistency of secondary indexes in MatrixOne using the Python SDK.

Overview

MatrixOne uses secondary index tables to speed up queries. The SDK provides utilities to:

  1. Get all secondary index table names for a table

  2. Get a specific index table by its index name

  3. Verify that all index tables have the same row count as the main table

These operations are available in both synchronous (Client) and asynchronous (AsyncClient) clients.

Basic Usage

Getting Secondary Index Tables

from matrixone import Client

client = Client()
client.connect(host='localhost', port=6001, user='root', password='111', database='test')

# Get all secondary index tables
index_tables = client.get_secondary_index_tables('my_table')
print(f"Found {len(index_tables)} secondary indexes:")
for table in index_tables:
    print(f"  - {table}")

Getting Index Table by Name

# Get physical table name for a specific index
physical_table = client.get_secondary_index_table_by_name('my_table', 'idx_name')

if physical_table:
    print(f"Index 'idx_name' maps to: {physical_table}")
else:
    print("Index not found")

Getting Detailed Index Information

For comprehensive index information including IVF, HNSW, Fulltext, and regular indexes:

# Get detailed information about all indexes for a table
indexes = client.get_table_indexes_detail('my_table')

for idx in indexes:
    print(f"Index: {idx['index_name']}")
    print(f"  Algorithm: {idx['algo'] or 'regular'}")
    print(f"  Table Type: {idx['algo_table_type'] or 'index'}")
    print(f"  Physical Table: {idx['physical_table_name']}")
    print(f"  Columns: {', '.join(idx['columns'])}")

# Example output for IVF index:
# Index: idx_embedding_ivf
#   Algorithm: ivfflat
#   Table Type: metadata
#   Physical Table: __mo_index_secondary_018e1234_meta
#   Columns: embedding
# Index: idx_embedding_ivf
#   Algorithm: ivfflat
#   Table Type: centroids
#   Physical Table: __mo_index_secondary_018e5678_centroids
#   Columns: embedding
# Index: idx_embedding_ivf
#   Algorithm: ivfflat
#   Table Type: entries
#   Physical Table: __mo_index_secondary_018e9abc_entries
#   Columns: embedding

Verifying Index Consistency

# Verify all index tables have the same count as main table
try:
    count = client.verify_table_index_counts('my_table')
    print(f"✓ Verification passed! Row count: {count}")
except ValueError as e:
    print(f"✗ Verification failed!")
    print(e)
    # Error message includes details about which indexes are mismatched

Complete Example

from matrixone import Client
from matrixone.orm import declarative_base
from sqlalchemy import Column, String, Integer, Index

# Connect to database
client = Client()
client.connect(host='localhost', port=6001, user='root', password='111', database='test')

# Define model with secondary indexes
Base = declarative_base()

class Product(Base):
    __tablename__ = 'products'

    id = Column(Integer, primary_key=True)
    name = Column(String(100))
    category = Column(String(50))
    price = Column(Integer)

    __table_args__ = (
        Index('idx_name', 'name'),
        Index('idx_category', 'category'),
        Index('idx_price', 'price'),
    )

# Create table with indexes
client.create_table(Product)

# Insert data
products = [
    {'id': i, 'name': f'Product {i}', 'category': f'Cat {i % 5}', 'price': i * 100}
    for i in range(1, 1001)
]
client.batch_insert(Product, products)

# Get all secondary indexes
print("Secondary indexes:")
index_tables = client.get_secondary_index_tables('products')
for idx_table in index_tables:
    print(f"  {idx_table}")

# Get specific index by name
name_index = client.get_secondary_index_table_by_name('products', 'idx_name')
print(f"\nName index table: {name_index}")

# Get detailed index information
print("\nDetailed index information:")
indexes = client.get_table_indexes_detail('products')
for idx in indexes:
    print(f"  {idx['index_name']} ({idx['algo'] or 'regular'}) - {idx['physical_table_name']}")

# Verify consistency
try:
    count = client.verify_table_index_counts('products')
    print(f"\n✓ All indexes verified! Row count: {count}")
except ValueError as e:
    print(f"\n✗ Verification failed: {e}")

client.disconnect()

Async Usage

The same functionality is available in async mode:

import asyncio
from matrixone import AsyncClient

async def verify_indexes():
    client = AsyncClient()
    await client.connect(
        host='localhost',
        port=6001,
        user='root',
        password='111',
        database='test'
    )

    # Get secondary index tables (async)
    index_tables = await client.get_secondary_index_tables('my_table')

    # Get specific index by name (async)
    idx_table = await client.get_secondary_index_table_by_name('my_table', 'idx_name')

    # Verify consistency (async)
    try:
        count = await client.verify_table_index_counts('my_table')
        print(f"✓ Verified! Count: {count}")
    except ValueError as e:
        print(f"✗ Verification failed: {e}")

    await client.disconnect()

asyncio.run(verify_indexes())

Use Cases

Data Integrity Checks

Use these methods to verify data integrity after:

  • Bulk data operations

  • Data migration

  • Index rebuilds

  • Database recovery

# After bulk insert
client.batch_insert(MyTable, large_dataset)

# Verify indexes are consistent
count = client.verify_table_index_counts('my_table')
print(f"Verified {count} rows across all indexes")

Monitoring and Diagnostics

Monitor index health in production:

import time

while True:
    try:
        count = client.verify_table_index_counts('critical_table')
        print(f"{time.ctime()}: ✓ Indexes OK ({count} rows)")
    except ValueError as e:
        print(f"{time.ctime()}: ✗ INDEX MISMATCH DETECTED!")
        print(e)
        # Alert monitoring system

    time.sleep(60)  # Check every minute

API Reference

Client.get_secondary_index_tables

Client.get_secondary_index_tables(table_name: str, database_name: str | None = None) List[str][source]

Get all secondary index table names for a given table.

This includes both regular secondary indexes (MULTIPLE type) and UNIQUE indexes.

Parameters:
  • table_name – Name of the table to get secondary indexes for

  • database_name – Name of the database (optional). If None, uses the current database.

Returns:

List of secondary index table names (includes both __mo_index_secondary_… and __mo_index_unique_… tables)

Examples:

>>> client = Client()
>>> client.connect(host='localhost', port=6001, user='root', password='111', database='test')
>>> # Use current database
>>> index_tables = client.get_secondary_index_tables('cms_all_content_chunk_info')
>>> # Or specify database explicitly
>>> index_tables = client.get_secondary_index_tables('cms_all_content_chunk_info', 'test')
>>> print(index_tables)
['__mo_index_secondary_..._cms_id', '__mo_index_unique_..._email']

Client.get_secondary_index_table_by_name

Client.get_secondary_index_table_by_name(table_name: str, index_name: str, database_name: str | None = None) str | None[source]

Get the physical table name of a secondary index by its index name.

Parameters:
  • table_name – Name of the table

  • index_name – Name of the secondary index

  • database_name – Name of the database (optional). If None, uses the current database.

Returns:

Physical table name of the secondary index, or None if not found

Examples:

>>> client = Client()
>>> client.connect(host='localhost', port=6001, user='root', password='111', database='test')
>>> # Use current database
>>> index_table = client.get_secondary_index_table_by_name('cms_all_content_chunk_info', 'cms_id')
>>> # Or specify database explicitly
>>> index_table = client.get_secondary_index_table_by_name('cms_all_content_chunk_info', 'cms_id', 'test')
>>> print(index_table)
'__mo_index_secondary_018cfbda-bde1-7c3e-805c-3f8e71769f75_cms_id'

Client.get_table_indexes_detail

Client.get_table_indexes_detail(table_name: str, database_name: str | None = None) List[dict][source]

Get detailed information about all indexes for a table, including IVF, HNSW, Fulltext, and regular indexes.

This method returns comprehensive information about each index physical table, including: - Index name - Index type (MULTIPLE, PRIMARY, UNIQUE, etc.) - Algorithm type (ivfflat, hnsw, fulltext, etc.) - Algorithm table type (metadata, centroids, entries, etc.) - Physical table name - Column names - Algorithm parameters

Parameters:
  • table_name – Name of the table to get indexes for

  • database_name – Name of the database (optional). If None, uses the current database.

Returns:

  • index_name: Name of the index

  • index_type: Type of index (MULTIPLE, PRIMARY, UNIQUE, etc.)

  • algo: Algorithm type (ivfflat, hnsw, fulltext, or None for regular indexes)

  • algo_table_type: Algorithm table type (metadata, centroids, entries, etc., or None)

  • physical_table_name: Physical table name

  • columns: List of column names

  • algo_params: Algorithm parameters (or None)

Return type:

List of dictionaries, each containing

Examples:

>>> client = Client()
>>> client.connect(host='localhost', port=6001, user='root', password='111', database='test')
>>> # Get all index details for a table
>>> indexes = client.get_table_indexes_detail('ivf_health_demo_docs')
>>> for idx in indexes:
...     print(f"{idx['index_name']} ({idx['algo']}) - {idx['algo_table_type']}: {idx['physical_table_name']}")
idx_embedding_ivf (ivfflat) - metadata: __mo_index_secondary_...
idx_embedding_ivf (ivfflat) - centroids: __mo_index_secondary_...
idx_embedding_ivf (ivfflat) - entries: __mo_index_secondary_...

This method provides comprehensive information about all indexes including:

  • Regular secondary indexes (MULTIPLE, UNIQUE)

  • IVF vector indexes (with metadata, centroids, and entries tables)

  • HNSW vector indexes

  • Fulltext indexes

Each index may have multiple physical tables (especially for IVF indexes which have metadata, centroids, and entries tables).

Client.verify_table_index_counts

Client.verify_table_index_counts(table_name: str) int[source]

Verify that the main table and all its secondary index tables have the same row count.

This method compares the COUNT(*) of the main table with all its secondary index tables in a single SQL query for consistency. If counts don’t match, raises an exception.

Parameters:

table_name – Name of the table to verify

Returns:

Row count (int) if verification succeeds

Raises:

ValueError – If any secondary index table has a different count than the main table, with details about all counts in the error message

Examples:

>>> client = Client()
>>> client.connect(host='localhost', port=6001, user='root', password='111', database='test')
>>> count = client.verify_table_index_counts('cms_all_content_chunk_info')
>>> print(f"✓ Verification passed, row count: {count}")

>>> # If verification fails:
>>> try:
...     count = client.verify_table_index_counts('some_table')
... except ValueError as e:
...     print(f"Verification failed: {e}")

AsyncClient Methods

All methods are also available in async form:

  • get_secondary_index_tables()

  • get_secondary_index_table_by_name()

  • get_table_indexes_detail()

  • verify_table_index_counts()

Error Handling

The verify_table_index_counts() method raises a ValueError with detailed information when verification fails:

try:
    count = client.verify_table_index_counts('my_table')
except ValueError as e:
    # Example error message:
    # Index count verification failed!
    # Main table 'my_table': 20000 rows
    # ✗ MISMATCH Index '__mo_index_secondary_..._idx1': 19900 rows
    # ✓ Index '__mo_index_secondary_..._idx2': 20000 rows
    # ✓ Index '__mo_index_secondary_..._idx3': 20000 rows
    print(str(e))

Performance Notes

  • All count comparisons are done in a single SQL query for consistency

  • The verification is atomic - all counts are from the same transaction

  • Efficient for tables with multiple indexes (no N+1 query problem)

Example SQL generated:

SELECT
    (SELECT COUNT(*) FROM `main_table`) as main_count,
    (SELECT COUNT(*) FROM `__mo_index_secondary_..._idx1`) as idx1_count,
    (SELECT COUNT(*) FROM `__mo_index_secondary_..._idx2`) as idx2_count,
    (SELECT COUNT(*) FROM `__mo_index_secondary_..._idx3`) as idx3_count

See Also