Secondary Index Verification Guide
This guide demonstrates how to verify the consistency of secondary indexes in MatrixOne using the Python SDK.
Overview
MatrixOne uses secondary index tables to speed up queries. The SDK provides utilities to:
Get all secondary index table names for a table
Get a specific index table by its index name
Verify that all index tables have the same row count as the main table
These operations are available in both synchronous (Client) and asynchronous (AsyncClient) clients.
Basic Usage
Getting Secondary Index Tables
from matrixone import Client
client = Client()
client.connect(host='localhost', port=6001, user='root', password='111', database='test')
# Get all secondary index tables
index_tables = client.get_secondary_index_tables('my_table')
print(f"Found {len(index_tables)} secondary indexes:")
for table in index_tables:
print(f" - {table}")
Getting Index Table by Name
# Get physical table name for a specific index
physical_table = client.get_secondary_index_table_by_name('my_table', 'idx_name')
if physical_table:
print(f"Index 'idx_name' maps to: {physical_table}")
else:
print("Index not found")
Getting Detailed Index Information
For comprehensive index information including IVF, HNSW, Fulltext, and regular indexes:
# Get detailed information about all indexes for a table
indexes = client.get_table_indexes_detail('my_table')
for idx in indexes:
print(f"Index: {idx['index_name']}")
print(f" Algorithm: {idx['algo'] or 'regular'}")
print(f" Table Type: {idx['algo_table_type'] or 'index'}")
print(f" Physical Table: {idx['physical_table_name']}")
print(f" Columns: {', '.join(idx['columns'])}")
# Example output for IVF index:
# Index: idx_embedding_ivf
# Algorithm: ivfflat
# Table Type: metadata
# Physical Table: __mo_index_secondary_018e1234_meta
# Columns: embedding
# Index: idx_embedding_ivf
# Algorithm: ivfflat
# Table Type: centroids
# Physical Table: __mo_index_secondary_018e5678_centroids
# Columns: embedding
# Index: idx_embedding_ivf
# Algorithm: ivfflat
# Table Type: entries
# Physical Table: __mo_index_secondary_018e9abc_entries
# Columns: embedding
Verifying Index Consistency
# Verify all index tables have the same count as main table
try:
count = client.verify_table_index_counts('my_table')
print(f"✓ Verification passed! Row count: {count}")
except ValueError as e:
print(f"✗ Verification failed!")
print(e)
# Error message includes details about which indexes are mismatched
Complete Example
from matrixone import Client
from matrixone.orm import declarative_base
from sqlalchemy import Column, String, Integer, Index
# Connect to database
client = Client()
client.connect(host='localhost', port=6001, user='root', password='111', database='test')
# Define model with secondary indexes
Base = declarative_base()
class Product(Base):
__tablename__ = 'products'
id = Column(Integer, primary_key=True)
name = Column(String(100))
category = Column(String(50))
price = Column(Integer)
__table_args__ = (
Index('idx_name', 'name'),
Index('idx_category', 'category'),
Index('idx_price', 'price'),
)
# Create table with indexes
client.create_table(Product)
# Insert data
products = [
{'id': i, 'name': f'Product {i}', 'category': f'Cat {i % 5}', 'price': i * 100}
for i in range(1, 1001)
]
client.batch_insert(Product, products)
# Get all secondary indexes
print("Secondary indexes:")
index_tables = client.get_secondary_index_tables('products')
for idx_table in index_tables:
print(f" {idx_table}")
# Get specific index by name
name_index = client.get_secondary_index_table_by_name('products', 'idx_name')
print(f"\nName index table: {name_index}")
# Get detailed index information
print("\nDetailed index information:")
indexes = client.get_table_indexes_detail('products')
for idx in indexes:
print(f" {idx['index_name']} ({idx['algo'] or 'regular'}) - {idx['physical_table_name']}")
# Verify consistency
try:
count = client.verify_table_index_counts('products')
print(f"\n✓ All indexes verified! Row count: {count}")
except ValueError as e:
print(f"\n✗ Verification failed: {e}")
client.disconnect()
Async Usage
The same functionality is available in async mode:
import asyncio
from matrixone import AsyncClient
async def verify_indexes():
client = AsyncClient()
await client.connect(
host='localhost',
port=6001,
user='root',
password='111',
database='test'
)
# Get secondary index tables (async)
index_tables = await client.get_secondary_index_tables('my_table')
# Get specific index by name (async)
idx_table = await client.get_secondary_index_table_by_name('my_table', 'idx_name')
# Verify consistency (async)
try:
count = await client.verify_table_index_counts('my_table')
print(f"✓ Verified! Count: {count}")
except ValueError as e:
print(f"✗ Verification failed: {e}")
await client.disconnect()
asyncio.run(verify_indexes())
Use Cases
Data Integrity Checks
Use these methods to verify data integrity after:
Bulk data operations
Data migration
Index rebuilds
Database recovery
# After bulk insert
client.batch_insert(MyTable, large_dataset)
# Verify indexes are consistent
count = client.verify_table_index_counts('my_table')
print(f"Verified {count} rows across all indexes")
Monitoring and Diagnostics
Monitor index health in production:
import time
while True:
try:
count = client.verify_table_index_counts('critical_table')
print(f"{time.ctime()}: ✓ Indexes OK ({count} rows)")
except ValueError as e:
print(f"{time.ctime()}: ✗ INDEX MISMATCH DETECTED!")
print(e)
# Alert monitoring system
time.sleep(60) # Check every minute
API Reference
Client.get_secondary_index_tables
- Client.get_secondary_index_tables(table_name: str, database_name: str | None = None) List[str][source]
Get all secondary index table names for a given table.
This includes both regular secondary indexes (MULTIPLE type) and UNIQUE indexes.
- Parameters:
table_name – Name of the table to get secondary indexes for
database_name – Name of the database (optional). If None, uses the current database.
- Returns:
List of secondary index table names (includes both __mo_index_secondary_… and __mo_index_unique_… tables)
Examples:
>>> client = Client() >>> client.connect(host='localhost', port=6001, user='root', password='111', database='test') >>> # Use current database >>> index_tables = client.get_secondary_index_tables('cms_all_content_chunk_info') >>> # Or specify database explicitly >>> index_tables = client.get_secondary_index_tables('cms_all_content_chunk_info', 'test') >>> print(index_tables) ['__mo_index_secondary_..._cms_id', '__mo_index_unique_..._email']
Client.get_secondary_index_table_by_name
- Client.get_secondary_index_table_by_name(table_name: str, index_name: str, database_name: str | None = None) str | None[source]
Get the physical table name of a secondary index by its index name.
- Parameters:
table_name – Name of the table
index_name – Name of the secondary index
database_name – Name of the database (optional). If None, uses the current database.
- Returns:
Physical table name of the secondary index, or None if not found
Examples:
>>> client = Client() >>> client.connect(host='localhost', port=6001, user='root', password='111', database='test') >>> # Use current database >>> index_table = client.get_secondary_index_table_by_name('cms_all_content_chunk_info', 'cms_id') >>> # Or specify database explicitly >>> index_table = client.get_secondary_index_table_by_name('cms_all_content_chunk_info', 'cms_id', 'test') >>> print(index_table) '__mo_index_secondary_018cfbda-bde1-7c3e-805c-3f8e71769f75_cms_id'
Client.get_table_indexes_detail
- Client.get_table_indexes_detail(table_name: str, database_name: str | None = None) List[dict][source]
Get detailed information about all indexes for a table, including IVF, HNSW, Fulltext, and regular indexes.
This method returns comprehensive information about each index physical table, including: - Index name - Index type (MULTIPLE, PRIMARY, UNIQUE, etc.) - Algorithm type (ivfflat, hnsw, fulltext, etc.) - Algorithm table type (metadata, centroids, entries, etc.) - Physical table name - Column names - Algorithm parameters
- Parameters:
table_name – Name of the table to get indexes for
database_name – Name of the database (optional). If None, uses the current database.
- Returns:
index_name: Name of the index
index_type: Type of index (MULTIPLE, PRIMARY, UNIQUE, etc.)
algo: Algorithm type (ivfflat, hnsw, fulltext, or None for regular indexes)
algo_table_type: Algorithm table type (metadata, centroids, entries, etc., or None)
physical_table_name: Physical table name
columns: List of column names
algo_params: Algorithm parameters (or None)
- Return type:
List of dictionaries, each containing
Examples:
>>> client = Client() >>> client.connect(host='localhost', port=6001, user='root', password='111', database='test') >>> # Get all index details for a table >>> indexes = client.get_table_indexes_detail('ivf_health_demo_docs') >>> for idx in indexes: ... print(f"{idx['index_name']} ({idx['algo']}) - {idx['algo_table_type']}: {idx['physical_table_name']}") idx_embedding_ivf (ivfflat) - metadata: __mo_index_secondary_... idx_embedding_ivf (ivfflat) - centroids: __mo_index_secondary_... idx_embedding_ivf (ivfflat) - entries: __mo_index_secondary_...
This method provides comprehensive information about all indexes including:
Regular secondary indexes (MULTIPLE, UNIQUE)
IVF vector indexes (with metadata, centroids, and entries tables)
HNSW vector indexes
Fulltext indexes
Each index may have multiple physical tables (especially for IVF indexes which have metadata, centroids, and entries tables).
Client.verify_table_index_counts
- Client.verify_table_index_counts(table_name: str) int[source]
Verify that the main table and all its secondary index tables have the same row count.
This method compares the COUNT(*) of the main table with all its secondary index tables in a single SQL query for consistency. If counts don’t match, raises an exception.
- Parameters:
table_name – Name of the table to verify
- Returns:
Row count (int) if verification succeeds
- Raises:
ValueError – If any secondary index table has a different count than the main table, with details about all counts in the error message
Examples:
>>> client = Client() >>> client.connect(host='localhost', port=6001, user='root', password='111', database='test') >>> count = client.verify_table_index_counts('cms_all_content_chunk_info') >>> print(f"✓ Verification passed, row count: {count}") >>> # If verification fails: >>> try: ... count = client.verify_table_index_counts('some_table') ... except ValueError as e: ... print(f"Verification failed: {e}")
AsyncClient Methods
All methods are also available in async form:
get_secondary_index_tables()get_secondary_index_table_by_name()get_table_indexes_detail()verify_table_index_counts()
Error Handling
The verify_table_index_counts() method raises a ValueError with detailed information when verification fails:
try:
count = client.verify_table_index_counts('my_table')
except ValueError as e:
# Example error message:
# Index count verification failed!
# Main table 'my_table': 20000 rows
# ✗ MISMATCH Index '__mo_index_secondary_..._idx1': 19900 rows
# ✓ Index '__mo_index_secondary_..._idx2': 20000 rows
# ✓ Index '__mo_index_secondary_..._idx3': 20000 rows
print(str(e))
Performance Notes
All count comparisons are done in a single SQL query for consistency
The verification is atomic - all counts are from the same transaction
Efficient for tables with multiple indexes (no N+1 query problem)
Example SQL generated:
SELECT
(SELECT COUNT(*) FROM `main_table`) as main_count,
(SELECT COUNT(*) FROM `__mo_index_secondary_..._idx1`) as idx1_count,
(SELECT COUNT(*) FROM `__mo_index_secondary_..._idx2`) as idx2_count,
(SELECT COUNT(*) FROM `__mo_index_secondary_..._idx3`) as idx3_count
See Also
ORM Usage Guide - ORM model definition with indexes
Client - Complete Client API reference
AsyncClient - Complete AsyncClient API reference