35 lines
2.1 KiB
Markdown
35 lines
2.1 KiB
Markdown
1. Add explicit reindex/backfill tooling.
|
||
Right now, only future PostTableData / PutTableData calls index rows. There should be an admin/dev command like:
|
||
|
||
ReindexProfile(profile_name)
|
||
ReindexTable(profile_name, table_name)
|
||
ReindexRow(profile_name, table_name, id)
|
||
|
||
This is the biggest missing piece.
|
||
|
||
2. Stop using relative ./tantivy_indexes.
|
||
Both writer and reader depend on the process working directory. Make it config/env-driven, e.g.
|
||
TANTIVY_INDEX_DIR.
|
||
3. Add index schema/version metadata.
|
||
If you change tokenizers/schema later, old indexes should fail with a clear “index version mismatch, reindex
|
||
required” instead of behaving strangely.
|
||
4. Batch index commits.
|
||
Current code opens a writer and commits per row. Fine for dev, not great for many inserts. A long-lived writer
|
||
task batching commits every N docs or every short interval would be more reliable and faster.
|
||
5. Make the indexing queue durable.
|
||
The current mpsc queue is in-memory. If the server crashes after DB insert but before indexing, search is stale.
|
||
For serious use, store pending index jobs in Postgres, process them, mark done.
|
||
6. Index only live rows intentionally.
|
||
handle_add_or_update currently fetches row by id without checking deleted = false, then search filters deleted
|
||
rows later. I’d either skip indexing deleted rows or make delete/update semantics explicit.
|
||
7. Add typed fields for numbers/dates if you need range queries.
|
||
Right now numbers are converted to strings. Good for text search, bad for real numeric filtering/sorting. Tantivy
|
||
can do numeric/date fields, but JSON text fields are not enough for robust range search.
|
||
8. Decide column-name strategy.
|
||
Indexing lowercases raw DB JSON keys. If UI uses display names/aliases, column constraints can miss unless the
|
||
frontend sends exactly what the index expects. I’d centralize display-name to physical-name mapping before
|
||
search.
|
||
9. Add delete hooks for table/profile deletion.
|
||
When a table or profile is deleted, the matching Tantivy docs/index directory should be cleaned by code, not
|
||
manually.
|