fix(embedder): replace text-embedding-004 with gemini-embedding-001#98
fix(embedder): replace text-embedding-004 with gemini-embedding-001#98
Conversation
text-embedding-004 is not the correct Gemini embedding model to use. Replace all references with gemini-embedding-001 throughout the codebase. - embedder.go: update default Gemini fallback model from text-embedding-004 to gemini-embedding-001 - inferEmbeddingDimensions: remove the text-embedding-004/005 case; the gemini-embedding-001 → 3072 entry already handles the correct model - isLikelyGeminiEmbeddingModel: remove redundant text-embedding-004 check; gemini-embedding-001 already matches via strings.Contains(m, "gemini") - DOCS/examples (single-repo, multi-repo): update model and dimensions (768 → 3072 to match gemini-embedding-001 output size) - cmd/simili-web/README.md: update example config snippet - README.md: update default model reference and dimension table Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
📝 WalkthroughWalkthroughReplaced Changes
Sequence Diagram(s)(Skipped — changes are focused on configuration/defaults, detection logic, tests, and a Qdrant helper; no new multi-component runtime flow requiring visualization.) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Simili Triage ReportNote Quality Score: 9.5/10 (Excellent) Classification
Quality Improvements
Similar Threads
Generated by Simili Bot |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/integrations/ai/embedder.go (1)
205-208: The narrowed detection for legacy Google embedding models is technically valid but has minimal practical impact.The function correctly checks only for the
"gemini"substring, meaning legacy Google models liketext-embedding-004andtext-embedding-005won't be detected. However,text-embedding-004was shut down on the Gemini API on January 14, 2026, andtext-embedding-005is a Vertex AI-specific model, not a Gemini API model. If a Gemini provider is accidentally configured with these legacy models, the auto-correction togemini-embedding-001is actually the desired migration path. Documenting this behavior change in migration materials remains a good practice for users transitioning from older models.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/integrations/ai/embedder.go` around lines 205 - 208, isLikelyGeminiEmbeddingModel currently only matches the substring "gemini", which misses legacy Google embedding model names; update the function isLikelyGeminiEmbeddingModel to also detect legacy tokens (e.g., "text-embedding-004" and "text-embedding-005") by normalizing the input (strings.ToLower/TrimSpace) and returning true if the model string contains "gemini" OR contains any of those legacy identifiers so the auto-migration to "gemini-embedding-001" still triggers for those older names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/integrations/ai/embedder.go`:
- Around line 205-208: isLikelyGeminiEmbeddingModel currently only matches the
substring "gemini", which misses legacy Google embedding model names; update the
function isLikelyGeminiEmbeddingModel to also detect legacy tokens (e.g.,
"text-embedding-004" and "text-embedding-005") by normalizing the input
(strings.ToLower/TrimSpace) and returning true if the model string contains
"gemini" OR contains any of those legacy identifiers so the auto-migration to
"gemini-embedding-001" still triggers for those older names.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e2efd6e5-65c7-4caa-a78d-87f9ffb44e16
📒 Files selected for processing (5)
DOCS/examples/multi-repo/simili.yamlDOCS/examples/single-repo/simili.yamlREADME.mdcmd/simili-web/README.mdinternal/integrations/ai/embedder.go
There was a problem hiding this comment.
Pull request overview
Updates Simili’s Gemini embedding defaults to use the correct model and aligns documented/vector dimensions with the model’s 3072-d output.
Changes:
- Switch Gemini default embedding model to
gemini-embedding-001. - Update embedding dimension inference to return 3072 for
gemini-embedding-001and remove stale cases. - Refresh docs/examples to reflect the new model name and 3072 dimensions.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/integrations/ai/embedder.go | Updates Gemini default model and adjusts dimension inference / model-detection heuristics. |
| DOCS/examples/single-repo/simili.yaml | Updates example embedding model and dimensions to 3072. |
| DOCS/examples/multi-repo/simili.yaml | Updates example embedding model and dimensions to 3072. |
| cmd/simili-web/README.md | Updates config snippet to the new model/dimensions. |
| README.md | Updates default model reference and the model→dimensions mapping table. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if strings.TrimSpace(model) == "" || isLikelyOpenAIEmbeddingModel(model) { | ||
| model = "text-embedding-004" | ||
| model = "gemini-embedding-001" | ||
| } |
There was a problem hiding this comment.
NewEmbedder will now leave an explicitly configured legacy Gemini model like text-embedding-004/text-embedding-005 unchanged when provider == ProviderGemini, which is likely to result in a runtime API error given this PR’s premise that those aren’t valid Gemini embedding models. Consider mapping those legacy names to gemini-embedding-001 (optionally with a warning) or failing fast with a clear error so misconfigurations are caught at startup rather than during embedding.
| } | |
| } | |
| trimmedModel := strings.TrimSpace(model) | |
| if trimmedModel == "text-embedding-004" || trimmedModel == "text-embedding-005" { | |
| return nil, fmt.Errorf( | |
| "model %q is not a valid Gemini embedding model; use %q instead or update configuration", | |
| trimmedModel, "gemini-embedding-001", | |
| ) | |
| } | |
| model = trimmedModel |
internal/integrations/ai/embedder.go
Outdated
| case strings.Contains(m, "text-embedding-004"), strings.Contains(m, "text-embedding-005"): | ||
| return 768 | ||
| default: | ||
| return 768 |
There was a problem hiding this comment.
inferEmbeddingDimensions now defaults Gemini to 768 for any model string other than gemini-embedding-001. If the model name is provided in a different but equivalent form (e.g., fully-qualified) or if additional Gemini embedding models are supported later, this will silently produce a wrong dimension and can break vector DB collection sizing. Consider broadening the match (or using the returned embedding length from a probe call) and/or changing the Gemini default to the current default model’s dimension (3072).
| return 768 | |
| // Default to the current Gemini embedding model dimension (3072) | |
| return 3072 |
internal/integrations/ai/embedder.go
Outdated
| func isLikelyGeminiEmbeddingModel(model string) bool { | ||
| m := strings.ToLower(strings.TrimSpace(model)) | ||
| return strings.Contains(m, "gemini") || strings.Contains(m, "text-embedding-004") || strings.Contains(m, "text-embedding-005") | ||
| return strings.Contains(m, "gemini") |
There was a problem hiding this comment.
isLikelyGeminiEmbeddingModel was narrowed to only match strings containing gemini. This removes the previous safeguard that auto-corrected legacy Gemini embedding model names (e.g., text-embedding-004/text-embedding-005) when provider == ProviderOpenAI, which can lead to sending invalid model names to OpenAI. Consider keeping legacy checks (even if only for migration) or adding explicit per-provider model validation with a clear startup error.
| return strings.Contains(m, "gemini") | |
| // Primary check: explicit Gemini model identifiers. | |
| if strings.Contains(m, "gemini") { | |
| return true | |
| } | |
| // Legacy Gemini embedding model identifiers that may still be used or misconfigured | |
| // under other providers (e.g., ProviderOpenAI). Keep recognizing them here to allow | |
| // migration logic to auto-correct and avoid sending invalid models to OpenAI. | |
| switch m { | |
| case "text-embedding-004", "text-embedding-005": | |
| return true | |
| } | |
| return false |
internal/integrations/ai/embedder.go
Outdated
| if strings.TrimSpace(model) == "" || isLikelyOpenAIEmbeddingModel(model) { | ||
| model = "text-embedding-004" | ||
| model = "gemini-embedding-001" | ||
| } |
There was a problem hiding this comment.
The default model/dimension selection logic changed here, but internal/integrations/ai has unit tests for related selection logic (provider_test.go, retry_test.go) while embedder.go currently has no coverage. Adding a small unit test around model fallback and dimension inference (Gemini default -> 3072; OpenAI default -> 1536; cross-provider mismatches) would help prevent regressions.
Copilot review fixes: 1. Fail fast with a clear error when a legacy Gemini model name (text-embedding-004 / text-embedding-005) is passed at startup, guiding users to gemini-embedding-001 immediately. 2. inferEmbeddingDimensions: default unknown Gemini models to 3072 (the current default model's dimension) instead of the old 768. 3. isLikelyGeminiEmbeddingModel: restore legacy name recognition for text-embedding-004/005 so they are never forwarded to OpenAI. 4. Add embedder_test.go with full coverage of dimension inference, Gemini model detection, and legacy model rejection. Stale 768-dimension references replaced with 3072 across all files: - internal/core/config/config.go (applyDefaults) - .simili.yaml - .github/workflows/e2e-test.yml (two inline config blocks) - DOCS/0.0.2v/plan.md - .claude/sessions/2026-02-02-0941-v0.0.2v-foundation.md Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/e2e-test.yml (1)
108-108: Reduce config drift by centralizing embedding constants in this workflow.Lines 108 and 208 are consistent now, but this model/dimension pair is duplicated in multiple heredocs. Consider templating these values once (e.g., placeholders + single substitution step) so future model upgrades don’t miss one path.
Also applies to: 208-208
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml at line 108, Hardcoded embedding constants (e.g., the "dimensions: 3072" entries repeated in multiple heredocs) should be centralized: add workflow-level variables (e.g., EMBEDDING_DIMENSIONS and EMBEDDING_MODEL) in the job/env section and replace each literal "dimensions: 3072" and model string inside the heredocs with a placeholder that references those variables, then ensure the heredocs are rendered with those env vars (use GitHub Actions env interpolation or a single substitution step like envsubst) so future model/dimension updates only change the centralized EMBEDDING_* values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/core/config/config.go`:
- Around line 283-285: The config default change to 3072 can hide mismatches
with existing Qdrant collections (e.g., 768), so update the Qdrant startup flow
to detect and fail on dimension mismatches: in
internal/integrations/qdrant/client.go, inside CreateCollection (the branch that
treats existing collections as success), fetch the existing collection metadata
(collection info/describe), compare its vector size to
config.Embedding.Dimensions (from internal/core/config/config.go), and if they
differ return a clear fatal error describing the expected vs actual dimensions
and a remediation (e.g., recreate collection or set correct embedding dimension)
so the service fails at startup rather than deferring to runtime; include the
dimension numbers and next steps in the error message.
---
Nitpick comments:
In @.github/workflows/e2e-test.yml:
- Line 108: Hardcoded embedding constants (e.g., the "dimensions: 3072" entries
repeated in multiple heredocs) should be centralized: add workflow-level
variables (e.g., EMBEDDING_DIMENSIONS and EMBEDDING_MODEL) in the job/env
section and replace each literal "dimensions: 3072" and model string inside the
heredocs with a placeholder that references those variables, then ensure the
heredocs are rendered with those env vars (use GitHub Actions env interpolation
or a single substitution step like envsubst) so future model/dimension updates
only change the centralized EMBEDDING_* values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0a3ddf6f-7ecf-4f98-ab3f-ef41f51b0720
📒 Files selected for processing (7)
.claude/sessions/2026-02-02-0941-v0.0.2v-foundation.md.github/workflows/e2e-test.yml.simili.yamlDOCS/0.0.2v/plan.mdinternal/core/config/config.gointernal/integrations/ai/embedder.gointernal/integrations/ai/embedder_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/integrations/ai/embedder.go
🧪 E2E Test✅ Bot responded: yes | Auto-closer (dry-run) | processed: 0 closed: 0 grace: 0 human: 0 | Test repo → gh-simili-bot/simili-e2e-22705431212 Auto-generated by E2E pipeline |
…nt mismatches
When CreateCollection finds an existing collection it previously returned
nil immediately, meaning a 768-dim collection would survive startup and
fail only at first write when a 3072-dim vector was upserted.
Add validateCollectionDimension: fetches the existing collection's
VectorParams.Size via GetCollectionInfoRequest and compares against the
requested dimension. Returns a clear fatal-ready error describing the
mismatch and the remediation steps (recreate collection or align
embedding.dimensions in config).
Path: CollectionInfo → CollectionConfig → CollectionParams
→ VectorsConfig → VectorParams.GetSize()
If collection info is unavailable the check is skipped gracefully so
read-only or restricted deployments are not broken.
Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/integrations/qdrant/client.go (1)
95-115:⚠️ Potential issue | 🟠 MajorReject non-positive dimensions before issuing collection requests.
CreateCollectionshould fail fast for invaliddimensionvalues instead of proceeding into existence/creation logic. This prevents invalid vector-size requests from reaching Qdrant.Proposed fix
func (c *Client) CreateCollection(ctx context.Context, name string, dimension int) error { + if dimension <= 0 { + return fmt.Errorf("invalid embedding dimension %d: must be > 0", dimension) + } + // Check if exists first exists, err := c.CollectionExists(ctx, name) if err != nil { return err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/integrations/qdrant/client.go` around lines 95 - 115, In Client.CreateCollection, validate the incoming dimension argument and return an error immediately for non-positive values (<= 0) before calling CollectionExists or making any collection requests; update the method so it checks dimension at the top of the function (in CreateCollection) and returns a clear error instead of proceeding, keeping the rest of the flow (CollectionExists call and subsequent Create flow and call to validateCollectionDimension) unchanged.
🧹 Nitpick comments (1)
internal/integrations/qdrant/client.go (1)
137-140: Avoid silently skipping validation on collection-info fetch errors.On Line 138-Line 139, returning
nilfor any metadata lookup failure weakens the startup check and can reintroduce write-time failures. Consider returning an error (or only skipping on explicitly tolerated cases).Proposed fix
resp, err := c.collections.Get(authCtx, &pb.GetCollectionInfoRequest{ CollectionName: name, }) if err != nil { - // Cannot confirm — proceed and let Qdrant reject mismatched vectors at write time. - return nil + return fmt.Errorf("failed to inspect collection %q dimension: %w", name, err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/integrations/qdrant/client.go` around lines 137 - 140, The code currently swallows any error from the collection-info fetch (the block where `if err != nil { return nil }`), which weakens startup validation; change this to propagate the error instead of returning nil (i.e., return the `err` or wrap it with context) so callers can fail fast, but if you intentionally want to tolerate specific cases only allow skipping on explicit, documented conditions (e.g., a NotFound/404 from Qdrant or a known transient error type such as `qdrant.ErrCollectionNotFound`) by checking `err` for those cases and continuing only then; update the collection-info fetch error handling to either return `err` (or a wrapped error) or explicitly handle tolerated error types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/integrations/qdrant/client.go`:
- Around line 95-115: In Client.CreateCollection, validate the incoming
dimension argument and return an error immediately for non-positive values (<=
0) before calling CollectionExists or making any collection requests; update the
method so it checks dimension at the top of the function (in CreateCollection)
and returns a clear error instead of proceeding, keeping the rest of the flow
(CollectionExists call and subsequent Create flow and call to
validateCollectionDimension) unchanged.
---
Nitpick comments:
In `@internal/integrations/qdrant/client.go`:
- Around line 137-140: The code currently swallows any error from the
collection-info fetch (the block where `if err != nil { return nil }`), which
weakens startup validation; change this to propagate the error instead of
returning nil (i.e., return the `err` or wrap it with context) so callers can
fail fast, but if you intentionally want to tolerate specific cases only allow
skipping on explicit, documented conditions (e.g., a NotFound/404 from Qdrant or
a known transient error type such as `qdrant.ErrCollectionNotFound`) by checking
`err` for those cases and continuing only then; update the collection-info fetch
error handling to either return `err` (or a wrapped error) or explicitly handle
tolerated error types.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 14d79580-7d9d-41b2-9116-44b676a1e093
📒 Files selected for processing (1)
internal/integrations/qdrant/client.go
🧪 E2E Test✅ Bot responded: yes | Auto-closer (dry-run) | processed: 0 closed: 0 grace: 0 human: 0 | Test repo → gh-simili-bot/simili-e2e-22706877355 Auto-generated by E2E pipeline |
Problem
text-embedding-004is not the correct Gemini embedding model. The right model isgemini-embedding-001, which also outputs 3072 dimensions rather than 768.Changes
internal/integrations/ai/embedder.gotext-embedding-004→gemini-embedding-001; remove stale dimension case and detection checkDOCS/examples/single-repo/simili.yaml768→3072)DOCS/examples/multi-repo/simili.yaml768→3072)cmd/simili-web/README.mdREADME.mdTest plan
go test ./internal/integrations/ai/...— all passgo test ./internal/core/config/...— all passgrep -r "text-embedding-004"returns no resultsSummary by CodeRabbit
Documentation
Configuration Updates
Behavior Change
Tests