Skip to content

Release v0.10.x: Digital Twin, M35 COS Enhancement, AI Provider Fallback#9

Merged
atomantic merged 108 commits intomainfrom
dev
Feb 1, 2026
Merged

Release v0.10.x: Digital Twin, M35 COS Enhancement, AI Provider Fallback#9
atomantic merged 108 commits intomainfrom
dev

Conversation

@atomantic
Copy link
Owner

@atomantic atomantic commented Jan 21, 2026

Summary

Major release introducing the Digital Twin identity system, M35 Chief of Staff Enhancement (proactive autonomous agent), AI provider usage limit handling with automatic fallback, and significant UI/UX improvements.

🎉 Key Features

M35: Chief of Staff Enhancement

  • Hybrid memory search with BM25 + vector search (40/60 weighting)
  • Tool execution state machine (IDLE → START → RUNNING → UPDATE → END → ERROR)
  • Agent gateway with request deduplication and 10-minute cache
  • Error recovery with 6 strategies (retry, escalate, fallback, decompose, defer, investigate)
  • Event scheduler with cron expressions and timeout-safe timers
  • Execution lanes: critical (1), standard (2), background (3) concurrent slots
  • Mission system for long-term goals with sub-tasks
  • LM Studio integration for local model completions
  • Thinking levels (off/minimal/low/medium/high/xhigh) with dynamic model selection
  • Context upgrader with complexity analysis

Digital Twin (formerly Soul)

  • Complete identity scaffold system for AI persona management
  • Document management with categories (core, audio, behavioral, enrichment)
  • Multi-model behavioral testing with side-by-side comparison
  • Enrichment questionnaire across 14 categories
  • Quantitative personality modeling (Big Five, values hierarchy, communication profile)
  • Confidence scoring with gap recommendations
  • External data import (Goodreads, Spotify, Letterboxd, iCal)
  • Writing sample analysis for voice pattern extraction
  • Export formats: System Prompt, CLAUDE.md, JSON, individual files
  • Mobile-responsive design throughout

AI Provider Usage Limit Handling

  • Automatic detection of provider usage limits
  • Provider status tracking with persistence across restarts
  • Configurable fallback providers at both provider and task level
  • Real-time status updates via WebSocket
  • Manual recovery option when limits reset

Dashboard Improvements

  • Activity streak tracking with gamification (current/longest streak, visual badges)
  • ETA countdown for running CoS agents with overtime indicators
  • Hourly activity heatmap

CoS Navigation & Mobile UX

  • Sidebar sub-navigation for CoS (Tasks, Agents, Scripts, Schedule, Digest, Learning, Memory, Health, Config)
  • Collapsible agent panel on mobile with toggle header
  • Touch-friendly tabs with 40px minimum height
  • Badges on collapsible nav sections in both states

🔧 Improvements

  • Task enhancer now uses cos-task-enhance prompt stage for provider/model configuration
  • Prompt Manager stages/variables sorted alphabetically
  • Learning-based model selection for CoS (upgrades models for historically failing tasks)
  • Provider/model compatibility validation in agent spawning
  • AI Toolkit integration in PortOS Stack templates
  • Folder picker for directory selection
  • Structured changelog documentation system
  • Auto-rehabilitation for skipped task types after grace period
  • Learning data reset for auto-paused task types

🐛 Fixes

  • Fixed ENOENT error in task prompt enhancement (now uses runId from createRun)
  • Fixed multi-line context loss in agent resume tasks (proper newline escaping in TASKS.md)
  • Fixed "Unexpected end of JSON input" errors with safe JSON parsing across 15+ services
  • Fixed circular dependency breaking task spawning
  • Fixed deadlock in memory soft delete causing delete button to hang
  • Fixed Brain MemoryTab delete success detection
  • Provider override now works correctly in CoS tasks
  • System task deletion passes correct task type
  • Removed debug console.log statements from Media page

🗑️ Removed

  • Dead code cleanup (~1,220 lines of unused .old.js files removed)
  • Shared file utilities reduce DRY violations

Test Plan

  • Verify COS sidebar sub-navigation works on desktop and mobile
  • Test collapsible agent panel on mobile devices
  • Verify Digital Twin documents can be created, edited, and deleted
  • Test AI provider fallback when primary provider is unavailable
  • Confirm activity streak displays correctly on Dashboard
  • Validate ETA countdown shows and updates for running agents
  • Test task enhancement uses prompt stage configuration
  • Verify JSON parsing doesn't crash on empty/corrupted files

github-actions bot and others added 30 commits January 10, 2026 05:11
…asks

Add ability to configure specific AI provider and model for each scheduled
improvement task type in the CoS schedule UI.

Changes:
- Add providerId and model fields to task schedule schema
- Update ScheduleTab UI with provider and model dropdowns
- Update task generation to use configured provider/model settings
- Update API endpoints to accept provider and model configuration
- Fall back to active provider and default model when not specified

Allows fine-grained control over which AI models handle different types of
improvement tasks (e.g., use Opus for security audits, Sonnet for code quality).
Add ability to view and customize prompts for each scheduled improvement task type in the CoS schedule UI.

Changes:
- Extract default prompts from cos.js to taskSchedule.js
- Add prompt field to task schedule schema (nullable, falls back to defaults)
- Add prompt editing UI in ScheduleTab with textarea and Save/Cancel buttons
- Update API endpoints to accept prompt parameter
- Update task generation to use stored prompts (custom or default)
- Support template variables {appName} and {repoPath} for app improvement prompts
- Show visual indicator when using custom vs default prompts

Allows fine-grained control over task instructions, enabling users to customize
analysis depth, specific areas to focus on, or adjust prompts for better results.
Implements offline-first "second brain" with AI-powered classification:
- Single inbox capture → AI classifies to People/Projects/Ideas/Admin
- Confidence threshold gating (60%) with human-in-loop corrections
- Daily digests (<150 words) and weekly reviews (<250 words)
- JSONL storage for append-heavy logs, JSON for entity stores
- Tab-based UI with deep linking (/brain/inbox, /brain/memory, etc.)
- Full CRUD for all entity types with inline editing
- Trust panel with audit trail and settings management
- Scheduler for automated digest/review generation
- 46 tests covering all Brain API endpoints
Add runner stability check before spawning agents to prevent
the race condition where server starts before cos-runner during
a rolling restart, spawns an agent, then the runner restarts
and orphans the agent.

The fix waits up to 15 seconds for the runner to have at least
10 seconds of uptime before spawning any agents.
- PLAN.md: Mark M16 complete, add M31 (LLM Memory Classification) and M32 (Brain) milestones with full documentation
- README.md: Add Weekly Digest and Brain features to feature list
- API.md: Add CoS Weekly Digest endpoints and Brain API endpoints
- ARCHITECTURE.md: Add Brain service and data directory structure
- Add ConfigTab.jsx with AI provider/model selection, confidence threshold, and schedule settings
- Update Brain settings route to validate provider IDs and models against actual providers
- Alphabetize main navigation with Dashboard at top followed by separator
- Ensure model selection is validated against provider's available models
- Fix retryClassification to update existing entry instead of creating duplicate
- Add CLI provider support to brain service callAI function
- Brain prompt templates already exist in data.sample (need to be copied to data/prompts/stages)
- Prompt stage config needs brain-classifier, brain-daily-digest, brain-weekly-review registered
Server updates:
- vitest: 2.1.8 -> 4.0.16 (fixes esbuild/vite moderate vulnerabilities)
- @vitest/coverage-v8: 2.1.8 -> 4.0.16 (companion update)
- supertest: 7.1.4 -> 7.2.2 (patch update)

Client updates:
- react-router-dom: updated via npm audit fix (fixes CSRF and XSS vulnerabilities)

Remaining known issues:
- pm2 has a low severity ReDoS vulnerability with no fix available

All 445 tests pass, client build successful.
Fixed multiple issues with the Brain retry AI functionality:
- Fixed undefined function call: replaced fileThought with fileToDestination
- Fixed provider/model parameter passing in retryClassification
- Rewrote retryClassification to update existing entry instead of creating duplicates
- Added CLI provider support to callAI function
- Removed excessive debug logging
Added comprehensive tooltips to all buttons throughout the Brain inbox:
- Capture thought button
- Route to destination buttons (People, Projects, Ideas, Admin)
- Retry AI button
- View, Fix, Move, Cancel buttons in filed entries

Added edit and delete functionality to inbox entries:
- Edit button to modify captured text with save/cancel
- Delete button with inline confirmation dialog (no window.confirm)
- Works for all entry types: needs review, error, and filed entries

Backend changes:
- Added PUT /api/brain/inbox/:id endpoint to update entry text
- Added DELETE /api/brain/inbox/:id endpoint to delete entries
- Added updateInboxInputSchema validation
- Added deleteInboxLog function to brainStorage
- Added updateInboxEntry and deleteInboxEntry service functions

Frontend changes:
- Added edit mode with textarea and save/cancel buttons
- Added delete confirmation UI with delete/cancel buttons
- Added tooltips (title attributes) to all interactive buttons
- Imported Trash2, Save, and X icons from lucide-react

All changes follow CLAUDE.md guidelines:
- No window.alert or window.confirm usage
- Uses inline confirmation dialogs
- Maintains modular DRY design
- Single-line logging with emoji prefixes
- Replace Reload button with Run Now button that triggers task evaluation
- Use Play icon and emerald styling to match Start button pattern
- Tasks stay up to date via WebSocket events, making reload unnecessary
- Remove manual Task ID input from User Tasks form
- Task IDs are now always auto-generated
- Add toggle to specify task position (top/bottom of queue)
- Default position is bottom of queue
- Update server route and service to handle position parameter
- Add checkForTaskCommit() to validate if agent made a commit with task ID
- Modify completeAgentRun() to check for task commits even if exit code is non-zero
- Override success flag if commit with [task-{taskId}] pattern is found
- Prevents false negatives where agents report failure but work completed
- Add execSync import for git log validation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add vision test service to verify LM Studio models can correctly interpret
images and screenshots. Includes API endpoints for health checks, single
image tests, and full test suites.
- Send screenshots to vision-capable models via proper image_url format
- Client passes screenshot paths to createRun API
- Server loads screenshots as base64 and constructs OpenAI-compatible vision messages
- Fix output not being saved by writing partial output even on stream errors
- Add outputSize to failed run metadata for consistency
- Add Success, Running, and Failed status filters
- Client-side filtering of runs based on status
- Show appropriate empty state when no runs match filters
Comprehensive architecture plan for @portos/ai-toolkit package that would
extract common AI provider, model selection, and prompt template patterns
into a reusable library for PortOS-style Express/React/Tailwind projects.
- Add execution ID section at top of expanded run details
- Include copy-to-clipboard button for easy ID copying
- Shows run.id as the execution ID for tracking and reference
The media page now accesses the camera and microphone from the PortOS server
machine rather than the client browser. This enables remote monitoring and
recording of the server's physical environment.

Changes:
- Create mediaService for FFmpeg-based camera/mic capture
- Add /api/media routes for device enumeration and streaming
- Update Media.jsx to consume server streams via MJPEG (video) and WebM (audio)
- Add audio analyzer with CORS support for real-time level monitoring
- Add default export to api.js for simplified HTTP methods

server/routes/media.js:118
server/services/mediaService.js:167
client/src/pages/Media.jsx:325
atomantic and others added 2 commits January 31, 2026 13:12
- fileUtils: wrap JSON.parse in try/catch in safeJSONParse to handle syntax errors
- taskParser: use JSON encoding for values with special chars, preserving reversibility
- index.js: remove promptPreview/outputTail from error events to avoid leaking secrets
- PromptManager: handle createStage errors properly with user feedback
- DevTools: combine source and status filters in filteredRuns
- Layout: use unique keys for separator elements
@atomantic atomantic requested a review from Copilot January 31, 2026 21:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 93 out of 167 changed files in this pull request and generated 6 comments.

Files not reviewed (1)
  • client/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

atomantic and others added 2 commits January 31, 2026 13:25
- taskParser.js: Use sentinel prefix for JSON-encoded metadata values
  to prevent incorrectly interpreting user values wrapped in quotes
- cos-runner/index.js: Validate cliArgs type before spawn() call
- DirectoryPicker.jsx: Add value to useEffect dependency array
- TrustTab.jsx: Guard against NaN from parseFloat on confidence threshold
- PromptManager.jsx: Handle fetch errors in deleteStage usage check
- fileUtils.test.js: Add comprehensive unit tests for JSON utilities
@atomantic atomantic requested a review from Copilot January 31, 2026 21:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 94 out of 168 changed files in this pull request and generated 7 comments.

Files not reviewed (1)
  • client/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

atomantic and others added 3 commits January 31, 2026 13:37
- Moved M8 Prompt Manager to docs/features/prompt-manager.md
- Moved M19 CoS Agent Runner to docs/features/cos-agent-runner.md
- Moved M33 Soul System to docs/features/soul-system.md
- Moved M34 Digital Twin to docs/features/digital-twin.md
- Moved M35 CoS Enhancement to docs/features/cos-enhancement.md
- Condensed PLAN.md from 1060 to 178 lines with doc references
- Updated Next Actions section with current priorities
- Removed obsolete SHARED_AI_LIBRARY_PLAN.md (toolkit now external npm module)
- release.yml: Capture changelog commit hash before switching branches
  to fix cherry-pick picking wrong commit
- cos-runner/index.js: Use !== undefined for cliArgs to allow empty arrays
- server/index.js: Use absolute paths from __dirname for AI toolkit config
- AIProviders.jsx: Treat undefined enabled as enabled (p.enabled !== false)
- ScheduleTab.jsx: Reset model when provider changes to avoid stale selection
- FolderPicker.jsx: Add dialog accessibility attributes (role, aria-modal, aria-labelledby)
- taskParser.js: Clarify legacy fallback is only for pre-sentinel historical data
@atomantic atomantic requested a review from Copilot January 31, 2026 21:39
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 98 out of 172 changed files in this pull request and generated 7 comments.

Files not reviewed (1)
  • client/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

atomantic and others added 2 commits January 31, 2026 13:52
Security and reliability improvements:
- cos-runner/index.js: Add command allowlist to prevent RCE via spawn()
- fileUtils.js: Remove existsSync() TOCTOU races, use async-only patterns
- fileUtils.js: Handle ENOENT specifically, log other I/O errors

UI fixes:
- ScheduleTab.jsx: Show current model in dropdown even when provider not loaded

Documentation:
- soul-system.md: Update to digital-twin naming (routes, services, pages)
@atomantic atomantic requested a review from Copilot January 31, 2026 21:54
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 98 out of 172 changed files in this pull request and generated 14 comments.

Files not reviewed (1)
  • client/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

atomantic and others added 3 commits January 31, 2026 15:20
cos-runner:
- Default cliArgs to [] when cliCommand provided but args missing
- Use path.basename for cross-platform command allowlist validation

ScheduleTab:
- Remove unnecessary disable condition on model dropdown

ProviderStatusCard:
- Show "Unable to load status" when statuses is null/failed

fileUtils:
- Support CRLF line endings in JSONL parsing
- Add CRLF tests for Windows compatibility

server/index.js:
- Use path.join instead of template literals for paths

Brain.jsx:
- Remove unused toast import

soul-system.md:
- Update directory structure to data/digital-twin/
@atomantic atomantic requested a review from Copilot January 31, 2026 23:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 98 out of 172 changed files in this pull request and generated 9 comments.

Files not reviewed (1)
  • client/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

atomantic and others added 2 commits January 31, 2026 15:39
- Add trailing comma before comment in server/index.js for consistency
- Handle Windows .exe extensions in CLI command allowlist validation
- Fix doc path: digitalTwin.js -> digital-twin.js in digital-twin.md
- Add try/finally error handling in SoulWizard to prevent stuck saving state
- Add aria-label to FolderPicker icon buttons for accessibility
@atomantic atomantic requested a review from Copilot January 31, 2026 23:44
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 98 out of 172 changed files in this pull request and generated 4 comments.

Files not reviewed (1)
  • client/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add useEffect to ScheduleTab to sync local state with external config updates
- Use console.warn instead of emoji-prefixed console.log/error in fileUtils
- Update fileUtils tests to check console.warn instead of console.log/error
@atomantic
Copy link
Owner Author

Re: CI npm install vs npm ci (line 32 in ci.yml):

The install:all script intentionally uses npm install rather than npm ci because:

  1. This is a monorepo with workspaces that requires workspace linking
  2. The script runs npm install in root, client, and server directories sequentially, which correctly handles workspace dependencies
  3. npm still respects the lockfile during npm install - it just has more flexibility for workspace linking

Using npm ci at the workspace level would require a different approach (single npm ci at root with proper workspace config). The current approach is intentional and deterministic for this project structure.

@atomantic atomantic merged commit 39b76d4 into main Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant