Skip to content

main <- staging#205

Merged
ducnmm merged 22 commits into
mainfrom
staging
May 28, 2026
Merged

main <- staging#205
ducnmm merged 22 commits into
mainfrom
staging

Conversation

@ducnmm

@ducnmm ducnmm commented May 28, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

ducnmm and others added 22 commits May 27, 2026 09:05
feat: add Slack alert for exhausted Walrus uploads
… 2.17.0

Aligns sidecar SDKs with the current Walrus on-chain package so future
package upgrades are picked up by a fresh client without breaking API
surface. Bumps are minor-version only (no breaking changes observed).
When the Walrus on-chain package is upgraded after the sidecar boots,
the cached @mysten/walrus client surfaces MoveAbort EWrongVersion from
walrus::system::inner_mut until restart. This change makes that
self-recovering instead of redeploy-dependent.

Sidecar (TS):
- isWalrusPackageVersionMismatch detector anchored on the location
  fragment "::system::inner_mut" (cross-transport stable) and the
  symbolic "EWrongVersion" (gRPC/GraphQL).
- Upload catch block calls existing refreshWalrusClient() and logs
  on-chain System.version before vs after refresh via systemObject().

Server (Rust):
- classify_sidecar_error: EWrongVersion is now Transient so Apalis
  retries against the refreshed client instead of marking Dead.
- AlertManager gains notify_walrus_package_upgrade_detected backed by
  WalrusPackageUpgradeDetectedAlert; fired from the wallet-job upload
  error path when the same pattern is observed.

Tests:
- 11 new sidecar tests covering JSON-RPC + gRPC error formats,
  case-insensitivity, and false-positive guards (balance::split,
  bare tokens, abort code 1 from non-Walrus modules).
- 4 new Rust tests on the classifier carve-out, the Walrus-only
  detector, and the new Slack payload (version diff, missing-data,
  oversized-error truncation).
User feedback (2026-05-27) flagged confusion around namespace overwrite,
hierarchy, and restore() response. Add two dedicated sections sourced
from the actual server behavior:

- Namespace Semantics: opaque flat strings, exact-equality match, no
  validation, no hierarchy, and every remember() is append-only (no
  upsert by namespace).
- Restore Semantics: response field meanings (restored / skipped / total),
  default limit, no pagination cursor, and bounded concurrency model
  (10 downloads, 3 decrypts) so on-call can reason about latency.

Also re-points the existing recall() example to the new object form
so the recommended call style is consistent across the doc.
Positional recall(query, limit, namespace) was easy to mis-read as
recall(query, namespace), so both SDKs now accept an object/dataclass
input alongside the existing positional signature. Positional calls
keep working — this is additive, not a breaking change.

TS SDK:
- New RecallParams type exposing { query, limit?, namespace?, maxDistance?, topK? }
- recall() overloaded: recall(params) and recall(query, limitOrOptions?, namespace?)
- JSDoc reframed to call out the object form as preferred

Python SDK:
- New RecallParams dataclass with the same shape
- recall() now accepts either str or RecallParams as the first arg
- Sync MemWalSync.recall() mirrors the overload
- Both new and existing forms exercised by test_client.py (6 recall tests)

Also fixes a default-value drift: Python restore() defaulted to limit=50
while TS and the server default to 10. Bringing Python in line removes
a foot-gun where the same call across SDKs returned different result
counts. Existing callers that pass an explicit limit are unaffected.

Both SDKs' restore() docstrings now spell out response field meanings,
the no-pagination caveat, and the linear-in-limit performance profile.
… fns

Next.js server-action modules reject any non-async-function export with
a build-time error. WALM-53 user feedback called out that this is easy
to miss because the failure mode is a build error pointing at the file
rather than at the offending export. Add a README section with the rule
and a wrong/right example so the constraint is discoverable next to the
running-locally instructions.

(Originally drafted as apps/chatbot/CLAUDE.md, but CLAUDE.md is
gitignored repo-wide as a per-dev workspace overlay convention, so the
note belongs in committed docs instead.)
Phase 1 of the WALM-53 recall() rollout plan: the new object-style
recall({ query, limit, namespace }) overload landed in the previous
commit; this commit adds a formal @deprecated JSDoc tag on the
positional signature so IDEs surface the warning at call sites. The
positional form keeps working — this is signaling intent, not removing
behavior.
Phase 1 doc migration: every recall() snippet in user-facing docs and
examples now uses the recommended object form so the first example a
new user copy-pastes is the unambiguous one. Positional form still
works (and the test suite still exercises it) — this is a doc-only
change with no runtime effect.

Updated:
- top-level README.md
- SKILL.md (Quick Start)
- docs/getting-started/quick-start.md
- docs/sdk/{quick-start,examples}.md, docs/sdk/usage/memwal.md
- docs/python-sdk/{quick-start,usage}.md, docs/python-sdk/usage/memwal.md
- docs/examples/example-apps.md
- docs/relayer/nautilus-tee.md
- packages/python-sdk-memwal/README.md
- packages/python-sdk-memwal/memwal/__init__.py + client.py docstrings
- packages/python-sdk-memwal/examples/{async_remember_demo,interactive_demo}.py

Left on positional intentionally:
- tests/test_client.py — verifies the backwards-compat path still works
- memwal/middleware.py — internal, will follow when callers update
- scripts/test-namespace.ts, test-zero-restore.ts — internal smoke
…comment-references

feat: clean up internal comment references
…4-marketing-analytics-for-memwalai-launch-traffic

Add GA4 launch analytics
The new wallet-job upload path (jobs.rs:866) already classified Walrus
package-version mismatches as Transient + fired the informational alert.
The legacy RememberJob handler that still drains older queue rows was
calling fail!() unconditionally on UploadBlobError::App, so the same
error would mark the row failed and skip the Apalis retry that the
sidecar's client refresh was meant to set up.

Carve out the same is_walrus_package_version_mismatch detection ahead
of fail!() — fire the same alert, log a warn, and return Err WITHOUT
writing status='failed'. Apalis then retries against the refreshed
client; the row only flips to 'failed' if the retry budget is exhausted
by some other transient error.

Reported by @ducnmm on PR #199 review.
The package-upgrade alert was described as one-shot but currently fires
for every matching upload failure. A real upgrade event triggers
EWrongVersion on every queued job concurrently, so the alerter could
emit dozens of identical messages back-to-back.

Add an in-memory dedup gate inside AlertManager keyed by
(sui_network, sidecar_walrus_dep_version) — the natural identity of a
distinct upgrade event:

- Same key, fired within the window → suppress.
- Different network or different dep version → independent event,
  fires immediately.
- After the window expires, the key re-fires (so a second upgrade on
  the same network surfaces).

Window default 600s, env-overridable via
WALRUS_PACKAGE_UPGRADE_ALERT_DEDUP_SECS so ops can widen / narrow it
without redeploying.

Reported by @ducnmm on PR #199 review.
Cast undefined / null through `as unknown as string` so the empty-input
test keeps working under any tsconfig strictness, instead of relying on
@ts-expect-error directives that go stale if the test file is ever
typechecked under different flags.

Reported by @ducnmm on PR #199 review.
Code comments should carry stable technical rationale, not Linear-ticket
narrative — the ticket ID rots once the system is renamed or the ticket
is closed. Reframe the carve-out comment as the invariant it's actually
protecting: the sidecar refreshes the cached Walrus client before the
error bubbles up, so the row must not flip to status='failed' here
because the next Apalis attempt is about to succeed.
Per repo convention, in-source comments and doc text should carry stable
technical rationale — not the Linear ticket that introduced the change.
Ticket IDs rot once renamed or closed; a "WALM-53 — Response semantics"
heading reads like a release note rather than a contract.

Strip the WALM-53 prefix from every docstring, JSDoc block, test docstring,
and markdown section it was introduced in. No behavior change; the prose
already explains why the symbol exists.
…package-version-mismatch-without-redeploy

fix(MEM-34): handle Walrus package version mismatch without redeploy
…sks-from-user-feedback

fix(WALM-53): address DX risks from 2026-05-27 user feedback
@harrymove-ctrl harrymove-ctrl self-requested a review May 28, 2026 10:02
@ducnmm ducnmm merged commit dacd004 into main May 28, 2026
24 checks passed
@railway-app railway-app Bot temporarily deployed to Walrus Memory / staging June 2, 2026 01:42 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants