fix(cli): retry mechanism when multiple systems query the MAPI #358

maoberlehner · 2025-10-21T13:19:22Z

When multiple systems hit the MAPI with the same access token, the retry mechanisms quickly gave up. By decreasing the semaphore batch size and increasing the max retries, we make such situations less likely. However, it is still possible to run into limits when many systems query the MAPI with the same access token in parallel.

See the following article for the rational behind the full jitter retry delay: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/

Copilot

Pull Request Overview

This PR improves the MAPI client's ability to handle concurrent requests from multiple systems using the same access token by adjusting retry and batching configurations. The changes reduce the likelihood of overwhelming the API when multiple systems query simultaneously.

Key Changes:

Increased retry attempts from 3 to 12 with linear backoff strategy
Reduced batch sizes from 100 to 6 in CLI migration commands
Updated documentation and tests to reflect new retry configuration

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
packages/mapi-client/src/client/client.ts	Increased maxRetries to 12 and implemented linear backoff for 429 responses
packages/mapi-client/src/tests/integration.test.ts	Updated test expectations to match new retry count (13 total calls)
packages/mapi-client/README.md	Updated documentation to reflect new maxRetries value of 12
packages/cli/src/commands/migrations/run/index.ts	Reduced batchSize from 100 to 6 for fetching and updating stories

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

packages/mapi-client/src/client/client.ts

Copilot · 2025-10-21T13:19:51Z

packages/cli/src/commands/migrations/run/index.ts

          starts_with: startsWith,
        },
-        batchSize: 100,
+        batchSize: 6,


[nitpick] Reducing batch size from 100 to 6 is a significant decrease that may substantially increase the total number of API requests and overall execution time. Consider using a more moderate value (e.g., 20-30) to balance between avoiding rate limits and maintaining reasonable performance.

Suggested change

batchSize: 6,

batchSize: 20,

@maoberlehner what do you think about this?

So I did go down the rabbit hole of batch sizes, exponential backoffs, and jitter. I arrived at a little bit more sophisticated delay mechanism and a batchSize of 12. The higher the batchSize, the higher the maxRetries must be, and I think 12 retries already is quite something. With a batchSize >= 20, I quickly run into the maxRetries limit when I start two or more migrations in parallel. With the new settings, 4 parallel migrations are still stable.

This brings the MAPI client and CLI reliance to a next level, awesome!

pkg-pr-new · 2025-10-21T13:30:40Z

Open in StackBlitz

@storyblok/astro

npm i https://pkg.pr.new/@storyblok/astro@358

storyblok

npm i https://pkg.pr.new/storyblok@358

@storyblok/eslint-config

npm i https://pkg.pr.new/@storyblok/eslint-config@358

@storyblok/js

npm i https://pkg.pr.new/@storyblok/js@358

storyblok-js-client

npm i https://pkg.pr.new/storyblok-js-client@358

@storyblok/management-api-client

npm i https://pkg.pr.new/@storyblok/management-api-client@358

@storyblok/nuxt

npm i https://pkg.pr.new/@storyblok/nuxt@358

@storyblok/react

npm i https://pkg.pr.new/@storyblok/react@358

@storyblok/region-helper

npm i https://pkg.pr.new/@storyblok/region-helper@358

@storyblok/richtext

npm i https://pkg.pr.new/@storyblok/richtext@358

@storyblok/svelte

npm i https://pkg.pr.new/@storyblok/svelte@358

@storyblok/vue

npm i https://pkg.pr.new/@storyblok/vue@358

commit: d59c972

alexjoverm

LGTM! Seems that the CI/CD is failing though, but other than that looks great

Also pinged you re the batch size, but not a blocker at all.

packages/cli/src/commands/migrations/run/streams/stories-stream.ts

Copilot

Pull Request Overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

packages/mapi-client/src/client/client.ts

packages/mapi-client/src/utils/calculate-retry-delay.ts

alexjoverm · 2025-10-22T08:57:46Z

packages/cli/src/utils/calculate-retry-delay.ts

+ * @param maxDelay The maximum possible delay in milliseconds (e.g., 20000).
+ * @returns The calculated delay in milliseconds to wait before the next attempt.
+ */
+export function calculateRetryDelay(


question: As this logic already lives in the MAPI client, and the CLI is leveraging it, is it needed to repeat this logic at this level? (same with delay.ts)

Yeah, it's not using the client for the login: https://github.com/storyblok/monoblok/blob/bugfix/cli-migrations-retry/packages/cli/src/commands/login/actions.ts#L38

I don't like the repetition either, but I did want to use the same algo in both cases, yet I don't think it is warranted to introduce a new package for shared code just for that. But maybe we should think about it for the future.

Not quite sure why we don't use the client for login. Probably it would be a better fix to do so.

Ah I see. Would you say it's crucial for the login? As it's not a bulk operation, I'm thinking if we can leave it as it is and without the retry algo (to avoid the duplication), and in a separate ticket we can review the login/logout calls, to keep it out of the scope of this one.

What do you think? I'm also fine leaving the duplication temporarily until we review them, if you think it's better solution

Agree, changed back.

When multiple systems hit the MAPI with the same access token, the retry mechanisms quickly gave up. By decreasing the semaphore batch size and increasing the max retries, we make such situations less likely. However, it is still possible to run into limits when many systems query the MAPI with the same access token in parallel. See the following article for the rational behind the full jitter retry delay: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/ Fixes WDX-184

alexjoverm

Good to go, thank you very much!

maoberlehner requested review from alexjoverm and Copilot October 21, 2025 13:19

maoberlehner had a problem deploying to test October 21, 2025 13:19 — with GitHub Actions Failure

Copilot AI reviewed Oct 21, 2025

View reviewed changes

maoberlehner force-pushed the bugfix/cli-migrations-retry branch from eb28875 to d737058 Compare October 21, 2025 13:27

maoberlehner had a problem deploying to test October 21, 2025 13:28 — with GitHub Actions Failure

alexjoverm requested changes Oct 21, 2025

View reviewed changes

maoberlehner force-pushed the bugfix/cli-migrations-retry branch from d737058 to b6e543c Compare October 22, 2025 07:34

maoberlehner had a problem deploying to test October 22, 2025 07:34 — with GitHub Actions Failure

maoberlehner commented Oct 22, 2025

View reviewed changes

packages/cli/src/commands/migrations/run/streams/stories-stream.ts Show resolved Hide resolved

maoberlehner requested review from alexjoverm and Copilot October 22, 2025 07:40

Copilot AI reviewed Oct 22, 2025

View reviewed changes

packages/mapi-client/src/client/client.ts Show resolved Hide resolved

maoberlehner force-pushed the bugfix/cli-migrations-retry branch from b6e543c to 7b25716 Compare October 22, 2025 07:59

maoberlehner temporarily deployed to test October 22, 2025 07:59 — with GitHub Actions Inactive

alexjoverm reviewed Oct 22, 2025

View reviewed changes

maoberlehner force-pushed the bugfix/cli-migrations-retry branch from 7b25716 to d59c972 Compare October 22, 2025 11:25

maoberlehner temporarily deployed to test October 22, 2025 11:25 — with GitHub Actions Inactive

alexjoverm approved these changes Oct 22, 2025

View reviewed changes

maoberlehner merged commit fe3a6f7 into main Oct 22, 2025
5 checks passed

maoberlehner deleted the bugfix/cli-migrations-retry branch October 22, 2025 11:59

fix(cli): retry mechanism when multiple systems query the MAPI #358

fix(cli): retry mechanism when multiple systems query the MAPI #358

Uh oh!

Conversation

maoberlehner commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

alexjoverm Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

maoberlehner Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

alexjoverm Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

pkg-pr-new bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexjoverm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

alexjoverm Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

maoberlehner Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

maoberlehner Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexjoverm Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

maoberlehner Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

alexjoverm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maoberlehner commented Oct 21, 2025 •

edited

Loading

pkg-pr-new bot commented Oct 21, 2025 •

edited

Loading

maoberlehner Oct 22, 2025 •

edited

Loading