Skip to content

Conversation

@maoberlehner
Copy link
Contributor

@maoberlehner maoberlehner commented Oct 21, 2025

When multiple systems hit the MAPI with the same access token, the retry mechanisms quickly gave up. By decreasing the semaphore batch size and increasing the max retries, we make such situations less likely. However, it is still possible to run into limits when many systems query the MAPI with the same access token in parallel.

See the following article for the rational behind the full jitter retry delay: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the MAPI client's ability to handle concurrent requests from multiple systems using the same access token by adjusting retry and batching configurations. The changes reduce the likelihood of overwhelming the API when multiple systems query simultaneously.

Key Changes:

  • Increased retry attempts from 3 to 12 with linear backoff strategy
  • Reduced batch sizes from 100 to 6 in CLI migration commands
  • Updated documentation and tests to reflect new retry configuration

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
packages/mapi-client/src/client/client.ts Increased maxRetries to 12 and implemented linear backoff for 429 responses
packages/mapi-client/src/tests/integration.test.ts Updated test expectations to match new retry count (13 total calls)
packages/mapi-client/README.md Updated documentation to reflect new maxRetries value of 12
packages/cli/src/commands/migrations/run/index.ts Reduced batchSize from 100 to 6 for fetching and updating stories

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

starts_with: startsWith,
},
batchSize: 100,
batchSize: 6,
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Reducing batch size from 100 to 6 is a significant decrease that may substantially increase the total number of API requests and overall execution time. Consider using a more moderate value (e.g., 20-30) to balance between avoiding rate limits and maintaining reasonable performance.

Suggested change
batchSize: 6,
batchSize: 20,

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maoberlehner what do you think about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I did go down the rabbit hole of batch sizes, exponential backoffs, and jitter. I arrived at a little bit more sophisticated delay mechanism and a batchSize of 12. The higher the batchSize, the higher the maxRetries must be, and I think 12 retries already is quite something. With a batchSize >= 20, I quickly run into the maxRetries limit when I start two or more migrations in parallel. With the new settings, 4 parallel migrations are still stable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brings the MAPI client and CLI reliance to a next level, awesome!

@pkg-pr-new
Copy link

pkg-pr-new bot commented Oct 21, 2025

Open in StackBlitz

@storyblok/astro

npm i https://pkg.pr.new/@storyblok/astro@358

storyblok

npm i https://pkg.pr.new/storyblok@358

@storyblok/eslint-config

npm i https://pkg.pr.new/@storyblok/eslint-config@358

@storyblok/js

npm i https://pkg.pr.new/@storyblok/js@358

storyblok-js-client

npm i https://pkg.pr.new/storyblok-js-client@358

@storyblok/management-api-client

npm i https://pkg.pr.new/@storyblok/management-api-client@358

@storyblok/nuxt

npm i https://pkg.pr.new/@storyblok/nuxt@358

@storyblok/react

npm i https://pkg.pr.new/@storyblok/react@358

@storyblok/region-helper

npm i https://pkg.pr.new/@storyblok/region-helper@358

@storyblok/richtext

npm i https://pkg.pr.new/@storyblok/richtext@358

@storyblok/svelte

npm i https://pkg.pr.new/@storyblok/svelte@358

@storyblok/vue

npm i https://pkg.pr.new/@storyblok/vue@358

commit: d59c972

Copy link
Contributor

@alexjoverm alexjoverm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Seems that the CI/CD is failing though, but other than that looks great

Also pinged you re the batch size, but not a blocker at all.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

* @param maxDelay The maximum possible delay in milliseconds (e.g., 20000).
* @returns The calculated delay in milliseconds to wait before the next attempt.
*/
export function calculateRetryDelay(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: As this logic already lives in the MAPI client, and the CLI is leveraging it, is it needed to repeat this logic at this level? (same with delay.ts)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not using the client for the login: https://github.com/storyblok/monoblok/blob/bugfix/cli-migrations-retry/packages/cli/src/commands/login/actions.ts#L38

I don't like the repetition either, but I did want to use the same algo in both cases, yet I don't think it is warranted to introduce a new package for shared code just for that. But maybe we should think about it for the future.

Copy link
Contributor Author

@maoberlehner maoberlehner Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure why we don't use the client for login. Probably it would be a better fix to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Would you say it's crucial for the login? As it's not a bulk operation, I'm thinking if we can leave it as it is and without the retry algo (to avoid the duplication), and in a separate ticket we can review the login/logout calls, to keep it out of the scope of this one.

What do you think? I'm also fine leaving the duplication temporarily until we review them, if you think it's better solution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, changed back.

When multiple systems hit the MAPI with the same access token, the retry
mechanisms quickly gave up. By decreasing the semaphore batch size and
increasing the max retries, we make such situations less likely.
However, it is still possible to run into limits when many systems query
the MAPI with the same access token in parallel.

See the following article for the rational behind the full jitter retry
delay: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/

Fixes WDX-184
Copy link
Contributor

@alexjoverm alexjoverm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go, thank you very much!

@maoberlehner maoberlehner merged commit fe3a6f7 into main Oct 22, 2025
5 checks passed
@maoberlehner maoberlehner deleted the bugfix/cli-migrations-retry branch October 22, 2025 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants