-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Problem Description
When using persistent browser sessions, cookies are not persisting between workflow runs. After executing a login workflow successfully, subsequent workflows using the same browser_session_id are not reusing the cookies, resulting in "NOT_LOGGED_IN" errors.
Expected Behavior
When a persistent browser session is created and a login workflow is executed:
- Cookies should be saved to a persistent
user_data_dir(e.g.,/app/skyvern/persistent_browser_sessions/{org_id}/{session_id}) - When a subsequent workflow uses the same
browser_session_id, it should:- Connect to the existing browser via CDP using the stored
browser_address - Use the same persistent
user_data_dirto load cookies - Preserve authentication state between workflows
- Connect to the existing browser via CDP using the stored
Actual Behavior
-
Session Creation: The persistent session is created correctly with the right
user_data_dir:[persistent_sessions_manager.py:453] Using persistent user_data_dir for session browser_session_id=pbs_457827170959729696 user_data_dir=/app/skyvern/persistent_browser_sessions/o_454165536444531154/pbs_457827170959729696 browser_address=None -
Browser Address Persisted: The browser address is correctly stored in the database:
[persistent_sessions_manager.py:376] Persisted browser address for session browser_session_id=pbs_457827170959729696 browser_address=http://127.0.0.1:9222 -
Workflow Execution Issue: When a workflow run is created for this session, it's creating a new browser state with a temporary
user_data_dirinstead of using the persistent one:[browser_manager.py:215] Creating browser state for workflow run workflow_run_id=wr_457827265449010210 [browser_factory.py:557] Using temporary user_data_dir user_data_dir=./temp/skyvern_browser_3gaxhz4_
Root Cause Analysis
The issue appears to be in browser_manager.py -> get_or_create_for_workflow_run():
- When
get_browser_state()fromPersistentSessionsManagerreturnsNone(browser state not in memory) - The code creates a new browser state via
_create_browser_state() - At this point,
workflow_run.browser_addressisNone, so it doesn't retrieve thebrowser_addressfrom the database - Without
browser_address, the code falls back to creating a new browser instead of connecting via CDP - Without explicit
browser_session_idpropagation in some code paths,user_data_diris not calculated correctly, resulting in a temporary directory
Attempted Fixes
We've tried several approaches:
- Modified
_create_headless_chromiumand_create_headful_chromiumto calculateuser_data_dirfrombrowser_session_idwhen not explicitly provided - Added logic to retrieve
browser_addressfrom database whenbrowser_session_idis present butworkflow_run.browser_addressisNone - Added retry logic in
_connect_to_cdp_browserto find existing contexts
However, we're still seeing the "Using temporary user_data_dir" log, indicating the issue persists.
Relevant Logs
Session Creation (Correct):
[persistent_sessions_manager.py:453] Using persistent user_data_dir for session
browser_session_id=pbs_457827170959729696
user_data_dir=/app/skyvern/persistent_browser_sessions/o_454165536444531154/pbs_457827170959729696
[browser_factory.py:536] Using provided user_data_dir for persistent session
user_data_dir=/app/skyvern/persistent_browser_sessions/o_454165536444531154/pbs_457827170959729696
[persistent_sessions_manager.py:376] Persisted browser address for session
browser_session_id=pbs_457827170959729696
browser_address=http://127.0.0.1:9222
Workflow Execution (Problem):
[browser_manager.py:215] Creating browser state for workflow run
workflow_run_id=wr_457827265449010210
[browser_factory.py:557] Using temporary user_data_dir
user_data_dir=./temp/skyvern_browser_3gaxhz4_
Questions
- Has anyone else encountered this cookie persistence issue with persistent browser sessions?
- Is there a recommended pattern for ensuring
browser_session_idis correctly propagated throughout the browser creation chain? - Should we always retrieve
browser_addressfrom the database whenbrowser_session_idis present, even ifworkflow_run.browser_addressexists?
Environment
- Skyvern version: Latest (Docker deployment)
- Python version: As per Docker image
- Playwright version: As per Skyvern dependencies
We're willing to pay someone to help us hourly