Skip to content

feat: add 'up_to_current_unit' retrieval mode to OpenEdXProcessor#213

Merged
felipemontoya merged 2 commits into
openedx:mainfrom
raccoongang:feature/add-function-to-retrive-sequence-content
Apr 14, 2026
Merged

feat: add 'up_to_current_unit' retrieval mode to OpenEdXProcessor#213
felipemontoya merged 2 commits into
openedx:mainfrom
raccoongang:feature/add-function-to-retrive-sequence-content

Conversation

@Pavilion4ik
Copy link
Copy Markdown
Contributor

@Pavilion4ik Pavilion4ik commented Apr 10, 2026

Overview

The retrieval_mode feature allows AI interactions to leverage context beyond the immediate unit where the interaction is happening. This is particularly useful for courses with granular content structures where a single unit may not provide enough context for meaningful AI responses.

Configuration Options

The system now supports three primary retrieval modes:

  • unit (Default): Retrieves content only from the current unit.
  • up_to_current_unit: Retrieves content from the sequence up to (and including) the current unit.
  • sequence: Retrieves content from the entire parent sequence (e.g., all units in the same lesson/subsection).

How it Works

The OpenEdXProcessor determines the retrieval_mode from its configuration (typically defined in the active workflow profile's JSON file).

Example: Setting in a Workflow Profile (.json file)

To enable sequence-level retrieval for a specific workflow, add "retrieval_mode": "sequence" to the OpenEdXProcessor configuration:

{
  "orchestrator_class": "DirectLLMResponse",
  "processor_config": {
    "OpenEdXProcessor": {
      "function": "get_location_content",
      "retrieval_mode": "sequence"
    },
    "LLMProcessor": {
      "provider": "default",
      "prompt": "Summarize the lesson content provided below..."
    }
  }
}

Technical Implementation

  • Processor Logic: In openedx_processor.py, the get_location_content method checks self.config for retrieval_mode.
  • Sequence Retrieval: If set to sequence, it uses store.get_parent_location(unit_key) to find the parent sequence and retrieves content for all its child units using a new _get_unit_data helper method.
  • Data Structure:
    • In unit mode, the returned JSON contains data for a single unit.
    • In sequence mode, the returned JSON contains a sequence_id, display_name, and a units list containing the processed content of every unit in that sequence.
  • Fallback Safety: If a parent sequence cannot be found, the system gracefully falls back to returning the single unit's content.

Issue: #173

 - implement new 'up_to_current_unit' retrieval mode to fetch sequence content up to the current unit
 - update get_location_content tool schema to support dynamic retrieval_mode selection by the LLM
 - enhance mock_keys in tests to support string comparison and make_usage_key method
 - add comprehensive tests for the new retrieval mode and parameter overrides
    - update implementation details documentation to reflect the three supported retrieval modes
@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 10, 2026
@openedx-webhooks
Copy link
Copy Markdown

Thanks for the pull request, @Pavilion4ik!

This repository is currently maintained by @felipemontoya.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.13%. Comparing base (61ace2e) to head (edfecf5).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #213      +/-   ##
==========================================
+ Coverage   95.08%   95.13%   +0.05%     
==========================================
  Files          67       67              
  Lines        7203     7322     +119     
  Branches      380      387       +7     
==========================================
+ Hits         6849     6966     +117     
- Misses        265      267       +2     
  Partials       89       89              
Flag Coverage Δ
unittests 95.13% <ø> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Member

@felipemontoya felipemontoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is working exactly as I would expect. Thanks a lot @Pavilion4ik.

I only have one comment that I'd like you to add before we merge. Can you add the explicit retrieval_mode to the profiles?

Since the profiles are read using json5, you could even leave a comment with options.

"OpenEdXProcessor": {
  "function": "get_location_content",
   "retrieval_mode": "up_to_current_unit" // "unit", "sequence", "up_to_current_unit"
},

@felipemontoya
Copy link
Copy Markdown
Member

I would still leave the default being unit, but you can mix it up in the profile examples

 - Added retrieval_mode examples to profiles
@Pavilion4ik
Copy link
Copy Markdown
Contributor Author

I would still leave the default being unit, but you can mix it up in the profile examples

@felipemontoya Got it - I’ve added unit as the default retrieval_mode to profiles

@felipemontoya felipemontoya merged commit 3aefaad into openedx:main Apr 14, 2026
10 checks passed
@github-project-automation github-project-automation Bot moved this from Needs Triage to Done in Contributions Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open-source-contribution PR author is not from Axim or 2U

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants