-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Feat: Add TCADP parser for PPTX and spreadsheet document types. #11041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
KevinHuSh
merged 11 commits into
infiniflow:main
from
aidansu:performance/perf_tcadp_parser
Nov 20, 2025
Merged
Feat: Add TCADP parser for PPTX and spreadsheet document types. #11041
KevinHuSh
merged 11 commits into
infiniflow:main
from
aidansu:performance/perf_tcadp_parser
Nov 20, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Remove custom signature implementation and adopt Tencent Cloud's official SDK - Update configuration files to support new SDK parameters - Upgrade dependencies to the latest stable versions - Optimize streaming response handling mechanism - Unify environment variable reading logic - Enhance control over table and image response types
…add_adp_parser # Conflicts: # rag/app/naive.py # rag/flow/parser/parser.py # web/src/components/layout-recognize-form-field.tsx
…sdk-python from 3.0.1215 to 3.0.1478
- Add spreadsheet parsing field component in data flow and agent forms - Update spreadsheet parsing constant configurations to support both DeepDOC and TCADP parsing methods - Implement TCADP parsing logic for spreadsheet files in rag/app/naive.py - Extend rag/flow/parser/parser.py to support both TCADP and DeepDOC spreadsheet parsing methods - Add handling of TCADP parsing results for HTML, JSON, and Markdown output formats - Update frontend utility functions to pass spreadsheet parsing method configurations
- Add tcadp_parser method for PPT files - Support both PPT and PPTX file formats - Add PPT form field component
- Add new output format options: markdown, text, and html
…mance/perf_tcadp_parser # Conflicts: # rag/app/naive.py # rag/flow/parser/parser.py # uv.lock # web/src/components/layout-recognize-form-field.tsx # web/src/pages/data-flow/constant.tsx # web/src/pages/data-flow/form/parser-form/index.tsx # web/src/pages/data-flow/form/parser-form/pdf-form-fields.tsx # web/src/pages/data-flow/utils.ts
… for the TCADP Parser - Added TCADP Parser-related configuration fields to the PDF, PPT, and spreadsheet parsing forms - Added support for setting table result type (Markdown/HTML) and Markdown image response type (URL/Text) - Updated the TCADP Parser to support obtaining return format settings from configuration or parameters - Updated frontend logic to dynamically display TCADP configuration options based on the selected parsing method - Modified backend logic to pass the corresponding format configuration parameters when calling the TCADP API - Optimized the form default value setting logic to ensure TCADP configuration items have appropriate initial values - Updated multilingual resource files to support the UI display of the new configuration items
…nce/perf_tcadp_parser # Conflicts: # rag/app/naive.py
Member
|
Thx, please fix the ci at first~~ |
Contributor
Author
|
@yingfeng CI issues fixed, checks are passing now. Please review again. Thanks! |
yngvarhuang
pushed a commit
to yngvarhuang/ragflow
that referenced
this pull request
Nov 20, 2025
…1120 * main: (53 commits) Use array syntax for commands in docker-compose-base.yml (infiniflow#11391) Feature (canvas): Add mind tagging support (infiniflow#11359) locale en add russian language option (infiniflow#11392) Locale: update russian language (infiniflow#11393) Feat: Add TCADP parser for PPTX and spreadsheet document types. (infiniflow#11041) fix(llm): handle None response in total_token_count_from_response (infiniflow#10941) feat: add OceanBase doc engine (infiniflow#11228) fix cohere rerank base_url default (infiniflow#11353) Feat: Fixed an issue where modifying fields in the agent operator caused the loss of structured data. infiniflow#10427 (infiniflow#11388) Docs: minor (infiniflow#11385) Doc: Optimize read me (infiniflow#11386) Fix some multilingual issues (infiniflow#11382) Feat: If a query variable in a data manipulation operator is deleted, a warning message should be displayed to the user. infiniflow#10427 infiniflow#11255 (infiniflow#11384) Fix: refine error msg. (infiniflow#11380) Doc: Added v0.22.1 release notes (infiniflow#11383) Feat: The key for the begin operator can only contain alphanumeric characters and underscores. infiniflow#10427 (infiniflow#11377) Fix: circle imports issue. (infiniflow#11374) Feat: Structured data will still be stored in outputs for compatibility with older versions. infiniflow#10427 (infiniflow#11368) Add release notes (infiniflow#11372) Update README for supporting Gemini 3 Pro (infiniflow#11369) ... # Conflicts: # pyproject.toml # web/src/locales/ru.ts
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Type of change