-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or requestpriority:mediumMedium priority taskMedium priority task
Description
Overview
Add support for OpenDocument formats (ODT, ODS, ODP) used by LibreOffice and OpenOffice.
Parent Epic
Part of #91 - Document & Office Format Awareness
Description
Parse OpenDocument files (ZIP-based) to extract metadata and text content from XML streams.
Implementation Details
- OpenDocument files are ZIP archives containing XML
- Parse meta.xml for metadata
- Parse content.xml for document content
- Handle styles.xml for formatting information
- Extract embedded media metadata
String Sources
- Document metadata (meta.xml)
- Text content (content.xml)
- Styles and formatting names
- Embedded media filenames
- Hyperlinks and references
Acceptance Criteria
- Unzip OpenDocument files
- Parse meta.xml for metadata
- Extract text from content.xml
- Handle ODT, ODS, ODP formats
- Skip binary embedded objects
- Tests with LibreOffice-created files
Related
Project: #76
Depends on: Phase 2 ZIP parser
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestpriority:mediumMedium priority taskMedium priority task