Skip to content

Feature: PostScript/EPS format support #96

@coderabbitai

Description

@coderabbitai

Overview

Add basic support for PostScript and Encapsulated PostScript (EPS) files.

Parent Epic

Part of #91 - Document & Office Format Awareness

Description

Parse PostScript files to extract metadata and text strings while skipping binary image data.

Implementation Details

  • PostScript is text-based programming language
  • Parse comments for metadata (%%Title, %%Creator, etc.)
  • Extract string literals
  • Identify DSC (Document Structuring Conventions) comments
  • Skip binary image data sections

String Sources

  • DSC comments (metadata)
  • String literals in code
  • Font names
  • Resource identifiers
  • BoundingBox and page information

Acceptance Criteria

  • Parse DSC comments
  • Extract string literals
  • Identify binary sections
  • Handle both ASCII and binary PS
  • Tests with PS and EPS files

Related

Project: #76

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions