Is there an existing issue for this?
Are you using the latest version of this package?
Can other PDF readers read the file?
When running this snippet
$text = (new PdfParser())->parseFile('/path/to/file.pdf')->getText();
I run into the following issue/exception (Please attach the pdf)
@PrinsFrank I hope you are doing good, didn't hear from you since your last mail.
There are characters missing in some words and it seems these characters are being "moved" to the next line. For instance, the line Frühwald, Norbert, Hemau looks like:
rühwald, orbert, emau
FNH
whereas the FNH are the first letters of the words above the line.
Raw extracted text
Related part in the PDF
PDFs
Here is the related PR for your PDF samples repository: PrinsFrank/pdf-samples#8
Uploaded PDF: BAnz AT 08.10.2025 B1.pdf
Online version of the PDF: https://www.bundesanzeiger.de/pub/publication/pifGpbbuJiFDBbgfH0P/content/pifGpbbuJiFDBbgfH0P/BAnz%20AT%2008.10.2025%20B1.pdf?inline
Do you allow attachment files to be used in tests to prevent regressions?
Is there an existing issue for this?
Are you using the latest version of this package?
Can other PDF readers read the file?
When running this snippet
I run into the following issue/exception (Please attach the pdf)
@PrinsFrank I hope you are doing good, didn't hear from you since your last mail.
There are characters missing in some words and it seems these characters are being "moved" to the next line. For instance, the line
Frühwald, Norbert, Hemaulooks like:whereas the
FNHare the first letters of the words above the line.Raw extracted text
Related part in the PDF
PDFs
Here is the related PR for your PDF samples repository: PrinsFrank/pdf-samples#8
Uploaded PDF: BAnz AT 08.10.2025 B1.pdf
Online version of the PDF: https://www.bundesanzeiger.de/pub/publication/pifGpbbuJiFDBbgfH0P/content/pifGpbbuJiFDBbgfH0P/BAnz%20AT%2008.10.2025%20B1.pdf?inline
Do you allow attachment files to be used in tests to prevent regressions?