Skip to content

Commit cedc183

Browse files
author
Derek Zen
committed
feat: integrate professional HTML to Markdown conversion
- Replace custom HTML conversion with node-html-markdown library - Provide three content formats: text, markdown, and html - Update README with new features and output structure - Remove redundant fields and clean up code - Version bump to 2.11.8
1 parent 358aea9 commit cedc183

File tree

5 files changed

+146
-70
lines changed

5 files changed

+146
-70
lines changed

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,12 @@
1+
## [2.11.8](https://github.com/callzhang/n8n-nodes-imap/compare/v2.11.7...v2.11.8) (2025-01-XX)
2+
3+
### Improvements
4+
5+
* **Professional HTML to Markdown Conversion**: Replaced custom HTML conversion with `node-html-markdown` library
6+
* **Better Markdown Quality**: Now uses industry-standard library for more accurate and reliable HTML to Markdown conversion
7+
* **Simplified Code**: Removed custom HTML parsing logic in favor of proven library
8+
* **Three Content Formats**: Clean separation of text, markdown, and html fields
9+
110
## [2.11.7](https://github.com/callzhang/n8n-nodes-imap/compare/v2.11.6...v2.11.7) (2025-01-XX)
211

312
### Features

README.md

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# <img src="nodes/Imap/node-imap-enhanced-icon.svg" height="40"> n8n-nodes-imap-enhanced
22

3-
This is an enhanced n8n community node that adds support for [IMAP](https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol) email servers with advanced features including custom labels, limit parameters, and structured email fields.
3+
This is an enhanced n8n community node that adds support for [IMAP](https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol) email servers with advanced features including custom labels, limit parameters, and professional HTML to Markdown conversion.
44

55
* [Installation](#installation)
66
* [Operations](#operations)
@@ -27,7 +27,7 @@ NPMJS: [n8n-nodes-imap-enhanced](https://www.npmjs.com/package/n8n-nodes-imap-en
2727
* Rename a mailbox
2828
* ~Delete a mailbox~ (disabled due to danger of accidental data loss and no apparent use case)
2929
* Email
30-
* Get list of emails in a mailbox **with limit parameter and enhanced structured fields**
30+
* Get list of emails in a mailbox **with limit parameter and three content formats (text, markdown, html)**
3131
* Download attachments from an email
3232
* Move an email to another mailbox
3333
* Copy an email into another mailbox
@@ -38,10 +38,11 @@ NPMJS: [n8n-nodes-imap-enhanced](https://www.npmjs.com/package/n8n-nodes-imap-en
3838

3939
## New Features
4040

41-
### Enhanced Email Fields
42-
- **Structured Output**: Returns title, from, to, cc, bcc, labels, content (text and HTML)
43-
- **Simplified HTML**: Converts text content to simplified HTML format
44-
- **All Email Metadata**: Includes date, messageId, size, and other envelope fields
41+
### Professional Content Processing
42+
- **Three Content Formats**: Returns `text`, `markdown`, and `html` fields for email body content
43+
- **Professional Markdown**: Uses `node-html-markdown` library for accurate HTML to Markdown conversion
44+
- **Clean Text**: Converts HTML to readable plain text with proper formatting
45+
- **Standard Fields**: Always includes flags/labels and structured envelope data
4546

4647
### Custom Labels Support
4748
- **Search by Custom Labels**: Search emails using custom labels/keywords
@@ -53,6 +54,27 @@ NPMJS: [n8n-nodes-imap-enhanced](https://www.npmjs.com/package/n8n-nodes-imap-en
5354
- **Mailbox List Limit**: Control maximum number of mailboxes returned
5455
- **Performance Optimization**: Prevents excessive data fetching
5556

57+
### Output Structure
58+
```json
59+
{
60+
"seq": 415,
61+
"uid": 18718,
62+
"mailboxPath": "INBOX",
63+
"envelope": {
64+
"subject": "Email Subject",
65+
"from": [{"name": "Sender", "address": "[email protected]"}],
66+
"to": [{"name": "Recipient", "address": "[email protected]"}],
67+
"date": "2025-01-XX",
68+
"messageId": "<message-id>"
69+
},
70+
"labels": ["\\Seen"],
71+
"size": 12345,
72+
"text": "Plain text content",
73+
"markdown": "# Header\n\n**Bold** text",
74+
"html": "<p>HTML content</p>"
75+
}
76+
```
77+
5678
## Credentials
5779

5880
Currently, this node supports only basic authentication (username and password).

nodes/Imap/operations/email/functions/EmailGetList.ts

Lines changed: 36 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import { getMailboxPathFromNodeParameter, parameterSelectMailbox } from "../../.
66
import { emailSearchParameters, getEmailSearchParametersFromNode } from "../../../utils/EmailSearchParameters";
77
import { simpleParser } from 'mailparser';
88
import { getEmailPartsInfoRecursive } from "../../../utils/EmailParts";
9+
import { NodeHtmlMarkdown } from 'node-html-markdown';
910

1011

1112
enum EmailParts {
@@ -48,54 +49,40 @@ function textToSimplifiedHtml(text: string): string {
4849
.replace(/\r/g, '<br>');
4950
}
5051

51-
function formatEmailAddresses(addresses: any[]): string[] {
52-
if (!addresses || !Array.isArray(addresses)) return [];
5352

54-
return addresses.map(addr => {
55-
if (typeof addr === 'string') return addr;
56-
if (addr.name && addr.address) {
57-
return `${addr.name} <${addr.address}>`;
58-
}
59-
return addr.address || '';
60-
}).filter(addr => addr);
53+
// Initialize HTML to Markdown converter
54+
const nhm = new NodeHtmlMarkdown();
55+
56+
function htmlToMarkdown(html: string): string {
57+
if (!html) return '';
58+
return nhm.translate(html);
6159
}
6260

63-
function simplifyHtmlToReadable(html: string): string {
61+
function htmlToText(html: string): string {
6462
if (!html) return '';
6563

6664
// Remove script and style elements completely
67-
let simplified = html.replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '');
68-
simplified = simplified.replace(/<style[^>]*>[\s\S]*?<\/style>/gi, '');
65+
let text = html.replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '');
66+
text = text.replace(/<style[^>]*>[\s\S]*?<\/style>/gi, '');
6967

70-
// Convert common HTML elements to readable format
71-
simplified = simplified
72-
// Convert headers to readable format
73-
.replace(/<h[1-6][^>]*>(.*?)<\/h[1-6]>/gi, '\n\n$1\n' + '='.repeat(50) + '\n')
74-
// Convert paragraphs
75-
.replace(/<p[^>]*>(.*?)<\/p>/gi, '\n\n$1\n')
68+
// Convert HTML elements to plain text
69+
text = text
7670
// Convert line breaks
7771
.replace(/<br[^>]*>/gi, '\n')
78-
// Convert divs to line breaks
72+
// Convert paragraphs
73+
.replace(/<p[^>]*>(.*?)<\/p>/gi, '\n\n$1\n\n')
74+
// Convert headers
75+
.replace(/<h[1-6][^>]*>(.*?)<\/h[1-6]>/gi, '\n\n$1\n\n')
76+
// Convert divs
7977
.replace(/<div[^>]*>(.*?)<\/div>/gi, '\n$1\n')
8078
// Convert lists
8179
.replace(/<ul[^>]*>(.*?)<\/ul>/gi, '\n$1\n')
8280
.replace(/<ol[^>]*>(.*?)<\/ol>/gi, '\n$1\n')
8381
.replace(/<li[^>]*>(.*?)<\/li>/gi, '• $1\n')
8482
// Convert blockquotes
8583
.replace(/<blockquote[^>]*>(.*?)<\/blockquote>/gi, '\n> $1\n')
86-
// Convert links to readable format
84+
// Convert links to plain text with URL
8785
.replace(/<a[^>]*href=["']([^"']*)["'][^>]*>(.*?)<\/a>/gi, '$2 ($1)')
88-
// Convert emphasis
89-
.replace(/<(strong|b)[^>]*>(.*?)<\/(strong|b)>/gi, '**$2**')
90-
.replace(/<(em|i)[^>]*>(.*?)<\/(em|i)>/gi, '*$2*')
91-
// Convert code
92-
.replace(/<code[^>]*>(.*?)<\/code>/gi, '`$1`')
93-
.replace(/<pre[^>]*>(.*?)<\/pre>/gi, '\n```\n$1\n```\n')
94-
// Convert tables to readable format
95-
.replace(/<table[^>]*>(.*?)<\/table>/gi, '\n$1\n')
96-
.replace(/<tr[^>]*>(.*?)<\/tr>/gi, '\n$1\n')
97-
.replace(/<td[^>]*>(.*?)<\/td>/gi, '$1 | ')
98-
.replace(/<th[^>]*>(.*?)<\/th>/gi, '$1 | ')
9986
// Remove all remaining HTML tags
10087
.replace(/<[^>]*>/g, '')
10188
// Decode HTML entities
@@ -113,7 +100,7 @@ function simplifyHtmlToReadable(html: string): string {
113100
.replace(/\n /g, '\n') // Remove leading spaces from lines
114101
.replace(/ \n/g, '\n'); // Remove trailing spaces from lines
115102

116-
return simplified.trim();
103+
return text.trim();
117104
}
118105

119106

@@ -214,8 +201,8 @@ export const getEmailsListOperation: IResourceOperationDef = {
214201
name: 'enhancedFields',
215202
type: 'boolean',
216203
default: true,
217-
description: 'Whether to include email body content (text and HTML) in the results',
218-
hint: 'Returns both text and HTML versions of the email body content',
204+
description: 'Whether to include email body content in the results',
205+
hint: 'Returns text, markdown, and html fields with email body content',
219206
}
220207
],
221208
async executeImapAction(context: IExecuteFunctions, itemIndex: number, client: ImapFlow): Promise<INodeExecutionData[] | null> {
@@ -322,18 +309,7 @@ export const getEmailsListOperation: IResourceOperationDef = {
322309
item_json.size = email.size;
323310
}
324311

325-
// Always include structured fields from envelope (clean format)
326-
if (email.envelope) {
327-
item_json.title = email.envelope.subject || '';
328-
item_json.from = formatEmailAddresses(email.envelope.from || []);
329-
item_json.to = formatEmailAddresses(email.envelope.to || []);
330-
item_json.cc = formatEmailAddresses(email.envelope.cc || []);
331-
item_json.bcc = formatEmailAddresses(email.envelope.bcc || []);
332-
item_json.replyTo = formatEmailAddresses(email.envelope.replyTo || []);
333-
item_json.date = email.envelope.date;
334-
item_json.messageId = email.envelope.messageId;
335-
item_json.inReplyTo = email.envelope.inReplyTo;
336-
}
312+
// Note: All envelope fields are already included in the envelope object above
337313

338314
// process the headers
339315
if (includeParts.includes(EmailParts.Headers)) {
@@ -409,11 +385,11 @@ export const getEmailsListOperation: IResourceOperationDef = {
409385
if (textContent.content) {
410386
item_json.textContent = await streamToString(textContent.content);
411387

412-
// if include body is enabled, also provide simplified HTML version
388+
// if include body is enabled, provide text content
413389
if (enhancedFields) {
414-
item_json.contentText = item_json.textContent;
415-
item_json.contentHtml = textToSimplifiedHtml(item_json.textContent);
416-
item_json.contentReadable = item_json.textContent; // Text content is already readable
390+
item_json.text = item_json.textContent;
391+
item_json.markdown = item_json.textContent; // Plain text is already readable
392+
item_json.html = textToSimplifiedHtml(item_json.textContent);
417393
}
418394
}
419395
}
@@ -428,25 +404,24 @@ export const getEmailsListOperation: IResourceOperationDef = {
428404
if (htmlContent.content) {
429405
item_json.htmlContent = await streamToString(htmlContent.content);
430406

431-
// if include body is enabled, also provide the HTML content
407+
// if include body is enabled, provide HTML content
432408
if (enhancedFields) {
433-
item_json.contentHtml = item_json.htmlContent;
434-
// if we don't have text content, create readable text from the HTML content
435-
if (!item_json.contentText) {
436-
item_json.contentText = simplifyHtmlToReadable(item_json.htmlContent);
409+
item_json.html = item_json.htmlContent;
410+
item_json.markdown = htmlToMarkdown(item_json.htmlContent);
411+
// if we don't have text content, create plain text from the HTML content
412+
if (!item_json.text) {
413+
item_json.text = htmlToText(item_json.htmlContent);
437414
}
438-
// Always provide a readable version of the HTML content
439-
item_json.contentReadable = simplifyHtmlToReadable(item_json.htmlContent);
440415
}
441416
}
442417
}
443418
}
444419

445420
// if include body is enabled but no content was found, set empty values
446-
if (enhancedFields && !item_json.contentText && !item_json.contentHtml) {
447-
item_json.contentText = '';
448-
item_json.contentHtml = '';
449-
item_json.contentReadable = '';
421+
if (enhancedFields && !item_json.text && !item_json.html) {
422+
item_json.text = '';
423+
item_json.markdown = '';
424+
item_json.html = '';
450425
}
451426
}
452427

package-lock.json

Lines changed: 71 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "n8n-nodes-imap-enhanced",
3-
"version": "2.11.7",
3+
"version": "2.11.8",
44
"description": "Enhanced IMAP node with custom labels support, limit parameters, and structured email fields for n8n workflows.",
55
"keywords": [
66
"n8n-community-node-package"
@@ -64,6 +64,7 @@
6464
"dependencies": {
6565
"imapflow": "^1.0.188",
6666
"mailparser": "^3.7.3",
67+
"node-html-markdown": "^1.3.0",
6768
"nodemailer": "^6.10.1"
6869
}
6970
}

0 commit comments

Comments
 (0)