Skip to content

Add values to "rel" instead of overwriting existing ones #322

@buhtz

Description

@buhtz

Hello,

let me start with an example.

<a href="https://fosstodon.org/@backintime" rel="me">foobar</a>

This input is transformed into this, with an overwritten rel-attribute:

<a href="https://fosstodon.org/@backintime" rel="nofollow">foobar</a>

But what I want is

<a href="https://fosstodon.org/@backintime" rel="me nofollow">foobar</a>

cleaned = nh3.clean(
html,
tags=ALLOWED_TAGS,
attributes=ALLOWED_ATTRIBUTES,
link_rel="nofollow",
url_schemes={"http", "https", "mailto"},
)

Background

I am aware that you use hn3 package for this. I asked there and got the answer that there is not option on the side of hn3 to modify that behavior. messense/nh3#71

The broader picture here is that I need verify a mailman3 list on Mastodon. To make this happen the list landing page in mailman3 need to contain a rel=me link like you see above.

Approach

I can imagine a regex based approach/workaround and would provide a PR if you accept the concept of the solution. What do you think?

The approach described in short steps:

  1. Before calling hn3.clean() record (via regex) all a-tags (and if you want link-tags contain in head-tags) containing a "rel" attribute.
  2. Call hn3.clean()
  3. Iterate over the recorded cases
    1. Use their hn3.clean() version to find them in the sanitized HTML.
    2. Add the old rel-values to the rel-attribute.

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions