Skip to content

Add workarounds for anti-framing scripts #636

@Mr0grog

Description

@Mr0grog

Some pages have anti-framing/anti-clickjacking code that checks whether the page is in a frame and hides the content and/or attempts to redirect the top frame to the page. For example, the https://www.census.gov/programs-surveys/economic-census.html has this code in the <head>:

<head>
  ...
  <style id="antiClickjack">body { display: none; }</style>
  <script type="text/javascript">
    if (self === top) {
      var antiClickjack =  document.getElementById("antiClickjack");
      antiClickjack.parentNode.removeChild(antiClickjack);
    } else {
      top.location = self.location
    }
  </script>
  ...
</head>

Since we show pages in iframes, this is a problem. We set restrictions on the frame’s code so it can’t redirect the top frame, but this still leaves us with a blank page (and in some cases, a broken page because the script might throw an exception). Some workaround ideas:

  • Inject a script that runs after page load (or maybe just at the end of the page?) checks whether the html or body element’s computed style has display: none or visibility: hidden. If so, explicitly set the elements’ style to display: block; visibility: visible;. Something like:

    [document.documentElement, document.body].forEach(function ensureVisible(element) {
        style = getComputedStyle(element);
        // Check and set these in one go because setting one and then checking
        // the next will cause layout thrashing.
        if (style.display === 'none' || style.visibility === 'hidden') {
            element.style.display = 'block';
            element.style.visibility = 'visible';
        }
    });

    Some downsides: won’t fix scripts that errored out, won’t work if the thing being hidden is some arbitrary wrapper element in the page (although we could maybe come up with some heuristics for that).

  • Wrap any scripts on the page in a with block that acts as a proxy for the window. For example, we’d transform the above example from census.gov to:

    <head>
      ...
      <style id="antiClickjack">body { display: none; }</style>
    
      <!-- Insert this element before the first <script> tag -->
      <script type="text/javascript">
        // Create a fake `window` object that makes `self` and `top` look identical.
        if (window.Proxy) {
          window.WINDOW_PROXY = new Proxy(window, {
            get (target, prop, receiver) {
              if (prop === "top" || prop === "self" || prop === "window") {
                return receiver;
              }
              return Reflect.get(target, prop, target);
            }
          });
        }
        else {
          window.WINDOW_PROXY = {self: window, top: window};
        }
      </script>
    
      <!-- Wrap the contents of any <script> tags in `with (WINDOW_PROXY) {...}` -->
      <script type="text/javascript">
        // Wrap the original contents of the script so properties are grabbed from
        // a special proxy object.
        with (WINDOW_PROXY) {
          if (self === top) {
            var antiClickjack =  document.getElementById("antiClickjack");
            antiClickjack.parentNode.removeChild(antiClickjack);
          } else {
            top.location = self.location
          }
        }
      </script>
      ...
    </head>

    Also not perfect: it only covers scripts that are in the page, rather than external references (i.e. <script src="some_url"></script>); the fallback version that doesn’t use Proxy could be error-prone in other ways (maybe just don’t support that case?). We could also expand this approach to solve some of the things that the iframe sandbox is causing errors with (e.g. referencing or setting document.cookie).

  • REALLY complex: add a service worker to essentially do the above to external scripts. This probably won’t work in a lot of cases (service workers don’t always apply) and may not really be worthwhile. It’s probably better accomplished by something even more messy: rewriting all [script] URLs so that the front-end server proxies them, and have it do this wrapping. On the other hand, proxying & rewriting (kinda like Wayback/the memento API) will solve lots of other issues, like CORS problems.

  • Any other ideas? These are the only two obvious approaches that jump out at me.

It might make the most sense to do a combination of the above. We could also push this into the HTML differ. instead of doing it here in the front-end.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Inbox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions