Analyzing Malicious PDF With Peepdf

Perform static analysis of malicious PDF documents using peepdf, pdfid, and pdf-parser to extract embedded JavaScript, shellcode, and suspicious objects.

Published by @mukul975·0 agent reads / 30d·0 saves·

Analyzing Malicious PDF with peepdf

When to Use

  • When triaging suspicious PDF attachments from phishing emails
  • During malware analysis of PDF-based exploit documents
  • When extracting embedded JavaScript, shellcode, or executables from PDFs
  • For forensic examination of weaponized document artifacts
  • When building detection signatures for PDF-based threats

Prerequisites

  • Python 3.8+ with peepdf-3 installed (pip install peepdf-3)
  • pdfid.py and pdf-parser.py from Didier Stevens suite
  • Isolated analysis environment (VM or sandbox)
  • Optional: PyV8 for JavaScript emulation within peepdf
  • Optional: Pylibemu for shellcode analysis

Workflow

  1. Triage with pdfid: Scan PDF for suspicious keywords (/JS, /JavaScript, /OpenAction, /Launch, /EmbeddedFile).
  2. Interactive Analysis: Open PDF in peepdf interactive mode to explore object structure.
  3. Identify Suspicious Objects: Locate objects containing JavaScript, streams, or encoded data.
  4. Extract Content: Dump suspicious streams and decode filters (FlateDecode, ASCIIHexDecode).
  5. Deobfuscate JavaScript: Analyze extracted JS for shellcode, heap sprays, or exploit code.
  6. Check VirusTotal: Use peepdf vtcheck to cross-reference file hash with AV detections.
  7. Generate IOCs: Extract URLs, domains, hashes, and shellcode signatures.

Key Concepts

ConceptDescription
/OpenActionAutomatic action executed when PDF is opened
/JavaScript /JSEmbedded JavaScript code in PDF objects
/LaunchAction that launches external applications
/EmbeddedFileFile embedded within the PDF structure
FlateDecodezlib compression filter used to hide content
Object StreamsPDF objects stored in compressed streams

Tools & Systems

ToolPurpose
peepdf / peepdf-3Interactive PDF analysis with JS emulation
pdfid.pyQuick triage scanning for suspicious keywords
pdf-parser.pyDeep object-level PDF parsing
VirusTotalHash lookup and AV detection cross-reference
CyberChefDecode and transform extracted payloads

Output Format

Analysis Report: PDF-MAL-[DATE]-[SEQ]
File: [filename.pdf]
SHA-256: [hash]
Suspicious Keywords: [/JS, /OpenAction, etc.]
Objects with JavaScript: [Object IDs]
Extracted URLs: [List]
Shellcode Detected: [Yes/No]
Embedded Files: [Count and types]
VirusTotal Detections: [X/Y engines]
Risk Level: [Critical/High/Medium/Low]

Bundled with this artifact

5 files

Reference files that ship alongside this artifact. Agents pull these in only when the task needs them.

More on the bench

SKILL0

Devsecops Ssdlc Appsec Cursor Rule

Cursor rules for secure coding, secret handling, dependency hygiene, authentication, authorization, security testing, and compliance documentation.

cybersecurity-soc+1
0
SKILL0

Audit Skills

Expert security auditor for AI Skills and Bundles. Performs non-intrusive static analysis to identify malicious patterns, data leaks, system stability risks, and obfuscated payloads across Windows, macOS, Linux/Unix, and Mobile (Android/iOS).

cybersecurity-soc+2
0
SKILL0

VibeSec Skill

This skill helps Claude write secure web applications. Use this when working on any web application or when a user requests a scan or audit to ensure security best practices are followed.

cybersecurity-soc+2
0