Cpg Analysis

Deep code property graph analysis with Joern CPG (AST+CFG+PDG) and CodeQL for control flow, data flow, taint analysis, and security auditing

Published by @alinaqi·0 agent reads / 30d·0 saves·

CPG Analysis Skill

Purpose: Deep code analysis beyond AST. Use Joern for full Code Property Graph (control flow, data flow, program dependencies) and CodeQL for interprocedural taint analysis and vulnerability detection.

These are opt-in tools. They require Docker/JVM (Joern) or CodeQL CLI. Use codebase-memory-mcp (Tier 1, always-on) for everyday navigation. Use these for deep analysis when Tier 1 is not enough.

┌────────────────────────────────────────────────────────────────┐
│  CODE PROPERTY GRAPH = AST + CFG + CDG + DDG + PDG             │
│  ─────────────────────────────────────────────────────────────│
│  AST  = Abstract Syntax Tree (structure)                       │
│  CFG  = Control Flow Graph (execution paths)                   │
│  CDG  = Control Dependency Graph (conditional dependencies)    │
│  DDG  = Data Dependency Graph (data flow between statements)   │
│  PDG  = Program Dependency Graph (CDG + DDG combined)          │
│                                                                │
│  Tier 2 (Joern): Full CPG with 40+ query tools                │
│  Tier 3 (CodeQL): Interprocedural taint + security queries     │
└────────────────────────────────────────────────────────────────┘

Tier Selection Guide

Simple symbol lookup, dependency trace, blast radius?
  → Tier 1: codebase-memory-mcp (always on, sub-ms)

Control flow paths, data flow, dead code, complex refactoring?
  → Tier 2: Joern CPG (on-demand, seconds)

Security audit, taint analysis, vulnerability detection?
  → Tier 3: CodeQL (on-demand, seconds to minutes)

Full security review before release?
  → All three tiers in sequence

Tier 2: Joern CPG (CodeBadger MCP)

When to Use Joern

ScenarioWhy JoernTier 1 Can't Do This
Trace data flow through functionsFull DDG traversalTier 1 has no data flow
Understanding control flow pathsCFG analysis with branch conditionsTier 1 has no CFG
Finding dead/unreachable codePDG reachability analysisTier 1 only detects unused exports
Complex refactoring impactCross-function dependency chainsTier 1 limited to call graph
Auditing third-party library usageDeep call chain traversalTier 1 stops at import boundary
Understanding exception flowCFG includes throw/catch pathsTier 1 ignores exceptions

Key MCP Tools (Joern/CodeBadger)

ToolPurposeExample Query
generate_cpgBuild CPG for projectFirst-time setup or after major changes
get_cpg_statusCheck CPG build statusVerify CPG is ready before querying
run_cpgql_queryRun arbitrary CPGQL queriescpg.method("login").callOut.code.l
get_cpgql_syntax_helpQuery language referenceWhen unsure about query syntax
get_cfgControl flow graph for a methodUnderstand execution paths in a function
list_methodsList all methods in projectOverview of available functions
get_method_sourceGet source code of a methodRead specific function source
list_callsList calls from/to a methodCaller/callee analysis
get_call_graphFull call graph visualizationUnderstand call chains
get_type_definitionType/class definitionsUnderstand type hierarchy

Supported Languages (Joern)

Java, Scala, C/C++, Python, JavaScript, TypeScript, PHP, Ruby, Go, Kotlin, Swift, Lua

Not supported: Rust (use CodeQL for Rust)

MCP Configuration (Joern)

{
  "mcpServers": {
    "codebadger": {
      "url": "http://localhost:4242/mcp",
      "type": "http"
    }
  }
}

Prerequisites

  • Docker (for Joern backend)
  • Python 3.10+ (for MCP server)
  • Install: ~/.claude/install-graph-tools.sh --joern

Common CPGQL Queries

// Find all methods that handle user input
cpg.method.where(_.parameter.name(".*input.*|.*request.*")).name.l

// Trace data flow from parameter to return
cpg.method("processPayment").parameter.reachableBy(cpg.method("processPayment").methodReturn).l

// Find methods with high cyclomatic complexity
cpg.method.where(_.controlStructure.size > 10).name.l

// Dead code: methods with no callers
cpg.method.where(_.callIn.size == 0).filter(_.name != "main").name.l

// Exception flow: methods that can throw but callers don't catch
cpg.method.where(_.ast.isThrow.size > 0).callIn.method.filter(_.ast.isTry.size == 0).name.l

Tier 3: CodeQL

When to Use CodeQL

ScenarioWhy CodeQLOther Tiers Can't Do This
Security audit before releaseInterprocedural taint analysisJoern has basic taint, CodeQL is deeper
Reviewing auth/payment codeData flow from source to sinkCross-function, cross-file taint
PR security reviewTargeted vulnerability scanPre-built OWASP query packs
Compliance checkingCWE/OWASP pattern matchingCurated security query suites
Rust security analysisFull Rust supportJoern doesn't support Rust

Key MCP Tools (CodeQL)

ToolPurpose
run_queryExecute a CodeQL query against the database
find_definitionsLocate symbol definitions
find_referencesFind all references to a symbol
get_resultsParse BQRS (Binary Query Result Sets)

Supported Languages (CodeQL)

C/C++, C#, Go, Java, Kotlin, JavaScript, TypeScript, Python, Ruby, Swift, Rust

MCP Configuration (CodeQL)

{
  "mcpServers": {
    "codeql": {
      "command": "codeql-mcp",
      "args": ["--database", ".code-graph/codeql-db"]
    }
  }
}

Prerequisites

  • CodeQL CLI (brew install codeql on macOS)
  • Install: ~/.claude/install-graph-tools.sh --codeql

Common CodeQL Patterns

// SQL injection: user input flows to SQL query
import python
from DataFlow::PathNode source, DataFlow::PathNode sink
where TaintTracking::hasFlowPath(source, sink)
  and source instanceof RemoteFlowSource
  and sink instanceof SqlExecution
select sink, source, sink, "SQL injection from $@.", source, "user input"

// Unvalidated redirect
from DataFlow::PathNode source, DataFlow::PathNode sink
where source instanceof RemoteFlowSource
  and sink instanceof RedirectSink
select sink, "Unvalidated redirect from user input"

Combined Workflow: Deep Analysis

When performing security review or complex refactoring, use all tiers:

1. SCOPE       → Tier 1: detect_changes / get_architecture
                 Identify files and modules in scope

2. STRUCTURE   → Tier 1: search_graph / trace_call_path
                 Map the call graph and dependencies

3. FLOW        → Tier 2: get_cfg / run_cpgql_query
                 Analyze control flow and data flow paths

4. SECURITY    → Tier 3: run_query with taint analysis
                 Check for vulnerabilities in data paths

5. REPORT      → Combine findings from all tiers
                 Prioritize: Critical > High > Medium > Low

Anti-Patterns

Anti-PatternDo This Instead
Using Joern/CodeQL for simple symbol lookupUse Tier 1 search_graph (sub-ms vs seconds)
Running full CPG build on every commitBuild CPG on-demand; use Tier 1 for continuous monitoring
Querying Joern without checking get_cpg_statusAlways verify CPG is built and current before querying
Running CodeQL without a specific security questionHave a hypothesis first; CodeQL queries are expensive
Ignoring Tier 1 blast radius before deep analysisAlways scope with Tier 1 first, then go deep on flagged areas
Using CodeQL for non-security structural queriesUse Joern CPGQL for structural/flow queries; CodeQL for security

More on the bench

SKILL0

Devsecops Ssdlc Appsec Cursor Rule

Cursor rules for secure coding, secret handling, dependency hygiene, authentication, authorization, security testing, and compliance documentation.

cybersecurity-soc+1
0
SKILL0

Audit Skills

Expert security auditor for AI Skills and Bundles. Performs non-intrusive static analysis to identify malicious patterns, data leaks, system stability risks, and obfuscated payloads across Windows, macOS, Linux/Unix, and Mobile (Android/iOS).

cybersecurity-soc+2
0
SKILL0

VibeSec Skill

This skill helps Claude write secure web applications. Use this when working on any web application or when a user requests a scan or audit to ensure security best practices are followed.

cybersecurity-soc+2
0