Kubernetes Specialist

Use when deploying or managing Kubernetes workloads. Invoke to create deployment manifests, configure pod security policies, set up service accounts, define network isolation rules, debug pod crashes, analyze resource limits, inspect container logs, or right-size workloads. Use for Helm charts, RBAC policies, NetworkPolicies, storage configuration, performance optimization, GitOps pipelines, and multi-cluster management.

Published by @Jeffallan·0 agent reads / 30d·0 saves·

Kubernetes Specialist

When to Use This Skill

  • Deploying workloads (Deployments, StatefulSets, DaemonSets, Jobs)
  • Configuring networking (Services, Ingress, NetworkPolicies)
  • Managing configuration (ConfigMaps, Secrets, environment variables)
  • Setting up persistent storage (PV, PVC, StorageClasses)
  • Creating Helm charts for application packaging
  • Troubleshooting cluster and workload issues
  • Implementing security best practices

Core Workflow

  1. Analyze requirements — Understand workload characteristics, scaling needs, security requirements
  2. Design architecture — Choose workload types, networking patterns, storage solutions
  3. Implement manifests — Create declarative YAML with proper resource limits, health checks
  4. Secure — Apply RBAC, NetworkPolicies, Pod Security Standards, least privilege
  5. Validate — Run kubectl rollout status, kubectl get pods -w, and kubectl describe pod <name> to confirm health; roll back with kubectl rollout undo if needed

Reference Guide

Load detailed guidance based on context:

TopicReferenceLoad When
Workloadsreferences/workloads.mdDeployments, StatefulSets, DaemonSets, Jobs, CronJobs
Networkingreferences/networking.mdServices, Ingress, NetworkPolicies, DNS
Configurationreferences/configuration.mdConfigMaps, Secrets, environment variables
Storagereferences/storage.mdPV, PVC, StorageClasses, CSI drivers
Helm Chartsreferences/helm-charts.mdChart structure, values, templates, hooks, testing, repositories
Troubleshootingreferences/troubleshooting.mdkubectl debug, logs, events, common issues
Custom Operatorsreferences/custom-operators.mdCRD, Operator SDK, controller-runtime, reconciliation
Service Meshreferences/service-mesh.mdIstio, Linkerd, traffic management, mTLS, canary
GitOpsreferences/gitops.mdArgoCD, Flux, progressive delivery, sealed secrets
Cost Optimizationreferences/cost-optimization.mdVPA, HPA tuning, spot instances, quotas, right-sizing
Multi-Clusterreferences/multi-cluster.mdCluster API, federation, cross-cluster networking, DR

Constraints

MUST DO

  • Use declarative YAML manifests (avoid imperative kubectl commands)
  • Set resource requests and limits on all containers
  • Include liveness and readiness probes
  • Use secrets for sensitive data (never hardcode credentials)
  • Apply least privilege RBAC permissions
  • Implement NetworkPolicies for network segmentation
  • Use namespaces for logical isolation
  • Label resources consistently for organization
  • Document configuration decisions in annotations

MUST NOT DO

  • Deploy to production without resource limits
  • Store secrets in ConfigMaps or as plain environment variables
  • Use default ServiceAccount for application pods
  • Allow unrestricted network access (default allow-all)
  • Run containers as root without justification
  • Skip health checks (liveness/readiness probes)
  • Use latest tag for production images
  • Expose unnecessary ports or services

Common YAML Patterns

Deployment with resource limits, probes, and security context

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: my-namespace
  labels:
    app: my-app
    version: "1.2.3"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
        version: "1.2.3"
    spec:
      serviceAccountName: my-app-sa   # never use default SA
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 2000
      containers:
        - name: my-app
          image: my-registry/my-app:1.2.3   # never use latest
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop: ["ALL"]
          envFrom:
            - secretRef:
                name: my-app-secret   # pull credentials from Secret, not ConfigMap

Minimal RBAC (least privilege)

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app-sa
  namespace: my-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: my-app-role
  namespace: my-namespace
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "list"]   # grant only what is needed
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: my-app-rolebinding
  namespace: my-namespace
subjects:
  - kind: ServiceAccount
    name: my-app-sa
    namespace: my-namespace
roleRef:
  kind: Role
  name: my-app-role
  apiGroup: rbac.authorization.k8s.io

NetworkPolicy (default-deny + explicit allow)

# Deny all ingress and egress by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: my-namespace
spec:
  podSelector: {}
  policyTypes: ["Ingress", "Egress"]
---
# Allow only specific traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-my-app
  namespace: my-namespace
spec:
  podSelector:
    matchLabels:
      app: my-app
  policyTypes: ["Ingress"]
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend
      ports:
        - protocol: TCP
          port: 8080

Validation Commands

After deploying, verify health and security posture:

# Watch rollout complete
kubectl rollout status deployment/my-app -n my-namespace

# Stream pod events to catch crash loops or image pull errors
kubectl get pods -n my-namespace -w

# Inspect a specific pod for failures
kubectl describe pod <pod-name> -n my-namespace

# Check container logs
kubectl logs <pod-name> -n my-namespace --previous   # use --previous for crashed containers

# Verify resource usage vs. limits
kubectl top pods -n my-namespace

# Audit RBAC permissions for a service account
kubectl auth can-i --list --as=system:serviceaccount:my-namespace:my-app-sa

# Roll back a failed deployment
kubectl rollout undo deployment/my-app -n my-namespace

Output Templates

When implementing Kubernetes resources, provide:

  1. Complete YAML manifests with proper structure
  2. RBAC configuration if needed (ServiceAccount, Role, RoleBinding)
  3. NetworkPolicy for network isolation
  4. Brief explanation of design decisions and security considerations

Documentation

Bundled with this artifact

13 files

Reference files that ship alongside this artifact. Agents pull these in only when the task needs them.

More on the bench

SKILL0

Xlsx

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

software-engineering+2
0
SKILL0

Docx

Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of 'Word doc', 'word document', '.docx', or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a 'report', 'memo', 'letter', 'template', or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.

software-engineering+1
0
SKILL0

Ticket Triage

Triage incoming support tickets by categorizing issues, assigning priority (P1-P4), and recommending routing. Use when a new ticket or customer issue comes in, when assessing severity, or when deciding which team should handle an issue.

customer-success+2
0