Waterfall Enrichment

Execute multi-provider enrichment waterfalls with credit-aware routing, validation, and export options.

Published by @gtmagents·0 agent reads / 30d·0 saves·

Waterfall Enrichment Command

Execute multi-provider enrichment waterfalls to maximize data discovery success rates while optimizing credit usage.

Command Syntax

/data-enrichment:waterfall --type <email|phone|company|full> --input <data> --max-credits <limit>

Parameters

  • --type: Type of waterfall (email, phone, company, full)
  • --input: Input data (name+company, email, domain, CSV file)
  • --max-credits: Maximum credits to spend per record (default: 10)
  • --providers: Specific provider sequence (optional, uses optimized defaults)
  • --validate: Validate discovered data (default: true)
  • --cache: Use cached results (default: true, 30-day TTL)
  • --parallel: Process multiple records in parallel (default: true)
  • --output: Output format (json|csv|salesforce|hubspot)

Waterfall Sequences

Email Discovery Waterfall

Default Sequence:
  1. Cache Check (0 credits)
  2. Apollo.io (1-2 credits)
  3. Hunter (1-2 credits)
  4. RocketReach (1-2 credits)
  5. People Data Labs (1-2 credits)
  6. ContactOut (1-2 credits)
  7. Findymail (1-2 credits)
  8. BetterContact (2-5 credits)
  9. AI Web Research (2-5 credits)
  
Validation:
  - ZeroBounce (0.5 credits)
  - NeverBounce backup (0.5 credits)

Phone Discovery Waterfall

Default Sequence:
  1. Cache Check (0 credits)
  2. Apollo.io (1-2 credits)
  3. RocketReach (1-2 credits)
  4. LeadMagic (1-2 credits)
  5. SignalHire (1-2 credits)
  6. BetterContact Phone (2-5 credits)
  7. People Data Labs (1-2 credits)
  
Validation:
  - ClearoutPhone (0.5 credits)
  - Phone type detection

Company Enrichment Waterfall

Default Sequence:
  1. Clearbit (1-2 credits)
  2. Ocean.io (2-3 credits)
  3. ZoomInfo (2-3 credits) [if enterprise]
  4. Crunchbase (1-2 credits) [if funded]
  5. BuiltWith (1-2 credits) [technographics]
  6. HG Insights (2-3 credits) [tech spend]
  7. Intent providers (3-5 credits) [if qualified]

Full Contact Enrichment

Comprehensive Sequence:
  1. Email discovery waterfall
  2. Phone discovery waterfall
  3. Social profile discovery
  4. Company enrichment
  5. Technographics
  6. Intent signals
  7. Validation & scoring

Examples

Basic Email Discovery

/data-enrichment:waterfall \
  --type email \
  --input "John Smith, Acme Corp"

Bulk Email Enrichment with Validation

/data-enrichment:waterfall \
  --type email \
  --input "prospects.csv" \
  --validate true \
  --max-credits 5

Custom Provider Sequence

/data-enrichment:waterfall \
  --type email \
  --input "[email protected]" \
  --providers "clearbit,apollo,hunter" \
  --validate true

Enterprise Full Enrichment

/data-enrichment:waterfall \
  --type full \
  --input "target_accounts.csv" \
  --max-credits 20 \
  --output salesforce

Provider Selection Logic

def select_providers(input_type, data_available, target_quality):
    providers = []
    
    # Email discovery logic
    if input_type == "email":
        if has_linkedin_url(data_available):
            providers = ["contactout", "rocketreach", "apollo"]
        elif has_full_name_and_company(data_available):
            providers = ["apollo", "hunter", "rocketreach"]
        elif has_domain_only(data_available):
            providers = ["hunter", "apollo", "clearbit"]
        else:
            providers = ["people_data_labs", "bettercontact", "ai_research"]
    
    # Phone discovery logic
    elif input_type == "phone":
        if has_email(data_available):
            providers = ["apollo", "rocketreach", "leadmagic"]
        else:
            providers = ["bettercontact_phone", "signalhire", "lusha"]
    
    # Quality-based filtering
    if target_quality == "high":
        providers = filter_high_accuracy_providers(providers)
    
    return providers

Credit Optimization

Smart Routing Algorithm

def optimize_provider_sequence(providers, max_credits, historical_success):
    # Sort by success rate and cost efficiency
    scored_providers = []
    
    for provider in providers:
        score = calculate_efficiency_score(
            success_rate=historical_success[provider],
            credit_cost=PROVIDER_COSTS[provider],
            data_quality=PROVIDER_QUALITY[provider]
        )
        scored_providers.append((provider, score))
    
    # Sort by efficiency score
    scored_providers.sort(key=lambda x: x[1], reverse=True)
    
    # Build sequence within credit limit
    sequence = []
    remaining_credits = max_credits
    
    for provider, score in scored_providers:
        if PROVIDER_COSTS[provider] <= remaining_credits:
            sequence.append(provider)
            remaining_credits -= PROVIDER_COSTS[provider]
    
    return sequence

Success Metrics

Tracking Performance

Metrics:
  success_rate:
    email_found: 85%
    phone_found: 65%
    company_enriched: 95%
  
  average_credits:
    email: 2.3 credits
    phone: 3.1 credits
    company: 4.5 credits
    full_contact: 8.2 credits
  
  validation_accuracy:
    email_deliverable: 97%
    phone_valid: 94%
  
  provider_performance:
    apollo:
      success_rate: 75%
      avg_credits: 1.5
    hunter:
      success_rate: 70%
      avg_credits: 1.2
    zoominfo:
      success_rate: 90%
      avg_credits: 2.5

Error Handling

Provider Failures

def handle_provider_failure(provider, error, context):
    # Log failure
    log_provider_error(provider, error)
    
    # Determine action
    if is_rate_limit(error):
        # Exponential backoff
        wait_time = calculate_backoff(provider)
        schedule_retry(provider, context, wait_time)
        
    elif is_auth_error(error):
        # Alert and skip provider
        alert_admin(f"Auth failed for {provider}")
        return next_provider()
        
    elif is_data_not_found(error):
        # Continue to next provider
        return next_provider()
        
    else:
        # Generic error - retry once then skip
        if not has_retried(provider, context):
            retry_provider(provider, context)
        else:
            return next_provider()

Output Formats

JSON Output

{
  "input": {
    "name": "John Smith",
    "company": "Acme Corp"
  },
  "results": {
    "email": "[email protected]",
    "email_confidence": 95,
    "email_deliverable": true,
    "phone": "+1-555-0123",
    "phone_type": "mobile",
    "phone_valid": true,
    "linkedin": "linkedin.com/in/johnsmith",
    "providers_used": ["apollo", "zerobounce"],
    "credits_used": 2.5
  },
  "metadata": {
    "enriched_at": "2024-01-20T10:30:00Z",
    "cache_hit": false,
    "processing_time": 1.2
  }
}

CSV Output

name,company,email,email_confidence,phone,phone_type,linkedin,credits_used
John Smith,Acme Corp,[email protected],95,+1-555-0123,mobile,linkedin.com/in/johnsmith,2.5

Salesforce Format

{
  "Lead": {
    "FirstName": "John",
    "LastName": "Smith",
    "Company": "Acme Corp",
    "Email": "[email protected]",
    "Phone": "+1-555-0123",
    "LinkedIn__c": "linkedin.com/in/johnsmith",
    "Enrichment_Score__c": 95,
    "Last_Enriched__c": "2024-01-20T10:30:00Z"
  }
}

Caching Strategy

Cache Management

CACHE_CONFIG = {
    "email": {
        "ttl_days": 30,
        "refresh_if_bounced": True
    },
    "phone": {
        "ttl_days": 60,
        "refresh_if_invalid": True
    },
    "company": {
        "ttl_days": 90,
        "refresh_on_trigger": ["funding", "acquisition", "ipo"]
    },
    "intent": {
        "ttl_days": 7,
        "always_refresh": True
    }
}

Best Practices

  1. Start with cached data - Always check cache first
  2. Set appropriate credit limits - Balance cost vs. data quality
  3. Use parallel processing - For bulk enrichments
  4. Validate critical data - Especially emails before outreach
  5. Monitor provider performance - Adjust sequences based on success rates
  6. Handle failures gracefully - Automatic fallback to next provider
  7. Track ROI - Measure enrichment value vs. credit cost

Execution model: claude-haiku-4-5 for provider routing, parallel processing for bulk operations

More on the bench

SKILL0

Vendor Management

Evaluate, compare, and manage vendor relationships. Trigger with "evaluate this vendor", "compare vendors", "vendor review", "should we renew", "RFP", or when the user is making procurement or vendor decisions.

sales-gtm-revops+1
0
SKILL0

Vendor Evaluation

Evaluate vendors with comparison matrices, TCO analysis, risk assessment, reference check templates, and negotiation strategies

operations+1
0
SKILL0

Contract Negotiation

Prepare for contract negotiations with term analysis, BATNA preparation, negotiation playbooks, and comparison frameworks

sales-gtm-revops
0