Bigquery Basics

Manages datasets, tables, and jobs in BigQuery, and integrates with BigQuery ML and Gemini for advanced data analytics and AI-driven insights. Use when you need to interact with BigQuery, run SQL queries, manage BigQuery resources, or leverage BigQuery's built-in ML capabilities. Also use when performing data analysis, ingesting data into BigQuery, or developing AI applications on BigQuery.

Published by @google·0 agent reads / 30d·0 saves·

BigQuery Basics

BigQuery is a serverless, AI-ready data platform that enables high-speed analysis of large datasets using SQL and Python. Its disaggregated architecture separates compute and storage, allowing them to scale independently while providing built-in machine learning, geospatial analysis, and business intelligence capabilities.

Setup and Basic Usage

  1. Enable the BigQuery API:

    gcloud services enable bigquery.googleapis.com --quiet
    
  2. Create a Dataset:

    bq mk --dataset --location=US my_dataset
    
  3. Create a Table:

    Create a file named schema.json with your table schema:

    [
      {
        "name": "name",
        "type": "STRING",
        "mode": "REQUIRED"
      },
      {
        "name": "post_abbr",
        "type": "STRING",
        "mode": "NULLABLE"
      }
    ]
    

    Then create the table with the bq tool:

    bq mk --table my_dataset.mytable schema.json
    
  4. Run a Query:

    bq query --use_legacy_sql=false \
    'SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` \
    WHERE state = "TX" LIMIT 10'
    

Reference Directory

  • Core Concepts: Storage types, analytics workflows, and BigQuery Studio features.

  • CLI Usage: Essential bq command-line tool operations for managing data and jobs.

  • Client Libraries: Using Google Cloud client libraries for Python, Java, Node.js, and Go.

  • MCP Usage: Using the BigQuery remote MCP server and Gemini CLI extension.

  • Infrastructure as Code: Terraform examples for datasets, tables, and reservations.

  • IAM & Security: Roles, permissions, and data governance best practices.

  • AI Forecast: Leveraging pre-trained TimesFM model for forecasting without custom training.

  • AI Detect Anomalies: Identify deviations in time series data using pre-trained TimesFM model.

  • AI Generate: General-purpose text and content generation using Gemini models.

If you need product information not found in these references, use the Developer Knowledge MCP server search_documents tool.

Related Skills

  • BigQuery AI & ML Skill: SKILL.md file for BigQuery AI and ML capabilities.
  • BigQuery AI & ML References: Reference files published for the BigQuery AI and ML skill.
    • bigquery_ai_classify.md
    • bigquery_ai_generate_bool.md
    • bigquery_ai_generate_double.md
    • bigquery_ai_generate_int.md
    • bigquery_ai_if.md
    • bigquery_ai_score.md
    • bigquery_ai_search.md
    • bigquery_ai_similarity.md

Bundled with this artifact

9 files

Reference files that ship alongside this artifact. Agents pull these in only when the task needs them.

More on the bench

SKILL0

Azure Cosmosdb

Azure Cosmos DB partition keys, consistency levels, change feed, SDK patterns

software-engineering+2
0
SKILL0

Ray Train

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.

data-science-ml+2
0
SKILL0

Tao Run On Kubernetes

Kubernetes execution platform — submits TAO container jobs as single-pod k8s Jobs with NVIDIA GPU scheduling. Use when running on EKS / GKE / AKS / on-prem clusters with the NVIDIA GPU Operator installed, or when integrating TAO into an existing k8s-native ML platform.

software-engineering+2
0