Data Engineering on Google Cloud Platform Training

Price
$3,600.00 USD

Duration
4 Days

 

Delivery Methods
Virtual Instructor Led
Private Group

Course Overview

According to Google Cloud, data-driven companies are 23 times more likely to acquire customers and 19 times more likely to be profitable. But building the right infrastructure for data success requires the right skills—and the right cloud platform.

Data Engineering on Google Cloud Platform provides hands-on training in building scalable data pipelines, managing batch and streaming data, and applying machine learning to large datasets. Through immersive hands-on labs, you’ll work directly with Google Cloud Platform (GCP) tools like BigQuery, Cloud Dataflow, Cloud Composer, and Kubeflow to design solutions that drive business intelligence, improve agility, and support real-time decision-making.

Course Objectives

This training helps professionals grow their cloud skills and prepare for the Professional Data Engineer certification. It’s ideal for anyone pursuing the role of a data engineer, especially those working with cloud data, big data, or real-time data processing needs.

  • Design and implement scalable data pipelines on Google Cloud
  • Analyze massive datasets using BigQuery, SQL, and machine learning
  • Build both batch data and streaming pipelines with Dataflow and Pub/Sub
  • Use Dataproc and Spark to manage big data workloads efficiently
  • Automate and deploy AI workflows using BigQuery ML, AutoML, and Kubeflow

Who Should Attend?

Developers responsible for handling their organization's data
  • Top-rated instructors: Our crew of subject matter experts have an average instructor rating of 4.8 out of 5 across thousands of reviews.
  • Authorized content: We maintain more than 35 Authorized Training Partnerships with the top players in tech, ensuring your course materials contain the most relevant and up-to date information.
  • Interactive classroom participation: Our virtual training includes live lectures, demonstrations and virtual labs that allow you to participate in discussions with your instructor and fellow classmates to get real-time feedback.
  • Post Class Resources: Review your class content, catch up on any material you may have missed or perfect your new skills with access to resources after your course is complete.
  • Private Group Training: Let our world-class instructors deliver exclusive training courses just for your employees. Our private group training is designed to promote your team’s shared growth and skill development.
  • Tailored Training Solutions: Our subject matter experts can customize the class to specifically address the unique goals of your team.

What is the Data Engineering on Google Cloud Platform training course?

It’s a Google Cloud certification course designed to teach professionals how to build and manage scalable data systems. It covers GCP tools like BigQuery, Dataflow, Composer, and Kubeflow—equipping learners with the skills needed for the Professional Data Engineer exam.

Is the course worth it?

Yes. As demand for data engineers grows, this training provides hands-on experience that directly supports high-value roles in cloud data architecture, real-time analytics, and machine learning implementation.

How will this course help me build real-time pipelines?

You’ll learn to create streaming data pipelines using Cloud Pub/Sub and Dataflow, enabling low-latency analytics, fraud detection, and live dashboards for business intelligence.

What kind of labs are included?

Labs are integrated throughout the course and cover the full lifecycle—batch data, streaming ingestion, data warehouse development, and machine learning workflows in Google Cloud.

Course Prerequisites

  • Basic proficiency with a common query language such as SQL.
  • Experience with data modeling and ETL (extract, transform, load) activities.
  • Experience with developing applications using a common programming language such as Python.
  • Familiarity with machine learning and/or statistics.

Agenda

Module 1: Introduction to Data Engineering

  • Define the role of a data engineer on GCP
  • Explore challenges in data processing and pipeline development
  • Get started with BigQuery and its capabilities
  • Compare data lakes and data warehouse models
  • Hands-on lab: Analyze data using BigQuery

Module 2: Building a Data Lake

  • Understand architecture for data lakes on Google Cloud
  • Store structured and unstructured data in Cloud Storage
  • Optimize with tiered storage and Cloud Functions
  • Secure and manage data access
  • Hands-on lab: Load taxi data into Cloud SQL

Module 3: Building a Data Warehouse

  • Learn modern data warehouse architecture
  • Perform advanced queries in BigQuery
  • Use schemas, arrays, and nested fields
  • Optimize partitioning and performance
  • Hands-on lab: Work with JSON and BigQuery

Module 4: Building Batch Data Pipelines

  • Compare ETL, ELT, and EL processes
  • Improve data quality with built-in tools
  • Execute batch data operations in BigQuery
  • Demo: Improve pipeline quality using ELT

Module 5: Running Spark on Cloud Dataproc

  • Explore Hadoop vs. Dataproc
  • Migrate from HDFS to GCS
  • Tune Spark clusters for performance
  • Run big data jobs using Apache Spark
  • Hands-on lab: Spark processing on Cloud Dataproc

Module 6: Serverless Processing with Cloud Dataflow

  • Build Dataflow pipelines for batch and streaming
  • Use templates, side inputs, and autoscaling
  • Hands-on lab: Build and run Dataflow pipelines

Module 7: Managing Pipelines with Data Fusion & Composer

  • Create visual pipelines in Data Fusion
  • Use Cloud Composer and Apache Airflow
  • Schedule and monitor DAGs
  • Hands-on lab: Orchestrate a data pipeline

Module 8: Streaming Data Fundamentals

  • Understand streaming vs. batch processing
  • Identify tools and use cases for real-time analytics

Module 9: Messaging with Cloud Pub/Sub

  • Use Cloud Pub/Sub for streaming messaging
  • Understand architecture and security controls
  • Hands-on lab: Stream data to Pub/Sub

Module 10: Streaming with Cloud Dataflow

  • Expand pipelines to support streaming use cases
  • Monitor and troubleshoot live streams
  • Hands-on lab: Real-time data processing

Module 11: Streaming with BigQuery and Bigtable

  • Ingest live data into BigQuery
  • Analyze patterns using dashboards
  • Leverage Cloud Bigtable for fast I/O
  • Hands-on lab: Build a streaming pipeline

Module 12: Advanced BigQuery and Performance

  • Use advanced SQL and GIS features
  • Tune complex queries for efficiency
  • Optional: Partition tables by date

Module 13: Analytics and AI Foundations

  • Understand AI in analytics workflows
  • Compare machine learning tools in GCP
  • Prepare data for model development

Module 14: ML APIs for Unstructured Data

  • Use Natural Language API and Vision API
  • Hands-on lab: Analyze unstructured text

Module 15: AI Platform Notebooks

  • Use Jupyter notebooks in Google Cloud
  • Analyze BigQuery data with Pandas
  • Visualize results with Python
  • Hands-on lab: Build reports in notebooks

Module 16: ML Pipelines with Kubeflow

  • Build scalable ML workflows
  • Use pipeline templates from AI Hub
  • Hands-on lab: Train and monitor models

Module 17: BigQuery ML for Model Building

  • Train models using SQL with BigQuery ML
  • Compare regression and classification types
  • Demo: Predict taxi fares using BigQuery ML

Module 18: Custom Models with AutoML

  • Create models using AutoML Tables, Vision, NLP
  • Evaluate model performance with minimal coding
 

Upcoming Class Dates and Times

Sep 9, 10, 11, 12
8:00 AM - 4:00 PM
ENROLL $3,600.00 USD
 



Do You Have Additional Questions? Please Contact Us Below.

contact us contact us 
Contact Us about Starting Your Business Training Strategy with New Horizons