HASHMAP SOLUTION PROFILE

NATIONAL PETROLEUM & EXPLORATION COMPANY

Migrating Hadoop to GCP BigQuery to Lower Costs and Provide Self-Serve Data Consumption

CHALLENGE
  • Migrate Hadoop based data warehouse to company standard DW platform of GCP and BigQuery

  • Replicate the existing Hive environment to the cloud in GCP/BigQuery while keeping the data pipelines sustainable and minimizing rework

  • Minimize the use of cloud-vendor specific tooling (be cloud-native)

  • Provide end-user self-service consumption via Spotfire hosted in GCP 

  • Lower overall costs for the data platform and services

  • Leverage in house GCP and BigQuery skills

APPROACH
  • Developed a Kubernetes (GKE) focused solution using

    • Hashmap custom kubernetes operator for data ingestion

    • Kubernetes hosted dbt (data build tool) job execution to replace legacy Spark ETL transformations

  • Migrated on-premises Spotfire to GCP as IaaS (GCE)

  • Infrastructure managed using Terraform (IaC)

  • Use of Gitlab for source control and CI/CD

  • Mentoring and coaching during migration, featuring hands-on demos and quick micro POCs across a variety of dimensions showcasing GCP and BigQuery

  • Complete documentation 

OUTCOME
  • Utilized modern compute solution designs, DataOps principles and practices, and hybridization of legacy Hadoop tooling (where necessary) and modern computing solutions 

  • Provided an end-to-end solution using project-level trust boundary resource segregation, automated testing, and deployment, infrastructure managed by Terraform IaC, utilization of GCP managed compute resources (BigQuery and GKE) for data warehousing and data movement + transformation using dbt, data storage on GCS, and hosting of 3rd party enterprise BI tooling (Spotfire).

  • Stackdriver was used for all of the monitoring and alerting needs and CloudSQL for all metadata management.

SOLUTION
  • Google BigQuery

  • Google Kubernetes Engine 

  • Google Compute Engine

  • Google Cloud Storage

  • Stackdriver

  • CloudSQL

  • Google Container Registry

  • dbt

  • Docker

  • Spotfire

  • Terraform

  • Gitlab

  • Hashmap Consulting Services: Assessment, Design, and Architecture Enablement Services, Cloud Modernization and Migration Services, Data Engineering and DataOps Enablement

HashmapLogoBlack_edited.png
  • Hashmap on LinkedIn
  • Tweet Hashmap
  • Hashmap on Facebook
  • Hashmap Stories
  • Data Rebels on Tap on Spotify
  • Hashmap on YouTube
  • Hashmap on Instagram

© 2020 by Hashmap, Inc.