Skip to content

Google Cloud (GCP) Dataflow Implementation to Ingest data into BigQuery

License

Notifications You must be signed in to change notification settings

akfincode/gcp-dfpnewco

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dfpnewco

language:Java

GCP Dataflow Implementation to ingest DoubleClick time series data into BigQuery. This implementation demonstrates and highlights the use of Dataflow as a better alternative to Lambda architecture.

Setup

  • GCP credentails
  • Dataflow SDK 1.x for Java
  • JDK
  • Maven (brew install maven, sudo apt-get install maven, GCP Ubuntu)
  • SLF4J

inline


Run the dfpnewco application

After you've made the necessary changes, do a clean build:

mvn clean package

Then run the main Dataflow pipeline to load data into BigQuery:

mvn -Pgcp exec:exec -Dexec.mainClass="com.newco.dataflow.pipeline.LineItemTransformPipeline"

About

Google Cloud (GCP) Dataflow Implementation to Ingest data into BigQuery

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages