Get 100%OFF Coupon For Google Cloud Certified Professional Data Engineer 2023 Course
Course Description:
Designing information processing techniques
Choosing the suitable storage applied sciences. Concerns embrace:
● Mapping storage techniques to enterprise necessities
● Data modeling
● Commerce-offs involving latency, throughput, transactions
● Distributed techniques
● Schema design
Designing information pipelines. Concerns embrace:
● Data publishing and visualization (e.g., BigQuery)
● Batch and streaming information (e.g., Dataflow, Dataproc, Apache Beam, Apache Spark and Hadoop ecosystem, Pub/Sub, Apache Kafka)
● On-line (interactive) vs. batch predictions
● Job automation and orchestration (e.g., Cloud Composer)
Designing a knowledge processing resolution. Concerns embrace:
● Alternative of infrastructure
● System availability and fault tolerance
● Use of distributed techniques
● Capability planning
● Hybrid cloud and edge computing
● Structure choices (e.g., message brokers, message queues, middleware, service-oriented structure, serverless capabilities)
● No less than as soon as, in-order, and precisely as soon as, and so on., occasion processing
Migrating information warehousing and information processing. Concerns embrace:
● Consciousness of present state and learn how to migrate a design to a future state
● Migrating from on-premises to cloud (Data Switch Service, Switch Equipment, Cloud Networking)
● Validating a migration
Constructing and operationalizing information processing techniques
Constructing and operationalizing storage techniques. Concerns embrace:
● Efficient use of managed providers (Cloud Bigtable, Cloud Spanner, Cloud SQL, BigQuery, Cloud Storage, Datastore, Memorystore)
● Storage prices and efficiency
● Life cycle administration of information
Constructing and operationalizing pipelines. Concerns embrace:
● Data cleaning
● Batch and streaming
● Transformation
● Data acquisition and import
● Integrating with new information sources
Constructing and operationalizing processing infrastructure. Concerns embrace:
● Provisioning assets
● Monitoring pipelines
● Adjusting pipelines
● Testing and high quality management
Operationalizing machine studying fashions
Leveraging pre-built ML fashions as a service. Concerns embrace:
● ML APIs (e.g., Imaginative and prescient API, Speech API)
● Customizing ML APIs (e.g., AutoML Imaginative and prescient, Auto ML textual content)
● Conversational experiences (e.g., Dialogflow)
Deploying an ML pipeline. Concerns embrace:
● Ingesting applicable information
● Retraining of machine studying fashions (AI Platform Prediction and Coaching, BigQuery ML, Kubeflow, Spark ML)
● Steady analysis
Selecting the suitable coaching and serving infrastructure. Concerns embrace:
● Distributed vs. single machine
● Use of edge compute
● {Hardware} accelerators (e.g., GPU, TPU)
Measuring, monitoring, and troubleshooting machine studying fashions. Concerns embrace:
● Machine studying terminology (e.g., options, labels, fashions, regression, classification, advice, supervised and unsupervised studying, analysis metrics)
● Influence of dependencies of machine studying fashions
● Frequent sources of error (e.g., assumptions about information)
Guaranteeing resolution high quality
Designing for safety and compliance. Concerns embrace:
● Id and entry administration (e.g., Cloud IAM)
● Data safety (encryption, key administration)
● Guaranteeing privateness (e.g., Data Loss Prevention API)
● Authorized compliance (e.g., Well being Insurance coverage Portability and Accountability Act (HIPAA), Youngsters’s On-line Privateness Safety Act (COPPA), FedRAMP, Normal Data Safety Regulation (GDPR))
Guaranteeing scalability and effectivity. Concerns embrace:
● Constructing and working check suites
● Pipeline monitoring (e.g., Cloud Monitoring)
● Assessing, troubleshooting, and bettering information representations and information processing infrastructure
● Resizing and autoscaling assets
Guaranteeing reliability and constancy. Concerns embrace:
● Performing information preparation and high quality management (e.g., Dataprep)
● Verification and monitoring
● Planning, executing, and stress testing information restoration (fault tolerance, rerunning failed jobs, performing retrospective re-analysis)
● Selecting between ACID, idempotent, ultimately constant necessities
Guaranteeing flexibility and portability. Concerns embrace:
● Mapping to present and future enterprise necessities
● Designing for information and utility portability (e.g., multicloud, information residency necessities)
● Data staging, cataloging, and discovery
Who this course is for:
- Newbie
- Intermediate
- Superior