Machine Learning Engineering
ML Project Lifecycle
Typical stages of ML Project from scope definition to production.
ML Pipelines
ML Pipelines # Pipeline Orchestrators: infrastructure for automating, monitoring and maintaining model training and deployment. Examples: Airflow, Argo, Celery, Luigi, KubeFlow. TensorFlow Extended (TFX) # Open-source end-to-end platform for deploying production ML pipelines.
Data
Data # Data Inspection # Identiry data sources Check how they are refreshed Consistency (formatting, data types) Outliers and errors Responsilbe Data # Bias in data User privacy: Aggregation: replace unique values with summary Redaction: remove some data to create less complete picture Compilance with GDPR and other regulations Data Problems # Data/Concept Drift # Data changes: Trend and seasonality Distributiob of features changes Relative importance of features changes World changes:
TensorFlow
Modeling
Modeling # Hyperparameters Tuning # Neural architecture search (NAS) is a technique for automating the design of artificial neural networks. Trainable paramters learned by the algorithm during taining e.g. weights of a neural network Hyperparameters set before launching the training not updated by the training itself e.g. learning rate, number of units in a dense layer Even in a small algorithm the number of tunable hyperparameters can be significant.
MLE Path