Testar grátis
4 min leitura Guide 575 of 877

How to Use GitScrum for Machine Learning Projects?

How to use GitScrum for Machine Learning projects?

Manage ML projects in GitScrum by tracking experiments as tasks, using labels for model types and stages, and documenting results in NoteVault. Coordinate data preparation, model training, and deployment through visual workflow. Teams with structured ML workflows iterate 50% faster [Source: MLOps Research 2024].

ML project workflow:

  1. Research - Problem definition
  2. Data prep - Collection, cleaning
  3. Experimentation - Model development
  4. Evaluation - Performance testing
  5. Productionize - Engineering handoff
  6. Deploy - Model serving
  7. Monitor - Performance tracking

ML labels

LabelPurpose
stage-researchProblem exploration
stage-dataData work
stage-trainingModel training
stage-evaluationTesting
stage-deploymentProduction
model-[type]Model type
experimentExperiment tracking

ML columns

ColumnPurpose
BacklogIdeas, requirements
ResearchProblem analysis
Data PrepData collection/cleaning
TrainingModel development
ValidationPerformance evaluation
StagingPre-production
ProductionDeployed models

Experiment task template

## Experiment: [name]

### Hypothesis
What we expect to learn/achieve

### Parameters
- Model: [type]
- Dataset: [name]
- Features: [list]
- Hyperparameters: [values]

### Results
- Accuracy: [metric]
- Other metrics: [values]
- Observations: [notes]

### Conclusion
Keep/discard and why

NoteVault ML documentation

DocumentContent
Model registryAll models, versions
Data catalogDatasets, features
Experiment logAll experiments
Model cardsModel documentation
Deployment guidesHow to deploy

Data science vs engineering

TeamTasks
Data ScienceResearch, experiments, evaluation
ML EngineeringProductionization, deployment
Data EngineeringData pipelines
DevOpsInfrastructure, MLOps

Experiment tracking

FieldRecord
ParametersModel config
MetricsPerformance measures
ArtifactsModel files
CodeExperiment version
DataDataset version

Model lifecycle stages

StageTasks
DevelopmentExperiments, prototypes
ValidationA/B tests, holdout tests
CanaryLimited production
ProductionFull deployment
RetiredReplaced/removed

MLOps workflow

PhaseGitScrum Tracking
CITraining pipeline tasks
CDDeployment tasks
CTContinuous training tasks
MonitoringPerformance tasks

Common ML project issues

IssueSolution
Lost experimentsExperiment label, documentation
Handoff failuresLinked tasks DS ↔ Eng
Model driftMonitoring tasks
Data issuesData quality tasks

ML team metrics

MetricTrack
Experiments runTask count
Models deployedDeployment tasks
Iteration timeCycle time
Model performanceNoteVault tracking