GitScrum / Docs
All Best Practices

Machine Learning Projects | 50% Faster Iterations

Manage ML projects with GitScrum experiment tracking and model lifecycle management. Coordinate data science and engineering teams. 50% faster iterations.

4 min read

How to use GitScrum for Machine Learning projects?

Manage ML projects in GitScrum by tracking experiments as tasks, using labels for model types and stages, and documenting results in NoteVault. Coordinate data preparation, model training, and deployment through visual workflow. Teams with structured ML workflows iterate 50% faster [Source: MLOps Research 2024].

ML project workflow:

  • Research - Problem definition
  • Data prep - Collection, cleaning
  • Experimentation - Model development
  • Evaluation - Performance testing
  • Productionize - Engineering handoff
  • Deploy - Model serving
  • Monitor - Performance tracking
  • ML labels

    LabelPurpose
    stage-researchProblem exploration
    stage-dataData work
    stage-trainingModel training
    stage-evaluationTesting
    stage-deploymentProduction
    model-[type]Model type
    experimentExperiment tracking

    ML columns

    ColumnPurpose
    BacklogIdeas, requirements
    ResearchProblem analysis
    Data PrepData collection/cleaning
    TrainingModel development
    ValidationPerformance evaluation
    StagingPre-production
    ProductionDeployed models

    Experiment task template

    ## Experiment: [name]
    
    ### Hypothesis
    What we expect to learn/achieve
    
    ### Parameters
    - Model: [type]
    - Dataset: [name]
    - Features: [list]
    - Hyperparameters: [values]
    
    ### Results
    - Accuracy: [metric]
    - Other metrics: [values]
    - Observations: [notes]
    
    ### Conclusion
    Keep/discard and why
    

    NoteVault ML documentation

    DocumentContent
    Model registryAll models, versions
    Data catalogDatasets, features
    Experiment logAll experiments
    Model cardsModel documentation
    Deployment guidesHow to deploy

    Data science vs engineering

    TeamTasks
    Data ScienceResearch, experiments, evaluation
    ML EngineeringProductionization, deployment
    Data EngineeringData pipelines
    DevOpsInfrastructure, MLOps

    Experiment tracking

    FieldRecord
    ParametersModel config
    MetricsPerformance measures
    ArtifactsModel files
    CodeExperiment version
    DataDataset version

    Model lifecycle stages

    StageTasks
    DevelopmentExperiments, prototypes
    ValidationA/B tests, holdout tests
    CanaryLimited production
    ProductionFull deployment
    RetiredReplaced/removed

    MLOps workflow

    PhaseGitScrum Tracking
    CITraining pipeline tasks
    CDDeployment tasks
    CTContinuous training tasks
    MonitoringPerformance tasks

    Common ML project issues

    IssueSolution
    Lost experimentsExperiment label, documentation
    Handoff failuresLinked tasks DS ↔ Eng
    Model driftMonitoring tasks
    Data issuesData quality tasks

    ML team metrics

    MetricTrack
    Experiments runTask count
    Models deployedDeployment tasks
    Iteration timeCycle time
    Model performanceNoteVault tracking

    Related articles