4 min leitura • Guide 575 of 877
How to Use GitScrum for Machine Learning Projects?
How to use GitScrum for Machine Learning projects?
Manage ML projects in GitScrum by tracking experiments as tasks, using labels for model types and stages, and documenting results in NoteVault. Coordinate data preparation, model training, and deployment through visual workflow. Teams with structured ML workflows iterate 50% faster [Source: MLOps Research 2024].
ML project workflow:
- Research - Problem definition
- Data prep - Collection, cleaning
- Experimentation - Model development
- Evaluation - Performance testing
- Productionize - Engineering handoff
- Deploy - Model serving
- Monitor - Performance tracking
ML labels
| Label | Purpose |
|---|---|
| stage-research | Problem exploration |
| stage-data | Data work |
| stage-training | Model training |
| stage-evaluation | Testing |
| stage-deployment | Production |
| model-[type] | Model type |
| experiment | Experiment tracking |
ML columns
| Column | Purpose |
|---|---|
| Backlog | Ideas, requirements |
| Research | Problem analysis |
| Data Prep | Data collection/cleaning |
| Training | Model development |
| Validation | Performance evaluation |
| Staging | Pre-production |
| Production | Deployed models |
Experiment task template
## Experiment: [name]
### Hypothesis
What we expect to learn/achieve
### Parameters
- Model: [type]
- Dataset: [name]
- Features: [list]
- Hyperparameters: [values]
### Results
- Accuracy: [metric]
- Other metrics: [values]
- Observations: [notes]
### Conclusion
Keep/discard and why
NoteVault ML documentation
| Document | Content |
|---|---|
| Model registry | All models, versions |
| Data catalog | Datasets, features |
| Experiment log | All experiments |
| Model cards | Model documentation |
| Deployment guides | How to deploy |
Data science vs engineering
| Team | Tasks |
|---|---|
| Data Science | Research, experiments, evaluation |
| ML Engineering | Productionization, deployment |
| Data Engineering | Data pipelines |
| DevOps | Infrastructure, MLOps |
Experiment tracking
| Field | Record |
|---|---|
| Parameters | Model config |
| Metrics | Performance measures |
| Artifacts | Model files |
| Code | Experiment version |
| Data | Dataset version |
Model lifecycle stages
| Stage | Tasks |
|---|---|
| Development | Experiments, prototypes |
| Validation | A/B tests, holdout tests |
| Canary | Limited production |
| Production | Full deployment |
| Retired | Replaced/removed |
MLOps workflow
| Phase | GitScrum Tracking |
|---|---|
| CI | Training pipeline tasks |
| CD | Deployment tasks |
| CT | Continuous training tasks |
| Monitoring | Performance tasks |
Common ML project issues
| Issue | Solution |
|---|---|
| Lost experiments | Experiment label, documentation |
| Handoff failures | Linked tasks DS ↔ Eng |
| Model drift | Monitoring tasks |
| Data issues | Data quality tasks |
ML team metrics
| Metric | Track |
|---|---|
| Experiments run | Task count |
| Models deployed | Deployment tasks |
| Iteration time | Cycle time |
| Model performance | NoteVault tracking |