- Implement camera capture and plant identification workflow - Add Core Data persistence for plants, care schedules, and cached API data - Create collection view with grid/list layouts and filtering - Build plant detail views with care information display - Integrate Trefle botanical API for plant care data - Add local image storage for captured plant photos - Implement dependency injection container for testability - Include accessibility support throughout the app Bug fixes in this commit: - Fix Trefle API decoding by removing duplicate CodingKeys - Fix LocalCachedImage to load from correct PlantImages directory - Set dateAdded when saving plants for proper collection sorting Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
232 lines
8.6 KiB
Markdown
232 lines
8.6 KiB
Markdown
# Plant Identification Core ML Model - Development Plan
|
|
|
|
## Overview
|
|
|
|
Build a plant knowledge base from a curated plant list, then source/create an image dataset to train the Core ML model for visual plant identification.
|
|
|
|
---
|
|
|
|
## Phase 1: Knowledge Base Creation from Plant List
|
|
|
|
**Goal:** Build structured plant knowledge from a curated plant list (CSV/JSON), enriching with taxonomy and characteristics.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 1.1 | Load and validate plant list file (CSV/JSON) |
|
|
| 1.2 | Normalize and standardize plant names |
|
|
| 1.3 | Create a master plant list with deduplicated entries |
|
|
| 1.4 | Enrich with physical characteristics (leaf shape, flower color, height, etc.) |
|
|
| 1.5 | Categorize plants by type (flower, tree, shrub, vegetable, herb, succulent) |
|
|
| 1.6 | Map common names to scientific names (binomial nomenclature) |
|
|
| 1.7 | Add regional/seasonal information from external sources |
|
|
|
|
**Deliverable:** Structured plant knowledge base (JSON/SQLite) with ~500-2000 plant entries
|
|
|
|
---
|
|
|
|
## Phase 2: Image Dataset Acquisition
|
|
|
|
**Goal:** Gather labeled plant images matching our knowledge base.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 2.1 | Research public plant image datasets (PlantCLEF, iNaturalist, PlantNet, Pl@ntNet) |
|
|
| 2.2 | Cross-reference available datasets with Phase 1 plant list |
|
|
| 2.3 | Download and organize images by species/category |
|
|
| 2.4 | Establish minimum image count per class (target: 100+ images per plant) |
|
|
| 2.5 | Identify gaps - plants in our knowledge base without sufficient images |
|
|
| 2.6 | Source supplementary images for gap plants (Flickr API, Wikimedia Commons) |
|
|
| 2.7 | Verify image quality and label accuracy (remove mislabeled/low-quality) |
|
|
| 2.8 | Split dataset: 70% training, 15% validation, 15% test |
|
|
|
|
**Deliverable:** Labeled image dataset with 50,000-200,000 images across target plant classes
|
|
|
|
---
|
|
|
|
## Phase 3: Dataset Preprocessing & Augmentation
|
|
|
|
**Goal:** Prepare images for training with consistent formatting and augmentation.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 3.1 | Standardize image dimensions (e.g., 224x224 or 299x299) |
|
|
| 3.2 | Normalize color channels and handle various image formats |
|
|
| 3.3 | Implement data augmentation pipeline (rotation, flip, brightness, crop) |
|
|
| 3.4 | Create augmented variants to balance underrepresented classes |
|
|
| 3.5 | Generate image manifest files mapping paths to labels |
|
|
| 3.6 | Validate dataset integrity (no corrupted files, correct labels) |
|
|
|
|
**Deliverable:** Training-ready dataset with augmentation pipeline
|
|
|
|
---
|
|
|
|
## Phase 4: Model Architecture Selection
|
|
|
|
**Goal:** Choose and configure the optimal model architecture for on-device inference.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 4.1 | Evaluate architectures: MobileNetV3, EfficientNet-Lite, ResNet50, Vision Transformer |
|
|
| 4.2 | Benchmark model size vs accuracy tradeoffs for mobile deployment |
|
|
| 4.3 | Select base architecture (recommend: MobileNetV3 or EfficientNet-Lite for iOS) |
|
|
| 4.4 | Configure transfer learning from ImageNet pretrained weights |
|
|
| 4.5 | Design classification head for our plant class count |
|
|
| 4.6 | Define target metrics: accuracy >85%, model size <50MB, inference <100ms |
|
|
|
|
**Deliverable:** Model architecture specification document
|
|
|
|
---
|
|
|
|
## Phase 5: Initial Training Run
|
|
|
|
**Goal:** Train baseline model and establish performance benchmarks.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 5.1 | Set up training environment (PyTorch/TensorFlow with GPU) |
|
|
| 5.2 | Implement training loop with learning rate scheduling |
|
|
| 5.3 | Train baseline model for 50 epochs |
|
|
| 5.4 | Log training/validation loss and accuracy curves |
|
|
| 5.5 | Evaluate on test set - document per-class accuracy |
|
|
| 5.6 | Identify problematic classes (low accuracy, high confusion) |
|
|
| 5.7 | Generate confusion matrix to find commonly confused plant pairs |
|
|
|
|
**Deliverable:** Baseline model with documented accuracy metrics
|
|
|
|
---
|
|
|
|
## Phase 6: Model Refinement & Iteration
|
|
|
|
**Goal:** Improve model through iterative refinement cycles.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 6.1 | Address class imbalance with weighted loss or oversampling |
|
|
| 6.2 | Fine-tune hyperparameters (learning rate, batch size, dropout) |
|
|
| 6.3 | Experiment with different augmentation strategies |
|
|
| 6.4 | Add more training data for underperforming classes |
|
|
| 6.5 | Consider hierarchical classification (family -> genus -> species) |
|
|
| 6.6 | Implement hard negative mining for confused pairs |
|
|
| 6.7 | Re-train and evaluate until target accuracy achieved |
|
|
| 6.8 | Perform k-fold cross-validation for robust metrics |
|
|
|
|
**Deliverable:** Refined model meeting accuracy targets (>85% top-1, >95% top-5)
|
|
|
|
---
|
|
|
|
## Phase 7: Core ML Conversion & Optimization
|
|
|
|
**Goal:** Convert trained model to Core ML format optimized for iOS.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 7.1 | Export trained model to ONNX or saved model format |
|
|
| 7.2 | Convert to Core ML using coremltools |
|
|
| 7.3 | Apply quantization (Float16 or Int8) to reduce model size |
|
|
| 7.4 | Configure model metadata (class labels, input/output specs) |
|
|
| 7.5 | Test converted model accuracy matches original |
|
|
| 7.6 | Optimize for Neural Engine execution |
|
|
| 7.7 | Benchmark inference speed on target devices (iPhone 12+) |
|
|
|
|
**Deliverable:** Optimized `.mlmodel` or `.mlpackage` file
|
|
|
|
---
|
|
|
|
## Phase 8: iOS Integration Testing
|
|
|
|
**Goal:** Validate model performance in real iOS environment.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 8.1 | Create test iOS app with camera capture |
|
|
| 8.2 | Integrate Core ML model with Vision framework |
|
|
| 8.3 | Test with real-world plant photos (not from training set) |
|
|
| 8.4 | Measure on-device inference latency |
|
|
| 8.5 | Test edge cases (partial plants, multiple plants, poor lighting) |
|
|
| 8.6 | Gather user feedback on identification accuracy |
|
|
| 8.7 | Document failure modes and edge cases |
|
|
|
|
**Deliverable:** Validated model with real-world accuracy report
|
|
|
|
---
|
|
|
|
## Phase 9: Knowledge Integration
|
|
|
|
**Goal:** Combine visual model with plant knowledge base for rich results.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 9.1 | Link model class predictions to Phase 1 knowledge base |
|
|
| 9.2 | Design result payload (name, description, care tips, characteristics) |
|
|
| 9.3 | Add confidence thresholds and "unknown plant" handling |
|
|
| 9.4 | Implement top-N predictions with confidence scores |
|
|
| 9.5 | Create fallback for low-confidence identifications |
|
|
|
|
**Deliverable:** Complete plant identification system with rich metadata
|
|
|
|
---
|
|
|
|
## Phase 10: Final Validation & Documentation
|
|
|
|
**Goal:** Comprehensive testing and production readiness.
|
|
|
|
| Task | Description |
|
|
|------|-------------|
|
|
| 10.1 | Run full test suite across diverse plant images |
|
|
| 10.2 | Document supported plant list with accuracy per species |
|
|
| 10.3 | Create model card (training data, limitations, biases) |
|
|
| 10.4 | Write iOS integration guide |
|
|
| 10.5 | Package final `.mlmodel` with metadata and labels |
|
|
| 10.6 | Establish model versioning and update strategy |
|
|
|
|
**Deliverable:** Production-ready Core ML model with documentation
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
| Phase | Focus | Key Deliverable |
|
|
|-------|-------|-----------------|
|
|
| 1 | Knowledge Base Creation | Plant knowledge base from plant list |
|
|
| 2 | Image Acquisition | Labeled dataset (50K-200K images) |
|
|
| 3 | Preprocessing | Training-ready augmented dataset |
|
|
| 4 | Architecture | Model design specification |
|
|
| 5 | Initial Training | Baseline model + benchmarks |
|
|
| 6 | Refinement | Optimized model (>85% accuracy) |
|
|
| 7 | Core ML Conversion | Quantized `.mlmodel` file |
|
|
| 8 | iOS Testing | Real-world validation report |
|
|
| 9 | Knowledge Integration | Rich identification results |
|
|
| 10 | Final Validation | Production-ready package |
|
|
|
|
---
|
|
|
|
## Key Insights
|
|
|
|
The plant list provides **structured plant data** (names, characteristics) but visual identification requires image training data. The plan combines the plant knowledge base with external image datasets to create a complete plant identification system.
|
|
|
|
## Target Specifications
|
|
|
|
| Metric | Target |
|
|
|--------|--------|
|
|
| Plant Classes | 200-500 species |
|
|
| Top-1 Accuracy | >85% |
|
|
| Top-5 Accuracy | >95% |
|
|
| Model Size | <50MB |
|
|
| Inference Time | <100ms on iPhone 12+ |
|
|
|
|
## Recommended Datasets
|
|
|
|
- **PlantCLEF** - Annual plant identification challenge dataset
|
|
- **iNaturalist** - Community-sourced plant observations
|
|
- **PlantNet** - Botanical research dataset
|
|
- **Oxford Flowers** - 102 flower categories
|
|
- **Wikimedia Commons** - Supplementary images
|
|
|
|
## Recommended Architecture
|
|
|
|
**MobileNetV3-Large** or **EfficientNet-Lite** for optimal balance of:
|
|
- On-device performance
|
|
- Model size constraints
|
|
- Classification accuracy
|
|
- Neural Engine compatibility
|