Files

Trey t 136dfbae33 Add PlantGuide iOS app with plant identification and care management

- Implement camera capture and plant identification workflow
- Add Core Data persistence for plants, care schedules, and cached API data
- Create collection view with grid/list layouts and filtering
- Build plant detail views with care information display
- Integrate Trefle botanical API for plant care data
- Add local image storage for captured plant photos
- Implement dependency injection container for testability
- Include accessibility support throughout the app

Bug fixes in this commit:
- Fix Trefle API decoding by removing duplicate CodingKeys
- Fix LocalCachedImage to load from correct PlantImages directory
- Set dateAdded when saving plants for proper collection sorting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-23 12:18:01 -06:00

8.6 KiB

Raw Permalink Blame History

Plant Identification Core ML Model - Development Plan

Overview

Build a plant knowledge base from a curated plant list, then source/create an image dataset to train the Core ML model for visual plant identification.

Phase 1: Knowledge Base Creation from Plant List

Goal: Build structured plant knowledge from a curated plant list (CSV/JSON), enriching with taxonomy and characteristics.

Task	Description
1.1	Load and validate plant list file (CSV/JSON)
1.2	Normalize and standardize plant names
1.3	Create a master plant list with deduplicated entries
1.4	Enrich with physical characteristics (leaf shape, flower color, height, etc.)
1.5	Categorize plants by type (flower, tree, shrub, vegetable, herb, succulent)
1.6	Map common names to scientific names (binomial nomenclature)
1.7	Add regional/seasonal information from external sources

Deliverable: Structured plant knowledge base (JSON/SQLite) with ~500-2000 plant entries

Phase 2: Image Dataset Acquisition

Goal: Gather labeled plant images matching our knowledge base.

Task	Description
2.1	Research public plant image datasets (PlantCLEF, iNaturalist, PlantNet, Pl@ntNet)
2.2	Cross-reference available datasets with Phase 1 plant list
2.3	Download and organize images by species/category
2.4	Establish minimum image count per class (target: 100+ images per plant)
2.5	Identify gaps - plants in our knowledge base without sufficient images
2.6	Source supplementary images for gap plants (Flickr API, Wikimedia Commons)
2.7	Verify image quality and label accuracy (remove mislabeled/low-quality)
2.8	Split dataset: 70% training, 15% validation, 15% test

Deliverable: Labeled image dataset with 50,000-200,000 images across target plant classes

Phase 3: Dataset Preprocessing & Augmentation

Goal: Prepare images for training with consistent formatting and augmentation.

Task	Description
3.1	Standardize image dimensions (e.g., 224x224 or 299x299)
3.2	Normalize color channels and handle various image formats
3.3	Implement data augmentation pipeline (rotation, flip, brightness, crop)
3.4	Create augmented variants to balance underrepresented classes
3.5	Generate image manifest files mapping paths to labels
3.6	Validate dataset integrity (no corrupted files, correct labels)

Deliverable: Training-ready dataset with augmentation pipeline

Phase 4: Model Architecture Selection

Goal: Choose and configure the optimal model architecture for on-device inference.

Task	Description
4.1	Evaluate architectures: MobileNetV3, EfficientNet-Lite, ResNet50, Vision Transformer
4.2	Benchmark model size vs accuracy tradeoffs for mobile deployment
4.3	Select base architecture (recommend: MobileNetV3 or EfficientNet-Lite for iOS)
4.4	Configure transfer learning from ImageNet pretrained weights
4.5	Design classification head for our plant class count
4.6	Define target metrics: accuracy >85%, model size <50MB, inference <100ms

Deliverable: Model architecture specification document

Phase 5: Initial Training Run

Goal: Train baseline model and establish performance benchmarks.

Task	Description
5.1	Set up training environment (PyTorch/TensorFlow with GPU)
5.2	Implement training loop with learning rate scheduling
5.3	Train baseline model for 50 epochs
5.4	Log training/validation loss and accuracy curves
5.5	Evaluate on test set - document per-class accuracy
5.6	Identify problematic classes (low accuracy, high confusion)
5.7	Generate confusion matrix to find commonly confused plant pairs

Deliverable: Baseline model with documented accuracy metrics

Phase 6: Model Refinement & Iteration

Goal: Improve model through iterative refinement cycles.

Task	Description
6.1	Address class imbalance with weighted loss or oversampling
6.2	Fine-tune hyperparameters (learning rate, batch size, dropout)
6.3	Experiment with different augmentation strategies
6.4	Add more training data for underperforming classes
6.5	Consider hierarchical classification (family -> genus -> species)
6.6	Implement hard negative mining for confused pairs
6.7	Re-train and evaluate until target accuracy achieved
6.8	Perform k-fold cross-validation for robust metrics

Deliverable: Refined model meeting accuracy targets (>85% top-1, >95% top-5)

Phase 7: Core ML Conversion & Optimization

Goal: Convert trained model to Core ML format optimized for iOS.

Task	Description
7.1	Export trained model to ONNX or saved model format
7.2	Convert to Core ML using coremltools
7.3	Apply quantization (Float16 or Int8) to reduce model size
7.4	Configure model metadata (class labels, input/output specs)
7.5	Test converted model accuracy matches original
7.6	Optimize for Neural Engine execution
7.7	Benchmark inference speed on target devices (iPhone 12+)

Deliverable: Optimized .mlmodel or .mlpackage file

Phase 8: iOS Integration Testing

Goal: Validate model performance in real iOS environment.

Task	Description
8.1	Create test iOS app with camera capture
8.2	Integrate Core ML model with Vision framework
8.3	Test with real-world plant photos (not from training set)
8.4	Measure on-device inference latency
8.5	Test edge cases (partial plants, multiple plants, poor lighting)
8.6	Gather user feedback on identification accuracy
8.7	Document failure modes and edge cases

Deliverable: Validated model with real-world accuracy report

Phase 9: Knowledge Integration

Goal: Combine visual model with plant knowledge base for rich results.

Task	Description
9.1	Link model class predictions to Phase 1 knowledge base
9.2	Design result payload (name, description, care tips, characteristics)
9.3	Add confidence thresholds and "unknown plant" handling
9.4	Implement top-N predictions with confidence scores
9.5	Create fallback for low-confidence identifications

Deliverable: Complete plant identification system with rich metadata

Phase 10: Final Validation & Documentation

Goal: Comprehensive testing and production readiness.

Task	Description
10.1	Run full test suite across diverse plant images
10.2	Document supported plant list with accuracy per species
10.3	Create model card (training data, limitations, biases)
10.4	Write iOS integration guide
10.5	Package final `.mlmodel` with metadata and labels
10.6	Establish model versioning and update strategy

Deliverable: Production-ready Core ML model with documentation

Summary

Phase	Focus	Key Deliverable
1	Knowledge Base Creation	Plant knowledge base from plant list
2	Image Acquisition	Labeled dataset (50K-200K images)
3	Preprocessing	Training-ready augmented dataset
4	Architecture	Model design specification
5	Initial Training	Baseline model + benchmarks
6	Refinement	Optimized model (>85% accuracy)
7	Core ML Conversion	Quantized `.mlmodel` file
8	iOS Testing	Real-world validation report
9	Knowledge Integration	Rich identification results
10	Final Validation	Production-ready package

Key Insights

The plant list provides structured plant data (names, characteristics) but visual identification requires image training data. The plan combines the plant knowledge base with external image datasets to create a complete plant identification system.

Target Specifications

Metric	Target
Plant Classes	200-500 species
Top-1 Accuracy	>85%
Top-5 Accuracy	>95%
Model Size	<50MB
Inference Time	<100ms on iPhone 12+

Recommended Datasets

PlantCLEF - Annual plant identification challenge dataset
iNaturalist - Community-sourced plant observations
PlantNet - Botanical research dataset
Oxford Flowers - 102 flower categories
Wikimedia Commons - Supplementary images

Recommended Architecture

MobileNetV3-Large or EfficientNet-Lite for optimal balance of:

On-device performance
Model size constraints
Classification accuracy
Neural Engine compatibility

8.6 KiB Raw Permalink Blame History