Optimize AI generation speed and add richer insight data

Speed optimizations: - Add session.prewarm() in InsightsViewModel and ReportsViewModel init for 40% faster first-token latency - Cap maximumResponseTokens on all 8 AI respond() calls (100-600 per use case) - Add prompt brevity constraints ("1-2 sentences", "2 sentences") - Reduce report batch concurrency from 4 to 2 to prevent device contention - Pre-fetch health data once and share across all 3 insight periods Richer insight data in MoodDataSummarizer: - Tag-mood correlations: overall frequency + good day vs bad day tag breakdown - Weather-mood correlations: avg mood by condition and temperature range - Absence pattern detection: logging gap count with pre/post-gap mood averages - Entry source breakdown: % of entries from App, Widget, Watch, Siri, etc. - Update insight prompt to leverage tags, weather, and gap data when available Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 11:52:14 -05:00
parent 329fb7c671
commit 70400b7790
7 changed files with 302 additions and 53 deletions
--- a/Shared/Services/FoundationModelsReflectionService.swift
+++ b/Shared/Services/FoundationModelsReflectionService.swift
@@ -28,14 +28,12 @@ class FoundationModelsReflectionService {
        mood: Mood
    ) async throws -> AIReflectionFeedback {
        let session = LanguageModelSession(instructions: systemInstructions)
-
        let prompt = buildPrompt(from: reflection, mood: mood)
-
        let response = try await session.respond(
            to: prompt,
-            generating: AIReflectionFeedback.self
+            generating: AIReflectionFeedback.self,
+            options: GenerationOptions(maximumResponseTokens: 200)
        )
-
        return response.content
    }