# Advanced Statistics — Deep Data Research

## Temporal Pattern Mining

### Mood Cycles & Seasonality
- **Weekly cycles** — not just "best/worst day" but the actual shape of the week (do they dip mid-week and recover Friday, or crash on Sunday night?)
- **Monthly cycles** — mood patterns across the month (beginning vs end, paycheck timing effects)
- **Seasonal patterns** — spring vs winter mood baselines. Weather data can separate "it's cold" from "it's January" effects
- **Time-of-day patterns** — `timestamp` (when they logged) vs `forDate`. Late-night loggers vs morning loggers may show different patterns. Logging time itself could correlate with mood.

### Trend Decomposition
Instead of just "improving/declining/stable", decompose the mood signal into:
- **Baseline** (long-term average that shifts slowly)
- **Trend** (is the baseline rising or falling over months?)
- **Volatility** (are swings getting wider or narrower over time?)

This gives users a real answer to "am I actually getting better?" that a simple average can't.

---

## Cross-Signal Correlations

### Health × Mood (Per-User Correlation Ranking)
9 health metrics available. Instead of showing all, **rank which health signals matter most for THIS specific user**. Compute per-user Pearson correlation between each health metric and mood:
- "Sleep is your #1 mood predictor (r=0.72)"
- "Steps have no significant correlation for you (r=0.08)"
- "Your HRV and mood are moderately linked (r=0.45)"

Personalized and genuinely useful — tells each user what to focus on.

### Weather × Mood (Beyond Averages)
Instead of just "sunny days = happier":
- **Temperature sweet spot** — fit a curve to find their optimal temperature range
- **Weather transitions** — does a sunny day *after* three rainy days hit differently than a sunny day in a sunny streak?
- **Humidity as a factor** — stored but not analyzed

### Tags × Health × Mood (Multivariate)
Cross-signal analysis:
- "On days tagged 'work' + sleep < 6hrs, your mood averages 1.8. On 'work' + sleep > 7hrs, it's 3.4" — sleep is a buffer against work stress
- "Exercise days tagged 'social' average 4.2, exercise days tagged 'solo' average 3.1" — social exercise matters more

---

## Behavioral Pattern Analysis

### Logging Behavior as Signal
The *act of logging* contains information:
- **Entry source patterns** — do they use the widget more on bad days? Watch on good days? Could reveal avoidance patterns
- **Logging time drift** — are they logging later and later? Often correlates with declining mood
- **Note length vs mood** — do they write more when upset or when happy? `notes?.count` is free data
- **Reflection completion rate** — do they bail on guided reflections for certain moods? Completing a negative reflection may itself be therapeutic

### Gap Analysis (Deeper)
Beyond simple gap tracking:
- **What predicts a gap?** Look at the 3 days before each gap — was mood declining? Were they on a negative streak?
- **Recovery patterns** — how long after returning does mood stabilize? Is there a "bounce" effect?
- **Gap frequency over time** — are they getting more or less consistent? Consistency trend is a health proxy

---

## AI-Enriched Analysis

### Note/Reflection Sentiment Trends
- **Sentiment trajectory within a reflection** — does the user start negative and end positive (processing) or start positive and end negative (rumination)?
- **Topic evolution** — what themes are growing vs fading over months? "Work" mentions peaking = potential burnout signal
- **Gratitude frequency** — entries tagged "gratitude" tracked as a percentage over time. Research shows gratitude journaling improves wellbeing — show them their own trend

### Predicted Mood
With enough data (30+ entries), build a simple predictor:
- Given today's day of week, recent weather, recent sleep, and current streak — what mood is likely?
- Show as a "forecast" card: "Based on your patterns, Tuesdays after poor sleep tend to be tough — be gentle with yourself"
- Uses correlations already computed, just applied forward

---

## Comparative & Benchmark Insights

### Personal Bests & Records
- Longest positive streak ever (and when it was)
- Best week/month on record
- Most consistent month (lowest variance)
- "Your mood this March was your best March in 2 years"

### Milestone Detection
- "You've logged 100 days"
- "Your 30-day average just hit an all-time high"
- "First month with no 'horrible' days"
- Motivational and drives retention

### Before/After Analysis
If a user starts a new habit (e.g., enables HealthKit, starts guided reflections, starts tagging), compare stats before vs after:
- "Since you started doing guided reflections 45 days ago, your average mood is up 0.6 points"
- "Since enabling Health tracking, your logging consistency improved 23%"

---

## Feasibility Notes

All of this runs on data already collected. The compute is lightweight:
- Correlations are just `zip` + arithmetic on two arrays
- Cycle detection is grouping by `weekDay` / `Calendar.component(.month)` / hour-of-day
- Trend decomposition is a sliding window average
- Predictions are weighted averages of correlated factors
- No server needed — Foundation Models handles the narrative, Swift handles the math

The heavy lift is **visualization** (Swift Charts) and **narrative framing** (using Foundation Models to turn "r=0.72 for sleep" into "Sleep is your superpower — on nights you get 7+ hours, your mood jumps by a full point").

---

## Existing Data Points Available

### Per Entry (MoodEntryModel)
1. Date logged (`forDate`)
2. Mood value (5-point scale)
3. Entry type (10 sources: app, widget, watch, siri, etc.)
4. Timestamp created
5. Day of week
6. Text notes (optional)
7. Photo ID (optional)
8. Weather data — condition, temp high/low, humidity, location (optional)
9. Guided reflection responses (optional)
10. AI-extracted tags from 16 categories (optional)

### HealthKit (9 metrics)
- Steps, exercise minutes, active calories, distance
- Average heart rate, resting heart rate, HRV
- Sleep hours, mindful minutes

### Already Computed (MoodDataSummarizer)
- Mood distribution (counts, percentages, averages)
- Day-of-week averages, best/worst day, weekend vs weekday
- Trend direction and magnitude
- Streaks (current, longest, positive, negative)
- Mood stability score and swing count
- Tag-mood correlations (good-day tags, bad-day tags)
- Weather-mood averages (by condition, by temp range)
- Logging gap analysis (pre/post gap averages)
- Entry source breakdown

### Already Visualized
- Year heatmap + donut chart (YearView)
- AI-generated text insights (InsightsView)
- Weekly digest card (WeeklyDigestCardView)
- AI reports with PDF export (ReportsView)

### NOT Yet Visualized (Gaps)
- No trend line charts
- No health correlation charts
- No tag/theme visualizations
- No period comparisons
- No streak visualizations beyond a number
- No mood stability visualization
- No logging behavior analysis
- No predictive features