Appearance
Mail Poller Z2 — Complete Implementation Index
Quick Start
- First time? Read
README.mdfor overview - Setting up credentials? Follow
INSTALLATION.md - Deploying to Z2? Run
./deploy.sh - Testing locally? Use
LOCAL_TESTING.md - Verification after deploy? Check
DEPLOYMENT_CHECKLIST.md
Files in This Directory
Executable Scripts
| File | Purpose | Usage |
|---|---|---|
mail-poller.py | Main polling script | python3 mail-poller.py [--account NAME] [--dry-run] [--verbose] |
deploy.sh | Deployment to ct102 | ./deploy.sh [--dry-run] [--test-only] [--verbose] |
Documentation
| File | Audience | Content |
|---|---|---|
README.md | Everyone | Overview, architecture, usage, monitoring, troubleshooting |
INSTALLATION.md | DevOps/Setup | Step-by-step credential setup + Z2 deployment |
LOCAL_TESTING.md | Developers | Testing mail-poller locally on Mac before Z2 deploy |
ARCHITECTURE.md | Engineers | Deep dive: system design, protocols, error handling, scaling |
DEPLOYMENT_CHECKLIST.md | QA/Verification | Pre/during/post-deployment verification steps (Phase 1–3) |
INDEX.md | Navigator | This file |
Configuration
| File | Purpose | Format |
|---|---|---|
.gitignore | Prevent credential commits | Standard .gitignore |
test-credentials-TEMPLATE.json | Credential structure reference | JSON template + comments |
File Relationships
mail-poller.py
↑ (deployed by)
deploy.sh
↑ (uses info from)
INSTALLATION.md, DEPLOYMENT_CHECKLIST.md
↓ (references)
README.md, ARCHITECTURE.md
LOCAL_TESTING.md
↓ (tests before deploy)
mail-poller.pyWorkflows
Setup From Scratch
1. Read README.md (overview)
2. Follow INSTALLATION.md Step 1 (prepare credentials on ct103)
3. Follow INSTALLATION.md Step 2 (run deploy.sh)
4. Use DEPLOYMENT_CHECKLIST.md Phase 2 (verify deployment)Development/Testing
1. Modify mail-poller.py locally
2. Follow LOCAL_TESTING.md (test on Mac)
3. Run deploy.sh --dry-run (preview Z2 changes)
4. Run deploy.sh (deploy to Z2)
5. Manual test: ssh z2 'pct exec 102 -- /opt/hinata/mail-poller/mail-poller.py --verbose'Troubleshooting
1. Check README.md § Troubleshooting
2. Review ARCHITECTURE.md § Error Handling
3. Check logs: ssh z2 'pct exec 102 -- journalctl -u hinata-mail-poller.service'
4. Manual test: ssh z2 'pct exec 102 -- /opt/hinata/mail-poller/mail-poller.py --verbose --account ACCOUNT'Migration Monitoring (Phase 3)
1. Deploy to Z2 (Phase 2)
2. Monitor for 7–14 days using DEPLOYMENT_CHECKLIST.md Phase 3
3. Disable Mac poller
4. Verify Z2 continues stable for another 7 daysKey Concepts
Accounts
4 email accounts across 2 protocols:
| Account | Protocol | Credential Files |
|---|---|---|
gmail | IMAP | mail_imap_credential.json |
hotmail-michael-asolo | Graph API | outlook-graph-credentials.json + outlook-tokens-hotmail-michael-asolo.json |
outlook-michael-nnamah | Graph API | outlook-graph-credentials.json + outlook-tokens-outlook-michael-nnamah.json |
outlook-n-nnamah | Graph API | outlook-graph-credentials.json + outlook-tokens-outlook-n-nnamah.json |
State Tracking
- Gmail (IMAP): Track last UID per folder → only fetch newer
- Outlook (Graph API): Track last received datetime → only fetch newer
- Persistence: state.json on ct102 at
/opt/hinata/mail-poller/state.json
Archive
- Location:
/opt/hinata/mail-poller/archive/ - Structure:
{account}/{YYYY}/{MM}/{message_hash}.json - Content: Email metadata + body (text + HTML)
Automation
- Trigger: systemd timer (every 15 minutes)
- Service:
hinata-mail-poller.service - Timer:
hinata-mail-poller.timer - Logs: journalctl (searchable by service name)
Development Notes
Stdlib Only
Script uses only Python standard library — no pip packages.
Core modules:
imaplib— IMAP clientemail.parser— RFC2822 parsingurllib— HTTP requests (Graph API)json— State/credential I/Ohashlib— Message hashinglogging— Structured loggingpathlib— Path operationsdatetime— ISO8601 handling
Extensibility Points
To add new features:
New email account:
- Edit
ACCOUNTSdict (add account definition) - Implement
poll_*()function (IMAP or Graph) - Implement
extract_*_fields()(field mapping)
New protocol (ProtonMail, Yahoo, etc.):
- Add protocol handler in
ACCOUNTS - Implement poller function
- Update field extraction
Caching/deduplication:
- Load archive index on startup
- Check if
message_idexists before archiving - Skip duplicates with debug log
Database backend:
- Replace
state.jsonwith SQLite schema - Replace
archive/filesystem with SQL queries - Add indices for fast lookups
Testing Strategies
Unit Testing
Not implemented (no dependencies to mock). Instead, use:
--dry-runflag (fetches but doesn't write)- Manual credential testing (see INSTALLATION.md)
- State inspection (jq queries)
Integration Testing
- Local testing on Mac (LOCAL_TESTING.md)
- Test on Z2 ct102 (DEPLOYMENT_CHECKLIST.md Phase 2)
- 7-day smoke test (DEPLOYMENT_CHECKLIST.md Phase 3)
Performance Testing
- Time full run:
time python3 mail-poller.py - Check network latency:
ping graph.microsoft.com - Monitor ct102 resources:
htop, disk I/O
Disaster Scenarios
Scenario: Credential Leaked
Action:
- Immediately regenerate password/tokens
- Update credential files on ct103
- Deploy to ct102:
./deploy.sh - Monitor logs for auth errors
Time to recover: 5 minutes
Scenario: Archive Corrupted
Action:
- Restore from backup (if available)
- Or delete
state.json(refetch all messages)
Time to recover: 5–10 minutes
Scenario: ct102 Disk Full
Action:
- Check archive size:
du -sh /opt/hinata/mail-poller/archive - Clean old months:
rm -rf /opt/hinata/mail-poller/archive/*/2025 - Restart service:
systemctl restart hinata-mail-poller.timer
Time to recover: 5 minutes
Scenario: Token Refresh Failing
Action:
- Check token file:
cat /opt/itachi/credentials/outlook-tokens-{account}.json - Regenerate tokens (see INSTALLATION.md)
- Test:
python3 mail-poller.py --account hotmail-michael-asolo --verbose
Time to recover: 10 minutes
Performance Baseline
Typical execution (4 accounts, ~5–10 new emails):
- Total time: 2–5 seconds
- IMAP polling: ~500ms
- Graph API polling: ~1.5s (3 accounts)
- Archive I/O: ~100ms
- Network latency: varies (0–2s)
With 100+ new messages:
- Total time: 5–15 seconds
- Bottleneck: IMAP download speed (depends on message size)
Operations Checklist
Daily
- [ ] Check logs for errors:
journalctl -u hinata-mail-poller.service --since 24h - [ ] Verify archive growth:
find archive -newermt "24 hours ago" | wc -l
Weekly
- [ ] Check state.json is advancing:
jq '.[] | .last_poll' state.json - [ ] Monitor disk usage:
du -sh archive/ - [ ] Verify all 4 accounts ran successfully
Monthly
- [ ] Rotate tokens (if approaching expiry)
- [ ] Review logs for patterns (rate limits, timeouts)
- [ ] Backup state.json (keep copies of successful states)
Connections to Other Systems
Upstream (Credentials)
- ct103/itachi: Stores credential files
- Mail-poller reads from here on each run
- Future: Migrate to Vaultwarden (#840058)
Downstream (Archive Usage)
- Studio API: Reads archive/ for email display (future)
- Heimerdinger classifier: Reads archive/ for classification (future)
- iCloud sync: Archive backed up via Sandpit → iCloud (future)
Version History
Current Version: 1.0 (2026-06-05)
- Initial implementation
- 4 accounts (Gmail IMAP, 3x Outlook Graph)
- Incremental polling with state persistence
- Systemd timer automation
Future Versions:
- v1.1: Add backfill mode (
--since DATE) - v2.0: Database backend (SQLite)
- v2.1: FastAPI endpoints for archive queries
- v3.0: Classification pipeline integration
Related Documents (in vault)
projects/brain/understanding_mail-poller-z2-migration-strategy.md— Full migration planprojects/infrastructure/reference_hinata-z2-repo-specification.md— Z2 infrastructure designprojects/infrastructure/understanding_z2-sandpit-sync-migration-strategy.md— Archive sync strategythe-government/feedback/— Governance + operational laws