Skip to content

Tax Practice AI - Backlog

Last updated: 2025-12-28 (v0.14 - S13 Complete, 72% progress)

This document tracks priority items, technical debt, and pending decisions.


V1 Back-Office AI Companion (Tax Season 2025)

Status: Requirements Complete, Implementation In Progress Target: January 2025 deployment for Tax Season 2025

V1 deploys as a back-office companion tool - staff use AI analysis alongside existing workflows without disrupting clients.

V1 Scope

Feature Status Description
Quick Client Entry Defined Minimal form: name, tax year, legacy account link
Drag-Drop Upload Defined Upload documents directly into viewer
Folder Import Defined Import entire folder of documents
AI Classification Defined Automatic document type identification
AI Analysis Defined Prior year comparison, anomaly detection, missing docs
Q&A Assistant Defined Ask questions during review with source citations
Annotations Defined Notes, flags, questions on documents
Worksheet Export Defined PDF/Excel with full source citations
S3 Fallback Defined AI works when cloud storage unavailable

V1 Documentation

Document Purpose
V1_COMPANION_REQUIREMENTS.md Full requirements
V1_USE_CASES.md Detailed use cases
V1_UI_CHANGES.md UI specifications
ARCHITECTURE.md Section 16 Deployment model

V1 Testing

  • BDD feature files created (Gherkin syntax)
  • 16 BDD scenarios passing (quick client, document upload)
  • Test data generator available: python scripts/generate_test_data.py
  • Sample documents generated: W-2s, 1099s, bank statements, receipts, CSVs

V1 Philosophy

Augment, don't replace. Zero client disruption.

  • Clients continue using SmartVault for uploads
  • UltraTax remains the tax prep tool
  • Legacy system remains source of truth
  • New account numbers prefixed with 'A' (pending client confirmation)

Client Decisions (Resolved)

TAX-001: Tax Software Selection

Status: ✅ Resolved (2024-12-23) Decision: UltraTax CS (Thomson Reuters) Integration: Via SurePrep CS Connect bridge (UltraTax has no direct API)

TAX-002: Volume Projections

Status: ✅ Resolved (2024-12-23) Decision: ~1,000 returns/year with 30% annual growth expectation

Timeframe Returns per Year
Year 1 1,000
Year 2 1,300
Year 3 1,700
Year 5 2,850

Note: Growth may accelerate due to system efficiency gains.

TAX-003: Entity Type Mix

Status: ✅ Resolved (2024-12-23) Decision: Equal priority for individuals and businesses. Business clients are primarily small businesses (advanced individuals). No differentiation needed in V1.

TAX-004: State Coverage

Status: ✅ Resolved (2024-12-23) Decision: Florida + surrounding states (GA, AL, SC, NC) initially. Design for all 50 states from the start - full coverage expected soon.


Phase 0: Data Migration (Pre-Launch Prerequisite)

Data migration must be completed before go-live. Depends on Sequence 1 infrastructure.

MIG-001: Client Data Import Tool

Status: ✅ COMPLETE Priority: P0 (Pre-Launch Blocker)

  • CLI tool: tax-migrate clients <file>
  • CSV/Excel parsing with column mapping
  • Duplicate detection and handling
  • Account number generation for imports
  • Dry-run mode for validation
  • Import summary report
  • Audit logging

Files: - scripts/tax_migrate.py - CLI entry point (per MIG-080) - src/migration/__init__.py - Module exports - src/migration/client_importer.py - Main import logic (MIG-001 through MIG-008) - src/migration/column_mapper.py - Flexible column mapping (MIG-002) - src/migration/duplicate_detector.py - Duplicate detection (MIG-010 through MIG-013) - src/migration/import_report.py - Report generation (MIG-021, MIG-110 through MIG-114) - tests/unit/test_migration_column_mapper.py - Unit tests (16 tests)

Requirements: See MIGRATION_REQUIREMENTS.md Section 2

MIG-002: Bulk Document Import Tool

Status: ✅ COMPLETE Priority: P0 (Pre-Launch Blocker)

  • CLI tool: tax-migrate documents <folder> (with --preview mode)
  • Folder structure pattern matching (client-name-first, account-first, year-first)
  • Document classification by filename (W-2, 1099, 1098, K-1, identity docs, etc.)
  • Client matching with fuzzy name support (configurable threshold)
  • Malware scanning integration (placeholder for ClamAV)
  • Unmatched document quarantine (--quarantine-dir option)
  • Import report with match statistics (classification stats, match confidence)

Files: - src/migration/document_classifier.py - Document type classification (MIG-040, MIG-042, MIG-043) - src/migration/client_matcher.py - Client matching with fuzzy support (MIG-050 through MIG-054) - src/migration/document_importer.py - Main import logic (MIG-030 through MIG-036) - scripts/tax_migrate.py - CLI entry point (documents subcommand) - tests/unit/test_migration_document_classifier.py - Unit tests (22 tests)

Requirements: See MIGRATION_REQUIREMENTS.md Section 3

MIG-003: Historical Return Data Import

Status: ✅ COMPLETE Priority: P0 (Pre-Launch Blocker)

  • CLI tool: tax-migrate history <file> (with --preview, --dry-run modes)
  • UltraTax export format support (per MIG-064)
  • Generic CSV format support (per MIG-065)
  • Prior year AGI import and client update (per MIG-060)
  • Filing status normalization (per MIG-061)
  • Refund/balance due import (per MIG-062)
  • Return history record creation (per MIG-070)
  • Client matching by external_id and SSN-4

Files: - src/migration/history_importer.py - History import logic (MIG-060 through MIG-070) - scripts/tax_migrate.py - CLI entry point (history subcommand) - tests/unit/test_migration_history_importer.py - Unit tests (27 tests)

Requirements: See MIGRATION_REQUIREMENTS.md Section 4

MIG-004: Migration Validation & Rollback

Status: ✅ COMPLETE Priority: P1

  • Migration summary report generation (per MIG-110)
  • Source vs imported count comparison (per MIG-111)
  • Error and warning listing (per MIG-112)
  • Sample client report for spot-checking (per MIG-113)
  • Rollback preview (per MIG-140)
  • Rollback CLI: tax-migrate rollback <batch-id> (per MIG-142)
  • List recent migration batches: tax-migrate rollback --list
  • Validate batch: tax-migrate rollback <batch-id> --validate
  • Generate report: tax-migrate rollback <batch-id> --report

Files: - src/migration/migration_validator.py - Validation and rollback logic - scripts/tax_migrate.py - CLI entry point (rollback subcommand) - tests/unit/test_migration_validator.py - Unit tests (16 tests)

Requirements: See MIGRATION_REQUIREMENTS.md Sections 6, 8


Phase 1: Foundation (Complete)

FOUND-001: Project Structure Setup

Status: Complete Priority: P0

  • Create ARCHITECTURE.md
  • Create CLAUDE.md
  • Create RUNBOOK.md
  • Create backlog.md
  • Create src/ directory structure
  • Create config.yaml with commented parameters
  • Set up requirements.txt with core dependencies
  • Create .env.example template

FOUND-002: Service Centralization Framework

Status: Complete Priority: P0

  • Create src/services/base_service.py
  • Create src/services/init.py (ServiceRegistry)
  • Create src/config/settings.py
  • Create src/config/secrets.py (AWS Secrets Manager - future)

FOUND-003: Snowflake Service Implementation

Status: Not Started Priority: P1

  • Create src/services/snowflake_service.py
  • Implement connection pooling
  • Add query logging for audit
  • Add health check method

FOUND-004: Aurora Service Implementation

Status: Complete Priority: P1

  • Create src/services/aurora_service.py
  • Implement connection pooling
  • Add transaction support
  • Add health check method
  • Add database error mapping (ConflictError, ValidationError, etc.)

Phase 2-7: Client, Document & Tax Preparation (Complete)

Sequence 2: Client Identity (Complete)

  • S2-001: Client Self-Registration (Done)
  • S2-002: Identity Verification with Persona (Done)
  • S2-003: Returning Client Authentication (Done)
  • S2-004: Profile Management (Done)

Sequence 3: Engagement (Complete)

  • S3-001: Engagement Letter Generation (Done)
  • S3-002: E-Signature via Google Docs (Done)
  • S3-003: Form 7216 Consent Management (Done)

Sequence 4: Document Management (Complete)

  • S4-001: Document Upload via Portal (Done)
  • S4-002: Document Upload via Email (Done)
  • S4-003: Malware Scanning (Done)
  • S4-004: Document Classification and Extraction (Done)
  • S4-005: SmartVault Integration (Done)
  • S4-006: SurePrep Integration (Done)
  • S4-007: Document Checklist Management (Done)
  • S4-008: Manual Extraction Correction (Done)

Sequence 5: AI Analysis (Complete)

  • S5-001: Preliminary Return Analysis (Done)
  • S5-002: Prior Year Comparison (Done)
  • S5-003: Missing Document Detection (Done)
  • S5-004: AI-Powered Q&A (Done)
  • S5-005: Extraction Corrections (Done)
  • S5-006: Analysis Dashboard (Done)

Sequence 6: Tax Preparation Workflow (Complete)

  • S6-001: Workflow State Machine (Done)
  • S6-002: Preparer Assignment (Done)
  • S6-003: Reviewer Assignment (Done)
  • S6-004: Progress Tracking (Done)
  • S6-005: Dashboard Views (Done)
  • S6-006: Time Tracking (Done)
  • S6-007: Priority Management (Done)
  • S6-008: Batch Operations (Done)
  • S6-009: Analytics (Done)

Sequence 7: Preparer & Reviewer Interface (Complete)

  • S7-001: Interactive Review Interface (Done)
  • S7-002: AI Q&A Assistant (Done)
  • S7-003: Change Tracking (Done)
  • S7-004: Final Review Package (Done)

Sequence 8: Client Communication (Complete)

  • S8-001: Secure Portal Messaging (Done)
  • S8-002: Email Notifications (Done)
  • S8-003: SMS Notifications (Done)
  • S8-004: Callback Scheduling (Done)
  • S8-005: Notification Preferences (Done)

Sequence 9: Client Delivery (Complete)

  • S9-001: Tax Package Generation (Done)
  • S9-002: Google Workspace Signature Integration (Done)
  • S9-003: Payment Authorization (Done)

Sequence 10: E-Filing Status Tracking (Complete)

  • S10-001: E-File Status Monitoring (Done)
  • S10-002: Mark Return Ready for Filing (Done)
  • S10-003: Filing Ready Check (Done)
  • S10-004: Rejection Management (Done)

Sequence 11: Billing & Payments (Complete)

  • S11-001: Stripe Service Implementation (Done)
  • S11-002: Invoice Generation (Done)
  • S11-003: Payment Collection (Done)
  • S11-004: Payment Reminders (Done)

Sequence 12: Estimated Tax Management (Complete - 3 of 4 stories)

  • S12-001: Estimated Tax Calculation (Done)
  • S12-002: Voucher Generation (Done)
  • S12-003: Calendar Event Generation (Done)
  • S12-004: Estimated Tax Reminders (DEFERRED - see note below)

S12-004 Deferral Note: By sending estimated tax reminders, the firm implies responsibility for notifying clients. When emails or SMS are missed (spam filters, wrong number, etc.), clients blame the firm for their missed payments and penalties. This inappropriately shifts liability. Clients should use calendar events (S12-003) instead.

Implementation Files: - Domain: src/domain/estimated_tax.py - Repository: src/repositories/estimated_tax_repository.py - Workflows: src/workflows/estimated_tax/ (calculation, voucher, calendar) - API: src/api/routes/estimated_tax.py - Schemas: src/api/schemas/estimated_tax_schemas.py

Sequence 13: AI Chat (Complete)

  • S13-001: Chat Domain & Repository (Done)
  • S13-002: Chat Service with CLI/API Modes (Done)
  • S13-003: Chat API Routes (Done)
  • S13-004: Staff-App Chat UI (Done)
  • S13-005: Chat Integration Tests (Pending)

Features: - Tax-focused system prompt with off-topic redirection - Single-client scope enforcement (no cross-client queries) - CLI mode for local dev (uses developer's Claude subscription) - API mode for production (AWS Bedrock) - Extended context building (client, return, documents, prior year) - Floating drawer UI in staff-app ClientDetailPage - Token/cost tracking per session

Implementation Files: - Domain: src/domain/chat.py - Repository: src/repositories/chat_repository.py - Service: src/services/chat_service.py - API: src/api/routes/chat.py - Schemas: src/api/schemas/chat_schemas.py - Frontend: frontend/apps/staff-app/src/components/chat/ - Hook: frontend/apps/staff-app/src/hooks/useChat.ts

Future (Backlog): - S13-006: Client Portal Chat - S13-007: WebSocket Streaming - S13-008: Cross-Client Queries (after security/cost analysis) - S13-009: Opus Escalation ("Ask the Expert" button)


Technical Debt

TD-001: Java Build Configuration

Status: Not Started Description: Need to establish Maven/Gradle configuration for Java components Notes: Should mirror ingestion engine patterns for consistency

TD-002: CI/CD Pipeline

Status: Complete Description: Set up GitHub Actions for automated testing and deployment Implementation: .github/workflows/ci.yml

Pipeline jobs: - [x] lint - Ruff linting and formatting checks - [x] unit-tests - Fast tests with coverage reporting (~30s) - [x] integration-tests - PostgreSQL service container (~2m) - [x] e2e-tests - PostgreSQL + LocalStack service containers (~2m)

Triggers: Push to main, pull requests to main

TD-003: Testing Framework

Status: Complete Description: Establish pytest structure for Python, JUnit for Java Progress: Full test pyramid implemented with 1,522 tests passing

Test Type Count Percent Target
Unit 1,290 85% 80%
Integration 182 12% 15%
E2E 50 3% 5%

Coverage: - Unit: Exceptions, middleware, domain entities, services, workflows (S2-S11) - Integration: Repositories (Client, Document, Engagement, Consent, Extraction, Checklist, Workflow, Review, Messaging, Delivery, EFiling, Invoice), S3Service with LocalStack - E2E: Client, Verification, Engagement, Consent, Document, Messaging, Delivery, EFiling, Invoice API endpoints

TD-004: UAT Script Creation

Status: Not Started Priority: P1 (Pre-Launch) Description: Create User Acceptance Testing scripts for client validation

Traceability: - [ ] Create requirements traceability matrix (RTM) linking UAT scripts to USER_STORIES.md - [ ] Each UAT test case references specific story ID (e.g., S2-001, S10-003) - [ ] Track coverage percentage against requirements - [ ] Bug tracking with requirement linkage for root cause analysis

UAT Scripts: - [ ] Define UAT scenarios for each sequence (S2-S18) - [ ] Create step-by-step testing scripts with requirement references - [ ] Define expected outcomes and acceptance criteria per story - [ ] Create UAT reporting templates with pass/fail per requirement - [ ] Document rollback procedures for failed UAT

TD-005: Test Data Generator (TDG-001)

Status: Complete (2024-12-26) Priority: P1 (Pre-Launch) Description: Comprehensive test data generator with realistic scenarios and document quality variations

User Personas (each with complete document sets): - [x] Individual (Simple): Single W-2, standard deductions, single state - [x] Individual (Heavy Investor): Multiple 1099-DIVs, 1099-Bs, K-1s - [x] Business Owner: Schedule C, 1099-NECs, business expenses, mileage - [x] Sub-Contractor: Multiple 1099-NECs, 1099-Ks, mileage logs - [x] Complex Individual: Multi-state, K-1s, 1098 mortgage - [x] S-Corp Owner: K-1 from S-Corp, W-2 from own company, distributions - [x] Retiree: 1099-R, SSA-1099, investment income

Document Types Generated: - [x] W-2s (single/multiple employers) - [x] 1099 series (DIV, INT, B, NEC, MISC, R, SSA, K) - [x] 1098 (mortgage interest) - [x] K-1s (partnership 1065, S-Corp 1120S) - [x] PDF bank statements - [x] PDF credit card statements - [x] Receipt images (PNG with quality variations) - [x] Mileage logs - [ ] Prior year tax returns (future enhancement) - [ ] ID documents (future enhancement)

Image Quality Variations: - [x] Excellent quality (clean generation) - [x] Medium quality (slight blur) - [x] Poor quality (blur, rotation) - [x] Terrible quality (heavy blur, rotation, JPEG artifacts)

Multi-Batch Scenarios: - [x] Batch assignment based on persona configuration - [x] Delayed documents (K-1s, corrected 1099s) arrive in later batches - [x] 1-4 batch scenarios per persona

Generator Implementation: - [x] CLI tool: python scripts/generate_test_data.py --persona business_owner - [x] Configurable output directory (--output) - [x] Reproducible with seed (--seed 42) - [x] Generate realistic but fake PII (987-65-xxxx SSN range) - [x] 69 unit tests passing

Files: - scripts/generate_test_data.py - CLI entry point - src/testing/data_generator.py - Core generation logic - src/testing/document_renderer.py - PDF/image rendering - src/testing/utils.py - PII generation utilities - src/testing/personas/personas.yaml - Persona configurations - tests/unit/testing/ - Unit tests

TD-006: Placeholder and Assumption Audit

Status: In Progress Priority: P1 (Pre-Launch) Description: Review and address "for now", "placeholder", and "TODO" items throughout codebase Plan: See docs/plans/TECH_DEBT_CLEANUP.md

Audit Complete (2024-12-24): Found 89 items, categorized as: - 35 Production Blockers: External service stubs (Email, SMS, Persona, SmartVault, SurePrep, Google) - 12 Development Conveniences: Acceptable placeholders (PDF generation, placeholder citations) - 42 Not Issues: Documentation/expected behavior (SQL placeholders, template variables)

Non-API Fixes Complete (2024-12-24): 4 items fixed without external API credentials: - [x] Consent route client lookup (fetches real client data from ClientRepository) - [x] AI QA citation resolution (resolves document names to actual IDs via DocumentRepository) - [x] EFiling ready checks (checks Form 8879 signatures and document checklist) - [x] Placeholder PDF generation (ReportLab implementation) - [x] Fixed ChecklistRepository abstract method implementation

Production blockers organized into 6 phases: - [x] Phase 0: Audit complete - [x] Phase 0b: Non-API fixes complete (4 items) - [ ] Phase 1: EmailService + SMSService (5-7 hrs) - requires API credentials - [ ] Phase 2: PersonaService (4-5 hrs) - requires API credentials - [ ] Phase 3: SmartVaultService (6-8 hrs) - requires API credentials - [ ] Phase 4: SurePrepService (8-10 hrs) - requires API credentials - [ ] Phase 5: GoogleService (6-8 hrs) - requires API credentials - [ ] Phase 6: Webhook Security (2-3 hrs)

Lesson learned: "Simpler" shortcuts that mask real requirements create hidden bugs. All assumptions should be explicitly documented and validated.

TD-007: Code Audit Findings (AUDIT-001 through AUDIT-008)

Status: Documented Priority: P2 (Post-MVP) Audit Date: 2024-12-27 Report: docs/audits/CODE_AUDIT_2024-12-27.md

Overall Rating: 8.5/10 - Production-ready architecture in src/, development-only duplication in local_api.py

Ratings by Area: - Architecture: 9/10 (excellent layering, services → repositories → domain) - Service Centralization: 9/10 (25 services inherit BaseService) - API Client Centralization: 10/10 (single api.ts for all calls) - Connection Pooling: 6/10 (src/ good, local_api.py needs work) - Maintainability: 8/10 (some large files need splitting) - Scalability: 8/10 (async patterns ready, needs pagination for large batches)

Action Items: - [x] AUDIT-001: Document local_api.py as dev-only in ARCHITECTURE.md (15 min) ✅ 2024-12-27 - [ ] AUDIT-002: Add connection pooling to local_api.py if used for extended demos (2 hrs) - [x] AUDIT-003: ~~Split repository files exceeding 1,000 lines~~ - Cancelled (single developer, no merge conflict risk) - [ ] AUDIT-004: Add inline comments to complex workflow logic (2 hrs) - Comments improve Claude's accuracy and speed - [ ] AUDIT-005: Move hardcoded values to environment variables (1 hr) - [ ] AUDIT-006: Consider migrating local_api.py to use src/ modules (8 hrs) - [ ] AUDIT-007: Add request/response logging middleware (2 hrs) - [ ] AUDIT-008: Implement caching layer for read-heavy endpoints (4 hrs)

TD-008: Security Audit Findings (SEC-001 through SEC-016)

Status: Documented Priority: P0 (Critical), P1 (High/Medium), P2 (Low) Audit Date: 2024-12-27 Report: docs/audits/SECURITY_AUDIT_2024-12-27.md

Overall Security Rating: 7/10 - Solid foundation, needs hardening before production

Critical (P0 - Fix Before Production): - [ ] SEC-001: XSS via template injection - src/services/template_service.py:276-301 - [ ] SEC-002: dangerouslySetInnerHTML without sanitization - TourOverlay.tsx:187-217 - [ ] SEC-003: Missing security headers - src/api/main.py - [x] SEC-004: Dev API weak auth - Already documented as dev-only per AUDIT-001

High (P0 - Fix Before Production): - [ ] SEC-005: ORDER BY SQL injection risk - base_repository.py:131 - [ ] SEC-006: CORS overly permissive - src/api/main.py:91-98 - [ ] SEC-007: Missing role enforcement on list endpoints - clients.py:66 - [ ] SEC-008: Stripe test keys in plain text - .env:44-45

Medium (P1 - Fix in V1.1): - [ ] SEC-009: No token revocation mechanism - auth.py - [ ] SEC-010: Payment bypass flag needs compliance doc - efiling.py:31-35 - [ ] SEC-011: Search parameter unbounded - clients.py:70 - [ ] SEC-012: Filename not sanitized in S3 path - local_api.py:879 - [ ] SEC-013: Request size limits not enforced - main.py, nginx.conf

Low (P2 - Address When Convenient): - [ ] SEC-014: In-memory rate limiting (needs Redis for prod) - rate_limiting.py:68 - [ ] SEC-015: Error messages may leak info - local_api.py:1513 - [ ] SEC-016: Email query parameter not validated - registration.py:252

TD-009: Compliance Audit Findings (COMP-001 through COMP-022)

Status: Documented Priority: P0 (Critical), P1 (High), P2 (Medium) Audit Date: 2024-12-27 Report: docs/audits/COMPLIANCE_AUDIT_2024-12-27.md

Overall Compliance Rating: 70/100 - Strong foundation with critical implementation gaps

Critical (P0 - Fix Before Production): - [ ] COMP-001: AI/cloud processing consent not implemented - Form 7216 violation risk - [ ] COMP-002: Form 7216 consent not enforced at e-filing - efiling_workflow.py - [ ] COMP-003: Authentication events not logged - audit_service.log_auth() never called - [ ] COMP-004: Document/client access not logged - only modifications tracked - [ ] COMP-005: Account lockout not implemented - SEC-005 design only - [ ] COMP-006: Field-level encryption not implemented - ENC-004 design only - [ ] COMP-007: Conflict of interest checks missing - Circular 230 requirement - [ ] COMP-008: Form 2848 (POA) workflow missing - domain model only

High (P1 - Fix in V1.1): - [ ] COMP-009: PTIN expiration not enforced in workflows - [ ] COMP-010: Competency/credential tracking missing - CIR-004 - [ ] COMP-011: MFA not implemented - framework only - [ ] COMP-012: Employee training tracking missing - WISP requirement - [ ] COMP-013: Database immutability not enforced - REVOKE statements commented - [ ] COMP-014: Incident detection/alerting not implemented - [ ] COMP-015: Persona integration in dry-run mode - [ ] COMP-016: Authorization denials not logged to audit

Medium (P2 - Address When Convenient): - [ ] COMP-017: Legal hold not implemented - RET-004 - [ ] COMP-018: Secure deletion not implemented - RET-005 - [ ] COMP-019: Vendor security assessments missing - [ ] COMP-020: Password policy not enforced - [ ] COMP-021: IP/device context not auto-captured - [ ] COMP-022: Consent table schema mismatch

Effort Estimate: 85-120 hours total (P0: 40-60 hrs, P1: 30-40 hrs, P2: 15-20 hrs)

TD-010: Remove Frontend Mock Mode Code

Status: Not Started Priority: P3 (Low - Technical Debt) Added: 2024-12-28

Description: Remove all DEV_MODE conditional branches and MOCK_* data arrays from frontend/packages/ui/src/lib/api.ts. Mock mode was disabled on 2024-12-28 in favor of using real API with seeded database.

Scope: - Remove ~30 if (DEV_MODE) { ... } conditional blocks - Remove MOCK_CLIENTS, MOCK_RETURNS, MOCK_DOCUMENTS, MOCK_ANALYSES, MOCK_CHAT_HISTORY arrays (~300 lines) - Remove DEV_MODE constant and related comments

Effort Estimate: 1 hour


P0 Master Priority List (Pre-Production Blockers)

Status: Consolidated 2024-12-27 Total Items: 15 (Security: 6, Compliance: 8, Code: 1 conditional) Total Effort: 45-50 hours

Implementation order optimized for dependencies and quick wins:

Phase 1: Quick Wins (2.5 hrs) - Parallelize

ID Issue Effort Fix
SEC-003 Missing security headers 1 hr Add middleware for X-Frame-Options, CSP, HSTS
SEC-005 ORDER BY SQL injection risk 1 hr Whitelist allowed column names
SEC-006 CORS overly permissive 30 min Restrict methods/headers in main.py

Phase 2: XSS Fixes (3 hrs) - Parallelize

ID Issue Effort Fix
SEC-001 XSS via template injection 1 hr Use html.escape() or Jinja2 autoescape
SEC-002 dangerouslySetInnerHTML 2 hrs Add DOMPurify or refactor to React components

Phase 3: Access Control (5 hrs) - Sequential

ID Issue Effort Fix
SEC-007 Missing role enforcement 2 hrs Filter list endpoints by user role
COMP-005 Account lockout missing 3 hrs Add lockout fields, check before auth

Phase 4: Audit Logging (5 hrs) - Sequential after Phase 3

ID Issue Effort Fix
COMP-003 Auth events not logged 2 hrs Call audit_service.log_auth() in auth routes
COMP-004 Access not logged 3 hrs Add log_access() to all GET endpoints
ID Issue Effort Fix
COMP-001 AI processing consent missing 4 hrs Add USE_AI_PROCESSING, check in BedrockService
COMP-002 E-filing consent not enforced 2 hrs Validate Form 7216 before mark_ready_for_filing

Phase 6: New Workflows (16 hrs) - Parallelize

ID Issue Effort Fix
COMP-007 COI checks missing 8 hrs COI table, check workflow, API, audit log
COMP-008 Form 2848 POA missing 8 hrs Form generation, signature, access control

Phase 7: Data Protection (8 hrs)

ID Issue Effort Fix
COMP-006 Field encryption missing 8 hrs encryption_service.py with pgcrypto wrapper

Conditional

ID Issue Effort Condition
AUDIT-002 local_api.py connection pooling 2 hrs Only if used for extended demos

Delegation Strategy

  • Phases 1-2: Can parallelize entirely (quick wins + XSS)
  • Phases 3-4: Must sequence (access control before audit logging)
  • Phase 5: Must sequence (consent type before e-filing check)
  • Phase 6: Can parallelize (COI and POA are independent)
  • Phase 7: Independent (can run anytime)

Expedited Analysis Pricing (PRICE-001)

Status: Backlog Priority: P1 (Revenue Feature) Target: Post-V1 Launch

Business Model

Tiered document analysis with freemium expedited processing:

Processing Tier Timing Cost
Batch (Default) Overnight Included
Expedited Immediate (~30 sec/doc) Free quota, then $1.50/doc

Freemium Model: - All clients receive 25 free expedited analyses per month - After quota exhausted: $1.50 per document OR wait for overnight batch - Counter resets on 1st of each month

User Experience

Document Upload Flow: 1. Documents uploaded show status: "Pending Analysis" 2. Banner displays: "3 documents pending. Overnight batch included, or analyze now." 3. Show quota: "15 of 25 expedited analyses remaining this month" 4. When quota exhausted, button changes to "Analyze Now ($1.50)" or "Queue for Overnight"

Recording Should Demonstrate: - Instant expedited analysis (within quota) - "Queued for overnight" state (quota exhausted or user choice) - Quota counter display

Implementation

Database Changes: - Add expedited_analyses_used counter to client/account table (resets monthly) - Add expedited_analyses_limit field (default: 25) - Add analysis_status enum to documents: pendingqueuedprocessingcomplete

Backend: - Quota check before expedited processing - Airflow DAG for overnight batch processing - Stripe integration for overage billing

Frontend: - Status badges on documents - Quota counter in header/sidebar - Expedited vs batch choice dialog - Overage payment confirmation

Files to Create: - src/domain/analysis_quota.py - src/repositories/analysis_quota_repository.py - src/workflows/document_analysis/batch_processor.py - dags/overnight_analysis_dag.py

Margin Analysis

At current AI costs (~$0.05/doc for analysis): - Expedited fee: $1.50/doc - Cost: $0.05/doc - Margin: 97%

Free quota (25/month) costs ~$1.25/month per client - acceptable customer acquisition cost.


Duplicate Document Detection (DUP-001)

Status: ✅ Implemented (2024-12-27) Priority: P2 (Data Quality) Target: V1.1

Problem

Users may accidentally upload the same document multiple times, leading to: - Duplicate data in AI analysis - Confusion about which version is current - Wasted storage and processing

Solution

Detect duplicates on upload using file hash at two levels:

  1. On Upload: Calculate SHA-256 hash of file content
  2. Check Same Client: Query documents for this client with matching hash
  3. Check Cross-Client: Query all documents with matching hash (different client)
  4. Warn/Error: Show appropriate message based on match type

User Experience

Same Client Duplicate:

⚠️ Duplicate Document

"W2_Global_2023.pdf" matches a document uploaded on Dec 15, 2024.

[View Original]  [Cancel]
Note: No "Upload Anyway" - identical content has no benefit to re-upload.

Cross-Client Match (likely wrong client selected):

🚨 Document Belongs to Another Client

This file is already linked to: John Smith
(uploaded Dec 10, 2024)

Did you select the wrong client?

[View in John Smith]  [Cancel]
Note: No auto-move - moving documents has cascading implications for both clients' returns.

Implementation (Completed)

Database Changes: - Added file_hash column to documents table (VARCHAR(64)) - Added index: idx_documents_file_hash

API Changes: - Calculate SHA-256 hash on upload - Single query to check for existing hash, returns client_id - Returns 409 Conflict with duplicate info if found: - error: "duplicate_document" - is_same_client: boolean - existing_document: { id, client_id, client_name, filename, uploaded_at }

Files Modified: - scripts/local_api.py - Hash calculation and duplicate check - scripts/bootstrap.py - file_hash column and index - frontend/.../DocumentUploadPage.tsx - Duplicate dialog with View Original / Cancel - frontend/.../AnalysisDashboardPage.tsx - Inline drag-drop upload with same duplicate detection

Tests Added: - tests/bdd/features/duplicate_detection.feature - 9 BDD scenarios - tests/bdd/step_defs/test_duplicate_detection.py - Step definitions - frontend/tests/features/duplicate-detection.feature - 12 Gherkin scenarios - frontend/tests/duplicate-detection.spec.ts - Playwright test stubs


Name Mismatch Detection (DUP-002)

Status: Backlog Priority: P3 (Data Quality) Target: V1.2

Problem

Document uploaded to wrong client - e.g., W-2 for "John Smith" uploaded to client "Jane Doe".

Solution

Post-processing check after document extraction: 1. Extract name from document metadata (W-2, 1099, etc.) 2. Fuzzy match against client name 3. If mismatch, flag for review

Considerations

  • Requires OCR/extraction to complete first (not instant like hash check)
  • Fuzzy matching needed ("John Smith" vs "John A. Smith" vs "J. Smith")
  • Some docs have no name (receipts, bank statements) - skip check
  • Joint returns - both spouse names are valid matches
  • Threshold for fuzzy match confidence

User Experience

After document processing completes:

⚠️ Name Mismatch

This W-2 shows "John Smith" but client is "Jane Doe".

[View Document]  [Confirm Correct Client]  [Move to Different Client]


Concurrent Edit Handling (CONC-001)

Status: Planning Complete Priority: P2 (Data Integrity) Target: V1.1 Plan: docs/plans/OPTIMISTIC_LOCKING.md

Problem

Two users editing the same record simultaneously results in silent data loss (Last Write Wins). No conflict detection or user notification.

Solution: Optimistic Locking

Add version field to mutable entities. On update, verify version matches; if stale, return 409 Conflict with current data for user resolution.

Scope

Entity Risk Implementation
Clients High Version column, API check, conflict dialog
Tax Returns High Version column, API check, conflict dialog
Documents Medium Version column, API check, conflict dialog
Users Low Version column, API check

Implementation Summary

Component Changes Effort
Database Add version column to 4 tables 30 min
API Version in responses, check on updates, 409 handling 2 hrs
Frontend State tracking, ConflictDialog component 2 hrs
Tests Unit (5), Integration (3), E2E (3), BDD (3) 3 hrs
Total 8-10 hrs

Delegation Strategy

  • Haiku: Database migration, API endpoint updates, unit tests
  • Sonnet: Frontend state management, conflict dialog, integration/E2E tests

Test Cases

  • Update with correct version succeeds, version increments
  • Update with stale version returns 409 Conflict
  • Missing version returns 400 Bad Request
  • Conflict response includes current data
  • Conflict dialog appears and functions correctly

Acceptance Criteria

  • All mutable entities have version column
  • All update endpoints require and verify version
  • 409 Conflict returned with current data on mismatch
  • Frontend displays conflict resolution dialog
  • User can discard changes or retry (overwrite)
  • All tests passing

Field-Level Merging (CONC-002)

Status: Planned Priority: P3 (UX Enhancement) Target: V1.2 Depends on: CONC-001 Plan: docs/plans/OPTIMISTIC_LOCKING.md Section 10

Problem

Record-level locking (CONC-001) forces all-or-nothing conflict resolution. If User A changes phone and User B changes email, they shouldn't conflict.

Solution

Track which fields were changed. Auto-merge non-overlapping changes. Only show conflict dialog for same-field edits.

Example - Auto-merge: - User A changes phone - User B changes email - Result: Both saved, no conflict

Example - Partial conflict: - User A changes phone + email - User B changes email - Result: Phone auto-merged, email conflict shown

Effort Estimate

4-6 hours (incremental on CONC-001)


Special Conflict Scenarios (CONC-003)

Status: Planned Priority: P3 (Data Integrity) Target: V1.2 Depends on: CONC-001 Plan: docs/plans/OPTIMISTIC_LOCKING.md Section 10

Scenarios

Delete vs Update: User A deletes record while User B is editing. → Block delete or warn "record was deleted"

Status Race: Two users change return status simultaneously. → State machine validates transitions

Assignment Collision: Two preparers claim same return. → "Already assigned to X" message

Bulk vs Single: Bulk update while individual edit in progress. → Per-record version check, report partial failures

Parent-Child Constraints: Delete client with active returns. → "Cannot delete: has N active returns"

Effort Estimate

6-8 hours


Client Actions (UI-001, UI-002)

UI-001: Send Portal Invite

Status: Backlog Priority: P2 (Client Experience) Target: V1.1

Send invitation email to client with portal access link. Triggers welcome email with secure login instructions.

UI-002: Create Engagement Letter

Status: Backlog Priority: P2 (Workflow) Target: V1.1

Generate engagement letter from template, pre-filled with client info. Integrates with S3-001/S3-002 (Engagement Letter Generation and E-Signature).


AI-Support Ticketing System (SUP-001)

Status: Backlog Priority: P2 (Client Experience) Target: V1.2

Problem

Client questions and issues currently handled via email/phone without centralized tracking. No AI assistance in triaging or responding to common questions.

Solution

Ticketing system with AI-powered triage and response assistance:

  1. Ticket Intake: Clients submit questions via portal, email, or phone (staff creates ticket)
  2. AI Triage: Auto-classify ticket type, priority, and route to appropriate staff
  3. AI Draft Response: Generate suggested response for common questions
  4. Staff Review: Staff reviews AI draft, edits if needed, and sends
  5. Resolution Tracking: Track time to resolution, client satisfaction

Features

Ticket Types: - Document request (missing W-2, need copy of return) - Status inquiry (where is my refund, when will return be filed) - Tax question (can I deduct X, how does Y work) - Technical support (portal access, password reset) - Billing inquiry (invoice questions, payment issues) - Appointment request (schedule call, meeting)

AI Capabilities: - Auto-categorize incoming tickets by type and urgency - Suggest priority based on content analysis - Draft responses using client context (return status, documents, prior communications) - Flag tickets requiring senior staff attention - Detect sentiment (frustrated, urgent, routine)

Staff Interface: - Unified inbox with AI-suggested priorities - One-click approve AI draft or edit - Internal notes and escalation - Response templates with merge fields - Time tracking per ticket

Client Interface: - Submit new ticket from portal - View ticket status and history - Receive notifications on updates - Rate satisfaction after resolution

Implementation

Database: - tickets table (id, client_id, type, priority, status, created_at, resolved_at) - ticket_messages table (id, ticket_id, sender_type, content, ai_draft, created_at) - ticket_templates table (id, type, subject, body)

Services: - src/services/ticket_service.py - Core ticketing logic - src/services/ticket_ai_service.py - AI triage and response generation

API: - src/api/routes/tickets.py - CRUD endpoints - src/api/schemas/ticket_schemas.py - Request/response schemas

Frontend: - frontend/apps/staff-app/src/pages/TicketsPage.tsx - Staff ticket queue - frontend/apps/staff-app/src/components/tickets/TicketDetail.tsx - Individual ticket view - frontend/apps/client-portal/src/pages/SupportPage.tsx - Client ticket submission

AI Prompts: - src/prompts/tickets/triage.txt - Classify and prioritize - src/prompts/tickets/draft_response.txt - Generate response draft

Effort Estimate

  • Backend: 16-20 hours
  • Frontend: 12-16 hours
  • AI Integration: 8-12 hours
  • Testing: 8-10 hours
  • Total: 44-58 hours

Metrics

  • Average response time
  • First-contact resolution rate
  • AI draft acceptance rate
  • Client satisfaction scores
  • Tickets per client per season

Future Phases

PHASE-BKP: Bookkeeping Module

Status: Requirements Drafted Priority: Post-MVP Requirements: bookkeeping_requirements.md

Phased implementation: - Phase 1: Bank statement upload, AI categorization, QuickBooks export - Phase 2: Reconciliation, recurring transaction detection - Phase 3: Full bookkeeping (chart of accounts, P&L, Balance Sheet, two-way sync)

Pending Client Input: - Service level (tax-ready categorization vs full bookkeeping) - Target clients (business entities only vs all clients) - Pricing model (monthly retainer vs per-transaction)


AI Cost Optimization (SaaS Profitability)

Critical for SaaS business model. See COST_DETAIL.md for full analysis.

OPT-001: Metadata Caching

Status: ✅ Complete (2025-12-28) Priority: P0 (Core to SaaS margins) Estimated Impact: 60-75% token reduction, $0.24/return savings

Problem: Every AI query re-reads source documents. A W-2 referenced 10 times costs 10× the tokens.

Solution: Extract once, cache as markdown, reference forever.

Implementation: - [x] Create metadata MD file on first document scan (<350 lines each) - [x] Per-document: {client_id}/{return_year}/{doc_id}_metadata.md - [x] Per-return: {client_id}/{return_year}/return_summary.md - [x] Store in S3 alongside original documents - [x] Index metadata location in Aurora (document_metadata table) - [x] AI reads cache first; only re-scan if stale or confidence < 90% - [x] Refresh trigger when document updated (mark_document_stale)

Metadata file contents: - Document type, source, upload date, confidence score - Extracted values (wages, withholding, etc.) in structured tables - AI notes (prior year comparison, anomalies) - Flags and questions

Savings: | Metric | Without Cache | With Cache | |--------|---------------|------------| | Tokens/return | 126,500 | ~50,000 | | AI cost/return | $0.40 | $0.16 |

OPT-002: Batch API Default

Status: Not Started Priority: P0 (Core to SaaS margins) Estimated Impact: 50-60% cheaper on batch-eligible tasks

Problem: Interactive API calls cost more than batch.

Solution: Default to batch processing with opt-in for real-time.

Implementation: - [ ] UX: "I'll have results ready tomorrow morning. [Start Live Session]" - [ ] Queue document processing overnight via Airflow - [ ] Pre-generate worksheets for morning review - [ ] Only Preparer Q&A requires real-time - [ ] Track batch vs interactive usage for cost analysis

Batch-eligible tasks: - Document classification and extraction - Prior year comparison - Missing document detection - Worksheet generation - Rejection analysis - Tax reminders

Savings: 80% batch acceptance → $0.40 → $0.28/return

OPT-003: Model Delegation Strategy

Status: Not Started Priority: P1 Estimated Impact: 38% AI cost reduction

Solution: Use cheapest model capable for each task.

SONNET (Orchestrator) → HAIKU (60% - extraction)
                      → SONNET (35% - analysis)
                      → OPUS (5% - expert review)

Implementation: - [ ] Haiku for document extraction and classification - [ ] Sonnet for Q&A, comparisons, worksheets - [ ] Opus only for tax code interpretation, audit risk, complex scenarios - [ ] Auto-escalation when confidence < 80% or complex entity types - [ ] User option: "Ask the expert" to force Opus

Savings: All-Sonnet ($0.64) → Delegation ($0.40) = 38% reduction

OPT-004: Combined Optimization Target

Priority: P0 Target: $0.12/return AI cost (vs $0.40 baseline)

Combined impact: 1. Batch processing (80%): $0.40 → $0.28 2. Metadata caching (75%): $0.28 → $0.12 3. Model delegation: Further optimization within each tier

At scale (10,000 returns): - Baseline: $4,000/year AI cost - Optimized: $1,200/year AI cost - Annual savings: $2,800

OPT-005: Prompt Compression

Status: Not Started Priority: P2 (Quick Win) Estimated Impact: 10-20% additional token reduction

Problem: System prompts, tax code references, and instructions repeat on every call.

Solution: Use prompt compression tools (LLMLingua) to reduce prompt size while preserving meaning.

Implementation: - [ ] Evaluate LLMLingua for system prompt compression - [ ] Compress static instructions and tax code references - [ ] Benchmark quality vs compression ratio - [ ] A/B test compressed vs full prompts

OPT-006: Output Token Limits

Status: Not Started Priority: P2 (Quick Win) Estimated Impact: 5-15% output token reduction

Problem: AI generates verbose explanations when structured data is sufficient.

Solution: Force shorter, structured responses for extraction and classification tasks.

Implementation: - [ ] Set max_tokens limits per task type - [ ] Use structured JSON outputs for extraction - [ ] Reserve verbose mode for Q&A only - [ ] Track output token usage by task type

OPT-007: Semantic Caching

Status: Not Started Priority: P2 Estimated Impact: 15-30% reduction on repeated queries

Problem: Similar questions hit the AI repeatedly. "How do I report crypto?" and "Where do cryptocurrency gains go?" are the same question.

Solution: Cache responses by meaning, not exact match. Return cached answer for semantically similar queries.

Implementation: - [ ] Implement vector embeddings for query similarity - [ ] Set similarity threshold for cache hits - [ ] Focus on client Q&A and preparer questions - [ ] Track cache hit rate and quality

OPT-008: RAG Optimization

Status: Not Started Priority: P2 Estimated Impact: 20-30% context token reduction

Problem: Sending entire prior-year returns when only specific sections are relevant.

Solution: Retrieve only relevant document sections based on the question.

Implementation: - [ ] Chunk documents into semantic sections - [ ] Index chunks with vector embeddings - [ ] Retrieve top-k relevant chunks per query - [ ] Benchmark quality vs full-document approach

OPT-009: Confidence-Based Escalation

Status: Not Started Priority: P3 Estimated Impact: 10-20% model cost reduction

Problem: Pre-routing tasks to specific models may over-allocate expensive models.

Solution: Let Haiku attempt everything first, escalate dynamically based on confidence scores.

Implementation: - [ ] Define confidence thresholds per task type - [ ] Implement escalation pipeline (Haiku → Sonnet → Opus) - [ ] Track escalation rates and quality - [ ] Tune thresholds based on error rates

OPT-010: Fine-Tuning (Future)

Status: Not Started Priority: P4 (Volume-dependent) Estimated Impact: 30-50% cost reduction on fine-tuned tasks

Problem: Using general-purpose models for repetitive, domain-specific tasks.

Solution: Train smaller models on our specific tasks (document classification, common Q&A).

Implementation: - [ ] Collect 6+ months of production data - [ ] Identify high-volume, repetitive tasks - [ ] Fine-tune Haiku or open-source model (LoRA/QLoRA) - [ ] A/B test fine-tuned vs general model

OPT-011: Knowledge Distillation (Future)

Status: Not Started Priority: P4 (Volume-dependent) Estimated Impact: 50-85% cost reduction on distilled tasks

Problem: Want Opus-quality responses at Haiku prices.

Solution: Use Opus to generate training data, teach Haiku to mimic Opus responses.

Implementation: - [ ] Generate "gold standard" responses with Opus - [ ] Create training dataset from Opus outputs - [ ] Train Haiku to reproduce Opus-quality responses - [ ] Deploy distilled model for production use


Design Decisions Log

ID Date Decision Rationale
DD-001 2024-12-22 Service centralization as core pattern Single point of access for all external services - easier maintenance, testing, audit
DD-002 2024-12-22 Mixed Java/Python architecture Java for performance-critical processing, Python for API/orchestration/AI
DD-003 2024-12-22 Shared config.yaml for both languages Single source of truth, environment variable substitution
DD-004 2024-12-22 Aurora-only starting architecture Aurora handles all data including 7+ years retention (via table partitioning). Tiered storage (Athena/Snowflake) adds complexity not justified at small scale. S3 for document storage only, not as query layer. Add analytics tier only when Aurora partitioning insufficient. Both Aurora and Snowflake meet security/compliance requirements for tax data.
DD-005 2024-12-23 Self-hosted Airflow orchestration Use self-hosted Airflow on EC2 t3.medium (~$23/mo reserved) instead of MWAA ($360+/mo) or Step Functions. Full control, Python DAGs, built-in UI/monitoring. Acceptable maintenance overhead for small practice.
DD-006 2024-12-23 Bookkeeping as separate requirements document Bookkeeping has different cadence (monthly vs annual), different workflow, and could serve non-tax clients. Separate document allows independent prioritization and phasing. Shares infrastructure with tax system.
DD-007 2024-12-23 Bookkeeping phased approach Phase 1: tax-ready categorization + QuickBooks export. Phase 2: reconciliation. Phase 3: full bookkeeping. Start light, design for full. QuickBooks is system of record initially.
DD-008 2024-12-23 V1 integration strategy: integrate, don't replace Integrate with SmartVault (client portal), SurePrep (OCR/extraction), and UltraTax (via SurePrep CS Connect). Don't replace industry standard tools in V1. Future versions may replace SurePrep to capture per-return fees, but UltraTax lacks API (blocker).
DD-009 2024-12-23 Dual integration pattern: Services + Skills Services handle API calls (how to call). Skills provide AI context (how to understand). SmartVault, SurePrep, UltraTax each get skills for AI to interpret their data. UltraTax has skill but no service (no API).
DD-010 2024-12-23 Multi-tenant SaaS: Separate databases per tenant Deploy as SaaS with separate database per tenant firm within one Aurora cluster. Strongest isolation for tax compliance, shared tiered pricing, easy to migrate growing tenants to dedicated clusters. Requires tenant routing middleware and dynamic connection management.
DD-011 2024-12-24 Frontend: React + Vite two-app architecture Two separate apps (client-portal, staff-app) with shared component library (@tax-practice/ui). React 18, Vite, TypeScript, Tailwind, shadcn/ui, React Query, Zustand. Production-grade, sellable stack. HTMX considered but React chosen for richer UX and market perception.

Completed Items

DOC-001: Pre-Implementation Documentation

Completed: 2024-12-23

  • DATABASE_SCHEMA.sql - Complete 35-table schema with 50 enums
  • DATA_MODEL.md - Logical data model with ER diagrams
  • API_SPECIFICATION.md - Full REST API contract
  • INTEGRATION_CONTRACTS.md - External service integrations
  • SECURITY_DESIGN.md - Security architecture and controls
  • PROCESS_FLOWS.md - State machines and workflows
  • USER_STORIES.md - 82 prioritized user stories (77 MVP, 5 post-MVP)

DOC-002: Documentation Reconciliation

Completed: 2024-12-23

All pre-implementation specifications validated and reconciled: - Enum values aligned across all documents - Missing endpoints added to API specification - ER diagram updated with all 35 entities - Webhook handlers aligned with integration contracts - Cross-references added between documents


Tax Reference System

Tax code reference system created 2024-12-27. Serves both Claude Code /tax skill and Bedrock chatbot (RAG).

Structure: docs/tax-reference/ with federal/, states/, common/ subdirectories organized by tax year.

Maintenance Model: Tax staff direct updates via conversation with Claude. Claude makes the actual file edits. No direct staff editing of markdown files.

TAX-REF-001: Senior Tax Administrator Role (RBAC)

Status: Backlog Priority: P2 Dependency: RBAC system implementation

Add RBAC role "Senior Tax Administrator" with permission to direct tax reference updates: - Can instruct Claude to update tax brackets, thresholds, rules - Can approve changes before Claude commits them - Audit log of all tax reference modifications - Year-over-year update workflow (annual review process)

TAX-REF-002: Firm-Specific Tax Skills/Overrides

Status: Backlog Priority: P2 Dependency: TAX-REF-001

Allow firms to customize tax reference with firm-specific guidance: - Override default rules with firm interpretations - Add firm-specific notes/cautions on tax positions - Custom worksheets and checklists per firm - Inheritance model: firm overrides layer on top of base tax reference - Storage: docs/tax-reference/overrides/{firm_id}/ or similar

TAX-REF-003: Bedrock Knowledge Base Integration

Status: Backlog Priority: P1 Dependency: BedrockService implementation

Integrate tax reference files with client-facing chatbot via hybrid approach:

Bedrock Knowledge Base (RAG): - Index docs/tax-reference/ into Bedrock Knowledge Base - Automatic chunking and embedding of markdown files - Semantic search for open-ended questions ("tell me about NY taxes") - Re-index on file updates (triggered by Claude commits)

Direct File Injection: - BedrockService.get_tax_context(jurisdiction, topic, year) method - Loads specific markdown file for deterministic context - More token-efficient for specific lookups ("NYC MFJ rate") - Routing logic maps question intent to file path

Implementation Steps: - Create Bedrock Knowledge Base with S3 data source pointing to docs/tax-reference/ - Add sync trigger when tax reference files are updated - Implement get_tax_context() in BedrockService - Add intent detection to route between KB search vs direct injection - Include source citations in chatbot responses (file path + section)

Benefits: - Single source of truth (same files serve Claude Code skill and chatbot) - Granular file structure enables precise context injection - Citations provide audit trail for tax advice

TAX-REF-004: Per-Client Tax Context Metadata

Status: Backlog Priority: P1 Dependency: AI Analysis workflow, TAX-REF-003

Auto-generated per-client tax context that primes the chatbot with relevant jurisdictions, skills, and flags.

File Structure:

s3://tax-practice-documents/{client_id}/{return_year}/
├── tax_context.md           # Current state (~50 lines, always loaded)
├── tax_context_pending.md   # Chatbot-flagged updates awaiting preparer review
└── tax_context_history.md   # Full changelog/audit trail (on-demand)

tax_context.md Contents: - Jurisdictions (states, localities detected from documents) - Tax skills required (references to docs/tax-reference/ files) - Flags (part-year, multi-state, audit risks, special situations) - Document-derived values (W-2 states, 1099 sources, property locations)

Creation - Part of AI Analysis (automatic, not user-triggered): - Analysis prompt includes jurisdiction detection and skill mapping - Analysis output structure includes tax_context section - Workflow writes tax_context.md as required output - No analysis is "complete" without tax context file

Updates - Delta Detection: - New document uploaded → re-analyze, update if jurisdictions/flags change - Chatbot conversation reveals new info → flag in tax_context_pending.md - Preparer approves pending → applied to tax_context.md, logged to history

Chatbot Flagging: - Chatbot detects tax-relevant statements (new state, life events, corrections) - Creates entry in tax_context_pending.md with suggested changes - Pending items surface in preparer queue with badge count - Preparer approves/rejects/modifies before changes apply - Chatbot cannot directly modify tax_context.md (audit trail integrity)

Version History (tax_context_history.md): - Every change logged with: timestamp, trigger, source (doc ID or conversation ID), actor, diff - S3 versioning enabled as backup - Supports dispute resolution: "On Dec 30 you told us X, here's the conversation"

Chatbot Loading Strategy: - tax_context.md: Always injected as system context - tax_context_pending.md: Loaded when preparer is in session - tax_context_history.md: On-demand ("What did I say about my move date?")

Tiered Context Loading: - Layer 1: tax_context.md (always, ~50 lines) - Layer 2: Specific tax skill files from docs/tax-reference/ (on-demand based on question) - Layer 3: Full Bedrock KB search (if question exceeds listed skills)

Templates: See docs/tax-reference/templates/ for file structure definitions.

Impact Analysis (2024-12-27):

Files to Modify (10): - src/workflows/analysis/preliminary_analysis_workflow.py - Add tax context generation after analysis - src/workflows/documents/classification_workflow.py - Add delta detection after classification - src/services/chat_service.py - Load tax context into chatbot, add flagging logic - src/workflows/review/ai_qa_workflow.py - Include tax context in Q&A prompts - frontend/apps/staff-app/src/pages/ReviewPage.tsx - Add pending review section - frontend/apps/staff-app/src/pages/ClientDetailPage.tsx - Add badge for pending updates - frontend/apps/staff-app/src/components/chat/ChatDrawer.tsx - Show flagged items indicator - src/services/audit_service.py - Log tax context events

New Files to Create (11): - src/services/tax_context_service.py - Core service: generate, update, manage tax context files - src/prompts/tax_context/generate_initial.txt - Prompt for initial tax context generation - src/prompts/tax_context/detect_delta.txt - Prompt for comparing new docs against existing context - src/prompts/tax_context/flag_statement.txt - Prompt for chatbot to detect tax-relevant statements - src/api/routes/tax_context.py - CRUD endpoints for tax context - src/api/schemas/tax_context_schemas.py - Request/response schemas - frontend/apps/staff-app/src/components/tax-context/PendingReviewPanel.tsx - Review/approve UI - frontend/apps/staff-app/src/components/tax-context/TaxContextViewer.tsx - Display current context - tests/unit/services/test_tax_context_service.py - tests/integration/test_tax_context_workflow.py - tests/e2e/test_tax_context_ui.py

No Changes Needed: - src/services/s3_service.py - Existing methods sufficient for read/write

Effort Estimate (Claude-only development with human guidance): - Backend: 15-20 hours of review/approval time - Frontend: 8-12 hours of review/approval time - Testing: 8-10 hours of review/approval time - Documentation: 2-3 hours of review/approval time - Total: 33-45 hours of human time (~1-1.5 weeks) - Note: Generic industry estimate for human developer would be 75-98 hours

Critical Path: 1. TaxContextService creation (foundational) 2. Prompt templates (generate_initial, detect_delta, flag_statement) 3. Integration into preliminary_analysis_workflow (initial generation) 4. Integration into classification_workflow (delta detection) 5. ChatService integration (context loading + flagging) 6. API routes creation 7. UI components (PendingReviewPanel, TaxContextViewer)

Risk Areas: - Delta detection accuracy (false positives/negatives) - mitigate with prompt tuning - Chatbot flagging noise - add confidence thresholds, allow preparer to disable - S3 file conflicts on concurrent updates - use S3 versioning, queue-based updates


Notes

  • Tax requirements: client_facing_docs/tax_practice_ai_requirements.md
  • Bookkeeping requirements: bookkeeping_requirements.md
  • User stories: USER_STORIES.md (82 stories in 18 sequences)
  • Development timeline target: January 2027 filing season (per requirements Section 11)
  • Critical path items have lead times (vendor selection, account setup) - see Section 11.4 of requirements

Document Inventory

Category Document Purpose
Requirements tax_practice_ai_requirements.md Full business requirements
Requirements bookkeeping_requirements.md Bookkeeping module (post-MVP)
Requirements MIGRATION_REQUIREMENTS.md Data migration (pre-launch)
Planning USER_STORIES.md 87 prioritized implementation stories (4 in S0, 10 in S2)
Planning backlog.md Priority tracking and decisions
Technical ARCHITECTURE.md System design and patterns
Technical DATABASE_SCHEMA.sql PostgreSQL schema definition
Technical DATA_MODEL.md Logical data model
Technical API_SPECIFICATION.md REST API contract
Technical INTEGRATION_CONTRACTS.md External service integrations
Technical SECURITY_DESIGN.md Security controls
Technical PROCESS_FLOWS.md State machines and workflows
Operations RUNBOOK.md Operational procedures