Skip to content

Tax Practice AI - Migration Requirements

Version: 1.0 Draft Date: December 23, 2025 Status: Requirements Gathering


1. Overview

This document defines requirements for migrating existing practice data into the Tax Practice AI system. Migration is a prerequisite for go-live and must be completed before the first filing season.

1.1 Scope

Data Type Source Volume (Est.) Priority
Client records Excel/CSV, UltraTax export ~1,000 clients Must Have
Client documents SmartVault folders, local file shares ~10,000+ documents Must Have
Prior year returns SmartVault, local PDF storage ~3,000 returns (3 years) Must Have
Historical return data UltraTax reports AGI, filing status, dependents Must Have

1.2 Migration Strategy

Approach: One-time bulk migration at launch with validation period.

┌─────────────────────────────────────────────────────────────────┐
│                    MIGRATION TIMELINE                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Phase 1: Client Data Import                                    │
│   ─────────────────────────────                                  │
│   • Import client records from CSV/Excel                         │
│   • Generate account numbers                                     │
│   • Validate and deduplicate                                     │
│                                                                  │
│   Phase 2: Document Migration                                    │
│   ───────────────────────────                                    │
│   • Bulk import from folder structure                            │
│   • Auto-classify by filename/folder                             │
│   • Link documents to clients                                    │
│                                                                  │
│   Phase 3: Historical Data                                       │
│   ─────────────────────────                                      │
│   • Import prior year AGI (Tier 1 auth)                          │
│   • Import filing history                                        │
│   • Generate client summaries                                    │
│                                                                  │
│   Phase 4: Validation                                            │
│   ───────────────────────                                        │
│   • Spot-check sample clients                                    │
│   • Verify document counts match                                 │
│   • Test client login with migrated data                         │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

2. Client Data Migration

2.1 Bulk Client Import

ID Requirement Priority
MIG-001 System shall accept client data import from CSV/Excel files Must Have
MIG-002 System shall support column mapping for flexible source formats Must Have
MIG-003 System shall validate required fields before import (name, email or phone) Must Have
MIG-004 System shall generate unique account numbers for imported clients Must Have
MIG-005 System shall detect and flag potential duplicates during import Must Have
MIG-006 System shall provide import preview with row-level validation status Should Have
MIG-007 System shall log all import operations for audit Must Have
MIG-008 System shall support dry-run mode (validate without committing) Should Have

Field Mapping Requirements:

Target Field Required Source Examples
name Yes "Client Name", "Full Name", "Name"
email Yes* "Email", "Email Address", "E-mail"
phone Yes* "Phone", "Phone Number", "Mobile"
address_line1 No "Address", "Street", "Address 1"
city No "City"
state No "State", "ST"
zip No "Zip", "Zip Code", "Postal Code"
ssn_last4 No "SSN4", "Last 4 SSN"
client_type No "Type", "Entity Type"

*At least one of email or phone is required.

2.2 Duplicate Handling

ID Requirement Priority
MIG-010 System shall identify duplicates based on configurable rules Must Have
MIG-011 System shall support duplicate resolution options: skip, merge, create anyway Must Have
MIG-012 System shall generate duplicate report for manual review Must Have
MIG-013 System shall preserve original source identifiers for traceability Should Have

Duplicate Detection Rules:

Rule Match Criteria Confidence
Exact email match email = email High
SSN-4 + DOB + Name similarity ssn_last4 + dob + fuzzy_name(0.8) High
Phone + Name similarity phone + fuzzy_name(0.8) Medium
Name + Address match exact_name + address_line1 Medium

2.3 Import Error Handling

ID Requirement Priority
MIG-020 System shall continue processing after row-level errors (log and skip) Must Have
MIG-021 System shall generate error report with row numbers and failure reasons Must Have
MIG-022 System shall support resumable imports for large datasets Should Have
MIG-023 System shall rollback entire import on critical errors (configurable) Should Have

3. Document Migration

3.1 Bulk Document Import

ID Requirement Priority
MIG-030 System shall import documents from folder structure Must Have
MIG-031 System shall support nested folder hierarchies Must Have
MIG-032 System shall match documents to clients based on folder name or metadata Must Have
MIG-033 System shall preserve original filenames and folder paths as metadata Must Have
MIG-034 System shall auto-classify documents by filename patterns Should Have
MIG-035 System shall scan for malware during import Must Have
MIG-036 System shall report unmatched documents (no client found) Must Have

Folder Structure Patterns Supported:

Pattern 1: By Client Name
───────────────────────
/documents/
├── Smith, John/
│   ├── 2024/
│   │   ├── W-2_Employer.pdf
│   │   └── 1099-INT_Bank.pdf
│   └── 2023/
│       └── Tax Return.pdf
└── Jones, Mary/
    └── 2024/
        └── W-2.pdf

Pattern 2: By Account Number
────────────────────────────
/documents/
├── 20231201-0001/
│   └── 2024/
└── 20231201-0002/
    └── 2024/

Pattern 3: By Year First
────────────────────────
/documents/
├── 2024/
│   ├── Smith, John/
│   └── Jones, Mary/
└── 2023/
    └── Smith, John/

3.2 Document Classification

ID Requirement Priority
MIG-040 System shall classify documents by filename keywords (W-2, 1099, etc.) Must Have
MIG-041 System shall use AI classification for ambiguous documents Should Have
MIG-042 System shall assign tax year based on folder structure or filename Must Have
MIG-043 System shall flag documents that cannot be classified for manual review Must Have

Filename Classification Rules:

Pattern Document Type Tax Form
*W-2*, *W2* Wage statement W-2
*1099-INT*, *1099INT* Interest income 1099-INT
*1099-DIV*, *1099DIV* Dividend income 1099-DIV
*1099-NEC*, *1099NEC* Non-employee compensation 1099-NEC
*1099-MISC* Miscellaneous income 1099-MISC
*1098* Mortgage interest 1098
*K-1*, *K1* Partnership/S-Corp income K-1
*Tax Return*, *1040* Prior year return 1040
*Driver*License*, *ID* Government ID ID Document

3.3 Client Matching

ID Requirement Priority
MIG-050 System shall match documents to clients by folder name Must Have
MIG-051 System shall support configurable matching (name, account number, custom ID) Must Have
MIG-052 System shall use fuzzy name matching with configurable threshold Should Have
MIG-053 System shall create match report showing client → document counts Must Have
MIG-054 System shall quarantine unmatched documents for manual assignment Must Have

4. Historical Return Data

4.1 Prior Year Data Import

ID Requirement Priority
MIG-060 System shall import prior year AGI for Tier 1 identity verification Must Have
MIG-061 System shall import filing status (MFJ, Single, HoH, etc.) Should Have
MIG-062 System shall import refund/balance due amounts Should Have
MIG-063 System shall import dependent information Should Have
MIG-064 System shall support import from UltraTax CSV export Must Have
MIG-065 System shall support import from generic CSV format Must Have

UltraTax Export Fields:

UltraTax Field Target Field Purpose
Client ID external_id (for matching) Match to imported client
SSN (last 4) ssn_last4 Client matching backup
Tax Year tax_year Year identification
Filing Status filing_status Historical record
AGI prior_year_agi Tier 1 verification
Refund/Balance Due prior_year_result Historical record

4.2 Return History

ID Requirement Priority
MIG-070 System shall create return history records for imported prior years Should Have
MIG-071 System shall link prior year returns to imported PDF documents Should Have
MIG-072 System shall extract key data from prior year PDFs if structured data unavailable Nice to Have

5. Migration Tools

5.1 Command Line Interface

ID Requirement Priority
MIG-080 System shall provide CLI tool for client import: tax-migrate clients <file> Must Have
MIG-081 System shall provide CLI tool for document import: tax-migrate documents <folder> Must Have
MIG-082 System shall provide CLI tool for historical data: tax-migrate history <file> Must Have
MIG-083 System shall support --dry-run flag for all migration commands Should Have
MIG-084 System shall support --verbose flag for detailed progress output Should Have
MIG-085 System shall generate summary report at completion Must Have

CLI Examples:

# Import clients from CSV
tax-migrate clients clients.csv \
  --mapping name="Full Name",email="Email",phone="Phone" \
  --duplicate-action=skip \
  --dry-run

# Import documents from folder
tax-migrate documents /path/to/smartvault/export \
  --pattern="client-name-first" \
  --match-threshold=0.85 \
  --classify-by-filename

# Import historical AGI data
tax-migrate history ultratax-export.csv \
  --format=ultratax \
  --years=2023,2022,2021

5.2 Admin UI

ID Requirement Priority
MIG-090 System shall provide web UI for small-batch imports Should Have
MIG-091 System shall show import progress with real-time updates Should Have
MIG-092 System shall allow manual duplicate resolution via UI Should Have
MIG-093 System shall allow manual document-to-client assignment via UI Should Have

6. Validation and Reporting

6.1 Pre-Migration Validation

ID Requirement Priority
MIG-100 System shall validate source files before import begins Must Have
MIG-101 System shall report expected record counts Must Have
MIG-102 System shall identify missing required fields Must Have
MIG-103 System shall estimate storage requirements for documents Should Have

6.2 Post-Migration Validation

ID Requirement Priority
MIG-110 System shall generate migration summary report Must Have
MIG-111 System shall compare source counts to imported counts Must Have
MIG-112 System shall list all errors and warnings Must Have
MIG-113 System shall generate sample client report for spot-checking Should Have
MIG-114 System shall verify document counts per client match source Should Have

Migration Summary Report:

Migration Summary Report
========================
Date: 2025-01-15 14:30:00
Duration: 2h 15m

CLIENTS
  Source records:     1,042
  Imported:           1,038
  Duplicates skipped:     4
  Errors:                 0

DOCUMENTS
  Source files:      12,456
  Imported:          12,389
  Matched to client: 12,102
  Unmatched:            287
  Classification:
    - W-2:           2,340
    - 1099-*:        3,120
    - Prior Returns: 2,890
    - Other:         4,039

HISTORICAL DATA
  Prior year records:  2,950
  AGI imported:        2,950

WARNINGS
  - 287 documents unmatched (see unmatched_documents.csv)
  - 4 duplicate clients skipped (see duplicates.csv)

NEXT STEPS
  1. Review unmatched_documents.csv and assign manually
  2. Spot-check 5% of clients for data accuracy
  3. Test client login with sample accounts

7. Security and Compliance

7.1 Data Protection During Migration

ID Requirement Priority
MIG-120 System shall encrypt all data in transit during migration Must Have
MIG-121 System shall not store SSN (full) during migration - only last 4 if provided Must Have
MIG-122 System shall log all migration operations for audit Must Have
MIG-123 System shall restrict migration tools to authorized admin users Must Have
MIG-124 System shall securely delete temporary files after migration Must Have

7.2 Audit Trail

ID Requirement Priority
MIG-130 System shall log migration start/end with operator ID Must Have
MIG-131 System shall log source file hashes for traceability Should Have
MIG-132 System shall preserve audit trail of imported vs. original records Must Have

8. Rollback and Recovery

ID Requirement Priority
MIG-140 System shall support full rollback of a migration batch Should Have
MIG-141 System shall tag all imported records with migration batch ID Must Have
MIG-142 System shall provide rollback CLI: tax-migrate rollback <batch-id> Should Have
MIG-143 System shall backup database before large migrations Must Have

9. Timeline and Dependencies

9.1 Prerequisites

Prerequisite Dependency Status
Aurora database deployed S1-001 Complete
S3 document storage configured S1-002 Complete
Client table schema finalized S1-001 Complete
AuditService implemented S1-007 Complete
  1. Week 1: Client data import and validation
  2. Week 2: Document migration (can run in parallel batches)
  3. Week 3: Historical data import
  4. Week 4: Validation, spot-checks, error resolution

9.3 Go-Live Checklist

  • All clients imported and validated
  • All documents imported and linked
  • Prior year AGI imported for Tier 1 auth
  • Sample client logins tested
  • Unmatched documents resolved or documented
  • Migration audit log archived
  • Source data backup retained

Document History

Version Date Author Changes
1.0 2024-12-23 Don McCarty Initial requirements