Tax Practice AI - Migration Requirements
Version: 1.0 Draft
Date: December 23, 2025
Status: Requirements Gathering
1. Overview
This document defines requirements for migrating existing practice data into the Tax Practice AI system. Migration is a prerequisite for go-live and must be completed before the first filing season.
1.1 Scope
| Data Type |
Source |
Volume (Est.) |
Priority |
| Client records |
Excel/CSV, UltraTax export |
~1,000 clients |
Must Have |
| Client documents |
SmartVault folders, local file shares |
~10,000+ documents |
Must Have |
| Prior year returns |
SmartVault, local PDF storage |
~3,000 returns (3 years) |
Must Have |
| Historical return data |
UltraTax reports |
AGI, filing status, dependents |
Must Have |
1.2 Migration Strategy
Approach: One-time bulk migration at launch with validation period.
┌─────────────────────────────────────────────────────────────────┐
│ MIGRATION TIMELINE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Phase 1: Client Data Import │
│ ───────────────────────────── │
│ • Import client records from CSV/Excel │
│ • Generate account numbers │
│ • Validate and deduplicate │
│ │
│ Phase 2: Document Migration │
│ ─────────────────────────── │
│ • Bulk import from folder structure │
│ • Auto-classify by filename/folder │
│ • Link documents to clients │
│ │
│ Phase 3: Historical Data │
│ ───────────────────────── │
│ • Import prior year AGI (Tier 1 auth) │
│ • Import filing history │
│ • Generate client summaries │
│ │
│ Phase 4: Validation │
│ ─────────────────────── │
│ • Spot-check sample clients │
│ • Verify document counts match │
│ • Test client login with migrated data │
│ │
└─────────────────────────────────────────────────────────────────┘
2. Client Data Migration
2.1 Bulk Client Import
| ID |
Requirement |
Priority |
| MIG-001 |
System shall accept client data import from CSV/Excel files |
Must Have |
| MIG-002 |
System shall support column mapping for flexible source formats |
Must Have |
| MIG-003 |
System shall validate required fields before import (name, email or phone) |
Must Have |
| MIG-004 |
System shall generate unique account numbers for imported clients |
Must Have |
| MIG-005 |
System shall detect and flag potential duplicates during import |
Must Have |
| MIG-006 |
System shall provide import preview with row-level validation status |
Should Have |
| MIG-007 |
System shall log all import operations for audit |
Must Have |
| MIG-008 |
System shall support dry-run mode (validate without committing) |
Should Have |
Field Mapping Requirements:
| Target Field |
Required |
Source Examples |
| name |
Yes |
"Client Name", "Full Name", "Name" |
| email |
Yes* |
"Email", "Email Address", "E-mail" |
| phone |
Yes* |
"Phone", "Phone Number", "Mobile" |
| address_line1 |
No |
"Address", "Street", "Address 1" |
| city |
No |
"City" |
| state |
No |
"State", "ST" |
| zip |
No |
"Zip", "Zip Code", "Postal Code" |
| ssn_last4 |
No |
"SSN4", "Last 4 SSN" |
| client_type |
No |
"Type", "Entity Type" |
*At least one of email or phone is required.
2.2 Duplicate Handling
| ID |
Requirement |
Priority |
| MIG-010 |
System shall identify duplicates based on configurable rules |
Must Have |
| MIG-011 |
System shall support duplicate resolution options: skip, merge, create anyway |
Must Have |
| MIG-012 |
System shall generate duplicate report for manual review |
Must Have |
| MIG-013 |
System shall preserve original source identifiers for traceability |
Should Have |
Duplicate Detection Rules:
| Rule |
Match Criteria |
Confidence |
| Exact email match |
email = email |
High |
| SSN-4 + DOB + Name similarity |
ssn_last4 + dob + fuzzy_name(0.8) |
High |
| Phone + Name similarity |
phone + fuzzy_name(0.8) |
Medium |
| Name + Address match |
exact_name + address_line1 |
Medium |
2.3 Import Error Handling
| ID |
Requirement |
Priority |
| MIG-020 |
System shall continue processing after row-level errors (log and skip) |
Must Have |
| MIG-021 |
System shall generate error report with row numbers and failure reasons |
Must Have |
| MIG-022 |
System shall support resumable imports for large datasets |
Should Have |
| MIG-023 |
System shall rollback entire import on critical errors (configurable) |
Should Have |
3. Document Migration
3.1 Bulk Document Import
| ID |
Requirement |
Priority |
| MIG-030 |
System shall import documents from folder structure |
Must Have |
| MIG-031 |
System shall support nested folder hierarchies |
Must Have |
| MIG-032 |
System shall match documents to clients based on folder name or metadata |
Must Have |
| MIG-033 |
System shall preserve original filenames and folder paths as metadata |
Must Have |
| MIG-034 |
System shall auto-classify documents by filename patterns |
Should Have |
| MIG-035 |
System shall scan for malware during import |
Must Have |
| MIG-036 |
System shall report unmatched documents (no client found) |
Must Have |
Folder Structure Patterns Supported:
Pattern 1: By Client Name
───────────────────────
/documents/
├── Smith, John/
│ ├── 2024/
│ │ ├── W-2_Employer.pdf
│ │ └── 1099-INT_Bank.pdf
│ └── 2023/
│ └── Tax Return.pdf
└── Jones, Mary/
└── 2024/
└── W-2.pdf
Pattern 2: By Account Number
────────────────────────────
/documents/
├── 20231201-0001/
│ └── 2024/
└── 20231201-0002/
└── 2024/
Pattern 3: By Year First
────────────────────────
/documents/
├── 2024/
│ ├── Smith, John/
│ └── Jones, Mary/
└── 2023/
└── Smith, John/
3.2 Document Classification
| ID |
Requirement |
Priority |
| MIG-040 |
System shall classify documents by filename keywords (W-2, 1099, etc.) |
Must Have |
| MIG-041 |
System shall use AI classification for ambiguous documents |
Should Have |
| MIG-042 |
System shall assign tax year based on folder structure or filename |
Must Have |
| MIG-043 |
System shall flag documents that cannot be classified for manual review |
Must Have |
Filename Classification Rules:
| Pattern |
Document Type |
Tax Form |
*W-2*, *W2* |
Wage statement |
W-2 |
*1099-INT*, *1099INT* |
Interest income |
1099-INT |
*1099-DIV*, *1099DIV* |
Dividend income |
1099-DIV |
*1099-NEC*, *1099NEC* |
Non-employee compensation |
1099-NEC |
*1099-MISC* |
Miscellaneous income |
1099-MISC |
*1098* |
Mortgage interest |
1098 |
*K-1*, *K1* |
Partnership/S-Corp income |
K-1 |
*Tax Return*, *1040* |
Prior year return |
1040 |
*Driver*License*, *ID* |
Government ID |
ID Document |
3.3 Client Matching
| ID |
Requirement |
Priority |
| MIG-050 |
System shall match documents to clients by folder name |
Must Have |
| MIG-051 |
System shall support configurable matching (name, account number, custom ID) |
Must Have |
| MIG-052 |
System shall use fuzzy name matching with configurable threshold |
Should Have |
| MIG-053 |
System shall create match report showing client → document counts |
Must Have |
| MIG-054 |
System shall quarantine unmatched documents for manual assignment |
Must Have |
4. Historical Return Data
4.1 Prior Year Data Import
| ID |
Requirement |
Priority |
| MIG-060 |
System shall import prior year AGI for Tier 1 identity verification |
Must Have |
| MIG-061 |
System shall import filing status (MFJ, Single, HoH, etc.) |
Should Have |
| MIG-062 |
System shall import refund/balance due amounts |
Should Have |
| MIG-063 |
System shall import dependent information |
Should Have |
| MIG-064 |
System shall support import from UltraTax CSV export |
Must Have |
| MIG-065 |
System shall support import from generic CSV format |
Must Have |
UltraTax Export Fields:
| UltraTax Field |
Target Field |
Purpose |
| Client ID |
external_id (for matching) |
Match to imported client |
| SSN (last 4) |
ssn_last4 |
Client matching backup |
| Tax Year |
tax_year |
Year identification |
| Filing Status |
filing_status |
Historical record |
| AGI |
prior_year_agi |
Tier 1 verification |
| Refund/Balance Due |
prior_year_result |
Historical record |
4.2 Return History
| ID |
Requirement |
Priority |
| MIG-070 |
System shall create return history records for imported prior years |
Should Have |
| MIG-071 |
System shall link prior year returns to imported PDF documents |
Should Have |
| MIG-072 |
System shall extract key data from prior year PDFs if structured data unavailable |
Nice to Have |
5.1 Command Line Interface
| ID |
Requirement |
Priority |
| MIG-080 |
System shall provide CLI tool for client import: tax-migrate clients <file> |
Must Have |
| MIG-081 |
System shall provide CLI tool for document import: tax-migrate documents <folder> |
Must Have |
| MIG-082 |
System shall provide CLI tool for historical data: tax-migrate history <file> |
Must Have |
| MIG-083 |
System shall support --dry-run flag for all migration commands |
Should Have |
| MIG-084 |
System shall support --verbose flag for detailed progress output |
Should Have |
| MIG-085 |
System shall generate summary report at completion |
Must Have |
CLI Examples:
# Import clients from CSV
tax-migrate clients clients.csv \
--mapping name="Full Name",email="Email",phone="Phone" \
--duplicate-action=skip \
--dry-run
# Import documents from folder
tax-migrate documents /path/to/smartvault/export \
--pattern="client-name-first" \
--match-threshold=0.85 \
--classify-by-filename
# Import historical AGI data
tax-migrate history ultratax-export.csv \
--format=ultratax \
--years=2023,2022,2021
5.2 Admin UI
| ID |
Requirement |
Priority |
| MIG-090 |
System shall provide web UI for small-batch imports |
Should Have |
| MIG-091 |
System shall show import progress with real-time updates |
Should Have |
| MIG-092 |
System shall allow manual duplicate resolution via UI |
Should Have |
| MIG-093 |
System shall allow manual document-to-client assignment via UI |
Should Have |
6. Validation and Reporting
6.1 Pre-Migration Validation
| ID |
Requirement |
Priority |
| MIG-100 |
System shall validate source files before import begins |
Must Have |
| MIG-101 |
System shall report expected record counts |
Must Have |
| MIG-102 |
System shall identify missing required fields |
Must Have |
| MIG-103 |
System shall estimate storage requirements for documents |
Should Have |
6.2 Post-Migration Validation
| ID |
Requirement |
Priority |
| MIG-110 |
System shall generate migration summary report |
Must Have |
| MIG-111 |
System shall compare source counts to imported counts |
Must Have |
| MIG-112 |
System shall list all errors and warnings |
Must Have |
| MIG-113 |
System shall generate sample client report for spot-checking |
Should Have |
| MIG-114 |
System shall verify document counts per client match source |
Should Have |
Migration Summary Report:
Migration Summary Report
========================
Date: 2025-01-15 14:30:00
Duration: 2h 15m
CLIENTS
Source records: 1,042
Imported: 1,038
Duplicates skipped: 4
Errors: 0
DOCUMENTS
Source files: 12,456
Imported: 12,389
Matched to client: 12,102
Unmatched: 287
Classification:
- W-2: 2,340
- 1099-*: 3,120
- Prior Returns: 2,890
- Other: 4,039
HISTORICAL DATA
Prior year records: 2,950
AGI imported: 2,950
WARNINGS
- 287 documents unmatched (see unmatched_documents.csv)
- 4 duplicate clients skipped (see duplicates.csv)
NEXT STEPS
1. Review unmatched_documents.csv and assign manually
2. Spot-check 5% of clients for data accuracy
3. Test client login with sample accounts
7. Security and Compliance
7.1 Data Protection During Migration
| ID |
Requirement |
Priority |
| MIG-120 |
System shall encrypt all data in transit during migration |
Must Have |
| MIG-121 |
System shall not store SSN (full) during migration - only last 4 if provided |
Must Have |
| MIG-122 |
System shall log all migration operations for audit |
Must Have |
| MIG-123 |
System shall restrict migration tools to authorized admin users |
Must Have |
| MIG-124 |
System shall securely delete temporary files after migration |
Must Have |
7.2 Audit Trail
| ID |
Requirement |
Priority |
| MIG-130 |
System shall log migration start/end with operator ID |
Must Have |
| MIG-131 |
System shall log source file hashes for traceability |
Should Have |
| MIG-132 |
System shall preserve audit trail of imported vs. original records |
Must Have |
8. Rollback and Recovery
| ID |
Requirement |
Priority |
| MIG-140 |
System shall support full rollback of a migration batch |
Should Have |
| MIG-141 |
System shall tag all imported records with migration batch ID |
Must Have |
| MIG-142 |
System shall provide rollback CLI: tax-migrate rollback <batch-id> |
Should Have |
| MIG-143 |
System shall backup database before large migrations |
Must Have |
9. Timeline and Dependencies
9.1 Prerequisites
| Prerequisite |
Dependency |
Status |
| Aurora database deployed |
S1-001 |
Complete |
| S3 document storage configured |
S1-002 |
Complete |
| Client table schema finalized |
S1-001 |
Complete |
| AuditService implemented |
S1-007 |
Complete |
9.2 Recommended Sequence
- Week 1: Client data import and validation
- Week 2: Document migration (can run in parallel batches)
- Week 3: Historical data import
- Week 4: Validation, spot-checks, error resolution
9.3 Go-Live Checklist
Document History
| Version |
Date |
Author |
Changes |
| 1.0 |
2024-12-23 |
Don McCarty |
Initial requirements |