Credit Report Agent - Stage 2: Improved¶
Path:
examples/a2a_credit_report_example/improved
Overview¶
Stage 2 demonstrates partial security improvements that are insufficient for production use. This stage teaches that "better" does not equal "secure" when handling sensitive financial data.
Security Rating: ⚠️ 4/10 - PARTIALLY SECURE
Status: ⚠️ Not Production Ready - Significant vulnerabilities remain
Key Learning Focus¶
This stage focuses on understanding why partial security measures fail and the importance of comprehensive, layered defenses.
What You'll Learn¶
- Why partial file validation is insufficient
- Gaps in incomplete PII encryption
- Weaknesses of basic authentication
- How attackers exploit remaining vulnerabilities
- The importance of defense-in-depth
Architecture¶
Client
↓ (uploads PDF)
Basic Auth Check ✅
↓
File Type Validation ⚠️
↓
PDF Parser (partial validation) ⚠️
↓
Storage (partial encryption) ⚠️
↓
Basic Access Control ⚠️
↓
Response (HTTPS) ✅
Components¶
server.py: HTTP server with basic authenticationauth.py: Simple password authentication (bcrypt)validator.py: Basic file type checkingparser.py: PDF processor with some validationencryption.py: Partial field encryptionstorage.py: Mixed plaintext/encrypted storageaccess_control.py: Simple ownership checksconfig.py: Environment variables for secrets
✅ Improvements from Stage 1¶
1. Basic Authentication Added¶
# ✅ Password verification with bcrypt
import bcrypt
def authenticate(username, password):
stored_hash = users[username]['password_hash']
return bcrypt.checkpw(password.encode(), stored_hash)
Benefit: Prevents anonymous access
2. File Extension Validation¶
# ✅ Check file extension
ALLOWED_EXTENSIONS = {'.pdf', '.PDF'}
def validate_extension(filename):
ext = os.path.splitext(filename)[1]
if ext not in ALLOWED_EXTENSIONS:
raise ValueError("Only PDF files allowed")
Benefit: Blocks obviously wrong file types
3. Basic Size Limits¶
# ✅ Reject files over 10MB
MAX_FILE_SIZE = 10 * 1024 * 1024
def validate_size(file):
file.seek(0, 2) # End of file
size = file.tell()
file.seek(0) # Reset
if size > MAX_FILE_SIZE:
raise ValueError("File too large")
Benefit: Prevents basic DoS via huge files
4. Partial PII Encryption¶
# ✅ Encrypt SSN and account numbers
from cryptography.fernet import Fernet
def store_report(data):
encrypted_data = {
'ssn': encrypt(data['ssn']), # ✅ Encrypted
'account_numbers': encrypt(data['accounts']), # ✅ Encrypted
'name': data['name'], # ❌ Still plaintext
'address': data['address'], # ❌ Still plaintext
'dob': data['dob'], # ❌ Still plaintext
'credit_score': data['score'] # ❌ Still plaintext
}
save_to_db(encrypted_data)
Benefit: Some PII protected, but inconsistent
5. HTTPS Enforced¶
Benefit: Encrypted data in transit
6. Basic Access Control¶
# ✅ Check ownership
def get_report(report_id, requester_id):
report = db.get(report_id)
if report.owner_id != requester_id:
raise PermissionError("Access denied")
return report
Benefit: Users can't access each other's reports
⚠️ Remaining Vulnerabilities¶
Despite improvements, 15+ critical vulnerabilities remain:
1. Incomplete File Validation (CRITICAL)¶
# ⚠️ Only checks extension, not content
def validate_file(filename):
if filename.endswith('.pdf'):
return True # .exe.pdf would pass!
Attack: Rename malicious files with .pdf extension
2. No Magic Byte Verification (CRITICAL)¶
# ❌ Doesn't verify actual PDF format
def is_pdf(file):
return file.name.endswith('.pdf') # Trust filename only
Attack: Upload executable with .pdf extension
Should be:
def is_pdf(file):
magic = file.read(4)
file.seek(0)
return magic == b'%PDF' # Verify actual file format
3. Inconsistent PII Encryption (HIGH)¶
# ⚠️ Only encrypts some fields
encrypted_fields = ['ssn', 'account_numbers']
plaintext_fields = ['name', 'address', 'dob', 'employer']
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Still exposed in breach!
Attack: Breach exposes most PII despite "encryption"
4. No MFA (HIGH)¶
# ⚠️ Single-factor authentication
def login(username, password):
if verify_password(username, password):
return create_session(username) # No 2FA
Attack: Stolen/weak passwords = full access
5. Weak Password Requirements (MEDIUM)¶
# ⚠️ No password complexity rules
def create_user(username, password):
# Accepts "password123"
hash = bcrypt.hashpw(password.encode(), bcrypt.gensalt())
save_user(username, hash)
Attack: Brute force weak passwords
6. No Rate Limiting (HIGH)¶
# ❌ Still unlimited requests
@app.route('/upload', methods=['POST'])
def upload():
return process_file(request.files['file'])
Attack: Mass data extraction, credential stuffing
7. Path Traversal Still Possible (HIGH)¶
# ⚠️ Incomplete sanitization
def sanitize_filename(filename):
filename = filename.replace('../', '') # Single pass only!
return filename
Attack: Use ....// which becomes ../ after replacement
8. No Virus Scanning (HIGH)¶
# ❌ No malware detection
def process_upload(file):
# Directly processes uploaded files
return parse_pdf(file)
Attack: Upload PDF with embedded malware
9. Incomplete Audit Logging (MEDIUM)¶
# ⚠️ Logs some but not all access
def get_report(report_id):
report = db.get(report_id)
# ❌ No log of who accessed what
return report
Attack: Unauthorized access undetected
10. No Input Sanitization (MEDIUM)¶
# ⚠️ Doesn't sanitize extracted text
def extract_text(pdf):
text = pdf.extract_text()
return text # Could contain injection payloads
Attack: Injection via PDF content
Attack Success Matrix¶
| Attack Type | Stage 1 | Stage 2 | Stage 3 | Stage 4 |
|---|---|---|---|---|
| Unauthorized Access | ✅ Succeeds | ❌ Blocked | ❌ Blocked | ❌ Blocked |
| Magic Byte Bypass | ✅ Succeeds | ✅ Succeeds | ❌ Blocked | ❌ Blocked |
| Path Traversal | ✅ Succeeds | ✅ Succeeds | ❌ Blocked | ❌ Blocked |
| PII Exposure | ✅ Full | ⚠️ Partial | ❌ Protected | ❌ Protected |
| Credential Stuffing | N/A | ✅ Succeeds | ❌ Blocked | ❌ Blocked |
| Mass Extraction | ✅ Succeeds | ✅ Succeeds | ❌ Blocked | ❌ Blocked |
| Malware Upload | ✅ Succeeds | ✅ Succeeds | ❌ Blocked | ❌ Blocked |
| DoS (Large Files) | ✅ Succeeds | ⚠️ Harder | ❌ Blocked | ❌ Blocked |
Legend: ✅ = Attack succeeds, ⚠️ = Partially mitigated, ❌ = Attack blocked
Compliance Status¶
FCRA Compliance¶
| Requirement | Status | Notes |
|---|---|---|
| Access Control (§604) | ⚠️ Partial | Basic auth added, but weak |
| Security Procedures (§607) | ❌ Insufficient | Incomplete encryption |
| Audit Trail (§607) | ⚠️ Partial | Some logging, incomplete |
| Dispute Resolution (§611) | ❌ Missing | No mechanism implemented |
Verdict: Still violates FCRA
GDPR Compliance¶
| Requirement | Status | Notes |
|---|---|---|
| Data Minimization (Art. 5) | ⚠️ Partial | Still collects too much |
| Security (Art. 32) | ⚠️ Partial | Inconsistent encryption |
| Breach Notification (Art. 33) | ❌ Missing | No detection capability |
| Data Portability (Art. 20) | ❌ Missing | Not implemented |
Verdict: Still violates GDPR
Attack Demonstrations¶
Demo 1: Magic Byte Bypass¶
# Create fake.pdf (actually an executable)
with open('malware.exe', 'rb') as exe:
content = exe.read()
with open('fake.pdf', 'wb') as pdf:
pdf.write(content)
# Upload succeeds - only checked extension!
response = upload_file('fake.pdf')
print(response) # "File uploaded successfully"
Demo 2: Double Path Traversal¶
# Bypass simple sanitization
filename = "....//....//....//etc/passwd"
# After single replace: "../../../etc/passwd"
upload_file(filename) # Writes outside uploads directory
Demo 3: Extract Plaintext PII¶
# Authenticate once
token = login('attacker', 'password')
# Steal all reports - no rate limiting
for report_id in range(10000):
report = get_report(report_id, token)
# SSN encrypted, but name, address, DOB exposed
stolen_pii.append({
'name': report['name'],
'address': report['address'],
'dob': report['dob']
})
Running the Example¶
Setup¶
Try the Attacks¶
# Terminal 1: Start server
python server.py
# Terminal 2: Run Stage 2 specific attacks
python ../demos/attack_stage2.py
What to Observe¶
- Some attacks blocked (good!)
- But many still succeed (bad!)
- Inconsistent protection
- False sense of security
Key Differences from Stage 1¶
| Feature | Stage 1 | Stage 2 | Improvement |
|---|---|---|---|
| Authentication | ❌ None | ✅ Password | +100% |
| File Validation | ❌ None | ⚠️ Extension only | +30% |
| PII Encryption | ❌ None | ⚠️ Partial | +50% |
| HTTPS | ❌ HTTP | ✅ Required | +100% |
| Size Limits | ❌ None | ✅ 10MB max | +100% |
| Access Control | ❌ None | ⚠️ Basic | +60% |
| Rate Limiting | ❌ None | ❌ None | 0% |
| Audit Logging | ❌ None | ⚠️ Partial | +40% |
| Overall Security | 0/10 | 4/10 | +40% |
Conclusion: Better, but still fails in production
Study Checklist¶
- Compare code with Stage 1 to see improvements
- Identify 15+ remaining vulnerabilities
- Run bypass demonstrations successfully
- Understand why partial encryption fails
- Recognize incomplete vs. comprehensive security
- Ready for Stage 3 production patterns
Key Takeaways¶
- Partial security creates false confidence: Some protection ≠ secure
- Inconsistent encryption is weak: All PII must be protected
- File validation must be comprehensive: Extension checks are insufficient
- Defense-in-depth is essential: Single layers fail
- Compliance requires completeness: Partial implementation still violates regulations
Next: Stage 3 (Secure)¶
Stage 3 implements production-grade security with comprehensive protections.
Additional protections in Stage 3: - ✅ 8-layer validation framework - ✅ Complete field-level encryption - ✅ MFA enforcement - ✅ Rate limiting and abuse detection - ✅ Comprehensive audit logging - ✅ Full FCRA/GDPR compliance
Time to Complete: 4-6 hours
Difficulty: ⭐⭐ Intermediate
Prerequisites: Stage 1 complete, understanding of encryption basics