Credit Report Agent - Stage 2: Improved¶

Path: examples/a2a_credit_report_example/improved

Overview¶

Stage 2 demonstrates partial security improvements that are insufficient for production use. This stage teaches that "better" does not equal "secure" when handling sensitive financial data.

Security Rating: ⚠️ 4/10 - PARTIALLY SECURE

Status: ⚠️ Not Production Ready - Significant vulnerabilities remain

Key Learning Focus¶

This stage focuses on understanding why partial security measures fail and the importance of comprehensive, layered defenses.

What You'll Learn¶

Why partial file validation is insufficient
Gaps in incomplete PII encryption
Weaknesses of basic authentication
How attackers exploit remaining vulnerabilities
The importance of defense-in-depth

Architecture¶

Client
  ↓ (uploads PDF)
Basic Auth Check ✅
  ↓
File Type Validation ⚠️
  ↓
PDF Parser (partial validation) ⚠️
  ↓
Storage (partial encryption) ⚠️
  ↓
Basic Access Control ⚠️
  ↓
Response (HTTPS) ✅

Components¶

server.py: HTTP server with basic authentication
auth.py: Simple password authentication (bcrypt)
validator.py: Basic file type checking
parser.py: PDF processor with some validation
encryption.py: Partial field encryption
storage.py: Mixed plaintext/encrypted storage
access_control.py: Simple ownership checks
config.py: Environment variables for secrets

✅ Improvements from Stage 1¶

1. Basic Authentication Added¶

# ✅ Password verification with bcrypt
import bcrypt

def authenticate(username, password):
    stored_hash = users[username]['password_hash']
    return bcrypt.checkpw(password.encode(), stored_hash)

Benefit: Prevents anonymous access

2. File Extension Validation¶

# ✅ Check file extension
ALLOWED_EXTENSIONS = {'.pdf', '.PDF'}

def validate_extension(filename):
    ext = os.path.splitext(filename)[1]
    if ext not in ALLOWED_EXTENSIONS:
        raise ValueError("Only PDF files allowed")

Benefit: Blocks obviously wrong file types

3. Basic Size Limits¶

# ✅ Reject files over 10MB
MAX_FILE_SIZE = 10 * 1024 * 1024

def validate_size(file):
    file.seek(0, 2)  # End of file
    size = file.tell()
    file.seek(0)     # Reset
    if size > MAX_FILE_SIZE:
        raise ValueError("File too large")

Benefit: Prevents basic DoS via huge files

4. Partial PII Encryption¶

# ✅ Encrypt SSN and account numbers
from cryptography.fernet import Fernet

def store_report(data):
    encrypted_data = {
        'ssn': encrypt(data['ssn']),              # ✅ Encrypted
        'account_numbers': encrypt(data['accounts']),  # ✅ Encrypted
        'name': data['name'],                     # ❌ Still plaintext
        'address': data['address'],               # ❌ Still plaintext
        'dob': data['dob'],                       # ❌ Still plaintext
        'credit_score': data['score']             # ❌ Still plaintext
    }
    save_to_db(encrypted_data)

Benefit: Some PII protected, but inconsistent

5. HTTPS Enforced¶

# ✅ Require TLS
if not request.is_secure:
    return "HTTPS required", 403

Benefit: Encrypted data in transit

6. Basic Access Control¶

# ✅ Check ownership
def get_report(report_id, requester_id):
    report = db.get(report_id)
    if report.owner_id != requester_id:
        raise PermissionError("Access denied")
    return report

Benefit: Users can't access each other's reports

⚠️ Remaining Vulnerabilities¶

Despite improvements, 15+ critical vulnerabilities remain:

1. Incomplete File Validation (CRITICAL)¶

# ⚠️ Only checks extension, not content
def validate_file(filename):
    if filename.endswith('.pdf'):
        return True  # .exe.pdf would pass!

Attack: Rename malicious files with .pdf extension

2. No Magic Byte Verification (CRITICAL)¶

# ❌ Doesn't verify actual PDF format
def is_pdf(file):
    return file.name.endswith('.pdf')  # Trust filename only

Attack: Upload executable with .pdf extension

Should be:

def is_pdf(file):
    magic = file.read(4)
    file.seek(0)
    return magic == b'%PDF'  # Verify actual file format

3. Inconsistent PII Encryption (HIGH)¶

# ⚠️ Only encrypts some fields
encrypted_fields = ['ssn', 'account_numbers']
plaintext_fields = ['name', 'address', 'dob', 'employer']
#                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#                    Still exposed in breach!

Attack: Breach exposes most PII despite "encryption"

4. No MFA (HIGH)¶

# ⚠️ Single-factor authentication
def login(username, password):
    if verify_password(username, password):
        return create_session(username)  # No 2FA

Attack: Stolen/weak passwords = full access

5. Weak Password Requirements (MEDIUM)¶

# ⚠️ No password complexity rules
def create_user(username, password):
    # Accepts "password123"
    hash = bcrypt.hashpw(password.encode(), bcrypt.gensalt())
    save_user(username, hash)

Attack: Brute force weak passwords

6. No Rate Limiting (HIGH)¶

# ❌ Still unlimited requests
@app.route('/upload', methods=['POST'])
def upload():
    return process_file(request.files['file'])

Attack: Mass data extraction, credential stuffing

7. Path Traversal Still Possible (HIGH)¶

# ⚠️ Incomplete sanitization
def sanitize_filename(filename):
    filename = filename.replace('../', '')  # Single pass only!
    return filename

Attack: Use ....// which becomes ../ after replacement

8. No Virus Scanning (HIGH)¶

# ❌ No malware detection
def process_upload(file):
    # Directly processes uploaded files
    return parse_pdf(file)

Attack: Upload PDF with embedded malware

9. Incomplete Audit Logging (MEDIUM)¶

# ⚠️ Logs some but not all access
def get_report(report_id):
    report = db.get(report_id)
    # ❌ No log of who accessed what
    return report

Attack: Unauthorized access undetected

10. No Input Sanitization (MEDIUM)¶

# ⚠️ Doesn't sanitize extracted text
def extract_text(pdf):
    text = pdf.extract_text()
    return text  # Could contain injection payloads

Attack: Injection via PDF content

Attack Success Matrix¶

Attack Type	Stage 1	Stage 2	Stage 3	Stage 4
Unauthorized Access	✅ Succeeds	❌ Blocked	❌ Blocked	❌ Blocked
Magic Byte Bypass	✅ Succeeds	✅ Succeeds	❌ Blocked	❌ Blocked
Path Traversal	✅ Succeeds	✅ Succeeds	❌ Blocked	❌ Blocked
PII Exposure	✅ Full	⚠️ Partial	❌ Protected	❌ Protected
Credential Stuffing	N/A	✅ Succeeds	❌ Blocked	❌ Blocked
Mass Extraction	✅ Succeeds	✅ Succeeds	❌ Blocked	❌ Blocked
Malware Upload	✅ Succeeds	✅ Succeeds	❌ Blocked	❌ Blocked
DoS (Large Files)	✅ Succeeds	⚠️ Harder	❌ Blocked	❌ Blocked

Legend: ✅ = Attack succeeds, ⚠️ = Partially mitigated, ❌ = Attack blocked

Compliance Status¶

FCRA Compliance¶

Requirement	Status	Notes
Access Control (§604)	⚠️ Partial	Basic auth added, but weak
Security Procedures (§607)	❌ Insufficient	Incomplete encryption
Audit Trail (§607)	⚠️ Partial	Some logging, incomplete
Dispute Resolution (§611)	❌ Missing	No mechanism implemented

Verdict: Still violates FCRA

Requirement	Status	Notes
Data Minimization (Art. 5)	⚠️ Partial	Still collects too much
Security (Art. 32)	⚠️ Partial	Inconsistent encryption
Breach Notification (Art. 33)	❌ Missing	No detection capability
Data Portability (Art. 20)	❌ Missing	Not implemented

Verdict: Still violates GDPR

Attack Demonstrations¶

Demo 1: Magic Byte Bypass¶

# Create fake.pdf (actually an executable)
with open('malware.exe', 'rb') as exe:
    content = exe.read()

with open('fake.pdf', 'wb') as pdf:
    pdf.write(content)

# Upload succeeds - only checked extension!
response = upload_file('fake.pdf')
print(response)  # "File uploaded successfully"

Demo 2: Double Path Traversal¶

# Bypass simple sanitization
filename = "....//....//....//etc/passwd"
# After single replace: "../../../etc/passwd"
upload_file(filename)  # Writes outside uploads directory

Demo 3: Extract Plaintext PII¶

# Authenticate once
token = login('attacker', 'password')

# Steal all reports - no rate limiting
for report_id in range(10000):
    report = get_report(report_id, token)
    # SSN encrypted, but name, address, DOB exposed
    stolen_pii.append({
        'name': report['name'],
        'address': report['address'],
        'dob': report['dob']
    })

Running the Example¶

Setup¶

cd examples/a2a_credit_report_example/improved
pip install -r requirements.txt
python server.py

Try the Attacks¶

# Terminal 1: Start server
python server.py

# Terminal 2: Run Stage 2 specific attacks
python ../demos/attack_stage2.py

What to Observe¶

Some attacks blocked (good!)
But many still succeed (bad!)
Inconsistent protection
False sense of security

Key Differences from Stage 1¶

Feature	Stage 1	Stage 2	Improvement
Authentication	❌ None	✅ Password	+100%
File Validation	❌ None	⚠️ Extension only	+30%
PII Encryption	❌ None	⚠️ Partial	+50%
HTTPS	❌ HTTP	✅ Required	+100%
Size Limits	❌ None	✅ 10MB max	+100%
Access Control	❌ None	⚠️ Basic	+60%
Rate Limiting	❌ None	❌ None	0%
Audit Logging	❌ None	⚠️ Partial	+40%
Overall Security	0/10	4/10	+40%

Conclusion: Better, but still fails in production

Study Checklist¶

Compare code with Stage 1 to see improvements
Identify 15+ remaining vulnerabilities
Run bypass demonstrations successfully
Understand why partial encryption fails
Recognize incomplete vs. comprehensive security
Ready for Stage 3 production patterns

Key Takeaways¶

Partial security creates false confidence: Some protection ≠ secure
Inconsistent encryption is weak: All PII must be protected
File validation must be comprehensive: Extension checks are insufficient
Defense-in-depth is essential: Single layers fail
Compliance requires completeness: Partial implementation still violates regulations

Next: Stage 3 (Secure)¶

Stage 3 implements production-grade security with comprehensive protections.

Additional protections in Stage 3: - ✅ 8-layer validation framework - ✅ Complete field-level encryption - ✅ MFA enforcement - ✅ Rate limiting and abuse detection - ✅ Comprehensive audit logging - ✅ Full FCRA/GDPR compliance

Time to Complete: 4-6 hours
Difficulty: ⭐⭐ Intermediate
Prerequisites: Stage 1 complete, understanding of encryption basics