Amin Wafi
AI & Machine Learning Engineer
I train and ship AI systems with a focus on how they behave in deployment. Computer vision, on-device LLMs, and developer tools — one principle runs through all of it: the user's data should not leave the user's machine.
YOLOv8 · spaCy · on-device LLMs
About
Final-year Computer Science student who trains and ships AI systems with a focus on how they behave in deployment. Work spans computer vision, on-device LLM tooling, and full-stack web applications. One principle runs through all of it: the user's data should not leave the user's machine.
My dissertation argues GDPR is structurally incompatible with frontier-scale LLM training on web-scraped personal data — the same thread as my AW Labs privacy tools. Also writes on the intersection of AI systems, security, and privacy law.
BSc (Hons) Computer Science · University of Derby · Expected 2026
Core skills
Featured Projects
Training LLMs on Web-Scraped Personal Data
Argues GDPR is structurally incompatible with frontier-scale LLM training on web-scraped personal data — synthesising empirical ML memorisation research (Carlini et al.), doctrinal EU law, and Charter fundamental-rights theory.
Evaluated current AI-governance responses (EDPB Opinion 28/2024, CNIL 2025 guidance, EU AI Act Article 53) against a standard derived from CJEU Charter case law. Predicted the Court of Rome would annul the Italian Garante's €15M fine against OpenAI. It did, six weeks after submission.
Proposed a data-trusts governance model grounded in GDPR Article 80(2), specifying three technical primitives: cryptographic dataset provenance, a training-data registry, and a memorisation-audit interface using membership-inference methods.
Skills
Languages
AI / ML
Web & Backend
Data
Infra & Tooling
AI Governance
Formal Methods
More Work
01 redact
Developer Tools Dual-engine PII redaction CLI: spaCy NER + Claude run in parallel, spans merged, compare mode shows where each engine fails.
redact
Developer ToolsDual-engine PII redaction CLI: spaCy NER + Claude run in parallel, spans merged, compare mode shows where each engine fails.
- — Dual-engine: spaCy NER + regex, and Claude with character-level offsets
- — Overlapping spans deduplicated; label or block redaction styles
- — Side-by-side compare command shows gap between engines
- — Pipe-friendly: cat file.txt | redact — no temp files
02 mocap-sandbox
AI/ML Real-time skeletal pose tracking from webcam using MediaPipe PoseLandmarker with OpenCV skeleton overlay.
mocap-sandbox
AI/MLReal-time skeletal pose tracking from webcam using MediaPipe PoseLandmarker with OpenCV skeleton overlay.
- — MediaPipe PoseLandmarker (pose.task model, VIDEO mode, num_poses=1)
- — Real-time OpenCV skeleton overlay with frame-indexed timestamp
- — Graceful exit handling; runs on Python 3.13
03 ElegantDine
Web Flask restaurant-management app — reservations, order status workflow, RBAC — deployed on Render with PostgreSQL.
ElegantDine
WebFlask restaurant-management app — reservations, order status workflow, RBAC — deployed on Render with PostgreSQL.
- — Role-based auth (Flask-Login)
- — Order status state machine
- — PostgreSQL/SQLite auto-fallback
- — Deployed on Render
04 Pandemic Resilience System
Web Multi-role Flask platform with encryption at rest, FHIR vaccination records, and purchase-limit enforcement.
Pandemic Resilience System
WebMulti-role Flask platform with encryption at rest, FHIR vaccination records, and purchase-limit enforcement.
- — Encryption at rest for sensitive identifiers
- — Werkzeug password hashing + RBAC
- — FHIR-JSON vaccination records
- — Full audit logging
05 Security Management Reports
Security Consultant-style authentication review (6 password managers, FIDO2, PCI-DSS) and the 2025 M&S ransomware post-mortem mapped to ISO 27001.
Security Management Reports
SecurityConsultant-style authentication review (6 password managers, FIDO2, PCI-DSS) and the 2025 M&S ransomware post-mortem mapped to ISO 27001.
- — Evaluated 6 password managers + 2 MFA bypass techniques
- — Reconstructed Scattered Spider attack chain
- — Mapped failures to ISO 27001 / NIST
- — Costed remediation programme
06 Systems Programming
Systems POSIX shared memory + semaphores in C, a Linux loadable kernel module, and LAPACK benchmarking across 4 environments with Docker.
Systems Programming
SystemsPOSIX shared memory + semaphores in C, a Linux loadable kernel module, and LAPACK benchmarking across 4 environments with Docker.
- — POSIX shared memory + named semaphores with clean teardown
- — Loadable Linux kernel module
- — LAPACK solver verified to machine epsilon
- — Reproducible across 4 environments
07 Formal Verification (CSP-M / FDR)
Formal Methods CSP-M models in FDR proving a banking security breach and Dekker's algorithm failing safety vs liveness under compiler optimisation.
Formal Verification (CSP-M / FDR)
Formal MethodsCSP-M models in FDR proving a banking security breach and Dekker's algorithm failing safety vs liveness under compiler optimisation.
- — Trace refinement proving a breach testing could not catch
- — Mutual exclusion, deadlock, divergence checks
- — Safety failure (Opt A) vs liveness failure (Opt B)
Contact
I'm actively targeting AI and software engineering roles and open to relocation. If you're building something serious with AI, I'd like to hear about it.
Send an email