SYSTEM ONLINELOC: FRESNO, CAROLE: SWE II

DUC
THAN

I build high-throughput backend systems and AI retrieval pipelines — architecting services that stay fast and reliable at 10M+ requests a day.

SYS_PROFILE.LOGv2.6
handle@ductienthan
focusbackend · ai · infra
stacknode · python · ts
uptime4+ yrs
p99< 100 ms
statusopen to talk
0M+
Daily requests served
<0ms
p99 response latency
0%
Repeat-query latency cut
0
GPA · CS, UT Arlington
01 / About

Engineer for systems under load.

I'm a software engineer who lives in the backend — the queues, caches, and data pipelines that have to stay fast when traffic spikes and stay correct when things fail.

At Samsung Electronics America, I lead a core authentication service handling 120K+ requests a day and a pricing engine serving 10M+. Lately I've gone deep on AI infrastructure: RAG pipelines, vector search, LLM tooling, and MCP agents that do real work in production.

I care about the unglamorous parts — sub-100ms p99s, fault-tolerant ingestion, observability you can actually debug with. The kind of engineering you only notice when it's missing.

NameDuc Than
RoleSoftware Engineer II
CompanySamsung Electronics America
BasedSunnyvale, California
DegreeB.S. Computer Science · 3.88 GPA
02 / Capabilities

Stack & systems.

[01]

Languages

JavaScript / Node.jsTypeScriptPythonSQLNext.js
[02]

Backend & Data

FastAPIHapiRabbitMQCeleryRedisPostgreSQLpgvectorElasticsearchOpenSearchPusherNeonDBPydanticAWS
[03]

AI / ML

RAG PipelinesLLM IntegrationMCP AgentsVector SearchEmbeddingssentence-transformersBM25 Hybrid SearchCross-encoder RerankHNSWtiktokenEasyOCRIsolation Forest
[04]

Systems & Tooling

Queue ClusteringLoad BalancingSQL TuningAsync / ConcurrencyCDN / Edge CachingTransactional OutboxRRFJenkinsDockerGitKibanaAlembicpytest
03 / Experience

Where I've shipped.

JUL 2022 — PRESENT
Software Engineer II
Samsung Electronics America — Mountain View, CA
  • Lead engineer for a core authentication service processing 120K+ requests/day — owned architecture, reliability, and incident response.
  • Architected a clustered SSO session-extension queue with load balancing and SQL tuning — eliminated CPU saturation at peak and held sub-100ms p99 latency globally.
  • Replaced legacy Java middleware with a Node.js + Hapi layer, cutting inter-service latency ~300ms across all daily traffic.
  • Built Isolation Forest anomaly detection on latency/throughput signals (MTTD ↓ 5–10%); shipped an LLM tool for automated Git-diff root-cause analysis.
  • Engineered a high-throughput shopping cart with Akamai CDN + multi-layer Redis caching to offload origin traffic under sustained peaks.
  • Designed a combinatorial pricing engine serving 10M+ daily requests with sub-millisecond lookups; built OpenSearch observability dashboards.
Internal Platform

AI-Powered PTO Request Tool

  • Designed a FastAPI + MCP-agent system letting thousands of employees submit PTO via template-driven Adaptive Cards; decoupled the agent from the HR backend via async queue for resilience against UKG API timeouts.
  • Built a cron worker mapping employees to managers; used AI coding agents to cut delivery time ~40%.
2021
B.S. Computer Science
University of Texas at Arlington — GPA 3.88
04 / Selected Work

Things I've built.

PRJ_01

BookRAG

AI-Powered Book Q&A System

A production-grade 5-layer RAG system with async ingestion, OCR fallback, and hierarchical chunking. Resumable embedding guarantees zero data loss on worker failure. An 8-step retrieval pipeline + cross-encoder reranking lifted answer quality ~60%; TTL caching cut repeat latency 95%+.

PythonpgvectorHNSWBM25 (RRF)OllamaDocker
PRJ_02

Movie Recommender

Async Embedding + LLM Re-rank

Celery workers ingest movies, generate embeddings, and persist to pgvector; an LLM agent re-ranks cosine-similarity candidates for natural-language quality. A Transactional Outbox guarantees no record loss between ingestion and vector storage under worker failure.

FastAPICelerypgvectorRedisLLM Rerank
PRJ_03

Automated Trading CLI

Concurrent Real-Time Engine

A hybrid threading + asyncio model parallelizes real-time price streams across instruments. Event-driven per-instrument task queues enable low-latency RSI signal processing and automated order execution — without shared-state contention.

PythonAsyncIOThreadingIG Markets API
PRJ_04LIVE ↗

Zuika

Real-Time Web Application

Built with Next.js (SSR + App Router) and Pusher pub-sub for real-time bidirectional events — no polling overhead. Hybrid persistence via NeonDB + local storage, with CDN edge caching for sub-50ms global asset delivery.

Next.jsPusherNeonDBCDN
05 / Journal

Notes & write-ups.

Things I've learned building systems under load — RAG internals, latency hunting, and architecture decisions.

All Posts →

// Posts coming soon — check back shortly.

06 / Contact
Let's build
something fast.