Currently: Senior Software Engineer • 8+ years

Kasi Viswanath G

I build production distributed systems — and I’m shifting my focus toward ML infrastructure and GPU performance engineering.

Email GitHub LinkedIn

Impact highlights

Event-driven systems with Kafka
Kubernetes-based production workloads
Seasonal peak scaling (US sale events)
Airflow + BigQuery data pipelines
Performance tuning across SQL + async processing

Distributed SystemsPerformanceML Infra (focus)GPU efficiency (building)

Technical Foundation

8+ years building production distributed systems in fintech and high-volume e-commerce environments. Strong background in concurrency, event-driven architectures, data systems, and performance optimization.

Distributed Systems & Infrastructure

Designed and maintained event-driven architectures using Kafka for high-throughput order processing.
Deployed and operated containerized workloads on Kubernetes with scaling considerations for peak traffic.
Built resilient, idempotent workflows for correctness under high concurrency and retry scenarios.
Developed Airflow DAGs for orchestrating production data pipelines.

Performance & Data Systems

Optimized Azure SQL queries and indexing strategies for performance-critical workflows.
Improved processing throughput via partition tuning, batching, and concurrency adjustments.
Integrated analytics pipelines using BigQuery for operational insights.
Strong understanding of system bottlenecks: CPU, I/O, memory, and network constraints.

Java EcosystemKafkaKubernetesAzure SQLBigQueryAirflowDistributed SystemsConcurrencyPerformance Tuning

Current Focus: ML Infrastructure & GPU Performance

Building deeper expertise in low-level systems and GPU compute to transition into ML infrastructure and performance engineering roles. My focus is on understanding compute efficiency from the hardware layer upward — memory hierarchy, parallelism, and bottleneck analysis.

C++ & Systems Programming

Strengthening fundamentals in memory management, data layout, cache behavior, and multithreading.
Studying performance tradeoffs between abstraction and low-level control.
Exploring lock-free and concurrency-oriented design patterns.

CUDA & GPU Compute

Learning CUDA execution model: threads, warps, blocks, and grids.
Understanding GPU memory hierarchy: global, shared, constant, and register memory.
Profiling workloads to analyze memory bandwidth vs compute-bound bottlenecks.

Triton & ML Systems

Exploring Triton kernel development for high-performance tensor operations.
Studying distributed training internals (DDP, NCCL, scaling patterns).
Building toward reproducible benchmarking and profiling-driven optimization workflows.

C++CUDATritonGPU ProfilingParallel ComputingDistributed ML SystemsPerformance Engineering

Experience

Senior Software Engineer

Present

Top e-commerce company (US) — Order Fulfillment

KafkaKubernetesAzure SQLAirflowBigQueryPerformance

Built and maintained distributed order fulfillment workflows using Kafka and Kubernetes in a high-volume environment.
Designed scalable event-driven processing pipelines resilient to seasonal peak traffic spikes (US sale events).
Improved throughput and reduced processing latency via partition tuning, async batching, and concurrency optimization.
Developed Airflow DAGs powering analytics and operational workflows via BigQuery.
Optimized Azure SQL queries and indexing strategies for order state transitions and operational reporting.
Implemented idempotent, retry-safe patterns to ensure correctness and reliability under high concurrency.

Software Engineer

Past

Kotak Cherry (Fintech) — Digital Investment Platform

BackendReliabilityPerformanceObservabilityFintech

Contributed to backend services powering a digital investment platform used by a six-figure registered user base.
Built and optimized portfolio aggregation and transaction-related workflows.
Improved API performance and query efficiency for latency-sensitive user flows.
Strengthened observability and production monitoring to improve reliability and incident response.

Technical focus

I’m targeting ML infrastructure and performance roles, leveraging my background in distributed systems. Current focus areas: profiling, benchmarking, GPU utilization, and scalable training/inference systems.

Distributed systemsEvent streamingK8s workloadsWorkflow orchestrationData systemsPerformance optimizationML infra (focus)GPU efficiency (building)

Contact

Email LinkedIn GitHub