Public CV

Saif Ryan Gangaram

Data Scientist 2 at Los Alamos National Laboratory and Computer Science Ph.D. student at the University of New Mexico focused on high performance computing, AI/ML, Big Data, signal processing, and reliable scientific software.

Home

Contact

Education

University of New Mexico, Department of Computer Science

Ph.D. in Computer Science, expected 2028

Focus: high performance computing, Big Data, AI/ML, signal processing, scientific data systems

Research advisor: Dr. Amanda Bienz

University at Buffalo, Institute for AI and Data Science

Master of Professional Studies in Data Sciences and Applications, 2023

University at Buffalo, College of Arts and Sciences

Bachelor of Arts in Mathematics, Computing and Applied Mathematics, 2021

Research Interests

Professional Profile

My work sits at the intersection of scientific computing, Big Data, data-intensive software engineering, high performance computing, applied machine learning, and signal analysis. I have built production-oriented research tools for large sensor and experimental datasets, database-backed analysis interfaces, HPC-enabled workflows, streaming inference pipelines, generative AI and deep learning models, and validation tooling for reproducible technical analysis. My model work includes supervised learning, unsupervised learning, clustering, regression, and large-scale training on terabyte-class datasets with models that can reach millions of parameters.

Experience

Data Scientist 2, Los Alamos National Laboratory

2026 - Present
  • Develop data acquisition, analysis, automation, and validation workflows for engineering test and scientific computing environments.
  • Apply statistical modeling, time-frequency analysis, denoising, stationarity testing, and signal characterization to high-volume technical datasets.
  • Build Python, C++, Dask, Slurm, NetCDF, and ML-enabled workflows for scalable analysis, batch execution, microbatched inference, and reproducible deployment.
  • Develop Big Data, generative AI, supervised, and unsupervised modeling workflows for clustering, regression, anomaly detection, and deep learning on terabyte-scale datasets.
  • Harden large-file processing pipelines with chunked execution, manifest tracking, integrity validation, runtime instrumentation, and atomic output publishing.
  • Optimize inference and data movement for large collections of multi-terabyte technical files where throughput and low-latency operation are core algorithmic requirements.
  • Improve cross-environment portability through local/HPC execution paths, environment-driven configuration, and reproducible workflow documentation.

Software Engineer, Automation and AI/ML, Space Dynamics Laboratory

2023 - 2026
  • Developed scientific data pipelines, database-backed web tools, and automation workflows for aerospace and remote-sensing research settings.
  • Integrated PostgreSQL, SQLite, HDF5, NetCDF, Python, C#, C++, Fortran, MATLAB, Flask, JavaScript, and dashboard tooling into analysis workflows.
  • Built searchable data interfaces with visualization, selection, export, compression, schema inspection, and reproducible model-input generation features.
  • Applied machine learning, anomaly detection, clustering, regression, simulation, image processing, and verification methods to large sensor, atmospheric, and telemetry-style datasets.
  • Designed multithreaded communication and automation components using REST, TCP/UDP interfaces, synchronization primitives, and testable configuration workflows.
  • Collaborated with multidisciplinary teams using Git, CI/CD practices, technical documentation, requirements analysis, integration testing, and iterative stakeholder feedback.

Teaching Assistant, University at Buffalo

2019 - 2020
  • Supported computer science labs and recitations for more than 45 students.
  • Assisted with grading, exam administration, student mentoring, and occasional lecture support.

Selected Projects

Hyperion: Self-Hosted Agentic AI Operations Console

Developed a local-first, open-source agentic harness using Deno, TypeScript, vanilla JavaScript, and WebSocket, with no frontend framework, bundler, or required cloud dependency beyond model APIs. Completed Phase 2 with parallel OpenAI and Anthropic agent sessions, token-by-token streaming, live tool and error events, file context injection, persistent cross-session memory, AI-assisted email drafting with tone control, and tmux session management with command suggestions based on live pane output. Added a full mock mode so the interface remains explorable without API keys. Phase 3 is planned to add multi-user authentication, CalDAV integration, MCP server support, and persistent sessions.

github.com/srgangaram-swe/Hyperion

Conductor: Agentic AI Dashboard for Software Development

Developing a personal agentic AI dashboard tool for software development and workflow optimization. The project focuses on coordinating development tasks, surfacing project state, supporting debugging and documentation workflows, and helping developers move from intent to verified implementation.

Parallel Principal Component Analysis with MPI

Designed distributed PCA with rank-local sharding, global mean and covariance reductions, eigenpair broadcast, local projection, optional whitening, metadata logging, and scalability benchmarks.

Large-File Scientific ML Inference

Built overlap-aware slab processing and microbatching for large scientific data, improving memory behavior, output validation, manifest tracking, and reproducible inference with Dask, PyTorch, and NetCDF-oriented data products. The workflow is designed for fast inference across large collections of multi-terabyte files, including workloads that may require near-real-time throughput.

Signal Processing and Workflow Reliability

Developed public-safe signal characterization workflows using spectral analysis, spectrograms, power estimates, stationarity tests, runtime tracing, and validation reports to improve confidence in noisy scientific data products.

Real-Time Human Emotion Recognition

Developed a CNN-based computer vision pipeline using public image datasets, augmentation, regularization, feature extraction, and real-time video inference.

Financial Risk Modeling

Developed probability-of-default models with gradient boosting, neural networks, feature engineering, Bayesian hyperparameter optimization, stress testing, and error analysis.

Scientific Database and Web Tooling

Designed database-backed research tools for search, plotting, export, schema inspection, data selection, and reproducible modeling workflows.

Quantitative Finance and Portfolio Optimization

Built a Python-based financial analysis system using portfolio objects, numerical analysis, and optimization methods to evaluate performance metrics and risk-aware allocation strategies.

Epidemiological SIR and Network Modeling

Implemented SIR and network-based epidemic models using public health datasets, numerical integration, and error analysis to study spread dynamics at regional scales.

Recognition and Presentations

Technical Skills

Languages: Python, C++, C#, SQL, Bash, R, Java, JavaScript, HTML, CSS, Julia, MATLAB, Scala, Fortran, TypeScript, Rust

ML and Data Science: PyTorch, TensorFlow, Keras, Scikit-learn, NumPy, Pandas, generative AI, supervised learning, unsupervised learning, clustering, regression, CNNs, U-Net-style segmentation, LSTMs, Transformers, anomaly detection, forecasting

HPC, Big Data, and Systems: MPI, Dask, Slurm, distributed workflows, scalable analytics, multithreading, synchronization primitives, socket programming, Docker, Linux, Windows, Git, CI/CD

Data and Visualization: PostgreSQL, SQLite, BigQuery, HDF5, NetCDF, TDMS-style scientific data, terabyte-scale datasets, distributed data processing, Matplotlib, Seaborn, Tableau, Grafana, technical reporting

Analysis: Statistical inference, regression, Bayesian analysis, time series, signal processing, spectral methods, numerical computation, validation, error analysis

Software Practices: Requirements analysis, integration testing, workflow hardening, documentation, reproducible batch execution, LLM-assisted development with human validation

Public Information Note

This public CV intentionally omits sensitive access details, restricted system names, operational parameters, non-public datasets, and program-specific details.