Computational astronomy research investigating galaxy evolution, cosmic large-scale structure, and quasar physics using DESI and modern spectroscopic surveys
This organization produces research outputs in astronomy and data science, building analysis-ready datasets from large public sources. The methodology was validated through the Steam Dataset 2025 — a multi-modal gaming analytics ARD with strong engagement on both Kaggle and Zenodo — and is now being applied to DESI DR1 spectroscopic surveys.
Current work spans galaxy evolution in different cosmic environments, AGN feedback mechanisms, and ML-driven spectral analysis. The research runs on purpose-built infrastructure that enables reproducibility at scale.
This organization benefits from open source programs that provide tooling to qualifying public repositories. These sponsorships aren't just logos — they enable workflows that would otherwise be impractical for an independent research operation.
| Sponsor | Provides | Impact |
|---|---|---|
| Greptile | AI code review | PR review on every commit, enforcing git discipline across all repos |
| Atlassian | Jira, Confluence (Standard) | Project tracking, milestone management, documentation |
| Snyk | Security scanning | Dependency vulnerability detection across the organization |
The infrastructure foundation for all research workloads. A 7-Node Proxmox cluster with 144 cores and 700GB+ of RAM total, including a dedicated GPU node with an nvidia A4000 16GB. Documents the 7-node Proxmox cluster, VM inventory, network architecture, and automation patterns. This is the platform that enables reproducible, scalable research across all projects.
Analyzing galaxy populations within cosmic voids using DESI Data Release 1 to investigate environmental quenching mechanisms. This project serves as the Analysis-Ready Dataset (ARD) factory for the organization, joining 9 Value-Added Catalogs into enriched data products that feed downstream research.
Investigating AGN-driven outflows through semi-automated spectral fitting combined with Cloudy photoionization modeling. Developing automated pipelines to identify and characterize outflows in massive spectroscopic datasets.
ML-based anomaly detection across millions of quasar spectra. Implementing 1D convolutional variational autoencoders on Ray clusters to identify statistically unusual objects that may represent new physics or rare phenomena.
Independent validation and reanalysis of the RBH-1 hypervelocity SMBH candidate (van Dokkum et al. 2025) using Bayesian inference and GPU-accelerated computing.
Control plane that meshes VS Code Server, MetaMCP, and AI agents (Claude/Gemini) with audit-first ops. Centralizes agent orchestration and provides unified tooling across the research environment.
View Repository → (Coming soon)
Specification and methodology for building Analysis-Ready Datasets (ARDs) — pre-computed, enriched data products that eliminate repetitive preprocessing and enable immediate analysis. Domain-agnostic framework with reference implementations.
A Federated Knowledge Core for astronomical research — decoupling semantic meaning from structural relationships to enable expert-level RAG and autonomous Deep Research agents.
Tested recipes for Claude skills and hooks — methodology documentation, failure modes, and honest assessments. Not another awesome-list.
A collection of Docker compose scripts centered around use in a home lab for learning IT technologies.
AI Model Wiki website presenting structured model card data for 160+ AI models, running on Astro, Tailwind and Typescript on Azure Static Web Apps.
Repository standardizing the structure and layout for all repositories in the RadioAstronomy.io Github organization.
2026 project sandbox covering AI, ML, agentic coding, RAG systems, cloud infrastructure, and the occasional side project. A space for experimentation and skill development across the full technology stack.
Grid Defense RL is a custom Gymnasium environment designed for training and visualizing reinforcement learning agents. A PPO agent via Stable-Baselines3 learns to place defensive walls on a 13×9 grid to block enemies moving toward a core.
Our research consumes DESI Data Release 1 Value-Added Catalogs, materialized through PostgreSQL and distributed as Parquet files.
| VAC | Purpose | Scale |
|---|---|---|
| FastSpecFit | Stellar continuum modeling, emission line fluxes | 6.4M galaxies |
| PROVABGS | Bayesian SED fitting, stellar mass, SFH | BGS sample |
| DESIVAST | Void classifications (4 algorithms) | ~10.7K voids |
| Gfinder | Group catalog, halo mass estimates | Group members |
| AGN/QSO | Systemic redshifts, BAL flags, spectral classification | 1.4M QSOs |
| CIV Absorber | Intervening CIV absorption systems | Absorber catalog |
| MgII Absorber | Intervening MgII absorption systems | Absorber catalog |
| QMassIron | Black hole masses, bolometric luminosity | QSO subset |
| Stellar Mass/EmLine | CIGALE stellar masses, emission line properties | Full sample |
PostgreSQL serves as the materialization engine where VAC joins and derived computations occur. Final ARD products are exported to Parquet for distribution and analysis. The pipeline currently manages ~32GB of catalog data in PostgreSQL and ~108GB of spectral tiles in Parquet format.
Production research platform running on a 7-node Proxmox cluster built from small form factor enterprise workstations.
| Resource | Value |
|---|---|
| Nodes | 7 |
| Total Cores | 144 |
| Total RAM | 704 GB |
| Total NVMe | 26 TB |
| Network Fabric | 10G LACP per node |
| GPU | RTX A4000 16GB |
We practice open science and open methodology — our version of "showing your work":
- Research methodologies are fully documented and repeatable
- Infrastructure configurations are version-controlled and automated
- Scripts and pipelines are published so others can learn, adapt, or improve them
- Learning processes are captured and shared for community benefit
Our hope is that these materials help someone facing similar challenges, or inspire collaboration that helps us. All projects operate under open source licenses (primarily MIT) to ensure maximum reproducibility.
Projects in this organization are licensed under MIT unless otherwise specified.
Computational astronomy research through open data, reproducible workflows, and enterprise infrastructure
















