Skip to content
View svarshneysjsu's full-sized avatar
๐ŸŽฏ
Focusing
๐ŸŽฏ
Focusing
  • San Jose State University
  • San Jose

Block or report svarshneysjsu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
svarshneysjsu/README.md

Hi there! I'm Saumya Varshney ๐Ÿ‘‹

๐ŸŽ“ Master's in Applied Data Science | San Jose State University
๐Ÿ“ San Jose, California


About Me

I'm a passionate AI Engineer and Data Scientist currently working on LLMs (Large Language Models), Generative AI, and multi-agent systems's real-world applications. I specialize in designing cloud-based ML pipelines, building autonomous AI agents, and experimenting with Chain-of-Thought (CoT) and Chain-of-Draft (CoD) prompting techniques for enhanced reasoning in LLMs.

My work bridges the gap between cutting-edge research and scalable systemsโ€”whether it's building decision-making bots, fine-tuning transformer models, or enabling intelligent data pipelines on AWS and GCP.


๐Ÿ› ๏ธ Technical Skills

๐Ÿค– AI & LLMs

  • LLMs & Agentic AI: GPT-4, Claude, OpenAI API, LangChain, CrewAI, Retrieval-Augmented Generation (RAG)
  • NLP & Transformers: Hugging Face Transformers, BERTScore, Generative AI, NLP, Explainability & Interpretability

๐Ÿ“ˆ Machine Learning & Data Science

  • Core ML: Scikit-learn, TensorFlow, PyTorch, Predictive Modeling, Anomaly Detection, Reinforcement Learning, Time-Series Forecasting
  • Data Science: Feature Engineering, A/B Testing, Model Evaluation, Statistical Analysis, KPI Reporting

๐Ÿงช Data Engineering & ETL

  • ETL & Pipelines: Apache Airflow, AWS Glue, PySpark
  • Databases: MySQL, MongoDB, Google BigQuery, AWS RDS, Redshift
  • Query Languages: SQL, NoSQL

โ˜๏ธ Cloud & MLOps

  • Cloud Platforms:
    • ๐ŸŸง AWS: S3, Lambda, Glue, RDS, Redshift, Step Functions
    • ๐ŸŸฆ GCP: Vertex AI Workflows, Cloud Functions
  • MLOps & DevOps: Docker, Git, GitHub Actions, CI/CD workflows

๐Ÿ’ป Programming & Frameworks

  • Languages: Python, SQL, Bash, PowerShell
  • Frameworks: Flask, FastAPI

๐Ÿ“Š Visualization & BI Tools

  • Tableau, Power BI, Microsoft Excel, Google Sheets, Google Apps Script

๐Ÿš€ Deployment & Interfaces

  • Gradio, Hugging Face Model Hub & Spaces (ZeroGPU), REST APIs

๐Ÿš€ Featured Projects

1. Go2Bot OpenAI Integration

Description:
Showcased the integration of the Unitree Go2 robot with OpenAI during a summer research project. Features voice command processing and AI-driven task execution, enhancing robotic functionalities through advanced AI models.

Technologies Used:
Python, OpenAI API, Robotics Integration

GitHub Repository:
Go2Bot-OpenAI-Integration


2. AWS-Enabled Data Pipeline for Weather Data Analysis

Description:
Developed a robust AWS-enabled data pipeline designed for real-time weather data analysis. The system automates data ingestion, processing, storage, and analysis, providing actionable insights from NOAA datasets.

Technologies Used:
AWS (S3, Lambda, Glue, EC2), Python, Apache Airflow, Pandas, Matplotlib

GitHub Repository:
AWS-Enabled-Data-Pipeline-for-Weather-Data-Analysis


3. Paraphrase Detection

Description:
Developed a machine learning model to detect paraphrased sentences, improving NLP applications' accuracy in understanding text similarity.

Technologies Used:
Python, NLP, Scikit-Learn

GitHub Repository:
Paraphrase-Detection-with-Quora-Question-Pairs


๐Ÿ“ซ Let's Connect!


Thanks for visiting!

Pinned Loading

  1. Go2Bot-OpenAI-Integration Go2Bot-OpenAI-Integration Public

    This repository showcases the integration of the Unitree Go2 robot with OpenAI, developed during a summer research project. It features voice command processing, and AI-driven task execution. Builtโ€ฆ

    Python 8 1

  2. AWS-Enabled-Data-Pipeline-for-Weather-Data-Analysis AWS-Enabled-Data-Pipeline-for-Weather-Data-Analysis Public

    The "AWS-Enabled Data Pipeline for Weather Data Analysis" project is a sophisticated solution designed to streamline the collection, processing, and analysis of vast datasets related to weather patโ€ฆ

    Jupyter Notebook

  3. Paraphrase-Detection-with-Quora-Question-Pairs Paraphrase-Detection-with-Quora-Question-Pairs Public

    In this project, an LSTM model is used to determine whether two Quora questions are similar or not. The Fasttext and GloVe word embeddings are utilized to train the model.

    Jupyter Notebook

  4. VTA-Ridership-Forecast VTA-Ridership-Forecast Public

    This project leverages machine learning techniques to forecast ridership for the Valley Transportation Authority (VTA).

    Jupyter Notebook

  5. LinkedIn_JobPostingAnalysis_NoSQL LinkedIn_JobPostingAnalysis_NoSQL Public

    LinkedIn_JobPostingAnalysis_Using_NoSQL

    Python

  6. Spotify-Data-Analysis-And-Visualization Spotify-Data-Analysis-And-Visualization Public

    Forked from SreenidhiHayagreevan/Spotify-Data-Analysis-And-Visualization

    Jupyter Notebook