DMOSpeech 2 (Fork)

This repository is a fork of the original DMOSpeech2 repository. The original README has been renamed to original-README.md.

Prerequisites

uv - Python package installer and virtual environment manager

Setup

Activate the virtual environment:
```
source setup-source-me.sh
```
Run the setup script:
```
./scripts/setup.sh
```
Note: This downloads large model files (~500MB each) to the ckpts/ directory from HuggingFace.

Usage

HuggingFace Spaces Demo

Try the original DMOSpeech2 online without any setup:

HuggingFace Spaces (Original Repository): https://huggingface.co/spaces/yl4579/DMOSpeech2-demo

Google Colab (Cloud GPU)

For quick testing without local setup, use the Google Colab notebook:

Open DOMSpeech2_gradio_colab_GPU.ipynb in Google Colab
Run all cells to set up environment and launch Gradio interface
Provides free GPU access for faster inference

Local Development (Recommended)

For single-machine development and testing. Services bind to 127.0.0.1 (localhost only) for security.

FastAPI Server

python scripts/local-fastapi.py

Access API at: http://127.0.0.1:8000
API documentation: http://127.0.0.1:8000/docs

Gradio UI

python scripts/local-gradio.py

Access UI at: http://127.0.0.1:7860

Jupyter Lab

./scripts/jupyter-lab-local.sh

Access Jupyter at: http://127.0.0.1:8888

Jupyter Notebooks

Three notebook demos are available:

src/serveDMO.ipynb - FastAPI demo
- Run the cell to start FastAPI server on port 8000
src/gradio-test.ipynb - Gradio UI demo
- Run the cell to start Gradio interface on port 7860
DOMSpeech2_gradio_colab_GPU.ipynb - Google Colab demo with GPU support
- Run DMOSpeech2 in Google Colab with free GPU access
- Includes all necessary setup and Gradio interface

Remote Access (SSH Tunnels)

To access local services from a remote machine, use SSH port forwarding:

# From your remote machine to access local services
ssh -L 7860:localhost:7860 -L 8000:localhost:8000 user@hostname

# Then access in your remote browser:
# - Gradio UI: http://localhost:7860  
# - FastAPI docs: http://localhost:8000/docs

This enables microphone access and full UI functionality from remote browsers while maintaining security.

Network Access (Advanced)

⚠️ Security Warning: These scripts expose services to your local network. Only use on trusted networks behind firewalls.

FastAPI Server (Network)

python scripts/remote-fastapi.py

Access from any device on your network: http://YOUR_IP:8000

Gradio UI (Network)

python scripts/remote-gradio.py

Access from any device on your network: http://YOUR_IP:7860

Jupyter Lab (Network)

./scripts/jupyter-lab-remote.sh

Access from any device on your network: http://YOUR_IP:8888

API Usage Examples

REST API Example

# Initialize voice with reference audio
curl -X POST "http://127.0.0.1:8000/init_voice" \
  -F "audio_file=@reference.wav" \
  -F "reference_text=Your reference text here"

# Generate speech from text
curl -X POST "http://127.0.0.1:8000/generate_audio" \
  -F "target_text=This is the text I want synthesized." \
  --output generated_audio.wav

For network access, replace 127.0.0.1 with your server's IP address.

Security Considerations

Local Development (127.0.0.1)

✅ Secure: Services only accessible from the same machine
✅ Recommended: For development and testing
✅ Safe: No network exposure

SSH Tunnels

✅ Secure: Encrypted connection to remote services
✅ Flexible: Access remote services as if they were local
✅ Best Practice: For remote access to development servers

Network Access (0.0.0.0)

⚠️ Caution Required: Exposes services to local network
⚠️ Firewall Needed: Ensure proper network security
⚠️ No Authentication: Services have no built-in security
⚠️ HTTP Only: No encryption (consider HTTPS for production)

Production Deployment

For production use, consider:

HTTPS/SSL certificates
Authentication and authorization
Rate limiting and monitoring
Reverse proxy (nginx, Apache)
Network security hardening

Troubleshooting

Common Issues

Port already in use: Change port numbers in scripts if conflicts occur
Permission denied: Ensure scripts are executable (chmod +x scripts/*.sh)
Module not found: Verify virtual environment is activated
CUDA errors: Check GPU availability and PyTorch installation

Getting Help

Check the original documentation in original-README.md
Review error logs for specific issues
Ensure all prerequisites are installed

Acknowledgments

Original DMOSpeech2 repository: DMOSpeech2
Additional codebase references: F5-TTS, DMD2, simple_GRPO

This fork aims to provide enhanced ease-of-use and seamless integration of DMOSpeech2 into broader workflows and user interfaces.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
data		data
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DOMSpeech2_gradio_colab_GPU.ipynb		DOMSpeech2_gradio_colab_GPU.ipynb
LICENSE		LICENSE
README.md		README.md
original-README.md		original-README.md
requirements.txt		requirements.txt
setup-source-me.sh		setup-source-me.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DMOSpeech 2 (Fork)

Prerequisites

Setup

Usage

HuggingFace Spaces Demo

Google Colab (Cloud GPU)

Local Development (Recommended)

FastAPI Server

Gradio UI

Jupyter Lab

Jupyter Notebooks

Remote Access (SSH Tunnels)

Network Access (Advanced)

FastAPI Server (Network)

Gradio UI (Network)

Jupyter Lab (Network)

API Usage Examples

REST API Example

Security Considerations

Local Development (127.0.0.1)

SSH Tunnels

Network Access (0.0.0.0)

Production Deployment

Troubleshooting

Common Issues

Getting Help

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DMOSpeech 2 (Fork)

Prerequisites

Setup

Usage

HuggingFace Spaces Demo

Google Colab (Cloud GPU)

Local Development (Recommended)

FastAPI Server

Gradio UI

Jupyter Lab

Jupyter Notebooks

Remote Access (SSH Tunnels)

Network Access (Advanced)

FastAPI Server (Network)

Gradio UI (Network)

Jupyter Lab (Network)

API Usage Examples

REST API Example

Security Considerations

Local Development (127.0.0.1)

SSH Tunnels

Network Access (0.0.0.0)

Production Deployment

Troubleshooting

Common Issues

Getting Help

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages