TABLE OF CONTENTS
Updated: 14 Feb 2025
NVIDIA H100 GPUs On-Demand
In our latest blog, we explore Zyphra Zonos, an innovative open-source text-to-speech (TTS) model suite developed by Zyphra. Featuring both a transformer model and the first open-source state-space model (SSM) hybrid for TTS, Zyphra Zonos delivers impressive performance with reduced latency and memory usage. We’ll guide you through deploying Zyphra Zonos on Hyperstack, from setting up your virtual machine to optimising model performance.
What is Zyphra Zonos?
Zyphra Zonos is a text-to-speech (TTS) model suite developed by Zyphra, an AI company based in Palo Alto, California. The suite includes two 1.6-billion-parameter models: a transformer model and a state-space model (SSM) hybrid. The SSM hybrid is the first open-source SSM model available for TTS. Both models are released under the permissive Apache 2.0 license.
The Zonos-v0.1 models are trained using a straightforward autoregressive approach, aiming to predict a sequence of audio tokens based on provided text and audio tokens. These audio tokens are generated from raw speech waveforms using the Descript Audio Codec (DAC) autoencoder.
The best part about the Zyphra Zonos hybrid model is that it offers efficient performance, lower latency and reduced memory usage compared to the transformer model. This is due to its Mamba2-based architecture, which minimises reliance on attention blocks.
How to Use Zyphra Zonos on Hyperstack
Let's walk through the step-by-step process to deploy Zyphra Zonos on Hyperstack:
Step 1: Accessing Hyperstack
- Visit the Hyperstack website and log in to your account.
- If you don't already have an account, you'll need to create one and set up your billing information. Check our documentation to get started with Hyperstack.
- Once you log in, you'll see the Hyperstack dashboard, which provides an overview of your resources and deployments.
Step 2: Deploying a New Virtual Machine
Initiate Deployment
- Navigate to the "Virtual Machines" section and click "Deploy New Virtual Machine."
- Click it to start the deployment process.
Select Hardware Configuration
- For regular usage, we recommend choosing the NVIDIA RTX A4000 x 1.
Choose the Operating System
- Select the "Ubuntu Server 22.04 LTS R550 CUDA 12.4 with Docker".
- This image comes pre-installed with Ubuntu 22.04 LTS, NVIDIA drivers (R550) and CUDA 12.4 with Docker, providing an optimised environment for AI workloads.
Select a Keypair
- Select one of the keypairs in your account. If you don't have a keypair yet, see our Getting Started tutorial for creating one.
Network Configuration
- Ensure you assign a Public IP to your Virtual machine.
- This allows you to access your VM from the internet, which is crucial for remote management and API access.
Enable SSH Access
- Make sure to enable an SSH connection.
- You'll need this to connect and manage your VM securely.
Add Firewall Rules
- Open port "7860" to allow incoming traffic on this port.
Please note: This will open your port to the public internet, allowing anyone with the public IP address and port number to access the dashboard.
Configure Additional Settings
- Look for the "Configure Additional Settings" section and click on it.
- Here, you'll find a field for cloud-init scripts. Use the provided cloud-init script for deployment and ensure it is in bash syntax. This script is pre-configured to deploy the Zyphra Zonos model by default, which is optimised for speed. Click here to get the cloud-init script!
Review and Deploy the Script
- Double-check all your settings.
- Paste the cloud-init script into the initialisation section when deploying your VM. The script will automatically install the necessary dependencies, clone the repository, and set up the environment.
- Click the "Deploy" button to launch your virtual machine.
Step 3: Setting Up the Model
- Set up time of cloud-init script: +-5 minutes to install libraries, download models and set up the UI.
Step 4: Accessing Your VM
Once the initialisation is complete, you can access your VM:
Locate SSH Details
- In the Hyperstack dashboard, find your VM's details.
- Look for the public IP address, which you will need to connect to your VM with SSH.
Connect via SSH
- Open a terminal on your local machine.
- Use the command ssh -i [path_to_ssh_key] [os_username]@[vm_ip_address] (e.g: ssh -i /users/username/downloads/keypair_hyperstack ubuntu@0.0.0.0.0)
- Replace username and ip_address with the details provided by Hyperstack.
Interacting with the Zyphra Zonos
Once the deployment is complete, access the Zyphra Zonos Web UI by navigating to[public-ip]:7860 in your web browser to generate text-to-audio with Zyphra Zonos.
Please note: This link is accessible to anyone with it. To restrict access, you can disable the Public IP and use SSH port forwarding instead. See instructions below
- Disable public-ip.
- Use SSH port forwarding together with your keypair, with this command:
ssh -i [path_to_ssh_key] -L 7860:localhost:7860 [os_username]@[vm_ip_address] # e.g: ssh -i /users/username/downloads/keypair_hyperstack -L 7860:localhost:7860 ubuntu@0.0.0.0
-
After running the above command, access the demo to localhost:7860 in your browser.
- See the attached screenshot for an example:
Troubleshooting Tips for Zyphra Zonos on Hyperstack
If you are having any issues, please follow the following instructions:
- SSH into your machine.
- Run this command to see logs:
cat /var/log/cloud-init-output.log
- Debug any issues you see there.
Step 5: Hibernating Your VM
When you're finished with your current workload, you can hibernate your VM to avoid incurring unnecessary costs:
- In the Hyperstack dashboard, locate your Virtual machine.
- Look for a "Hibernate" option.
- Click to hibernate the VM, which will stop billing for compute resources while preserving your setup.
To continue your work without repeating the setup process:
- Return to the Hyperstack dashboard and find your hibernated VM.
- Select the "Resume" or "Start" option.
- Wait a few moments for the VM to become active.
- Reconnect via SSH using the same credentials as before.
Similar Reads:
FAQs
What is Zyphra Zonos?
Zyphra Zonos is an open-source text-to-speech (TTS) model suite featuring a transformer model and a state-space model (SSM) hybrid. It offers efficient, high-quality TTS performance with reduced latency and memory usage.
What makes the Zyphra Zonos hybrid model unique?
The hybrid model leverages a Mamba2-based architecture, which reduces dependency on attention blocks, resulting in faster performance and lower memory consumption compared to traditional transformer models.
How are Zyphra Zonos models trained?
Zonos-v0.1 models are trained using an autoregressive approach to predict audio token sequences from text and audio tokens, using the Descript Audio Codec (DAC) autoencoder for converting raw speech waveforms into tokens.
What GPU should I choose on Hyperstack to run Zyphra Zonos?
For efficient performance with Zyphra Zonos, we recommend the NVIDIA RTX A4000 x1, which offers a balance of power and cost-effectiveness for TTS workloads.
How do I access the Zyphra Zonos Web UI on Hyperstack?
After deploying your VM and setting up Zyphra Zonos, access the Web UI by visiting [your_public_ip]:7860 in your browser. For secure access, use SSH port forwarding to localhost:7860.
Subscribe to Hyperstack!
Enter your email to get updates to your inbox every week
Get Started
Ready to build the next big thing in AI?