Deploying and Using Pixtral Large Instruct 2411 on Hyperstack: A Quick Start Guide

Written by Sebastian Panman de Wit | Nov 20, 2024 2:10:24 PM

Mistral AI’s Pixtral Large is a cutting-edge 123B multimodal model that excels in image and text understanding. With a 128K context window capable of fitting over 30 high-res images, it outperforms competitors on benchmarks like MathVista, DocVQA and VQAv2. Pixtral Large surpasses GPT-4 and Gemini-1.5 Pro in complex visual reasoning and real-world multimodal tasks. The model is available for both research and commercial use.

To get started, read our latest tutorial below and deploy Pixtral Large Instruct 2411 on Hyperstack.

Why Deploy on Hyperstack?

Hyperstack is a cloud platform designed to accelerate AI and machine learning workloads. Here's why it's an excellent choice for deploying Pixtral Large Instruct 2411:

Availability: Hyperstack provides access to the latest and most powerful GPUs such as the NVIDIA A100 and the NVIDIA H100 SXM on-demand, specifically designed to handle large language models.
Ease of Deployment: With pre-configured environments and one-click deployments, setting up complex AI models becomes significantly simpler on our platform.
Scalability: You can easily scale your resources up or down based on your computational needs.
Cost-Effectiveness: You pay only for the resources you use with our cost-effective cloud GPU pricing.
Integration Capabilities: Hyperstack provides easy integration with popular AI frameworks and tools.

Deployment Process

Now, let's walk through the step-by-step process of deploying Pixtral Large Instruct 2411 on Hyperstack.

Step 1: Accessing Hyperstack

Go to the Hyperstack website and log in to your account.
If you're new to Hyperstack, you'll need to create an account and set up your billing information. Check our documentation to get started with Hyperstack.
Once logged in, you'll be greeted by the Hyperstack dashboard, which provides an overview of your resources and deployments.

Step 2: Deploying a New Virtual Machine

Initiate Deployment

Look for the "Deploy New Virtual Machine" button on the dashboard.
Click it to start the deployment process.

Select Hardware Configuration

In the hardware options, choose the "4xA100-80G-PCIe" flavour.
This configuration provides 4 NVIDIA A100 GPUs with 80GB memory each, connected via PCIe, offering exceptional performance for running Pixtral Large Instruct 2411.

Choose the Operating System

Select the "Server 22.04 LTS R535 CUDA 12.2 with Docker".
This image comes pre-installed with Ubuntu 22.04 LTS and NVIDIA drivers (R535) along with CUDA 12.2 with Docker, providing an optimised environment for AI workloads.

Select a keypair

Select one of the keypairs in your account. Don't have a keypair yet? See our Getting Started tutorial for creating one.

Network Configuration

Ensure you assign a Public IP to your Virtual machine.
This allows you to access your VM from the internet, which is crucial for remote management and API access.

Enable SSH Access

Make sure to enable an SSH connection.
You'll need this to securely connect and manage your VM.

Configure Additional Settings

Look for an "Additional Settings" or "Advanced Options" section.
Here, you'll find a field for cloud-init scripts. This is where you'll paste the initialisation script. Click here to get the cloud-init script!
Ensure the script is in bash syntax. This script will automate the setup of your Pixtral Large Instruct 2411 environment.

To use this model, you will need gated access:

Create a HuggingFace token to access the gated model, see more info here.
Replace line 11 of the attached cloud-init file with your HuggingFace token.

Review and Deploy

Double-check all your settings.
Click the "Deploy" button to launch your virtual machine.

Step 3: Initialisation and Setup

After deploying your VM, the cloud-init script will begin its work. This process typically takes about 7 minutes. During this time, the script performs several crucial tasks:

Dependencies Installation: Installs all necessary libraries and tools required to run Pixtral Large Instruct 2411.
Model Download: Fetches the Pixtral Large Instruct 2411 model files from the specified repository.
API Setup: Configures the vLLM engine and sets up an OpenAI-compatible API endpoint on port 8000.

While waiting, you can prepare your local environment for SSH access and familiarise yourself with the Hyperstack dashboard.

Step 4: Accessing Your VM

Once the initialisation is complete, you can access your VM:

Locate SSH Details

In the Hyperstack dashboard, find your VM's details.
Look for the public IP address, which you will need to connect to your VM with SSH.

Connect via SSH

Open a terminal on your local machine.
Use the command ssh -i [path_to_ssh_key] [os_username]@[vm_ip_address] (e.g: ssh -i /users/username/downloads/keypair_hyperstack ubuntu@0.0.0.0.0)
Replace username and ip_address with the details provided by Hyperstack.

Interacting with Pixtral Large Instruct

To access and experiment with Meta's latest model, SSH into your machine after completing the setup. If you are having trouble connecting with SSH, watch our recent platform tour video (at 4:08) for a demo. Once connected, use this API call on your machine to start using the Pixtral Large Instruct 2411.

Check out the Image example below:

# Query the model using an image
IMAGE_URL="https://www.hyperstack.cloud/hs-fs/hubfs/deploy-vm-11-ecd8c53003182041d3a2881d0010f6c6-1.png?width=3352&height=1852&name=deploy-vm-11-ecd8c53003182041d3a2881d0010f6c6-1.png"
cat < payload.json
{
    "model": "mistralai/Pixtral-Large-Instruct-2411",
    "messages": [
        {
            "role": "system",
            "content": "SYSTEM_PROMPT"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe this image in two sentences"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "${IMAGE_URL}"
                    }
                }
            ]
        }
    ]
}
EOF

You should see a response similar to below:

 {
  "id": "chatcmpl-2b41dd5f00c44692a647f27c6526f397",
  "object": "chat.completion",
  "created": 1732088507,
  "model": "mistralai/Pixtral-Large-Instruct-2411",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The image shows a virtual machine management interface from Hyperstack, where the \"genius-hubble\" VM is active. In the networking settings, the option to enable SSH access on port 22 is highlighted, and there are additional options for ICMP access and managing firewalls.",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 2357,
    "total_tokens": 2419,
    "completion_tokens": 62,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null
}

Check out the Text example below:

# Use the JSON payload file in the curl command
curl -X POST http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d @payload.json

# Query the model only with text
cat < payload.json
{
    "model": "mistralai/Pixtral-Large-Instruct-2411",
    "messages": [
        {
            "role": "system",
            "content": "SYSTEM_PROMPT"
        },
        {
            "role": "user",
            "content": "Hi. What can you do for me?"
        }
    ]
}
EOF

# Use the JSON payload file in the curl command
curl -X POST http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d @payload.json

You should see a response similar to below:

{
  "id": "chatcmpl-eaadc212af354361b46ed65366f9f7a7",
  "object": "chat.completion",
  "created": 1732088556,
  "model": "mistralai/Pixtral-Large-Instruct-2411",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I can assist you with a wide range of tasks and provide information on various topics. Here are some examples:\n\n1. **Answer Questions**: I can provide information based on the data I've been trained on, up until 2023.\n\n2. **Explain Concepts**: I can help break down complex ideas into simpler parts to make them easier to understand.\n\n3. **Provide Suggestions**: Whether it's a book to read, a movie to watch, or a recipe to cook, I can provide recommendations.\n\n4. **Help with Language**: I can help with language translation, definition, or grammar.\n\n5. **Perform Simple Tasks**: I can do simple calculations, conversions, and other basic tasks.\n\n6. **Engage in Dialogue**: I can participate in conversations on a wide range of topics.\n\nWhat specifically would you like help with?",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "total_tokens": 221,
    "completion_tokens": 201,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null
}

If the API is not working after 20-30 minutes, please refer to our 'Troubleshooting Pixtral Large Instruct 2411 section below.

Troubleshooting Pixtral Large Instruct 2411

Step 5: Hibernating Your VM

When you're finished with your current workload, you can hibernate your VM to avoid incurring unnecessary costs:

In the Hyperstack dashboard, locate your Virtual machine.
Look for a "Hibernate" option.
Click to hibernate the VM, which will stop billing for compute resources while preserving your setup.

To continue your work without repeating the setup process:

Return to the Hyperstack dashboard and find your hibernated VM.
Select the "Resume" or "Start" option.
Wait a few moments for the VM to become active.
Reconnect via SSH using the same credentials as before.

Explore our latest tutorials below:

View full post