How to run Mistral AI's leaked Miqu in 5 minutes with vLLM, Runpod, and no code

Here's a quick tutorial to run Miqu on Runpod in less than five minutes.

Getting Started

A brief overview of the model card on Hugging Face, mentioning the model name "Miqu" and details about its dequantized version for easier use.

Technical Setup

Running the Model: Discussion on the impracticality of running the model locally due to its size and the alternative of using a remote service.
GPU Requirements: Introduction of a tool on Hugging Face to calculate GPU requirements for running the model, indicating the need for two GPUs.

Setting Up Remote GPUs

Selecting a GPU Provider: Emanuel chooses Runpod, an on-demand GPU provider, for the demonstration.
Account and Funding: Steps to create an account on Runpod and fund it.
Deploying VM: Detailed instructions on deploying a VM on Runpod using a template image and customizing deployment for the model.

Running the Model

Downloading and Launching: Emanuel walks through the process of downloading the model, launching the VLM server, and connecting to the API server.
Testing the API: Demonstrating the use of the API with CURL commands in the terminal and Python code using the OpenAI client.

Conclusion and Alternatives

Integration and Usage Tips: Tips on integrating Miqu into applications and reminders about the unofficial and illegal nature of the leak.
Official Alternatives: Mention of signing up with Mistral AI for a legitimate API key and usage.
Exploring Models on Air tr.: Encouragement to explore models on Air tr. for a direct comparison with other models like GP4 and LLaMA.

Closing Remarks

Emmanuel wraps up the video with a reminder to shut down the Runpod VM to avoid extra charges and a caution against using leaked models in production.

‍