Getting Started with Microsoft's Phi-2 on your laptop

In this video we will explore how to run the Phi-2 model on your local machine.

Microsoft has changed the license terms for their model Phi-2; it's now under an MIT license, allowing commercial use. However, I encourage you to do your own research and evaluate the model independently. It has limitations, notably in instruction following, safety, and toxicity. With 2.7 billion parameters, it's relatively small, fitting easily on a laptop or phone, making it potentially ideal for text completion.

The model has a small context length, suited more for short generations and inputs. It was trained mostly on synthetic data generated by GPT-3.5 and highly selected web data evaluated by GPT-4, aiming for textbook quality data.

In this video, I'll show you how to run this model on your local machine. We'll use Llama.CPP. First, go to the Llama.CPP GitHub repo, copy the URL, and clone it ion your machine. Once in, start a 'make' to build the executable. Support for Phi-2 was added three weeks ago. We'll use the raw model without fine-tuning. First, pip install 'huggingface_hub' to download the model from the command line. Use the 'huggingface-cli download' command, ensuring it's downloaded locally.

Next, convert the model to the GGUF format that Llama.CPP expects. In the Llama.CPP directory, there's a script called 'convert-hf-to-gguf.py'. Install the script's dependencies from the requirements file. Once installed, run the script pointing to the model directory.

Now, run Llama..CPP, pointing to the model file, enabling color, and setting it to run interactively. Let's test it with some example prompts from Microsoft's model card. We'll try both basic and QA formats.

The baseline model hasn't been fine-tuned. Let's explore existing fine-tunings for potentially better results. TheBloke is a large producer of fine-tunes; they have a DPO (Direct Preference Optimization) fine-tuning, which is a lightweight way of incorporating human preference directly into the model. Follow the same process to download and use this model.

Finally, test the model with your usual evaluation dataset or other prompts. It's relatively easy to run this model locally