Replit 3B Inference Save

Run inference on replit-3B code instruct model using CPU

Project README

Replit Code Instruct inference using CPU

Run inference on the replit code instruct model using your CPU. This inference code uses a ggml quantized model. To run the model we'll use a library called ctransformers that has bindings to ggml in python.

Demo:

Inference Demo

Requirements

Using docker should make all of this easier for you. Minimum specs, system with 8GB of ram. Recommend to use python 3.10.

Tested working on

Will post some numbers for these two later.

  • AMD Epyc 7003 series CPU
  • AMD Ryzen 5950x CPU

Setup

First create a venv.

python -m venv env && source env/bin/activate

Next install dependencies.

pip install -r requirements.txt

Next download the quantized model weights (about 1.5GB).

python download_model.py

Ready to rock, run inference.

python inference.py

Next modify inference script prompt and generation parameters.

Open Source Agenda is not affiliated with "Replit 3B Inference" Project. README Source: abacaj/replit-3B-inference
Stars
151
Open Issues
5
Last Commit
10 months ago
License
MIT

Open Source Agenda Badge

Open Source Agenda Rating