Mpt 30B Inference Save

Run inference on MPT-30B using CPU

Project README

MPT 30B inference code using CPU

Run inference on the latest MPT-30B model using your CPU. This inference code uses a ggml quantized model. To run the model we'll use a library called ctransformers that has bindings to ggml in python.

Turn style with history on latest commit:

Inference Chat

Video of initial demo:

Inference Demo

Requirements

I recommend you use docker for this model, it will make everything easier for you. Minimum specs system with 32GB of ram. Recommend to use python 3.10.

Tested working on

Will post some numbers for these two later.

AMD Epyc 7003 series CPU
AMD Ryzen 5950x CPU

Setup

First create a venv.

python -m venv env && source env/bin/activate

Next install dependencies.

pip install -r requirements.txt

Next download the quantized model weights (about 19GB).

python download_model.py

Ready to rock, run inference.

python inference.py

Next modify inference script prompt and generation parameters.

Open Source Agenda is not affiliated with "Mpt 30B Inference" Project. README Source: abacaj/mpt-30B-inference

Stars

573

Open Issues

Last Commit

10 months ago

Repository

abacaj/mpt-30B-inference

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/mpt-30b-inference"><img src="https://www.opensourceagenda.com/projects/mpt-30b-inference/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022