Grimoire is All You Need for Enhancing Large Language Models
๐กEnhance the capabilities of small language models using grimoires.
In-context learning (ICL) is one of the key methods for enhancing the performance of large language models on specific tasks by providing a set of few-shot question and answer examples. However, the ICL capability of different types of models shows significant variation due to factors such as model architecture, volume of learning data, and the size of parameters. Generally, the larger the model's parameter size and the more extensive the learning data, the stronger its ICL capability. In this paper, we propose a method SLEICL (Strong LLM Enhanced ICL) that involves learning from examples using strong language models and then summarizing and transferring these learned skills to weak language models for inference and application.
This ensures the stability and effectiveness of ICL. Compared to directly enabling weak language models to learn from prompt examples, SLEICL reduces the difficulty of ICL for these models. Our experiments, conducted on up to eight datasets with five language models, demonstrate that weak language models achieve consistent improvement over their own zero-shot or few-shot capabilities using the SLEICL method. Some weak language models even surpass the performance of GPT4-1106-preview (zero-shot) with the aid of SLEICL.
The project is organized into several key directories and modules. Here's an overview of the project structure:
.
โโโ archived # Store the grimoire and hard samples used in our experiment.
โโโ assets # Store project assets, such as images, diagrams, or any visual elements used to enhance the presentation and understanding of the project.
โโโ configs # Store configuration files.
โโโ core # Core codebase.
โ โโโ data # Data processing module.
โ โโโ evaluator # Evaluator module.
โ โโโ llm # Load Large Language Models (LLMs) module.
โโโ data # Store datasets and data processing scripts.
โโโ external # Store the Grimoire Ranking model based on the classifier approach.
โโโ outputs # Store experiment output files.
โโโ prompts # Store text files used as prompts when interacting with LLMs.
โโโ stats # Store experiment statistical results.
โโโ tests # Store test code or unit tests.
Clone the repository.
git clone https://github.com/IAAR-Shanghai/Grimoire.git && cd Grimoire
Prepare for the conda environment.
conda create -n grimoire python=3.8.18
conda activate grimoire
Install Python dependencies and process the data.
chmod +x setup.sh
./setup.sh
Configure
cp -r ./archived/.cache ./
.Look into experiments.py to see how to run experiments.
Run analyst.py to analyze the results saved in outputs
.
Note: Regarding the deployment of LLMs, we also provide some reference tutorials.
For any questions, feedback, or suggestions, please open a GitHub Issue. You can reach out through GitHub Issues.
setup.sh
to implement Python dependencies installation and the implementation of embed.py
and compute_similarity.py
;huggingface
;experiment.yaml
;@article{Grimoire,
title={Grimoire is All You Need for Enhancing Large Language Models},
author={Ding Chen and Shichao Song and Qingchen Yu and Zhiyu Li and Wenjin Wang and Feiyu Xiong and Bo Tang},
journal={arXiv preprint arXiv:2401.03385},
year={2024},
}