How to play with the Eagle / RWKV model?

Online public demos

If you simply want to give the RWKV Raven a try, you can try the following public demos.

TIP

Chat is disabled in the above public demo

If you are not familiar with python or hugging face, you can install chat models locally with the following app

RWKV Runner Demoopen in new window

Windows setup guide video

Prompting guidelines for RWKV

RWKV is more sensitive to prompt format, then transformer based models. Due to its weaker ability in "looking back"

As such, instead of doing something like the following

{{CONTEXT}}

{{INSTRUCTION}}

{{ANSWER}}

As the format you should instead, do the following

{{INSTRUCTION}}

{{CONTEXT}}

{{ANSWER}}

For a human analogy, you can think of it as the instruction/input's are being read out loud to to model, without letting the model write it down. If the model is told the context first before instruction, it does not know what to do with the context, and may not remember parts that are crucial to the instruction. As it has not been told what to do with it yet.

However if you tell the model the instruction first, then the context, it will understand the instruction first, and use that knowledge to process the context.

For Q&A with context task, the most optimal is to repeat the question before and after the context like the following

{{QUESTION}}

{{CONTEXT}}

{{QUESTION}}

{{ANSWER}}

How to play with custom models instead?

If the above "guided" setups are not what you are looking for, and you want to experiment with different model sizes / quantization settings. The following are the general list to find the various things you may need.

Instruction trained models download

Base Models download

TIP

It is strongly advised to try the raven instruction model, unless you are familiar with few shot prompting with the base models

RWKV.cpp / RWKV.cpp cuda project

After downloading the desired model, you can quantize or convert them for running against the RWKV.cpp / RWKV-cpp-cuda project

These projects are designed to run locally, without the need of python or hugging face. And can be ranned on CPU or GPU (or both) respectively

TIP

Despite the "cuda" name, rwkv-cpp-cuda does have vulkan support, meaning it can run on AMD GPU's

RWKV mobile projects

Chat client projects

The official RWKV chat project can be found here

RWKV main repo

The main RWKV repo can be found here, use v4neo to run current models

TIP

For new users, unless you plan to finetune, due to the complexity involved with python dependencies, it is recommmended to use the RWKV.cpp project instead.

interact with the model via the following CLI, if you have NPM installed

RWKV cpp node (slightly out of date)

# Install globally, do not use NPX as it has known display issues
npm install -g rwkv-cpp-node

# Run the setup, and use the chat demo
rwkv-cpp-node