May 17, 2024
5 mins

ReFT: Representation Finetuning for Language Models

ReFT changes representations at different layers of an LLM by using a technique called intervention instead of changing weights/parameters using PeFT. Gives a better performance on common benchmarks and tasks.
Paper Link
Header image

Key Takeaways

  • LLMs are expensive to fine-tune: Adapting large language models (LLMs) to specific tasks typically requires fine-tuning, which involves updating a massive number of parameters and can be computationally expensive.
  • Representation Finetuning (ReFT) offers a new approach: Instead of modifying model weights, ReFT methods, like LoReFT, learn to edit a small portion of the LLM's internal representations, achieving comparable or even superior performance with a fraction of the parameters.
  • Efficiency and effectiveness across tasks: LoReFT demonstrates remarkable performance in various NLP benchmarks, including commonsense reasoning, instruction following, and natural language understanding, showcasing its potential for diverse applications.
  • Potential for cost reduction and improved performance: By significantly reducing the number of trainable parameters, ReFT methods like LoReFT have the potential to lower the cost of LLM fine-tuning while potentially improving performance and generalizability.


Pretrained LLMs are frequently finetuned to adapt them to a new domain or tasks. Fine-tuning changes a small number of weights thus not altering the entire model performance, but tuning and adapting a model to a use-case specific behavior. That being said, finetuning still requires the entire model to be loaded into memory and is mighty expensive, especially for large models.

Background: PEFT

To address the challenges of full fine-tuning, researchers have explored Parameter-Efficient Fine-tuning (PEFT) methods. These methods aim to adapt LLMs by updating only a small subset of weights, thereby reducing the computational and memory requirements. Popular PEFT approaches include adapters and Low-Rank Adaptation (LoRA), which have shown promising results in achieving comparable performance to full fine-tuning while using significantly fewer parameters. QLoRA  further shows that full-precision adapters can be trained on top of reduced-precision models without sacrificing performance. Adapters are generally more efficient and effective than methods that introduce new model components, like prefix-tuning.

ReFT: Representation Fine-tuning

While PEFT methods have made significant strides, they primarily focus on modifying model weights. However, recent research in LLM interpretability has revealed that the internal representations learned by these models encode rich semantic information. This insight leads us to Representation Finetuning (ReFT), a novel family of methods that operate on frozen base models and learn task-specific interventions on hidden representations instead of weights.


ReFT draws inspiration from interventional interpretability techniques used to understand the inner workings of LLMs. These techniques often involve manipulating representations to observe their causal effects on model behavior. Notably, the distributed alignment search (DAS) method has been successful in finding linear subspaces within representations that correspond to human-interpretable concepts. ReFT leverages this knowledge to steer model behavior towards solving downstream tasks efficiently.


Among the ReFT family, Low-rank Linear Subspace ReFT (LoReFT) stands out as a particularly strong and efficient method. LoReFT intervenes on hidden representations within a low-rank subspace, effectively editing them to guide the model towards desired outputs. The intervention involves learning a low-rank projection matrix and a linear projection to steer the representations. This approach allows for significant parameter reduction while maintaining or even surpassing the performance of PEFT methods.


To assess the effectiveness of LoReFT, researchers conducted experiments across four distinct NLP benchmarks, encompassing over 20 datasets:

Common sense Reasoning

LoReFT demonstrated state-of-the-art performance on eight common sense reasoning datasets, outperforming all other PEFT methods by a substantial margin. This success highlights LoReFT's ability to effectively capture and manipulate common sense knowledge within LLM representations.

Arithmetic Reasoning

While LoReFT did not achieve the same level of dominance in arithmetic reasoning tasks, it still outperformed prefix-tuning and showed promising results, particularly with larger LLM models. This suggests that further exploration and optimization could enhance LoReFT's capabilities in reasoning tasks that require multi-step calculations and logical deductions.


LoReFT achieved remarkable success in instruction-following, surpassing even full fine-tuning and achieving a win-rate close to GPT-3.5 Turbo. This result underscores LoReFT's potential for developing highly capable instruction-tuned LLMs with minimal parameter updates.

Natural Language Understanding

Evaluations on the GLUE benchmark demonstrated that LoReFT performs competitively with existing PEFT methods on tasks related to sentiment analysis, natural language inference, and other NLU challenges. This finding suggests that LoReFT's benefits extend beyond text generation and can be applied to improve LLM performance in various NLP domains.

Business Applications and Impact

The advancements brought forth by ReFT, particularly LoReFT, hold significant implications for businesses and industries leveraging LLMs. The potential for cost reduction and improved performance unlocks new possibilities:

Reduced Training Costs

The parameter efficiency of LoReFT translates to lower computational costs and faster training times, making it more accessible for businesses to fine-tune LLMs for specific applications.

Improved Generalizability

By focusing on representations rather than weights, LoReFT might enhance the generalizability of LLMs, enabling them to adapt to new tasks and domains more effectively.

Democratization of LLMs

LoReFT's efficiency opens doors for smaller businesses and organizations to leverage the power of LLMs without the need for extensive computational resources.

Enhanced LLM-powered Applications

ReFT could lead to the development of more efficient and effective LLM-powered applications in areas such as chatbots, machine translation, text summarization, and content creation.

In the long term, widespread adoption of ReFT methods could lead to the development of highly specialized and adaptable LLMs, catering to the unique needs of different industries and applications.


ReFT, exemplified by LoReFT, presents a compelling alternative to traditional fine-tuning methods for LLMs. Its ability to achieve state-of-the-art performance with a fraction of the parameters opens doors for more efficient, cost-effective, and generalizable LLM applications. As research in this area continues, we can anticipate further advancements and innovations that will unlock the full potential of LLMs and transform the landscape of artificial intelligence.

Why Does ReFT Work?

The effectiveness of ReFT methods like LoReFT raises intriguing questions about the nature of LLM representations and the mechanisms behind their success. While further investigation is needed, some potential explanations include:

  • Linearity of Concepts: The success of LoReFT aligns with the hypothesis that many concepts are encoded in linear subspaces within LLM representations. By intervening on these subspaces, LoReFT can effectively manipulate and control the activation of relevant concepts, guiding the model towards desired outputs.
  • Causal Pathways: ReFT interventions might create new causal pathways within the LLM's computation or modulate the strength of existing ones, thereby influencing the model's behavior in a targeted manner.
  • Abstraction and Generalization: ReFT's focus on representations, rather than specific weights, could contribute to improved abstraction and generalization capabilities, allowing the LLM to apply its knowledge to new tasks and domains more effectively.

Github Repo

Share this post

Why Clio AI?

Unlock the most obvious-yet-hidden-in-plain-sight growth hack - enable your employees to work on important things, and reduce their cognitive load and time to resolve blockers.

Fast, efficient, and in-context information to make every employee a super performer.

Spend time thinking not searching. Get a demo today.

By signing up for a demo, you agree to our Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.