Writer (Writer)

sanderland

authored a paper 3 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 19

dmytro-writer

authored a paper 8 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 277

shelly-writer

authored a paper 8 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 277

sanderland

authored a paper 8 months ago

RewardBench 2: Advancing Reward Model Evaluation

Paper • 2506.01937 • Published Jun 2, 2025 • 7

kiranr

authored a paper 8 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 277

wassemgtk

authored a paper 8 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 277

sanderland

authored a paper 9 months ago

Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models

Paper • 2405.05417 • Published May 8, 2024 • 1

wassemgtk

posted an update 10 months ago

Post

3271

I’ve been diving into the iRoPE architecture from Llama 4—a game-changer for long-context models! It interleaves local attention (with RoPE) for short contexts and global attention (with inference-time temp scaling) for long-range reasoning, aiming for infinite context. I’m going to try writing iRoPE—who wants to help?

Code: https://github.com/wassemgtk/iRoPE-try/blob/main/iRoPE.ipynb

1 reply

·

sanderland

authored a paper 10 months ago

Command A: An Enterprise-Ready Large Language Model

Paper • 2504.00698 • Published Apr 1, 2025 • 29

wassemgtk

posted an update 10 months ago

Post

2137

For fun, a new project: SuperTokenizer! A BPE tokenizer trained on C4 to beat GPT-4. Byte-level, A100-powered, and open-source. Messing around with tokens!
https://github.com/wassemgtk/SuperTokenizer

1 reply

·

wassemgtk

posted an update 11 months ago

Post

1922

# GESAL: Real-Time Adaptation for LLMs

We’re excited to unveil **Graph-Enhanced Singular Adaptive Learning (GESAL)**, a framework that lets LLMs like meta-llama/Llama-3.2-1B adapt in real time using user feedback. Check out the code and white paper on GitHub!

🔗 **Code**: [https://github.com/writer/AI-Adaptive-Learning-GESAL](https://github.com/writer/AI-Adaptive-Learning-GESAL)

---

## Why GESAL?

Static LLMs struggle to adapt without heavy retraining. GESAL solves this with:
- **SVF**: Adapts weights via \( W' = U (\Sigma \cdot z) V^T \), using few parameters.
- **Graph Memory**: Stores adaptations in nodes for scalability.
- **RL**: Updates via \( J(z) = \mathbb{E}[\log \pi_z(y|x) r] \) based on feedback.

---

## How It Works

Ask "How many R’s in ‘strawberry’?" If it says "2" and you say "no," GESAL learns to say "3" next time, avoiding repeats.

---

## Try It

Built with Hugging Face’s transformers:

pip install transformers torch numpy
python Adaptive_Learning_(GESAL).py

Needs a Hugging Face token for Llama-3.2-1B.

---

## Results

GESAL hits 95% accuracy after 5 feedbacks vs. LoRA’s 70%. It’s efficient (~0.5M params) and scalable.

15 replies

·

wassemgtk

authored a paper 11 months ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10, 2025 • 133

dmytro-writer

authored 2 papers 12 months ago

Comparative Analysis of Retrieval Systems in the Real World

Paper • 2405.02048 • Published May 3, 2024

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10, 2025 • 133

kiranr

authored a paper 12 months ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10, 2025 • 133

wassemgtk

authored a paper over 1 year ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144

kiranr

authored a paper over 1 year ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144

wassemgtk

authored a paper over 1 year ago

Comparative Analysis of Retrieval Systems in the Real World

Paper • 2405.02048 • Published May 3, 2024

wassemgtk

posted an update almost 2 years ago

Post

3656

Writer team had the opportunity to run an eval for Mixtral-8x22b, results were interesting.

| ---------------------------- |
| #mmlu 77.26 |
| ---------------------------- |
| #hellaswag 88.81 |
| ---------------------------- |
| #truthfulqa 52.05 |
| ---------------------------- |
| #arc_challenge 70.31 |
| ---------------------------- |
| #winogrande 84.93 |
| ---------------------------- |
| #gsm8k 76.65 |
| ---------------------------- |

2 replies

·

wassemgtk

posted an update almost 2 years ago

Post

We are thrilled to announce the release of the OmniACT dataset! This revolutionary dataset and benchmark focuses on pushing the limits of how virtual agents can facilitate the automation of our computer tasks. Imagine less clicking and typing, and more observation as your computer takes care of tasks such as organizing schedules or arranging travel arrangements on its own.

Check it out ➡️ [OmniACT Dataset on Hugging Face]( Writer/omniact)

For a deep dive, here’s the paper: [OmniACT Paper](https://arxiv.org/abs/2402.17553)

Writer

AI & ML interests

Articles

Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason!

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

RewardBench 2: Advancing Reward Model Evaluation

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models

Command A: An Enterprise-Ready Large Language Model

Expect the Unexpected: FailSafe Long Context QA for Finance

Comparative Analysis of Retrieval Systems in the Real World

Expect the Unexpected: FailSafe Long Context QA for Finance

Expect the Unexpected: FailSafe Long Context QA for Finance

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Comparative Analysis of Retrieval Systems in the Real World

AI & ML interests

Articles

Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason!

Team members 176

Writer's activity