Papers
updated
Detecting Pretraining Data from Large Language Models
Paper
• 2310.16789
• Published
• 11
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large
Language Models by Extrapolating Errors from Small Models
Paper
• 2310.13671
• Published
• 19
AutoMix: Automatically Mixing Language Models
Paper
• 2310.12963
• Published
• 14
An Emulator for Fine-Tuning Large Language Models using Small Language
Models
Paper
• 2310.12962
• Published
• 13
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper
• 2310.10638
• Published
• 30
Zephyr: Direct Distillation of LM Alignment
Paper
• 2310.16944
• Published
• 123
Reward-Augmented Decoding: Efficient Controlled Text Generation With a
Unidirectional Reward Model
Paper
• 2310.09520
• Published
• 11
DSPy: Compiling Declarative Language Model Calls into Self-Improving
Pipelines
Paper
• 2310.03714
• Published
• 37
Efficient Streaming Language Models with Attention Sinks
Paper
• 2309.17453
• Published
• 14
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language
Models
Paper
• 2309.12284
• Published
• 19
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper
• 2309.11495
• Published
• 40
Knowledge Distillation of Large Language Models
Paper
• 2306.08543
• Published
• 22
A Repository of Conversational Datasets
Paper
• 1904.06472
• Published
• 5
SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine
Paper
• 1704.05179
• Published
• 1
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084
• Published
• 12
Efficient Few-Shot Learning Without Prompts
Paper
• 2209.11055
• Published
• 4
Attention Is All You Need
Paper
• 1706.03762
• Published
• 115
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper
• 2005.11401
• Published
• 14
FlashAttention: Fast and Memory-Efficient Exact Attention with
IO-Awareness
Paper
• 2205.14135
• Published
• 15
Textbooks Are All You Need
Paper
• 2306.11644
• Published
• 154
Direct Preference Optimization: Your Language Model is Secretly a Reward
Model
Paper
• 2305.18290
• Published
• 64
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122
Language Variants
Paper
• 2308.16884
• Published
• 10
Retentive Network: A Successor to Transformer for Large Language Models
Paper
• 2307.08621
• Published
• 173
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Paper
• 2310.17752
• Published
• 15
Contrastive Decoding: Open-ended Text Generation as Optimization
Paper
• 2210.15097
• Published
Contrastive Decoding Improves Reasoning in Large Language Models
Paper
• 2309.09117
• Published
• 40
Efficient Memory Management for Large Language Model Serving with
PagedAttention
Paper
• 2309.06180
• Published
• 41
DoLa: Decoding by Contrasting Layers Improves Factuality in Large
Language Models
Paper
• 2309.03883
• Published
• 36
Controlled Decoding from Language Models
Paper
• 2310.17022
• Published
• 15
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large
Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper
• 2309.08968
• Published
• 24
Learning From Mistakes Makes LLM Better Reasoner
Paper
• 2310.20689
• Published
• 29
Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection
Paper
• 2310.11511
• Published
• 78
YaRN: Efficient Context Window Extension of Large Language Models
Paper
• 2309.00071
• Published
• 81
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like
Models at All Scales
Paper
• 2308.01320
• Published
• 46
Shepherd: A Critic for Language Model Generation
Paper
• 2308.04592
• Published
• 33
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head
Checkpoints
Paper
• 2305.13245
• Published
• 6
Improving Large Language Model Fine-tuning for Solving Math Problems
Paper
• 2310.10047
• Published
• 7
Dialogue Act Classification with Context-Aware Self-Attention
Paper
• 1904.02594
• Published
It's Morphin' Time! Combating Linguistic Discrimination with
Inflectional Perturbations
Paper
• 2005.04364
• Published
Question rewriting? Assessing its importance for conversational question
answering
Paper
• 2201.09146
• Published
Can Question Rewriting Help Conversational Question Answering?
Paper
• 2204.06239
• Published
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo
Labelling
Paper
• 2311.00430
• Published
• 56
Paper
• 2310.20707
• Published
• 11
When Less is More: Investigating Data Pruning for Pretraining LLMs at
Scale
Paper
• 2309.04564
• Published
• 17
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper
• 2311.01282
• Published
• 37
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Paper
• 2311.06243
• Published
• 21
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
Paper
• 2311.04934
• Published
• 32
Co-training and Co-distillation for Quality Improvement and Compression
of Language Models
Paper
• 2311.02849
• Published
• 8
NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation
Paper
• 2310.19820
• Published
• 1
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
• 2312.11514
• Published
• 260
RecurrentGemma: Moving Past Transformers for Efficient Open Language
Models
Paper
• 2404.07839
• Published
• 48