EmoBooks β Emotionally Intelligent Book Recommender (LoRA Adapter)
A fine-tuned Llama-3-8B-Instruct LoRA adapter for emotion-aware Sinhala book recommendations.
β οΈ This is a LoRA adapter (~168MB), not a full model. It must be loaded on top of the base model. See Architecture below.
Architecture
How Base Model + LoRA Adapter Works
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Full Inference Pipeline β
β β
β βββββββββββββββββββββββββββββββββββββ β
β β Base Model (Frozen Weights) β ~5GB (4-bit)β
β β unsloth/llama-3-8b-instruct β β
β β - 32 Transformer layers β β
β β - 8B parameters (quantized) β β
β β - General language ability β β
β ββββββββββββββββββ¬βββββββββββββββββββ β
β β merge at runtime β
β ββββββββββββββββββΌβββββββββββββββββββ β
β β LoRA Adapter (This Repo) β ~168MB β
β β DiyRex/emobooks-llama3-lora β β
β β - Adds small weight deltas β β
β β - Targets 7 module types β β
β β - Rank 16, Alpha 16 β β
β β - Emotion-aware behavior β β
β ββββββββββββββββββ¬βββββββββββββββββββ β
β β β
β βΌ β
β EmoBooks Output β
β (Empathetic, safety-filtered recommendations) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The base model provides general language understanding. It knows English, grammar, how to follow instructions, and conversational patterns.
The LoRA adapter teaches it EmoBooks-specific behavior: mood detection, empathetic acknowledgments, the match/switch protocol, book title formatting, and critical safety rules (never recommending dark books to sad users who want to feel better).
What's in This Repo
| File | Size | Purpose |
|---|---|---|
adapter_model.safetensors |
168MB | LoRA weight deltas (the fine-tuned parameters) |
adapter_config.json |
1KB | LoRA config (rank, alpha, target modules, base model reference) |
tokenizer.json |
17MB | Tokenizer vocabulary (same as base, included for convenience) |
tokenizer_config.json |
51KB | Tokenizer settings and chat template |
Training Details (v9 β Apr 2026)
| Parameter | Value |
|---|---|
| Base Model | unsloth/llama-3-8b-instruct-bnb-4bit |
| Method | QLoRA (4-bit quantized base + LoRA adapters) |
| LoRA Rank (r) | 32 |
| LoRA Alpha | 64 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training Steps | 3000 |
| Learning Rate | 1.0e-4 (cosine) |
| Effective Batch Size | 16 (per_device 2 Γ grad_accum 8) |
| Dataset | DiyRex/emobooks-dataset (data/emobooks_chat_v3.jsonl, 66 000 multi-turn rows) |
| Catalog | 607 Sinhala novels, English-language Sri Lankan authors removed (Carl Muller, Punyakante Wijenaike, etc.) |
| Anti-hallucination | Post-hoc _enforce_catalog guardrail in the runtime β every "X by Y" mention is validated against the catalog index; mismatches are rewritten or stripped |
How It Works
- User shares mood (e.g., "I feel lonely today") β Model acknowledges empathetically
- Natural Flow:
- Explicit: Model asks "Match your mood or Switch?" when user intent is vague.
- Implicit: Model understands intent from context (e.g., "Cheer me up" β Switch) and recommends directly.
- Direct: Model honors specific requests (e.g., "Recommend a thriller") without unnecessary mood questioning.
- Greetings: Model handles "Hi/Hello" gracefully without forced recommendations.
- Single Recommendation: Model recommends exactly one book with title, author, and description.
- Safety: When sad/anxious/angry users choose "Switch", ONLY uplifting books are recommended.
Quick Start (Inference)
Option A: Using Unsloth (Recommended, fastest)
from unsloth import FastLanguageModel
# Step 1: Load base model + LoRA adapter in one call
# Unsloth reads adapter_config.json β finds base_model_name_or_path β
# downloads llama-3-8b-instruct (~5GB) β loads LoRA adapter on top
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="DiyRex/emobooks-llama3-lora", # This repo
max_seq_length=2048,
load_in_4bit=True, # 4-bit quantization for ~5GB VRAM usage
)
FastLanguageModel.for_inference(model) # Enable 2x faster inference
# Step 2: Chat with the model
messages = [{"role": "user", "content": "I feel lonely today and I'm alone at home"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, max_length=None)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Option B: Using Transformers + PEFT (No Unsloth dependency)
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
# Step 1: Load the base model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
base_model = AutoModelForCausalLM.from_pretrained(
"unsloth/llama-3-8b-instruct-bnb-4bit",
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("DiyRex/emobooks-llama3-lora")
# Step 2: Load the LoRA adapter on top of the base model
model = PeftModel.from_pretrained(base_model, "DiyRex/emobooks-llama3-lora")
model.eval()
# Step 3: Inference (same as above)
messages = [{"role": "user", "content": "I feel lonely today"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, max_length=None)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Continue Training (Retraining from this Adapter)
You can resume fine-tuning from this checkpoint without starting from scratch:
from unsloth import FastLanguageModel
from trl import SFTTrainer
from transformers import TrainingArguments
from datasets import load_dataset
# Step 1: Load this adapter (LoRA layers are already attached)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="DiyRex/emobooks-llama3-lora",
max_seq_length=2048,
load_in_4bit=True,
)
# Step 2: Load your new/updated dataset
dataset = load_dataset("DiyRex/emobooks-dataset", data_files="data/emobooks_training_v6.jsonl", split="train")
# Step 3: Configure and run training
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=dataset,
args=TrainingArguments(
output_dir="./outputs",
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
num_train_epochs=2,
learning_rate=5e-5,
fp16=True,
logging_steps=10,
),
)
trainer.train()
# Step 4: Save and push the updated adapter
model.save_pretrained("./outputs/lora_adapter_v2")
model.push_to_hub("DiyRex/emobooks-llama3-lora") # Updates main branch
Merging into a Standalone Model (Fusing)
If you need a standalone model without requiring the base model separately (e.g., for GGUF export or production deployment):
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="DiyRex/emobooks-llama3-lora",
max_seq_length=2048,
load_in_4bit=True,
)
# Merge LoRA weights into base model (creates a ~16GB fp16 model)
model.save_pretrained_merged("./merged_model", tokenizer, save_method="merged_16bit")
# Or export directly to GGUF for llama.cpp / Ollama
model.save_pretrained_gguf("./gguf_model", tokenizer, quantization_method="q4_k_m")
Dataset Versions
Available at DiyRex/emobooks-dataset:
| Version | Samples | Focus |
|---|---|---|
| v3 | 5000 | Format compliance β single-book output, match/switch protocol |
| v4 | 5000 | Expanded prompts and conversational variety |
| v5 | 5000 | Category-aware descriptions |
| v6 | 5000 | Sentiment-safe β keyword shield, 5 moods, 100 prompt styles, unique descriptions |
| v7 | 6000 | Conversational β explicit, implicit, and direct intent detection |
| v8 | 6600 | Balanced β added neutral greetings to prevent "Model Aggression" |
| v9 (Apr 2026) | 66 000 | Cleaned Sinhala-only catalog (607 books, no English-language authors), readable Singlish transliteration, strict + soft-offer SYSTEM_PROMPT, 9 dialog arcs Γ 8 emotions, anti-hallucination guardrail at runtime |
The v9 chat file is data/emobooks_chat_v3.jsonl (file name kept for
back-compat with training scripts; the release tag is v9.0). The
human-readable catalog reference is at
reference/reference_books.{json,csv} and
reference/curated_sinhala_novels.json.
License
Apache 2.0
- Downloads last month
- 51