Turbine Qwen Marketing Adapters
Model summary
This repository contains a set of LoRA/PEFT adapters fine-tuned for B2B marketing and enablement copy in the workforce/apprenticeship domain, with optional persona specialization.
Adapters are trained on cleaned content extracted from a Turbine website codebase (MDX/MD and TSX pages), then shaped into persona-specific instruction-style examples.
Artifacts
genericadapter: general Turbine marketing voice and product positioning.- Persona adapters:
employerssponsorsvocational_trainersstate_workforce_boards
These adapters are designed to be loaded onto the same base model:
- Base:
marketeam/Qwen-Marketing
Intended use
Primary intended use
Generate, rewrite, and structure procurement-safe and persona-relevant marketing copy such as:
- landing page sections (headlines, bullets, CTAs)
- outbound emails and follow-ups
- one-pagers and enablement summaries
- objection handling
- benefit framing by stakeholder type
Out-of-scope use
Not intended for:
- legal advice or regulatory determinations
- guarantees of compliance, eligibility, or funding
- generating factual claims or metrics not present in the provided context
- sensitive personal data processing
- automated decision-making in high-stakes contexts (eligibility, enforcement)
Persona behaviors
Persona adapters are optimized for differences in framing and tone:
- Employers: ROI, speed-to-value, operational outcomes, reduced admin overhead.
- Sponsors: audit readiness, evidence capture, lifecycle operations, reporting artifacts.
- Vocational trainers: curriculum alignment, competency tracking, placements, employer alignment.
- State workforce boards: procurement-safe language, governance, interoperability, outcomes reporting, equity framing; avoids hype.
Training data
Source
Training inputs were derived from: https://turbineworkforce.com
src/markdown/**(MDX/MD content)src/pages/**(TSX/TS content) with exclusions:- paths containing
blogPosts - non-content markup and UI-only code
Cleaning and extraction
- HTML tags, MDX/JSX tags, and common TSX markup are removed.
- Frontmatter/imports/exports are stripped.
- Whitespace is normalized.
- Content is chunked into token windows for training.
Persona dataset generation
Persona datasets are auto-generated from non-persona text by applying persona-specific instruction frames. The resulting examples are chat-style records:
system: persona writing constraintsuser: transformation task (rewrite / one-pager / email / objections)assistant: grounded content derived from the source text (no invented metrics)
Data files
Typical structure:
out/splits/train.jsonl,out/splits/val.jsonl(non-persona)out/persona_splits/<persona>/train.jsonl,out/persona_splits/<persona>/val.jsonl
Training procedure
Method
- Parameter-efficient fine-tuning using LoRA (PEFT) on a causal LM.
Key hyperparameters (typical)
- LoRA:
target_modules:["q_proj", "v_proj"]r: 4–16 (persona-dependent)lora_alpha: usually2 * rlora_dropout: 0.05–0.15 (higher for more conservative personas)
- Training:
max_length: ~512 tokenseffective_batch_size: ~16 (via gradient accumulation)learning_rate: ~1e-4 (LoRA-appropriate)- mixed precision:
fp16
Output
Only adapter weights are produced (base model weights are unchanged), unless explicitly merged.
Evaluation
What to evaluate
This is a marketing adapter set; success is measured by:
- persona separation: different stakeholders yield different framing
- groundedness: no invented metrics or unsupported claims
- style consistency: Turbine voice, clarity, procurement-safety where required
- format compliance: headline/bullets/CTA structure
Recommended evaluation
- Track
eval_lossduring training for overfit detection. - Run fixed prompt suites per persona:
- rewrite a paragraph → headline + bullets + CTA
- produce an email + objection handling
- procurement-safe one-pager section (boards)
Limitations
- Outputs may still contain generic marketing phrasing if persona data is small.
- The model can produce plausible-sounding claims if prompted to do so; users should provide source context and require citations/grounding in their pipeline if needed.
- Content quality depends on the source corpus quality; missing or thin pages lead to weaker examples.
- Not a compliance engine; it can mention regulations but should not be trusted to interpret them.
Safety and responsible use
- Do not use this model to fabricate metrics, outcomes, compliance guarantees, or legal statements.
- For state/board-facing outputs, enforce “procurement-safe” constraints and require that all claims be supported by provided context.
- Add a post-generation validation layer (rule-based or LLM-based) to flag:
- numeric claims not present in context
- absolute/guarantee language (“ensures”, “guaranteed”, “fully compliant”)
- hallucinated program names or statutes
How to use
Load base + adapter (Transformers + PEFT)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
BASE = "marketeam/Qwen-Marketing"
ADAPTER = "out/adapters/sponsors" # or employers, vocational_trainers, state_workforce_boards
tok = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
BASE, torch_dtype=torch.float16, device_map="auto", trust_remote_code=True
)
model = PeftModel.from_pretrained(model, ADAPTER)
prompt = "Write a sponsor-facing one-pager section about audit-ready evidence capture."
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=300, do_sample=True, temperature=0.7)
print(tok.decode(out[0], skip_special_tokens=True))