aixonlab
/

Aether-12b

Text Generation

text-generation-inference

Model card Files Files and versions

Aether-12b / README.md

Xclbr7's picture

Update README.md

d031d17 verified over 1 year ago

|

history blame contribute delete

1.84 kB

	---
	base_model: Xclbr7/Arcanum-12b
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- mistral
	- trl
	---
	<img src="https://cdn-uploads.huggingface.co/production/uploads/66dcee3321f901b049f48002/Fpdr8qCx9Xx4RHWgptCGD.png" width="800"/>


	# Aether-12b

	Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.

	## Model Details 📊
	- Developed by: AIXON Lab
	- Model type: Causal Language Model
	- Language(s): English (primarily), may support other languages
	- License: apache-2.0
	- Repository: https://huggingface.co/aixonlab/Aether-12b

	## Model Architecture 🏗️
	- Base model: Arcanum-12b
	- Parameter count: ~12 billion
	- Architecture specifics: Transformer-based language model

	## Open LLM Leaderboard Evaluation Results
	Coming Soon !

	## Training & Fine-tuning 🔄
	Aether-12b was fine-tuned on the following dataset:
	- Dataset: theprint/CleverBoi-Data-20k
	- Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.

	The CleverBoi-Data-20k dataset improved the model in the following ways:
	1. Enhanced reasoning and problem-solving capabilities
	2. Broader knowledge across various topics
	3. Improved performance on specific tasks like writing, analysis, and problem-solving
	4. Better contextual understanding and response generation

	## Intended Use 🎯
	As an assistant or specific role bot.

	## Ethical Considerations 🤔
	As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.


	## Acknowledgments 🙏
	We acknowledge the contributions of:
	- theprint for the amazing CleverBoi-Data-20k dataset