| --- |
| base_model: Xclbr7/Arcanum-12b |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - text-generation-inference |
| - transformers |
| - unsloth |
| - mistral |
| - trl |
| --- |
| <img src="https://cdn-uploads.huggingface.co/production/uploads/66dcee3321f901b049f48002/Fpdr8qCx9Xx4RHWgptCGD.png" width="800"/> |
|
|
|
|
| # Aether-12b |
|
|
| Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset. |
|
|
| ## Model Details π |
| - Developed by: AIXON Lab |
| - Model type: Causal Language Model |
| - Language(s): English (primarily), may support other languages |
| - License: apache-2.0 |
| - Repository: https://huggingface.co/aixonlab/Aether-12b |
|
|
| ## Model Architecture ποΈ |
| - Base model: Arcanum-12b |
| - Parameter count: ~12 billion |
| - Architecture specifics: Transformer-based language model |
|
|
| ## Open LLM Leaderboard Evaluation Results |
| Coming Soon ! |
|
|
| ## Training & Fine-tuning π |
| Aether-12b was fine-tuned on the following dataset: |
| - Dataset: theprint/CleverBoi-Data-20k |
| - Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision. |
|
|
| The CleverBoi-Data-20k dataset improved the model in the following ways: |
| 1. Enhanced reasoning and problem-solving capabilities |
| 2. Broader knowledge across various topics |
| 3. Improved performance on specific tasks like writing, analysis, and problem-solving |
| 4. Better contextual understanding and response generation |
|
|
| ## Intended Use π― |
| As an assistant or specific role bot. |
|
|
| ## Ethical Considerations π€ |
| As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly. |
|
|
|
|
| ## Acknowledgments π |
| We acknowledge the contributions of: |
| - theprint for the amazing CleverBoi-Data-20k dataset |
|
|
|
|