Instructions to use akemiH/JMLR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use akemiH/JMLR with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="akemiH/JMLR")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("akemiH/JMLR") model = AutoModelForCausalLM.from_pretrained("akemiH/JMLR") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use akemiH/JMLR with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "akemiH/JMLR" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "akemiH/JMLR", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/akemiH/JMLR
- SGLang
How to use akemiH/JMLR with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "akemiH/JMLR" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "akemiH/JMLR", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "akemiH/JMLR" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "akemiH/JMLR", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use akemiH/JMLR with Docker Model Runner:
docker model run hf.co/akemiH/JMLR
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
JMLR-13B For MedQA
Large Language Models (LLMs) have demonstrated a remarkable potential in medical knowledge acquisition and question-answering. However, LLMs can potentially hallucinate and yield factually incorrect outcomes, even with domain-specific pretraining. Previously, retrieval augmented generation (RAG) has limited success in addressing hallucinations. Unlike previous methods in RAG where the retrieval model was trained separately from the LLM, we introduce JMLR (for Jointly trains LLM and information Retrieval (IR)) during the fine-tuning phase. The synchronized training mechanism enhances JMLR's ability to retrieve clinical guidelines and leverage medical knowledge to reason and answer questions and reduces the demand for computational resources. We evaluated JMLR on the important medical question answering application. Our experimental results demonstrate that JMLR-13B (70.5%) outperforms a previous state-of-the-art open-source model using conventional pre-training and fine-tuning Meditron-70B (68.9%) and Llama2-13B with RAG (54.9%) on a medical question-answering dataset. JMLR-13B (148 GPU hours) also trains much faster than Meditron-70B (42630 GPU hours). Through this work, we provide a new and efficient knowledge enhancement tool for healthcare, demonstrating the potential of integrating IR and LLM training for medical question-answering systems. The code, along with selected retrieval data that can be made public, is included in the supplementary material and will be made publicly accessible with CC-BY 4.0 license upon the paper's acceptance.
| Model | Parameter | Open Access | MedQA | Amboss | MMLU | MedMCQA | Average |
|---|---|---|---|---|---|---|---|
| GPT-4 | ? | No | 74.7 | 82.1 | 88.4 | 69.5 | 78.6 |
| ChatGPT | ? | No | 50.2 | 49.1 | 69.4 | 51.0 | 54.9 |
| Meditron | 70B | Yes | 60.7 | 76.4 | 73.6 | 65.1 | 68.9 |
| RAG | 13B | Yes | 59.9 | 76.9 | 69.9 | 64.2 | 67.7 |
| JLMR | 13B | Yes | 62.5 | 81.2 | 72.8 | 65.5 | 70.5 |
- Downloads last month
- 40