Qwen3-Next-80B-A3B-Thinking-NVFP4

williamchangtw

chenjiel commited on Feb 11

Commit

52f42b9

0 Parent(s):

Duplicate from nvidia/Qwen3-Next-80B-A3B-Thinking-NVFP4

Browse files

Co-authored-by: Chenjie Luo <chenjiel@users.noreply.huggingface.co>

Files changed (24) hide show

.gitattributes +37 -0
README.md +200 -0
added_tokens.json +28 -0
chat_template.jinja +86 -0
config.json +370 -0
generation_config.json +13 -0
hf_quant_config.json +255 -0
merges.txt +0 -0
model-00001-of-00011.safetensors +3 -0
model-00002-of-00011.safetensors +3 -0
model-00003-of-00011.safetensors +3 -0
model-00004-of-00011.safetensors +3 -0
model-00005-of-00011.safetensors +3 -0
model-00006-of-00011.safetensors +3 -0
model-00007-of-00011.safetensors +3 -0
model-00008-of-00011.safetensors +3 -0
model-00009-of-00011.safetensors +3 -0
model-00010-of-00011.safetensors +3 -0
model-00011-of-00011.safetensors +3 -0
model.safetensors.index.json +3 -0
special_tokens_map.json +25 -0
tokenizer.json +3 -0
tokenizer_config.json +239 -0
vocab.json +0 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,37 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+model.safetensors.index.json filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,200 @@

+---
+pipeline_tag: text-generation
+base_model:
+- Qwen/Qwen3-Next-80B-A3B-Thinking
+license: apache-2.0
+library_name: Model Optimizer
+tags:
+- nvidia
+- ModelOpt
+- Qwen3
+- quantized
+- NVFP4
+- nvfp4
+---
+# Model Overview
+## Description:
+The NVIDIA Qwen3-Next-80B-A3B-Thinking NVFP4 model is the quantized version of Alibaba's Qwen3-Next-80B-A3B-Thinking model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking). The NVIDIA Qwen3-Next-80B-A3B-Thinking NVFP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
+This model is ready for commercial/non-commercial use.  <br>
+## Third-Party Community Consideration
+This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to Non-NVIDIA [(Qwen3-Next-80B-A3B-Thinking) Model Card](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking).
+### License/Terms of Use:
+[Apache license 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
+### Deployment Geography:
+Global <br>
+### Use Case:
+Developers looking to take off-the-shelf, pre-quantized models for deployment in AI Agent systems, chatbots, RAG systems, and other AI-powered applications. <br>
+### Release Date:
+Huggingface 12/29/2025 via https://huggingface.co/nvidia/Qwen3-Next-80B-A3B-Thinking-NVFP4 <br>
+## Model Architecture:
+**Architecture Type:** Transformers  <br>
+**Network Architecture:** Qwen3NextForCausalLM <br>
+**This model was developed based on [Qwen3-Next-80B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking) <br>
+**Number of model parameters: Undisclosed. <br>
+## Input:
+**Input Type(s):** Text <br>
+**Input Format(s):** String <br>
+**Input Parameters:** 1D (One-Dimensional): Sequences <br>
+**Other Properties Related to Input:** Context length 262,144 natively and extensible up to 1,010,000 tokens <br>
+## Output:
+**Output Type(s):** Text <br>
+**Output Format:** String <br>
+**Output Parameters:** 1D (One-Dimensional): Sequences <br>
+**Other Properties Related to Output:** N/A <br>
+Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions. <br>
+## Software Integration:
+**Runtime Engine(s):** <br>
+* TensorRT-LLM <br>
+**Supported Hardware Microarchitecture Compatibility:** <br>
+* NVIDIA Blackwell <br>
+**Preferred Operating System(s):** <br>
+* Linux <br>
+The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.
+## Model Version(s):
+The model is quantized with nvidia-modelopt **v0.40.0**  <br>
+## Training, Testing, and Evaluation Datasets:
+## Calibration Dataset:
+** Link: [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail), [Nemotron-Post-Training-Dataset-v2](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2) <br>
+** Data Collection Method by dataset: Automated. <br>
+** Labeling Method by dataset: Automated. <br>
+## Training Dataset:
+** Data Modality: Undisclosed <br>
+** Data Collection Method by dataset: Undisclosed <br>
+** Labeling Method by dataset: Undisclosed<br>
+** Properties: Undisclosed
+## Testing Dataset:
+** Data Collection Method by dataset: Undisclosed <br>
+** Labeling Method by dataset: Undisclosed <br>
+** Properties: Undisclosed <br>
+## Evaluation Dataset:
+* Datasets: MMLU Pro, GPQA Diamond, LiveCodeBench V6, SciCode, AIME 2025 <br>
+** Data Collection Method by dataset: Hybrid: Automated, Human <br>
+** Labeling Method by dataset: Hybrid: Human, Automated <br>
+## Inference:
+**Acceleration Engine:** TensorRT-LLM <br>
+**Test Hardware:** B200 <br>
+## Post Training Quantization
+This model was obtained by quantizing the weights and activations of Qwen3-Next-80B-A3B-Thinking to NVFP4 data type, ready for inference with TensorRT-LLM. Only the weights and activations of the linear operators within transformer blocks are quantized. This optimization reduces the number of bits per parameter from 16 to 4, reducing the disk size and GPU memory requirements by approximately 3.3x.
+## Usage
+### Deploy with TensorRT-LLM
+To deploy the quantized checkpoint with [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) LLM API, follow the sample codes below:
+* LLM API sample usage:
+```
+from tensorrt_llm import LLM, SamplingParams
+from tensorrt_llm.llmapi import KvCacheConfig
+def main():
+    prompts = [
+        "Hello, my name is",
+        "The president of the United States is",
+        "The capital of France is",
+        "The future of AI is",
+    ]
+    sampling_params = SamplingParams(temperature=0.6, top_p=0.95)
+    kv_cache_config = KvCacheConfig(enable_block_reuse=False)
+    llm = LLM(model="nvidia/Qwen3-Next-80B-A3B-Thinking-NVFP4", tensor_parallel_size=4, kv_cache_config=kv_cache_config)
+    outputs = llm.generate(prompts, sampling_params)
+    # Print the outputs.
+    for output in outputs:
+        prompt = output.prompt
+        generated_text = output.outputs[0].text
+        print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
+# The entry point of the program needs to be protected for spawning processes.
+if __name__ == '__main__':
+    main()
+```
+### Evaluation
+The accuracy benchmark results are presented in the table below:
+<table>
+  <tr>
+   <td><strong>Precision</strong>
+   </td>
+   <td><strong>MMLU Pro</strong>
+   </td>
+   <td><strong>GPQA Diamond</strong>
+   </td>
+   <td><strong>LiveCodeBench V6</strong>
+   </td>
+   <td><strong>SciCode</strong>
+   </td>
+   <td><strong>AIME 2025</strong>
+   </td>
+  </tr>
+  <tr>
+   <td>FP8
+   </td>
+   <td>0.823
+   </td>
+   <td>0.754
+   </td>
+   <td>0.714
+   </td>
+   <td>0.414
+   </td>
+   <td>0.879
+   </td>
+  </tr>
+  <tr>
+   <td>NVFP4
+   </td>
+   <td>0.822
+   </td>
+   <td>0.752
+   </td>
+   <td>0.708
+   </td>
+   <td>0.409
+   </td>
+   <td>0.862
+   </td>
+  </tr>
+  <tr>
+</table>
+> Baseline: [Qwen3-Next-80B-A3B-Thinking-FP8](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking-FP8).
+> Benchmarked with temperature=0.6, top_p=0.95, max num tokens 81920
+## Ethical Considerations
+NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications.  When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
+Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://app.intigriti.com/programs/nvidia/nvidiavdp/detail).

added_tokens.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "</think>": 151668,
+  "</tool_call>": 151658,
+  "</tool_response>": 151666,
+  "<think>": 151667,
+  "<tool_call>": 151657,
+  "<tool_response>": 151665,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,86 @@

+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {{- messages[0].content + '\n\n' }}
+    {%- endif %}
+    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for message in messages[::-1] %}
+    {%- set index = (messages|length - 1) - loop.index0 %}
+    {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
+        {%- set ns.multi_step_tool = false %}
+        {%- set ns.last_query_index = index %}
+    {%- endif %}
+{%- endfor %}
+{%- for message in messages %}
+    {%- if message.content is string %}
+        {%- set content = message.content %}
+    {%- else %}
+        {%- set content = '' %}
+    {%- endif %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {%- set reasoning_content = '' %}
+        {%- if message.reasoning_content is string %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in content %}
+                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
+                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- if loop.index0 > ns.last_query_index %}
+            {%- if loop.last or (not loop.last and reasoning_content) %}
+                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
+            {%- else %}
+                {{- '<|im_start|>' + message.role + '\n' + content }}
+            {%- endif %}
+        {%- else %}
+            {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- endif %}
+        {%- if message.tool_calls %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<tool_call>\n{"name": "' }}
+                {{- tool_call.name }}
+                {{- '", "arguments": ' }}
+                {%- if tool_call.arguments is string %}
+                    {{- tool_call.arguments }}
+                {%- else %}
+                    {{- tool_call.arguments | tojson }}
+                {%- endif %}
+                {{- '}\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n<think>\n' }}
+{%- endif %}

config.json ADDED Viewed

	@@ -0,0 +1,370 @@

+{
+    "architectures": [
+        "Qwen3NextForCausalLM"
+    ],
+    "attention_bias": false,
+    "attention_dropout": 0.0,
+    "bos_token_id": 151643,
+    "decoder_sparse_step": 1,
+    "dtype": "bfloat16",
+    "eos_token_id": 151645,
+    "full_attention_interval": 4,
+    "head_dim": 256,
+    "hidden_act": "silu",
+    "hidden_size": 2048,
+    "initializer_range": 0.02,
+    "intermediate_size": 5120,
+    "layer_types": [
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention",
+        "linear_attention",
+        "linear_attention",
+        "linear_attention",
+        "full_attention"
+    ],
+    "linear_conv_kernel_dim": 4,
+    "linear_key_head_dim": 128,
+    "linear_num_key_heads": 16,
+    "linear_num_value_heads": 32,
+    "linear_value_head_dim": 128,
+    "max_position_embeddings": 262144,
+    "mlp_only_layers": [],
+    "model_type": "qwen3_next",
+    "moe_intermediate_size": 512,
+    "norm_topk_prob": true,
+    "num_attention_heads": 16,
+    "num_experts": 512,
+    "num_experts_per_tok": 10,
+    "num_hidden_layers": 48,
+    "num_key_value_heads": 2,
+    "output_router_logits": false,
+    "partial_rotary_factor": 0.25,
+    "rms_norm_eps": 1e-06,
+    "rope_scaling": null,
+    "rope_theta": 10000000,
+    "router_aux_loss_coef": 0.001,
+    "shared_expert_intermediate_size": 512,
+    "tie_word_embeddings": false,
+    "transformers_version": "4.57.1",
+    "use_cache": true,
+    "use_sliding_window": false,
+    "vocab_size": 151936,
+    "quantization_config": {
+        "config_groups": {
+            "group_0": {
+                "input_activations": {
+                    "dynamic": false,
+                    "num_bits": 4,
+                    "type": "float",
+                    "group_size": 16
+                },
+                "weights": {
+                    "dynamic": false,
+                    "num_bits": 4,
+                    "type": "float",
+                    "group_size": 16
+                },
+                "targets": [
+                    "Linear"
+                ]
+            }
+        },
+        "ignore": [
+            "lm_head",
+            "model.layers.0.linear_attn.conv1d",
+            "model.layers.0.linear_attn.in_proj_ba",
+            "model.layers.0.linear_attn.in_proj_qkvz",
+            "model.layers.0.mlp.gate",
+            "model.layers.0.mlp.shared_expert_gate",
+            "model.layers.1.linear_attn.conv1d",
+            "model.layers.1.linear_attn.in_proj_ba",
+            "model.layers.1.linear_attn.in_proj_qkvz",
+            "model.layers.1.mlp.gate",
+            "model.layers.1.mlp.shared_expert_gate",
+            "model.layers.10.linear_attn.conv1d",
+            "model.layers.10.linear_attn.in_proj_ba",
+            "model.layers.10.linear_attn.in_proj_qkvz",
+            "model.layers.10.mlp.gate",
+            "model.layers.10.mlp.shared_expert_gate",
+            "model.layers.11.mlp.gate",
+            "model.layers.11.mlp.shared_expert_gate",
+            "model.layers.11.self_attn.k_proj",
+            "model.layers.11.self_attn.q_proj",
+            "model.layers.11.self_attn.v_proj",
+            "model.layers.12.linear_attn.conv1d",
+            "model.layers.12.linear_attn.in_proj_ba",
+            "model.layers.12.linear_attn.in_proj_qkvz",
+            "model.layers.12.mlp.gate",
+            "model.layers.12.mlp.shared_expert_gate",
+            "model.layers.13.linear_attn.conv1d",
+            "model.layers.13.linear_attn.in_proj_ba",
+            "model.layers.13.linear_attn.in_proj_qkvz",
+            "model.layers.13.mlp.gate",
+            "model.layers.13.mlp.shared_expert_gate",
+            "model.layers.14.linear_attn.conv1d",
+            "model.layers.14.linear_attn.in_proj_ba",
+            "model.layers.14.linear_attn.in_proj_qkvz",
+            "model.layers.14.mlp.gate",
+            "model.layers.14.mlp.shared_expert_gate",
+            "model.layers.15.mlp.gate",
+            "model.layers.15.mlp.shared_expert_gate",
+            "model.layers.15.self_attn.k_proj",
+            "model.layers.15.self_attn.q_proj",
+            "model.layers.15.self_attn.v_proj",
+            "model.layers.16.linear_attn.conv1d",
+            "model.layers.16.linear_attn.in_proj_ba",
+            "model.layers.16.linear_attn.in_proj_qkvz",
+            "model.layers.16.mlp.gate",
+            "model.layers.16.mlp.shared_expert_gate",
+            "model.layers.17.linear_attn.conv1d",
+            "model.layers.17.linear_attn.in_proj_ba",
+            "model.layers.17.linear_attn.in_proj_qkvz",
+            "model.layers.17.mlp.gate",
+            "model.layers.17.mlp.shared_expert_gate",
+            "model.layers.18.linear_attn.conv1d",
+            "model.layers.18.linear_attn.in_proj_ba",
+            "model.layers.18.linear_attn.in_proj_qkvz",
+            "model.layers.18.mlp.gate",
+            "model.layers.18.mlp.shared_expert_gate",
+            "model.layers.19.mlp.gate",
+            "model.layers.19.mlp.shared_expert_gate",
+            "model.layers.19.self_attn.k_proj",
+            "model.layers.19.self_attn.q_proj",
+            "model.layers.19.self_attn.v_proj",
+            "model.layers.2.linear_attn.conv1d",
+            "model.layers.2.linear_attn.in_proj_ba",
+            "model.layers.2.linear_attn.in_proj_qkvz",
+            "model.layers.2.mlp.gate",
+            "model.layers.2.mlp.shared_expert_gate",
+            "model.layers.20.linear_attn.conv1d",
+            "model.layers.20.linear_attn.in_proj_ba",
+            "model.layers.20.linear_attn.in_proj_qkvz",
+            "model.layers.20.mlp.gate",
+            "model.layers.20.mlp.shared_expert_gate",
+            "model.layers.21.linear_attn.conv1d",
+            "model.layers.21.linear_attn.in_proj_ba",
+            "model.layers.21.linear_attn.in_proj_qkvz",
+            "model.layers.21.mlp.gate",
+            "model.layers.21.mlp.shared_expert_gate",
+            "model.layers.22.linear_attn.conv1d",
+            "model.layers.22.linear_attn.in_proj_ba",
+            "model.layers.22.linear_attn.in_proj_qkvz",
+            "model.layers.22.mlp.gate",
+            "model.layers.22.mlp.shared_expert_gate",
+            "model.layers.23.mlp.gate",
+            "model.layers.23.mlp.shared_expert_gate",
+            "model.layers.23.self_attn.k_proj",
+            "model.layers.23.self_attn.q_proj",
+            "model.layers.23.self_attn.v_proj",
+            "model.layers.24.linear_attn.conv1d",
+            "model.layers.24.linear_attn.in_proj_ba",
+            "model.layers.24.linear_attn.in_proj_qkvz",
+            "model.layers.24.mlp.gate",
+            "model.layers.24.mlp.shared_expert_gate",
+            "model.layers.25.linear_attn.conv1d",
+            "model.layers.25.linear_attn.in_proj_ba",
+            "model.layers.25.linear_attn.in_proj_qkvz",
+            "model.layers.25.mlp.gate",
+            "model.layers.25.mlp.shared_expert_gate",
+            "model.layers.26.linear_attn.conv1d",
+            "model.layers.26.linear_attn.in_proj_ba",
+            "model.layers.26.linear_attn.in_proj_qkvz",
+            "model.layers.26.mlp.gate",
+            "model.layers.26.mlp.shared_expert_gate",
+            "model.layers.27.mlp.gate",
+            "model.layers.27.mlp.shared_expert_gate",
+            "model.layers.27.self_attn.k_proj",
+            "model.layers.27.self_attn.q_proj",
+            "model.layers.27.self_attn.v_proj",
+            "model.layers.28.linear_attn.conv1d",
+            "model.layers.28.linear_attn.in_proj_ba",
+            "model.layers.28.linear_attn.in_proj_qkvz",
+            "model.layers.28.mlp.gate",
+            "model.layers.28.mlp.shared_expert_gate",
+            "model.layers.29.linear_attn.conv1d",
+            "model.layers.29.linear_attn.in_proj_ba",
+            "model.layers.29.linear_attn.in_proj_qkvz",
+            "model.layers.29.mlp.gate",
+            "model.layers.29.mlp.shared_expert_gate",
+            "model.layers.3.mlp.gate",
+            "model.layers.3.mlp.shared_expert_gate",
+            "model.layers.3.self_attn.k_proj",
+            "model.layers.3.self_attn.q_proj",
+            "model.layers.3.self_attn.v_proj",
+            "model.layers.30.linear_attn.conv1d",
+            "model.layers.30.linear_attn.in_proj_ba",
+            "model.layers.30.linear_attn.in_proj_qkvz",
+            "model.layers.30.mlp.gate",
+            "model.layers.30.mlp.shared_expert_gate",
+            "model.layers.31.mlp.gate",
+            "model.layers.31.mlp.shared_expert_gate",
+            "model.layers.31.self_attn.k_proj",
+            "model.layers.31.self_attn.q_proj",
+            "model.layers.31.self_attn.v_proj",
+            "model.layers.32.linear_attn.conv1d",
+            "model.layers.32.linear_attn.in_proj_ba",
+            "model.layers.32.linear_attn.in_proj_qkvz",
+            "model.layers.32.mlp.gate",
+            "model.layers.32.mlp.shared_expert_gate",
+            "model.layers.33.linear_attn.conv1d",
+            "model.layers.33.linear_attn.in_proj_ba",
+            "model.layers.33.linear_attn.in_proj_qkvz",
+            "model.layers.33.mlp.gate",
+            "model.layers.33.mlp.shared_expert_gate",
+            "model.layers.34.linear_attn.conv1d",
+            "model.layers.34.linear_attn.in_proj_ba",
+            "model.layers.34.linear_attn.in_proj_qkvz",
+            "model.layers.34.mlp.gate",
+            "model.layers.34.mlp.shared_expert_gate",
+            "model.layers.35.mlp.gate",
+            "model.layers.35.mlp.shared_expert_gate",
+            "model.layers.35.self_attn.k_proj",
+            "model.layers.35.self_attn.q_proj",
+            "model.layers.35.self_attn.v_proj",
+            "model.layers.36.linear_attn.conv1d",
+            "model.layers.36.linear_attn.in_proj_ba",
+            "model.layers.36.linear_attn.in_proj_qkvz",
+            "model.layers.36.mlp.gate",
+            "model.layers.36.mlp.shared_expert_gate",
+            "model.layers.37.linear_attn.conv1d",
+            "model.layers.37.linear_attn.in_proj_ba",
+            "model.layers.37.linear_attn.in_proj_qkvz",
+            "model.layers.37.mlp.gate",
+            "model.layers.37.mlp.shared_expert_gate",
+            "model.layers.38.linear_attn.conv1d",
+            "model.layers.38.linear_attn.in_proj_ba",
+            "model.layers.38.linear_attn.in_proj_qkvz",
+            "model.layers.38.mlp.gate",
+            "model.layers.38.mlp.shared_expert_gate",
+            "model.layers.39.mlp.gate",
+            "model.layers.39.mlp.shared_expert_gate",
+            "model.layers.39.self_attn.k_proj",
+            "model.layers.39.self_attn.q_proj",
+            "model.layers.39.self_attn.v_proj",
+            "model.layers.4.linear_attn.conv1d",
+            "model.layers.4.linear_attn.in_proj_ba",
+            "model.layers.4.linear_attn.in_proj_qkvz",
+            "model.layers.4.mlp.gate",
+            "model.layers.4.mlp.shared_expert_gate",
+            "model.layers.40.linear_attn.conv1d",
+            "model.layers.40.linear_attn.in_proj_ba",
+            "model.layers.40.linear_attn.in_proj_qkvz",
+            "model.layers.40.mlp.gate",
+            "model.layers.40.mlp.shared_expert_gate",
+            "model.layers.41.linear_attn.conv1d",
+            "model.layers.41.linear_attn.in_proj_ba",
+            "model.layers.41.linear_attn.in_proj_qkvz",
+            "model.layers.41.mlp.gate",
+            "model.layers.41.mlp.shared_expert_gate",
+            "model.layers.42.linear_attn.conv1d",
+            "model.layers.42.linear_attn.in_proj_ba",
+            "model.layers.42.linear_attn.in_proj_qkvz",
+            "model.layers.42.mlp.gate",
+            "model.layers.42.mlp.shared_expert_gate",
+            "model.layers.43.mlp.gate",
+            "model.layers.43.mlp.shared_expert_gate",
+            "model.layers.43.self_attn.k_proj",
+            "model.layers.43.self_attn.q_proj",
+            "model.layers.43.self_attn.v_proj",
+            "model.layers.44.linear_attn.conv1d",
+            "model.layers.44.linear_attn.in_proj_ba",
+            "model.layers.44.linear_attn.in_proj_qkvz",
+            "model.layers.44.mlp.gate",
+            "model.layers.44.mlp.shared_expert_gate",
+            "model.layers.45.linear_attn.conv1d",
+            "model.layers.45.linear_attn.in_proj_ba",
+            "model.layers.45.linear_attn.in_proj_qkvz",
+            "model.layers.45.mlp.gate",
+            "model.layers.45.mlp.shared_expert_gate",
+            "model.layers.46.linear_attn.conv1d",
+            "model.layers.46.linear_attn.in_proj_ba",
+            "model.layers.46.linear_attn.in_proj_qkvz",
+            "model.layers.46.mlp.gate",
+            "model.layers.46.mlp.shared_expert_gate",
+            "model.layers.47.mlp.gate",
+            "model.layers.47.mlp.shared_expert_gate",
+            "model.layers.47.self_attn.k_proj",
+            "model.layers.47.self_attn.q_proj",
+            "model.layers.47.self_attn.v_proj",
+            "model.layers.5.linear_attn.conv1d",
+            "model.layers.5.linear_attn.in_proj_ba",
+            "model.layers.5.linear_attn.in_proj_qkvz",
+            "model.layers.5.mlp.gate",
+            "model.layers.5.mlp.shared_expert_gate",
+            "model.layers.6.linear_attn.conv1d",
+            "model.layers.6.linear_attn.in_proj_ba",
+            "model.layers.6.linear_attn.in_proj_qkvz",
+            "model.layers.6.mlp.gate",
+            "model.layers.6.mlp.shared_expert_gate",
+            "model.layers.7.mlp.gate",
+            "model.layers.7.mlp.shared_expert_gate",
+            "model.layers.7.self_attn.k_proj",
+            "model.layers.7.self_attn.q_proj",
+            "model.layers.7.self_attn.v_proj",
+            "model.layers.8.linear_attn.conv1d",
+            "model.layers.8.linear_attn.in_proj_ba",
+            "model.layers.8.linear_attn.in_proj_qkvz",
+            "model.layers.8.mlp.gate",
+            "model.layers.8.mlp.shared_expert_gate",
+            "model.layers.9.linear_attn.conv1d",
+            "model.layers.9.linear_attn.in_proj_ba",
+            "model.layers.9.linear_attn.in_proj_qkvz",
+            "model.layers.9.mlp.gate",
+            "model.layers.9.mlp.shared_expert_gate",
+            "mtp.layers.0*"
+        ],
+        "quant_algo": "NVFP4",
+        "kv_cache_scheme": {
+            "dynamic": false,
+            "num_bits": 8,
+            "type": "float"
+        },
+        "producer": {
+            "name": "modelopt",
+            "version": "0.0.1.dev445+gae4ae22f9.d20260209"
+        },
+        "quant_method": "modelopt"
+    }
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "bos_token_id": 151643,
+  "do_sample": true,
+  "eos_token_id": [
+      151645,
+      151643
+  ],
+  "pad_token_id": 151643,
+  "temperature": 0.6,
+  "top_k": 20,
+  "top_p": 0.95,
+  "transformers_version": "4.57.0.dev0"
+}

hf_quant_config.json ADDED Viewed

	@@ -0,0 +1,255 @@

+{
+    "producer": {
+        "name": "modelopt",
+        "version": "0.0.1.dev445+gae4ae22f9.d20260209"
+    },
+    "quantization": {
+        "quant_algo": "NVFP4",
+        "kv_cache_quant_algo": "FP8",
+        "group_size": 16,
+        "exclude_modules": [
+            "lm_head",
+            "model.layers.0.linear_attn.conv1d",
+            "model.layers.0.linear_attn.in_proj_ba",
+            "model.layers.0.linear_attn.in_proj_qkvz",
+            "model.layers.0.mlp.gate",
+            "model.layers.0.mlp.shared_expert_gate",
+            "model.layers.1.linear_attn.conv1d",
+            "model.layers.1.linear_attn.in_proj_ba",
+            "model.layers.1.linear_attn.in_proj_qkvz",
+            "model.layers.1.mlp.gate",
+            "model.layers.1.mlp.shared_expert_gate",
+            "model.layers.10.linear_attn.conv1d",
+            "model.layers.10.linear_attn.in_proj_ba",
+            "model.layers.10.linear_attn.in_proj_qkvz",
+            "model.layers.10.mlp.gate",
+            "model.layers.10.mlp.shared_expert_gate",
+            "model.layers.11.mlp.gate",
+            "model.layers.11.mlp.shared_expert_gate",
+            "model.layers.11.self_attn.k_proj",
+            "model.layers.11.self_attn.q_proj",
+            "model.layers.11.self_attn.v_proj",
+            "model.layers.12.linear_attn.conv1d",
+            "model.layers.12.linear_attn.in_proj_ba",
+            "model.layers.12.linear_attn.in_proj_qkvz",
+            "model.layers.12.mlp.gate",
+            "model.layers.12.mlp.shared_expert_gate",
+            "model.layers.13.linear_attn.conv1d",
+            "model.layers.13.linear_attn.in_proj_ba",
+            "model.layers.13.linear_attn.in_proj_qkvz",
+            "model.layers.13.mlp.gate",
+            "model.layers.13.mlp.shared_expert_gate",
+            "model.layers.14.linear_attn.conv1d",
+            "model.layers.14.linear_attn.in_proj_ba",
+            "model.layers.14.linear_attn.in_proj_qkvz",
+            "model.layers.14.mlp.gate",
+            "model.layers.14.mlp.shared_expert_gate",
+            "model.layers.15.mlp.gate",
+            "model.layers.15.mlp.shared_expert_gate",
+            "model.layers.15.self_attn.k_proj",
+            "model.layers.15.self_attn.q_proj",
+            "model.layers.15.self_attn.v_proj",
+            "model.layers.16.linear_attn.conv1d",
+            "model.layers.16.linear_attn.in_proj_ba",
+            "model.layers.16.linear_attn.in_proj_qkvz",
+            "model.layers.16.mlp.gate",
+            "model.layers.16.mlp.shared_expert_gate",
+            "model.layers.17.linear_attn.conv1d",
+            "model.layers.17.linear_attn.in_proj_ba",
+            "model.layers.17.linear_attn.in_proj_qkvz",
+            "model.layers.17.mlp.gate",
+            "model.layers.17.mlp.shared_expert_gate",
+            "model.layers.18.linear_attn.conv1d",
+            "model.layers.18.linear_attn.in_proj_ba",
+            "model.layers.18.linear_attn.in_proj_qkvz",
+            "model.layers.18.mlp.gate",
+            "model.layers.18.mlp.shared_expert_gate",
+            "model.layers.19.mlp.gate",
+            "model.layers.19.mlp.shared_expert_gate",
+            "model.layers.19.self_attn.k_proj",
+            "model.layers.19.self_attn.q_proj",
+            "model.layers.19.self_attn.v_proj",
+            "model.layers.2.linear_attn.conv1d",
+            "model.layers.2.linear_attn.in_proj_ba",
+            "model.layers.2.linear_attn.in_proj_qkvz",
+            "model.layers.2.mlp.gate",
+            "model.layers.2.mlp.shared_expert_gate",
+            "model.layers.20.linear_attn.conv1d",
+            "model.layers.20.linear_attn.in_proj_ba",
+            "model.layers.20.linear_attn.in_proj_qkvz",
+            "model.layers.20.mlp.gate",
+            "model.layers.20.mlp.shared_expert_gate",
+            "model.layers.21.linear_attn.conv1d",
+            "model.layers.21.linear_attn.in_proj_ba",
+            "model.layers.21.linear_attn.in_proj_qkvz",
+            "model.layers.21.mlp.gate",
+            "model.layers.21.mlp.shared_expert_gate",
+            "model.layers.22.linear_attn.conv1d",
+            "model.layers.22.linear_attn.in_proj_ba",
+            "model.layers.22.linear_attn.in_proj_qkvz",
+            "model.layers.22.mlp.gate",
+            "model.layers.22.mlp.shared_expert_gate",
+            "model.layers.23.mlp.gate",
+            "model.layers.23.mlp.shared_expert_gate",
+            "model.layers.23.self_attn.k_proj",
+            "model.layers.23.self_attn.q_proj",
+            "model.layers.23.self_attn.v_proj",
+            "model.layers.24.linear_attn.conv1d",
+            "model.layers.24.linear_attn.in_proj_ba",
+            "model.layers.24.linear_attn.in_proj_qkvz",
+            "model.layers.24.mlp.gate",
+            "model.layers.24.mlp.shared_expert_gate",
+            "model.layers.25.linear_attn.conv1d",
+            "model.layers.25.linear_attn.in_proj_ba",
+            "model.layers.25.linear_attn.in_proj_qkvz",
+            "model.layers.25.mlp.gate",
+            "model.layers.25.mlp.shared_expert_gate",
+            "model.layers.26.linear_attn.conv1d",
+            "model.layers.26.linear_attn.in_proj_ba",
+            "model.layers.26.linear_attn.in_proj_qkvz",
+            "model.layers.26.mlp.gate",
+            "model.layers.26.mlp.shared_expert_gate",
+            "model.layers.27.mlp.gate",
+            "model.layers.27.mlp.shared_expert_gate",
+            "model.layers.27.self_attn.k_proj",
+            "model.layers.27.self_attn.q_proj",
+            "model.layers.27.self_attn.v_proj",
+            "model.layers.28.linear_attn.conv1d",
+            "model.layers.28.linear_attn.in_proj_ba",
+            "model.layers.28.linear_attn.in_proj_qkvz",
+            "model.layers.28.mlp.gate",
+            "model.layers.28.mlp.shared_expert_gate",
+            "model.layers.29.linear_attn.conv1d",
+            "model.layers.29.linear_attn.in_proj_ba",
+            "model.layers.29.linear_attn.in_proj_qkvz",
+            "model.layers.29.mlp.gate",
+            "model.layers.29.mlp.shared_expert_gate",
+            "model.layers.3.mlp.gate",
+            "model.layers.3.mlp.shared_expert_gate",
+            "model.layers.3.self_attn.k_proj",
+            "model.layers.3.self_attn.q_proj",
+            "model.layers.3.self_attn.v_proj",
+            "model.layers.30.linear_attn.conv1d",
+            "model.layers.30.linear_attn.in_proj_ba",
+            "model.layers.30.linear_attn.in_proj_qkvz",
+            "model.layers.30.mlp.gate",
+            "model.layers.30.mlp.shared_expert_gate",
+            "model.layers.31.mlp.gate",
+            "model.layers.31.mlp.shared_expert_gate",
+            "model.layers.31.self_attn.k_proj",
+            "model.layers.31.self_attn.q_proj",
+            "model.layers.31.self_attn.v_proj",
+            "model.layers.32.linear_attn.conv1d",
+            "model.layers.32.linear_attn.in_proj_ba",
+            "model.layers.32.linear_attn.in_proj_qkvz",
+            "model.layers.32.mlp.gate",
+            "model.layers.32.mlp.shared_expert_gate",
+            "model.layers.33.linear_attn.conv1d",
+            "model.layers.33.linear_attn.in_proj_ba",
+            "model.layers.33.linear_attn.in_proj_qkvz",
+            "model.layers.33.mlp.gate",
+            "model.layers.33.mlp.shared_expert_gate",
+            "model.layers.34.linear_attn.conv1d",
+            "model.layers.34.linear_attn.in_proj_ba",
+            "model.layers.34.linear_attn.in_proj_qkvz",
+            "model.layers.34.mlp.gate",
+            "model.layers.34.mlp.shared_expert_gate",
+            "model.layers.35.mlp.gate",
+            "model.layers.35.mlp.shared_expert_gate",
+            "model.layers.35.self_attn.k_proj",
+            "model.layers.35.self_attn.q_proj",
+            "model.layers.35.self_attn.v_proj",
+            "model.layers.36.linear_attn.conv1d",
+            "model.layers.36.linear_attn.in_proj_ba",
+            "model.layers.36.linear_attn.in_proj_qkvz",
+            "model.layers.36.mlp.gate",
+            "model.layers.36.mlp.shared_expert_gate",
+            "model.layers.37.linear_attn.conv1d",
+            "model.layers.37.linear_attn.in_proj_ba",
+            "model.layers.37.linear_attn.in_proj_qkvz",
+            "model.layers.37.mlp.gate",
+            "model.layers.37.mlp.shared_expert_gate",
+            "model.layers.38.linear_attn.conv1d",
+            "model.layers.38.linear_attn.in_proj_ba",
+            "model.layers.38.linear_attn.in_proj_qkvz",
+            "model.layers.38.mlp.gate",
+            "model.layers.38.mlp.shared_expert_gate",
+            "model.layers.39.mlp.gate",
+            "model.layers.39.mlp.shared_expert_gate",
+            "model.layers.39.self_attn.k_proj",
+            "model.layers.39.self_attn.q_proj",
+            "model.layers.39.self_attn.v_proj",
+            "model.layers.4.linear_attn.conv1d",
+            "model.layers.4.linear_attn.in_proj_ba",
+            "model.layers.4.linear_attn.in_proj_qkvz",
+            "model.layers.4.mlp.gate",
+            "model.layers.4.mlp.shared_expert_gate",
+            "model.layers.40.linear_attn.conv1d",
+            "model.layers.40.linear_attn.in_proj_ba",
+            "model.layers.40.linear_attn.in_proj_qkvz",
+            "model.layers.40.mlp.gate",
+            "model.layers.40.mlp.shared_expert_gate",
+            "model.layers.41.linear_attn.conv1d",
+            "model.layers.41.linear_attn.in_proj_ba",
+            "model.layers.41.linear_attn.in_proj_qkvz",
+            "model.layers.41.mlp.gate",
+            "model.layers.41.mlp.shared_expert_gate",
+            "model.layers.42.linear_attn.conv1d",
+            "model.layers.42.linear_attn.in_proj_ba",
+            "model.layers.42.linear_attn.in_proj_qkvz",
+            "model.layers.42.mlp.gate",
+            "model.layers.42.mlp.shared_expert_gate",
+            "model.layers.43.mlp.gate",
+            "model.layers.43.mlp.shared_expert_gate",
+            "model.layers.43.self_attn.k_proj",
+            "model.layers.43.self_attn.q_proj",
+            "model.layers.43.self_attn.v_proj",
+            "model.layers.44.linear_attn.conv1d",
+            "model.layers.44.linear_attn.in_proj_ba",
+            "model.layers.44.linear_attn.in_proj_qkvz",
+            "model.layers.44.mlp.gate",
+            "model.layers.44.mlp.shared_expert_gate",
+            "model.layers.45.linear_attn.conv1d",
+            "model.layers.45.linear_attn.in_proj_ba",
+            "model.layers.45.linear_attn.in_proj_qkvz",
+            "model.layers.45.mlp.gate",
+            "model.layers.45.mlp.shared_expert_gate",
+            "model.layers.46.linear_attn.conv1d",
+            "model.layers.46.linear_attn.in_proj_ba",
+            "model.layers.46.linear_attn.in_proj_qkvz",
+            "model.layers.46.mlp.gate",
+            "model.layers.46.mlp.shared_expert_gate",
+            "model.layers.47.mlp.gate",
+            "model.layers.47.mlp.shared_expert_gate",
+            "model.layers.47.self_attn.k_proj",
+            "model.layers.47.self_attn.q_proj",
+            "model.layers.47.self_attn.v_proj",
+            "model.layers.5.linear_attn.conv1d",
+            "model.layers.5.linear_attn.in_proj_ba",
+            "model.layers.5.linear_attn.in_proj_qkvz",
+            "model.layers.5.mlp.gate",
+            "model.layers.5.mlp.shared_expert_gate",
+            "model.layers.6.linear_attn.conv1d",
+            "model.layers.6.linear_attn.in_proj_ba",
+            "model.layers.6.linear_attn.in_proj_qkvz",
+            "model.layers.6.mlp.gate",
+            "model.layers.6.mlp.shared_expert_gate",
+            "model.layers.7.mlp.gate",
+            "model.layers.7.mlp.shared_expert_gate",
+            "model.layers.7.self_attn.k_proj",
+            "model.layers.7.self_attn.q_proj",
+            "model.layers.7.self_attn.v_proj",
+            "model.layers.8.linear_attn.conv1d",
+            "model.layers.8.linear_attn.in_proj_ba",
+            "model.layers.8.linear_attn.in_proj_qkvz",
+            "model.layers.8.mlp.gate",
+            "model.layers.8.mlp.shared_expert_gate",
+            "model.layers.9.linear_attn.conv1d",
+            "model.layers.9.linear_attn.in_proj_ba",
+            "model.layers.9.linear_attn.in_proj_qkvz",
+            "model.layers.9.mlp.gate",
+            "model.layers.9.mlp.shared_expert_gate",
+            "mtp.layers.0*"
+        ]
+    }
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

model-00001-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:48076ded1e2667e580f65334135437fe365599e52a8d45f16f4844871e9c691b
+size 5003036968

model-00002-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b9edf08dd268bddb1eac630fbdccdd0671cea9663d32bd78a00cffc98445f2c
+size 5003483960

model-00003-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:034f364254657f125f93ea5a85fbfcc5b81105823068870c4374f1bc5eb4973a
+size 5003514400

model-00004-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8b60f7f85d7724585a8d1b48543e12cb04c9e7ded1e50f74b121b0450dce7031
+size 5003755712

model-00005-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e81a4c7e05dc48ee2421053db5f3e0bccceb260293d1d41e4128b871b6a63819
+size 5003581304

model-00006-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6166dbf630027173c0bab8bb63c70908e8350ee3513d6984aeb29b1e3dc9a5f0
+size 5003516056

model-00007-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ea1ddbacaae76c0949e5d99139f0d73345cf415c5e54366df34af3aac8649e36
+size 5003593008

model-00008-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8ba17da6d76be3a695d8f6c5a060f0f63155f762fd6b2adddb27d03b06843b47
+size 5003516072

model-00009-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:27eb6a1822b655c831f0d3f6e6667f9a60b7239a77dc945ad4c88fb0488c81a6
+size 5003744824

model-00010-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2b365175bccbcca8420a972ad87d5c230f9e29174aefd5e44483ce7c89283205
+size 5000330496

model-00011-of-00011.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e2b4a32aae8538581df2c16b4273e493ad02d926cd7dde207bab6e2892bf0637
+size 725675520

model.safetensors.index.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f9c2f67f083110def2b0d727e2b90b9eeb5179b7cc400a450880da4d04abb47d
+size 28463294

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,25 @@

+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "<|im_end|>"
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
+size 11422654

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,239 @@

+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151665": {
+      "content": "<tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151666": {
+      "content": "</tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151667": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151668": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "model_max_length": 1010000,
+  "pad_token": "<|im_end|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff