Open to Work

11 22 3

Taki WU

taki555

https://wutaiqiang.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a collection about 10 hours ago

Qwen3.5

updated a model 4 days ago

taki555/Qwen3-30B-A3B-Instruct-2507-Art

updated a model 4 days ago

taki555/Qwen3-4B-Instruct-2507-Art

View all activity

Organizations

authored 4 papers 5 days ago

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving

Paper • 2505.23932 • Published May 29, 2025

MMFormalizer: Multimodal Autoformalization in the Wild

Paper • 2601.03017 • Published Jan 6 • 105

BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models

Paper • 2602.04163 • Published 27 days ago • 10

The Art of Efficient Reasoning: Data, Reward, and Optimization

Paper • 2602.20945 • Published 6 days ago • 5

authored a paper about 1 month ago

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

Paper • 2601.09195 • Published Jan 14 • 15

submitted a paper to Daily Papers about 1 month ago

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

Paper • 2601.09195 • Published Jan 14 • 15

authored 2 papers 5 months ago

Revisiting Model Interpolation for Efficient Reasoning

Paper • 2510.10977 • Published Oct 13, 2025 • 10

Timber: Training-free Instruct Model Refining with Base via Effective Rank

Paper • 2509.23595 • Published Sep 28, 2025 • 1

authored 5 papers 9 months ago

A Survey on the Honesty of Large Language Models

Paper • 2409.18786 • Published Sep 27, 2024 • 31

Autoregressive Models in Vision: A Survey

Paper • 2411.05902 • Published Nov 8, 2024 • 19

LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation

Paper • 2501.12976 • Published Jan 22, 2025

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Paper • 2505.15929 • Published May 21, 2025 • 49

Shadow-FT: Tuning Instruct via Base

Paper • 2505.12716 • Published May 19, 2025 • 4

authored a paper about 1 year ago

LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models

Paper • 2411.06839 • Published Nov 11, 2024 • 1

authored 6 papers over 1 year ago

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

Paper • 2212.06385 • Published Dec 13, 2022

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

Paper • 2304.05659 • Published Apr 12, 2023

Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast

Paper • 2405.14507 • Published May 23, 2024

Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16, 2024 • 3

Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models

Paper • 2404.02657 • Published Apr 3, 2024 • 2

Weight-Inherited Distillation for Task-Agnostic BERT Compression

Paper • 2305.09098 • Published May 16, 2023

Taki WU

AI & ML interests

Recent Activity

Organizations

taki555's activity