1 27 2

Xin Jin

Xin1118

https://github.com/JinXins

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

EarlyTom: Early Token Compression Completes Fast Video Understanding

authored a paper 13 days ago

MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding

authored a paper 13 days ago

LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs

View all activity

Organizations

None yet

upvoted a paper 10 days ago

EarlyTom: Early Token Compression Completes Fast Video Understanding

Paper • 2605.30010 • Published 11 days ago • 32

authored 3 papers 13 days ago

MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding

Paper • 2510.23479 • Published Oct 27, 2025 • 18

LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs

Paper • 2603.19217 • Published Mar 19 • 29

RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution

Paper • 2605.21195 • Published 19 days ago • 19

upvoted a paper 17 days ago

RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution

Paper • 2605.21195 • Published 19 days ago • 19

upvoted 2 papers 26 days ago

Fast Byte Latent Transformer

Paper • 2605.08044 • Published May 8 • 12

PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks

Paper • 2605.10977 • Published about 1 month ago • 10

upvoted 2 papers about 1 month ago

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published Apr 27 • 118

Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

Paper • 2604.23775 • Published Apr 26 • 45

upvoted a collection 3 months ago

MLLMerging

Collection

23 items • Updated Nov 18, 2025 • 3

liked a dataset 3 months ago

KD-TAO/LVOmniBench

Updated Apr 3 • 809 • 9

upvoted 2 papers 3 months ago

LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs

Paper • 2603.19217 • Published Mar 19 • 29

The Trinity of Consistency as a Defining Principle for General World Models

Paper • 2602.23152 • Published Feb 26 • 202

upvoted 2 papers 4 months ago

Thinking with Drafting: Optical Decompression via Logical Reconstruction

Paper • 2602.11731 • Published Feb 12 • 36

dVoting: Fast Voting for dLLMs

Paper • 2602.12153 • Published Feb 12 • 23

upvoted a paper 5 months ago

OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding

Paper • 2512.23646 • Published Dec 29, 2025 • 15

liked a dataset 6 months ago

OpenRaiser/Envision

Viewer • Updated Dec 2, 2025 • 1k • 49 • 26

upvoted 3 papers 6 months ago

Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights

Paper • 2512.01816 • Published Dec 1, 2025 • 95

OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models

Paper • 2511.14582 • Published Nov 18, 2025 • 19

Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward

Paper • 2511.20561 • Published Nov 25, 2025 • 33

Xin Jin

AI & ML interests

Recent Activity

Organizations

Xin1118's activity