nvidia/Nemotron-Cascade-2-30B-A3B · Cascade-2 cheating attempts on Math is... cute

Cascade-2 cheating attempts on Math is... cute

#25

by chankhavu - opened about 17 hours ago

Discussion

chankhavu

about 17 hours ago

•

edited about 17 hours ago

I was doing some inference of this model on a subset of harder problems from Nemotron-Math-V2, hoping to get a decent dataset for fun. I noticed that this model created a bunch of unrelated files in the sandbox folder (which is not network isolated in my case foolishly):

It downloaded IMO2021SL.pdf, imo2022sl.pdf and so on. I opened those files and they were legitimate PDF files. Which means the model tried to cheat by downloading those files 😄 Here are the statistics of network access attempts (out of 10K reasoning traces):

Domain	Traces
artofproblemsolving.com	64
en.wikipedia.org	58
www.google.com	40
duckduckgo.com	39
oeis.org	28
math.stackexchange.com	13
api.stackexchange.com	12
www.imo-official.org	11
html.duckduckgo.com	11
raw.githubusercontent.com	8
api.duckduckgo.com	6
mathworld.wolfram.com	6
purplecomet.org	5
api.github.com	3
arxiv.org	3
www.bing.com	3
stackoverflow.com	3

The cutest part is that in most of these cheating attempts, it fails to produce the correct answer. Here are in-depth analysis together with exported reasoning traces with tool use: https://huggingface.co/datasets/chankhavu/nemotron-cascade2-cheating-attempts (analysis was done by Claude Code)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment