Cascade-2 cheating attempts on Math is... cute
I was doing some inference of this model on a subset of harder problems from Nemotron-Math-V2, hoping to get a decent dataset for fun. I noticed that this model created a bunch of unrelated files in the sandbox folder (which is not network isolated in my case foolishly):
It downloaded IMO2021SL.pdf, imo2022sl.pdf and so on. I opened those files and they were legitimate PDF files. Which means the model tried to cheat by downloading those files π Here are the statistics of network access attempts (out of 10K reasoning traces):
| Domain | Traces |
|---|---|
| artofproblemsolving.com | 64 |
| en.wikipedia.org | 58 |
| www.google.com | 40 |
| duckduckgo.com | 39 |
| oeis.org | 28 |
| math.stackexchange.com | 13 |
| api.stackexchange.com | 12 |
| www.imo-official.org | 11 |
| html.duckduckgo.com | 11 |
| raw.githubusercontent.com | 8 |
| api.duckduckgo.com | 6 |
| mathworld.wolfram.com | 6 |
| purplecomet.org | 5 |
| api.github.com | 3 |
| arxiv.org | 3 |
| www.bing.com | 3 |
| stackoverflow.com | 3 |
The cutest part is that in most of these cheating attempts, it fails to produce the correct answer. Here are in-depth analysis together with exported reasoning traces with tool use: https://huggingface.co/datasets/chankhavu/nemotron-cascade2-cheating-attempts (analysis was done by Claude Code)