Add code audit — 10 issues found with fixes
Browse files
AUDIT.md
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# BokehFlow Code Audit — Issues Found and Fixed
|
| 2 |
+
|
| 3 |
+
## CRITICAL BUGS (Will cause training failure or incorrect results)
|
| 4 |
+
|
| 5 |
+
### 1. ❌ Compositing double-multiplication bug (render_bokeh, line ~825)
|
| 6 |
+
**Problem:** `output = output + blurred * visible / (blurred_mask + 1e-6) * visible`
|
| 7 |
+
This multiplies by `visible` TWICE — `blurred * visible / blurred_mask * visible` = wrong alpha compositing.
|
| 8 |
+
**Fix:** `output = output + blurred / (blurred_mask + 1e-6) * visible`
|
| 9 |
+
|
| 10 |
+
### 2. ❌ CoC map computation doesn't handle focus distance == depth correctly
|
| 11 |
+
**Problem:** When D == S₁ (pixel at focus distance), CoC should be exactly 0. The formula computes `abs(D - S₁)` which is correct, but the `S1.clamp(min=f+1.0)` can produce NaN gradients when `f` is a learnable parameter.
|
| 12 |
+
**Fix:** Detach f from the clamp or use a fixed minimum.
|
| 13 |
+
|
| 14 |
+
### 3. ❌ BatchNorm in ConvStem will break at batch_size=1 during inference
|
| 15 |
+
**Problem:** `nn.BatchNorm2d` computes running stats during training but fails with batch_size=1 if model is in training mode.
|
| 16 |
+
**Fix:** Use `nn.GroupNorm(num_groups=8, num_channels=...)` or `nn.InstanceNorm2d` instead.
|
| 17 |
+
|
| 18 |
+
## STABILITY ISSUES (May cause NaN/Inf during training)
|
| 19 |
+
|
| 20 |
+
### 4. ⚠️ No gradient clipping mentioned in training config
|
| 21 |
+
**Problem:** The GatedDeltaNet recurrence compounds matrix operations. Without gradient clipping, gradients can explode.
|
| 22 |
+
**Fix:** Add `max_grad_norm=1.0` to training config.
|
| 23 |
+
|
| 24 |
+
### 5. ⚠️ Key L2-normalization — correct but needs epsilon
|
| 25 |
+
**Problem:** `F.normalize(k, p=2, dim=-1)` can produce NaN if k is all zeros.
|
| 26 |
+
**Fix:** Add eps: `k = F.normalize(k, p=2, dim=-1, eps=1e-8)`
|
| 27 |
+
|
| 28 |
+
### 6. ⚠️ State explosion risk
|
| 29 |
+
**Problem:** The state update `state = a_t * (state - b_t * (state @ kk_t)) + b_t * vk_t` has matrix products that can grow unbounded if α≈1 and β≈0 for many steps.
|
| 30 |
+
**Fix:** Add periodic state normalization: `state = state / (state.norm() + 1e-6).clamp(min=0.1)` every 256 steps.
|
| 31 |
+
|
| 32 |
+
### 7. ⚠️ Softplus depth output has no upper bound
|
| 33 |
+
**Problem:** `nn.Softplus()` can output arbitrarily large values, causing CoC explosion.
|
| 34 |
+
**Fix:** `depth = F.softplus(raw_depth).clamp(max=100.0)` (100 meters max).
|
| 35 |
+
|
| 36 |
+
## LOGICAL ISSUES
|
| 37 |
+
|
| 38 |
+
### 8. ⚠️ embed_dim mismatch for base variant
|
| 39 |
+
**Problem:** `num_heads=6, head_dim=32` means inner_dim=192 but `embed_dim=192`, so the linear projections `to_qkv` project 192→3*192=576. This is correct but the output gate also projects 192→192. No bug but very heavy for base variant.
|
| 40 |
+
|
| 41 |
+
### 9. ⚠️ Direction fusion uses outputs before normalization
|
| 42 |
+
**Problem:** The adaptive direction fusion `softmax(W_γ · [o_→;...])` operates on raw scan outputs, then the result is LayerNorm'd. The softmax inputs can have different scales per direction, potentially making one direction always dominate.
|
| 43 |
+
**Fix:** Apply LayerNorm to each scan output BEFORE fusion, or use a temperature in the softmax.
|
| 44 |
+
|
| 45 |
+
### 10. ⚠️ TSP state shape mismatch
|
| 46 |
+
**Problem:** `self.S_init` has shape `(1, num_heads, head_dim, head_dim)` but BiGDR returns a list of states (one per scan direction), not a single state. The propagate function iterates over block_states which are lists, not tensors.
|
| 47 |
+
**Fix:** S_init should match the per-direction state shape, and propagation should handle the list structure properly.
|
| 48 |
+
|
| 49 |
+
## DATASET CONFIRMED COMPATIBLE ✅
|
| 50 |
+
RealBokeh has paired data:
|
| 51 |
+
- Input: `{split}/in/{id}_f22.JPG` (sharp, f/22)
|
| 52 |
+
- GT: `{split}/gt/{id}/{id}_f{fstop}.JPG` (variable bokeh)
|
| 53 |
+
- Metadata: `{split}/metadata/{id}.json` with focal_length, focus_plane_distance, target_avs
|
| 54 |
+
|
| 55 |
+
This maps perfectly to BokehFlow's inputs: image, f_number, focal_length_mm, focus_distance_m.
|