Fixes to backend to support matmul
Includes enough fixes to backend to support matmul
- Compute block names properly (so phis / jumps involving blocks inside / outside of fork joins are handled properly)
- Fix a bunch of one-off bugs in the backend
- Added a Xdot visualizer for schedule IR to make future debugging easier
Interestingly, this did not require adding a pass that coalesces forks / sequentializes outer forks. Technically what happens is matmul will generate invalid schedule IR (because the reduce loop will turn into a reduction variable not dominating its use in the initialization of a reduction variable), but when this gets lowered to LLVM IR, the reduction variable gets pulled into a phi at the top of the loop, so the LLVM is actually valid. This is a nice coincidence, but won't work when we try to lower to parallel fork-joins (in general). So, the fork coalescing + sequentialization at the Hercules IR level will still be necessary in the long term.
Edited by rarbore2