Tighten symbol types throughout the codebase#505
Conversation
# Conflicts: # hkmc2/shared/src/main/scala/hkmc2/codegen/BlockSimplifier.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/Lowering.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/ReflectionInstrumenter.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/StackSafeTransform.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/UsedVarAnalyzer.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/js/JSBuilder.scala # hkmc2/shared/src/test/mlscript/block-staging/Nested.mls # hkmc2/shared/src/test/mlscript/codegen/BlockPrinter.mls # hkmc2/shared/src/test/mlscript/codegen/FirstClassFunctionTransform.mls # hkmc2/shared/src/test/mlscript/codegen/Hygiene.mls # hkmc2/shared/src/test/mlscript/codegen/PlainClasses.mls # hkmc2/shared/src/test/mlscript/deforest/relaxedPrograms.mls # hkmc2/shared/src/test/mlscript/lifter/PrivateMutableFields.mls
# Conflicts: # hkmc2/shared/src/main/scala/hkmc2/codegen/HandlerLowering.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/Lifter.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/Lowering.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/ReflectionInstrumenter.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/StackSafeTransform.scala
# Conflicts: # hkmc2/shared/src/main/scala/hkmc2/codegen/Block.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/cpp/CodeGen.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/llir/Analysis.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/llir/Builder.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/llir/Interp.scala # hkmc2/shared/src/main/scala/hkmc2/codegen/llir/Llir.scala
If Copilot won't do it then I'll ask Claude to do it then! (I'll separately go through the changes manually - Just treat this as what Copilot was supposed to do). I also explicitly told Claude to ignore the LLIR changes - It's just a simple removal so I don't think there's much to review there. Code reviewReviewed locally on Correctness1.
case Assign(_: NoSymbol, r, rst) => rst.definedVars
case Assign(l: TermSymbol, r, rst) => rst.definedVars + l // ← treated as reachable
case Assign(l: LocalVarSymbol, r, rst) => rst.definedVars + lbut case Assign(_: NoSymbol, rhs, rest) => rhs.freeVars ++ rest.freeVars
case Assign(lhs: FreeSymbol, rhs, rest) => Set.single(lhs) ++ ...
Please either add an explicit 2. Acknowledged Lifter hack. In Other correctness-sensitive bits look right:
Code quality / conventions3. Duplicated type aliases across packages. 4. Snake_case leak. 5. Minor. Behavioral changes worth a second look
TestsGood coverage — VerdictStrong, coherent tightening. I'd block on #1 (the |
|
Impressive review by the slop-machine! Though, of course, it missed the forest for the tree, it made me realize It also got something plain wrong: |
|
PS: Feels funny to be complimented on "strong, coherent tightening" by the weak, incoherent code generator. |
|
@CAG2Mark Nothing too serious actually. I had a branch with much more extensive changes, but couldn't go the last mile, so I just backtracked. |

Well... this took me way longer than I would have thought.
We've had a bad habit of just using
Symbol(or its infamous aliasLocal) everywhere, which led to nobody really knowing whta kind of symbol was really valid where, which in turn led to a proliferation of hacks, workarounds, outright violations, and needlessly defensive programming.This PR is the first step towards pulling us out of this quagmire. I managed to tighten the symbol types of most things, although some "looser" types are still floating around in some parts, such as
ValueSymbol. Ideally, we'd remove most uses of this type.The lifter in particular caused me immense amounts of pain, as I was trying to figure out just what kind of symbol was allowed where and how everything somehow manages to magically fall back into place after traversing all these crazy mappings. I think there is a large potential for simplification here; I tried to achieve it for a while, but failed miserably. So, for now it uses quite loose type and a hack which is real code smell:
There is still something funny going on, which causes everything to comes crashing down if I try to improve this part. Hopefully @CAG2Mark can help with this (in a follow-up PR).
Oh, and I probably wasted much more time than I saved by trying to use Codex and Copilot to help with the refactoring. I think this is a good example of a task at which they are completely useless with: redesigning some complicated mechanisms that sit outside of their usual training data. Their "fixes" were always worthless workarounds that only piled more technical debt on the existing mess!