Skip to content

Bold change: Continue supporting LR's --array --ghc --coerce mode, abandon the rest #268

@sgraf812

Description

@sgraf812

The happy LR backend has 24 different modes, making it quite insane to maintain and extend. These 24 modes depend on

  • whether you want to provide your own lexer or simply have a [Token] available
  • whether you want to have a monadic parser
  • whether you want to generate a table-based parser (--array) or a recursive ascent parser with explicit stack (i.e., generate Haskell code, the default). --array based parsers tend to be introspectable (this is huge) and smaller but a bit slower. (The latter seldomly matters for an LR parser, because the language it parses will need name resolution and type checking as well.)
  • whether you want to target GHC (--ghc) for using unboxed integers and bytestrings (in --array mode)
    • and if so, whether you want to use unsafeCoerce for another performance boost (--coerce)

Realistic applications only ever tend to use --array --ghc --coerce. I suggest we abandon the non-array, non-ghc modes to get down to a more manageable 4 modes, not least because it also means we can get rid of the mandatory {-# LANGUAGE CPP #-} that is currently emitted: #263 Ironically, that means that even the code generated without passing --ghc is GHC dependent. Plus, I don't have any plans to make error resumption work in recursive ascent mode, because the stack unwinding necessary would need a lot of work that is entirely orthogonal to the table-based mechanism.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions