Skip to content

Latest commit

 

History

History
664 lines (532 loc) · 16.2 KB

renamer.md

File metadata and controls

664 lines (532 loc) · 16.2 KB

author: Sam Derbyshire, Well-Typed title: GHC's renamer subtitle: GHC Contributors' Workshop date: June 7th, 2023

Plan

  • First part: survey of the renamer and how information about identifiers flows through the compiler pipeline.
  • Second part: getting our hands dirty, fixing a bug in the renamer (interaction of COMPLETE sets of pattern synonyms and do notation).

What does the renamer do?

  1. Gives all names unique identifiers.
    • Resolve namespacing (type constructor vs variable, module qualification, record fields).
    • Handle shadowing.
  2. Usage & dependency analysis.
    • Redundant/unused imports, unused declarations, ...
    • Dependency analysis: determines order of typechecking.

Names

There are many ways that GHC can refer to an identifier depending on how much it knows about it.

:::{.element: class="fragment"}

  • OccName ::: :::{.element: class="fragment"}
  • Name ::: :::{.element: class="fragment"}
  • RdrName ::: :::{.element: class="fragment"}
  • GlobalRdrElt ::: :::{.element: class="fragment"}
  • Var, TyVar, Id

:::

OccName: occurrences with a NameSpace

:::{.element: class="fragment"}

data OccName = OccName
    { occNameSpace :: NameSpace
    , occNameFS    :: FastString }

data NameSpace
  = VarName
  | FldName { fldParent :: FastString }
  | DataName
  | TvName
  | TcClsName

newtype OccEnv a = MkOccEnv (FastStringEnv (UniqFM NameSpace a))

:::

Name: unique identifiers

:::{.element: class="fragment"}

data Name = Name
  { n_sort :: NameSort
  , n_occ  :: OccName
  , n_uniq :: Unique
  , n_loc  :: SrcSpan
  }

data NameSort
  = External Module
  | WiredIn Module TyThing BuiltInSyntax
  | Internal
  | System

:::

RdrName: umbrella datatype

:::{.element: class="fragment"}

data RdrName
  = Unqual OccName
  | Qual ModuleName OccName
  | Orig Module OccName
  | Exact Name

:::

Var and Id: variables

:::{.element: class="fragment"}

data Var -- slightly abridged
  = TyVar { varName    :: Name
          , varType    :: Kind }
  | TcTyVar { varName       :: Name
            , varType       :: Kind
            , tc_tv_details :: TcTyVarDetails }
  | Id { varName    :: Name
       , varType    :: Type
       , varMult    :: Mult
       , idScope    :: IdScope
       , id_details :: IdDetails
       , id_info    :: IdInfo
       }
data IdScope = GlobalId | LocalId ExportFlag
data ExportFlag = NotExported | Exported
data IdDetails = VanillaId | RecSelId {} | PrimOpId {} | CoVarId {} | ...

type TyVar = Var
type Id = Var

:::

Information about Names: TyThing and GREInfo

:::{.element: class="fragment"}

data TyThing
  = AnId     Id
  | AConLike ConLike
  | ATyCon   TyCon
  | ACoAxiom (CoAxiom Branched)

:::

:::{.element: class="fragment"}

data GREInfo
  = Vanilla
  | IAmTyCon    (TyConFlavour Name)
  | IAmConLike  ConInfo
  | IAmRecField RecFieldInfo

:::

GlobalRdrElt and the GlobalRdrEnv

The renamer primarily deals with GlobalRdrElt, which consists of a Name, information about how it's in scope in the renamer, and additional information that the renamer might need to know.

:::{.element: class="fragment"}

data GlobalRdrElt
  = GRE { gre_name :: Name
        , gre_par  :: Parent
        , gre_lcl  :: Bool
        , gre_imp  :: Bag ImportSpec
        , gre_info :: GREInfo }

type GlobalRdrEnv = OccEnv [GlobalRdrElt]

:::

The TcRn monad

Renaming and typechecking happens in a shared monad, TcM (also called TcRn or RnM).

:::{.element: class="fragment" data-fragment-index="1"}

type TcRnIf a b = IOEnv (Env a b)
type TcRn = TcRnIf TcGblEnv TcLclEnv
type TcM = TcRn
type RnM = TcRn

:::

:::{.element: class="fragment" data-fragment-index="2"} This is ReaderT over IO, with access to:

  • HscEnv – per-module options (e.g. flags passed to GHC) and environment
    e.g. UnitEnv, currently loaded modules;
  • TcGblEnv – generated during typechecking and passed on
    e.g. TypeEnv, InstEnv, GlobalRdrEnv;
  • TcLclEnv – changes as we move inside expressions
    e.g. SrcSpan, TcLevel, LocalRdrEnv.

    :::

Example: renaming a record field update

:::{.element: class="fragment"} Start by looking at rnExpr (RecordUpd {..}): :::

:::{.element: class="fragment"}

rnExpr (RecordUpd { rupd_expr = L l expr, rupd_flds = rbinds })
  = setSrcSpanA l $
    case rbinds of
      RegularRecUpdFields { recUpdFields = flds } ->
        do  { (e, fv_e) <- rnExpr expr
            ; (parents, flds, fv_flds) <- rnHsRecUpdFields flds
            ; ... }
      -- ...

:::

Looking up fields

After handling duplicates, rnHsRecUpdFields starts by calling lookupRecUpdFields to look up each field individually.

:::{.element: class="fragment"}

lookupRecUpdFields :: NE.NonEmpty (LHsRecUpdField GhcPs GhcPs)
                   -> RnM (NE.NonEmpty (HsRecUpdParent GhcRn))
lookupRecUpdFields flds
  = do { gre_env <- getGlobalRdrEnv
       ; fld1_gres NE.:| other_flds_gres <-
           mapM (lookupFieldGREs gre_env . getFieldUpdLbl) flds
       ; -- ...
       }

:::

:::{.element: class="fragment"} We retrieve GlobalRdrElts for each record field.
These have GREInfos which specify which constructors have that field.
We can use that to disambiguate.

:::

Disambiguating using constructors

:::{.element: class="fragment"}

data S = MkS1 { x, y :: Float } | MkS2 { x, y, z :: Float }
data T = MkT1 { x :: Word } | MkT2 { y :: Int }

foo r = r { x = 3, y = 4 }

:::

:::{.element: class="fragment"} Lookup: [(x, [MkS1, MkS2, MkT1]), (y, [MkS1, MkS2, MkT2])]. :::

:::{.element: class="fragment"} Intersect all the possible data constructors: [MkS1, MkS2]. :::

:::{.element: class="fragment"} Take parents (removing duplicates): [S]. ::: :::{.element: class="fragment"} There is a single parent: the record update is unambiguous!

:::

Flow through the compiler & trees that grow

:::{.element: class="fragment"}

type family IdP p
type LIdP p = XRec p (IdP p)
type family XRec p a

:::

:::{.element: class="fragment"}

type instance IdP (GhcPass p) = IdGhcP p
type family IdGhcP pass where
  IdGhcP 'Parsed      = RdrName
  IdGhcP 'Renamed     = Name
  IdGhcP 'Typechecked = Id

:::

:::{.element: class="fragment"}

type instance XRec (GhcPass p) a = GenLocated (Anno a) a

data GenLocated l e = L l e

type instance Anno RdrName = SrcSpanAnnN -- SrcSpan, but with
type instance Anno Name    = SrcSpanAnnN -- some extra stuff
type instance Anno Id      = SrcSpanAnnN -- for exact-print

:::

Lexing

We start off by lexing. See GHC.Parser.Lexer.x:

:::{.element: class="fragment"}

@varid  = $small $idchar*          -- variable identifiers
@conid  = $large $idchar*          -- constructor identifiers
@varsym = ($symbol # \:) $symbol*  -- variable (operator) symbol
@consym = \: $symbol*              -- constructor (operator) symbol

$small = [a-z \_]
$large = [A-Z]
$symbol = [\!\#\$\%\&\*\+\.\/\<\=\>\?\@\\\^\|\-\~\:]
$idchar = [$small $large $digit \']

:::

Parsing

Then parsing, in GHC.Parser.y.

  • Parse occurrences into `RdrName`.
    • Use `mkQual`/`mkUnqual` depending on qualification.
    • Resolve `NameSpace` using context (e.g. whether we are parsing a term or a type).
  • Fix up after the fact when necessary.

:::{.element: class="fragment"} Example: parsing a type constructor.

tyconsym :: { RdrName } -- (actually LocatedN RdrName)
  : CONSYM { mkUnqual tcClsName (getCONSYM $1) }
  | VARSYM { mkUnqual tcClsName (getVARSYM $1) }
  | ':'    { consDataCon_RDR }
  | '-'    { mkUnqual tcClsName (fsLit "-") }
  | '.'    { mkUnqual tcClsName (fsLit ".") }

:::

Quick example of fixing up

:::{.element: class="fragment"}

data D a b c d = Q a => b :+: c :*: d

:::

:::{.element: class="fragment"} Parse this as a type first. Then, after properly associating using the fixities, we find that :+: should be namespaced as a data constructor, and the others remain type constructors. :::

:::{.element: class="fragment"} We also need to resolve ambiguities between expression and patterns. See Note [Ambiguous syntactic categories] in GHC.Parser.PostProcess.

:::

The renamer & typechecker pipeline

:::{.element: class="fragment"} The main entry point is GHC.Tc.Module.tcRnModuleTcRnM. :::

  1. Rename the import list in `tcRnImports`.
  2. Rename and typecheck local declarations and the export list in `tcRnSrcDecls`.
    1. Rename local declarations in `rnTopSrcDecls`.
    2. Typecheck these declarations in `tcTopSrcDecls`.
    3. Rename the exports in `rnExportList`. This allows us to assemble a final `TcGblEnv` which contains everything provided by the module.
  3. Assemble the final typechecked module, to be passed onto the next stage of the compiler pipeline. Everything is extracted from the `TcGblEnv`.
    <p class="indicator">⭲</p>
    

Renaming imports

Entry point: GHC.Rename.Names.rnImportDecl.

  1. Load the module we're importing; this will load its interface file from disk, or directly load the information if it's available in memory.
  2. Figure out what we are importing, in `filterImports`.
    • For a blanket import, import everything.
    • For an explicit import list, accumulate everything that is mentioned.

      `import M( A(x, ..), B(..) )`

    • `import M hiding ( .. )`

      Works the same, except now we filter out instead.

  3. Add all the imported identifiers to the `GlobalRdrEnv`. We will look up in this environment when renaming the body of the module.
    <p class="indicator">⭲</p>
    

Renaming local definitions

:::{.element: class="fragment" data-fragment-index="1"} The main entry point to renaming local declarations is GHC.Rename.Module.rnSrcDecls. Control flow: :::

  1. Generate new `Name`s for all binders.
    1. "Non-value" binders `getLocalNonValBinders`.
      1. Type synonyms, type families, data declarations, data families, class declarations.
      2. Data family instances and class instances, including associated types and methods.
      3. Foreign imports.
    2. Pattern synonyms `extendPatSynEnv`.
    3. Generate new `Name`s and rename the LHS of top-level value bindings (`rnTopBindsLHS`).

  2. Rename declaration bodies.
    1. Type declarations `rnTyClDecls`.
      1. Type synonyms, type families, data declarations, data families, class declarations.
      2. Standalone kind signatures.
      3. Type family instances, data family instances, class instances.
      4. Role annotations.
    2. Top-level value bindings `rnValBindsRHS`.
    3. Everything else: `RULES`, foreign import/exports, default declarations...

Dependency analysis of type declarations

:::{.element: class="fragment"} rnTyClDecls

  • renames all types/classes defined in the module
  • uses this information to compute dependency groups (strongly-connected components). :::

:::{.element: class="fragment"}

rnTyClDecls :: [TyClGroup GhcPs] -> RnM ([TyClGroup GhcRn], FreeVars)
rnTyClDecls tycl_ds =
  do { tycls_w_fvs  <- mapM rnTyClDecl (tyClGroupTyClDecls tycl_ds)
     ; kisigs_w_fvs <- rnStandaloneKindSigs (tyClGroupKindSigs tycl_ds)
     ; instds_w_fvs <- mapM rnSrcInstDecl (tyClGroupInstDecls tycl_ds)
     ; let tycl_sccs = depAnalTyClDecls kisig_fv_env tycls_w_fvs
     ; .. }

:::

:::{.element: class="fragment"} See Note [Dependency analysis of type, class, and instance decls] in GHC.Rename.Module. :::


:::{.element: class="fragment"} This approach is known to have shortcomings, as it doesn't properly account for type instances when kind-checking.


See wiki page: Type-&-Class-Dependency-Analysis.

:::

Dependency analysis of value declarations

:::{.element: class="fragment"}

data HsBindLR idL idR
  = FunBind {..} -- used for function/variable bindings
  | PatBind {..} -- (pattern is never a simple variable)
  | VarBind {..} -- (introduced by the typechecker)
  | PatSynBind {..}
  | XHsBindsLR !(XXHsBindsLR idL idR) -- (introduced by the typechecker)

:::

:::{.element: class="fragment"} rnValBindsRHS

  • renames all value bindings
  • performs dependency analysis :::

:::{.element: class="fragment"}

rnValBindsRHS ctxt (ValBinds _ mbinds sigs)
  = do { (sigs', sig_fvs) <- renameSigs ctxt sigs
       ; binds_w_dus <- mapBagM (rnLBind (mkScopedTvFn sigs')) mbinds
       ; let !(anal_binds, anal_dus) = depAnalBinds binds_w_dus
       ; ... }

:::

Topics not covered

:::{.element: class="fragment"}

  • Template Haskell ::: :::{.element: class="fragment"}
  • Backpack :::

Slides available online: sheaf.github.io/ghc-renamer.