Loading a Cabal module in the GHC API

If you plan to build some Haskell tooling or any kind of static code analyzer, chances are you’ll need to use the GHC API at some point.

While loading a simple module into GHC’s API is quite trivial and well documented, loading complex modules (modules having some c dependencies, some specific options in the .cabal file, etc.) will require you to find the appropriate dynamic flags. These flags are usually retrieved and loaded into GHC by Cabal. Sadly for us, Cabal’s API does not seems to expose a direct way to get these flags.

Some people have developed solutions to work around this problem. In this post, we’ll explore two of them.

Revision: If you plan to use GHC >= 8.6.1, you might as well want to check out GHC source plugins. Mpickering wrote about them. They will not be mentionned in this article.

GHC Plugin + “GHCi Wrapper”

This technique has been detailed by Edward Yang on his website.

The idea basically boils down to this:

A more reliable way to integrate a GHC API program with Cabal is inversion of control: have Cabal call your GHC API program, not the other way around!

[..]

What we will do is replace the GHC executable which passes through all commands to an ordinary GHC, except for ghc –interactive, which we will then pass to the GHC API program. Then, we will call Cabal repl/stack repl with our overloaded GHC, and where we would have opened a GHCi prompt, instead our API program gets run.

You’ll first need to write your GHC API program as a GHC frontend plugin. Then, the dependencies will be built by Cabal using a regular GHC while your package will be loaded with your custom frontend plugin. You can then use GHC’s API without having to think about any dynamic flags: they already have been loaded for you by Cabal.

Pretty smart!

You’ll still need to get the data extracted by your GHC API program home. Sadly, your GHC frontend plugin will be instantiated in another process, you won’t have a direct access to it. I guess you could use one of the usual suspects (stdout, a message queue, a DBMS, etc.) to bring everything home. I guess you’ll also need a proper way to retrieve and manage the errors.

I’m not going to dive more into details here, the [previously linked article] (http://blog.ezyang.com/2017/02/how-to-integrate-ghc-api-programs-with-cabal/) contains some code examples, it should more than enough if you want to learn more about this trick.

Cabal Helper

Cabal helper is a library initially created for ghc-mod.

The idea here is a bit different, instead of having our GHC API program loaded by Cabal, we’re going to retrieve some build meta-informations left by Cabal during the package build process and parse them.

Here, you won’t need any GHC wrapper nor frontend plugin.

Let’s see how we can practically use this library and the GHC API to retrieve a module exports.

Using Cabal-helper: retrieving a module exports

First, we’ll need to let Cabal both install the package’s dependencies as well as building the project.

cabal install --dependencies-only
cabal build

Cabal should have built the necessary artefacts for us to retrieve the dynamic flags. Let’s parse these artifacts and get the options passed to GHC.

let fp = "path to the dir containing the .cabal file"
    qe = mkQueryEnv fp (fp </> "dist")
cs <- runQuery qe $ components $ (,) <$> ghcOptions

Your package may contain several Cabal components: a library, an application, a unit test suite, an integration test suite, etc. Each component is built differently, they’ll each have a different set of dynamic flags.

You’ll probably want to filter this list to extract the component you’re interested in. In this example, let’s say we are interested by the lib:

getLib (_, chLibName) = True
getLib _ = False
head $ filter getLib cs

Altogether, you should have something like this.

getCabalDynFlagsLib :: (MonadIO m) => FilePath -> m [String]
getCabalDynFlagsLib fp = do
  let qe = H.mkQueryEnv fp (fp </> "dist")
  cs <- liftIO $ H.runQuery qe $ H.components $ (,) <$> H.ghcOptions
  pure . fst . head $ filter getLib cs
    where
      getLib (_,H.ChLibName) = True
      getLib _ = False

Well, vaguely. You probably want to reflect some error cases on your API :)

We can now load these flags into GHC.

dflags0 <- getSessionDynFlags
dflags1 <- getCabalDynFlagsLib pfp
(dflags, _, _) <- parseDynamicFlags dflags0 (noLoc <$> dflags1)
_ <- setSessionDynFlags dflags

Your GHC API environment is finally set up, you can now use it.

We still want to retrieve the exported symbols of a module: let’s write the corresponding GHC API code.

getModExports :: (MonadIO m) => FilePath -> ModuleName -> m [String]
getModExports pfp mn = 
  liftIO . runGhc (Just libdir) $ do
    dflags0 <- getSessionDynFlags
    dflags1 <- getCabalDynFlagsLib pfp
    (dflags, _, _) <- parseDynamicFlags dflags0 (noLoc <$> dflags1)
    _ <- setSessionDynFlags dflags
    target <- guessTarget fileName Nothing
    setTargets [target]
    _ <- load LoadAllTargets
    modSum <- getModSummary $ mkModuleName modName
    parseModule modSum >>= typecheckModule >>= desugarModule >>= getModExports
  where
    modName = intercalate "." $ components mn 
    fileName = "./" <> toFilePath mn 
    getModExports = fmap getAvName . mg_exports . dm_core_module

Aaaaaand, that’s it, we’re done. Again: you would probably want to add a bit more of error handling to this.

As you can see, cabal-helper has a pretty simple API and is quite straightforward to use. I just had to ignore some upper bounds to use it in my project.

Under the hood, it’s a whole other story. Since Cabal’s artifacts are quite heavily version dependant, cabal-helper will compile and run a small wrapper at runtime. This wrapper is in charge to parse the artifacts and send back the results to the main process your program is running in. Since the wrapper has been built by the same Cabal that built the artifacts, we are kinda sure it will be able to parse them.

In our example, this does not really matter as we are sure the Cabal who built cabal-helper is the same who built the module we are inspecting, however, this can be a killer feature in case you are building an editor plugin.

Special thanks to [Edward Yang] (http://ezyang.com) for writing an article detailing the GHC --interactive trick and [Dxld] (http://darkboxed.org/) for writing cabal-helper, helping me figure out how to use it, and relaxing some upper bounds :)