If you plan to build some Haskell tooling or any kind of static code analyzer, chances are you’ll need to use the GHC API at some point.
While loading a simple module into GHC’s API is quite trivial and well documented, loading complex modules (modules having some c dependencies, some specific options in the .cabal file, etc.) will require you to find the appropriate dynamic flags. These flags are usually retrieved and loaded into GHC by Cabal. Sadly for us, Cabal’s API does not seems to expose a direct way to get these flags.
Some people have developed solutions to work around this problem. In this post, we’ll explore two of them.
Revision: If you plan to use GHC >= 8.6.1, you might as well want to check out GHC source plugins. Mpickering wrote about them. They will not be mentionned in this article.
GHC Plugin + “GHCi Wrapper”
This technique has been detailed by Edward Yang on his website.
The idea basically boils down to this:
A more reliable way to integrate a GHC API program with Cabal is inversion of control: have Cabal call your GHC API program, not the other way around!
What we will do is replace the GHC executable which passes through all commands to an ordinary GHC, except for ghc –interactive, which we will then pass to the GHC API program. Then, we will call Cabal repl/stack repl with our overloaded GHC, and where we would have opened a GHCi prompt, instead our API program gets run.
You’ll first need to write your GHC API program as a GHC frontend plugin. Then, the dependencies will be built by Cabal using a regular GHC while your package will be loaded with your custom frontend plugin. You can then use GHC’s API without having to think about any dynamic flags: they already have been loaded for you by Cabal.
You’ll still need to get the data extracted by your GHC API program home. Sadly, your GHC frontend plugin will be instantiated in another process, you won’t have a direct access to it. I guess you could use one of the usual suspects (stdout, a message queue, a DBMS, etc.) to bring everything home. I guess you’ll also need a proper way to retrieve and manage the errors.
I’m not going to dive more into details here, the previously linked article contains some code examples, it should more than enough if you want to learn more about this trick.
Cabal helper is a library initially created for ghc-mod.
The idea here is a bit different, instead of having our GHC API program loaded by Cabal, we’re going to retrieve some build meta-informations left by Cabal during the package build process and parse them.
Here, you won’t need any GHC wrapper nor frontend plugin.
Let’s see how we can practically use this library and the GHC API to retrieve a module exports.
Using Cabal-helper: retrieving a module exports
First, we’ll need to let Cabal both install the package’s dependencies as well as building the project.
cabal install --dependencies-only cabal build
Cabal should have built the necessary artefacts for us to retrieve the dynamic flags. Let’s parse these artifacts and get the options passed to GHC.
let fp = "path to the dir containing the .cabal file" qe = mkQueryEnv fp (fp </> "dist") cs <- runQuery qe $ components $ (,) <$> ghcOptions
Your package may contain several Cabal components: a library, an application, a unit test suite, an integration test suite, etc. Each component is built differently, they’ll each have a different set of dynamic flags.
You’ll probably want to filter this list to extract the component you’re interested in. In this example, let’s say we are interested by the lib:
getLib (_, chLibName) = True getLib _ = False head $ filter getLib cs
Altogether, you should have something like this.
getCabalDynFlagsLib :: (MonadIO m) => FilePath -> m [String] getCabalDynFlagsLib fp = do let qe = H.mkQueryEnv fp (fp </> "dist") cs <- liftIO $ H.runQuery qe $ H.components $ (,) <$> H.ghcOptions pure . fst . head $ filter getLib cs where getLib (_,H.ChLibName) = True getLib _ = False
Well, vaguely. You probably want to reflect some error cases on your API :)
We can now load these flags into GHC.
dflags0 <- getSessionDynFlags dflags1 <- getCabalDynFlagsLib pfp (dflags, _, _) <- parseDynamicFlags dflags0 (noLoc <$> dflags1) _ <- setSessionDynFlags dflags
Your GHC API environment is finally set up, you can now use it.
We still want to retrieve the exported symbols of a module: let’s write the corresponding GHC API code.
getModExports :: (MonadIO m) => FilePath -> ModuleName -> m [String] getModExports pfp mn = liftIO . runGhc (Just libdir) $ do dflags0 <- getSessionDynFlags dflags1 <- getCabalDynFlagsLib pfp (dflags, _, _) <- parseDynamicFlags dflags0 (noLoc <$> dflags1) _ <- setSessionDynFlags dflags target <- guessTarget fileName Nothing setTargets [target] _ <- load LoadAllTargets modSum <- getModSummary $ mkModuleName modName parseModule modSum >>= typecheckModule >>= desugarModule >>= getModExports where modName = intercalate "." $ components mn fileName = "./" <> toFilePath mn getModExports = fmap getAvName . mg_exports . dm_core_module
Aaaaaand, that’s it, we’re done. Again: you would probably want to add a bit more of error handling to this.
As you can see, cabal-helper has a pretty simple API and is quite straightforward to use. I just had to ignore some upper bounds to use it in my project.
Under the hood, it’s a whole other story. Since Cabal’s artifacts are quite heavily version dependant, cabal-helper will compile and run a small wrapper at runtime. This wrapper is in charge to parse the artifacts and send back the results to the main process your program is running in. Since the wrapper has been built by the same Cabal that built the artifacts, we are kinda sure it will be able to parse them.
In our example, this does not really matter as we are sure the Cabal who built cabal-helper is the same who built the module we are inspecting, however, this can be a killer feature in case you are building an editor plugin.