Silver Searcher: Useful Regexes for a Haskell Code-Base

TL;DR

I use 4 Perl Regex patterns most of the time when it comes to search some Haskell code:

Functions:    "\b<args>\b[ \t\n]+::"
Types:        "(data|newtype|type)(\ +)\b<args>\b"
TypeClasses:  "class(\ +)(.*)(=>)*(\ *)\b<args>\b"
Constructors: "\|[\t\ ]+\b<args>\b"

I am looking for a better way to search for a type constructor, email me at picnoir at this domain if you have any better idea.

Ag

Exploring an unknown code-base is always tricky: you need to somehow translate a text-based representation of a software to an accurate model in your own mind. Being able to efficiently search through the code helps to reduce the read-search loop feedback and frees a lot of headspace.

Unlike many languages, Haskell does not have a real IDE and like many developers, I use a traditional text-based searching tool: the silver searcher. This software is blazing fast and supports Perl regexes.

After using it to search through my Haskell code for quite some time, I found myself using the same regex patterns all day long. I soon started to write some aliases to cover most of my searching needs. In this article, I’ll share my favorite ones.

In the following regexes, <SYMBOL> will represent the symbol we’re looking for.

Functions

Searching for functions is quite straightforward: the function name is always followed by a :: lexeme. This lexeme is separated from the symbol either by both some whitespaces or new lines.

\b<PATTERN>\b[ \t\n]+::

Types

When looking for a type, we usually like to search for any data, newtype or type declaration.

(data|newtype|type)(\ +)\b<PATTERN>\b

Here, we do not try to match against any constructor, we prefer having a distinct pattern matching them.

TypeClass

Same trick as when we look for a type, except we are here matching a class lexeme.

Note type classes may have some type constraints on the left side. We need to pattern match those as well.

class(\ +)(.*)(=>)*(\ *)\b<args>\b

Data Constructor

It’s sometimes useful to find the definition of a type constructor. The idea here is similar to the function regex: the constructor symbol is most of the time following the | lexeme.

\|[\t\ ]+\b<PATTERN>\b

Unlike the previous patterns, I am not quite satisfied with this one: it does matches against guards, miss the first constructor of a regular algebraic data type and does not match any constructor expressed in the GADT style. To be honest, I hesitated about including this one in this article. This one could be greatly improved.

Vim Integration

Being able to directly jump to the matched string from within the editor is always nice. If you happen to be a heretic of some sort (ie. not a vim user), this section will be no use for you. Just use a plain shell alias or a specific feature from your favorite editor.

You’ll first need a vim silver searcher integration. I personally use ack.vim and will assume you also use it.

I use some short aliases for the patterns previously showed:

  • :Aghf <PATTERN> Ag Haskell Function: looks for a function definition.
  • :Aght <PATTERN> Ag Haskell Type: looks for a type definition.
  • :Aghtc <PATTERN> Ag Haskell TypeClass: looks for a typeclass definition.
  • :Aghc <PATTERN> Ag Haskell Constructor: looks for a type constructor definition.

These commands are defined in my configuration using the command! directive:

command! -nargs=+ -complete=file Aghf Ack -G ".*\.hs" "\b<args>\b[ \t\n]+::"
command! -nargs=+ -complete=file Aght Ack -G ".*\.hs" "(data|newtype|type)(\ +)\b<args>\b"
command! -nargs=+ -complete=file Aghtc Ack -G ".*\.hs" "class(\ +)(.*)(=>)*(\ *)\b<args>\b"
command! -nargs=+ -complete=file Aghc Ack -G ".*\.hs" "\|[\t\ ]+\b<args>\b"

That’s pretty much all. These 4 simple aliases are probably covering around 80% of all my search needs. I still need to perform some custom search from time to time, but I am overall quite pleased by this small setup.