I have been watching Jessica’s Mak streams lately. She is an indie game developper, but most of all, she has a kick ass overlay that shows what she is typing in real time.
I wanted the same one, I made it using Haskell, Gloss and Chipmunk via the Hipmunk binding.
Long story short, it ended up looking like this.
The source code is available in this git repository.
Gloss, display library
Gloss is a graphics library built upon OpenGL. The API is high level enough to mask a lot of low level nastiness.
To keep it short, the API is based around two functions:
- An update function, which processes the input events and update the world data structure.
- A display function, which renders the world as a Picture.
Those two functions are using two datastructures:
- The world datastructure. This structure can be what you want, you will define it by yourself.
- The Picture datastructure. It represents something that can be drawed to the screen.
I really like this second datastructure, its shape is highly interesting, let’s dig in it.
Hipmunk, physics library
Each object is composed by both a body and a shape. The body contains the physics informations (mass, moment of momentum, elasticity, …) while the shape just describes the bounding box of the object.
Capturing the Input
Now we can display funny bouncy letters, we still need to fix a problem: how to capture keyboard’s input? We were directly using the Gloss input system for debugging purposes. This sadly only work when the keywar window is focussed. Xorg only transferts the input events to the currently focussed window, which, in our case, is a problem.
How could we capture all the input events, regardless the focussed window? I found 3 solutions:
- We could create an overlay window transferring the events to the underlying ones. This solution would imply to re-implement a subset of my window-manager features. We do not want to re-invent the wheel once again :).
- We could bypass xorg and directly read Linux’s /dev/inputX input events. This solution is tricky, not only those events are using a binary data format, which make parsing them clumsy, but those events are raw, they are not applied to any keymap. If we really wanted to read those events, not only we should write the appropriate parser, but we also should support different keymaps.
- We use a keylogger and
tail -f
its output.
The last solution was the simplest, we used the logkeys keylogger.
Side Note: Passwords
As mentionned in the github readme, you need to be extra careful when typing passwords, you clearly do not want to leak them while streaming.
I added a system which displays question marks in place of the actual letters to mitigate that. The audience will get the lenght of your password, but it is already the case if your keyboard is a bit noisy. Mine is noisy as hell, hence this is not really a problem to me.