Creating a regex-based Markdown parser in TypeScript
We will explore the limitations and benefits of using regular expressions to create a simple Markdown parser in TypeScript. Spoiler alert — not a good way to do it
Motivation
Markdown is a markup language that has gained immense popularity in recent years. Besides being used as a convenient way to create content that generates full-blown static websites (via engines such as Gatsby.js and MarkBind), I also started to see widespread usage of Markdown in knowledge management systems such as Obsidian and Dendron.
I write articles like this one using Markdown and I am also actively exploring the use of Markdown in the above-mentioned capacities this year. As a result, I decided to dive deep into how Markdown works and hence this article.
I realized that there are two extremes in software projects:
- the most popular/battle-tested/enterprise-grade projects that define the “standard” for a particular domain
- e.g. for Markdown, it’s markdown-it and marked
- tutorial examples/toy projects for educational purposes
While the former is complex and production ready, the latter is simple and easy to understand. The problem is that there’s a huge gap between creating something simple to something complex. Should you want to do it, there’s less help…