FAQ

← Back to Babelmark

Why Babelmark 3?

This new version is developed by Alexandre Mutel and is using the same concept as Babelmark 2. The main differences are:

  • the project babelmark3 is now hosted on github, accepting PR
  • the front-end (this site) is hosted on github-pages, the repository is babelmark.github.io using a plain jekyll website.
    • Modern look&feel
    • Add languages and link to original projects
    • Support for ctrl-enter
    • Usage of async ajax queries instead of post form
  • the back-end babelmark-proxy is hosted on Azure and is a .NET application.
    • Improved performances, multithread queries to markdown convert servers
    • Perform normalization on the server with NUglify
  • the babelmark registry contains the list of markdown convert servers. This is where you can PR to add a new markdown implementation to the list. See

The original text of this FAQ from babelmark2 is copyrighted by John Mac Farlane

What is this for?

This is a tool for comparing the output of various implementations of John Gruber’s markdown syntax for plain text documents. The official markdown syntax documentation is silent or vague on many issues, and implementations have diverged in their interpretations of the syntax. Even when the interpretation of the syntax spec is not in question, implementations may have bugs. So it is useful to be able to see at a glance how implementations differ on various inputs. The hope is that this tool will promote discussion of how and whether certain vague aspects of the markdown spec should be clarified.

What are some examples of interesting divergences between implementations?

Lists

Inline markup

Raw HTML

Other

Why does it matter whether implementations agree?

Markdown is everywhere these days. If the same document is interpreted differently in different places, that is a problem. For example, suppose you have a nice piece of documentation on a github wiki, and you want to turn it into DocBook using pandoc. Your sublists are indented two spaces, and this works fine on the wiki, but when you run the document through pandoc, these items are interpreted as parts of the main list. If you are lucky, you notice this before distributing the DocBook file!

What are some big questions that the markdown spec does not answer?

  1. How much indentation is needed for a sublist? The spec says that continuation paragraphs need to be indented four spaces, but is not fully explicit about sublists. It is natural to think that they, too, must be indented four spaces, but not all implementations require that. (Note that none of the implementations that allow a one-space indentation to start a sublist are entirely consistent about that.) This is hardly a “corner case,” and divergences between implementations on this issue often lead to surprises for users in real documents. See this comment by John Gruber.

  2. Is a blank line needed before a block quote or header? Or can you have things like this:

    paragraph
    > block quote
    # header
    paragraph
    

    Most implementations do not require the blank line. However, this can lead to unexpected results in hard-wrapped text, and also to ambiguities in parsing (note that some implementations put the header inside the blockquote, while others do not). John Gruber has also spoken in favor of requiring the blank lines.

  3. Is blank space allowed between the [...] part and the (...) part of an inline link? That is, can you have a link like this:

    [my link] (/url)
    

    There is some relevant discussion here.

  4. What is the exact rule for determining when list items get wrapped in <p> tags? Can a list be partially loose and partially tight? What should we do with a list like this:

    1. one
    
    2. two
    3. three
    

    Or this?

    1.  one
    
        - a
    
        - b
    2.  two
    

    There are some relevant comments by John Gruber here.

  5. When list markers change from bullets to numbers, should we have two lists or one?

  6. Should two blockquotes with a blank line between them be treated as a single blockquote? Most implementations do this, but John Gruber has suggested that this be changed. (Waylan Limberg points out that the spec actually does seem to settle this question, in favor of a single blockquote.)

  7. Can reference link definitions occur embedded in block quotes and list items? (The spec says they can be “anywhere in the document,” but most implementations require that they be at the outer level.)

What was the previous Babelmark?

The original Babelmark at http://babelmark.bobtfish.net was written by Michel Fortin, author of PHP Markdown and PHP Markdown Extra. It is no longer accessible. It was using old implementations.

What was Babelmark 2?

John Mac Farlane introduced a new version called babelmark2.

Instead of asking the Babelmark maintainer to install all the converters on his server, and keep them up to date, we use a decentralized model. Each implementer provides a small “dingus server” that accepts textual input and returns HTML. Babelmark 2 queries these dingus servers asynchronously and combines their outputs into a page of results. This system puts the burden on implementers to keep their servers up to date.

How can I add my markdown implementation to Babelmark 3?

Write a server app or CGI script that accepts accepts GET queries, takes the text out of the text parameter, and returns a javascript object with the following fields: name (the name of the markdown processor), version (the version being run), and html (the result of converting the text to HTML).

Example:

$ curl 'http://johnmacfarlane.net/cgi-bin/pandoc-dingus?text=hi'
{"name":"Pandoc","html":"<p>hi</p>","version":"1.9.4.2"}

The script can, if desired, return an error if the input text exceeds 1000 characters.

You can then fork the repository babelmark registry, add your server to the file registry.json and make a pull-request!

Why is there a 1000 character limit on input?

A thousand characters should be sufficient to test syntax features. We impose the limit to reduce load on the servers.

What determines the order in which the implementations are listed?

The calls to the individual dingus servers are made asynchronously and in different threads, the results added in the order they come in. However, one should not infer that the implementations that appear at the top are the fastest. The performance differences in converting small bits of markdown are swamped by other factors, such as server latency and script startup time. For example, RedCarpet has the fastest parser of all the implementations, but often appears last because of the slow startup time of ruby scripts. And PHP Markdown is faster than Markdown.pl, but its dingus server is much farther away.