The Many Ways Markdown Conversion Toolkit for Microsoft Word Excels Over Pandoc
Writing great Markdown is important for the success of your project. A great README file helps your project to stand out from the vast sea of open source projects on GitHub. But the need for great Markdown simply transcends the README. Well written documentation to describe how your product works and how the community can contribute to it is also vitally important.
That’s why it is important to choose the best possible tooling to create your Markdown. Without question the best way to create Markdown is to use a proper full-featured and powerful Word Processor. Word Processors are optimized to process paragraphs, pages, entire papers — incorporating all the important elements of tables, images, formatting features. Let’s not forget the easy navigation, spellcheckers, grammar checkers and the ability to see the structure of your document in outline format. That is just the tip of the iceberg about what a Word Processor brings to the table when it comes to editing sophisticated and persuasive documents.
But how do you convert to Markdown from Word (.docx file)?
For years, the answer to that question has been Pandoc. But Pandoc is not a specialist — it excels at converting to and from a wide variety of formats. Pandoc is not that great at going from Microsoft Word to Markdown.
Let’s explore the many ways that the Markdown Toolkit for Word excels over Pandoc for the purpose of converting docx files into Word format.
To begin, Pandoc does not properly deal with images
Everybody agrees with the cliché that a picture is worth a thousand words. That’s why some of the best technical authors take full advantages of the ability to embed graphics into their documents. Diagrams that depict architecture, concepts, relationships between entities grab our attention, explain, and inspire. Clearly, humans are visual creatures and a large percentage of the human brain dedicates itself to visual processing. Humans can process images and alarming speed with some research indicating that the human brain is able to recognize a familiar object within 100 ms.
The use of bright colors can capture our attention because our brains are wired to react to them. Our visual abilities are by far are the most active of our senses — throughout the evolution of humankind our ability to identify a predator or identify food based on colors and shape is critical to our survival. And of course, Facebook and Instagram are built upon a foundation of visual imagery.
So, let’s begin by addressing images. Let’s compare how Pandoc processes images as compared to the Markdown Toolkit for Word.
What we want to convert in Microsoft Word format
Here is a simple example. Below you can see a part of a Microsoft Word Document that contains an embedded image. Naturally, you would want your Markdown to include such an image when the conversion takes place.
Pandoc lacks support for images.
You can see the comparison below. Instead of converting the actual image, pan dark simply places the dimensions of the image in the Markdown. It’s missing the step of creating the folder for your images and placing your images in that folder so they can be referenced by Markdown
The image itself is completely missing. There does not appear to be any attempt to properly incorporate the image from Microsoft Word into a Markdown document.
The Markdown Conversion Toolkit includes images
For an image to work down properly in Markdown, a child folder needs to be created. Only then can the graphic image from the Word Document be placed into it for reference as seen below.
The Markdown Conversion Toolkit generates the appropriate Markdown to reflect the image from Microsoft Word. You can see how the images is being referenced directly from within the Markdown content below.
Tables in Microsoft Word
Tables are an effective means of communication. Organizing data into rows and columns allow readers to identify the main ideas and compare them to each other. Research indicates that organizing data into tables leads to more accurate answers to questions then bar charts or pie charts. Tables are excellent at simplifying complicated textual descriptions. Tables make the most sense when you want to place detailed information into categories.
Creating tables by hand in Markdown is difficult and error prone. A much better approach is to use a Word Processor and then convert to Markdown with the tool.
Pandoc does not render tables correctly for Markdown
The differences between Pandoc and the Markdown Conversion Toolkit are obvious. Pandoc generated tables simply do not comply with Markdown syntax. As a result, as you can see the lower left corner of the image below, the resulting rendering of the table is very difficult to read. There is an unwanted horizontal line, and there are spaces separating the columns.
The Markdown Conversion Toolkit
Markdown conversion Toolkit properly formats tables from Microsoft Word into the syntax required by the Markdown language. Naturally, you can see the Markdown table properly rendered in Markdown.
Microsoft Word Code
Perhaps one of the most important capabilities of Markdown is to display code, especially in the context of GitHub. Although it may not be a typical use for a Word Processor to display code, it is relatively easy to achieve. One approach is to simply have a single cell Microsoft Word table that has the font sent to console us. This will be the way that the Markdown Toolkit for word can understand that it is a code snippet and then enable it to output the proper Markdown syntax for the display of code. Pandoc offers no such capability.
Although Microsoft Word does not directly support source code natively, Markdown converter for word provides some built-in keyboard shortcuts and macros to simplify the process of creating good looking and well formatted code. In the conversion to Markdown conforms to the Markdown standard about how source code gets represented.
Markdown converter looks for all Microsoft Word tables in the document that have a single cell and are formatted to Consolas font . Whenever Markdown converter encounters such a table (a single cell with the font set to Consolas), it will output the traditional Markdown syntax of triple tick (```) before and after the code. The Markdown converter Toolkit comes with prewritten word templates that have built-in keyboard shortcuts to accelerate the authoring of word and Markdown content.
A keyboard shortcut (Ctrl-9) can make this quick and efficient. Simply hit Ctrl-9 to automatically create a single cell table, pastes the code into it, and makes it appear as you see below.
Pandoc’s improper use of programming source code
In the image below you can see that fails to take advantage of the native abilities of Markdown to represent source code. In the top left part of the image you can see unusual borders represented by ASCII symbols (+-|). If you look closely below in the image you can spot the obvious shortcomings of Pandoc.
As mentioned before, the triple backtick (```) should be used to denote programming source code.
Word Document with Numbered and Bullet Lists
Nothing improves readability incomprehension more than a structured outline. Markdown supports both numbered and bulleted lists, where you can include bold, italic and strikethrough formats.
Pandoc will lose your text and ignore it
What is even more alarming is that Pandoc has forgotten to transfer nested bullet lists as you can see below. The following bullet points were completely lost in the translation:
· Easy navigation
Beautiful, well-structured Markdown is key to the success of your GitHub repo. Using a Word Processor to build this Markdown is the best possible tool for the job. It also provides the most efficient means to get the job done.
The other big take away out of this essay is the fact that there are significant word processing tools built into Microsoft Word that will make your Markdown for superior. One simple example can speak seen below with the editor suggestions. This can take your writing to a new level as it will find many common errors that can be easily fixed to make your Markdown the best that it could possibly be.
You can learn more about the Markdown Toolkit for word here: http://www.wordtomarkdown.com.