8 replies
September 2018

RLesur

Thanks Maëlle for your blog post! It reminds me two approachs built in pandoc.

The first one is to use this pandoc lua filter: https://pandoc.org/lua-filters.html#extracting-information-about-links
After merging all the md files in one all_files.md, one can execute this kind of command:

rmarkdown::pandoc_convert("all_files.md", to = "markdown", output = "count.md", options = "--lua-filter=count_links.lua")

where count_links.lua contains the referenced lua script.

The second approach is to get a json version of the pandoc AST:

rmarkdown::pandoc_convert("all_files.md", to = "json", output = "allfiles_ast.json")

It is close to the XML commonmark version.

Regards,

Romain

3 replies
September 2018 ▶ RLesur

maelle

Thanks Romain, this is very interesting! :ok_hand:

I am especially interested in the JSON approach since parsing JSON is well supported (e.g. rOpenSci has a jqr package!) and something one needs to learn for other applications anyway.

Merci again! :grinning:

September 2018 ▶ RLesur

noamross

In lieu of lua, you can also do

rmarkdown::pandoc_convert("all_files.md", to = "markdown", output = "count.md", options = "--filter=count_links.R")

Where count_links.R is an arbitrary script that takes the JSON as stdin and emits JSON as stdout. In this case it’s R, but it could also be, say, a .jq script.

1 reply
September 2018 ▶ noamross

maelle

My mind is blown by all these nice ways to extract stuff from Markdown files :exploding_head:

September 2018

maelle

Just for info, I tried this on an R Markdown file, followed by writing it back to markdown, and I didn’t get the input file exactly. :cry: It was worth trying though.

1 reply
September 2018 ▶ maelle

RLesur

I wonder whether it is possible to get the exact input file. There are many default extensions (native_divs for instance) that could explain these differences. The default template for markdown writer can also have an impact.

However, markdown to markdown conversion can be a useful trick as explained in the pandoc wiki here : https://github.com/jgm/pandoc/wiki/Pandoc-Tricks#from-markdown-to-markdown

1 reply
September 2018 ▶ RLesur

maelle

Thanks, will try again soon! :nerd_face:

September 2018

maelle

WIP package to modify ( R )Markdown files without regex https://github.com/ropenscilabs/tinkr