Directions for rchie package / Google Docs

I’ve begun work on the rchie package, which parses the NYT’s new ArchieML format.

The idea of ArchieML is to be able to include structured data in otherwise unstructured text documents. The main use-case it seems to be developed for is collaboration with non-technical writers. An example workflow: You want to create a graphic to accompany an article, so you collaborate using Google Docs in writing the article, which includes some hierarchical lists. When you parse the article with rchie::from_archie(), you get list objects you can use to make a figure with diagrammeR.

I’d like rchie to work with well with Google Docs. The question is: Should I build in a Google Drive API wrapper into the package? RGoogleDocs hasn’t been updated in several years and will no longer work after Google deprecates their old API in April. There’s also a nacent RGoogleDrive on github, but it doesn’t look like it’s being actively developed and doesn’t do much of what I’d need. Thoughts? (Note this relates to @Louis’s question about Google Drive)

Other things I’m trying with rchie:

  • Pulling ArchieML in from Word docs, using rmarkdown::pandoc_convert
  • Extracting ArchieML from the text portions of Rmd files, and making this available to chunks of those files when knitting.

Let me know if there are features or use-cases I should think about.


Seems like building a separate Drive client would be better and then depend on that. I highly doubt Duncan will update his :smile:

I’ll think about use cases

Before working on a Google drive package, I would suggest speaking with Oliver Keyes [1] or Jenny Bryan [2] since both are working on related efforts.


I have a use-case demo for rchie up as a Shiny App. It shows how an R markdown doc can pull in dynamic text as ArchieML data from a Google Doc (using @Ironholdsdriver). I throw in something with gspreadr, for fun, too.

1 Like

Did you mean to link to the Shiny App? Instead of the Google doc?

Um, yes:

I dropped the gspreadr component because for some reason pinging the google spreadsheet was very slow, but now the App and Google Doc also show how you can have structured numeric data in ArchieML.

cool, looking quite good

I’ve been thinking about the"mixed workflow" approach that I demo’d with Shiny app. The main purpose developing the mixed workflow approach, I think, is is to enable collaboration between someone in a word processing / spreadsheet environment with someone in a text/R/git environment. Both can use their preferred approach, and you get a final product that’s a scientific paper or interactive web page.

There are a bunch of things one could do to improve this. One is to use the the driver and git2r packages to import the version history of a Google Doc into working git repository. Another is to improve the Shiny App, possibly using htmlwidgets so that text imported into the final product live, and/or embedding a google doc directly into the Shiny App, so that both editing and output can be seen in one place.

My question is: Is it worth it to develop this over alternatives? For instance, if there were an app that allowed live, concurrent *.Rmd editing on the left and displayed compile HTML on the right, would that be a better collaboration environment for most cases? What about live co-editing in RStudio server? And of course @Louis’s idea of using the Google Doc as R Markdown and syncing to a local file.

In what cases would the “mixed workflow” / ArchieML approach have an advantage? Right now I think the advantage is in (1) more complex outputs than *.Rmd documents, like Shiny apps and web pages with complex layouts, and (2) in version control, because I think version control in Google Docs/Dropbox and git have distinct advantages (ease/power), that can be combined with this approach.


wrt to concurrent editing of .Rmd documents, perhaps something with the ace editor + knitr + shiny

There is the possibility of using the ace editor, eg., shinyAce See example 3

Another eg with OpenCPU

The reasons I put together RGoogleDocs in the first place was solely for the purpose of being able to pull a shared Google Doc that a number of people had edited into R for compilation as markdown.

I did this with the specific intent of using slidify (this was before rmarkdown was a viable option).

The solution hinged on downloading html from googledocs and relying on the document headings to make decisions on formatting as well as a python script that converts html to text… presumably one could use pandoc but at the time I wasn’t aware of it. Here’s an example of my workflow, it’s not very complicated and could definitely be adjusted from slidify to whatever package you’re working with (rmarkdown)

With that said, the RGoogleDocs package isn’t being developed because it already did exactly what I needed it to do if you have something else you would like to see, post it as an issue on the github page and I’ll see if I can work it into my schedule over the coming weeks.