Directions for rchie package / Google Docs

noamross · March 9, 2015, 11:07pm

I’ve begun work on the rchie package, which parses the NYT’s new ArchieML format.

The idea of ArchieML is to be able to include structured data in otherwise unstructured text documents. The main use-case it seems to be developed for is collaboration with non-technical writers. An example workflow: You want to create a graphic to accompany an article, so you collaborate using Google Docs in writing the article, which includes some hierarchical lists. When you parse the article with rchie::from_archie(), you get list objects you can use to make a figure with diagrammeR.

I’d like rchie to work with well with Google Docs. The question is: Should I build in a Google Drive API wrapper into the package? RGoogleDocs hasn’t been updated in several years and will no longer work after Google deprecates their old API in April. There’s also a nacent RGoogleDrive on github, but it doesn’t look like it’s being actively developed and doesn’t do much of what I’d need. Thoughts? (Note this relates to @Louis’s question about Google Drive)

Other things I’m trying with rchie:

Pulling ArchieML in from Word docs, using rmarkdown::pandoc_convert
Extracting ArchieML from the text portions of Rmd files, and making this available to chunks of those files when knitting.

Let me know if there are features or use-cases I should think about.

sckott · March 10, 2015, 5:29pm

Seems like building a separate Drive client would be better and then depend on that. I highly doubt Duncan will update his

I’ll think about use cases

karthik · March 10, 2015, 6:25pm

Before working on a Google drive package, I would suggest speaking with Oliver Keyes [1] or Jenny Bryan [2] since both are working on related efforts.

noamross · March 13, 2015, 3:19am

I have a use-case demo for rchie up as a Shiny App. It shows how an R markdown doc can pull in dynamic text as ArchieML data from a Google Doc (using @Ironholds’ driver). I throw in something with gspreadr, for fun, too.

sckott · March 13, 2015, 3:41pm

Did you mean to link to the Shiny App? Instead of the Google doc?

noamross · March 13, 2015, 4:20pm

Um, yes: https://noamross.shinyapps.io/rchie/

I dropped the gspreadr component because for some reason pinging the google spreadsheet was very slow, but now the App and Google Doc also show how you can have structured numeric data in ArchieML.

sckott · March 13, 2015, 4:32pm

cool, looking quite good

noamross · March 16, 2015, 5:34pm

I’ve been thinking about the"mixed workflow" approach that I demo’d with Shiny app. The main purpose developing the mixed workflow approach, I think, is is to enable collaboration between someone in a word processing / spreadsheet environment with someone in a text/R/git environment. Both can use their preferred approach, and you get a final product that’s a scientific paper or interactive web page.

There are a bunch of things one could do to improve this. One is to use the the driver and git2r packages to import the version history of a Google Doc into working git repository. Another is to improve the Shiny App, possibly using htmlwidgets so that text imported into the final product live, and/or embedding a google doc directly into the Shiny App, so that both editing and output can be seen in one place.

My question is: Is it worth it to develop this over alternatives? For instance, if there were an app that allowed live, concurrent *.Rmd editing on the left and displayed compile HTML on the right, would that be a better collaboration environment for most cases? What about live co-editing in RStudio server? And of course @Louis’s idea of using the Google Doc as R Markdown and syncing to a local file.

In what cases would the “mixed workflow” / ArchieML approach have an advantage? Right now I think the advantage is in (1) more complex outputs than *.Rmd documents, like Shiny apps and web pages with complex layouts, and (2) in version control, because I think version control in Google Docs/Dropbox and git have distinct advantages (ease/power), that can be combined with this approach.

Thoughts?

sckott · March 16, 2015, 7:05pm

wrt to concurrent editing of .Rmd documents, perhaps something with the ace editor + knitr + shiny

There is the possibility of using the ace editor, eg., shinyAce https://github.com/trestletech/shinyAce. See example 3 https://github.com/trestletech/shinyAce#03-knitr

Another eg with OpenCPU https://www.opencpu.org/posts/knitr-markdown-opencpu-app/

1beb · March 21, 2015, 3:59am

The reasons I put together RGoogleDocs in the first place was solely for the purpose of being able to pull a shared Google Doc that a number of people had edited into R for compilation as markdown.

I did this with the specific intent of using slidify (this was before rmarkdown was a viable option).

The solution hinged on downloading html from googledocs and relying on the document headings to make decisions on formatting as well as a python script that converts html to text… presumably one could use pandoc but at the time I wasn’t aware of it. Here’s an example of my workflow, it’s not very complicated and could definitely be adjusted from slidify to whatever package you’re working with (rmarkdown)

http://bertelsen.ca/programming/google-docs-to-slidify-directly-from-r/

With that said, the RGoogleDocs package isn’t being developed because it already did exactly what I needed it to do if you have something else you would like to see, post it as an issue on the github page and I’ll see if I can work it into my schedule over the coming weeks.

Topic		Replies	Views
Google Drive/Docs Rmarkdown Package Use Questions	13	8722	November 14, 2017
Test API-wrapping R packages with OAuth Tokens	10	3584	March 14, 2015
rOpenSci \| Troubleshooting Pandoc Problems as an R User Blog	0	226	June 1, 2023
Feedback on text mining in rcrossref package Package Use Questions	0	1316	January 16, 2015
Share rOpenSci package citations plz	12	3002	August 25, 2017

Directions for rchie package / Google Docs

Related topics