Generate pdf from xml

text-mining
r
Tags: #<Tag:0x00007fbc5643a818> #<Tag:0x00007fbc5643a4f8>

#1

Dear forum,

I’d like to text-mine Science Direct papers with an R package called statcheck, but I am only able to reliable get xml through the API.

Is there any way to generate PDF file from the XML? Statcheck only works on HTML and PDF formats.

Thank you,
Andrei


#2

Curious, how are you using the API?

will get back to you on the xml to pdf thing


#3

Thank you for the prompt reply.

With respect to the unsuccessfull pdf download thing, I should add, I’m having the issue that I only get the first page using the URL that they provide, so I didn’t try via the “fulltext” package.

I am having some trouble understanding how the pdf download works in “fulltext” - can you please explain how I supposed to get the pdf exactly? I ran some of the dummy examples, like:

res <- ft_get(x=‘10.1101/012476’)
res$biorxiv

but I don’t understand what I am to do with:

res$biorxiv$data$path
[[1]]
[1] “~/.fulltext/10.1101_012476.pdf”

thank you


#4

“~/.fulltext/10.1101_012476.pdf” is a path to the pdf file. more later -i’m in a conference rightnow