Wrapping Elsevier's Sciencedirect/Scopus API?

Hi all,

I would like to ask if there would be an interest (and use cases) to develop a package to wrap Elsevier’s Sciencedirect.com and Scopus database? (They divided their API into search, retrieval and metadata; link)

The reason is that I have searched for anything related to accessing Sciencedirect & Scopus database in R, but could not find anything which would provide me a direct access to (at least) some of their API. The closest was CITAN below, but it is rather used for post-processing of data (please correct me if I am wrong).

Any other tips are greatly appreciated. :raised_hand:

Thank you.

https://github.com/Rexamine/CITAN

1 Like

hi @dmpe Good question! What are their authentication requirements? I believe I’ve tried the same web services, and found them to be too much trouble - would be hard to support as I believe they require API key auth as well as checking IP addresses requests come from, only allowing subscribing institutions to access. Is that wrong? I hope so :smile:

I haven’t used CITAN, but from the DESCRIPTION file https://github.com/Rexamine/CITAN/blob/master/DESCRIPTION it does appear the package doesn’t have any dependencies for making http requests, i.e., accessing these web services.

Hi @sckott,

Thanks for reply. Yes, indeed, they require an API Key which I had no problem to register, even with my non-university email. So the API Key itself should not be a problem.

"Anyone can obtain an API Key and use the APIs free of charge"
http://dev.elsevier.com/about.html

There are quotas, naturally :-1: , and for some APIs you will need to contact them to give you an access (yet, I don’t think it requires :moneybag:) See http://dev.elsevier.com/api_key_settings.html

"Furthermore, full API access is only granted to clients that run within the networks of organizations with Sciencedirect and/or Scopus subscriptions.

Yeeh, but (at least here in Germany) almost every university has access to it. So - in the case I would develop it (not yet, I must finish rbitly) - it should not be a problem too.

What however is a complication is their return format, which although is a JSON, it is this http://json-ld.org/ which (I) has no active libraries except for the Javascript one (I am mean this says a lot to me :unamused: about the format) and
(II) there are no direct packages in R for it (jsonlite will not work, or be rather limited at best)

E.g. your response is https://gist.github.com/dmpe/76698d977c28314c02e3 (sorry for formatting issues). But they also offer XML (who knows if it is really better).

Hi @dmpe - Okay, they require api key, but don’t they also check IP addresses? Or has that changed? That;s the part that makes it hard since users have to be on campus that has access or be on a VPN off campus.

In terms of JSON-LD, right, there is no library specifically for handling JSON-LD, but in my experience you can still parse JSON-LD using jsonlite.

If you do go with XML i’d suggest the xml2 library as it’s lighter weight than XML package - it only parses XML (you can’t create XML), but that’s fine for your use case I think.

Hi @sckott,

Yeeh, I have checked that with my university IP address and indeed they have different developer content based on which email and IP address you use. When registering with my university email, I have access to more documentation e.g. on text mining. Yet, still need to contact my “librarian” to give me a special API key only for text mining which differs from the one you normally get.

:thumbsdown: to Elsevier.

Thanks for clarification. I imagine when you are at an IP address that doesn’t have subscription (e.g., at home) you only get open access content for text mining, which I imagine is quite small for this publisher.

@dmpe What do you think you’ll do wrt this project?

Hi @sckott,

(sorry for not giving an answer earlier)

I don’t know…I already put something on github https://github.com/dmpe/ElsevierR. Very hackerish, but just by looking on one data frame from the first function - it would be quite a lot of work to make it usable. But even with json-ld I think it should be possible. That is by using jsonlite.

Looking very briefly at their xml, this would be a bit harder (a lot of parsing, I believe).

Hard to judge at this stage. But through next 2-3 weeks it will be very hard to me to do anything in regard to it (have university work to do). Thus, if you (or anybody) begin something, I will try to contribute to it later.

I mean e.g. http://dev.elsevier.com/metadata.html#!/Abstract_Citation_Count/AbstractCitationCount could be quite interesting to do. But take a look on a return format. It’s jpeg. :pig:
I also think (?) that some of the functionality could also go to https://github.com/ropensci/alm too.

I don’t have any plans to work on this - I’ll check in on your project down the road.

I want to keep alm focused as an R client for Lagotto (API: http://alm.plos.org/docs/api) - which is an open source app to collect and serve article level metrics data

Good,
as said I will try to do something with Elsevier but just not now.

Thanks.

So I made something for this, which should work with ElsevieR

You need to install using

devtools::install_github("muschellij2/rscopus")

currently, but it should be on CRAN in the next few days/today. You need to get an API key from Elsevier, which is bound to an IP address set.

Hope that helps anyone!
John

Thanks @muschellij2 for sharing

I like that you made your package name all lowercase :smile:

I’m working in some Elsevier API access into rcrossref and fulltext as well, though those changes aren’t on CRAN yet