rOpenSci package or resource used
What did you do?
I had a Twitter bot summarize my 2020 contributions on GitHub. I was a bit surprised by the high number of commits that I had made so I decided to have a look into that. I used the GitHub GraphQL API via the ghql package to first pull all the repositories I had contributed to in the past and in a second step get all my commits to those repositories. I then had a look at the “smaller commits” to check whether “~30% of my commits were 1 line diffs” which was my main hypothesis for the high number of commits (spoiler: no, only ~12%). I also had a look at the files that I changed in those small commits (using the REST API via gh).
In theory and with some tweaks (mostly adding pagination to the first step), the data collection approach via ghql should be able to give you all your commits to the default branch over all repositories you have committed to.
URL or code snippet for your use case*
For example, here’s the query for step 2 (get all commits to a specific repo):
query getCommits($name: String!, $owner: String!, $authorId: String!, $after: String)
{
repository(name: $name, owner: $owner) {
defaultBranchRef {
target {
... on Commit {
history(first: 100, author: {id: $authorId}, after: $after) {
nodes {
commitUrl
deletions
additions
author {
user {
login
}
email
name
}
message
messageBody
changedFiles
committedDate
oid
committedViaWeb
pushedDate
}
pageInfo {
hasNextPage
hasPreviousPage
endCursor
}
totalCount
}
}
}
}
}
}
Image
Sector
other
Field(s) of application
ehm. social sciences maybe? in general, meta analysis of your research patterns maybe?
Comments
thanks for the great ghql package!