Search_tweets doesn't include user_id? Github v 0.7.0.9024

I started using the github repo version of rtweet (0.7.0.9024) after a suggestion by Lluis. 1st issue I’ve run into is that search_tweets() doesn’t seem to include user_id for the person who posted it (such as account ID or even screen_name).

Since they both have “id_str” as columns, my initial assumption was that this way a way of connecting the two data frames, but they refer to the tweetID & userID respectively.

I feel like there’s something I’m missing here, so figured I’d post here b4 creating an issue on github.

library(rtweet)
library(tidyverse)
#Bearer token input
auth_setup_default()

#Test run of my personal acct
test <- search_tweets2(q = "@ADrugResearcher", n=5)

#ss1 Retrieved 466 tweets
ss1 <- search_tweets(q = '"safe supply"', n = 600,
                    include_rts = TRUE, 
                    retryonratelimit = TRUE)

#Remove RT's (I need them for other purposes)
snrt <- ss1 %>%
  filter(!str_detect(full_text, "RT "))
#snrt returns 214 rows, 39 columns

u_data <- users_data(snrt)
#users_data returns 465 rows?

I have elevated access if that makes a difference (seems like search_tweets still runs on v1 though?)

The other idea that I had was to try to include author ID as an additional parameter

ss2 <- search_tweets(q = '"safe supply"', n = 100,
                     include_rts = TRUE, 
                     retryonratelimit = TRUE,
                     env_name="author_id")
#or the same thing, but instead of env_name, using
ss2 <- search_tweets(q = '"safe supply"', n = 100,
                     include_rts = TRUE, 
                     retryonratelimit = TRUE,
                     tweet.fields="author_id")

But then I get,

Warning message:
Could not authenticate you. (32)

Which I think is due to rtweet using v1, but unsure

Sorry if this convoluted, I’m primarily a qualitative researcher & thanks for any help!

id_str of tweets are the status id or users’ identifier. If the requests if for tweets the id_str is what used to be status_id, and the user_id is the id_str of users_data of said tweets. If the endpoint is to request data about users the id_str is the user_id, and the tweets_data’ id_str is the status_id. These ids are not mean to match users and tweets.

user <- lookup_users("ADrugResearcher")
t_data <- tweets_data(user)
# t_data$id_str is the id of the latest status.
# user$id_str is the id of the user.

When you filter the ss1 you don’t filter the users_data of said data.frame. I recommend to do something like:

rt <- str_detect(ss1$full_text, "RT ")
snrt <- ss1[!rt, ]
u_data_snrt <- users_data(ss1)[!rt, ]
attr(snrt, "users") <- u_data_snrt
u_data <- users_data(snrt)
stopifnot(nrow(u_data)  == nrow(snrt))

Yes, rtweet (still) uses v1 for all the requests.

The arguments tweet.fields and env_name are not Twitter API (v1) parameters. Probably this is why it fails to to validate/authenticate the request. Please read Twitter documentation for the endpoint (now included on References at the end of the help page).

I hope this helps,
I’ll try to improve this to make it easier to subset and filter using the whole data returned.