in Education by
I have had a project in mind where I would download all the tweets sent to celebrities for the last one year and do a sentiment analysis on them and evaluate who had the most positive fans. Then I discovered that you can at max retrieve twitter mentions for the last 7 days using tweepy/twitter API. I scavenged the net but couldn't find any ways to download tweets for the last one year. Anyways, I decided to do the project on last 7 days data only and wrote the following code: try: while 1: for results in tweepy.Cursor(twitter_api.search, q="@celebrity_handle").items(9999999): item = (results.text).encode('utf-8').strip() wr.writerow([item, results.created_at]) # write to a csv (tweet, date) I am using the Cursor search api because the other way to get mentions (the more accurate one) has a limitation of retrieving the last 800 tweets only. Anyways, after running the code overnight, I was able to download only 32K tweets. Around 90% of them were Retweets. Is there a better more efficient way to get mentions data? Do keep in mind, that: I want to do this for multiple celebrities. (Famous ones with millions of followers). I don't care about retweets. They have thousands to tweets sent out to them per day. Any suggestions would be welcome but at the current moment, I am out of ideas. JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
I would use the search api. I did something similar with the following code. It appears to have worked exactly as expected. I used it on a specific movie star, and pulled 15568 tweets, upon a quick scan all of which appear to be @mentions of them. (I pulled from their entire timeline.) In your case, on a search you'd want to run, say, daily, I'd store the id of the last mention you pulled for each user, and set that value as "sinceId" each time you rerun the search. As an aside, AppAuthHandler is much faster than OAuthHandler and you won't need user authentication for these kinds of data pulls. auth = tweepy.AppAuthHandler(consumer_token, consumer_secret) auth.secure = True api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True) searchQuery = '@username' this is what we're searching for. in your case i would make a list and iterate through all of the usernames in each pass of the search query run. retweet_filter='-filter:retweets' this filters out retweets inside each api.search call below i would put the following in as the query parameter: q=searchQuery+retweet_filter the following code (and the api setup above) is from this link: tweetsPerQry = 100 # this is the max the API permits fName = 'tweets.txt' # We'll store the tweets in a text file. If results from a specific ID onwards are reqd, set sinceId to that ID. else default to no lower limit, go as far back as API allows sinceId = None If results only below a specific ID are, set max_id to that ID. else default to no upper limit, start from the most recent tweet matching the search query. max_id = -1L //however many you want to limit your collection to. how much storage space do you have? maxTweets = 10000000 tweetCount = 0 print("Downloading max {0} tweets".format(maxTweets)) with open(fName, 'w') as f: while tweetCount < maxTweets: try: if (max_id <= 0): if (not sinceId): new_tweets = api.search(q=searchQuery, count=tweetsPerQry) else: new_tweets = api.search(q=searchQuery, count=tweetsPerQry, since_id=sinceId) else: if (not sinceId): new_tweets = api.search(q=searchQuery, count=tweetsPerQry, max_id=str(max_id - 1)) else: new_tweets = api.search(q=searchQuery, count=tweetsPerQry, max_id=str(max_id - 1), since_id=sinceId) if not new_tweets: print("No more tweets found") break for tweet in new_tweets: f.write(jsonpickle.encode(tweet._json, unpicklable=False) + '\n') tweetCount += len(new_tweets) print("Downloaded {0} tweets".format(tweetCount)) max_id = new_tweets[-1].id except tweepy.TweepError as e: # Just exit if any error print("some error : " + str(e)) break print ("Downloaded {0} tweets, Saved to {1}".format(tweetCount, fName))

Related questions

0 votes
    How to get the UserID of all the currently logged in users using Apex code?...
asked Nov 11, 2020 in Technology by JackTerrance
0 votes
    I'm using tweepy to access a large number of tweets. Many tweets are truncated, so I want to get ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 19, 2022 in Education by JackTerrance
0 votes
    I'm experiencing some problems regarding Twitter OAuth within an android activity. I read a lot of tutorials ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 27, 2022 in Education by JackTerrance
0 votes
    httr package does not work well with facebook and twitter API. (a) True (b) False The question was ... questions and answers pdf, Data Science interview questions for beginners...
asked Oct 28, 2021 in Education by JackTerrance
0 votes
    please followers come back please please eeeeeeeeeeeee eeeeeeeeeeeee eeeeeeeeeeeee eeeeeeeeeeeee eeeeeeeeeeeee eeeeeeeeeeeee ... the correct answer from above options...
asked Dec 14, 2021 in Education by JackTerrance
0 votes
    Which method is used to return information for all users associated with a database? (1)db.Users() (2)db.returnUsers() (3)db.getUsers()...
asked May 22, 2021 in Technology by JackTerrance
0 votes
    I would like to display a twitter user profile without having the app prompt the phone user for creating an ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 1, 2022 in Education by JackTerrance
0 votes
    I have an PHP SDK for Twitter OAuth, but in it's constructor, I have to pass the OAuth secret ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 10, 2022 in Education by JackTerrance
0 votes
    After studying the behavior of a population, you have identified four specific individual types that are valuable to ... type. Which algorithm is most appropriate for this study?...
asked Apr 27, 2021 in Technology by JackTerrance
0 votes
    Actually I am checking the excel values whether they are displayed on the web page Mouse hover menu. The menu includes titles and the ... ")); int submenuui = 0; for (int a=1;a...
asked Jul 20, 2022 in Education by JackTerrance
0 votes
    How will you get the sum of all numbers present in a list using Java 8 in Java8?...
asked Nov 8, 2020 in Education by Editorial Staff
0 votes
    I was getting the users mail addresses from Libraray/Preferences/com.apple.mail.plist. They are not there ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 24, 2022 in Education by JackTerrance
...