1.6. Access Tweets

  • Author: Johannes Maucher

  • Last update: 2020-09-09

This notebook demonstrates how the Python API tweepy can be applied to access data from twitter.

#!pip install tweepy
import tweepy
import json
from slugify import slugify

The tweepy authentification process is described in detail in tweepy authentification. This process is implemented in the following two code-cells. It requires that corresponding personal twitter credentials are available. These credentials can be applied from Twitter Apps.

My personal credentials are saved in the file twitterCredentials.json. The contents of this file are loaded in the following code-cell.

with open("/Users/johannes/Dropbox (Privat)/twitterCredentials.json") as data_file:    
    credentials = json.load(data_file)
#print credentials

Tweepy authentification:

auth = tweepy.OAuthHandler(credentials["consumerKey"], credentials["consumerSecret"])
auth.set_access_token(credentials["accessToken"],credentials["accessSecret"])
api = tweepy.API(auth)

The API-object can now be applied for example in order to read the own timeline of the Twitter homepage:

public_tweets = api.home_timeline(count=10)
for tweet in public_tweets:
    print("-"*10)
    print(tweet.author.name)
    print(tweet.text)
----------
Steffen Seibert
RT @coronawarnapp: Die #CoronaWarnApp braucht dich. Und dich. Und dich. 📲❌🌊

Danke dafür.👇 https://t.co/48aULZW90X
----------
VfB Stuttgart
Glückwunsch an Borna #Sosa und die kroatische Nationalmannschaft zur erfolgreichen WM-Qualifikation💪🔥 https://t.co/bIVKHstci6
----------
Steffen Seibert
RT @GermanyDiplo: Rumours that Germany is planning to send busses to pick up persons from #Belarus through #Poland to Germany are false. Wh…
----------
Steffen Seibert
RT @GermanyDiplo: ئەو دەنگۆیانەی باس لەوە دەکەن، گوایە ئەڵمانیا پاسی ناردوون بۆ هێنانی خەڵکی لە #بێلاروسیاوە لە ڕێی #پۆلۆنیا راست نین. ئەوە…
----------
Steffen Seibert
RT @BMVg_Bundeswehr: Wir trauern um die Toten von Krieg und Gewaltherrschaft. Der heutige #Volkstrauertag mahnt uns zu Versöhnung und Fried…
----------
VfB Stuttgart
Gute Besserung, Roberto! 

#VfB | #Massimo https://t.co/TEMMpNRSjt
----------
Hochschule der Medien (HdM)
Am 18. 11.2021 stellt die HdM Studieninteressierten verschiedene Bachelorstudienangebote online vor. Besucherinnen… https://t.co/5aCdsVWmrd
----------
VfB Stuttgart
Doppeltes Heimspiel-Feeling in ⚪🔴! Sichert euch jetzt eure Tickets gegen Mainz (26.11.) und Hertha (5.12.). Mit dab… https://t.co/W5lnp0AY6R
----------
Steffen Seibert
RT @Senat_Hamburg: Eine Impfung schützt am sichersten vor einer schweren Erkrankung an Corona. Daher: #ÄrmelHoch, Hamburg! Es gibt mobile I…
----------
VfB Stuttgart
Gewinnt mit etwas Glück einen der begehrten Plätze beim #VfB Meet & Greet 2.0 powered by #MercedesBenzBank mit Wald… https://t.co/E5l0N7kKTp

The API-object can also be applied for sending a new tweet (a.k.a. update status):

#api.update_status("This is just for testing tweepy api")

The API does not only provide access to content (tweets), but also to user-, friendship-, list-data and much more:

#user = api.get_user('realDonaldTrump')
user = api.get_user('RegSprecher')
user.id
234343491
print(user.description)
Sprecher der Bundesregierung und Chef des Bundespresseamtes (BPA). Tweets seiner Mitarbeiter/innen enden mit dem Kürzel (BPA).
user.followers_count
1002923
user.location
'Berlin'
user.friends_count
156
for friend in user.friends():
    print(friend.screen_name)
print(len(user.friends()))
endrifuga
Celik_Chn
coronawarnapp
barrierefrei
OurWorldInData
berthoppe
POTUS
DerekinBerlin
StefanLeifert
WhiteHouse
JoeBiden
Luisamneubauer
germanyintheeu
martinkaul
BMWi_Bund
BMISprecher
eucopresident
maithi_nk
BMJV_Bund
BMVg_Bundeswehr
20
dirname="./TWEETS/"
users = ["mxlearn","reddit_python","ML_NLP"]
numTweets=300
for user in users:
    print(user)
    user_timeline = api.user_timeline(screen_name=user, count=numTweets)
    filename = str(user) + ".json"
    with open(dirname+filename, 'w+',encoding="utf-8") as f:
        for idx, tweet in enumerate(user_timeline):
            tweet_text = user_timeline[idx].text
            #print(tweet_text)
            f.write(tweet_text + "\n")
mxlearn
reddit_python
ML_NLP

Alternative way to access timeline using the cursor-object:

with open("twitterTimeline.json", 'w+') as f:
    for status in tweepy.Cursor(api.home_timeline).items(1):
        json.dump(status._json,f)
for friend in list(tweepy.Cursor(api.friends).items()):
    print(friend.name)
Johannes Maucher
VfB Stuttgart
Steffen Seibert
Hochschule der Medien (HdM)
for tweet in tweepy.Cursor(api.user_timeline).items(1):
    print("-"*10)
    print(tweet.text)
----------
This is just for testing tweepy api