Incident analysis of Paris RATP metro lines¶
This library can be used to generate following figures illustrating the probability of incidents for each Paris metro/RER line during the last 30 days.
Which line is more probable to make you angry?¶
When (at which hour and on which day) should you avoid metros?¶
API documentation¶
-
class
ratpmetro.
RATPMetroTweetsAnalyzer
(api=None)¶ Class for analyzing Paris RATP metro line incidents, using their official Twitter accounts
To be able to download tweets you need to obtain your Twitter developer API keys (consumer_key, consumer_secret, access_key and access_secret). Be aware that it may be not possible to download all 14 lines on a row: there is some usage limitation of the Twitter API.
Parameters: api (dict) – Dictionary containing Twitter developer API keys: consumer_key
,consumer_secret
,access_key
,access_secret
-
incident_prob
(year=None, loc=None)¶ Return the mean probability of incidents
Parameters: - year (int) – If
year
is given then only tweets within this specific year are used, else then all downloaded tweets are used - loc (list of str) – Time period from
loc[0]
toloc[1]
- year (int) – If
-
load
(line, number_of_tweets=3200, folder_tweets='tweets', force_download=False)¶ Download the tweets from the official RATP Twitter account.
Some code is adapted from https://github.com/gitlaura/get_tweets
Parameters: - line (int or str) – RATP metro line number (1 to 14), or
"A"
,"B"
for RER lines - number_of_tweets (int) – Number of tweets to download, must be smaller than 3200 due to some limitation of the Twitter API
- folder_tweets (str) – Folder to store the downloaded tweets as a
.csv
file - force_download (bool) – If
False
, it will directly load the already downloaded file without re-downloading it. You can force downloading by usingforce_download = True
- line (int or str) – RATP metro line number (1 to 14), or
-
plot_incident_cause
(year=None, loc=None)¶ Plot frequencies of the main cause of incidents
Parameters: - year (int) – If
year
is given then only tweets within this specific year are used, else then all downloaded tweets are used - loc (list of str) – Time period from
loc[0]
toloc[1]
- year (int) – If
-
plot_incident_prob
(by='hour', year=None, loc=None, **kwargs)¶ Plot (marginal) probability of operational incidents
Parameters: - by (str) – Can be “year”, “month”, “day”, “weekday”, “hour”, or any two of them connected by a “-“, like “hour-weekday”
- year (int) – If
year
is given then only tweets within this specific year are used, else then all downloaded tweets are used - loc (list of str) – Time period from
loc[0]
toloc[1]
-
process
()¶ Process the downloaded raw data frame (using Paris time zone, identifying incidents, resampling…)
-