President’s Day (As In: What Does President Trump Do With His Day?)
R
rtweet
ggplot2
Visualization
Author
Garrick Aden-Buie
Published
February 27, 2019
On February 3rd, Axios released President Trump’s daily schedule. As in many other areas of his political career, Trump has broken with tradition by hiding his schedule from public view.
In addition to a set of re-typed PDF files, Axios also created a Google Spreadsheet containing the president’s schedule and notes about the activities. If you’re interesting in reading about how that task could be accomplished, I highly recommend Maëlle Salmon’s post on rectangling the tables in the PDF files.
The leak and subsequent release by Axios provide unique insight into Trump’s daily activities, which are dominated by a large block of time referred to as Executive Time. Reportedly, Trump hated following a strict daily schedule, so former chief-of-staff John Kelly introduced the concept of Executive Time: unstructured time when the president reads watches news, makes phone calls, and writes emails tweets.
I won’t comment extensively on what these schedules mean—for more on that angle, see reporting from Axios, Vox, Politico and others.
Instead, I’ll use this post to visualize the president’s work day and tweeting habits and a demonstrate how to use R, plot.ly and the tools of the tidyverse to create interactive and static visualizations to try to make sense of what the president does on a daily basis.
The President’s Daily 8am to 5pm Schedule
Axios’ article on Trump’s private schedule includes an interactive view of the president’s workday schedule from 8 a.m. to 5 p.m. Here, I recreate the same visualization, with each activity colored according to the activity’s category. (Note these plots look best on desktop devices.)
Hover over any time slot to view more details about the activity at that time. You can also toggle activity categories — try removing everything except Meetings, Lunches, and Events, it’s unbelievable.
View static image of the plot. Expand the section below for a behind-the-scenes look at this visualization.
How This Was Made…
The pipeline for building this visualization is a fairly standard loading and transformation of the source data with readr and dplyr, followed by building the visualization in ggplot2 and passing off to plotly for the interactive parts. On the other hand, I created a number of helper functions and constants that I reused throughout this post, so there’s quite a bit of code and preamble to get to the actual plot making.
To build the plot, I first created several helper functions for the plot labels, scales, and data filtering. I also set the global plot theme, and created a few constants that I used across several plots in this post.
Helper Functions
The first helper adds "am" or "pm" to an integer hour for easy-to-read labels on the x-axis time of day.
am_pm <-function(x) { x <-floor(x) y <-paste(x) y[x <1] <-"12 am" y[x >1& x <12] <-paste(y[x >1& x <12], "am") y[x ==12] <-"12 pm" y[x >12] <-paste(x[x >12] -12, "pm") y}am_pm(seq(8, 17, 3))
[1] "8 am" "11 am" "2 pm" "5 pm"
The second helper function is copied directly from StackOverflow: Reverse datetime (POSIXct data) axis in ggplot, in one of the rare-but-beautiful moments when directly copying and pasting from SO works out perfectly. This gives me a rev_date() transformer function that can be passed to scale_y_continuous(trans = rev_date) to show the y-axis in chronological order with the earliest date starting at the top.
A third helper function abstracts the code required to filter out the portion of the data set that belongs to the workday, which I use throughout this post. This function is a nice example of how tidyeval can be used for flexible dplyr wrapper functions.
Finally, another helper function creates the plot caption credits.
credit_caption <-function(rtweet =FALSE) {paste0("\nData: Based on White House schedules released by Axios. http://bit.ly/2UGM0fw",if (rtweet) "\n Tweets collected with {rtweet}. https://rtweet.info","\nChart: @grrrck" )}
Plot Themes
I used showtext and sysfonts to match the fonts in the plot to the font used on this blog (PT Mono).
sysfonts::font_add_google("PT Sans")sysfonts::font_add_google("PT Sans Narrow")sysfonts::font_add_google("PT Mono")showtext::showtext_auto()
And I used hrbrthemes’ excellent theme_ipsum() as my theme’s starting point.
I created a few constants for later reference: one stores date breaks for the time period covered by the Axios schedule, with labels on each Monday; and the other two store the color palette and labels for the executive activity categories. The colors were hand-selected from a picture of the Donald (I was expecting there to be more orange).
Finally, I pulled the schedule data and all of the above pieces together to build the interactive plot.
g <- exec_time %>%mutate(date =floor_date(time_start, "day"),date =as.integer(date) ) %>%filter_workday() %>%mutate_at(vars(time_start, time_end), in_hours) %>%ggplot() +geom_rect(aes(xmin = time_start,xmax = time_end,ymin = date -3600*12,ymax = date +3600*12,fill = top_category ) ) +scale_x_continuous(breaks =seq(8, 17, 3),limits =c(8, 17),position ="top",labels =am_pm(seq(8, 17, 3)),expand =expansion(c(0.025, 0), 0) ) +scale_y_reverse(# trans = rev_date,breaks = plot_date_breaks,labels = plot_date_breaks_labels,expand =expansion(c(0.025, 0), 0) ) +scale_fill_manual(values = event_type_colors,labels = event_type_labels ) +labs(x ="Hour of the Day", y =NULL, fill =NULL) +ggtitle("President Trump's Daily Schedule","November 7, 2018 through February 2, 2019" ) +labs(caption =credit_caption(FALSE)) +guides(fill =guide_legend(nrow =1, label.position ="bottom")) +theme(axis.text.y =element_text(family ="PT Mono", size =10),plot.margin =margin(3, 0, 0, 0, unit ="line") )plotly::ggplotly(g +aes(text = label), tooltip ="label") %>% plotly::layout(xaxis =list(side ="top", title =""))
And that’s it! Jump back up to see the final product.
The President’s Daily Tweeting Schedule
Donald Trump tweets about 5.1 tweets per working day within working hours. Why does this number feel so low? Then again, this represents 290 tweets published over 57 workdays and 128 tweets over the 24 weekend days in the same time period. Outside of work hours, the average rises to 8.7 tweets per 24-hour workday, or a total of 546 tweets on workdays and 203 tweets on weekends.
Below, each tweet sent by the President is shown as a dot over his private schedule. Hover over a tweet’s dot to read the text of the tweet.
View static image of the plot. Expand the section below to learn more about how I gathered tweets from @realDonaldTrump, merged his tweets with the Axios schedules, and added them to the first plot.
Also, if you’re interested in exploring the timeline of tweets rendered as they appear on Twitter, Jonathan Sidi created an awesome Shiny app for exploring Trump’s tweets by category.
How This Was Made…
This visualization required the President’s tweets, which I downloaded using the excellent rtweet package. After matching the tweets with their corresponding activity, I modified the previous plot to soften the coloring of the presidential activities and added the tweets to the chart.
Download the President’s Tweets
Using the rtweet package makes downloading the tweets relatively straightforward. I only needed to add a loop to gather tweets beyond Twitter’s timeline API limits and ensure that I have all the tweets from the time period described in the leaked schedules.
To align the tweets with the President’s schedule, I rounded (actually, floored) the time stamp of each tweet down to the nearest 5 minute interval. Then, I joined the tweets to the schedule using the 5 minute intervals I created while importing the schedule (see above). This gives me the category and scheduled activity of each tweet.
At this point, there are a few versions of the tweets data set that I can use in different places. The djt_simple is a basic, bare-bones tibble of tweets.
# A tibble: 758 × 4
created_at time_inc text statu…¹
<dttm> <dttm> <chr> <chr>
1 2018-11-07 01:27:01 2018-11-07 01:25:00 “There’s only been 5 times i… 106005…
2 2018-11-07 01:37:48 2018-11-07 01:35:00 ....unbelievably lucky to ha… 106005…
3 2018-11-07 01:49:40 2018-11-07 01:45:00 .@DavidAsmanfox “How do the… 106006…
4 2018-11-07 06:21:51 2018-11-07 06:20:00 Received so many Congratulat… 106013…
5 2018-11-07 06:55:35 2018-11-07 06:55:00 Ron DeSantis showed great co… 106013…
6 2018-11-07 07:07:51 2018-11-07 07:05:00 Those that worked with me in… 106014…
7 2018-11-07 07:36:28 2018-11-07 07:35:00 I will be doing a news confe… 106014…
8 2018-11-07 07:52:39 2018-11-07 07:50:00 To any of the pundits or tal… 106015…
9 2018-11-07 08:04:02 2018-11-07 08:00:00 If the Democrats think they … 106015…
10 2018-11-07 08:31:24 2018-11-07 08:30:00 In all fairness, Nancy Pelos… 106016…
# … with 748 more rows, and abbreviated variable name ¹status_id
The djt_joined_all variable holds the complete set of all tweets joined with the full schedule, meaning that there will be missing values where no tweets occurred in a 5 minute window or where the schedule data doesn’t cover a tweet.
# A tibble: 1,182 × 9
time_inc top_cat…¹ text event…² created_at liste…³ hour
<dttm> <fct> <chr> <int> <dttm> <chr> <dbl>
1 2018-10-24 08:35:00 <NA> Bria… NA 2018-10-24 08:35:15 <NA> 8.59
2 2018-11-02 09:45:00 <NA> Wow!… NA 2018-11-02 09:46:49 <NA> 9.78
3 2018-11-06 01:35:00 <NA> A fa… NA 2018-11-06 01:36:22 <NA> 1.61
4 2018-11-27 07:30:00 <NA> The … NA 2018-11-27 07:30:37 <NA> 7.51
5 2018-11-29 19:00:00 <NA> As R… NA 2018-11-29 19:04:23 <NA> 19.1
6 2018-12-15 11:15:00 <NA> The … NA 2018-12-15 11:15:38 <NA> 11.3
7 2018-12-22 20:45:00 <NA> Bret… NA 2018-12-22 20:48:23 <NA> 20.8
8 2018-12-23 11:55:00 <NA> I ju… NA 2018-12-23 11:59:22 <NA> 12.0
9 2019-01-13 22:15:00 <NA> ....… NA 2019-01-13 22:18:31 <NA> 22.3
10 2019-01-30 06:30:00 <NA> ....… NA 2019-01-30 06:34:31 <NA> 6.58
# … with 1,172 more rows, 2 more variables: date <dttm>, label <glue>, and
# abbreviated variable names ¹top_category, ²event_id, ³listed_title
And djt_workday contains the tweets within the period covered by the Axios schedules and during workday hours.
To build the second plot, I had to manually tweak the first plot to adjust the transparency of the geom_rect layer and then overlay the tweets as points.
# Modify geom_rect (only layer) to reduce transparencyg_subdued <- gg_subdued$layers[[1]]$aes_params$alpha <-0.6# Add tweets to the plotg_subdued <- g_subdued +ggtitle("President Trump's Daily Tweeting") +geom_point(data = djt_workday,aes(x = hour, y =as.integer(date +3600), text = label),color ="#2c3741",size =0.8 )
At this point, the plot is almost ready to go, except for the fact that the tooltip text will appear for the underlying activity layers. Fortunately, once again StackOverflow comes to the rescue. To disable the tooltip, I have to save the plotly object and change the $hoverinfo value to "none" for each of the data layers of the activity categories.
gpltly <- plotly::ggplotly(g_subdued, tooltip ="label") %>% plotly::layout(xaxis =list(side ="top", title =""))# remove hover labels for time category layers (6 categories)# thanks: https://stackoverflow.com/a/45802923/2022615for (i in1:6) { gpltly$x$data[[i]]$hoverinfo <-"none"}
Head back to the visualization to view the final product and explore Trump’s delusional ranting tweets.
How much time is spent in Executive Time?
Looking at the above plots, it’s really striking how much time is unstructured Executive Time in Trump’s schedule. But how much of the day is spent in each activity group?
R code…
The first step is to calculate the total time as a percentage of the 8am to 5pm workday spent in each time category. This data frame will be used for several plots.
One tricky point is that there are piece of the schedule that are explicitly marked as “Unknown” (or “no data”) in the Axios data, so I calculate the total percent of time spent in other categories and subtract this value from 1 to recover the complete unaccounted-for time.
exec_time_total <- exec_time %>%filter(between(hour(time_start), 8, 17), hour(time_start) <17) %>%mutate(date =floor_date(time_start, "day"),n =difftime(time_end, time_start, units ="mins"),n =as.numeric(n) ) %>%group_by(date, top_category) %>%summarize(pct =sum(n) / (60*9)) %>%ungroup() %>%# Get unaccounted time for each date (unknown or unlabelled)nest(-date) %>%mutate(total =map_dbl(data, ~ {filter(., top_category !="Unknown") %>%pull(pct) %>%sum() }),unaccounted =1- total, ) %>%unnest() %>%spread(top_category, pct, fill =0) %>%mutate(`Unknown`= unaccounted) %>%select(-total, -unaccounted) %>%gather("top_category", "pct", -date) %>%mutate(top_category =factor(top_category, rev(names(event_type_colors))))exec_time_total %>%arrange(date, desc(pct))
The above tibble contains a summary of time use by day, but the first plot requires a full summary of all days in the schedule. The following code chunk caculates total hours spent in each group and creates a text label that is used to label the regions of the stacked bar in the plot.
exec_time_hours <- exec_time_total %>%# Only the days covered by the schedulefilter(!(pct ==1& top_category =="Unknown")) %>%group_by(top_category) %>%summarize(hours =sum(pct *60*90)) %>%arrange(desc(top_category)) %>%mutate(pct = hours /sum(hours),pct_upto =cumsum(pct),label =glue("{top_category}\n","{scales::percent(pct, accuracy = 1)}"),label =if_else(top_category =="Unknown", "", paste(label)) )exec_time_hours
Finally, I create the plot using geom_col() to create a stacked bar chart, that I then rotate to be horizontal with coord_flip(). The bar labels are added as a text annotation, and I do a little adjustment to make sure the annotations fit in the plot and to hide the axis that aren’t relevant.
ggplot(exec_time_hours) +aes(x =1,y = pct,fill = top_category) +geom_col() +geom_text(aes(x =0.35, y = pct_upto - pct/2, label = label),color ="grey30",family ="PT Sans" ) +scale_fill_manual(values = event_type_colors,labels =rev(event_type_labels),guide =FALSE ) +scale_x_continuous(expand =expansion(0, c(0.2, 0)) ) +scale_y_continuous(labels = scales::percent_format(accuracy =10),expand =expansion(0, 0),position ="bottom" ) +coord_flip() +labs(x =NULL,y ="Percent of Workday Between 8am and 5pm",fill =NULL ) +ggtitle("What Does President Trump Do With His Time?" ) +labs(caption =credit_caption(FALSE)) +theme(axis.text.y =element_blank(),panel.grid.major =element_blank(),axis.ticks.x.top =element_line(color ="grey20") )
For 43 of the 51 workdays (that’s 84%) covered by the Axios schedules and for which there is schedule information, Trump spent 50% or more of his day in executive time.
In other words, there were only 8 days in about 10 work weeks where executive time was not the dominant activity.
When the above time-use summary is expanded into his daily schedule, it’s clear how unusual it is for Trump to spend a significant portion of his day in structured events.
R code…
exec_time_total %>%mutate(date =as.integer(date +3600*12), ) %>%ggplot() +aes(date, pct, fill = top_category) +geom_col(width =3600*24) +scale_fill_manual(values = event_type_colors,labels =rev(event_type_labels) ) +scale_y_continuous(breaks =seq(0, 1, .25),labels = scales::percent_format(accuracy =25),position ="right",expand =expansion(c(0.025, 0), 0) ) +scale_x_reverse(# trans = rev_date,breaks = plot_date_breaks,labels = plot_date_breaks_labels,expand =expansion(c(0.025, 0), 0) ) +coord_flip() +labs(x =NULL,y ="Percent of Workday Between 8am and 5pm",fill =NULL ) +guides(fill =guide_legend(nrow =1,reverse =TRUE,label.position ="bottom") ) +ggtitle("What Does President Trump Do With His Time?" ) +labs(caption =credit_caption(FALSE))
In fact, the largest non-executive time block for the 8 days where executive time isn’t more than half of Trump’s workday are almost entirely travel related.
Depart Washington, DC en route Buenos Aires, Argentina
2018-11-30
1.75 hours
G20 Leaders' dinner
2018-12-07
2.58 hours
Depart Washington, DC en route Kansas City, MO
2018-12-21
1.00 hours
Lunch
2019-01-10
4.17 hours
Depart Washington, DC en route McAllen, TX
2019-01-14
2.58 hours
Depart Washington, DC en route Kenner, LA
Travel seems to be the only activity capable of substantially affecting the amount of time the president spends on his executive time. My (completely speculative) guess is that this is in part due to travel being the only activity with a duration long enough to displace executive time, and also in part that travel probably most resembles executive time.
R code…
exec_time_total %>%# Drop "Unkown" time category, not that importantfilter(top_category !="Unknown") %>%# Spread top_category...spread(top_category, pct) %>%# ...and gather to leave Executive Time in own columngather(other_activity, pct, -date, -`Executive Time`) %>%# Ignore points where both groups are 0% (not informative)filter(pct +`Executive Time`>0) %>%# Pipe into ggplotggplot() +aes(`Executive Time`, pct, color = other_activity) +geom_point() +facet_wrap(~ other_activity, nrow =1) +scale_x_continuous(breaks =c(0, 0.5, 1),labels = scales::percent_format(accuracy =25),limits =c(0, 1) ) +scale_y_continuous(breaks =c(0, 0.5, 1),labels = scales::percent_format(accuracy =25),limits =c(0, 1) ) +scale_color_manual(values = event_type_colors,labels =rev(event_type_labels),guide =FALSE ) +coord_flip() +labs(x ="Percent of Workday\nIn Executive Time",y ="Percent of Workday Spent in Activity",caption =credit_caption() ) +theme(axis.title.x =element_text(margin =margin(10)),axis.title.y =element_text(margin =margin(r =20)), )
Tweeter In Chief
At the point, I was very interested in exploring how Trump’s tweeting relates to his work schedule. The first question to answer is When does he send most of his tweets? And the answer is: primarily on the weekends, during executive time, or before or after work hours.
R code…
# Start and end dates of Axios-pubslished schedules# which I called `exec_time` for some reason and am sticking withexec_time_boundaries <- exec_time %>%summarize(min =min(time_start), max =max(time_end))exec_time %>%# mutate(event_id = row_number()) %>%unnest() %>%filter(time_end != time_inc) %>%full_join(djt_simple, by ="time_inc") %>%select(event_id, time_inc, created_at, listed_title, top_category, text) %>%filter(!is.na(text),between(time_inc, exec_time_boundaries$min, exec_time_boundaries$max) ) %>%mutate(wday =wday(created_at, abbr =FALSE, week_start =1),top_category =case_when(!is.na(top_category) ~paste(top_category), wday >5~"Weekend",between(wday, 1, 5) &hour(created_at) <6~"Early Morning (before 6am)",between(wday, 1, 5) &hour(created_at) <8~"Morning (6-8 am)",between(wday, 1, 5) &hour(created_at) >17~"Evening (after 5pm)",is.na(top_category) ~"Unknown",TRUE~paste(top_category)) ) %>%count(top_category) %>%arrange(n) %>%mutate(top_category =fct_inorder(top_category)) %>%ggplot() +aes(top_category, n) +geom_col(fill ="#445566") +scale_y_continuous(expand =c(0, 0, 0, 5)) +coord_flip() +theme(panel.grid.major.y =element_blank()) +labs(x ="Activity or Time of Day",y =paste("Total Tweets Sent Between",strftime(exec_time_boundaries$min, "%F"),"to",strftime(exec_time_boundaries$max, "%F") ),title ="Trump Tweet Volume by Scheduled Activity",caption =credit_caption(rtweet =TRUE))
We can get a sense of the timing of Trump’s tweeting activities by looking at the time of day of each tweet and the scheduled activity that’s going on at the time. The following plot shows each tweet as a vertical line and considers only workday tweeting and only for days covered by the Axios schedules.
ggplot(djt_joined_work_non_work) +aes(x = hour, y =1, color = top_category) +geom_segment(aes(xend = hour, yend =0), alpha =0.6, size =0.5) +facet_wrap(~ top_category, ncol =1, strip.position ="left") +scale_x_continuous(position ="bottom",breaks =seq(0, 24, 4),limits =c(0, 24),labels =am_pm(seq(0, 24, 4)),expand =expansion(c(0.025, 0), 0) ) +scale_color_manual(values = event_type_colors_extra,labels =names(event_type_colors_extra) ) +scale_fill_manual(values = event_type_colors_extra,labels =names(event_type_colors_extra) ) +coord_cartesian(clip ="off") +labs(x =NULL, y =NULL, color =NULL) +guides(color =FALSE, fill =FALSE) +theme(panel.grid.major.y =element_blank(),axis.text.y =element_blank(),strip.text.y.left =element_text(angle =0,margin =margin(r =5, l =25),hjust =1 ),panel.spacing.y =unit(0, "pt") ) +ggtitle("What's On His Schedule When He's Tweeting?","Each line represents a tweet, colored by the activity on his White House Schedule" ) +labs(caption =credit_caption(rtweet =TRUE))
Most of Trump’s tweeting happens betewen 7 and 9 am, but what’s striking is that it’s nearly impossible to tell the difference between early morning tweeting and the start of President Trump’s official workday at 8am.
R code…
djt_joined_work_non_work %>%filter(top_category %in%c("Non-Work Hours", "Executive Time")) %>%mutate(week =floor_date(created_at, "week"),week =strftime(week, "%F")) %>%ggplot() +aes(x = hour, y =1, color = top_category) +geom_segment(aes(xend = hour, yend =0)) +facet_wrap(~ week, ncol =1, strip.position ="left") +scale_x_continuous(position ="bottom",breaks =seq(0, 24, 2),limits =c(4, 12),labels =am_pm(seq(0, 24, 2)),expand =expansion(c(0.025, 0), 0) ) +scale_color_manual(values = event_type_colors_extra,labels =names(event_type_colors_extra) ) +scale_fill_manual(values = event_type_colors_extra,labels =names(event_type_colors_extra) ) +labs(x =NULL, y =NULL, color =NULL, caption =credit_caption(TRUE)) +guides(color =FALSE, fill =FALSE) +theme(panel.grid.major.y =element_blank(),axis.text.y =element_blank(),strip.text.y.left =element_text(angle =0, margin =margin(r =25)),panel.spacing.y =unit(0, "pt") ) +ggtitle("When Do Official Work Hours Start?","Morning tweets published over one week periods.\nAccording to the White House Schedule, \"Executive Time\" starts at 8am in the Oval Office." )
As we learned above, Trump sends about 5 tweets per working day within working hours. Naturally, I wondered if he tends to tweet more or less during the day when he has more executive or travel time available. Similarly does he tweet less when he has more strucured time, i.e. metings, events, or lunches?
Somewhat unsurprisingly, the number of tweets sent during the workday in only very slightly correlated with the amount of unstructured time on Trump’s calendar. This makes sense: there is very little variation in the amount of the day spent in structured events – it’s never more than half the day.
R code…
djt_joined %>%group_by(date) %>%count() %>%rename(tweets = n) %>%left_join(exec_time_total, ., by ="date") %>%replace_na(list(tweets =0)) %>%filter(top_category %in%c("Executive Time", "Unknown", "Travel", "Lunch")) %>%spread(top_category, pct, fill =0) %>%filter(!Unknown ==1) %>%mutate(pct = Unknown +`Executive Time`+ Travel + Lunch) %>%ggplot() +aes(pct, tweets) +geom_smooth(method ="lm",color = event_type_colors["Executive Time"],fill = event_type_colors["Unknown"] ) +geom_point(color = event_type_colors["Executive Time"]) +scale_x_continuous(labels = scales::percent_format(10)) +labs(x ="Percent of Workday Dedicated to Downtime\n(Executive Time, Travel, Unknown)",y ="Number of Tweets",title ="Does Trump Tweet More When He Has More Downtime?",caption =credit_caption(rtweet =TRUE))
R code…
djt_simple %>%mutate(date =floor_date(created_at, "day")) %>%group_by(date) %>%count() %>%rename(tweets = n) %>%left_join(exec_time_total, ., by ="date") %>%filter(top_category %in%c("Meeting", "Event", "Lunch")) %>%group_by(date) %>%summarize(pct =sum(pct), tweets =max(tweets)) %>%filter(pct >0) %>%ggplot() +aes(pct, tweets) +geom_smooth(method ="lm",color = event_type_colors["Executive Time"],fill = event_type_colors["Unknown"] ) +geom_point(color = event_type_colors["Executive Time"]) +scale_x_continuous(labels = scales::percent_format(10)) +labs(x ="Percent of Workday Dedicated to Meetings, Lunches, or Events",y ="Number of Tweets",title ="Does Trump Tweet Less When He Does More \"Work\"?",caption =credit_caption(rtweet =TRUE))
Finally, I wanted to explore the emotional valence of Trump’s day-time tweeting. Are his morning tweets angrier or more rant-driven? Are his event-related tweets more positive?
To this end, I ran Trump’s tweet text through the NRC sentiment dictionary using get_nrc_sentiment() from the syuzhet package. This function returns an integer score from 0 to 10 for a range of positive and negative emotions.
# A tibble: 2,950 × 4
top_category text emotion value
<fct> <chr> <chr> <dbl>
1 Executive Time If the Democrats think they are going to waste … anger 1
2 Executive Time In all fairness, Nancy Pelosi deserves to be ch… anger 0
3 Executive Time According to NBC News, Voters Nationwide Disapp… anger 4
4 Executive Time We are pleased to announce that Matthew G. Whit… anger 1
5 Executive Time ....We thank Attorney General Jeff Sessions for… anger 1
6 Travel “Presidential Proclamation Addressing Mass Migr… anger 0
7 Travel .@BrianKempGA ran a great race in Georgia – he … anger 0
8 Travel You mean they are just now finding votes in Flo… anger 2
9 Travel As soon as Democrats sent their best Election s… anger 2
10 Travel Jeff Flake(y) doesn’t want to protect the Non-S… anger 2
# … with 2,940 more rows
# A tibble: 50 × 3
# Groups: top_category [5]
top_category emotion value
<fct> <fct> <dbl>
1 Executive Time anger 0.864
2 Executive Time anticipation 0.870
3 Executive Time disgust 0.525
4 Executive Time fear 0.772
5 Executive Time joy 0.654
6 Executive Time negative 1.36
7 Executive Time positive 2.01
8 Executive Time sadness 0.562
9 Executive Time surprise 0.537
10 Executive Time trust 1.52
# … with 40 more rows
emotions <-c("positive", "joy", "trust", "surprise", "anticipation","sadness", "anger", "fear", "disgust", "negative")djt_sentiment %>%mutate(emotion =factor(emotion, rev(emotions))) %>%ggplot() +aes(y = value, x = emotion, fill = top_category) +# ggridges::geom_density_ridges() +geom_boxplot(alpha =0.7, color ="grey20", outlier.shape =NA) +scale_fill_manual(values = event_type_colors) +guides(fill =FALSE, color =FALSE) +coord_flip() +facet_wrap(~top_category, scales ="free_y") +labs(y ="Sentiment Score", x =NULL,title ="Emotions Expressed in Trump Tweets",caption =credit_caption(rtweet =TRUE))
The result provides something of a profile of Trump’s tweeting habits, but more analysis is needed to make sense of these sentiment values. I wanted to look further into how these tweets were categorized by the sentiment dictionary, but by this point this post is already far too long and has consumed too much of my evenings and weekends, so I’ll save it for another day.
Thanks for reading! I’d love to hear your thoughts or feedback. I’m @grrrck on Twitter.