***WHITE PAPER***

Abigail De Kosnik and Benjamin De Kosnik

23 July 2017

alpha60 Data and Analysis of BitTorrent Peers and Seeds

for HBO’s Game of Thrones Season 7 Episode 1

“alpha60” is a data science/computational humanities research project by Abigail De Kosnik (Associate Professor, University of California, Berkeley) and Benjamin De Kosnik (software engineer and artist). alpha60 is a data scraper that quantifies and maps BitTorrent traffic—the number and location of downloaders (or “peers”) and of uploaders (or “seeds”)—around specific digital media files (“torrents”). In other words, alpha60 is a “piracy ratings” system.  Initial funding for this project was provided by the Berkeley Center for New Media through a Faculty Seed Grant.

During the week of Sunday 16 July through Saturday 22 July 2017, alpha60 tracked 72 torrents for the HBO fantasy series Game of Thrones (GoT), season 7, episode 1 (s07e01) (which HBO released on official broadcast and streaming platforms on 16 July). We sampled the total swarm of peers for this episode every four hours using three VPN locations (London, Sydney, and Toronto).

This white paper was written on 23 July 2017 and contains only preliminary findings.

Quant Ratings

The first set of alpha60 ratings that we will discuss are quantitative or “quant” ratings. Figure 1 (below) compares Nielsen ratings to alpha60 ratings for GoT s07e01.

Nielsen Same Day Live Viewers are the total number of viewers who watched the GoT s07e01 broadcast live. Nielsen Same Day Total Viewers are the total number of viewers who watched GoT s07e01 live, plus viewers who watched on DVR playback, plus viewers who watched on HBO’s official streaming platforms, HBO GO (a free streaming service available to HBO cable subscribers) and HBO NOW (a standalone subscription-based streaming service).

alpha60 considers the date of broadcast of an episode to its “Release Day”; on that day, BitTorrent activity takes place over just a few evening hours, as the first torrents become available after the broadcast of the episode. Therefore, we consider the equivalent of Nielsen Same Day to be alpha60’s Release + 1, which is comprises the first 24 hours after torrents are released for an episode. We call Release + 1 “Peak Day” because, across all of the series that are broadcast on network or cable that we have analyzed so far, Release + 1 consistently shows the highest number of unique peers (downloads).   Subsequent days of activity tracked by alpha60 are called “Release + 2,” “Release + 3,” and so on.

Nielsen Live + 3 is the total viewers on all platforms and services for the date of broadcast plus three days; alpha60 Release + 3 is the total peers for the release date plus three days.

Figure 1. Table showing GoT s07e01 Nielsen ratings[1] vs. alpha60 ratings.

Nielsen Same Day Live Viewers

(Live Broadcast or “Linear” Viewers)

(16 July 2017 only)

10.1 million
Nielsen Same Day Total Viewers

(Linear Viewers + Same Day DVR Playback + HBO GO/HBO NOW) (16 July 2017 only)

16.1 million
alpha60 Release + 1 (“Peak Day”) Peers (day following torrent release: 17 July 2017) 827,000
alpha60 Release + 1 Peers as % of Nielsen Same Day Live Viewers 8.19%
alpha60 Release + 1 Peers as % of Nielsen Same Day Total Viewers 5.14%
Nielsen Live + 3 Viewers (broadcast date + 3 days: 16 July – 19 July 2017) 12.2 million
alpha60 Cumulative Release + 2 Peers (Release Day + 2 days: 16 July – 18 July 2017) 1.28 million (1,279,578)
alpha60 Cumulative Release + 4 Peers (Release Day + 4 days: 16 July – 20 July 2017) 1.77 million (1,766,388)
alpha60 Release + 2 Peers as % of Nielsen Live + 3 10.49%
alpha60 Release + 4 Peers as % of Nielsen Live + 3 Viewers 14.51%

In Figure 1, we see that the alpha60 number of peers on Release + 1, or Peak Day, is 827,000, which is 8.19% of the Nielsen Same Day Live viewers count of 10.1 million, and 5.14% of the Nielsen Same Day Total Viewer count of 16.1 million.

The Nielsen Live + 3 count is 12.2 million viewers (Nielsen’s Live + 3 is lower than Nielsen Same Day Total because it not include the number of views on HBO GO/HBO NOW[2]). When we compare this to the alpha60 Release +2 count of 1.28 million peers, and the Release + 4 count of 1.77 million peers, the percentage changes dramatically: the number of Release + 2 peers is 10.49% the number of Live + 3 viewers, and the number of Release + 4 peers is 14.51% the number of Live + 3 viewers (we did not tally the cumulative number of unique peers at Release + 3; we are still adjusting the alpha60 system and have noted the need for Release + 3 cumulative numbers). We will be eager to see if this percentage increases when we compare Nielsen Live + 7 numbers to alpha60 Release + 7 numbers. If the Nielsen Live + 3 numbers are adjusted at a later date to account for streaming activity, then the alpha60 Release +4 percentage could decrease.

Figure 2. Comparison between Nielsen Live + 3 and alpha60 Release + 2 and Release + 4 in total number of viewers/peers gained and percentage increase in viewers/peers.

Nielsen Live + 3 alpha60 Release + 2 alpha60 Release + 4
Total # of viewers/peers 12.2 million 1.28 million (1,279,578) 1.77 million (1,766,388)
Gain in # of viewers/peers 2.08 million (2,081,000)[3] 452,578 939,388
% Gain in viewers/peers 20.59% 54.73% 113.59%

Figure 3. Visualization of the data in Figure 2 comparing the % gain in viewers/peers between Nielsen Live + 3, alpha60 Release + 2, and alpha60 Release + 4.

Screenshot 2017-07-23 23.18.14.png

Figures 2 and 3 offer another way to understand the data in Figure 1. Nielsen ratings show a gain in viewers of 21% three days after broadcast. alpha60 ratings show a gain in peers of 54.73% two days after broadcast, and a gain in peers of 113.59% four days after broadcast. In other words, in the days following the initial broadcast of an episode, downloading grows at a faster rate than authorized viewing grows, at least in the case of GoT s07e01.

Figure 4. Table showing GoT s07e01 total unique peers (downloads) per day.

Day alpha60 Unique Peers on Single Day
Release Day (16 July 2017) 283,011
Release + 1 (“Peak Day”) (17 July 2017) 827,000
Release + 2 (18 July 2017) 447,015
Release + 3 (19 July 2017) 351,534
Release + 4 (20 July 2017) 404,191

Figure 5. Visualization of data in Figure 2 (GoT s07e01 total unique peers (downloads) per day for Release + 4).

Screenshot 2017-07-23 23.18.36

Figure 4 gives the number of unique peers (downloads) counted by alpha60 on Release Day through Release + 4, for each day. Figure 5 visualizes the data in Figure 4. It is clear that Release + 1 is, indeed, the Peak Day for downloads, as the number of downloads on that day exceeds the number on Release Day or any subsequent day. We also see the “long tail” of interest in GoT s07e01, as the peer activity does not dramatically taper off after Peak Day but continues to be strong; in fact, while there is a slight decline in the number of peers from Release +2 to Release + 3, there is an increase in the number of peers from Release + 3 to Release + 4.

Figure 6. GoT Peak Day activity compared to Peak Day activity of other series

alpha60-peak-day

When we look at GoT s07e01’s Peak Day count (the day of maximum unique peers) of 827,000 peers (as seen in Figure 1 and Figure 4) beside the Peak Day counts of other television series that we have tracked with alpha60, we see that GoT s07e01 has by far, the greatest volume of Peak Day activity. The Walking Dead (TWD) (AMC) s07e10 and s07e13 had slightly higher Nielsen Same Day Live ratings than GoT s07e01: TWD s07e10 had 11.1 million live viewers and TWD s07e13 had 10.7 million live viewers, compared to GoT s07e01’s 10.1 million live viewers.[4] However, GoT s07e01’s Peak Day showed approximately three times as much downloading activity as TWD s0710’s and TWD s0713’s Peak Days. Of all the television shows on which we have collected data with the alpha60 system, House of Cards (Netflix) s05 comes the closest to GoT s07e01 in Peak Day activity, but still falls short by approximately 327,000 downloads.

In other words, the peak pirate activity around GoT s07e01 far outstrips the peak pirate activity around other television series from the 2016-17 season, including series with comparable popularity and ratings such as TWD.

Geo Ratings

The second set of alpha60 ratings that we will discuss are geographic or “geo” ratings.

Figure 7. Animation showing geography of downloading activity for GoT s07e01 (Release Day through Release + 4, or 16 July through 20 July 2017). Available at: https://www.youtube.com/watch?v=ty-gQsOnjyQ.

 

As the Figure 7 animation shows, piracy of GoT takes place all over the world. The areas of downloading activity generally correspond to areas of population density. In other words, where there are people, there are pirates downloading GoT.

Figure 8. List of top 25 pirating cities for GoT s07e01

Rank City
1 Seoul, Rep. of Korea
2 Athens, Greece
3 São Paulo, Brazil
4 Guangzhou, China
5 Mumbai, India
6 Bangalore, India
7 Shanghai, China
8 Riyadh, Saudi Arabia
9 Delhi, India
10 Beijing, China
11 Pune, India
12 Cairo, Egypt
13 Dublin, Ireland
14 Chennai, India
15 Toronto, Canada
16 Kolkata, India
17 Rio De Janeiro, Brazil
18 Chengdu, China
19 Islamabad, Pakistan
20 Nanjing, China
21 Istanbul, Turkey
22 Dallas, USA
23 London, UK
24 Singapore
25 Amsterdam, Netherlands

A few observations we can make about the list of top 25 pirating cities for GoT s07e01:

  • The top 25 contains only four European cities (Athens, Dublin, London, and Amsterdam), and only two North American cities (Toronto and Dallas).
  • There are no Australian cities in the top 25, despite Australia’s often being mentioned as a heavily pirating nation.
  • Most of the top 25 pirating cities are not in the top 25 cities for Internet access, according to the 2015 IDI (ICT Development Index).[5] Brazil, Greece, China, India, Saudi Arabia, Egypt, Pakistan, and Turkey are ranked lower than 25 in the world on the IDI, but are among the top GoT downloading countries.
  • Some countries appear multiple times on the list: India has six cities in the top 25 (Mumbai, Bangalore, Delhi, Pune, Chennai, Kolkata), China has five (Guangzhou, Shanghai, Beijing, Chengdu, and Nanjing), and Brazil has two (São Paulo and Rio).

Figure 9. List of top 25 “overpirating” cities (peer activity normalized for population) for GoT s07e01.

Rank City
1 Dallas, USA
2 Brisbane, Australia
3 Chicago, USA
4 Riyadh, Saudi Arabia
5 Seattle, USA
6 Perth, Australia
7 Phoenix, USA
8 Toronto, Canada
9 Athens, Greece
10 Guangzhou, China
11 Seoul, Rep. of Korea
12 Melbourne, Australia
13 Houston, USA
14 Dubai, UAE
15 Dublin, Ireland
16 Sydney, Australia
17 Amsterdam, Netherlands
18 Shanghai, China
19 Auckland, New Zealand
20 Beijing, China
21 São Paulo, Brazil
22 Atlanta, USA
23 Nanjing, China
24 Bangalore, India
25 Belgrade, Serbia

We were able to normalize peer activity by population, and generate the list of top 25 “overpirating” cities for GoT s07e01 (Figure 9); in other words, this list ranks cities by how much the city’s volume of piracy is disproportionate to its population size.

  • In the top 25 overpirating cities, the USA makes a much stronger showing, with six cities on the list (Dallas, Chicago, Seattle, Phoenix, Houston, Atlanta).
  • Australia also shows up multiple times in the overpirating top 25, with four cities (Brisbane, Perth, Melbourne, Sydney).
  • India, which featured six cities on the top 25 overall pirating list, only has one (Bangalore) on the top 25 overpirating list.
  • Cities outside the U.S. and Australia that appear in the top 25 for overpirating but not in the top 25 overall are: Dubai, UAE; Auckland, New Zealand; and Belgrade, Serbia.
  • Cities that appear on both the top 25 overall and the top 25 overpirating cities are: Dallas, USA; Riyadh, Saudi Arabia; Toronto, Canada; Athens, Greece; Guangzhou, China; Seoul, Rep. of Korea; Dublin, Ireland; Amsterdam, Netherlands; Shanghai, China; Beijing, China; São Paulo, Brazil; Nanjing, China; and Bangalore, India.

We cannot determine to what extent either ranking (either the top 25 overall pirating list in Figure 8 or the top 25 overpirating list in Figure 9) is affected by geospoofing, that is, by users located outside of these cities using Virtual Private Networks to disguise their actual locations when they pirate media. Overpirating cities, especially, could be determined in large part by VPN subscribers using these cities as masks for their real locations.

For more information about the information contained here or about the alpha60 research project, please email a.dekosnik@gmail.com.

[1] Sources for Nielsen and Nielsen Social Ratings: http://www.vulture.com/2017/07/game-of-throness-season-7-ratings-are-off-to-a-huge-start.html; http://tvbythenumbers.zap2it.com/daily-ratings/game-of-thrones-scores-series-best-audience-with-season-7-premiere/; http://www.multichannel.com/news/content/game-thrones-draws-massive-social-media-chatter/414018; http://www.nielsensocial.com/socialcontentratings/weekly/#SeriesSpecials; http://tvbythenumbers.zap2it.com/dvr-ratings/cable-live-3-ratings-for-july-10-16-2017/.

[2] http://www.spottedratings.com/2013/09/intro-to-nielsen-ratings-basics-and.html.

[3] http://tvbythenumbers.zap2it.com/dvr-ratings/cable-live-3-ratings-for-july-10-16-2017/.

[4] TWD ratings from http://tvbythenumbers.zap2it.com/daily-ratings/sunday-cable-ratings-feb-19-2017/ and http://tvbythenumbers.zap2it.com/daily-ratings/sunday-cable-ratings-march-12-2017/.

[5] http://www.itu.int/net4/ITU-D/idi/2015/.