🤩 Recieve Regular Job Updates On Our Telegram Channel-> Join Now

Extracting YouTube Data With Python Using API

YouTube is making people go viral and actually YouTube itself is going viral. It is the second-largest search engine after Google.

Started in the year 2005 the platform now has over 2 billion monthly active users. YouTube has helped content creators to get exposure and also earn some revenue out of it.

In this article, we will be using the YouTube API with the help of Python programming. By doing so we will be able to extract and scrap data from YouTube which could be used in multiple projects.

Don’t worry if you are confused about the terms we are using for now. They will all be explained in this article as we will move through. So, let us get started.

What is an API?

API is short for Application Programming Interface and it is a software interface that allows programmers to have an efficient way for client-server communication.

Developers often use APIs to build client-server applications. They may have to provide some data and then the API returns the specific information from the backend. Not only the information but multiple operations could be accompanied by APIs.

Here we will be using YouTube API which is provided officially by YouTube itself. It allows developers to retrieve various attributes related to the provided information.

Suppose you wanted to create an application where you will show the most liked video of the entered channel by the user. You could take the help of the API to retrieve that information in no time.

There are various things that an API could extract and depends on the platform and owner. Let us now see what all things we could extract and scrap with YouTube API.

Things You Could Extract From YouTube API?

There is plenty of information and data that you could extract from this YouTube API using Python. We are mentioning some of the important attributes below:

Channel’s Statistics: Return important statistical information about the channels specified.

No. of Videos: Get the total number of Videos uploaded by that YouTube channel.

Total Watch Time: You can get the Total Watch time of any specified channel in Minutes.

Total No. of Subscribers: As the name suggests, it would fetch you the Number of Subscribers with YouTube API.

Snippet: It lets you fetch multiple things from Channel’s data like Description, Title, etc.

Logo: You can get the Logo used by that channel in the same size as it is used.

Content Details: Lets you extract information related to the video including Like count, Dislike count, etc.

There are a lot more things that YouTube API lets you extract from its database. Now we will be looking at some Python code samples that we used to extract important information.

But before that, we will be setting up our machine and its environment. We have used Jupyter Notebook for running up these codes but you can use any other Python IDE.

Working on YouTube API with Python

1. Installation of Google API Client

Google API Client will be used to call the Build Method so we will need to install it first. We have provided three commands below for different platforms used.

For Windows:

pip install google-api-python-client

For Ubuntu

sudo pip install google-api-python-client

For Anaconda

conda install google-api-python-client

2. Importing Libraries

We will need Libraries to work upon our YouTube API extraction. Import them using the code below:

from googleapiclient.discovery import build

3. Creating Object

We will be creating an Object to access YouTube data. For creating an object you will need an api_key which you can get from here: Get API Key

After getting the API Key use the following code to create an object:

youTubeApiKey=your_youTubeApiKey
youtube=build('youtube','v3',developerKey=youTubeApiKey)
channelId='UCr2dD3s19bdcw4qjuUTQKiQ'

Here we have used the channel ID for our YouTube channel Rajni Sharma Maths Classes. You can use any channel to extract the data.

For getting the Channel ID just go the any YouTube channel and check the URL. You will find the Channel ID:

Example: https://www.youtube.com/channel/UCr2dD3s19bdcw4qjuUTQKiQ

For the above YouTube URL, the Channel ID is UCr2dD3s19bdcw4qjuUTQKiQ 

4. Getting Statistics from YouTube API

The statistics will include YouTube Views & Subscribers. We can get channel statistics with the following code:

statdata=youtube.channels().list(part='statistics',id=channelId).execute()
stats=statdata['items'][0]['statistics']
stats

This will return a dictionary of items which we will need to extract one by one as shown below.

Output Screen:
 

Statistics With YouTube data
Statistics With YouTube API

a) Total Number of Videos

videoCount=stats['videoCount']
videoCount

b) Total Watch Time

viewCount=stats['viewCount']
viewCount

c) Total Number of Subscribers

suscriberCount=stats['subscriberCount']
suscriberCount

Output Screen:

Extracting Data from YouTube

5. Getting Snippet

Just like the stats, Snippet also contains various important information. We will first create it in the form of a dictionary using YouTube API with Python. Then We will extract all the information.

snippetdata=youtube.channels().list(part='snippet',id=channelId).execute()
snippetdata 

a) Title of YouTube Channel

title=snippetdata['items'][0]['snippet']['title']
title

The title is the name of the YouTube channel that you have used while giving the Channel ID.

b) YouTube Channel’s Description

description=snippetdata['items'][0]['snippet']['description']
description

The description includes the information that the channel owner has provided.

c) YouTube Channel’s Logo

logo=snippetdata['items'][0]['snippet']['thumbnails']['default']['url']
logo

The logo used by the Channel owner will be fetched through the above code. It will just give you a link to the actual logo image.

Output Screen:

Showing Data

6. Getting Content Details

This is the most interesting and one of our favorite parts of using the YouTube API. With the help of this option, we can scrap information about all the videos of that specific channel.

We have created a Project in the past which showcased Most liked videos, Most Disliked Videos, Most Commented Videos, etc. YouTube API was used for Extracting the Data and Python language was used to code.

Step 1 – Getting All the Video Details

contentdata=youtube.channels().list(id=channelId,part='contentDetails').execute()
playlist_id = contentdata['items'][0]['contentDetails']['relatedPlaylists']['uploads']
videos = [ ]
next_page_token = None



while 1:
     res = youtube.playlistItems().list(playlistId=playlist_id,
                                               part='snippet',
                                               maxResults=50,
                                               pageToken=next_page_token).execute()
    videos += res['items']
    next_page_token = res.get('nextPageToken')



    if next_page_token is None:
        break
print(videos)

Output Screen:

Other usecases for API

Step -2: Getting Video ID for each Video:

video_ids = list(map(lambda x:x['snippet']['resourceId']['videoId'], videos))
video_ids

Step 3: Getting Statistics for Each Video

stats = []for i in range(0, len(video_ids), 40):
res = (youtube).videos().list(id=','.join(video_ids[i:i+40]),part='statistics').execute()
stats += res['items']
print(stats)

Step 4: Collecting All the Information in a List:

title=[ ]
liked=[ ]
disliked=[ ]
views=[ ]
url=[ ]
comment=[ ]

for i in range(len(videos)):
      title.append((videos[i])['snippet']['title'])
      url.append("https://www.youtube.com/watch?v="+(all_videos[i])['snippet']['resourceId']['videoId'])
      liked.append(int((stats[i])['statistics']['likeCount']))
     disliked.append(int((stats[i])['statistics']['dislikeCount']))
     views.append(int((stats[i])['statistics']['viewCount']))
     comment.append(int((stats[i])['statistics']['commentCount']))

Output Screen:
 

Further Details

Step 5: Creating a Dataframe for the Collected Data

This is not a necessary step but is done to organize data in a better way. You will need to install Pandas library using the code given below(Windows)

pip install pandas

Now use the following code to create a Datafram with Python pandas library:

import pandas as pd
data={'title':title,'url':url,'liked':liked,'disliked':disliked,'views':views,'comment':comment}
df=pd.DataFrame(data)
df

Output Screen:

List of Videos and Their Stats from API

So here we have created Data to organize that information in a better way. This marks the end of this article on Extracting YouTube Data with Python Using YouTube API.

Here is the link of GitHub for Code

Hope you liked the article do comment on your views on it or if you have got any doubt.

6 thoughts on “Extracting YouTube Data With Python Using API”

  1. Im working with the API and ive came to an issue, when a channel have a name as a channel ID (example: user/irene9894) it shows back a “Key Error”, i dont understand why is this happening. It can be solved¿

    Reply
  2. —————————————————————————
    NameError Traceback (most recent call last)
    in
    7 for i in range(len(videos)):
    8 title.append((videos[i])[‘snippet’][‘title’])
    —-> 9 url.append(“https://www.youtube.com/watch?v=”+(all_videos[i])[‘snippet’][‘resourceId’][‘videoId’])
    10 liked.append(int((stats[i])[‘statistics’][‘likeCount’]))
    11 disliked.append(int((stats[i])[‘statistics’][‘dislikeCount’]))

    NameError: name ‘all_videos’ is not defined

    Why?

    Reply
  3. it means the variables “all_videos” is not defined. You just have to replace it with the good one. If you copy/paste the code it’s “videos”.

    Reply
  4. #Get statistics for each video
    stats = []
    for i in range(0, len(video_ids), 40):
    res = (youtube).videos().list(
    id=’,’.join(video_ids[i:i+40]),
    part=’statistics’
    ).execute()
    stats+=res[‘items’]

    title, views, comments = [], [], []

    for i in range(len(videos)):
    title.append((videos[i])[‘snippet’][‘title’])
    views.append(int((stats[i])[‘statistics’][‘viewCount’]))
    comments.append(int((stats[i])[‘statistics’][‘commentCount’]))

    –I’m getting a key error for ‘viewCount’, any idea why?

    Reply
  5. playlist_id = contentdata[‘items’][0][‘contentDetails’][‘relatedPlaylists’][‘uploads’]
    videos = []
    next_page_token = None

    while 1:
    res = youtube.playlistItems().list(
    playlistId=playlist_id,
    part=’snippet’,
    maxResults=50,
    pageToken=next_page_token
    ).execute()

    videos += res[‘items’]
    next_page_token = res.get(‘nextPageToken’)

    if next_page_token is None:
    break

    print(videos)

    #Get video ID for each video
    video_ids = list(map(lambda x:x[‘snippet’][‘resourceId’][‘videoId’], videos))

    #Get statistics for each video
    stats = []
    for i in range(0, len(video_ids), 40):
    res = youtube.videos().list(
    id=’,’.join(video_ids[i:i+40]),
    part=’statistics’
    ).execute()
    stats+=res[‘items’]

    title, views, comments = [], [], []

    for i in range(len(videos)):
    title.append((videos[i])[‘snippet’][‘title’])
    views.append(int((stats[i])[‘statistics’][‘viewCount’]))
    comments.append(int((stats[i])[‘statistics’][‘commentCount’]))

    –Getting a key error for viewCount, any idea why?

    Reply
  6. I need to scrap data only for You tube video a particular video. I have that video link with me how to scrap data only for one video not for whole channel

    Reply

Leave a Comment