View the Project on GitHub ianubhavsisodia/Top_India_Youtubers_2024
📂We are pulling data from Kaggle, exploring and analysing it in Excel, cleaning and testing it in SQL, and visualizing it in Power BI.
To discover the top performing INDIA YouTubers to form marketing collaborations with throughout the year 2024.
To create a dashboard that provides insights into the top India YouTubers in 2024 that includes their
This will help the marketing team make informed decisions about which YouTubers to collaborate with for their marketing campaigns.
• The Head of Marketing has to find the top YouTube channels in the INDIA to run marketing campaigns with difficult. • She has performed online research but constantly bumps into overly complicated and conflicting insights • She has also held calls with different third-party providers, but they are all expensive options for underwhelming results • The BI reporting team lack the bandwidth to assist her with this assignment
We need data on the top UK YouTubers in 2024 that includes their
total videos uploaded
To understand what it should contain, we need to figure out what questions we need the dashboard to answer:
For now, these are some of the questions we need to answer, this may change as we progress down our analysis.
Some of the data visuals that may be appropriate in answering our questions include:
Tool | Purpose |
---|---|
Excel | Exploring the data |
SQL Server | Cleaning, testing, and analyzing the data |
Power BI | Visualizing the data via interactive dashboards |
GitHub | Hosting the project documentation and version control |
Mokkup AI | Designing the wireframe/mockup of the dashboard |
This is the stage where you have a scan of what’s in the data, anamolies, errors, inconcsistencies, missing data, etc
The aim is to refine our dataset to ensure it is structured and ready for analysis.
The cleaned data should meet the following criteria and constraints:
Below is a table outlining the constraints on our cleaned dataset:
Property | Description |
---|---|
Number of Rows | 100 |
Number of Columns | 4 |
And here is a tabular representation of the expected schema for the clean data:
Column Name | Data Type | Nullable |
---|---|---|
channel_name | VARCHAR | NO |
total_subscribers | INTEGER | NO |
total_views | INTEGER | NO |
total_videos | INTEGER | NO |
/*
# 1. Select the required columns
# 2. Extract the channel name from the 'NAME' column
*/
SELECT
CAST(SUBSTRING(NAME,1, Charindex('@',NAME)-1) as varchar(100)) AS channel_name,
total_subscribers,
total_views,
total_videos
FROM
top_india_youtubers_2024;
/*
# 1. Create a view to store the transformed data
# 2. Cast the extracted channel name as VARCHAR(100)
# 3. Select the required columns from the top_uk_youtubers_2024 SQL table
*/
-- 1.
CREATE VIEW view_india_youtubers_2024 AS
-- 2.
SELECT
CAST(SUBSTRING(NAME, 1, CHARINDEX('@', NAME) -1) AS VARCHAR(100)) AS channel_name, -- 2.
total_subscribers,
total_views,
total_videos
-- 3.
FROM
top_india_youtubers_2024;
Here are the data quality tests conducted:
/*
# Count the total number of records (or rows) are in the SQL view
*/
SELECT
COUNT(*) AS no_of_rows
FROM
view_india_youtubers_2024;
/*
# Count the total number of columns (or fields) are in the SQL view
*/
SELECT
COUNT(*) AS column_count
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME = 'view_india_youtubers_2024'
/*
# Check the data types of each column from the view by checking the INFORMATION SCHEMA view
*/
-- 1.
SELECT
COLUMN_NAME,
DATA_TYPE
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME = 'view_india_youtubers_2024';
/*
# 1. Check for duplicate rows in the view
# 2. Group by the channel name
# 3. Filter for groups with more than one row
*/
-- 1.
SELECT
channel_name,
COUNT(*) AS duplicate_count
FROM
view_india_youtubers_2024
-- 2.
GROUP BY
channel_name
-- 3.
HAVING
COUNT(*) > 1;
This shows the Top India Youtubers in 2024 so far.
Total Subscribers (M) =
VAR million = 1000000
VAR sumOfSubscribers = SUM(view_india_youtubers_2024[total_subscribers])
VAR totalSubscribers = DIVIDE(sumOfSubscribers,million)
RETURN totalSubscribers
Total Views (B) =
VAR billion = 1000000000
VAR sumOfTotalViews = SUM(view_india_youtubers_2024[total_views])
VAR totalViews = ROUND(sumOfTotalViews / billion, 2)
RETURN totalViews
Total Videos =
VAR totalVideos = SUM(view_india_youtubers_2024[total_videos])
RETURN totalVideos
Average Views per Video (M) =
VAR sumOfTotalViews = SUM(view_india_youtubers_2024[total_views])
VAR sumOfTotalVideos = SUM(view_india_youtubers_2024[total_videos])
VAR avgViewsPerVideo = DIVIDE(sumOfTotalViews,sumOfTotalVideos, BLANK())
VAR finalAvgViewsPerVideo = DIVIDE(avgViewsPerVideo, 1000000, BLANK())
RETURN finalAvgViewsPerVideo
Subscriber Engagement Rate =
VAR sumOfTotalSubscribers = SUM(view_india_youtubers_2024[total_subscribers])
VAR sumOfTotalVideos = SUM(view_india_youtubers_2024[total_videos])
VAR subscriberEngRate = DIVIDE(sumOfTotalSubscribers, sumOfTotalVideos, BLANK())
RETURN subscriberEngRate
Views Per Subscriber =
VAR sumOfTotalViews = SUM(view_india_youtubers_2024[total_views])
VAR sumOfTotalSubscribers = SUM(view_india_youtubers_2024[total_subscribers])
VAR viewsPerSubscriber = DIVIDE(sumOfTotalViews, sumOfTotalSubscribers, BLANK())
RETURN viewsPerSubscriber
For this analysis, we’re going to focus on the questions below to get the information we need for our marketing client -
Here are the key questions we need to answer for our marketing client:
Rank | Channel Name | Subscribers (M) |
---|---|---|
1 | T-Series | 270.00 |
2 | SET India | 175.00 |
3 | Zee Music Company | 109.00 |
4 | Goldmines | 98.60 |
5 | Sony SAB | 94.20 |
6 | ChuChu TV | 92.00 |
7 | Zee TV | 82.40 |
8 | Colors TV | 74.60 |
9 | SHemaroo Filmi Gaane | 69.00 |
10 | Tips Official | 67.40 |
Rank | Channel Name | Videos Uploaded |
---|---|---|
1 | ABP NEWS | 3,93,419 |
2 | Aaj Tak | 3,78,103 |
3 | IndiaTV | 3,11,028 |
4 | Zee News | 1,81,988 |
5 | Zee TV | 1,80,488 |
Rank | Channel Name | Total Views (B) |
---|---|---|
1 | T-Series | 262.00 |
2 | SET India | 167.00 |
3 | Sony SAB | 118.00 |
4 | Zee TV | 92.36 |
5 | Colors TV | 72.74 |
Rank | Channel Name | Views per Video |
---|---|---|
1 | Sidhu Moose Wala | 66574607.56 |
2 | ChuChu TV | 65000477.83 |
3 | Infobells- Hindi | 53538370.92 |
4 | Fun For Kids TV | 45619168.83 |
5 | Shorts Break | 45262322.39 |
Rank | Channel Name | Views per Subscriber |
---|---|---|
1 | Zee5 | 1296.04 |
2 | Sony SAB | 1252.65 |
3 | etvteluguindia | 1232.56 |
2 | Zee TV | 1120.92 |
3 | Mazhavil Manorama | 1105.69 |
Rank | Channel Name | Subscriber Engagement Rate |
---|---|---|
1 | CarryMinati | 215577.89 |
2 | Sidhu Moose Wala | 213793.11 |
3 | BB ki Vines | 137500 |
2 | Desi Music Factory | 136693.55 |
3 | ChuChu TV | 115577.89 |
For this analysis, we’ll prioritize analysing the metrics that are important in generating the expected ROI for our marketing client, which are the YouTube channels wuth the most
Campaign idea = product placement
Best option from category: T-Series
Campaign idea = sponsored video series
Best option from category: Zee TV
Campaign idea = Influencer marketing
Best option from category: T-Series
Campaign idea = Influencer marketing / Sponsored Videos
We discovered that
Based on our analysis, we beieve the best channels to advance a long-term partnership deal with to promote the client’s products are T-Series, Zee TV, CarryMinati, BB Ki Vines.
We’ll have conversations with the marketing client to forecast what they also expect from this collaboration. Once we observe we’re hitting the expected milestones, we’ll advance with potential partnerships with TOP-5 channels in the future.