Bellabeat Case Study: How Can a Wellness Technology Company Play It Smart?
In this case study, I analyzed public smart device datasets while working for a fictional company called "Bellabeat" in order to uncover insights that can be used to design and develop business and marketing strategies aimed at maximizing opportunities for company growth. Refer to slide presentation below for the insights uncovered in this case study, and the report below for a more detailed look into the data analysis using SQL and Tableau
Introduction:
Hello, my name is Aninon J. Egbejule, and welcome to my Bellabeat data analysis case study project. In this project I am working for a fictional company called Bellabeat, a high-tech manufacturer of health-focused products for women. Bellabeat is a successful small company, but they have the potential to become a larger player in the market. Urška Sršen, cofounder and Chief Creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. I have been asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. The insights I discover will then help guide marketing strategy for the company. I will present my analysis to the Bellabeat executive team along with my high-level recommendations for Bellabeat’s marketing strategy.
Case Study 2: How can a wellness technology company play it smart?
Products:
- Bellabeat app: The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits. This data can help users better understand their current habits and make healthy decisions. The Bellabeat app connects to their line of smart wellness products.
- Leaf: Bellabeat’s classic wellness tracker can be worn as a bracelet, necklace, or clip. The Leaf tracker connects to the Bellabeat app to track activity, sleep, and stress.
- Time: This wellness watch combines the timeless look of a classic timepiece with smart technology to track user activity, sleep, and stress. The Time watch connects to the Bellabeat app to provide you with insights into your daily wellness.
- Spring: This is a water bottle that tracks daily water intake using smart technology to ensure that you are appropriately hydrated throughout the day. The Spring bottle connects to the Bellabeat app to track your hydration levels. ○
- Bellabeat membership: Bellabeat also offers a subscription-based membership program for users. Membership gives users 24/7 access to fully personalized guidance on nutrition, activity, sleep, health and beauty, and mindfulness based on their lifestyle and goals.
Bellabeat products are available through a growing number of online retailers in addition to their own e-commerce channel on their website. The company has invested in traditional advertising media, such as radio, out-of-home billboards, print, and television, but focuses on digital marketing extensively. Bellabeat invests year-round in Google Search, maintaining active Facebook and Instagram pages, and consistently engages consumers on Twitter. Additionally, Bellabeat runs video ads on Youtube and display ads on the Google Display Network to support campaigns around key marketing dates.
Sršen knows that an analysis of Bellabeat’s available consumer data would reveal more opportunities for growth. She has asked the marketing analytics team to focus on a Bellabeat product and analyze smart device usage data in order to gain insight into how people are already using their smart devices. Then, using this information, she would like high-level recommendations for how these trends can inform Bellabeat marketing strategy.
Goal:
The goal is to identify opportunities for growth within Bellabeat, using public smart device consumer data, in order to design business and marketing strategies aimed at maximizing these growth opportunities. This analysis will be guided by the following questions:
1. What are some trends in smart device usage?
2. How could these trends apply to Bellabeat customers?
3. How could these trends help influence Bellabeat marketing strategy?
Business Task:
Analyze smart public device data of a non-Bellabeat product in order to identify trends and gain insight in to how customers are using their smart devices, and how these trends could help influence Bellebeat’s marketing strategy.
Data Used:
The dataset used for used for this analysis is: FitBit Fitness Tracker Data (CC0: Public Domain, dataset made available through Mobius): This Kaggle data set contains personal fitness tracker from thirty fitbit users. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. It includes information about daily activity, steps, and heart rate that can be used to explore users’ habits. 18 datasets were downloaded from the FitBit Fitness Tracker Data in .csv file format and included long and wide formats. The datasets used in this analysis included a user count of 33 participants over a 31 day period.
The FitBit Fitness Tracker Data was collected in 2016 making the datasets outdated for current trend analysis. The data was collected during a 31 day period (04-12-2016 to 05-12-2016). Since the data only included instances over a 31 day period, the timeframe for a more insightful analysis is relatively small.
Moreover, the sample size itself could create a sample bias. While a sample size of 30 (our data later shows a sample size of 33) will hold up within the CLT theorem, a larger sample size will be more representative of the population and would increase the confidence interval. In addition, since there were no demographic information collected it will be hard to see if we have a true representation of a national or global population. This lack of demographic information will also limit recommendations on the target audience (including gender, location, age and job status) and where to market to them.
For the case study analysis the following datasets were chosen out of the 18 datasets downloaded from Kaggle:
Daily_Activity_Merged
Daily_Sleep_Merged
Hourly_Steps_Merged
Hourly_Intensity_Merged
Hourly_Calories_Merged
Data Preparation:
Data cleaning of the datasets was performed using Microsoft Excel. The following steps were taken during data cleaning within each dataset:
Checked for duplicate data using the 'duplicate data' tool in Excel. 3 rows of duplicate information was found within the Daily_Sleep_Merged file, and were deleted before further analysis.
Formatted date data into MM/DD/YY date format
Formatted all numerical data in number format and rounded up to either no decimals or 2 decimal places depending on the column
Separated Date Time values into two separate columns (date column and time column) when needed for further analysis.
Formatted any time data into 00:00:00 format for consistency.
Checked Id entries and other columns for LEN to make sure the data was correct and uniform in length
The clean datasets can be viewed here
Data Analysis & Visualization:
MySQL Workbench and Tableau were used to analyze and visualize the clean datasets. The datasets were imported into MySQL Workbench using the import wizard and contained within a single schema called ‘bellabeat_data’
Checked the number of participants in each dataset by counting the number of unique Ids:
-- Counting the number of participants in each dataset
SELECT COUNT(DISTINCT Id)
FROM dailyactivity_merged; -- 33 Participants
SELECT COUNT(DISTINCT Id)
FROM hourlycalories_merged; -- 33 Participants
SELECT COUNT(DISTINCT Id)
FROM hourlyintensities_merged; -- 33 Participants
SELECT COUNT(DISTINCT Id)
FROM hourlysteps_merged; -- 33 Participants
SELECT COUNT(DISTINCT Id)
FROM sleepday_merged; -- 24 Participants
Checked how many times each user utilized the Fitbit tracker, and created a ranking system based on the number of times participants used their trackers:
-- Counting how many times each user utilized the Fitbit Tracker
SELECT Id,
COUNT(Id) AS Count_of_Id
FROM dailyactivity_merged
GROUP BY Id;
-- Ranking the participants according to activity level
SELECT Id,
COUNT(Id) AS Total_Logged_Uses,
CASE
WHEN COUNT(Id) BETWEEN 25 AND 31 THEN 'Active User'
WHEN COUNT(Id) BETWEEN 15 and 24 THEN 'Moderate User'
WHEN COUNT(Id) BETWEEN 0 and 14 THEN 'Light User'
END Fitbit_Usage_Type
FROM dailyactivity_merged
GROUP BY Id
ORDER BY Fitbit_Usage_Type;
I made the following data transformations in order to ensure that the dates in the dataset were formatted correctly, and then I extracted the weekday from the date. I also calculated the difference between the earliest date and the latest date in order to confirm the timeframe for this study, which was 31 days
-- Modify date format for better understaning in daily_activity
UPDATE dailyactivity_merged
SET ActivityDate = STR_TO_DATE(ActivityDate, '%m/%d/%Y');
-- Exploring the total number of days included in the dataset
SELECT DATEDIFF(MAX(ActivityDate), MIN(ActivityDate))
FROM dailyactivity_merged;
-- Add day_0f_week column on daily_activities
ALTER TABLE dailyactivity_merged
ADD day_of_week nvarchar(50);
-- Extract datename from ActivityDate
UPDATE dailyactivity_merged
SET day_of_week = DAYNAME(ActivityDate);
I calculated the minimum, maximum, and average of the total steps, total distance, calories, and activity levels for each participant.
-- MIN, MAX, & AVG of total steps, total distance, calories and activity levels by Id.
SELECT Id,
MIN(TotalSteps) AS Min_Total_Steps,
MAX(TotalSteps) AS Max_Total_Steps,
ROUND(AVG(TotalSteps), 2) AS Avg_Total_Stpes,
MIN(TotalDistance) AS Min_Total_Distance,
MAX(TotalDistance) AS Max_Total_Distance,
ROUND(AVG(TotalDistance), 2) AS Avg_Total_Distance,
MIN(Calories) AS Min_Total_Calories,
MAX(Calories) AS Max_Total_Calories,
ROUND(AVG(Calories), 2) AS Avg_Total_Calories,
MIN(VeryActiveMinutes) AS Min_Very_Active_Minutes,
MAX(VeryActiveMinutes) AS Max_Very_Active_Minutes,
ROUND(AVG(VeryActiveMinutes), 2) AS Avg_Very_Active_Minutes,
MIN(FairlyActiveMinutes) AS Min_Fairly_Active_Minutes,
MAX(FairlyActiveMinutes) AS Max_Fairly_Active_Minutes,
ROUND(AVG(FairlyActiveMinutes), 2) AS Avg_Fairly_Active_Minutes,
MIN(LightlyActiveMinutes) AS Min_Lightly_Active_Minutes,
MAX(LightlyActiveMinutes) AS Max_Lightly_Active_Minutes,
ROUND(AVG(LightlyActiveMinutes), 2) AS Avg_Lightly_Active_Minutes,
MIN(SedentaryMinutes) AS Min_Sedentary_Minutes,
MAX(SedentaryMinutes) AS Max_Sedentary_Minutes,
ROUND(AVG(SedentaryMinutes), 2) AS Avg_Sedentary_Minutes
From dailyactivity_merged
Group BY Id;
I created a visualization to show the average of total minutes spent being either sedentary, lightly active, fairly active, or very active active. The results show users spend the most time being sedentary (999.2 minutes), followed by being lightly active (191.5 minutes), then very active (20.3 minutes), and then fairly active (13.3 minutes).
The CDC recommends 150 minutes of physical activity each week (https://www.cdc.gov/physicalactivity/basics/adults/index.htm)
Unfortunately, over the course of this data frame (31 days) our Fitbit users spend an average of 33.6 minutes being fairly and very active, which is much lower than the recommended physical activity amount.
I examined the average active minutes for each day of the week, and visualized the results in the chart below.
-- Average active minutes by week day
SELECT day_of_week,
ROUND (AVG(VeryActiveMinutes), 2) AS Avg_Very_Active_Minutes,
ROUND (AVG(FairlyActiveMinutes), 2) AS Avg_Fairly_Active_Minutes,
ROUND (AVG(LightlyActiveMinutes), 2) AS Avg_Lightly_Active_Minutes,
ROUND (AVG(SedentaryMinutes), 2) AS Avg_Sedentary_Minutes
FROM dailyactivity_merged
GROUP BY day_of_week;
The chart above shows no significant difference in the average active minutes for the participants based on the day of the week. It seems users are quite consistent with their average active minutes from day to day. Based on this indication, Bellabeat could leverage activity goals for users to meet and implement in-app features akin to the activity ring in the apple watch which encourages users to meet daily activity goals and provides badges or awards when such goals are met. Some users might already being trying to meet personal activity goals each day and Bellabeat could enourage higher activity goals to increase daily active minutes that are very active or fairly active.
Furthermore, I examined the total steps taken by users in relation to the calories they burned.
The scatter plot above shows the relationship between the total steps and the total calories of participants. The linear trend line shows the R-squared value = 0.349953 and the P-value < 0.0001.
Since P-value is less than 0.0001, correlation between calories and total steps is statistically significant.
There is a positive correlation between calories burned and total steps. This data could be utilized to help users meet their caloric goals by providing daily step goal recommendations relative to their caloric and weight loss goals.
A Healthline.com article (“How many steps do I need a day?”) written by Sara Lindberg in 2019 cited a 2011 study by Tudor-Locke et. al. titled “How many steps/day are enough? for adults” which found that 10,000 steps/day is a reasonable target for healthy adults.
Sources: Healthline Article | 2011 Study
Based on the referenced study I classified the users based on two categories: those that met an average of 10,000 steps and those that didn't meet an average of 10,000 over the course of the study.
-- Categorized users based on meeting an average of 10,000 steps
SELECT Id,
AVG(TotalSteps) AS Avg_Total_Steps,
CASE
WHEN AVG(TotalSteps) < 10000 THEN 'Does Not Meet Daily Target'
WHEN AVG(TotalSteps) >= 10000 THEN 'Meets Daily Target'
END User_Type
FROM dailyactivity_merged
GROUP BY Id
ORDER BY User_Type
The chart shows that based on the referenced study 35.23% of users meet an average of 10,000 steps over the course of this case study while 64.77% do not meet an average of 10,000 steps.
In addition, I examined the total steps taken, and total calories burned by users for each day of the week.
-- Average Total steps by day of the week
SELECT
DISTINCT day_of_week,
ROUND(AVG(TotalSteps), 0) AS AVG_TotalSteps,
ROUND(AVG(TotalDistance), 2) AS AVG_TotalDistance,
ROUND(AVG(Calories), 2) AS AVG_TotalCalories
FROM
dailyactivity_merged
GROUP BY
day_of_week
ORDER BY
AVG_TotalSteps DESC;
-- Exploring the total steps, total calories and total active minutes for each Id
SELECT Id,
SUM(TotalSteps) AS Sum_total_steps,
SUM(Calories) AS Sum_Calories,
SUM(VeryActiveMinutes + FairlyActiveMinutes) AS Sum_Active_Minutes
FROM dailyactivity_merged
GROUP BY Id;
-- Activities and calories comparison
Select Id,
SUM(TotalSteps) as total_steps,
SUM(VeryActiveMinutes) as total_very_active_mins,
Sum(FairlyActiveMinutes) as total_fairly_active_mins,
SUM(LightlyActiveMinutes) as total_lightly_active_mins,
SUM(Calories) as total_calories
From dailyactivity_merged
Group By Id;
The result shows that Saturday is the day that users are most active walking, followed by Tuesday. The result also indicates that Saturday and Tuesday are also the days that users burn the most calories on average, while Sunday is the day with the least amount of steps as well as calories burned.
-- Total steps by hour of the day
SELECT
ActivityTime,
SUM(StepTotal) AS Total_Steps_By_Hour
FROM hourlysteps_merged
GROUP BY ActivityTime
ORDER BY Total_Steps_By_Hour DESC;
I visualized the Total Steps taken by Hour to see what time of day our users were most active.
The top 5 hours of steps recorded were:
18:00:00 (6pm) – 542,848 steps
19:00:00 (7pm) – 528,552 steps
12:00:00 (12pm) – 505,848 steps
17:00:00 (5pm) – 498,511 steps
14:00:00 (2pm) – 497,813 steps
I explored the Sleep habits of the users and its relationship to activity level.
-- Analyzing the total steps, total calories, and total minutes asleep for each Id
SELECT da.Id,
AVG(da.TotalSteps) AS AvgTotalSteps,
AVG(da.Calories) AS AvgCalories,
AVG(ds.TotalMinutesAsleep) AS AvgTotalMinutesAsleep
FROM dailyactivity_merged da
JOIN sleepday_merged ds
ON da.Id = ds.Id
GROUP BY da.Id;
Although there is no correlation between average minutes asleep and average steps. The graphs shows that most users who got at least 5 hrs. asleep had higher step counts. With this said, most users were not averaging the recommended 10,000 steps a day as noted in the Healthline article cited above.
Key Findings:
This analysis revealed that 87.9% percent of users were 'active', 9.0% were 'moderate' and 3.0% were 'low' based on how much they utlized their Fitbit devices over the course of this case study. Being that majority of the users were active, there is an opportunity to cater to the needs of active smart health device users while simultaneously closing the gap between active users and moderate or low users.
Also, 65% of users did not meet the recommended 10,000 steps a day.
Furthermore, most steps taken were during lunch hours or the 5pm - 7pm time frame. This shows that most users have a consistent time frame during which they are increasing their steps either through a walk or potentially through a workout.
Lastly, the data showed that users who averaged 5 hours of sleep or more also had a higher average step total.
Recommendations:
I believe that in order for Bellabeat to fully identify and maximize opportunities for growth, the company should consider conduct their own data collection to gain further insights, or lean into its own existing marketing and user data or trends. In addition, using a larger sample size idoe further analysis will increase the confidence interval. Since the data utilized in this case study did not include demographic information, I'm unable to give a more detailed recommendation or ensure there was no sampling bias.
I'd recommend a daily notification at the end of day to 'charge up' the wearable so it's ready for tracking the next day.
Adding device notifications throughout the day as reminders to "get up & move" or to "complete your step goal for the day" may help increase total steps for users.
It may be beneficial to send an app notification recommending users 'start their walk' or 'start their workout' if the app notices that the user hasn't begun to do so at their usual time of increased steps.
It may be beneficial to help users increase their sleep time by sending them a notification to 'wind down for sleep' at a certain time based off their sleep habits.