Bellabeat Case Study: How Can a Wellness Technology Company Play It Smart?


 In this case study, I analyzed public smart device datasets while working for a fictional company called "Bellabeat" in order to uncover insights that can be used to design and develop business and marketing strategies aimed at maximizing opportunities for company growth. Refer to slide presentation below for the insights uncovered in this case study, and the report below for a more detailed look into the data analysis using SQL and Tableau

Case Study: How can a wellness technology company play it smart??

Introduction:

     Hello, my name is Aninon J. Egbejule, and welcome to my Bellabeat data analysis case study project. In this project I am working for a fictional company called  Bellabeat, a high-tech manufacturer of health-focused products for women. Bellabeat is a successful small company, but they have the potential to become a larger player in the market. Urška Sršen, cofounder and Chief Creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. I have been asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. The insights I discover will then help guide marketing strategy for the company. I will present my analysis to the Bellabeat executive team along with my high-level recommendations for Bellabeat’s marketing strategy.

Case Study 2: How can a wellness technology company play it smart?


Products:

-       Bellabeat app: The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits. This data can help users better understand their current habits and make healthy decisions. The Bellabeat app connects to their line of smart wellness products.

 

-       Leaf: Bellabeat’s classic wellness tracker can be worn as a bracelet, necklace, or clip. The Leaf tracker connects to the Bellabeat app to track activity, sleep, and stress.

 

-       Time: This wellness watch combines the timeless look of a classic timepiece with smart technology to track user activity, sleep, and stress. The Time watch connects to the Bellabeat app to provide you with insights into your daily wellness.

 

-       Spring: This is a water bottle that tracks daily water intake using smart technology to ensure that you are appropriately hydrated throughout the day. The Spring bottle connects to the Bellabeat app to track your hydration levels.

 

-       Bellabeat membership: Bellabeat also offers a subscription-based membership program for users. Membership gives users 24/7 access to fully personalized guidance on nutrition, activity, sleep, health and beauty, and mindfulness based on their lifestyle and goals.

       Bellabeat products are available through a growing number of online retailers in addition to their own e-commerce channel on their website. The company has invested in traditional advertising media, such as radio, out-of-home billboards, print, and television, but focuses on digital marketing extensively. Bellabeat invests year-round in Google Search, maintaining active Facebook and Instagram pages, and consistently engages consumers on Twitter. Additionally, Bellabeat runs video ads on Youtube and display ads on the Google Display Network to support campaigns around key marketing dates.

 

Sršen knows that an analysis of Bellabeat’s available consumer data would reveal more opportunities for growth. She has asked the marketing analytics team to focus on a Bellabeat product and analyze smart device usage data in order to gain insight into how people are already using their smart devices. Then, using this information, she would like high-level recommendations for how these trends can inform Bellabeat marketing strategy.


Goal:

 The goal is to identify opportunities for growth within Bellabeat, using public smart device consumer data, in order to design business and marketing strategies aimed at maximizing these growth opportunities. This analysis will be guided by the following questions:

 

1.     What are some trends in smart device usage?

2. How could these trends apply to Bellabeat customers?

3. How could these trends help influence Bellabeat marketing strategy?

 

Business Task:

    Analyze smart public device data of a non-Bellabeat product in order to identify trends and gain insight in to how customers are using their smart devices, and how these trends could help influence Bellebeat’s marketing strategy.

 

Data Used:

    The dataset used for used for this analysis is: FitBit Fitness Tracker Data (CC0: Public Domain, dataset made available through Mobius): This Kaggle data set contains personal fitness tracker from thirty fitbit users. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. It includes information about daily activity, steps, and heart rate that can be used to explore users’ habits. 18 datasets were downloaded from the FitBit Fitness Tracker Data in .csv file format and included long and wide formats. The datasets used in this analysis included a user count of 33 participants over a 31 day period.

    The FitBit Fitness Tracker Data was collected in 2016 making the datasets outdated for current trend analysis. The data was collected during a 31 day period (04-12-2016 to 05-12-2016). Since the data only included instances over a 31 day period, the timeframe for a more insightful analysis is relatively small.

    Moreover, the sample size itself could create a sample bias. While a sample size of 30 (our data later shows a sample size of 33) will hold up within the CLT theorem, a larger sample size will be more representative of the population and would increase the confidence interval. In addition, since there were no demographic information collected it will be hard to see if we have a true representation of a national or global population. This lack of demographic information will also limit recommendations on the target audience (including gender, location, age and job status) and where to market to them.

    For the case study analysis the following datasets were chosen out of the 18 datasets downloaded from Kaggle:

 

Data Preparation:

   Data cleaning of the datasets was performed using Microsoft Excel. The following steps were taken during data cleaning within each dataset:

The clean datasets can be viewed here


Data Analysis & Visualization:

  MySQL Workbench and Tableau were used to analyze and visualize the clean datasets. The datasets were imported into MySQL Workbench using the import wizard and contained within a single schema called ‘bellabeat_data’


-- Counting the number of participants in each dataset 

SELECT COUNT(DISTINCT Id)

FROM dailyactivity_merged; -- 33 Participants


SELECT COUNT(DISTINCT Id)

FROM hourlycalories_merged; -- 33 Participants


SELECT COUNT(DISTINCT Id)

FROM hourlyintensities_merged; -- 33 Participants


SELECT COUNT(DISTINCT Id)

FROM hourlysteps_merged; -- 33 Participants


SELECT COUNT(DISTINCT Id)

FROM sleepday_merged; -- 24 Participants

-- Counting how many times each user utilized the Fitbit Tracker

SELECT Id,

COUNT(Id) AS Count_of_Id

FROM dailyactivity_merged

GROUP BY Id; 


-- Ranking the participants according to activity level

SELECT Id,

COUNT(Id) AS Total_Logged_Uses,

CASE

WHEN COUNT(Id) BETWEEN 25 AND 31 THEN 'Active User'

WHEN COUNT(Id) BETWEEN 15 and 24 THEN 'Moderate User'

WHEN COUNT(Id) BETWEEN 0 and 14 THEN 'Light User'

END Fitbit_Usage_Type

FROM dailyactivity_merged

GROUP BY Id

ORDER BY Fitbit_Usage_Type;

Tracker Usage Count by Id

I made the following data transformations in order to ensure that the dates in the dataset were formatted correctly, and then I extracted the weekday from the date. I also calculated the difference between the earliest date and the latest date in order to confirm the timeframe for this study, which was 31 days 

-- Modify date format for better understaning in daily_activity

UPDATE dailyactivity_merged

SET ActivityDate = STR_TO_DATE(ActivityDate, '%m/%d/%Y');


-- Exploring the total number of days included in the dataset

SELECT DATEDIFF(MAX(ActivityDate), MIN(ActivityDate))

FROM dailyactivity_merged;


-- Add day_0f_week column on daily_activities

ALTER TABLE dailyactivity_merged

ADD day_of_week nvarchar(50);


-- Extract datename from ActivityDate

UPDATE dailyactivity_merged

SET day_of_week = DAYNAME(ActivityDate);

I calculated the minimum, maximum, and average of the total steps, total distance, calories, and activity levels for each participant.

--  MIN, MAX, & AVG of total steps, total distance, calories and activity levels by Id.

SELECT Id,

MIN(TotalSteps) AS Min_Total_Steps,

MAX(TotalSteps) AS Max_Total_Steps, 

ROUND(AVG(TotalSteps), 2) AS Avg_Total_Stpes,

MIN(TotalDistance) AS Min_Total_Distance, 

MAX(TotalDistance) AS Max_Total_Distance, 

ROUND(AVG(TotalDistance), 2) AS Avg_Total_Distance,

MIN(Calories) AS Min_Total_Calories,

MAX(Calories) AS Max_Total_Calories,

ROUND(AVG(Calories),  2) AS Avg_Total_Calories,

MIN(VeryActiveMinutes) AS Min_Very_Active_Minutes,

MAX(VeryActiveMinutes) AS Max_Very_Active_Minutes,

ROUND(AVG(VeryActiveMinutes), 2) AS Avg_Very_Active_Minutes,

MIN(FairlyActiveMinutes) AS Min_Fairly_Active_Minutes,

MAX(FairlyActiveMinutes) AS Max_Fairly_Active_Minutes,

ROUND(AVG(FairlyActiveMinutes), 2) AS Avg_Fairly_Active_Minutes,

MIN(LightlyActiveMinutes) AS Min_Lightly_Active_Minutes,

MAX(LightlyActiveMinutes) AS Max_Lightly_Active_Minutes,

ROUND(AVG(LightlyActiveMinutes), 2) AS Avg_Lightly_Active_Minutes,

MIN(SedentaryMinutes) AS Min_Sedentary_Minutes,

MAX(SedentaryMinutes) AS Max_Sedentary_Minutes,

ROUND(AVG(SedentaryMinutes), 2) AS Avg_Sedentary_Minutes

From dailyactivity_merged

Group BY Id;

Daily Activity Min, Max & Avg

I created a visualization to show the average of total minutes spent being either sedentary, lightly active, fairly active, or very active active. The results show users spend the most time being sedentary (999.2 minutes), followed by being lightly active (191.5 minutes), then very active (20.3 minutes), and then fairly active (13.3 minutes). 

The CDC recommends 150 minutes of physical activity each week (https://www.cdc.gov/physicalactivity/basics/adults/index.htm)

Unfortunately, over the course of this data frame (31 days) our Fitbit users spend an average of 33.6 minutes being fairly and very active, which is much lower than the recommended physical activity amount.

I examined the average active minutes for each day of the week, and visualized the results in the chart below.

-- Average active minutes by week day 

SELECT day_of_week,

ROUND (AVG(VeryActiveMinutes), 2) AS Avg_Very_Active_Minutes,

ROUND (AVG(FairlyActiveMinutes), 2) AS Avg_Fairly_Active_Minutes,

ROUND (AVG(LightlyActiveMinutes), 2) AS Avg_Lightly_Active_Minutes,

ROUND (AVG(SedentaryMinutes), 2) AS Avg_Sedentary_Minutes

FROM dailyactivity_merged

GROUP BY day_of_week;

The chart above shows no significant difference in the average active minutes for the participants based on the day of the week. It seems users are quite consistent with their average active minutes from day to day. Based on this indication, Bellabeat could leverage activity goals for users to meet and implement in-app features akin to the activity ring in the apple watch which encourages users to meet daily activity goals  and provides badges or awards when such goals are met. Some users might already being trying to meet personal activity goals each day and Bellabeat could enourage higher activity goals to increase daily active minutes that are very active or fairly active.

Furthermore, I examined the total steps taken by users in relation to the calories they burned.

The scatter plot above shows the relationship between the total steps and the total calories of participants. The linear trend line shows the R-squared value = 0.349953 and the P-value < 0.0001.

Since P-value is less than 0.0001, correlation between calories and total steps is statistically significant.

There is a positive correlation between calories burned and total steps. This data could be utilized to help users meet their caloric goals by providing daily step goal recommendations relative to their caloric and weight loss goals.

A Healthline.com article (“How many steps do I need a day?”) written by Sara Lindberg in 2019 cited a 2011 study by Tudor-Locke et. al. titled “How many steps/day are enough? for adults” which found that 10,000 steps/day is a reasonable target for healthy adults.

Sources: Healthline Article | 2011 Study

Based on the referenced study I classified the users based on two categories: those that met an average of 10,000 steps and those that didn't meet an average of 10,000 over the course of the study.

-- Categorized users based on meeting an average of 10,000 steps

SELECT Id,

AVG(TotalSteps) AS Avg_Total_Steps,

CASE

WHEN AVG(TotalSteps) < 10000 THEN 'Does Not Meet Daily Target'

WHEN AVG(TotalSteps) >= 10000 THEN 'Meets Daily Target'

END User_Type

FROM dailyactivity_merged

GROUP BY Id

ORDER BY User_Type

avg_steps_by_targetmeet

The chart shows that based on the referenced study 35.23% of users meet an average of 10,000 steps over the course of this case study while 64.77% do not meet an average of 10,000 steps.

In addition, I examined the total steps taken, and total calories burned by users for each day of the week. 

-- Average Total steps by day of the week

SELECT 

  DISTINCT day_of_week,

  ROUND(AVG(TotalSteps), 0) AS AVG_TotalSteps,

  ROUND(AVG(TotalDistance), 2) AS AVG_TotalDistance,

  ROUND(AVG(Calories), 2) AS AVG_TotalCalories

FROM

  dailyactivity_merged

GROUP BY

  day_of_week

ORDER BY

AVG_TotalSteps DESC;

-- Exploring the total steps, total calories and total active minutes for each Id

SELECT Id, 

SUM(TotalSteps) AS Sum_total_steps,

SUM(Calories) AS Sum_Calories, 

SUM(VeryActiveMinutes + FairlyActiveMinutes) AS Sum_Active_Minutes

FROM dailyactivity_merged

GROUP BY Id;


-- Activities and calories comparison

Select Id,

SUM(TotalSteps) as total_steps,

SUM(VeryActiveMinutes) as total_very_active_mins,

Sum(FairlyActiveMinutes) as total_fairly_active_mins,

SUM(LightlyActiveMinutes) as total_lightly_active_mins,

SUM(Calories) as total_calories

From dailyactivity_merged

Group By Id;

The result shows that Saturday is the day that users are most active walking, followed by Tuesday. The result also indicates that Saturday and Tuesday are also the days that users burn the most calories on average, while Sunday is the day with the least amount of steps as well as calories burned.

-- Total steps by hour of the day

SELECT 

ActivityTime,

SUM(StepTotal) AS Total_Steps_By_Hour

FROM hourlysteps_merged 

GROUP BY ActivityTime

ORDER BY Total_Steps_By_Hour DESC;

I visualized the Total Steps taken by Hour to see what time of day our users were most active.

The top 5 hours of steps recorded were:


I explored the Sleep habits of the users and its relationship to activity level.

-- Analyzing the total steps, total calories, and total minutes asleep for each Id

SELECT da.Id,

AVG(da.TotalSteps) AS AvgTotalSteps,

AVG(da.Calories) AS AvgCalories,

AVG(ds.TotalMinutesAsleep) AS AvgTotalMinutesAsleep

FROM dailyactivity_merged da

JOIN sleepday_merged ds

ON da.Id = ds.Id

GROUP BY da.Id;

Although there is no correlation between average minutes asleep and average steps. The graphs shows that most users who got at least 5 hrs. asleep had higher step counts. With this said, most users were not averaging the recommended 10,000 steps a day as noted in the Healthline article cited above.

Key Findings:


Recommendations: