Nighthawks Usability Testing

Usability Testing: Design a Solution

Nighthawks

Story map

While creating our storyboard, we focused on our persona, Feel Good Franklin, and the things that drive him towards and away from exercising. We decided to prioritize a “rainy day” scenario, in which an obstacle arises that prevents our persona from working out. Our story follows Franklin as he plans to go for a run, but has to pivot to going to the gym due to weather. We considered what would push him toward the gym instead of just canceling, and we considered how we could prolong his feeling of satisfaction with his workout, so that he isn’t so easily discouraged in the future. 

Pre-Story Maps:

Story Map:

MVP Features

  • Social aspect (workout w/ people, competitions etc.)
  • Alternative recommendations for rainy day scenarios (ex. If someone doesn’t want to run in the rain, propose similar indoor workout)
  • Robust notification system that nudges users to exercise
  • Exercise streaks similar to that of Duolingo
  • Specific recommendations based on preferences, weather & logging (which muscle groups to target based on what has already been hit the day before etc.)
  • A fun logging/tracking system that prompts users to enter and reflect on their workout so they can be more intentional and visualize progress (& help rec algorithm).
  • Scheduling for accountability and more robust, accessible social workout structure

System paths

While we created our system path diagram, we started with the overall journey of our main user. Our main system path illustrates the main tasks and ideas, and we have two auxiliary paths that detail processes within our app: new user signups and scheduling with friends.

Our main user has a very broad definition and is someone who wants to work out more but isn’t sure where to start. This consists of both of our original personas, Feel Good Franklin and Stubborn Sally. We also introduce a new persona, Passive Patrick, who will go to the gym when prompted by a friend but is not interested in tracking or improving their overall workout routine. 

We first considered what would prompt a user to work out: either a notification or a routine. From here, a user can schedule a workout with friends. In our path to schedule a workout, we have the option to invite a friend to workout right now or to plan a workout for later. This would consist of filling a schedule and later getting recommendations based on availability and preferred workouts. 

We then considered how a user might work out. We have our “rainy day” scenario, where a user experiences an obstacle; they can get an alternative workout, reschedule, continue to work out anyway, or give up and exit the system. If a user begins a workout and they don’t know what to do, they can either have a bad workout or get recommendations. After a workout, users can reflect on their workout and log their progress. 

After a workout, a user can have a workout streak that continues each day. Users can also view a leaderboard, progress visualization, and stats on workout partners. The social aspect and reflection will hopefully help users progress their workouts. By building routines and having workouts that encourage feeling good, hopefully users will be more willing to work out more often.

Bubble Map

We focused on breadth in our bubble map, but we recognize that we will likely have to narrow down features as time goes on and we conduct more testing. To create our big bubbles, we thought about what our app is at its core: a more socially focused Strava that is more feeling and connection-focused than Strava’s number-driven approach to exercise. We want users to improve their workout habits however they define improvement, which can be measured not only by quantitative workout data but, perhaps more importantly, by a focus on feelings and emotions.

The large bubbles that we chose to highlight in our bubble map are Recommendations/alternative workouts, Scheduling, and Social/Leaderboards. We see Recommendations and alternative workout suggestions as a way to prevent issues similar to those that Sally experiences with decision fatigue and a general lack of information as to what her options are in the gym. Scheduling serves as a bridge between people with wavering motivation to bring them together to motivate each other; when one person is feeling down or tired, the idea is that another friend would have the activation energy to inspire them to workout. Finally, Social/Leaderboards are how we round out the whole experience and make the app a place where people want to meet friends and socially interact in fun and exciting ways in the name of improving health.

Assumption Map

Below is our assumption map diagram. In pink and bigger than the other stickies are the key assumptions that we want to test. Here is a link to our Miro board.

Assumption Tests

Below are the assumptions from our Miro board:

  • Assumption 1: People are actually willing to embrace change to improve their workout routines
    • Test: Have participants walk us through their ideal workout routine. Have them attempt to follow it for a week. We want to observe if they 1) Attempted it, and to what extent of effort 2) Ask for feedback about how much they enjoyed the experience, how difficult changing their routine was, as well as how feasible the new routine feels given their current schedule (if participants did indeed change but expressed that it didn’t feel feasible, then perhaps time or some other factor is the real constraint and not the willingness to change). 
  • Assumption 2: People are willing to workout in groups (at least for certain types of exercises)
    • Test: Have participants pick several different workouts they normally enjoy. Try it alone, then with friends, and compare differences in feelings and workout satisfaction. (This could also yield insights about what types of exercise are more conducive to groups.)
  • Assumption 3: People could have time to workout if they make time
    • Test: Pick a day of the week that the participant says they’re “too busy to workout” on. Have them share their schedule with us and we will collectively figure out a feasible time/type of exercise. See if they’re able to follow through with the plan and if so how it made them feel.

Intervention Study

Background

We wanted to investigate the effect of teamwork, social connection, and workout reflection to nudge people to exercise more. To test these motivations, we conducted a social exercise-focused intervention study over five days. We had n=8 total participants that we split into four teams of n=2. We added all n=8 participants into an iMessage group chat where we would text end-of-day updates with information on which team was in the lead and any other interesting trends. Each day, participants would log their exercise from that day, along with the duration of the workout, their workout satisfaction (1-10), their motivation levels (1-10 score), and any extra notes they may have. The raw results are here.

Results

Final leaderboard, workout minutes over the 5 days:

  1. 507 minutes: Team 2 // TB and LW
  2. 390 minutes: Team 3 // RB and MG
  3. 220 minutes: Team 4 // AM and KA
  4. 111 minutes: Team 1 // KZ and SB

Workout frequency since diary study: 

  • RB, TB increased their workout frequency
  • LW, KA, SB had the same workout frequency
  • AM decreased their workout frequency
  • MG, KZ were new participants 

By team, workout frequency since diary study: 

  • KZ (new 1/5), SB (same 2/5)
  • TB (increase 2/5 to 5/5), LW (same 5/5)
  • RB (increase 1/5 to 3/5), MG (new 4/5)
  • AM (less 3/5 to 0), KA (same 2/5)

Key Insights

An interesting insight is that not everyone who consistently exercises always feels motivated to do so. For example, LW, the most consistent performer across our studies, only had an average self-reported motivation level of 4.6/10 (5, 4, 3, 7, 4 across the 5 days). This means that on 4 out of those 5 days, she was more unmotivated than motivated, but she still chose to exercise despite the lack of motivation. Some reasons for low motivation include bad weather for outdoor runs and more generally anticipation of effort, and she has shown that these could be overcome with structure, internal drive, and consistency. (It’s easy to assume that some people are just more motivated and therefore are able to achieve fitness goals more easily, however, we see that this is not the case. Even for the most consistent, it’s not always easy. If we could uncover the secrets of committing to exercise regardless of motivation, those who blame lack of exercise on lack of motivation may be able to achieve higher fitness goals.) 

Another interesting trend is that a majority of people seem very attached to a certain type of exercise. For example, except for LW, who did a little bit of everything, everyone who did cardio did exclusively cardio, and those who lifted focused heavily on lifting. TB, despite ankle issues that prevent him from running outside, still opted to run on a treadmill as opposed to doing some form of exercise that did not require lower body. This may suggest that people find it easier/more comforting to stick with exercises they are familiar with. 

Of the users that were in both our diary study and our intervention study, only two participants increased the amount of exercise that they did over the course of the week. One of the participants who increased their exercise frequency went from only exercising twice to exercising all five days. This participant was paired with someone who exercised every single day during both our baseline study and our intervention study. Our other user who logged increased activity went from exercising once to three times. This participant was with a new partner, who exercised four times.

Our four users who exercised the most days made up two teams, suggesting that the teams that were most competitive were also the most driven to exercise. In addition, one team had both participants exercise together, and they both noted the social aspect in their daily log. This indicates that the social and competitive aspects of our study were effective. 

While recruiting for our study, some of our participants said that they likely would only exercise a little this week due to immovable commitments. These participants were on the two worst-performing teams. Since this study was only five days, which is not long enough to show sustainable workout changes, and it took place when many of our users had midterms, we do not believe that the decreased performance of these teams was a direct result of our intervention. 

We allowed our participants to log whichever exercise they did for the day, and our competition was based on workout duration. Early into the intervention study, we received complaints from our participants that different types of exercises should be weighted more in the competition than others. For example, doing yoga for an hour is much lower intensity than weight lifting for an hour, and going to the gym for an hour usually includes rest periods, whereas running for an hour usually does not. Since we initially told our participants that their scores would be based on workout duration, we felt it would be unfair to adjust our scoring metric partway through the study.

Implications

Our goal is to encourage users to be more active, and we understand that exercise looks different for everyone. To incorporate the feedback from our study, we will likely have different metrics that can be used in competition. For example, we may have separate leaderboards for daily workout streaks, average workout duration, and average workout intensity. This way, we can accommodate users with different activity levels and physical capabilities. 

Moreover, separate leaderboards and metrics would help alleviate issues with different types of workouts being weighted the same amount. Going back to the yoga versus weightlifting example, one can imagine that having more lifting minutes would rank someone higher in the workout intensity leaderboards than someone doing yoga. Conversely, yoga is a form of physical activity that is often performed daily, which would give someone who does yoga a leg up in the daily streak category versus a frequent lifter. A byproduct of different kinds of leaderboards is an overall incentivization of experimentation with different forms of exercise and promoting well-rounded routines.

Earlier was mentioned that the study happening during midterms caused people to exercise less frequently. In our final solution, we need to make sure that we can evolve the app to be what people need at the moment. Specifically, rather than recommending or promoting extremely intense exercises and goals by default, we can make the application especially responsive to user reflection. If a user appears stressed, we can recommend more casual and holistic activities like stretching or walks. If the user is injured, we can recommend both exercises that don’t involve the injured part as well as physical therapy exercises and stretches. 

A barrier to recommending exercises is that people don’t always like to explore new exercises, especially between weightlifting and cardio. Upon interviewing users, weightlifters express that cardio is both too tiring and would make them lose physical mass, which is not ideal. Some of those who prefer cardio express that they don’t want to go to the gym or don’t want to start as a beginner and have to plan exercises they’re not familiar with. LW is open to many types of exercise as she follows a regimen for a specific goal, so we may want to emphasize and promote exercises based on goals along with other factors and frame it by explaining how they help the user achieve their goals. 

Avatar

About the author