Assumption Test: Brevi

Here’s our original post with our test cards!

Here lies some of our results artifacts, and here is the form we used to collect responses for tests 2-4.

Recruitment

We recruited 3 people for each experiment NOT involved with a prior version of our app, for a total of 12 test runs across the 4 tests. Most participants were students or early-career knowledge workers who do regular focused work on a laptop and sometimes struggle with low-productivity moments like doomscrolling, avoidance breaks, or difficulty getting back into a task.

We chose these participants because they are close to the kind of user we imagine for Brevi: people who want to use their time better, but whose breaks are not always intentional or helpful. For the survey and framing tests, we recruited people who could react quickly to the idea and tell us whether the problem and messaging felt believable. For the estimate vs reality and logging tests, we recruited people willing to do a short work session and reflect honestly on their break habits.

We also tried to include people with slightly different habits. Some were more structured and productivity-oriented, while others got distracted more easily or took more unplanned breaks. This helped us see whether the problem seemed narrow or whether it showed up across different working styles. This was a small convenience sample, mostly people we could reach easily, so the findings are useful for early learning but likely not broad enough to generalize too strongly.

Learning Card 1: Pause blame survey

Insight Name: Pauses matter, but people do not always name them that way
Date of Learning: 3/7/26
Person Responsible: Pause team
Data Reliability: Low
Action Required: Yes

STEP 1: HYPOTHESIS

We believed that people believe pauses are a major cause of low productivity.

STEP 2: OBSERVATION

We ran the verbal survey with 3 people. Two people gave fairly high agreement that pauses hurt productivity, but only 1 person put pauses in their top 3 causes when asked. The other answers that came up more often were stress, tiredness, lack of focus, and phone distractions.

So this test gave us only partial support. People seemed to agree with the idea once they saw it, but it was not the first thing they naturally said.

STEP 3: LEARNINGS AND INSIGHTS

From that we learned that our problem may be real, but our language may be slightly off. People may not talk about “pauses” as a main issue, even if they do struggle with doomscrolling, avoidance, or drifting between tasks.

This seems important because it means our product should maybe not start by blaming pauses in a broad way. It may work better to talk about lost focus, avoidance, or unplanned break habits, which felt more concrete to people.

A limitation is scope. This was only 3 people, and it was a very short survey. It tells us more about first reactions to wording than about the full problem. It is also possible that people interpreted “pauses” differently.

STEP 4: DECISIONS AND ACTIONS

Therefore, we will keep the core problem area, but we will test clearer wording. We will talk less abstractly about “pauses” and more about specific behaviors like doomscrolling, avoidance breaks, and trouble restarting work. We should also run this with more people before treating it as a strong insight.

Learning Card 2: Unhelpful pause estimate vs reality

Insight Name: People spend more time in unhelpful pauses than they think
Date of Learning: 3/7/26
Person Responsible: Pause team
Data Reliability: Medium
Action Required: Yes

STEP 1: HYPOTHESIS

We believed that people underestimate how much time they spend in unhelpful pauses.

STEP 2: OBSERVATION

We ran a 30-minute work session with 3 people. All 3 estimated less pause time than what we observed. The median gap was around 25 to 30 percent. Two people said the result was surprising, and 2 people said they would want to change at least one break habit after seeing the breakdown.

This was our clearest result. People were not shocked in a dramatic way, but they did seem to notice a real mismatch between what they thought and what actually happened.

STEP 3: LEARNINGS AND INSIGHTS

From that we learned that awareness itself may be a big part of the value. People do not always need a lot of advice first. Sometimes they need a mirror.

This feels important for Brevi. If we can help users see their hidden pause patterns in a simple and non-judgmental way, that may create motivation better than just telling them to take better breaks.

A limitation is scope. This was a short observed session with only 3 people, so we still do not know if the same pattern holds over a full day or over multiple days. People may also behave a bit differently when they know they are being watched.

STEP 4: DECISIONS AND ACTIONS

We will prioritize features that show users their pause patterns back to them. We should keep testing lightweight tracking, summaries, and reflection moments. We also want to test this over a longer period, because the next question is not just whether people notice the problem, but whether that leads to behavior change.

Learning Card 3: Framing A/B on willingness

Insight Name: Performance recovery framing seems stronger than self-care framing
Date of Learning: 3/8/26
Person Responsible: Pause team
Data Reliability: Low
Action Required: Yes

STEP 1: HYPOTHESIS

We believed that framing breaks as performance recovery instead of self-care increases willingness to take breaks.

STEP 2: OBSERVATION

We tested 2 versions of the message with 3 people. The performance recovery version got a slightly better reaction. People described it as more practical and easier to justify during work. Self-reported willingness was a bit higher for the recovery framing, but click behavior was not very different.

No one strongly rejected the self-care version, but it felt softer and less urgent. One person said it sounded nice, but not like something they would really use every day.

STEP 3: LEARNINGS AND INSIGHTS

From that we learned that framing matters, but maybe not in a huge way yet. “Performance recovery” seems to fit the work context better. It gives people a reason to take breaks without making them feel lazy or indulgent.

At the same time, this was not a massive win. It was more of a gentle preference than a breakthrough. That means messaging helps, but messaging alone is probably not enough.

A limitation is scope. This was a tiny A/B test with only 3 people and a very lightweight prototype. It tells us about first impressions, not real repeated use. We also cannot say much about click-through with a sample this small.

STEP 4: DECISIONS AND ACTIONS

Therefore, we will use performance recovery framing in our next onboarding and landing page tests. We will still keep some warmth in the tone, but we will anchor the value more in focus, energy, and doing better work. We should test this again with more users and in a more realistic product flow.

Learning Card 4: Need-first vs do-first logging A/B

Insight Name: Need-first prompts may increase honesty, but they can also feel heavier
Date of Learning: 3/7-8/26
Person Responsible: Pause team
Data Reliability: Low
Action Required: Yes

STEP 1: HYPOTHESIS

We believed that a log prompt asking “What did you need?” would lead to more honest entries than asking “What did you do?”

STEP 2: OBSERVATION

We tested both versions with 3 people (1 each day). The need-first version did seem to produce slightly more honest and reflective answers. People were a bit more willing to admit they were anxious, stuck, or avoiding work. We also saw a little more mention of doomscrolling and avoidance.

But the downside was that the flow felt heavier. One person paused longer before answering, and one person said it felt a bit too personal for a simple logging step. Completion was a little worse (simply judging by length and level of personalized info) in the need-first version.

STEP 3: LEARNINGS AND INSIGHTS

From that we learned that honesty is not free. Asking about needs can help people reflect more deeply, but it can also increase effort and emotional friction.

This means our idea was directionally right, but maybe too ambitious for the first step. People may open up more once they already trust the app, but not necessarily in a lightweight daily log from day 1.

A limitation is scope. This was only 3 people and a very early test, so we should be careful not to overread it. We also only tested a short interaction, not a real habit over time. The question may work differently after users have used Brevi for a few days.

STEP 4: DECISIONS AND ACTIONS

Therefore, we will not make need-first logging the default right away. Instead, we will explore lighter versions of it, maybe as an optional reflection prompt or as a follow-up after the basic log. We still think the concept has value, but we should introduce it more gently and test it later in the user journey.

Overall Synthesis

Across these assumption tests, we think Brevi is directionally solving a real problem, but we also learned that we need to be more careful about how we describe the problem, how we introduce the product, and how much reflection we ask from users at the start.

Overall, our tests suggest that Brevi’s best early value is helping people see and understand their unhelpful pauses in a light, useful way. The app should probably feel more like a practical focus tool than a deep self-reflection tool, at least at first.

A major limitation is scope. These were all very small, early assumption tests with only 3 people per experiment. Most were short interactions, not repeated product use over time. Because of that, we should treat these as closer to sanity checks rather than scientific proof. 

Avatar

About the author

Leave a Reply