Product Sense Pushups: Crisis Management — Error States and Recovery

Slack It’s one of the two major players for communication platforms in the corporate world. They need to offer real time communication with minimal delays, proper memory retention, and consistent notifications. Without these features communication structures can hinder an organization’s productivity and cause an exodus from the tool. It’s important for Slack to provide transparent updates about the health of their systems, and maintain reliability even in the face of uncertainty/turbulence. Building this trust and reliability is crucial for their business. 

Uber faces fierce competition (Lyft, Waymo, etc) and maximizes customer satisfaction by minimizing foreseeable issues. For example, Uber will provide accurate ride waits times Auto rematch drivers whenever drivers cancel and recalculate routes when they become volatile. All of these are done to ensure user satisfaction because anytime a write is canceled or a user doesn’t complete the transaction with Uber that is lost to revenue. So there are so many different points in time where a write can be canceled and Uber has to minimize that. So I guess the core metric here is maximize write completion in order to maximize revenue.

Similar to Slack, it’s extremely important to maintain reliability and access. That said, banking apps operate in the most delicate space, people’s literally life savings. They must maintain quick and secure service regarding deposits, card usage history, account balances, transfers, transactions, etc. In order to mitigate the potential errors in these spaces, it’s important to have a really strong documentation system that’s very transparent with the user. It’s also really important to have human operators available to manually look through any sort of mishaps. In the end, you want to maximize user trust. Banks need people to keep their money with them and widespread errors could prompt a bank run.



Avatar

About the author