Move Fast, (Don't) Break Things

Move Fast, (Don't) Break Things

On Saturday night, October 4th 2025, Candle went down. This post covers how it happened and how we will prevent it from ever happening again.

Parth ChopraParth Chopra

Intro

On Saturday, October 4th, 2025, at approximately 8:58 PM PST, the Candle app went down. And this wasn't a small bug or something we could just patch in a few minutes (which we tend to do).

This was a full-service outage. The app would not open anymore. Not for us, the founders. Not for anyone.

Within a minute, we got an email - "Candle App Glitch, please fix." Within 5 minutes, our support email had over 20 emails from distraught Candle users who weren't able to open their app and use Candle with their partners and loved ones. Panicking, I called Alex (my cofounder), who was at home with his family celebrating his brother's birthday. I called him a few times to get his attention, and on the third ring:

"What's up dude?"

"Alex, our entire database just went down. The app isn't opening anymore. I can't breathe."

"Slow down. Walk me through what happened."

So what happened?

As someone who loves to code, I always get stoked whenever users ask us to add new features to make the Candle experience better. In fact, many of our most loved (and some upcoming!) features were suggested directly by talking to our users; Canvas, Live Countdowns, and Audio prompts were all user suggested features.

This time, we were building a referral system. A way for free users on Candle to experience the perks of Candle Premium by simply inviting a couple to the platform. The person who invited gets a free month, and so does the person who redeemed! A true win-win.

The code was basically done and we were polishing and testing it. At around 8:57 PM yesterday, I realized that we probably wanted to push notify both couples that they'd just been offered a free month of Premium, so I had to make a tiny code change to support the new notification logic. I wrote the 4 new lines of code, and pushed it up to our server.

Oops.

As an honest oversight, I ended up deploying (meaning that everyone using Candle on their phones right now gets access to this code) code from a different place than the code for our mobile app. Technical details spared here, this means that our database (and all the optimizations around it to make Candle feel fast and snappy) got dropped. Just like that.

Wait, all the data got deleted?

Thankfully, no. Our database ensures that all the data sticks around if something like this happens. However, all of our database indexes got dropped. Think of database indexes as the letters you see poking out of bookshelves at a library. They make it a lot easier to slice and dice data and find exactly what you want to see. In Candle's case, this makes it easier to retrieve things like your streak stats, what questions you've answered already, and what questions you've saved for later.

It also means that our app can't live without them. Bringing all of these indexes back takes a while because of the sheer volume of users who use Candle every day and the all-time total number of questions answered. TL;DR - without the indexes, the app crashes instantly.

We started re-populating these indexes last night (October 4th) at 9:02PM, 4 minutes after we realized what happened. Things looked great, but the repopulation unfortunately failed around midnight. So we tried again. This time, it looked more promising.

At this point, I knew I'd be up all night, so I asked my girlfriend to come hang out with me at the office in Berkeley, CA that I was working at. We monitored the situation actively, set up an automatic email responder to support emails, made a Reddit post with constant updates, and posted a TikTok about what was going on. Alex and I took shifts on fending off angry app reviews on the Google Play Store and App Store, telling couples that everything would be okay and that we were actively working on a fix.

BUT... THE STREAKS! WHAT ABOUT THE STREAKS?

We also made sure that all users knew that all lost streaks would be restored. We were used to doing that already by hand in the early days knew our ways around it.

Moving on and lessons learned

Besides just overall carefulness and care when deploying code on a whim, at Candle we've taken a few measures to make sure that this doesn't happen again.

  1. Any code that drops a table will have extra warnings to make sure that the engineer knows exactly what they're doing and that this will likely take time to recover. If we had this configured, this mistake surely would not have occurred.

  2. We will work with our database team to understand the intricacies of these indexes, especially on super large tables, and how we can speed up the process drastically in the future if this ever happens again.

  3. We will work to create some sort of in-app messaging (likely in a future update), that runs as soon as the app is opened and allows us, the Candle team, to remotely send messages to users. We could have prevented a lot of angry emails if users knew that Candle was experiencing a server outage and that the only thing to do was wait.

And if this outage affected you, we are deeply sorry.

Thank you for being a Candle user.

appengineering