Sentinel

TL;DR: We're fully back. And something about the updates.

Hello Fam,

We've missed you.

It's been a rough ride these past few days. In the Racksterly way, we'll be going over recent events, where we are now, and where we're going, telling it like it is. Now strap in, it's storytime.

Let's start with a story about why we closed our site for 3 days for maintenance?

Would you be surprised if we told you we didn't? We bet you would. But, we really didn't. Now get comfortable, this one's a freight train.

We knew there was trouble in paradise when we started receiving emails from several users by the hour on the 11th, saying they weren't paid for sharing, or they made a payment and still had the transaction pending, or even worse, got the much dreaded, "This site can't be reached" message on their browsers. Some smart browsers like Chrome provided additional information like, "racksterly. co took too long to respond". Now, a trained eye can tell all of these were symptoms of an unresponsive/overloaded web server. Or were they?

We knew we were in dire straits when, by midday, even we couldn't reach our website. What could be the problem? We had already gotten the biggest, and most powerful server Namecheap had.

Starting midday, we got in touch with Namecheap. Asked for certain configuration changes. DDOS again, maybe? But we already had Cloudflare set up. Maybe it was a false positive - Cloudflare may have mistaken the traffic from our users for a DOS attack and throttled traffic. But, we figured, it wasn't like Cloudflare to make such a mistake, and even the stats looked good. No matter, we tweaked the config anyway. Then, we relaxed mod_security on the web server. We did everything we possibly could to speed up the server. Every switch and change and toggle had us believing we had found the solution to the poor server performance and everything was under control (we really should have made that post to update you earlier). Still nothing. What to do? Upgrade, now! By midnight, we were resigned to the fact that we needed a bigger server, several of them in fact. Or did we?

It wasn't supposed to take long. Provision some new servers here. Move all files and databases there. Update some DNS records somewhere. Easy peasy. Except, it wasn't easy peasy. We weren't just getting a bigger server to move to - we already had Namecheap's finest. And we'd grown tired of messaging them about our server slowing down once we reached a thousand plus users per second (it wasn't looking good too; when users begin to experience issues with a website, they tell others, who also try connecting and then tell others, ad infinitum, worsening the problem, although in our case this proved to be a blessing down the road). We were going to move somewhere we'd have full control over our servers, so we could tweak them to our heart's content, and not have to contact a remote technician whenever we had difficulty. Where? Destination Digital Ocean (for the unintiated, they provide unmanaged, bad-ass servers).

Thus, we began this migration that was totally unplanned for. By about midnight on day two (the 12th), we had packed our bags on the old servers, so to speak, and were slowly and carefully moving camp, to ensure we didn't break anything. We hoped it'd be like the last time. Set up, then point everyone there via DNS. Boy, were we wrong! Configuring new, blank servers to match the environment Racksterly ran on and getting them to sync and work as a cluster took most of the day. We seemed to underestimate how long everything would take at every step of the way. Deciding to use a cluster we could always extend, however, was the best decision we could have ever made (in hindsight, it was just common sense). That's because it forced us to use a totally seperate machine for the database (a cluster of database servers, in fact - everything had to have a backup now). And that proved pivotal.

We estimated we'd be done by the evening of the 12th. We were, sorta. When we began importing the database to the new database cluster (by about midnight, the hour of the "Red Bull" post, which was when we actually closed off access to the site), something happened that we dismissed as nothing. As the queries were executed by the much faster web servers, the database server slowed down. When we tried opening a seperate configuration web page that relied on the database to load while the migration was in progress, the page took forever to load, then ended with the message, "This site can't be reached". Ring a bell? When the import ended, everything loaded fine. We should have known.

As the Racksterly site loaded with lightning speed in the early hours of the morning on Day 3, we felt we had successfully resolved the issues. But, we held off on making any announcements so we could monitor the servers and ensure they were stable. That turned out to be the right call. As users flooded in, our website slowed down again. What the heck?! Average server load was 0.1%. Something wasn't right.

If the web servers weren't breaking a sweat, and the site was slow, something was. And there was only one other place to go. Yeah, you got that right. It was the database server. We hadn't exactly taken the biggest of them all, so we thought, "maybe its too small". We forked a new cluster, then destroyed that one (that's the reason you could sign in one moment, and couldn't sign in the next, getting the "Forking DB clusters. Please wait…" message instead).

The new cluster was massive, to put it mildly. It seemed our solution to every problem was "Get a bigger one!". As this was the largest we could get on DO, it had to work or else! Forking-Forking. Copying-Copying. Importing-Importing. Slowly and carefully. When it was ready to roll, and we pointed the web servers there and began hitting it, things looked okay. For a while. As we approached a thousand users connected per second, our site slowed down again. "This can't be happening!", we thought. Yet, there it was, unfolding right before our eyes.

Then we noticed something weird. The slow speed wasn't something we got all of the time. We had restarted the servers, and on browsing the website without signing in, it loaded incredibly fast. Once we tried signing in though, (which is what everyone was doing, and which included hitting the database), it made snails feel like Usain. We had a pattern. When we looked at the database logs, we noticed throughput averaging 14 million queries per second. We were pushing the database server, and for only about a thousand users connected at the time, something really nasty was going on.

So, we wrote a small program to log to a file whenever the database was queried. Lo-and-behold, there it was! Everytime a user tried signing in, the program that set up his session would make a blood-curdling number of queries on the database. For each user!.

With thousands of users waiting on each keystroke, and with fatigue dogging our steps, we proceeded to modify core parts of Racksterly. When you've been without sleep for long enough, your appetite goes out the door, and your mind begins to play tricks on you. Nothing keeps you going besides sheer will, and a refusal to stop until the job is done. Humans aren't designed to stay awake for so long, normally. But with thousands of users waiting with bated breath, sleep was a luxury we couldn't afford. This was the third day, and Red Bull was our crutch. Everyone had had so many cans the next one was having next to no effect. But there was too much riding on this - we simply couldn't quit or fail. We couldn't afford to let down the people who had trusted us with all their hearts. Or show them up to those they had convinced of our mettle. I guess when something is really important, you go for it with every breath in your body, no matter the odds.

By about 9pm on the 13th, we were done writing and testing. This was our last card. Our final roll of the dice. If this failed…well, better not to think about it. We re-opened login and held our breaths. 50 users, 100…300…500…as word spread that users could sign in, so did the number of simultaneous users increase. 1000…1500…2500…3000…3500. Average server load had maxed out at 0.5%. User's were still logging on by the second. And the servers were still chewing through requests like no man's business. If you were on 4G, you couldn't say your name before the site loaded (you still can't). We had dodged a bullet!

What was next? Pay everyone due for withdrawals. Except we couldn't. We had set things up in a way that made us accessible to everyone who needed access to us, so our bank had already sent officials the previous day to confirm that we weren't trying, or even planning to pull a Houdini. Obviously, the complaints had reached them. But when they saw the physical state of things around here, and went through our transaction records, it was clear as day that our hearts and souls revolved around one thing. And it wasn't "running away". By early evening on the third day, we received the news that our accounts had been unfrozen.

We still had an obstacle to surmount. Transfers on Paystack had been suspended, as we had found out the previous day. And we had received the email response we published in our previous post. There was nothing to do now but wait for the phone call from Paystack. It came in by about 7:30pm on Saturday, 14.12.2019, and lasted about 35 minutes. We could tell that this situation had caused them a real headache, with users emailing, tweeting, and probably calling, in droves. They asked many questions. We answered. They made it clear that there was a huge chance our business would be deactivated on Paystack due to the shitton of complaints they had received in the past two days or so. We dared to hope against hope. We still don't blame them. It's hard enough dealing with customer requests about one's own business. So, for just one of the businesses using their service to have given them so much of a headache couldn't have been funny. And it could have been worse if we'd actually had some funny business in our plans then or for the future.

We provided certain documents and particulars they requested after the call. But not before we were informed we'd get a response by Monday. Now, anyone who had initiated a withdrawal right before or after the server incidence knows it didn't play out that way. By 7:45pm on Sunday, transfers to thousands of users were underway. By 9pm, everyone who had made a withdrawal before then had been paid. Order had been restored. We didn't know what happened, but we were grateful it had happened.

We've spent the past few days improving and fixing any bugs that may have remained.

Now, there have been some complaints after the fact. Like complaints about the new interface. So, let's discuss that for a bit. Why some would complain that their activity balance (now Ad Credits) was being debited, even when you've never had an activity balance before now, and your main balance was being credited in usual fashion, and we told you we were still tweaking things, is beyond us. But again, there's nowt as strange as folk, eh?

Ad Credits are meant for publishing your own ads directly from your accounts. We noticed that most of us own or are involved in businesses on the side. And we thought to encourage and help you get some exposure by giving you some ad credits to start with. Alas, it was abused. We had posts of animals and bare chests and paper and legit products and mucky products and services and whatnot. Not exactly what we thought you'd do with it. The end result and experience for other users was an eyesore. So, we disabled ad publishing for a little while, and began cleaning things up.

By the time you read this, ad publishing would have been re-enabled. But before you make another post, there's something you need to know:

We're implementing a zero-tolerance policy on ads posted. There's a new "Report post" feature (tap the three dots under a post to see it). As the Stream is slowly becoming the biggest part of the Racksterly experience, everyone has a responsibility to ensure only high quality posts are allowed, for one another. So, if you see a post that shouldn't be there, report it. If a post is reported by enough people (we've set "enough people" to a small number of people), not only will your ad be removed, but its owner's Racksterly account will also be disabled. Permanently. Shitty posts will not be tolerated. If you have a legitimate business or service to advertise, you have nothing to worry about. Just take the time to prepare a nice image or video (video ad publishing from your account is now available!), and write a nice description and title, and you're good to go. Oh, and don't come looking for a refund if your account is disabled for a truly shitty post. You won't get it. Post something useful, or post nothing at all.

By the way, speaking of refunds, when we saw the number of sliding tackles that had been made on our Facebook Page, we thought we'd have to make thousands of refunds. We haven't had up to 20 requests. Weird, no?

Everyone's ad credits have been reset to zero. When you want to publish a post, you can simply top up your ad credits (now you know what the top-up feature is for). Then, you can create a post from the ad credits in your ad credits balance.

There are some posts that aren't allowed on Racksterly. Let's address those:

Sexually explicit/violent content is not allowed. In fact, pretty much anything you wouldn't let your kid catch you looking at is not welcome.

Political ads are cash-cows for platforms that thrive on providing publicity. However, we won't be accepting political ads on our platform, for what should be obvious reasons. And if those reasons aren't obvious to you, then you've probably been asleep the past few years. Here's some cold water.

Spammy, scammy, or misleading content are all not allowed. Neither is news. Please do not advertise news articles.

Products, businesses and services that pass the above tests are fine, pending when we update the list (we'll let you know when we do).

We promised to compensate you for the days you lost while you couldn't use our service, and we have. It wasn't the most comfortable of decisions, but a promise is a promise. If you still haven't been compensated, kindly get in touch with us and we'll take care of it.

We're sorry for ever putting your integrity in doubt before those you introduced into our family. Even though we pulled through, you should never have been in that situation. We're really sorry.

Thank you for trusting and believing.

We love you. For always.

PS: We still don't have a Twitter handle.