• We are a multilingual website again. Read the notice about this.
  • Understand AI use at MyPTSD: all AI use is explained in our AI help page. AI use is by choice here. It exists if you want it, but does nothing unless you choose to use it.

Server Performance Tuning

I believe I've fixed the backup situation too, in that it no longer takes so long, nearly an hour, consuming resources. It now occurs across about a 10 minute period, then immediately offloaded to Amazon S3 for storage.
 
Was curious if any work was being done currently. (I'm assuming you can use the time stamp to know when I'm referring to.). Server is the slowest it's been for me in a long time. Not complaining, just providing info in case its useful.
I've done some further monitoring for this, the result is that I had to change more PHP settings. I have temporarily added Google Analytics onto the site to measure the real-time traffic after Cloudflare (blocked / refused through to server). What is happening is that there are peaks and troughs happening constantly. In one five minute period I will have 20 active users, then it will go to 300 or 500 for 5 minutes. When you start doing math of page loading times, how long a page takes, blah blah blah, its a lot.

Basically, I just changed PHP settings again to scale workers further to handle peaks, but drop them off in troughs. It can sound counter productive to people when fully explained, but it works. By default the server has about 5 workers allocated, so what happens is each one becomes heavily loaded and starts fighting with itself, so one will consume 60% - 80% CPU (300% - 400%). 100% is 1 CPU core. It was actually worse than this. I scaled that out to 40, but it still wasn't enough. When you scale the workers you spread the load and remove the fighting within itself, basically each user online per second has their own worker (hopefully). The result is whilst you have more, you endup with something more like 5% per worker (200% total, or 2 cores). I have scaled it out to 80 now, which will soak up the peaks better, as a result, each worker uses less CPU again, like around 2% (160% or 1.6 CPU cores).

This is not a relative thing, you can't keep scaling out like that, because there is a sweet spot, then you have the opposite issue again. So I've manually overridden the settings I implemented a few days ago and scaled out further to try and limit CPU use during peaks.

Hopefully what you experienced, fingers crossed, has ended.

All of this has consequences to other areas, ie. database load. When PHP becomes overloaded, it then screws up database calls and then that becomes loaded, and the consequences roll through multiple areas. It's a balance, and I am trying to find it right now.

Singapore is currently on my shit list... the abuse coming from that country is just astounding me right now. I have it managed, but its just mind boggling.
 
Ok, I've upgraded the server to higher specs too, to help with additional load during peaks. As such, I have increased the donation goal to match most of the cost.
 
The stuff you find when you go looking. I have identified a large issue which is related to the new VPN blocking. I hope to have a fix in the next 24hrs to solve the issue it is creating by loading a massive IP table with each page load, even though it is in Redis cache.
 
As I found the main culprit last night before bed, this morning confirmed it and so I could downgrade the server again, specs, and thus the donation goal, as the primary issue is now handled. For those who understand pictures better, you can see what the server was doing before I made the change last night, and then whilst sleeping, it confirmed it with smooth operation. That is how a server should be, even handling high amounts of traffic at once, it should be fairly smooth when there is nothing clogging the pipes.

Screenshot 2026-02-17 064150.webp
 
Been doing a bunch of stuff, which explains any downtime you briefly experienced. End result, the server put through 170k users in one hour yesterday, and it barely flinched at it. Simply, the server is now well tuned to handle massive traffic spikes and you should not experience any issues.
 
Somewhere in my tuning and fault finding a bunch of problems on the server, I screwed up email sending for this site. Fixed that just now... for those who expected email and weren't getting it.
 
Ok, server ran out of memory overnight, killing elasticsearch. I just upgraded from 16G to 20G RAM on the server so it has more room to deal with random things. Fingers crossed that is the finish of server tuning.
 
Ok, I made a caching change. Please tell me if you notice anything weird... because caching is the place where weird things happen, like it shows as a different user, but isn't. It won't breach privacy, it won't actually log anyone in as another user, security goes beyond that, but caching can look like a page of another user. It shouldn't, because users are added to be bypassed... but shit happens with caching changes.
 

Donation drives

2026 Donation Goal

Goal
$1,800.00
Earned
$910.00
This donation drive ends in
0 hours, 0 minutes, 0 seconds
  50.6%

Trending content

Featured content

Back
Top Bottom