- Joined
- Sep 15, 2014
- Messages
- 4,343
- Likes
- 8,855
- Degree
- 8
I'm going to tell you right now, if you are building a system on mysql, the engine you use is more important then you realize. I just found this out the hard way.
First off, If you've got tables which are being written to constantly form multiple angles, MYISAM is the worse table for this, since it locks the whole table while writing. It may not matter on a small app that's being written to once a minute, but when you get to 5 to 10 times a second from multiple "workers" or processes, the toll will be there. It will be small at the beginning, but when you start factoring in that some customers might be on the other side of the world, that 0.5 second to load, just went to 2 seconds. Sounds small, but then you get a burst of traffic, and that 2 seconds is now 4 to 6 seconds. That's just to deliver the data to end users.
You're not even thinking about latency from going across multiple networks, or loading elements like images, javascript, and processing on your server's side AND on the end customer's end as well - easy 15 seconds IF you are using the wrong table type.
Here is the thing about all this theory and processing speed - it's hard to visualize and understand what's going on without actual charts. So I'm going to allow you to peak into the backend of a SAAS I run on how we monitor speed.
Every request is logged for data size, processing, and speed - for the worker drones. But for users, we concentrate on one thing - speed. Let me show you some gradual screenshots of what's going on in terms of speed at the beginning with MYISAM being used on 1 of the multiple tables users were accessing (users access multiple tables pe request):
The chart is between for the last 200 days up to the last 60 days. What you are looking at is the "Loading Time in Seconds" table, for requests for the main function of the Saas. You'll notice at the green arrow, we've made some improvement that reduced time once we saw things were getting a bit laggy.
We improved servers ram, speed, upgraded data connections, and moved to bigger servers, that's the cause of that overall drop. But you will notice when there were tons of people on the system, you will see spikes of average loading times hitting 5 seconds. But overall they are below 5 seconds after the green arrow.
But don't let that fool you, look closely... you see another increase happening, and that's one of those "long term" increases, that you have to watch, since as data grows, your overall time to pull data grows - at least so I thought. Here is a closer look at that gradual range:
That's going to be a problem cause it's not enough to keep upgrading servers, and ram to infinity, you have to optimize code - but optimizing code, you'll have to bend your mind on how to reduce overall pressure on the system, otherwise in 3 months, you'll be at 10 second requests, and then your customers will start complaining your system is slow. Remember these are people in the digital age, 3 seconds is too long - take also into consideration we've got international customers, so for the people in North America they were seeing 1-2 second loads, but when adding the international crowd, they were seeing way above 10 second loading times, in order to get to that 10 second average.
That's when I kept digging around the internet to find how to make a system faster, and I had heard of Innodb, but just thought, meh, it might cut down on 1 or two seconds. But I was desperate so I said fuck it, and switched the one table - not even the biggest table we were access but one that was getting hammered the 2nd hardest, to Innodb and prayed for a stay of execution, give me at least 2 seconds less, here were the results:
Wow, do you see the difference? An even closer look:
Now a look at it on single day, hour by hour:
Average times for this one interface is less then a second - never before had that happened at any level, and this table has more data then it's ever had. Simply switching a single table to Innodb dropped ALL pressure on ALL overall requests to the system, and it's so fast no one even thinks about "speed" anymore. When you click, it loads - period. When your customers aren't even in the realm of complaining about speed, their old 10 second averages are down to 0.3 second averages, you've hit a new threshold.
Going forward our mission was no longer about finding another tweak to stay the execution - we could focus on things like adding new features and continuing to improve the overall experience with other techniques. It's like going from a 56K modem to 1GB per second data, speed is no longer a problem, you've got it to a point where no one notices or can even complain so now you can concentrate on other things. That one thing, changing a single table to Innodb did this, imagine if I start using some voodoo like memcache or redis...
What's ironic, is the only reason I was using MYISAM was because it was the default setup. After 5.5 MYSQL started defaulting to Innodb... just double check what your system is on, and IF it makes sense to alter tables to Innodb. It made sense in our scenario.
There is one thing I will say about this whole experiment, I would not have been able to do any of this if I didn't monitor data requests and the system as closely as I do. So if you are running a system where you aren't even able to monitor what's going on you might want to get things in place, even basics, so you know what averages look like and what outliers are, cause without this information, you can't improve upon what you do not measure or know.
That's my contribution to the programming sections
First off, If you've got tables which are being written to constantly form multiple angles, MYISAM is the worse table for this, since it locks the whole table while writing. It may not matter on a small app that's being written to once a minute, but when you get to 5 to 10 times a second from multiple "workers" or processes, the toll will be there. It will be small at the beginning, but when you start factoring in that some customers might be on the other side of the world, that 0.5 second to load, just went to 2 seconds. Sounds small, but then you get a burst of traffic, and that 2 seconds is now 4 to 6 seconds. That's just to deliver the data to end users.
You're not even thinking about latency from going across multiple networks, or loading elements like images, javascript, and processing on your server's side AND on the end customer's end as well - easy 15 seconds IF you are using the wrong table type.
Here is the thing about all this theory and processing speed - it's hard to visualize and understand what's going on without actual charts. So I'm going to allow you to peak into the backend of a SAAS I run on how we monitor speed.
Every request is logged for data size, processing, and speed - for the worker drones. But for users, we concentrate on one thing - speed. Let me show you some gradual screenshots of what's going on in terms of speed at the beginning with MYISAM being used on 1 of the multiple tables users were accessing (users access multiple tables pe request):
The chart is between for the last 200 days up to the last 60 days. What you are looking at is the "Loading Time in Seconds" table, for requests for the main function of the Saas. You'll notice at the green arrow, we've made some improvement that reduced time once we saw things were getting a bit laggy.
We improved servers ram, speed, upgraded data connections, and moved to bigger servers, that's the cause of that overall drop. But you will notice when there were tons of people on the system, you will see spikes of average loading times hitting 5 seconds. But overall they are below 5 seconds after the green arrow.
But don't let that fool you, look closely... you see another increase happening, and that's one of those "long term" increases, that you have to watch, since as data grows, your overall time to pull data grows - at least so I thought. Here is a closer look at that gradual range:
That's going to be a problem cause it's not enough to keep upgrading servers, and ram to infinity, you have to optimize code - but optimizing code, you'll have to bend your mind on how to reduce overall pressure on the system, otherwise in 3 months, you'll be at 10 second requests, and then your customers will start complaining your system is slow. Remember these are people in the digital age, 3 seconds is too long - take also into consideration we've got international customers, so for the people in North America they were seeing 1-2 second loads, but when adding the international crowd, they were seeing way above 10 second loading times, in order to get to that 10 second average.
That's when I kept digging around the internet to find how to make a system faster, and I had heard of Innodb, but just thought, meh, it might cut down on 1 or two seconds. But I was desperate so I said fuck it, and switched the one table - not even the biggest table we were access but one that was getting hammered the 2nd hardest, to Innodb and prayed for a stay of execution, give me at least 2 seconds less, here were the results:
Wow, do you see the difference? An even closer look:
Now a look at it on single day, hour by hour:
Average times for this one interface is less then a second - never before had that happened at any level, and this table has more data then it's ever had. Simply switching a single table to Innodb dropped ALL pressure on ALL overall requests to the system, and it's so fast no one even thinks about "speed" anymore. When you click, it loads - period. When your customers aren't even in the realm of complaining about speed, their old 10 second averages are down to 0.3 second averages, you've hit a new threshold.
Going forward our mission was no longer about finding another tweak to stay the execution - we could focus on things like adding new features and continuing to improve the overall experience with other techniques. It's like going from a 56K modem to 1GB per second data, speed is no longer a problem, you've got it to a point where no one notices or can even complain so now you can concentrate on other things. That one thing, changing a single table to Innodb did this, imagine if I start using some voodoo like memcache or redis...
What's ironic, is the only reason I was using MYISAM was because it was the default setup. After 5.5 MYSQL started defaulting to Innodb... just double check what your system is on, and IF it makes sense to alter tables to Innodb. It made sense in our scenario.
There is one thing I will say about this whole experiment, I would not have been able to do any of this if I didn't monitor data requests and the system as closely as I do. So if you are running a system where you aren't even able to monitor what's going on you might want to get things in place, even basics, so you know what averages look like and what outliers are, cause without this information, you can't improve upon what you do not measure or know.
That's my contribution to the programming sections
Last edited: