Anand Sukumaran
Startup Founder, Software Engineer, Abstract thinker
Co-founder & CTO @ Engagespot (Techstars NYC '24)
Configurations to modify while vertically scaling a MySQL instance
Mar 06, 2023Recently, at Engagespot, we came across a serious problem with the throughput of our notification delivery engine. The delivery engine is written in Node.js and uses AWS RDS for MySQL. The entire infrastructure is deployed in Amazon Elastic Container Service (ECS) with auto-scaling.
When the notification throughput of a single worker bottlenecked, we tried horizontally scaling it by adding more workers. But it resulted in two things -
- Starts getting limited by IOPS
- Throughput is not increasing beyond a limit no matter how many workers we have.
We began to investigate the reason by watching the CPU/Memory/IO usage metrics via Cloudwatch dashboard. Workers were performing normally, and even under high load, CPU, Memory, and IO were under control. Even for the database, usage was pretty normal.
We monitored query execution time from performance_schema
table. Queries were performing decently. But the bottleneck was with connections.
First bottleneck might be the Mysql connection
As we know, the concurrent connections in MySQL is limited by the number of connections mentioned in it’s max_connections
parameter. By default, in a docker instance, it might be set to around 100
. However, keep in mind that increasing this can also increase memory usage and CPU load.
We had set this to a higher value to match our required throughput needs.
But that’s not enough.
Whenever you execute a query via a MySQL connection driver in your language, (such as mysql2 in node), an overhead is there. The need to perform authentication and authorization, network latency, memory allocation etc contributes to significant effort and thus slow down your queries.
That’s why you should use a connection pool
that keeps X number of open connections to the database whenever a query needs to be executed. It’s blazingly fast compared to initiating a connection every time.
Defining connection pool in Typeorm
Typeorm by default uses connection pool, but the maximum connection it keeps open by default is 10
. We should modify this to match the max_connections
parameter of our MySQL server. Always make sure not to set the connection pool size to max_connections
. This is because MySQL might be using several connections internally, and thus the actual available connections will be lower than the max_connections
limit.
In the latest version of TypeORM, there is a database agnostic propery poolSize
in DataSource
that you should use to define the pool size.
const DataSourceConfig: DataSourceOptions = {
type: "mysql",
host: process.env.TYPEORM_HOST,
port: Number(process.env.TYPEORM_PORT),
username: process.env.TYPEORM_USERNAME,
password: process.env.TYPEORM_PASSWORD,
database: process.env.TYPEORM_DATABASE,
poolSize: 100
}
Other parameters you could optimize to tune the MySQL server performance.
Increase the value of the
innodb_buffer_pool_size
parameter to allocate more memory for InnoDB buffer pool. This can help reduce disk I/O and improve performance. This is a memory area used to cache data and indexes from InnoDB tables.innodb_flush_log_at_trx_commit
determines how frequently changes to the database are written to the transaction log file.
By performing these optimizations, we observed a 30% avg increase in throughput for our delivery workers. However, database continues to be a bottleneck, unless we upgrade the VM size or do heavy in-memory caching to reduce the number hits to the database.
In another experiment with heavy use of in-memory caches, and batching write queries, I observed a whopping 2500% increase in throughput of a single worker. But this is still in experimentation phase, and a lot of things needs to be tackled before we implement this.
I’ll write more about it in another post.