The other day a coworker was remarking that their test suite ran much more slowly than mine. Much more slowly than the build server, too. And by the other day I mean the other day, last month, and probably a few other times.
Some Background On How We Test
All of our test suites start out by initializing the database model. This includes, among other things, emptying the test database and creating all new tables, procedures, views, etc. using sqlalchemy's model system.
Our core system includes hundreds of database objects. Our core products are supported by a single model library containing all the models for all of our projects.
Constructing all of these objects is not a big problem. They load relatively quickly. And in production the schema isn't being constantly modified.
After creating all of these tables all tests run on these tables and each test is responsible for ensuring the state in the tables it may be testing against is what it expects, including cleaning up the tables beforehand and adding any initial objects it needs.
Tests then proceed to call whatever functions (or web endpoints, or whatever) and returns are inspected for correctness. Too, the database state is inspected to check for any side effects.
Hysterical Raisins
Since I've been using the same development sandbox for a long time (and for a lot more than just development) my configuration has "evolved" (or degraded, depending on your perspective) over time.
One aspect that had seen numerous modifications was my PostgreSQL configuration. Over the years, I have tweaked and tuned the config here in order to speed up my own queries or to test optimizations that would ultimately end up in production or staging.
The result of this was that I had inadvertently made my tests run quickly! Even though my sandbox (we use virtualbox running on osx with ubuntu for our local development systems) had fewer allocated resources in terms of CPU and memory, careful optimizations had allowed it to run my tests significantly faster.
The Config
Here's a rundown of what the relevant configuration options were:
shared_buffers = 1GB # min 128kB work_mem = 32MB # min 64kB maintenance_work_mem = 256MB # min 1MB wal_level = minimal # minimal, archive, or hot_standby fsync = off # turns forced synchronization on or off synchronous_commit = off # synchronization level; on, off, or local full_page_writes = off # recover from partial page writes
shared_buffers: This
is the amount of data that PostgreSQL will store in its own
shared memory. This primarily contains cached data. Keeping
more data in cache generally speeds up all workloads, even
fairly write heavy ones.
work_mem: This
is the amount of memory that PostgreSQL will use to execute a
query. If a dataset can be loaded into memory and operated on
in this much memory, it will. For example, a quicksort can run
in memory, but all the data must be in memory. If the data fits
in memory, it will be loaded in memory and sorted there.
Otherwise, an on-disk merge sort may be performed.
maintenance_work_mem: This
is similar to the above configuration parameter, but for
maintenance type operations such as CREATE INDEX.
wal_level: This
determines how much information is written to the write ahead
log (WAL). Since we aren't making backups or performing
replication on our sandboxes, mininal is the best choice for
this.
fsync: This
parameter ensures that data is written to disk to prevent
corruption in the event of a crash. YOU SHOULD ONLY SET THIS TO
"off" IF YOU ACCEPT THE RISK OF CORRUPTION. Since our sandboxes
can be recreated from scratch fairly easily, using this is
safe.
synchronous_commit: This
parameter allows for certainty that a transaction was completed
and written to disk. When "off", a transaction may report as
committed but still may be lost in the event of a crash.
full_page_writes: This
parameter controls whether or not a full page is written to WAL
after a checkpoint. Not writing the page is a performance
improvement but can result in silent, unrecoverable corruption
in the event of a system failure.
But What About That Build Server?
The build server had benefited from sharing parts of our production config as well as having its data directory mounted on a ramdisk. Both of these helped the build server cruise along where my coworker's development sandbox crawled.
How We'll Avoid This In The Future
The above will be the settings recommended in our sandbox setup guide as well as in the vagrant setup script.