Digg.com has over 1 million users on their LAMP based web application. Their architecture includes a load balancer that distributes web traffic across several web servers. Then they have a single master MySQL server that replicates to several slave MySQL servers. Each web server runs PHP and connects to a random slave MySQL server.
An important part of their infrastructure is memcached. With memcached, they cache content that is faster to retrieve than to query the database. When the PHP page is executed, it queries memcached then renders the page. If memcached doesn’t have the requested content cached, PHP will query the MySQL servers and then dump the result into memcached for future lookups.
They have several servers running memcached to distribute the cache storage.
They divided their MySQL slaves into shards. One of the shards of slaves handles searches. They have two other shards, but the presenter was a little quick for me. There are a couple types of shards:
- Table-based – put tables on dedicated servers
- Range-based – put a range of users or topics on dedicated servers
- Date-based – partition the data by date across dedicated servers
- Hashed – users or topics are stored on a particular server and referenced via a lookup
- Partial sharding – not quite sure, but it sounds cool
Digg is not using any built-in MySQL partitioning, federated tables, or clustering. MySQL 5.1 has some mechanisms to handle these shards, but at the time they didn’t exist, so Digg had to roll their own.
Their MySQL databases run on Debian based Linux. Debian’s apt made it extremely easy to upgrade packages. They are running MySQL 5, but they didn’t notice a huge performance increase compared to MySQL 4.1. They use around 20 MySQL servers and memcached runs on about 9 of those MySQL slave servers.
They have reached a level where they can no longer solve database problems by adding more RAM. Digg has begun re-writing queries to optimize I/O. Their database is around 30GB and constantly growing. They have used Cacti for monitoring their MySQL servers, but can be painful to use in an ever changing infrastructure.
For full text searching, they use Apache Lucene on a couple servers.
They store user images on XFS partitions. XFS is supposedly better at handling a large quantity of files and very robust. From their testing, ext3 was slower and not as reliable.
Their presentation is available online at http://eliw.com.