Meet the RotatorContainer

Apr 25, 2008

I’ve been sitting on this code for a while and decided to clean it up and submit it to the Dojo Javascript Toolkit.

The RotatorContainer cycles through dijit.layout.ContentPanes and provides navigation in the form of tabs or a pager. There a number of timing settings you can adjust as well as the layout of the controls.

Give it a try!

Here’s how you can use it:

<script type="text/javascript">
dojo.require("dojo.parser");
dojo.require("dijit.layout.ContentPane");
dojo.require("dojox.layout.RotatorContainer");
</script>

<div dojoType="dojox.layout.RotatorContainer" id="myRotator" showTabs="true"
      autoStart="true" transitionDelay="5000">
    <div dojoType="dijit.layout.ContentPane" title="1">
        Pane 1!
    </div>
    <div dojoType="dijit.layout.ContentPane" title="2">
        Pane 2!
    </div>
    <div dojoType="dijit.layout.ContentPane" title="3">
        Pane 3!
    </div>
</div>

NOTE: There is some CSS needed to make the RotatorContainer look correct. Include or copy the contents of the dojox/layout/resources/RotatorContainer.css into your CSS file.

The magic happens once you have some content. It takes a bit of time to tweak things, but you can pretty much do anything you can imagine.

The RotatorContainer can be controlled by a pager which includes a play/pause button, next and previous button, and the current and total panes. You can also have as many pagers as you’d like. The pager code looks like this:

<script type="text/javascript">
dojo.require("dijit.form.Button");
</script>

<div dojoType="dojox.layout.RotatorPager" rotatorId="myRotator">
    <button dojoType="dijit.form.Button" dojoAttachPoint="previous">Prev</button>
    <button dojoType="dijit.form.ToggleButton" dojoAttachPoint="playPause"></button>
    <button dojoType="dijit.form.Button" dojoAttachPoint="next">Next</button>
    <span dojoAttachPoint="current"></span> / <span dojoAttachPoint="total"></span>
</div>

One thing I should note, if this widget should find it’s way into dojox, it’s possible the name or functionality will change.

If time permitted, I would have liked to add additional transitions such as a left-to-right wipe. Oh well.

Enjoy!


Apache has a neat module called mod_dbd that allows your Apache modules to connect to a database. mod_dbd interfaces with apr_dbd, an Apache Portable Runtime (APR) abstraction layer around database specific drivers.

Back when Ubuntu 7.04 (fiesty) was released, a MySQL driver was not bundled with Apache for licensing concerns. So, in order to use mod_dbd to connect to a MySQL database, you need to get the MySQL driver source code from WebThing (apr_dbd_mysql.c) and manually re-compile apr-utils.

You also need the source code for Apache 2.2.3 (which includes apr-utils 1.2.7) from the Ubuntu 7.04 repositories, then copy the apr_dbd_mysql.c file into the Apache source apr-utils/dbd directory. The Ubuntu guys made a nice INSTALL.MySQL file in the apr-utils with some basic instructions.

What they don’t tell you is you need to install the MySQL source. To make matters worse, once you install it, the apr-utils 1.2.7 configure script can’t find it, even if you tell it where it is.

<snip>
configure: checking for mysql in /usr/src/mysql-dfsg-5.0-5.0.38/include
checking mysql.h usability... no
checking mysql.h presence... no
checking for mysql.h... no
<snip>

This apparently was a known issue and was fixed in apr-utils 1.2.8.

Starting with apr-utils 1.2.11, the MySQL driver is bundled with it. Unfortunately, even Ubuntu 7.10 (gutsy) still ships with apr-utils 1.2.7. So, you are forced to download the source and compile.

Or, you can wait a couple days and Ubuntu 8.04 (hardy) which has Apache 2.2.8 and apr-utils 1.2.11. In theory the MySQL driver will work out of the box.

As for me, I’ll be compiling Apache, PHP, MySQL, memcached, and <insert essential infrastructure software> from source like I should have done in the beginning.


MySQL Conference Day 4 Thoughts

Apr 17, 2008

Scaling out MySQL: Hardware Today and Tomorrow

Jeremy Cole and Eric Bergen over at Proven Scaling LLC gave a talk about the hardware side of MySQL. They covered pretty much every aspect of hardware.

For starters, Jeremy said go 64-bit hardware and operating system. For CPU, faster is better. The current versions of MySQL and InnoDB don’t take full advantage of 8 core servers, so unless you have the budget, Jeremy recommended a single quad-core or a dual dual-core setup. He recommended getting as much RAM as possible. RAM is cheap so go for 32GB, or at least 16GB.

For storage, Jeremy discussed the many options including direct attached storage (DAS), SAN, NAS, and the various hard drive interfaces. From what I gathered, they prefer configuring each DB server with RAID 10. If the RAID controller has battery backed cache, then you should do “write back”, otherwise “write through”. Write back offers faster performance since it caches the data and doesn’t make the system wait for the data to be written to disk. The battery backed cache means that you won’t lose the data pending to be written if the system loses power. There was a brief discussion of SATA vs. SAS. SAS offers faster drives (15,000 RPM) and have processors to handle commands just as SCSI has which improves performance. SAS has another interesting feature where a single drive can be hooked up to two separate SAS controllers in the event one controller should become unavailable.

They buy all of their gear from Dell, but HP, Sun, and IBM are good too. Dell just happens to be significantly cheaper, especially when you go through a sales rep. They mentioned some of the smaller guys including SuperMicro and Silicon Mechanics. I personally really like SuperMicro’s 6015T server because it has 2 nodes in a 1U chassis. This is actually denser than any blade server solution I’ve ever seen. Each node is capable of two quad-core processors and 32GB of RAM. The only downside is you can only have 2 hard drives and both nodes share a single non-redundant power supply. So this would make a decent slave, but you would need to architect your application so it could quickly pick another slave if/when it goes down or use MySQL Proxy.

For databases using InnoDB, they said the InnoDB buffer pools should be 2GB less than to total system memory, so 14GB on a 16GB system. Jeremy mentioned special hardware to speed things up, specifically Kickfire and Violin Memory. Kickfire is a SQL appliance that includes a special SQL chip to speed up operations significantly. Violin Memory’s 1010 memory appliance is sweet. For only $170k you can add 512GB of DRAM in 2U to your database server of a PCI-Express bus. It holds 84 x 6GB chips that can be hot swapped. You can lose 2 sticks before you’re screwed.

Jeremy concluded with high-speed interconnects including InfiniBand and Dolphin Interconnect. InfiniBand is fast and you can hook them all into a switch. Dolphin’s interconnect is also fast and are chained together in a loop similar to external SCSI devices, but you need to make sure they have a driver for your hardware.

I talked to Jeremy after his talk and asked him about diskless slaves which would basically have a RAM drive for the data. While it would be fast, it would take memory that would otherwise be used by MySQL and would be a pain to manage when they come online. So scratch that idea.

Helping InnoDB Scale on Servers with Many CPU Cores and Disks

One of the more popular talks was by Mark Callaghan at Google who talked about ways they managed to get InnoDB to take advantage of system with more than 4 cores and many disks. The primary change they made was to InnoDB’s mutex code used to control concurrent read/writes to pages.

They replaced the existing pthreads mutex code with a more efficient platform specific compare and swap CPU instruction (CAS). They managed to get much better performance. He said they are hoping to get a patch out by the end of the year with their changes. They don’t want to release it until they know it is rock solid.

Scaling Heavy Concurrent Writes In Real Time

Dathan Pattishall, formerly with Flickr, and now with RockYou.com talked about an analytics system he helped build for Flickr. Flickr keeps track of each photos stats including external links. Whenever someone directly embeds a picture from a Flickr Pro user, they record that information, then make those stats available in near realtime.

The old design basically involved inserting records as they came in, but it was killing the servers, especially since those servers were also handling reads for people viewing the stats. Their solution was to create a separate Java daemon that queues up pending inserts. This means only a single thread is used on the MySQL server and it doesn’t block the web servers from serving up the information.

They are inserting the stats into 3 tables, one for daily, weekly, and monthly stats. To keep things in order, they tried a VARCHAR of the URL as the primary key, but ran into major performance issues. So instead they decided to store a hash as bigint:

// php
$id = hexdec(substr(md5(url),0,16),16,10);

This code generates a 32 character MD5 of the URL, then takes the first 16 characters and converts them from string of hex numbers to base 10 number. The resulting number fits perfectly in a bigint.

He also mentioned using ibbackup for backing up the databases, but it is not a free solution.

Geo Distance Search with MySQL

Ever since Google Maps API was released, I’ve had an interest in playing around with it. Alexander Rubin of MySQL talked about ways of querying for locations within a given distance of a lat/lng. He first abstracts the distance math into a user-defined function (UDF). Then just calls the UDF from within the query.

I’ve already played with geo searching before, so it was mostly review. He didn’t go into much depth such as MySQL’s spatial extensions.

Dinner at the Tied House

We had a great turnout of around 18 people at the Tied House in Mountain View. We had a number of people from places including MySQL, PrimeBase, and Facebook.

Thanks to the PrimeBase guys! They have a neat transaction storage engine that supports streaming blob data. Normal blobs in MySQL are held in memory during the transaction. The PBXT Storage Engine is designed to stream blob data in and out very efficiently.

I’d like to give a special thanks to Jay Pipes for getting me to come to the conference this year. I truly had a great time. Thanks!


MySQL Conference Day 3 Thoughts

Apr 16, 2008

Keynotes

The conference committee managed to get Rick Falkvinge of the Swedish Pirate Party to speak. I heard him speak at OSCON 2007. What I took away from his talk is copyright is evil. Copyright is the excuse industries (i.e. the music industry) are using as a tool to justify monitoring all of your communications. Not only do they want to monitor you, but prohibit certain kinds of communications. What it comes down to is your privacy vs. copyright. It’s scary stuff.

The second part of the keynote was a panel consisting of a representative from MySQL, Sun, flickr, FotoLog, Wikipedia, Facebook, and YouTube. They were discussing scaling at each of their sites. It was a great discussion. Informative and funny. Paul Tuckfield of YouTube had a great saying: “Replication is the answer. You just need to rephrase the question.” Farhan “Frank” Mashraqi of FotoLog made an interesting observation where Sun Sparc Niagara 1 servers make great master servers due to their high speed and Sun Sparc Niagara 2 servers make great slave servers due to their large concurrency.

Grand Tour of the information_schema

The information_schema database is a built-in database that contains metadata about data including tables, partitions, privileges, character sets, constraints, indexes, server settings, server status, and routines. This database is an alternative to MySQL’s proprietary SHOW commands.

I see a real utility being able to query the information_schema database to check server status. Another interesting use is to auto-generate schema documentation. I’m curious what kind of user metadata you can associate to objects.

Applied Partitioning and Scaling Your Database System

Phil Hildebrand gave his talk about the different ways of partitioning your data. The types are range, hash, key, and list. You can read more about partitioning types in MySQL’s documentation.

MySQL Performance Under a Microscope: The Tobias and Jay Show

This was an entertaining talk by MySQL’s Tobias Asplund and Jay Pipes. They showed the results of a few benchmarks comparing multiple ways to do something.

In the first test, they wanted to see what was the fastest way for getting the total count of records. They tried a handful of ways, but COUNT(*) when query caching was enabled was the fastest.

On of the other interesting tests they did was DATETIME vs. INT UNSIGNED for storing a date. The best method was to use an INT UNSIGNED and do the date to int conversion on the application tier. In PHP, use the strtotime() function.

The MySQL Query Cache

Query caching can bring huge performance gains to your web application. Baron Schwartz of Percona gave a talk describing why query cache rocks.

MySQL caches query results, not execution plans. It stores the results in a big hash table where the key is the query. They key is case-sensitive and whitespace-sensitive. Only SELECT statement results are cached since it doesn’t make a whole lot of sense to cache INSERT or UPDATE results. Only deterministic queries are cached. If the query contains a non-deterministic function call, such as a function that returns the current time, then it cannot cache the results.

You can display the query cache information by executing the following:

SHOW GLOBAL STATUS LIKE 'qcache%';
SHOW GLOBAL STATUS LIKE 'query_cache%';

The way the query cache memory is allocated can potentially cause fragmentation. You can get a feel for how bad it is by comparing the number of free blocks to the number of total blocks. If you are running out of free blocks, you either have filled your cache or you have bad fragmentation.

Grazr: Lessons Learned Building a Web 2.0 Application Using MySQL

The talk about Grazr was given by Patrick Galbraith and Michael Kowalchik. Patrick is one of the fellows that showed of some awesome memcached stuff at tutorial and the BOF. Grazr filters out feeds to only the information it thinks you’d be interested in. This was a pretty general discussion and they managed to get through their slides pretty quickly. Since the talk was winding down early, I headed over to Eli’s talk.

Help, My Website has been Hacked! Now What?

If you have a popular site, you are going get people attempt to hack your site. Eli White of digg talked about some of the ways your site can get hacked.

One thing he pointed out that I didn’t think about was you can’t just block someone’s IP address. If there is a proxy between the user and the web server, then IP address you get is the proxy’s, not the user’s. You need to check the x-forwarded-for HTTP header. If there are more than one proxies involved, the x-forwarded-for will contain a comma separated list of addresses.

I talked to Eli after his session and he recommended blocking the IPs on the firewall instead of the PHP code. This is means less load on the app server, but unless you have a fancy firewall, I would be curious to know how often a particular IP is trying to attack me.

Performing MySQL Backups Using LVM Snapshots

The last session of the day was by Lenz Grimmer of MySQL. LVM snapshots can be a great way to backup your databases, especially InnoDB. The basic procedure is:

  • flush tables
  • flush tables with read lock
  • lvcreate -s
  • show master/slave status
  • unlock tables
  • mount snapshot, perform backup
  • unmount and discard the snapshot

InnoDB ignores the “flush tables with read lock” step, but if you have any MyISAM tables, you’ll still need to do it. Flushing the tables does impact performance, especially while the snapshot is active. As soon as you mount the LVM partition snapshot, you can back it up and then unmount and discard the snapshot.

There is a Perl script called mylvmbackup which can help with these procedures.

An alternative to LVM snapshots for backups is to replicate to a slave server, stop the replication, perform the backup on the slave, then start replication again. The downside is it requires an extra machine as the slave in which MySQL can be stopped so that InnoDB tables can be properly flushed.

MySQL Quiz Show and Sun party

The quiz show is a absolute blast. The show is moderated by the infamous Jay Pipes. Facebook was kind enough to sponsor the quiz show this year. There was plenty of beer and popcorn to go around. People won a ton of books and Sheeri Kritzer Cabral won the grand prize: an Apple iPhone. Lucky!

Everybody ended up coming out of the wood work for the Sun after party. It was nice to finally get to meet Baron Schwartz. Everybody should go by his book! High Performance MySQL.

I also had a chat with Brian Moon of dealnews.com. He claims PHP can be made to work with the Apache worker MPM. Hmm, looks like I have a new project!


Keynotes

The keynote was kick started by Marten Mickos.  If you’ve never met Marten, he is, on a personal note, one of the greatest CEOs I’ve ever met.  The keynotes were especially interesting for me because it was the first time I’ve had the opportunity to listen to Jonathan Schwartz, the CEO of Sun Microsystems.  Jonathan seems like a great guy who gives the impression he "gets it".

The last keynote was by Werner Vogels of Amazon.  His talk covered Amazon’s growth and the new services they offer including EC2.  He announced that EC2 now supports persistent storage, which is a huge improvement, but doesn’t quite solve all of the problems.

Testing PHP/MySQL Applications with PHPUnit/DbUnit

I’ve never been big into testing, but I’m trying to change that.  Sebastian Bergmann, the author of PHPUnit Pocket Reference (free online version), talked about PHPUnit and DbUnit and why I should use them.  Installing PHPUnit is extremely simple if you have pear installed:

pear channel-discover pear.phpunit.de
pear install phpunit/PHPUnit

Once installed, just require PHPUnit:

// php
require_once 'PHPUnit/Framework.php';

He just scratched the surface on writing unit tests. One thing he pointed out was using CruiseControl for automated testing. What’s really cool is you can fire off CruiseControl from Subversion commit hooks. If the testing fails, CruiseControl can send an email with the results and who is to blame.

Practical MySQL for Web Applications

Domas Mituzas of MySQL and Wikipedia fame gave a good talk that covered practical design of web applications. The talk covered simple stuff, so I didn’t learn a whole lot. Nevertheless, Domas sometimes says some funny things that make the talk enjoyable.

EXPLAIN Demystified

Baron Schwartz gave a talk about the EXPLAIN statement. EXPLAIN is run by prepending the word EXPLAIN to your SELECT statements. It only works on SELECT statements. When the query is run, it outputs an execution plan.

After running through the output of the EXPLAIN statement, he showed us mk-visual-explain which is one of the tools in Maatkit. It is a neato command line tool that takes the EXPLAIN output and reformats it as a tree structure. It’s a great way to visualize the execution plan. Now if only there was a GUI version…

Upgrading to Elegant Versatile Database Architecture using PHP5 Data Objects

This talk was given by Sigurd Magnusson of SilverStripe and covered PDO. I already researched and used PDO, so it was mostly review.

After talking to some of the other people at the conference, I’ve been seriously thinking of moving away from PDO and using MySQL specific functions because they expose some *really* cool debugging and profiling information.

Exploring Amazon EC2 for Scale-out Applications

The thought of EC2 sounds really cool. The ability to create a server instance and host your stuff on it within minutes is sweet. Need more servers, no problem, add another instance. The speakers, Morgan Tocker of MySQL and Carl Mercier of Defensio, talked about their experience with EC2.

There are some serious data and management issues. Until the other day, there wasn’t any kind of persistent storage, meaning when the server went offline, you lost all your data. Now you can mount a drive that persists across restarts. But one issue for critical business transactions is how and when data is written to disk. Is the data written immediately to disk or is buffered in the kernel or in some RAID card’s cache?

Another issue they ran into is when a new machine is created, there’s remnants of the previous machine’s instance’s data. So they need to zero out the drive which takes 5 hours on single instance.

What I took away from the talk is EC2 is great if your app is simple and relies on 3rd party services (i.e. Facebook, Google, etc) that are more reliable than EC2.

Service Oriented Architecture with PHP and MySQL

Joe Stump, a PHP hacker at Digg, gave a talk about SOA. It wasn’t as much about “web services” as it is managing tasks and processing them asynchronously.

After talking to Joe, he highly recommended Gearman to manage tasks. From the Gearman site: “Gearman is a system to farm out work to other machines, dispatching function calls to machines that are better suited to do work, to do work in parallel, to load balance lots of function calls, or to call functions between languages.”

So, if a user uploads an image, you can add the task of resizing the image to a backend processing mechanism. This allows for a responsive front-end for the user.

Joe, along with Chris Goffinet, are working on netgearman which is a PEAR package for interfacing with Gearman.

Memcached Hackathon BOF

This was a birds of a feature session where a bunch of people informally got together to discuss all things memcached. Patrick Galbraith of Grazr showed off Memcached Functions for MySQL. This is super cool. It allows you to set and get data in memcached within your SQL code via user defined functions.

So instead of pulling data from the DB to the app, then pushing it to memcached, you can just have a trigger or stored procedure store the value directly to memcached. One caveat is when you rollback a transaction, it won’t unset the value from memcached.

There was some discussion about the memcached MySQL storage engine. After listening to them discuss it, I have to wonder if it is really worth it. It acts like a distributed memory table, except when a server in a cluster goes down, it will affect all the other servers.


Memcached and MySQL

Apr 14, 2008

The second and last tutorial of Monday was Memcached and MySQL: Everything you need to know by Brian Aker and Alan Kasindorf.

The talk was mainly about memcached and libmemcached and less on MySQL. That’s OK since I have been meaning to learn more about memcached’s internals.

Alan and Brian discussed the slab allocator, protocol, internal hash table, LRU (least recently used), and threading. The slab allocator is the name of memcached memory allocator. The LRU keeps track of the age of each slab. memcached uses a consistent hash algorithm for the slabs to be located quickly and supports dynamically adding new servers to the pool. One thing that is cool about the hash table is it is pluggable and you can choose a different algorithm to meet your needs.

memcached’s protocol is ascii based, similar to HTTP, however they are in the process of finishing up a binary protocol which should have less overhead and better performance.

The current architecture is threaded and scales OK on machines with 4-8 cores, but when 16-32 core servers come out, memcached will not scale as well. Future versions will be improving threading support and scale better on larger machines.

Brian talked about the libmemcached a bit which is a memcached client library written in C. The API looks pretty easy as it uses very similar concepts and naming as other higher level client APIs.

One thing that Brian said that stood out is we should move away from synchronous actions and towards asynchronous event. It was a very interesting talk and Brian definitely knows what he’s talking about.


The day of tutorials started out with All Bases Covered: A Hands-on Introduction to High-availability MySQL and DRBD by Florian Haas and Philipp Reisner.

After a brief introduction to DRBD, they started discussing the configuration file. There were a couple settings that I had set incorrectly on my servers.

Since I have my two servers connected via a gigabit crossover cable, I had my synchronization rate set to 125MB. They recommended approximately 1/3 your network and disk I/O so that you’re applications don’t freeze up during synchronization. Their test system used 30MB so I’ll give it a try too.

Another setting they had different was the activity log extents. All of the references I looked at said to set the al-extents to 257. Actually, there’s an equation to find this value which is E = (R x t) / 4 where E is the al-extents, R is the synchronization rate, and t is the target synchronization time (in seconds). If the sync rate is 30MB and target sync time is 240 seconds, then the extent would be 1800, which rounded to the nearest prime is 1801.

Heartbeat is the cluster manager to detect when a node is unavailable. You should have at least 2 heartbeat connections between the two nodes. If eth0 is your public network and eth1 is your private network, you will want to configure Heartbeat to send the heartbeats across the public network using multicast and broadcast for the private network.

# /etc/ha.d/ha.cf
bcast eth1
mcast eth0 239.0.0.42 694 1 0

The version of Heartbeat that they demonstrated was Heartbeat v2. I use the older v1, which isn’t as powerful, but much simpler to configure. It was also the first time I have seen the Heartbeat GUI. The GUI makes it easy as cake to manage the Heartbeat resources and offers a level of monitoring. You can tell the GUI was written by a developer since the usability could be improved greatly.

I specifically asked if DRBD has any issues with partitions larger than 2TB and Florian basically said if you can create the partition (meaning the driver supports it), then DRBD supports it. He mentioned something about how all SCSI devices use 32-bit integers for addressing that limits you to 2TB. This was news to me. My SATA RAID card is technically seen by Linux as a SCSI device. I’m not sure if this is 100% accurate, but nevertheless there is an easy solution. If you have 4TB of space, you can chop it up into two 2TB partitions, then use either software RAID 0 (stripe) or LVM (linear or striped map mode).

I can’t wait to build my next HA cluster, but this time using Heartbeat v2.


2008 MySQL Conference and Expo

Apr 14, 2008

It’s that time of the year again and I’m in Santa Clara, CA, for the 2008 MySQL Conference & Expo. This year, the conference includes a lot more Web 2.0 topics.

MySQL Conference and Expo 2007

The conference starts out with a full day of tutorials. The two tutorials I signed up for are:

I’ve made it a point this time to learn about things that I wouldn’t normally go to. This includes sessions about testings, benchmarking, and MySQL’s information schema.

I’ll be blogging about some of the sessions that I’ll be attending, so stay tuned!