This is a continuation of the saga of helping a customer of ours with their MySQL on Windows issues.
The basic premise is that MySQL 5 running under Windows has problems with large numbers of connections/open files.
We initially presented our client with 2 choices for solving their problem:
- Setup MySQL on a different server running Linux
- Move their database backend from MySQL to Microsoft SQL Server
Option 1 would require a non-trivial capital outlay and time to setup and migrate the data over
Option 2 held more interest for them as a longer term solution as they were interested in some of the powerful reporting and analysis tools. However, the server that all of this was running on turned out to present a problem in the following way:
The system is a Dell PowerEdge 2950 with dual dual-core Xeons (with HyperThreading enabled, resulting in 8 logical cpus), 8GB RAM and running Windows 2003 Server x64. For a general ballpark estimate of pricing for MS SQL Server, I went to the Dell website and configured a similar machine and went to select a SQL Server as an add-on. Due to the way that Microsoft prices SQL Server for a web environment, you must license it for each processor you have installed in your server. A dual-processor licensed SQL Server 2005 running under Windows x64 turned out to be in excess of $14,000 on that page.
Needless to say, the customer had a bit of sticker shock. Not only would they have to plop down FOURTEEN grand, they would still need man-hours to work out kinks in transitioning the backend of the app from MySQL. Getting a fully loaded separate server for half that cost running Linux/MySQL was looking more attractive.
I was chatting with a friend about the customers dilemma and he had a brilliant suggestion: Run a Linux instance under VMWare on the existing hardware.
I broached the idea with the client and they were game to give it a shot. The server was at this point extremely underutilized and was only limited by the specific software implementations. They were already running MySQL here anyway, so putting a different wrapper around it in order to escape those limits was worth investigating.
The server needed some more disk space so they ordered 6 more of the 146GB 2.5″ SAS drive modules like the 2 that were already installed in a mirrored pair. These are great little devices for getting a decent number of spindles into a database server.
While they did that I installed the latest Release Candidate of VMWare Server 2.0 and went about setting up a Debian based Linux install with the latest MySQL 5.0 binary available. During testing we had some interesting processor affinity issues for the VM environment and after a very good experience and excellent responses in the VMWare community forums, I tweaked the config to have that VirtualMachine attach itself to processors 5-8 which correspond to all of the logical cpus on the second socket. This left the first socket’s worth of cpus for OS and IIS/ASP usage. Doing this would help avoid the cache issues that occur with multi-cpu systems and multi-threaded processes.
I ended up configuring the VM for 3.5GB RAM, 2 processors and tweaking the MySQL config to make good use of those resources for their INNOdb based tables.
The system has been running very reliably for over 2 months and is making very good use of the resources on that system. Load Average in the VM stays well under 1.0 and the cpus in the Windows system are easily handling all the different loads.
Overall I spent probably 8-10 hours installing and tweaking the config and helping them migrate their data. This was much less expensive to the other options and they got to make use of their existing hardware and software configurations.
A very positive experience and outcome.
Posted by Brian Blood as Database, Hardware, Linux, MySQL, Servers at 2:41 PM UTC
No Comments »
A customer of ours recently asked us to help them troubleshoot some performance problems they have been having with their ASP/MySQL based solution. They are running the latest version of MySQL 5 under Windows 2003 Server 64-bit edition. Their IIS/ASP based application is using an ODBC connection to connect to MySQL.
They have recently added a partner that is sending them sales leads through an API call. There are occasional peaks in this traffic where the incoming connections into IIS far exceeds what they have been able to handle in the concurrent open connections into MySQL. Tracking this down through the application stack we looked at several low-level file/network system items:
- Ephemeral port exhaustion
- Stale socket disposal
- Max Open Files
#1 and #2 has caused some issues for another customer in the past, but this time it turned out to be #3. Even though they had set an appropriate value for max-connections in their my.ini file:
max-connections = 4000
table-open-cache=512
MySQL was artificially reducing these settings at runtime to much lower than was needed to support this flash traffic. Upon startup, MySQL would emit this into the Application Event Log:
Changed limits: max_open_files: 2048 max_connections: 1910 table_cache: 64
According to this bug report on the MySQL site, the approximate formula is used to determine the maximum open files for MySQL:
table_cache*2+max_connections
This condition has been reported as far back as 2006. A fix that switched MySQL from using the C Runtime Library in the Windows binary for opening files to using the native Win32 API calls was completed in mid June 2008. However, this will only be rolled into the MySQL 6 line of development.
We had looked at seeing about increasing this limit to something higher than 2048, but according to this site, that value is hardcoded. This particular customer has at least one thing going for them, their exclusive use of the InnoDB engine. This engine uses native calls so the table cache value does not need to be anything very high and they can maximize the number of open descriptors for use in incoming connections.
Realistic Options:
- Artificially limit the maximum number of incoming connections to IIS so as to limit the passthrough the MySQL. This leaves serious amounts of processing power of this particular server untapped.
- Move to MySQL running on a non-Windows OS
- Move their backend to MS SQL Server.
MySQL.com documented differences.
Posted by Brian Blood as Database, MySQL, Servers, Web App Development at 3:42 PM UTC
No Comments »
I set up a replication slave at our office to a MySQL server running at our colo and the master server is pretty busy. So busy that even with the compressed protocol option turned on the stream was taking a good 60-70 kbps out of the available bandwidth of our T1. Since it isn’t critical that this data be real-time slaved, I made a small shell script that can take parameters for starting and stopping either the sql or io threads: mysqlslavectl.sh
#!/bin/sh -
#
#
USER=mysqladmin
PASSWD=adminpwd
SCRIPT="/usr/local/mysql/bin/mysql -u$USER -p$PASSWD -e "
StartService ()
{
case $1 in
sql ) $SCRIPT "START SLAVE SQL_THREAD" ;;
io ) $SCRIPT "START SLAVE IO_THREAD" ;;
* ) echo "$0: unknown Start argument: $1";;
esac
}
StopService ()
{
case $1 in
sql ) $SCRIPT "STOP SLAVE SQL_THREAD" ;;
io ) $SCRIPT "STOP SLAVE IO_THREAD" ;;
* ) echo "$0: unknown Stop argument: $1";;
esac
}
CheckCommand ()
{
case $1 in
start ) StartService "$2" ;;
stop ) StopService "$2" ;;
* ) echo "$0: unknown argument: $1";;
esac
}
CheckCommand "$1" "$2"
I can call this like so:
mysqlslavectl.sh start io
mysqlslavectl.sh stop sql
I setup a couple of crontab entries, one to start the io thread at 8pm and one to stop the io thread at 6AM.
The sql thread will always be running.
Posted by Brian Blood as Database, MySQL at 6:30 PM UTC
No Comments »
I had a customer submit a batch file to be processed in one of our systems and they handed me an Excel file with short dates (mm/dd/yy). Sheesh, this is 2007, can we not get full 4 digit years, please!
so, I wanted to work out a method for quickly converting those date strings within mysql
assume a table of:
- BirthDate char(16)
- BDMonth tinyint
- BDDay tinyint
- BDYear smallint
UPDATE MyDatesTable
BDMonth = SUBSTRING_INDEX(BirthDate, ‘/’, 1),
BDDay = SUBSTRING_INDEX( SUBSTRING_INDEX(BirthDate, ‘/’, 2 ) , ‘/’, -1 ),
BDYear = SUBSTRING_INDEX(BirthDate, ‘/’, -1)+1900,
UPDATE MyDatesTable
SET BirthDate = CONCAT(BDYear, ‘-’, BDMonth, ‘-’, BDDay)
ALTER TABLEĀ MyDatesTable MyDatesTable MyDatesTable datetime;
2 digits years are SOOO last century…
Posted by Brian Blood as Data Translation, Database, MySQL, Text Munging at 3:00 PM UTC
No Comments »
Quick FileMaker Pro tip:
if you are trying to implement some privilege set specific behavior and you are depending on the call to Get(PrivilegeSetName), be aware that if you set the “Run Script with Full Privileges” option on that script that that function does not return the Privilege Set Name for the current user, but instead returns [Full Access].
Took me about 10 minutes to figure out why my script was not executing. This actually turned out to be a good thing since it forced me to factor my script a bit more and thereby create a more flexible function.
One other thing that is a bit frustrating is that there is no way (other than setting at startup and referring to a global global) to abstract/define a constant for a privilege set name. If you needed to make heavy use of privilege set specific branching, picking good privilege set names in the beginning would be crucial.
Posted by Brian Blood as Database, FileMaker at 1:08 PM UTC
No Comments »
I had been working really hard on my post on our super duper mail server and at some point I started having some really weird interactions with the tinymce editor. I was switching back and forth between the raw HTML editor and all of a sudden I only had the middle 60% of my post. Stupidly I hit Save and lost a good chunk of my valuable words of wisdom. I was able to recover most of the text from the original email, but I was a bit perturbed there wasn’t a revert feature.
So, I added one:
ALTER TABLE `posts` ADD `post_content_bkp1` LONGTEXT AFTER `post_content` ,
ADD `post_content_bkp2` LONGTEXT AFTER `post_content_bkp1` ;
or actually two revisions.
I then altered the sql for updating posts like so:
post_content_bkp2 = post_content_bkp1,
post_content_bkp1 = post_content,
so, there may not be any interface for this, but at least if I or the software mangles a post while I’m editing, I should be able to go to the db and recover something.
Posted by Brian Blood as Database, General, Web Software at 12:39 AM UTC
1 Comment »
Today I added a new datum for the Users table for our mail server: Last Message Received
What prompted me to add this was I was trying to prune down the over 150 accounts we have in the macserve.net domain and I had no idea which email addresses were actually in use or when they last received an email. I needed a quick reference that would not necessitate a trip to the Recent Mail table.
So I added two new columns to the Users table: last msg recvd and last msg sent. For now I’m only dealing with the former as I haven’t implemented tracking of sent mail yet.
The big thing was figuring out a quick and easy way of getting the most recent datetime for a user from the Recent Mail table and updating that in the Users table.
I started writing a separate script for this, but realized it would be just fine to drop this process into our daily database maintenance script. I had started to write some PHP code that would loop through the Recent Mail table for entries at most a week old and figure out the most recent message and then make a call to update that record in the Users table.
In the course of creating a temporary table to go back to the archive of recent mail we keep, I realized I could simply use an SQL temporary table to hold that data (duh) and then simply run a joined update from that temp table into the Users table. It turned out to be a simple 3 statement SQL process like so:
CREATE TEMPORARY TABLE most_recent_email
SELECT recipient_id,MAX(recent_msgs.msg_when) as last_msg_when
FROM recent_msgs
WHERE (msg_when >= ‘$oneWeekAgo’) AND (recent_msgs.recipient_id > 0)
GROUP BY recent_msgs.recipient_id;
UPDATE site_users,most_recent_mail
SET last_msg_rcvd = last_msg_when
WHERE site_users.user_id = most_recent_email.recipient_id;
DROP TABLE most_recent_email;
Nice and simple and I let the database do all the work for me. I like it.
Posted by Brian Blood as Database, Mail Server at 11:47 PM UTC
No Comments »
An recent email inquiry I received:
> I saw you had posted a reply to my inquiry about large installs running ECM2.
This mail server is my baby, so if I gush a bit, please forgive me.
> At this point, we’ve totally outgrown EIMS (as you can understand),
> and ECM2 is defintely the front-runner as far as replacements go. I
> have looked a lot at AtMail, which is basically a commercial ECM2,
SquirrelMail is what we use now for customer webmail, but we are seriously considering using something different and @mail’s webmail system might do the trick. We have some pretty sophisticated customers and a better webmail system is definitely needed.
> but decided that I really think I’d rather have a firm foundation and
> be able to modify it myself, instead of relying yet again on a
> commercial developer’s whims.
I hear you. EIMS lasted us a very long time, but we had to bite the bullet and make a change. I built a whole bunch of applescripts and php import scripts to migrate accounts over and that data then fed into a program that did syncing of mail over from EIMS mailboxes to the new ECM based system that mostly kept “read/unread” flags on the email. THAT was a big deal.
It’s been about a year now running on our server and it’s totally kick butt. We see almost no spam now, and the manageability is orders of magnitude more than it was with EIMS.
Most emails we get to make changes for email accounts we simply reply back to the admin of that account with the admin pwd and a URL to the mailadmin site. It has saved us mountains of support time.
The great thing about exim is that I can actually PROGRAM each phase of the SMTP conversation and the delivery phase to however complicated or personalized for each of our domains/users it needs to be.
Our exim config is definitely one of the more complex I’ve seen on the net.
And exim runs it without any issue.
> That said, it sounds like your ECM2 installation is handling your
> traffic well. May I ask what architecture you’re running your server
> on, and what types of loads you see?
OK, you asked…..
We run it on a Dual 1.0 Ghz Xserve G4 (10.4.8) with 2GB of ram, 2×60GB for boot and 2 x300 for data. Here is the Daily Load Average graph for that box:

Those spikes are when I’m running a mysqldump.
Here is the monthly graph:

we have:
419 sites
542 domains
1835 users defined
1069 of those defined as a mailbox.
Now, I took the package that George built and highly modified it. ECMs database schema and exim config is actually based on vexim, so there were some things I found on that site that I did pull into the system.
The primary difference on our system is that I abstracted domains out of the schema. I’ve always like the way EIMS implemented domain aliases. It so easily and transparently overlaid onto a “site”. So that’s what I did:

so, the primary unit is the site, then all the users and then you can have any number of domains on that site. Those graphics represent a state of the system over a year ago, there are quite a bit more fields in the sites table now, but this gives you the basic outline. the user can use any domain and can even use the % hack in their login id as well for all three services: SMTP, POP3 and IMAP. There is still a PRIMARY domain that you define in the site preferences.
> I have heard conjecture that
> Exim isn’t all that great under load,
We haven’t seen that and we run it on “old” hardware and it runs like a champ for us. I’ve implemented it for two other customers (one on Mac, one on Debian Linux) and we have two more lined up who want it.
The one thing I didn’t want exim to do was to handle outbound mail. The reason is that every message for delivery would have incurred another database lookup which would have caused unnecessary load and slow performance. So, we use an instance of Postfix with a very light config to handle all that delivery. This could be setup on the same box by having Postfix listen on a different port, but we already had in place an existing system on a different server, so we just used that.
> but I’ve also heard version 4
> took care of a lot of that. I don’t think we’re huge load here, but
> we do do at least a couple thousand messages per hour.
ok, I looked at our graphs of connections:

and taking the hour of 14:00 Dallas time, which seemed the busiest…..
received mail: 587
received mail: 1172
Now, I was able to pull those numbers very quickly because we LOG all blocks and all accepted messages into mysql:
SELECT count(*) FROM `recent_mail` WHERE recvd BETWEEN ‘2007-03-19 19:00:00′ AND ‘2007-03-19 19:59:59′
SELECT count(*) FROM `block_log` WHERE `when_blocked` BETWEEN ‘2007-03-19 19:00:00′ AND ‘2007-03-19 19:59:59′
(We store ALL times in UTC and then adjust for the user when displaying through the web admin pages)
The admin of an site or even an individual account holder can now see with their own eyes what emails were blocked and why.
The block log:

The Recent mail log:

I also built into it:
- greylisting
- greylisting exceptions by:
- site (stable through whatever domains are on that site)
- source IP or IP range
- sender domain or sender email
- recipient domain or recipient email
- auto blacklist of ips:
When an incoming server tries to helo with one of my ips or one of my names, that IP address is automatically added to the blacklisted hosts table with an expiration for a month.
- spam assassin:
- sql based bayesing scoring
- sql based auto white listing (the more mail you get from a sender, the lower their email is rated for spam)
- global, site and user level based prefs
- connection logging/profiling
I keep track all every single IP address that connects to our server.
As they progress through a SMTP connection, I update certain values on that record:
ip cnxn_count, cnxn_first, cnxn_last, reverse_ok_count,
helo_ok_count, quit_count, bad_from_count, bad_rcpt_count,
ok_rcpt_count, dnsbl_block_count, last_dnsbl_time, last_dnsbl
- whitelisting (globally or per site):
- by sender
- by recipient
- by ip or ip range
I also implemented catchalls similar to the way that EIMS has them:
Through the use of a preferences table, I can also selectively turn on/off certain features of the mail server in real-time without touching the config:
allow_cnxns allow_trusted allow_authd allow_other greylist_on
Also I liked the options for users that EIMS had: mailbox, forward and both, but ours has expanded features:
we also now have TLS based SMTP, POP3 and IMAP using a self-signed certificate.
> Thanks for any input you may be able to provide!
Here is a screen shot of the admin interface. ( I completely rebuilt the ECM web admin interface to handle more features and deal with the changes in architecture)

The Blocks column is the number of blocks in the past 24 hours and the last hour.
The Recent column is the number of accepted msgs in the past 24 hours and the last hour.I also wrote a bunch of support scripts that tail through some of the logs and update the login times for those users in the database.
Things I haven’t implemented but are mostly already built, just need testing:
- logging of email sent by authenticated users.
- automatically feeding email sent to spam traps into Spam Assassin for bayesian scoring
- Archiving of email per site or per user for corporate entities required to do so.
So, there it is.
Hope that answers all your questions and more!
Posted by Brian Blood as Database, Mail Server at 12:03 AM UTC
No Comments »
We manage a lot of different servers, mostly web application servers running PHP and MySQL.
The physical layout of these systems, due the hardware involved, often is quite different from server to server. As a result the placement of the data repository and binary logs and other log files is not the the same.
I got tired of having to track down where each of these items were located, so I started using the following structure where possible. If not possible, these same items existed, but were merely symlinks to the actual locations.
/var/mysql
- mysql.sock – just a symlink to /tmp/mysql.sock. This was due to a minor config issue Apple had in OS X Server 10.4, that they finally corrected.
- binlogs – the binary logs generated if this system is intended to be a replication master
- data – link to the actual location of the datadir
/var/log/mysql
- error.log – the error log for the machine. I never liked how mysql named the error log after the hostname. Now, I’m telling it, use this name
- slowq.log – slow query log
Now all the items that I normally need to get to, have easy to remember, standardized “locations”.
I’m sure this is old hat for some. Sometimes it takes me a while to figure out the simple stuff.
Posted by Brian Blood as Database, Servers at 12:25 PM UTC
2 Comments »
We support the Hypersites development team in handling all their colocation and load balancing systems and occasionally doing web application consulting for them to help make their site better, faster, stronger and more agile. The Hypersites Application Builder is truly a marvelous piece of software. You should give it a spin for your next web project.
One of the underlying parts of their architecture that we advised them on long ago was to utilize the compression based encoding that most browsers support to reduce the actual amount of traffic sent over the internet to deliver a page. Another item was to build out versions of the pages that their system created and store those in a caching system of some sort. We had considered using memcache, which is a great way of storing that transient data that most web apps end up creating/using, but they decided on a much simpler (KISS) database table.
In that table are stored 3 versions of a page’s html: plain html, gzip and compress
The team recently made a change in their code so that instead of grabbing all three columns of data from the cache table, then choosing which version of the data to use, they chose which column to select before making the query.
The result: In about 70% of the calls to the cache table, the query result dropped to 10% of it’s original size.
By making a simple change to the logic in their code, they accelerated their software (at least that portion of the code) by TEN FOLD, something which no amount of reasonably-priced hardware upgrades would have accomplished.
Very cool and a good lesson.
Posted by Brian Blood as Database, Web App Development at 11:55 AM UTC
No Comments »