A really good client of ours has been colocating with us since late 2003. They’ve grown their web application from running on a Xserve 1.0Ghz DP G4 to an Xserve 2.0Ghz DP G5, then moving their database off to a separate big hardware RAIDed Dell server.
They came to us about a year ago (May 2006) and said they were getting a big new client who wanted to run their entire site on their system and they were going to need a load balanced system with plenty of power and scalability. Earlier in the year (Feb 2006), Apple had introduced their second generation MacMini that now sported the new Intel Core Duo chips along with Gigabit Ethernet. At that time, we were also concerned about increased power usage in our cage, so we picked up an Intel Mac Mini 1.66Ghz Core Duo, had it upgraded to 2GB of RAM (the G4’s could only handle 1GB) and started to really put it through it’s paces.
It turned out to be a real winner.
- In terms of power consumption, no matter how hard I tried I could not make that Mini use more than 0.37A of power. I blasted that thing with multiple concurrent CPU and disk bound processes, getting really heavy loads and disk read/writes.
- In terms of CPU, when we ran a battery of tests to really try and emulate this customers environment which is a very complex PHP web application. It really shone and had tremendous performance, even under load.
The final configuration we ended up with was:
MacMini Intel 1.66Ghz Core Duo, 2GB RAM and we replaced the stock Seagate 80GB 5400rpm SATA 2.5″ notebook drives with the Hitachi E7K100 60GB 7200rpm SATA drive. These are the drives that IBM puts on it’s blade servers as they are rated for 24/7 usage.
Total Cost: ~$1,000 each. (we bought the Apple ram)
We worked with the client to help factor their web application so that it could be properly load balanced. Changes were necessary in the following areas:
- session storage
- code updates (simultaneous CVS updates)
- media upload handling (you can no longer assume you have the resources you did when you only had a single server)
- host name abstraction to keep the Apache conf files nice and clean.
- centralized logging of web hits/visits.
After all the hardware and software was ready, we setup the content rules for the load balancer and turned it on. It was very gratifying to see the Minis perform very well even under adverse load conditions. (The big client sends out large email newsletter runs that bring flash crowds to the site.)
One of the more interesting experiences we’ve had with this system was when we migrated their Xserve G5 in as web server #4 in the load balanced group. Mind you this is not a puny box. We even upgraded the drive system in it to a hardware RAID 5 based set. Since that time we have had to periodically adjust the weighting rules on the load balancer to give more and more priority to sending hits to the Intel Minis instead of the Xserve G5. We are now at a 3:3:3:1 ratio and the Xserve is finally at a lower overall load average than the minis.
Yes, you read that right: the Intel MacMini is somewhere between 2 and 3 times faster than a Xserve G5 in raw cpu performance.
And it uses one fifth the power. With a nice sliding rack tray, you can easily get 6 of them into a 2U space. (excuse the cabling mess)

We’ve also considered a different configuration whereby we set the Minis on their side and “stack” them horizontally. The Mini is right at 2 inches tall and a 6.5 inch square, so accounting for some space for air flow and cabling you could get 7, maybe 8 of them in a row, resulting in a 4U tall set. With the right mounting, you could get easily 2, maybe 3 of these rows on a sliding tray. You do have to account for the external power supply, but that separation actually works out as a major benefit as the cables could be run so that you have a single U of dedicated space for the power supplies and put some directional cooling air flow over them.
Result: With only 2 rows of Minis, you get:
- roughly the same CPU power as 28 Xserve G5s
- about the same power consumption as 5 Xserve G5s (8.4A vs 9A)
- 6-7 times less space (4-5U of Minis vs 28U)
- easily less than half the per unit cost ($1,000 vs ~$2-$3K)
The Intel Dual Xeon Xserve looks promising as well in terms of raw cpu performance, but I’ve seen reports that it suffers from the same high power consumption as any other Dual Xeon system does: ~ 3A. However, this is TEN TIMES the power consumption of a MacMini, so for a high density web farm, this is not a better solution. Better to utilize that Intel Xserve as a database server to take advantage of it’s greater RAM capacity (32GB), increased threading (dual dual-core) and 64-bit capability.
Related CPU performance anecdote: We have another client with a Compressor video encoding grid made up of Intel MacMinis and he found out (Apple also confirmed it) that he needed to remove the PowerMac G5 DP from that grid as it was the weakest link!
In summary, a MacMini based farm is a powerful solution for almost any web application. You get low cost, low power, the ease of use and security of a OS X based system. A very compelling formula. Our client is very happy and is looking to add 2 more MacMinis in the near future. We’ve built this type of system for another client and they are extremely happy with the performance of their web application, too. If your organization is interested in a MacMini web farm, please contact us for a quote.
Some additional links regarding MacMinis
- Installing Debian GNU/Linux on the Mac Mini go
- 123Macmini.com - A Mac Mini User Community go
- Mac Mini hacks Forum for the Apple Mac Mini go
- Squeezing 2 MacMinis into a 1U custom case go
I’ve posted a followup to this article.
Posted by Brian Blood as Content Networking, Servers, Web App Development at 9:54 AM CDT
14 Comments »
I had been working really hard on my post on our super duper mail server and at some point I started having some really weird interactions with the tinymce editor. I was switching back and forth between the raw HTML editor and all of a sudden I only had the middle 60% of my post. Stupidly I hit Save and lost a good chunk of my valuable words of wisdom. I was able to recover most of the text from the original email, but I was a bit perturbed there wasn’t a revert feature.
So, I added one:
ALTER TABLE `posts` ADD `post_content_bkp1` LONGTEXT AFTER `post_content` ,
ADD `post_content_bkp2` LONGTEXT AFTER `post_content_bkp1` ;
or actually two revisions.
I then altered the sql for updating posts like so:
post_content_bkp2 = post_content_bkp1,
post_content_bkp1 = post_content,
so, there may not be any interface for this, but at least if I or the software mangles a post while I’m editing, I should be able to go to the db and recover something.
Posted by Brian Blood as Database, General, Web Software at 12:39 AM CDT
1 Comment »
Today I added a new datum for the Users table for our mail server: Last Message Received
What prompted me to add this was I was trying to prune down the over 150 accounts we have in the macserve.net domain and I had no idea which email addresses were actually in use or when they last received an email. I needed a quick reference that would not necessitate a trip to the Recent Mail table.
So I added two new columns to the Users table: last msg recvd and last msg sent. For now I’m only dealing with the former as I haven’t implemented tracking of sent mail yet.
The big thing was figuring out a quick and easy way of getting the most recent datetime for a user from the Recent Mail table and updating that in the Users table.
I started writing a separate script for this, but realized it would be just fine to drop this process into our daily database maintenance script. I had started to write some PHP code that would loop through the Recent Mail table for entries at most a week old and figure out the most recent message and then make a call to update that record in the Users table.
In the course of creating a temporary table to go back to the archive of recent mail we keep, I realized I could simply use an SQL temporary table to hold that data (duh) and then simply run a joined update from that temp table into the Users table. It turned out to be a simple 3 statement SQL process like so:
CREATE TEMPORARY TABLE most_recent_email
SELECT recipient_id,MAX(recent_msgs.msg_when) as last_msg_when
FROM recent_msgs
WHERE (msg_when >= ‘$oneWeekAgo’) AND (recent_msgs.recipient_id > 0)
GROUP BY recent_msgs.recipient_id;
UPDATE site_users,most_recent_mail
SET last_msg_rcvd = last_msg_when
WHERE site_users.user_id = most_recent_email.recipient_id;
DROP TABLE most_recent_email;
Nice and simple and I let the database do all the work for me. I like it.
Posted by Brian Blood as Database, Mail Server at 11:47 PM CDT
No Comments »
An recent email inquiry I received:
> I saw you had posted a reply to my inquiry about large installs running ECM2.
This mail server is my baby, so if I gush a bit, please forgive me.
> At this point, we’ve totally outgrown EIMS (as you can understand),
> and ECM2 is defintely the front-runner as far as replacements go. I
> have looked a lot at AtMail, which is basically a commercial ECM2,
SquirrelMail is what we use now for customer webmail, but we are seriously considering using something different and @mail’s webmail system might do the trick. We have some pretty sophisticated customers and a better webmail system is definitely needed.
> but decided that I really think I’d rather have a firm foundation and
> be able to modify it myself, instead of relying yet again on a
> commercial developer’s whims.
I hear you. EIMS lasted us a very long time, but we had to bite the bullet and make a change. I built a whole bunch of applescripts and php import scripts to migrate accounts over and that data then fed into a program that did syncing of mail over from EIMS mailboxes to the new ECM based system that mostly kept “read/unread” flags on the email. THAT was a big deal.
It’s been about a year now running on our server and it’s totally kick butt. We see almost no spam now, and the manageability is orders of magnitude more than it was with EIMS.
Most emails we get to make changes for email accounts we simply reply back to the admin of that account with the admin pwd and a URL to the mailadmin site. It has saved us mountains of support time.
The great thing about exim is that I can actually PROGRAM each phase of the SMTP conversation and the delivery phase to however complicated or personalized for each of our domains/users it needs to be.
Our exim config is definitely one of the more complex I’ve seen on the net.
And exim runs it without any issue.
> That said, it sounds like your ECM2 installation is handling your
> traffic well. May I ask what architecture you’re running your server
> on, and what types of loads you see?
OK, you asked…..
We run it on a Dual 1.0 Ghz Xserve G4 (10.4.8) with 2GB of ram, 2×60GB for boot and 2 x300 for data. Here is the Daily Load Average graph for that box:

Those spikes are when I’m running a mysqldump.
Here is the monthly graph:

we have:
419 sites
542 domains
1835 users defined
1069 of those defined as a mailbox.
Now, I took the package that George built and highly modified it. ECMs database schema and exim config is actually based on vexim, so there were some things I found on that site that I did pull into the system.
The primary difference on our system is that I abstracted domains out of the schema. I’ve always like the way EIMS implemented domain aliases. It so easily and transparently overlaid onto a “site”. So that’s what I did:

so, the primary unit is the site, then all the users and then you can have any number of domains on that site. Those graphics represent a state of the system over a year ago, there are quite a bit more fields in the sites table now, but this gives you the basic outline. the user can use any domain and can even use the % hack in their login id as well for all three services: SMTP, POP3 and IMAP. There is still a PRIMARY domain that you define in the site preferences.
> I have heard conjecture that
> Exim isn’t all that great under load,
We haven’t seen that and we run it on “old” hardware and it runs like a champ for us. I’ve implemented it for two other customers (one on Mac, one on Debian Linux) and we have two more lined up who want it.
The one thing I didn’t want exim to do was to handle outbound mail. The reason is that every message for delivery would have incurred another database lookup which would have caused unnecessary load and slow performance. So, we use an instance of Postfix with a very light config to handle all that delivery. This could be setup on the same box by having Postfix listen on a different port, but we already had in place an existing system on a different server, so we just used that.
> but I’ve also heard version 4
> took care of a lot of that. I don’t think we’re huge load here, but
> we do do at least a couple thousand messages per hour.
ok, I looked at our graphs of connections:

and taking the hour of 14:00 Dallas time, which seemed the busiest…..
received mail: 587
received mail: 1172
Now, I was able to pull those numbers very quickly because we LOG all blocks and all accepted messages into mysql:
SELECT count(*) FROM `recent_mail` WHERE recvd BETWEEN ‘2007-03-19 19:00:00′ AND ‘2007-03-19 19:59:59′
SELECT count(*) FROM `block_log` WHERE `when_blocked` BETWEEN ‘2007-03-19 19:00:00′ AND ‘2007-03-19 19:59:59′
(We store ALL times in UTC and then adjust for the user when displaying through the web admin pages)
The admin of an site or even an individual account holder can now see with their own eyes what emails were blocked and why.
The block log:

The Recent mail log:

I also built into it:
- greylisting
- greylisting exceptions by:
- site (stable through whatever domains are on that site)
- source IP or IP range
- sender domain or sender email
- recipient domain or recipient email
- auto blacklist of ips:
When an incoming server tries to helo with one of my ips or one of my names, that IP address is automatically added to the blacklisted hosts table with an expiration for a month.
- spam assassin:
- sql based bayesing scoring
- sql based auto white listing (the more mail you get from a sender, the lower their email is rated for spam)
- global, site and user level based prefs
- connection logging/profiling
I keep track all every single IP address that connects to our server.
As they progress through a SMTP connection, I update certain values on that record:
ip cnxn_count, cnxn_first, cnxn_last, reverse_ok_count,
helo_ok_count, quit_count, bad_from_count, bad_rcpt_count,
ok_rcpt_count, dnsbl_block_count, last_dnsbl_time, last_dnsbl
- whitelisting (globally or per site):
- by sender
- by recipient
- by ip or ip range
I also implemented catchalls similar to the way that EIMS has them:
Through the use of a preferences table, I can also selectively turn on/off certain features of the mail server in real-time without touching the config:
allow_cnxns allow_trusted allow_authd allow_other greylist_on
Also I liked the options for users that EIMS had: mailbox, forward and both, but ours has expanded features:
we also now have TLS based SMTP, POP3 and IMAP using a self-signed certificate.
> Thanks for any input you may be able to provide!
Here is a screen shot of the admin interface. ( I completely rebuilt the ECM web admin interface to handle more features and deal with the changes in architecture)

The Blocks column is the number of blocks in the past 24 hours and the last hour.
The Recent column is the number of accepted msgs in the past 24 hours and the last hour.I also wrote a bunch of support scripts that tail through some of the logs and update the login times for those users in the database.
Things I haven’t implemented but are mostly already built, just need testing:
- logging of email sent by authenticated users.
- automatically feeding email sent to spam traps into Spam Assassin for bayesian scoring
- Archiving of email per site or per user for corporate entities required to do so.
So, there it is.
Hope that answers all your questions and more!
Posted by Brian Blood as Database, Mail Server at 12:03 AM CDT
No Comments »
For some reason, for a long while now net-snmp which the snmpd agent on OS X is based on has been broken in several aspects.
First: there is some wierd error that keeps coming up (sometimes) when you try and start snmpd:
nlist err: neither icmpstat nor _icmpstat found.
which I found the answer to fixing here.
Pretty simple, change this line in /System/Library/StartupItems/SNMP/SNMP from:
/usr/sbin/snmpd to /usr/sbin/snmpd -I -icmp
OK, that fixes that.
Second, I want to specify in the snmpd.conf file data for the “agentaddress”. I want to run snmpd on a different port and/or to restrict it to a backside interface.
so, every time I would put something like this into snmpd.conf:
agentaddress 16001
which is what the interactive snmpconf program put in there itself, I’d get this annoying error message in /var/log/snmpd.log
Error opening specified endpoint “16001″
Server Exiting with code 1
ugh.
I finally figured out how to make the stupid thing work. Don’t put the agentaddress specification into snmpd.conf; add it to the command that launches snmpd as it can be a command line option.
So finally to get snmpd on OS X to work:
/usr/sbin/snmpd -I -icmp
becomes:
/usr/sbin/snmpd -I -icmp 16001 or
/usr/sbin/snmpd -I -icmp 127.0.0.1:16001
So apparently the issue is somewhere in the code that picks up this information from the config file and not with snmpd in general.
Like I said: VERY annoying.
Update: John Welch of AFP548.com fame has a new article giving a primer on snmp and it’s use/setup in 10.5/Leopard Server. From what we’ve seen with Leopard Server, Apple has fixed some of the basic flaws in snmp. As in, you don’t have to do the nonsense above any longer.
Posted by Brian Blood as OS X Server, Servers at 4:22 PM CDT
No Comments »
We manage a lot of different servers, mostly web application servers running PHP and MySQL.
The physical layout of these systems, due the hardware involved, often is quite different from server to server. As a result the placement of the data repository and binary logs and other log files is not the the same.
I got tired of having to track down where each of these items were located, so I started using the following structure where possible. If not possible, these same items existed, but were merely symlinks to the actual locations.
/var/mysql
- mysql.sock - just a symlink to /tmp/mysql.sock. This was due to a minor config issue Apple had in OS X Server 10.4, that they finally corrected.
- binlogs - the binary logs generated if this system is intended to be a replication master
- data - link to the actual location of the datadir
/var/log/mysql
- error.log - the error log for the machine. I never liked how mysql named the error log after the hostname. Now, I’m telling it, use this name
- slowq.log - slow query log
Now all the items that I normally need to get to, have easy to remember, standardized “locations”.
I’m sure this is old hat for some. Sometimes it takes me a while to figure out the simple stuff. 
Posted by Brian Blood as Database, Servers at 12:25 PM CST
2 Comments »