Skip to main content.
August 4th, 2008

A Hi-Tech Method of Increasing Air Flow in a PowerMac G4

We have some devices in our server room in our office that due to the AC not running full blast on nights and Sundays, can get a little warm, even with spot cooling. To ensure better air flow, I’ve added a hi-tech piece of equipment to a PowerMac G4.

 

I normally would have added a PCI slot fan, but this server doesn’t have any available.

Posted by Brian Blood as Hardware, Servers at 10:52 AM CDT

No Comments »

July 29th, 2008

Root Server Hints file out of date in Mac OS X Server

With all the recent attention on the DNS exploit and the delay in response by Apple for providing a response, I thought I would undertake a deeper review some of our dns systems. We were already protected from the exploit as we do not provide recursive service with any of our unpatched dns servers.

Turns out that not only has Apple not patched the BIND install as of July 29, they haven’t really kept up with the installed config files, specifically the root servers hints file (/var/named/named.ca)

DNS is a distributed system, but a DNS resolver has to be given a place to start in it’s search for your name resolution request. That’s what the 13* root servers are about. They are the top of the tree so to speak. They are strategically placed around the world and load balanced and are THE critical part of the Internet infrastructure.

The hints file is populated with names/IP addresses of the 13 root servers preset like so:

A.ROOT-SERVERS.NET. 3600000 A 198.41.0.4

this goes on down from A to M

Occasionally, an IP address of one of the root server will change. There have been two updates in the past 4 years: B and L

Old
B.ROOT-SERVERS.NET. 3600000 A 128.9.0.107
L.ROOT-SERVERS.NET. 3600000 A 198.32.64.12

New:
; updated Jan 29, 2004
B.ROOT-SERVERS.NET. 3600000 A 192.228.79.201
; updated Nov, 1, 2007
L.ROOT-SERVERS.NET. 3600000 A 199.7.83.4

In Tiger Server, both B and L are out of date even with all updates and security patches applied

Leopard Server (up to 10.5.4) still shows the L root server out of date.

Update your /var/named/named.ca with the new entries preferably with this file:

ftp://rs.internic.net/domain/named.root

then stop/start the named process

The hints file from internic will also include IPv6 information.

The guys at Renesys wrote a good article about the potential dangers of not  querying the correct root servers.

Links:
http://www.root-servers.org/

* - The count is 13 and 13 being the count because that’s the maximum amount of dns records that can be crammed into a single IP packet response.

Posted by Brian Blood as OS X Server, Servers at 12:47 PM CDT

No Comments »

July 21st, 2008

MySQL 5 limits on Window OS - 2048 max open files

A customer of ours recently asked us to help them troubleshoot some performance problems they have been having with their ASP/MySQL based solution. They are running the latest version of MySQL 5 under Windows 2003 Server 64-bit edition. Their IIS/ASP based application is using an ODBC connection to connect to MySQL.

They have recently added a partner that is sending them sales leads through an API call. There are occasional peaks in this traffic where the incoming connections into IIS far exceeds what they have been able to handle in the concurrent open connections into MySQL. Tracking this down through the application stack we looked at several low-level file/network system items:

  1. Ephemeral port exhaustion
  2. Stale socket disposal
  3. Max Open Files

#1 and #2 has caused some issues for another customer in the past, but this time it turned out to be #3. Even though they had set an appropriate value for max-connections in their my.ini file:

max-connections = 4000
table-open-cache=512

MySQL was artificially reducing these settings at runtime to much lower than was needed to support this flash traffic. Upon startup, MySQL would emit this into the Application Event Log:

Changed limits: max_open_files: 2048 max_connections: 1910 table_cache: 64

According to this bug report on the MySQL site, the approximate formula is used to determine the maximum open files for MySQL:

table_cache*2+max_connections

This condition has been reported as far back as 2006. A fix that switched MySQL from using the C Runtime Library in the Windows binary for opening files to using the native Win32 API calls was completed in mid June 2008. However, this will only be rolled into the MySQL 6 line of development.

We had looked at seeing about increasing this limit to something higher than 2048, but according to this site, that value is hardcoded. This particular customer has at least one thing going for them, their exclusive use of the InnoDB engine. This engine uses native calls so the table cache value does not need to be anything very high and they can maximize the number of open descriptors for use in incoming connections.

Realistic Options:

Posted by Brian Blood as Database, MySQL, Servers, Web App Development at 3:42 PM CDT

No Comments »

Windows Server - Ephemeral Ports and Stale Sockets

We’ve been managing a mail server for a customer of ours for about 5 years now. We started them out, as their needs were very modest, on a Windows 2000 Server running ArgoSoft Mail Server and then early 2005, we migrated them to a newer system running Windows 2003 Server Web Edition running MDaemon from AltN.

We chose Web Edition as it was the least expensive of the Server product line and since we were adding third-party software, this system didn’t require any specific built-in Microsoft server components that are available with Standard, Enterprise or Small Business Server.

This system ran very nicely for quite some time as usage continued to grow.

Late 2006, the system had over 1500 active email addresses and over 500 active POP3 users. Over one particular weekend, many users started to see intermittent connection failures. After investigating the issue we came up with two particular issues that needed tweaking.

First, a bit background material. When an application based on the TCP/IP protocol wants to communicate with another system two pieces of information are required: the destination IP address and the port. Systems that provide a service listen on the Well Known Ports since a convention is needed for the particular protocol. For example, Simple Mail Transfer Protocol (SMTP), the language that mail servers use to transfer mail, uses the well known port 25.

A system that is initiating a connection to another that is listening on a well known port (like an email program on a client machine to a mail server) must also use a particular port at the source so that packets that are sent from the receiving end can properly reach back to the initiating program. These ports are usually picked at random and are only used for a very short duration and as such called the Ephemeral Ports. On a mail server, such ports will come into use when the mail server software wants to process certain tests for doing such things as spam processing, dns blacklist checking, anti-virus updates, etc… so a system can quickly eat a large number of these ports as necessary for mail processing. Also in delivery, ephemeral ports will be required for the outgoing SMTP connection.

Second, sockets that are closed (communication between the two ends is closed gracefully) do not immediately go back into the pool of available. These are in what is called the TIME_WAIT state This is so that reopening the connection to the client and server costs less than establishing a new connection. IBM: Windows Tuning

In the end, we decreased the TcpTimedWaitDelay value, so sockets could be recycled faster and we increased the MaxUserPort value, so we could have a larger pool of available sockets.

What was curious was that these values are essentially the same when the base OS is a Windows client version such as Windows XP. The base OS on this server was the “Web Edition”, so having to find and then tune the values for a server OS was a bit strange.

As they say, if you can’t measure it, you can’t manage it, so we setup two data sources and graphs for this system to monitor the active and stale socket connections that feed into cacti:

Mail Server Active Connections

Mail Server Stale Connections

Since making these tweaks, the server has been humming along smoothly.

Posted by Brian Blood as Servers at 3:30 PM CDT

1 Comment »

June 24th, 2008

MySQL Replication Slave Control shell script

I set up a replication slave at our office to a MySQL server running at our colo and the master server is pretty busy. So busy that even with the compressed protocol option turned on the stream was taking a good 60-70 kbps out of the available bandwidth of our T1. Since it isn’t critical that this data be real-time slaved, I made a small shell script that can take parameters for starting and stopping either the sql or io threads: mysqlslavectl.sh

#!/bin/sh -
#
#

USER=mysqladmin
PASSWD=adminpwd
SCRIPT="/usr/local/mysql/bin/mysql -u$USER -p$PASSWD -e "

StartService ()
{
    case $1 in
      sql  ) $SCRIPT "START SLAVE SQL_THREAD"   ;;
      io   ) $SCRIPT "START SLAVE IO_THREAD"    ;;
      *      ) echo "$0: unknown Start argument: $1";;
    esac
}

StopService ()
{
    case $1 in
      sql  ) $SCRIPT "STOP SLAVE SQL_THREAD"   ;;
      io   ) $SCRIPT "STOP SLAVE IO_THREAD"    ;;
      *      ) echo "$0: unknown Stop argument: $1";;
    esac
}

CheckCommand ()
{
    case $1 in
      start  ) StartService "$2"   ;;
      stop   ) StopService "$2"    ;;
      *      ) echo "$0: unknown argument: $1";;
    esac
}

CheckCommand "$1" "$2"

I can call this like so:

mysqlslavectl.sh start io

mysqlslavectl.sh stop sql

I setup a couple of crontab entries, one to start the io thread at 8pm and one to stop the io thread at 6AM.

The sql thread will always be running.

Posted by Brian Blood as Database, MySQL at 6:30 PM CDT

No Comments »

May 22nd, 2008

Converting PowerMac G4 to 2U Server

Many years ago, before there were Xserves, in an attempt to save rack space in our cabinets, we experimented with ripping the guts out of a PowerMac G4 and stuffing them into a 2U server case. Here is a photo of one of those attempts:

It worked out fairly well and we had a couple of these running for several years.

The main piece to the puzzle was finding the right L riser card to give us access to the AGP slot and 2 of the PCI slots, one for a secondary Ethernet card and another for an ATTO SCSI card.

Posted by Brian Blood as Colocation, Hardware, Servers at 10:15 AM CDT

No Comments »

February 25th, 2008

YouTube briefly offline due to Pakistani ISP

So apparently YouTube, according to a Pakistani telecommunications authority, carries content that is “deemed offensive to Islam.

So the ISP either purposefully or accidentally, added custom routing configurations to it’s routers to block YouTube. The unfortuate side-effect was that these BGP announcements were propagated to a large part of the world’s routers, taking YouTube offline for a good number of people.

This fellow has a pretty good summation/chronology of what happened:

Renesys Blog: Pakistan hijacks YouTube

Posted by Brian Blood as Routers and Firewalls, Soap Box at 11:25 AM CST

No Comments »

January 17th, 2008

Converting disks from Apple Software RAID version 1 to version 2

We have a few servers that are still running from being upgraded from the 10.2 and 10.3 days. Most all are running Tiger server, with one or two running Leopard.

Since all our XServe G4s run with dual mirrored pairs, we have quite a few of these software RAID sets.

The trick is that if a mirror pair becomes degraded, your server is now vulnerable because 10.4 disk utilities will not allow you to rebuild a v.1 raid set. You MUST convert the RAID set to v2 before it can be restored.

Unfortunately, the convertRAID verb for the command line diskutil, had some issues. Specifically if your drives had the OS 9 Drivers installed on them, or there wasn’t enough room to shrink the current partition, then the convertRAID operation would destroy the partition map of your disk.

As a result, the only way to get these volumes converted to v2 was to take the volume offline and run a Dusk Utility Restore operation from the v1 pair/disk to a new v2 pair/disk.

Since we have a handful of v1 RAID pairs that are the boot volume, being able to take a server down long enough to perform this operation is sometimes difficult.

The fine folks at SoftRAID have added a new feature to their latest version that allows you to convert RAID sets from the Apple RAID to a SoftRAID version and back. We’ve tested converting from v1 to SoftRAID format then to v2 and it works well. We had some strange behavior from the partition maps, but Mark James and the engineering staff gave us some tips on what to look for and this cleared those issues up. If you can’t run hardware RAID, get yourself a copy of SoftRAID.

As a lark, we also booted the server we used to test all of this on with Leopard Server and tried the diskutil convertRAID command to see if Apple had fixed that operation and it hallelujah it worked!

It even turned a single disk degraded v1 raid into a single member v2 raid set that could easily have another drive added to it for bringing it back to full redundancy. Good news this is as we won’t have to have a server with a boot volume that needs converting down for longer than it takes to boot from Leopard (a external FireWire drive of course), run the conversion, then reboot.

If you are running a server and do not have fully redundant (RAID is NOT backup) boot and data partitions, get thee to a store and buy another drive and add it in. The diskutil enableRAID command also works very well on a single disk.

Posted by Brian Blood as OS X Server, Servers at 10:40 AM CST

No Comments »

January 9th, 2008

Ubuntu/Debian on an Intel MacMini

In our previous adventures with Mac Minis as “blade” servers, I thought we might try installing Ubuntu/Debian on an Intel MacMini and seeing how the system performed against an OS X client based system.

Well, we did that and about a week later we wiped the machine and imaged off one of the other Minis and set it back up under OS X.

We had one of our techs scour all he could find on the net about installing Linux on an Intel MacMini and the biggest hurdle was getting something working in the EFI realm.

We ended up using rEFIt, a project on sourceforge, to allow us to dual boot into either Debian or Tiger. This had some issues, but in the end it worked out ok.

The USB Ethernet adapter also worked rather well right out of the box.

No, the real kicker was the on-board gigabit ethernet which is used on the backside primarily for database access. The Mini uses the Yukon based chipset for it’s GigE port and and this combination with the default ethernet driver installed by Debian induces a flow-control hang under certain loads.

Marcus Bointon hinted as much in comment #6 to my original article and so when the Debian Mini developed problems communication over that interface, I was pretty sure where to look.

Debian by default picks the “sky2″ driver for that PHY and it wasn’t cutting the mustard. Apparently this bug has been around for a couple of years (the chipset is also used on some other system boards) and the “workaround” is to recompile a different ethernet driver into the kernel and it solves the issue. Since running Debian on this system was merely a trial, we decided to punt instead of sinking more time in tinkering with it.

Under Debian, the Mini did actually perform about 10% better than when it was running Tiger. Ultimately, the OS turned out to not be the biggest factor in getting more performance out of the load balanced system as a whole. Tuning Apache and making some other improvements to the web application proved to be far more useful.

Posted by Brian Blood as Hardware, Linux, Servers at 11:57 PM CST

No Comments »

OS X - Server Monitor crazy tech note

Server Monitor is an application that allows you to monitor the health of several Xserves over the network:

Server Monitor

Sometimes the application gets a bit cranky about the connections it makes to the servers and reports that it can’t communicate or as you see here in the picture “reply not understood”. So we don’t really use it for serious monitoring other than as a cursory glance usually to check some items.

However, Apple really takes the cake with this knowledge-base article:

Xserve: Server Monitor does not authenticate with server over subnet

in which they claim that the way to fix the problems with their SOFTWARE, is to:

  1. Make the necessary changes to the username or password using Server Monitor.
  2. Quit Server Monitor.
  3. Shut down the Xserve that is the target of these changes.
  4. Remove the power cord from the back of the Xserve.
  5. Wait 30 seconds and plug the power cord back in.
  6. Power the server back on.

This sounds suspiciously similar to something an old tech friend of mine once told me:

There are sound scientifically proven reasons why one must sometimes sacrifice a chicken in order to get a SCSI chain to work.

Ugh.

Posted by Brian Blood as Colocation, Hardware, OS X Server, Servers, Soap Box at 11:33 PM CST

No Comments »

« Previous Entries