Press "Enter" to skip to content

Network Jack Posts

Debian 6.0 Squeeze on Xserve G5 with 4TB

As far as Apple servers go, Xserve G5s are now in a tight spot.

  1. Good: Reasonably fast CPUs, they are certainly powerful enough for most any internet based web site
  2. Good: They can take 8/16GB of RAM
  3. Good: They support SATA disks so at least you can buy modern replacements.
  4. Bad: Mac OS X Server support halted at Leopard
  5. Good/Bad: Hardware RAID works, for the most part.
  6. Good: FireWire 800
  7. Bad: Only one power supply
  8. Bad: Only 2 slots for expansion cards. No built-in video.
  9. Good/Bad: Market value is pretty low right now.

I’m not one to let extra server hardware lay around. I’ll find a use for it. I still have Xserve G4s in production. However, I’d like to see a more up to date, leaner OS run on it and Debian keeps a very good port up to date for PowerPC. With the latest rev, 6.0, just released, I thought I would combine the two and see what results. My main goal is to be able to continue to use these machines for certain specific tasks and not have to rely on Apple to keep the OS up to date, as Leopard support will surely drop pretty soon.

Some uses I can think of immediately:

  1. Dedicated MySQL replication slave – with enough disk space and RAM, I can create multiple instances of MySQL configured to replicate from our different Master Servers and perform mysqldumps for backup purposes on the slaves instead of the masters.
  2. Dedicated SpamAssassin, ClamAV scanners.
  3. Bulk mail relay/mailing list server.
  4. DNS resolver
  5. Bulk File Delivery/FTP Server
  6. Bulk Backup storage.
  7. iSCSI target for some of the Xen-based virtualization we have been doing. Makes it easy to backup the logical volume for a domU. Just mount the iSCSI target from within dom0 and dd the domU’s LV over to an image file on the Xserve G5.

First goal is to determine how easy it is to install/manage this kind of setup. Second is to define how well the system performs under load.

As for configuration, the main thing I’m curious about is whether the Hardware RAID PCI card works and is manageable from within Debian. I would likely choose to not use that card, as doing so would require more stringent/expensive disk options and would take another PCI slot. In the end, I’ll likely lean towards a small SSD to use for the Boot volume and then software RAID mirror 2 largish SATA drives in the other bays. I don’t expect to use this system for large amounts of transaction work, so something reliable and large is the goal as we want to extend the life of this system another couple of years.

Partitioning and Basic Install

Using just a plain 80GB SATA disk in Bay 1, I was able to install from the NetInstall CD without issue. The critical item appears to be creating a set of partitions correctly to make sure the OpenFirmware will boot the system correctly:

  1. 32k Apple partition map – created automatically by the partitioner
  2. 1MB partition for the yaboot boot loader
  3. 200MB partition for /boot
  4. 78GB partition for /
  5. 1.9GB partition for swap

Things installed smoothly and without issue. Worked like a normal Debian install should. System booted fairly quickly; shutdown and cold-boot worked as expected.

This partitioning setup comes from: XserveHowTo

Hardware RAID PCI card compatibility

I dropped in one of these cards and a set of 3 Apple firmware drives I have laying around and booted the system off the CD. Unfortunately, I immediately started getting some spurious failures with the keyboard/video/network. No problem, to keep using that card with bigger drives will require buying expensive Apple-firmware drives. No thanks. This is a simple bulk data server, so I pull the card out and now that leaves me a slot for another NIC or some other card.

Software RAID and the SSD Boot disk

Linux also has a software RAID 5 capability, so the goal will be to use 2TB SATA disks in each of the three drive bays. Then use software RAID5 to create a 4TB array. One important thing to note when putting these newer bigger disks into the Xserve: make sure to put the jumper on the drives pins to force SATA 1 (1.5Gps) mode. Otherwise the SATA bus on the Xserve will not recognize the drive. Your tray will simply have a continuously lit activity LED.

With the 3 drive bays occupied by the 2TB drives, instead of configuring and installing the OS on the RAID5 array, I thought I would be clever and put a simple little 2.5″ SSD into a caddy that replaces the optical drive and that would serve as my boot drive.

The optical drive in the Xserve G5 is an IDE model, but no worry, you can purchase a caddy with an IDE host interface and a SATA disk interface. The caddy has a IDE/SATA bridge built into it.

I happened to have a 32GB IDE 2.5″ SSD, so I got a straight IDE/IDE caddy. Ultimately, you will want to have that drive in place when you run the installer which turns out to not be so easy, but it is doable.

The general outline for this install is: perform a hard drive media install with an HFS formatted SATA disk in Bay 1. Install to the SSD in the optical caddy, then setup the MD RAID5 device comprising of the 3 x 2TB disks AFTER you get the system setup and running on the SSD. Because of the peculiarities of OpenFirmware and the yaboot boot loader, it’s much simpler to get the system installed on a setup that will be the final configuration.

Hard Disk Based Install

Basic outline:

  1. Format new HFS volume on a SATA disk
  2. copy the initrd.gz, vmlinux, yaboot and yaboot.conf files onto the disk.
  3. Place disk into ADM tray and insert into bay 1.
  4. Have a USB stick with the broadcom firmware deb package on it plugged into the server.
  5. Boot the machine into OpenFirmware (Cmd-Apple-O-F)
  6. issue command: boot hd:3,yaboot; If that doesn’t work try: boot hd:4,yaboot
  7. choose “install” from yaboot screen
  8. perform standard Debian installation

From this point, on there aren’t really any differences between this system and any other debian install.

RAID and LVM

After installation is complete, shut the system down and insert the three 2TB disks and boot back up.

Install MDADM and LVM packages:

apt-get install mdadm lvm2

Basic steps for creating the RAID 5 array:

  1. setting up partition table and single linux RAID partition on each 2TB drive
  2. creating RAID5 array with:   mdadm –level=raid5 –raid-devices=3/dev/sda2/dev/sdb2/dev/sdc2

Logical Volume Manager setup:

The idea here is to grab the entire 4TB logical disk designate it as a “physical volume” upon which we will put a single “Logical Volume Group” for LVM to manage. That way we can create seperate Logical Volumes within the VG for different purposes. iSCSI target support will want to use a LV, so we can easily carve out a 1TB  section of the Logical Volume Group and do the same for potentially other purposes.

Basic LVM setup commands:

  1. pvcreate /dev/md0
  2. vgcreate vg1 /dev/md0

Here is the resulting disks and volume group:

root@debppc:~# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda4              26G  722M   24G   3% /
tmpfs                 492M     0  492M   0% /lib/init/rw
udev                  486M  204K  486M   1% /dev
tmpfs                 492M     0  492M   0% /dev/shm
/dev/hda3             185M   31M  146M  18% /boot
root@debppc:~# vgs
 VG   #PV #LV #SN Attr   VSize VFree
 vg1    1   0   0 wz--n- 3.64t 3.64t

Then:

lvcreate --size=1T --name iSCSIdiskA vg1

So to recap the layers involved here:

  1. 3 physical 2TB disks, sda, sdb, sdc
  2. Linux RAID type partition on each, sda2, sdb2, sdc2 (powerpc partitioning likes to put a 32K Apple partition at the start)
  3. Software RAID5 combining the three RAID partitions into a single multi-disk device: md0
  4. LVM taking the entire md0 device as a Physical Volume for LVM
  5. A single Volume Group: vg1 built on that md0 Physical Volume
  6. A single 1TB Logical Volume carved out of the 4TB Volume Group
  7. iSCSI could then share that Logical Volume out as a “disk” which is seen as a block-level device to be mounted on another computer which can format/partition as it sees fit. Even a Mac with iSCSI initiator driver, which could be in another country since it’s mounted over an IP network.

Performance

Used to be that running RAID5 in software was “inconceivable!”, but the Linux folks have latched onto the great SIMD engines that the chip manufacturers have put into their products over the years and are using that hardware directly to support RAID xor/parity operations. From dmesg:

[   14.298984] raid6: int64x1   1006 MB/s
[   14.370982] raid6: int64x2   1598 MB/s
[   14.442974] raid6: int64x4   1769 MB/s
[   14.514972] raid6: int64x8   1697 MB/s
[   14.586965] raid6: altivecx1  2928 MB/s
[   14.658965] raid6: altivecx2  3631 MB/s
[   14.730951] raid6: altivecx4  4550 MB/s
[   14.802961] raid6: altivecx8  3859 MB/s
[   14.807759] raid6: using algorithm altivecx4 (4550 MB/s)
[   14.816033] xor: measuring software checksum speed
[   14.838951]    8regs     :  5098.000 MB/sec
[   14.862951]    8regs_prefetch:  4606.000 MB/sec
[   14.886951]    32regs    :  5577.000 MB/sec
[   14.910951]    32regs_prefetch:  4828.000 MB/sec
[   14.915087] xor: using function: 32regs (5577.000 MB/sec)

So, running software RAID 5 should have minimal effect on the overall performance of this machine.

Summary

I’ve been running the system for a couple of weeks and a couple of observations:

  1. the software RAID 5 has not been a big deal as far as I can tell.
  2. after installing the hfsplus debian package I was able to attach and mount a firewire drive with data I wanted to move over quickly from a Mac OS X Server.
  3. I installed and compiled in the iscsitarget kernel module and started creating iscsi target volumes for use on some other servers. very nice.
  4. I configured my network interfaces using some clever ip route statements I found to attach and dedicate a different gigabit NIC for iSCSI purposes even though both interfaces are on the same subnet.
  5. The performance on the system is adequate, but not stupendous. I was copying about 300GB worth of MySQL tables from a database server using an iSCSI target volume and the load on the Xserve stayed around 8 for a couple of hours. Whether that’s good/bad I’m not sure.

Overall, it’s been an interesting exercise and I’m really glad I could repurpose the machine into such a useful item.

Differences in Hardware/Software for an Email Server

One of our customers is running our ECMSquared Email server solution and recently decided they had outgrown the platform it was installed on. Mailbox access was slow, webmail was slow and it felt constantly overloaded.

When planning for an upgrade like this you have to allot for not only the hardware, but the expert’s time and this customer was on a tight budget, so they decided that spending money on our services making sure the transition was a higher priority than getting the biggest fanciest hardware rig. After all this is email, a service that may not seem critical, but it’s the first thing that people notice is not functioning correctly. So we put together a proposal for the migration.

Old system: Apple Xserve G5 – 2x 2.0Ghz G5 – 6GB RAM – 3 x 250GB SATA H/W RAID 5 running Tiger Server.

Upgrading the OS on the system from Tiger to Leopard Server should have yielded some performance gains, especially with the finer grained kernel locking that was introduced in Leopard, but with the main issue being slow mailbox access, we felt that the file system was going to continue to be the biggest bottleneck. HFS+ doesn’t handle 1000s of files in a single directory very efficiently and having the enumerate a directory like that on every delivery and every POP3/IMAP access was taking it’s toll. Also with Apple discontinuing PPC support along with the demise of the Xserve, the longevity of this hardware was assessed as low.

The decision was made to go to a Linux based system running ext3 as the file system. Obviously this opened up the hardware choices quite a bit.

A mail server is very much like a database server in that the biggest bottleneck is almost always disk throughput, not CPU or network. Based on the customers budget concerns we wanted to get them the biggest fastest drive array in the eventual system for the budget allowed. There aren’t a lot of choices when it comes to bigger/faster hard drives within a reasonable budget, so we ended up choosing a 3 x 146GB 10k rpm SCSI drives in a RAID 5 array.

New System: Dell PowerEdge 1750 – 2x 3.2Ghz Xeon – 8GB RAM – 3x 146GB NEW SCSI drives in HW RAID 5

Obviously this is relatively old hardware, but we were able to get everything procured along with some spare drives for ~$600

We installed Debian Lenny and custom-compiled version of Exim onto the system and ran several days of testing.
Then we migrated their system over late one night and everything went smoothly.

The change in that hardware/OS/file system stack produced the following graphic for the Load Average for the system:

Load Average

You can see how dramatic the difference in how loaded the server was from before. The customer is very happy in the snappiness of the system now.

Even though the server hardware is a bit older, it’s applying the right resources in the right spot that makes thing run very smoothly.

We expect many more years of usage from this system.

First Dead MacMini Power Supply

I was at the datacenter we are moving out of this evening rearranging power connections for some straggler customers so I could free up some power feeds. As part of that process I was unplugging and replugging power supplies in a load balanced set of MacMinis and I went to turn one of them back on and it would not power on. Turns out the power supply must have died, perhaps from a spike. I’ve worked with MacMinis of varying designed for many years now, easily over 50 of them and this is the first one I’ve had that died.

I guess that’s a good track record.

Ubuntu Server 10.04 on a Dell PowerEdge 2450

We have a Dell PowerEdge 2450 laying around doing nothing, and my friend asked to set up a server for him so he has a dedicated system to do some Drupal work. I said, no problem….. Boy was I in for it.

I downloaded the Server ISO and burned it. After upgrading the RAM from 1GB to 2GB and setting up the 3 x 18GB 10k rpm SCSI disks in a RAID 5, I booted from the fresh disc and the Ubuntu installer came up and when it dropped into the Debian base installation and tried to load components from CD, it would get stuck about 17% of the way through saying it could not read the CD-ROM any longer. So, I tried burning another copy…. Same thing.

OK, this system is pretty old, so I swap out the older CD-ROM for a tray-load DVD-ROM. Same thing, but at 21%. Grrr.

I try a THIRD CD burn in a different burner and still halts at 21%. I pop into the psuedo-shell in the Installer and try to do a ls on the /cdrom directory. I get some Input/Output error lines for docs, isolinux and some other items, but do get some output lines from that directory….

OK, now I’m wondering if my ISO didn’t perhaps get corrupted in the initial download. Unfortunately, Ubuntu does NOT provide MD5 checsums on their ISO images at least not directly on their website where you download it.

Let’s ask the Google. Apparently others have had the same issue since at least the 7.0 series. The Minimal CD works, but there doesn’t seem to be a way to install into the Server version from that.

I finally find a post (see link below) where success was had by using a SECOND copy of the Installer in a USB connected CD-ROM drive. The system boots off the internal CD but pulls all the material off the CD on the USB drive.

It is finishing the install as I type this.

Wow, what a Rabbit-hole!

Just another example of: “Linux is Free if you’re time is worth nothing”

Speaking at Dallas TechFest 2010

If you will be in the DFW area at the end of July, please come see the talk I will be giving at the 3rd session of the PHP track on  Building Scalable PHP Web Applications.

The conference will be at the University of Texas at Dallas on July 30.

http://dallastechfest.com/Tracks/PHP/tabid/74/Default.aspx

Brian

The Great Leap Beyond One – Creating Scalable PHP Web Applications

I gave a presentation to the Dallas PHP user group on May 11, 2010 on Creating Scalable PHP Web Applications.

Download the presentation in PDF.

Here is a basic outline:

  • Introduction
    • Traditional Single Server and Dedicated DB-2 Server data flows.
    • What does it mean to be Scalable, Available and Redundant?
  • Planning your Delivery Architecture.
    • Delivery Resource Types – html/image/pdf/email/rss
    • URL types and origins for main text/html, static images, user generated media
  • Delivery Architecture Components
    • Web Servers
    • Database Systems
    • Load Balancers
    • Caching systems
    • PHP Application Code
  • Web Server systems
    • Make fast and agile and identical
    • Key concept: Web systems must be thought of as being Disposable.
    • Storage of source and non-source delivery resources
    • Deployment of web servers – OS/PHP Load, Code Deployment/Updates
  • Database systems
    • Hardest to Scale, throw money at this problem
    • Replication and caching layers can extend life/performance of primary database.
    • Make a plan to deal with Primary Failure – what in site will/won’t work.
    • Make a plan to deal with Primary Recovery
    • TEST THAT PLAN
    • Redundant MySQL Overview
    • Caching Layers Overview
  • Load Balancers
    • Hardware/Software List
    • Primary Features
    • Secondary Features
    • Example Service/Content rule Pseudo-Config
  • PHP Component Code changes
    • Sessions
      • Custom Session Data Handler
      • Basics and Gotchas
      • Example Session SQL Table
    • Non-Source File & Data Storage
      • Uploaded images/documents (avatars/photos)
      • System generated files (chart images for emails)
      • System Generated Data (calculated results data)
      • Data pulled from external system (RSS feed cache)
      • Store into shared system accessible by all front-ends
      • Admin system for forced pushes/cleanouts. Monitoring.
    • Resource Delivery
      • Simple and complex examples.
      • Code for abstracting URL generation – App::URL(‘logo.jpg’, ‘branded’)
      • Example of complex URL structures.
      • Delivery issues with CSS and JavaScript
      • Serving SSL protected content with references to external static media; needs SSL too!
      • Using ErrorDocument to create a Just-In-Time delivery system to media servers.
    • Periodic Process Execution
      • Using Queues and Semaphores to control execution.

Leopard Server Upgrade – postfix not logging or delivering

We have a development server for a client that was recently upgraded from Tiger Server to Leopard Server. This system holds the subversion repository and the staging sites for their hosted application. One of the configured pieces is that whenever someone commits into the SVN repository, we have a post-commit hook that sends a message to all the developers with the information from the revision commit. Email on this system is handling by the Apple built-in Postfix. When the system was upgraded, we noticed that we no longer received our SVN commit messages. Investigating this I found two things that needed fixing.

My first problem was that the logging that postfix was sending to syslogd was very sparse. so I checked through all the settings twice in Server Admin and directly in the main.cf and master.cf files. Took me a while, but I finally looked at the /etc/syslogd.conf file and found that the facility entry for mail was set to mail.warn. Checked the Server Admin setting for SMTP log level and set it to Debug.

Second problem: now that logging was fixed, I can see that the relayhost set in config was rejecting the messages. So not only were the original messages being rejected, the bounce messages were being bounced. Essentially anything being send was dying a quick death. I fixed the relayhost setting, tried another messages and BAM, message delivered.

Upgrading from Tiger to Leopard is an important step to take, but with all upgrades, one must really go through all your settings once again to verify their correctness.

Site to Site VPN with Mac OS X Server and a NetScreen

A client needs to have a Site to Site VPN between a server at their office and a NetScreen at their colo.

I did a fresh new install of Leopard Server fully and cleanly updated to 10.5.8 running on a G4 MacMini to make sure I can configure both sides properly.
My test Server is on a clean public static IP address for the built-in ethernet.
Secondary ethernet using a USB Ethernet adapter for the private side of the network.

System has no issues until…..

I used the s2svpnadmin cli tool to create a new shared-secret IPSec tunnel to a NetScreen at our colo.
Very basic setup, nothing fancy (not like the tool lets you do anything fancy.)

After creating the config I start to get these entries in my system.log:

Mar 10 12:55:56 test1 vpnd[1614]: Server ‘TestColo’ starting…
Mar 10 12:55:56 test1 TestColo[1614]: 2010-03-10 12:55:56 CST    Server ‘TestColo’ starting…
Mar 10 12:55:56 test1 vpnd[1614]: Listening for connections…
Mar 10 12:55:56 test1 TestColo[1614]: 2010-03-10 12:55:56 CST    Listening for connections…
Mar 10 12:55:57 test1 ReportCrash[1615]: Formulating crash report for process vpnd[1614]
Mar 10 12:55:57 test1 com.apple.launchd[1] (TestColo[1614]): Exited abnormally: Bus error
Mar 10 12:55:57 test1 com.apple.launchd[1] (TestColo): Throttling respawn: Will start in 9 seconds
Mar 10 12:55:57 test1 ReportCrash[1615]: Saved crashreport to /Library/Logs/CrashReporter/vpnd_2010-03-10-125556_MacServe-Test1.crash using uid: 0 gid: 0, euid: 0 egid: 0

and looking at the crash report:

Process:         vpnd [1614]
Path:            /usr/sbin/vpnd
Identifier:      vpnd
Version:         ??? (???)
Code Type:       PPC (Native)
Parent Process:  launchd [1]

Date/Time:       2010-03-10 12:55:56.252 -0600
OS Version:      Mac OS X Server 10.5.8 (9L34)
Report Version:  6
Anonymous UUID:  7E25DC5D-7D93-42B5-8F69-F7C823244418

Exception Type:  EXC_BAD_ACCESS (SIGBUS)
Exception Codes: KERN_PROTECTION_FAILURE at 0x0000000000000000
Crashed Thread:  0

Thread 0 Crashed:
0   ???                               0000000000 0 + 0
1   vpnd                              0x0000444c accept_connections + 1280
2   vpnd                              0x00002a08 main + 1572
3   vpnd                              0x00001a48 start + 68
4   ???                               0000000000 0 + 0

Thread 0 crashed with PPC Thread State 32:
srr0: 0x00000000  srr1: 0x4200f030   dar: 0x000513b0 dsisr: 0x42000000

…. etc. etc.

I do NOT have the VPN service “running”.

I did find this post on Apple discussions:

http://discussions.apple.com/thread.jspa?threadID=1491028#7116067

and followed the posters directions for manually starting the tunnel.
I still get a bit of fussing, but no crash.
I checked the IPSec SA/SPD info with setkey -PD and some basic pings across the network and the tunnel is active.

The crashing doesn’t seem to be cpu arch dependent as my system is ppc and the OP on the Apple board is using a x86 machine.

Kind of a bummer. It looks like there is probably some really simple issue here as the crash apparently happens very early in the setup process: “accept_connections”.

Hopefully this will help someone in the future.

Oh and FYI:

Leopard Server IPSec parameters for a Shared Secret based VPN:

Phase 1: DiffieHellman Group 2, 3DES, MD5, lifetime: 28800

Phase 2: No Perfect Forward Secrecy; Encapsulated Packet (no AH); AES128 encryption; SHA1 hash; lifetime: 3600; Compression: Deflate (this is optional)

Optimizing a NetScreen 5GT as a Transparent Firewall

We have some Windows-based servers that we colocate for some clients.

We’ve always insisted that those devices sit behind some sort of protection and for a long time, we’ve used a Cisco 2621 as  a screening router for a smaller subnet of our main address space. Any traffic that wanted to reach the protected IPs was routed through this device and we applied access list screening both inwards and outwards.

Over time, this device become unable to handle the traffic that was pushed through it and we decided to replace it. We had a 10-user model NetScreen 5GT that was untasked and since we had only a handful of devices on that protected subnet, we found a new home for the 5GT as a transparent firewall for those systems.

The protected subnet was compartmentalized with the use of a non-tagging VLAN on a our main Cisco customer attach switch, so segregation of a broadcast domain was not an issue. We merely needed to configure the 5GT into Layer 2 mode and setup the right policies for both directions of traffic.

I like to filter Bogons on our network so I started there. Now in this context, any traffic originating from the Untrusted side and having a source IP that existed on the Trusted could also be considered a Bogon so I made sure that rule was in place as well. Since Defined Addresses must be defined in terms of a security zone, I had to setup our Protected IPs in both Zones so I could define the correct policy

One problem I did run up against is the way Sessions are handled in ScreenOS. The max Sessions that can be tracked by this model of NetScreen is 2064, and in a busy period after installing the device we did get close to reaching the limit. The solution was to drop the timeout value for POP3 (one of the servers is a Mail Server) and HTTP/HTTPS in the Predefined Services section down to a very low value. This would ensure that there would be a faster turnover of entries in the Sessions table and keep it further away from the limit. This does mean a bit more work for the CPU, but the NetScreen’s ASICs are up to the challenge.

It has turned out to be a very good switchout of better hardware and management access policies in the ScreenOS Web management is much easier than with the Cisco ACL approach. My main gripe there is that to make a change to an access-list, I have to remove it from the interface, remote it from the router, then add the new access-list back to the router, then reapply it to the interface. A very tedious chore.

noatime for Mac OS X Server boot disk

The new G4 MacMini with the SSD is running beautifully. However, there is one little detail I’d like to take care of to help prolong the life of the SSD: disable the atime updating in the file system.

When we build out Linux servers, one of the configuration changes we always make is to add a noatime flag to the mount options for the file systems. atime is the Last Access Timestamp and really is useless in a server environment.

After some empirical testing…..

Under Tiger:

# mount -uvw -o noatime /
/dev/disk0s3 on / (local, journaled)

no effect. Even produced this entry in the system.log:

Jan 8 14:19:27 vpn KernelEventAgent[34]: tid 00000000 received
unknown event (256)

Leopard:

# mount -vuw -o noatime /
/dev/disk4 on / (hfs, local, journaled, noatime)

where it looks to be supported…

The test is to check the last access time with a ls -lu , then simply cat the file, then ls -lu again.

I guess I’ll need to upgrade the Mini to Leopard Server!