...sine propero notiones

You are here: Kiko > PostsInEnglish > EnBlogEntry2008Jan19A Printable | topic end

Start of topic | Skip to actions
Versão em Português

The TS7300 in my test rig
The TS7300 in my test rig (click to enlarge),
showing its own power consumption. Ignore the
temperature data, I disabled those sensors.

High Availability and the TS7300 Single-Board Computer

Rec 19-jan-2008 09:42

A classic challenge in network monitoring is this: when you design a system to monitor another system's availability, the former must be more reliable/available than the latter. Say, if you are monitoring a bunch of redundant servers that yield an overall >99.99% availability, the monitoring system itself must be even more available/stable/reliable than that, say, >99.999%.

Because of that, when I was choosing the hardware for my monitoring system, I wanted a physically small and rugged computer that had good internal redundancy and outstanding reliability. Essentially, a computer that "doesn't break". The most common causes of failure in PCs are: coolers, power supplies and hard disks. Let's start by replacing the hard disks by flash memory, which, with no moving parts, are way more reliable.

I also want it to continue alive even in face of a 48-hour power loss. Not many UPSs systems can support ordinary PCs for that long the ones that do cost a small fortune. If we can't scale up the UPS, let's scale down the power requirements: let's try a computer that draws less that 3W of power. (Ordinary PCs draw about 100W -- see this page to get a rough estimate for your case).

Yet, I want a computer powerful enough to run a full Debian Linux setup -- the monitoring software, like most our production software, runs on this platform. The performance requirements are quite modest, though. We will be running lots of Perl scripts to poll services and count how many minutes they are operating in full redundant mode, under the various degraded modes and how many minutes it has been offine. There will be about 30 or 40 such monitors.

This computer will double as an emergency access station so we can access the server's consoles over their serial ports. This means we need at least 8 serial ports. I also want USB ports so we can connect a GSM modem for emergency access in case of Internet link failure and to send SMS.

We also need 2 Ethernet ports to connect to two switches in a redundant configuration. 100BaseT is more than enough; no need for Gigabit speeds.

There is at least one computer that does fit this bill perfectly: the TS-7300 from Technologic Systems. It features a 200MHz ARM CPU with 128MiB of RAM and 2 SD card slots, which act like the hard disks in a conventional computer. The performance is modest: it is about as fast as a 166MHz Pentium classic. However, for this application, it is more than enough. This also means the CPU doesn't even get appreciably hot, dispensing with coolers altogether.

This ARM CPU is pretty much the same you have in many mobile phones these days. In other words: if mobile phones had two ethernet ports, I could just as well be using a mobile phone instead.

The included SD card comes with a full Debain-ARM distro, so there are almost no additional complexities in the software part.

One oddity with the included software stack is that the root filesystem is ext2. That the bootloader requires this is understandable, for simplicity seasons. But using ext2 for the full debian distro seems to me a bad idea; first, I've always heard that for flash-based storage we should use flash-optimized filesystems like jffs2 or yaffs -- perhaps the old limitation of ~100.000 writes in the same page most flash devices used to have has been fixed by now and I haven't heart of that? I did read somewhere that SD card firmwares had spare sectors and the ability to remap them just like ordinary modern hard disks do, but I have no idea how effective those actually are in real-life workloads.

Second, any crashes and you'll have to go to through the lengthy fsck process in the startup (this is the reason why I've been a happy reiserfs user for many, many years). I'll try to figure out a way to create new partitions with better filesystems and see how all that interacts with the boot process and assess the reliability results.

The TS7300 boot process is kinda cool. The ROM code can checksum part of the SD card contents (including its serial number) and refuse to boot if it doesn't check out. This makes it a bit harder to hack your TS7300 simply by inserting another SD card. If the checksum goes fine, it loads the kernel image and initrd, potentially dropping you to a shell mere 1.7 seconds after startup. The default initrd has busybox and other goodies and the linuxrc script has an option to pivot to another root system with the full Debian sytem -- which takes a lot longer to boot because the initscripts thing. I wonder how fast could we get a full system boot using init replacements such as runit along with daemontools and minimalistic DJB-style services.

An intriguing feature of the TS7300 is that many peripherals -- such as the second Ethernet, the video card, 8 (of 10) serial ports and two GPIO blocks -- are implemented in an FPGA (now that your average mobile phone doesn't have... yet). This means we can change them -- I'm planning to delete the video card and replace it with more 8 UARTs and add a daughterboard with a few MAX232 level converters and D-Sub9 connectors. In the end, this will get me 18 serial ports, enough for a full rack of servers.

I look forward to the not so distant day when ordinary PCs will have not one, but possibly several, large FPGAs inside -- this will open up fantastic possibilities in terms of high performance computing.

In the picture you see the TS7300 in my test rig with a power supply I designed that measures how much power the system is using. 9V is the input power from an wall-wart; the supply converts this to the regulated 5V the TS7300 requires. We see its power consumption is about 480mA in idle mode at 200MHz. I measured that when the CPU goes to 100%, it takes as much as 600mA. Scaling the clock down to 42MHz brings consumption down to 380mA in idle mode. When I set the CPU clock to 14MHz it consumed even less, but the Ethernet port stopped working.

I'm using this data to design the power supply + multiserial daughterboard. Technologic Systems does offer a battery backup system called TS-BAT3, but at 1000mAh capacity, it would keep the system alive for only 2 hours. I plan to use two banks of 6 ordinary Duracell MN1300 non-rechargeable 15000mAh alkaline D-cells to exceed the desired 48-hour mark.

I tried to plug a TS9989i GSM minimodem in the USB port but it did not work. I guess the minimodem takes more current than the USB port can provide. I will make a powered USB cable to test that. Oh, boy, it already seems I'll be needing need even more batteries.

And the good thing is that the TS7300 is quite affordable -- the test one I bought went for about USD$420, including the SD card with the software, a few extras and the outrageously expensive $100 shipping UPS charged me. There are cell phones that cost more than that.

You are here: Kiko > PostsInEnglish > EnBlogEntry2008Jan19A


Creative Commons License   The content of this site is made available under the terms of a Creative Commons License, except where otherwise noted.
  O conteúdo deste site está disponibilizado nos termos de uma Licença Creative Commons, exceto onde dito em contrário.