filesystem for 5TB server

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

acaeti

Member
Mar 7, 2006
103
0
0
The best I found was an eight-bay rackmount enclosure. I can't find it at the moment, but here is an 8-unit tower based on a port multiplier for $700 [Link]. I also found a 3U rackable piece that you need to buy your own PSU and port multiplier for [Link2].
 

Brazen

Diamond Member
Jul 14, 2000
4,259
0
0
Drag, any reason you would go with OCFSv2 instead of GFS. I personally would go with GFS just for the sake of being around longer and having a solid reputation.
 

drag

Elite Member
Jul 4, 2002
8,708
0
0
OCFSv2 is in the kernel. That's all.

I use Debian, both OCFSv2 is avialable and GFS is avialable via apt-get. I don't know which one is better, no idea at all.

There are 2 things that I am interested in that I don't know much about, besides performance/stability details of GFS vs OCFSv2...

First thing I can think of is iSCSI versus GNBD. iSCSI is a standard for scsi commands encapsulated in a TCP protocol. SCSI has been around for a long time and iSCSI is a standard that is implemented by many different groups and is compatible with many different operating systems. Microsoft provides no-cost iSCSI initiator for Windows, for example. (although there is no non-freaking-expensive iSCSI target software for Windows, nor is there any clustering file system support from Microsoft, only from very expensive third party companies)

GNBD is GFS's network block device. This allows you to share a block device over the network to another Linux machine. It effectively does the same thing as iSCSI, which is to allow you to access a block device directly.

So iSCSI vs GNBD? Which is better? Faster, more stable? Is it easy to boot off of GNBD? (because it's not with Open iSCSI)
If your only using Linux and GNBD is faster then it would be attractive.

Booting off a network share like that would be very kick-ass. It would be possible to make a dead-quite machine for your living room, for example. Very small also.


One note though.. You'll need local storage for the swap. Swap over network simply _does_not_work_. So swap on flash device would be nice, also it would provide a convient boot partition so you can avoid netboot/ stuff.




Then the second part is CLVM. CLVM is LVM with clustering support added. Doing this you can export the entire raid array as a single iSCSI share or GNBD and then manage LVM shares from many different machines. This would be very nice.


But I don't know how mature it is.
 

Goosemaster

Lifer
Apr 10, 2001
48,775
3
81
As people have said, perhaps lean towards using a completely separate drive for the OS.

I would probably use something small, use ext2 for boot, ext3 for home, and have the LVM pool divided as you wish...

or set one of the pools for /home....

the key is to keep system data segregated from the storage area...so that if the system keels over, all you have to do is nuke a volume/partition...depending on your setup, since if you have backed up everything correctly and image or a quick reinstall might be painless.

Also consider using FreeNAS as it supports LVM
 

Brazen

Diamond Member
Jul 14, 2000
4,259
0
0
Booting off of iSCSI is easy. All you need is one of these: http://shopper.cnet.com/adapte...0_9-32537540.html?ar=o It's $800 by the way, if you don't feel like clicking the link. That's pretty crazy, but it's comparable to a fibre channel HBA, IIRC. If you google for "iSCSI HBA" maybe there are cheaper ones, but Qlogic is what we use for our FC HBAs (really just because that is what Dell gave us).

GNBD is not going to be any different. You are still going to need something to load a network stack and initialize the GNBD lun, so you will need a GNBD HBA. As long as you are going over ethernet, why not just use something more standardized such as PXE (I know little to nothing about PXE, if this just sounded stupid)? I tried to find some info on GNDB and I couldn't find anything that says it even uses tcp/ip. The only advantage I could see for it over iSCSI is if it implements it's own transport protocol (something more efficient for transporting block devices) like fibre channel does, then it would indeed be faster than iSCSI.

I also did look into OCFSv2. It looks to be much simpler to implement (or maybe just better documented), but the feature set is almost identical to VMFS (the file system used by VMWare ESX Server). For me this sounds great as I would want to install virtual machines on this thing, but if it was going to be used directly as a file share or the like, then I would probably still use GFS which has a feature-set very similar to Ext3.
 

drag

Elite Member
Jul 4, 2002
8,708
0
0
Ya, but that's 800 dollars. I'd rather spend that much money on a extra TB or 2.

It's possible to use iSCSI without any special hardware. It's actually quite fast. Benchmarked it against things like NFS or SMB. It works out very well.

I've done it using PXE boot.

Software wise you need iSCSI enterprise target on the server, dedicated disks or logical volumes, a DHCP server that supports netboot, and a tftp server, pxelinux pxe bootloader, and a custom Linux kernel and initrd image.


You need a server (or servers) with:
DHCP that supports advanced configurations; Bind9
A TFTP server; atftpd or tftpd-hpa (tftp is tricky, it's udp protocol, and you may find that one works better then the other when dealing with paticular hardware)
A NIC card on the client that supports PXE (it's possible to use built-in ROM on some cards to do netboot or do netboot from floppy or cdrom or flash key)
Pxelinux, which is the PXE-specific bootloader from the syslinux family of bootloaders.
iSCSI enterprise target on the file server with dedicated disks or logical volumes.
a custom initramfs initrd that you've added Open iSCSI initiator to.


How PXE generally works is like this:
PXE boots ---> asks DHCP server for lease ---> you have the DHCP preconfigured for this paticular MAC address and the server responds with a address and file and location of the bootloader ---> the card downloads pxelinux and it's *.cfg file, executes it ---> Pxelinux looks through the configuration and find a entry that matches the system ---> the vlinuz image and initrd image are downloaded and executed ---> Linux launches, runs the init script in the initrd image, and sets up the network and various system services needed to access the remote share ---> after setting everything up, root pivot is done and your now running on your remote file system.

The tricky part is writing the bash scripts to setup the Open iSCSI software. The sucky part is that with the Open-iSCSI daemon there is no way to hand off control to another daemon.. you have to be able to kill the old daemon, and launch a new one once your out of the initrd image, but you can't (at least not with the version I was using). So it works, but you loose control over the daemon since the context is off. The Open ISCSI folks have this as a issue and something they want to fix, but I don't know if they gotten around to it by now. I haven't kept up.

I ran my desktop like that for a year or so.

The major downside to any network booting, iSCSI, GNBD, NFS.. it's fundamental to all forms of network-booting is the lack of local cache. Imagine your accessing a iSCSI lun and on that you have a partition setup for swap space. If you exhaust your local physical memory you need to allocate some from swap to keep the machine running. In order to write out memory to the swap parition then you have to transverse the entire network stack, store packets, be able to analise and respond.. all of this requires (you guessed it) more ram. So in order to allocate swap space on a ram exhausted system you need to allocate more ram, which means you need to allocate more swap space, which means you've hit a deadlock.

You can work around it a bit.. you can allocate more ram in a special reserve you save just for this occasion, or you can be very carefull that your workload does not exceed your ram, but it's still there waiting to bite you.

so real the solution is to have a local disk for swap whether or not you actually plan on using to boot off of.

I am thinking that since GNBD support is Linux native then it may be easier to deal with as far as the booting part. It may be faster also since it's Linux native you don't have to deal with the very complex iSCSI protocol.

Keep in mind that there are a variety of network block device stuff for Linux.. This one I am talking about specificly is GNBD, other ones are not suitable for this sort of stuff.


I just need to get my server backup and play around with GNBD, CLVM, and GFS. Hrm... have to order new drives.

For VM stuff I've used Qemu/Kqemu/KVM and Xen. They are much more dependant on other projects for the various features that you get with ESX.
 

Brazen

Diamond Member
Jul 14, 2000
4,259
0
0
Right now I plan to just use the free VMWare server for virtualization. I've tried Qemu and Xen in the past, like way way past, and I like VMWare better. KVM is supposed to have the ability to do live migrations soon, similar to VMotion on ESX Server, and then I might look into it again. I do hope there are good remote administration on headless virtual host servers with KVM though, but I haven't heard or seen anything.

Also, getting an HBA is the only solution to the remote swap problem AFAIK, either with iSCSI or GNBD. But yeah, save your $800 and stick some flash storage permanently inside the case.
 

drag

Elite Member
Jul 4, 2002
8,708
0
0
KVM is a hacked Qemu designed to work with in-kernel KVM.ko module. It effectively turns the Linux kernel into a hypervisor similar to what is used in Xen or ESX (that they are hypervisors, not that they both depend on Linux) THeir plans are to integrate the changes back into the original Qemu project so that users can transparently use the same front.

So you can use straight Qemu, with no acceleration. Also Kqemu was released as open source. Kqemu is a 'accelerator' for Qemu, it uses a kernel module to make the execution of code faster.

For KVM vs Kqemu it's a matter of taste. KVM is probably a ultimately better solution, but KVM depends on hardware-based virtualization features present in Intel's VT or AMD's SVM stuff. So unless your using relatively new hardware you can't use it. So for people with hardware that doesn't support that sort of thing then you can use Kqemu. Kqemu has been slightly more stablier in my experiance, but KVM is under rapid development. In practice Kqemu is about as fast as KVM.


For administration there are a few options for the Qemu family of virtualization whatever. If you using a *nix system in the VM then it's easy to setup serial console for those folks. Then headless administration for those is the same as they would be for a real server. So for Linux this would require lilo bootloader with serial output and then a serial port configured for console access.

Qemu can be launched so that the terminal it's operating in and you can tell it to go gui-less. You can also tell it to hook the current terminal your in into the serial port of the system your running. This is probably the best since it will give you control over the entire boot-up process from lilo on up, but without the need for the GUI.

So for headless management of multiple *nix systems you can just run screen and launch the various VMs from within that and access it all through ssh. Just like you would if your setting up a buinch of real headless servers.

Also for *nix and Windows you can tell Qemu to output to a VNC server. Then you can use whatever VNC client you'd like, even have it running in a browser if that's what you want. Using HTTPS you can then use ssl/tls to encrypt it or otherwise use some other sort of VPN.


With Xen it's a bit easier since it has a bunch of little utilities and such for issuing commands to shutdown or start or pause or whatever and accessing the console.

 

silverpig

Lifer
Jul 29, 2001
27,703
12
81
I'll throw in a quick plug for JFS here. In my experimenting with linux, the different distros (gentoo and making it fast and unstable), overclocking, etc, I've had a few unstable systems, and a few rock solid ones. I've played with the major filesystems as well, and the only one I haven't had any data loss on is JFS. I had 1 partition corruption with each of XFS and ext3, and 2 corruptions (one massive completely unrecoverable one) with reiserfs. JFS has never given me any problems, and it's what I currently use on every drive I own.

At least take a look at it. I'd like to hear what you find.
 

bwatson283

Golden Member
Jul 16, 2006
1,062
0
0
two tiny drives for raid0 for OS, then raid5 the beast....


and I must say........HOLY CRAP DUDE!.......in a home environment...........
 

LeadMagnet

Platinum Member
Mar 26, 2003
2,348
0
0
a good journal files system and LVM like OpenSolaris using ZFS with RAIDZ is the way to go.
 

DaiShan

Diamond Member
Jul 5, 2001
9,617
1
0
Are you set on Linux? ZFS (Zettabyte File System) from Sun would be perfect for this setup, and with RAIDZ you will almost certainly benefit from the dynamic stripe size and (optional) double-parity (as the size of your array grows, so too does the chance for double failure) it's one of the few areas where Sun can keep me interested ZFS has been ported to Linux under FUSE, but currently the incompatibility of the CDDL and GPL is prohibiting it's uptake into Linus' kernel
 

LeadMagnet

Platinum Member
Mar 26, 2003
2,348
0
0
Originally posted by: DaiShan
Are you set on Linux? ZFS (Zettabyte File System) from Sun would be perfect for this setup, and with RAIDZ you will almost certainly benefit from the dynamic stripe size and (optional) double-parity (as the size of your array grows, so too does the chance for double failure) it's one of the few areas where Sun can keep me interested ZFS has been ported to Linux under FUSE, but currently the incompatibility of the CDDL and GPL is prohibiting it's uptake into Linus' kernel

I would definatly look into OpenSolaris running ZFS and RAIDZ2 double-parity. It is so easy to use it's ridiculous
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |