I am an extreme moderate

January 22, 2012

Snappy compression added to BTRFS

Filed under: Uncategorized — niezmierniespokojny @ 2:55 pm

Recently, Intel’s Andy Kleen published a set of patches that add Snappy compression support to BTRFS.
My first reaction was:
What for?
BTRFS already has LZO, which is stronger and compresses faster. Decompression is slower, but in all real cases the bottleneck will be the disk, so higher strength should lead to lower disk use and better performance with reads too.
So what reason does Andy provide?

“snappy is a faster compression algorithm that provides similar compression as LZO, but generally better performance.”

Hmm.
I benchmarked these algorithms on several machines and did some benchmarks that didn’t end up published on several more and LZO was definitely faster during compression (but slower during decompression).
In an older post, Andy shared some details:

“This has been tested with various test workloads and also in a real distribution. One challenge with the benchmarks is that a lot of of benchmarks only read/write zeroes, which are a uncommon good case for IO compression. However even with modified workloads that use different patterns gains are seen.

In general snappy performs better with a 64bit CPU, but also generally outperforms LZO on a 32bit Atom system on IO workloads.

The disk space usage is similar to LZO (within 0.2%)”

I have a suspicion.
He probably tested on largely incompressible data. In my tests, the size delta was in the range of 0.8-1.5%, so way bigger than his. And it happens that on incompressible data, Snappy is indeed faster than LZO.
But if your data is incompressible why use compression at all?

It would be nice if Andy clarified what data did he use and maybe gave more details about the test setup.

I’ve been thinking about it for a while and I just don’t see a really compelling use case for Snappy in BTRFS. There are some like a mix of highly compressible and incompressible where it would be better than LZO, but just slightly.

January 14, 2012

A quick look at Windows 8 storage virtualisation

Filed under: Uncategorized — niezmierniespokojny @ 7:15 am

A couple of days ago, Microsoft’s Steven Sinofsky blogged about some cool new Windows 8 storage features that could be valuable for about anyone.
For home users, Win 8 offers a nicely implemented data protection scheme where you have multiple drives in the system and in case some of them die, you don’t lose any of your photos, music, emails or whatever you find valuable.
OK, Windows could do it for a while with RAID. But the new ‘Storage Spaces’ are much more flexible and usable. In fact, it looks like the best implementation that I’ve seen.

Here’s how it goes:
First, you have a pool of drives. You can add a new drive at any time and your available capacity (and performance) will increase.
On top of it you create ‘Spaces’. They are much like partitions, but:

  • They are thinly provisioned. The total size of all spaces can be bigger than your drives can hold. In fact even a single space can be bigger. Obviously, you won’t be able to store thus much, but it’s useful, because you don’t have to decide up front how to carve your drives to separate partitions. And when you add drives, you can leave your partitions untouched, the system will just start using them.
  • They provide RAID-like protections. You can use RAID 1, 5 and 6-like schemes.
  • I just love the data protection implementation. It’s different because it’s not at the drive level. Having just 1 drive pool, you can protect different data in different ways. In example, temp doesn’t need any protection. For system, RAID-1 is probably the best because it offers the best performance. And for important data I would use RAID 6 because it’s very secure and rather cheap.
    I don’t think there is anything as good available anywhere and I’m sure – not for home users. The closest thing is provided by ZFS, but it’s less flexible:
    You have a similar separation of pools and filesystems.
    You can have any RAID-like protections at the pool level
    You can have RAID-1-like protection at filesystem level

    This means that if you want to use the cheap protections (RAID5-6) in ZFS, the way to do it is by having it on the pool level and no protection at the filesystem level. So you must have it for all filesystems in the pool or for none. Also, it prevents you from adding drives one by one. You have to add an entire RAID group at the time.

    OTOH the Microsoft’s thin provisioning implementation is quite ugly. For most uses you don’t want to have any partition size at all. It should take just as much as space as it needs, w/out any limits. Or – better – there should be limits, but ones that you can increase freely. That’s how ZFS implements it.
    [ADD] Microsoft explained in their FAQ that you can increase size of a Space. So it’s practically as good as others in this regard. [/ADD]

    OK, I covered 2 options available for home users. There’s also a 3rd one, BTRFS.
    How does it stand against them when it comes to virtualisation features? Not well, but could be worse. There’s only RAID-1. And at the level I’d describe as being between the pool and filesystems. All files use the same protection, but you can add drives with different sizes one by one. Thin provisioning works like on ZFS.

    There’s one virtualisation feature here that ZFS has and others don’t – hybrid storage pools. You put both HDDs and SSDs in a single pool and the system uses them according to their capabilities. I don’t see any reason for MS not to add it at some point though.

    Also there’s one thing that is not virtualisation in itself, but it’s an important feature that can be implemented only at the level that provides RAID. Protection from silent data corruption. ZFS and BTRFS offer it. Microsoft doesn’t talk about it, so I guess they don’t. But it’s likely just a matter of time. In the Steven’s blog I see nothing standing on the way.

    January 8, 2012

    Thoughts on moderation

    Filed under: Uncategorized — niezmierniespokojny @ 11:47 am

    I’ve been thinking about moderation today. Moderation of internet forums, comments, films and everything that people let others write. Or in short: of communication.
    What purpose does it serve?
    Sadly, too often it is used to control what people say or do. Silence criticism. Prevent talks about competition. Remove all information that moderator finds sensitive. I think this approach is wrong because it goes against the main purpose of moderated mediums – to let people communicate. And the more important the thing that you try to stifle is, the higher likeness that users will communicate it anyway, just in a different place. So it brings little benefit for you and hurts your users, often the most important thing about your site.

    In my opinion, moderation should be viewed differently. As a service provided to users, meant to improve their experiences.
    It’s hard to get right. Different people have different needs. I’d like to mention 2 approaches…

    One extreme is Frost. It’s a censorship-resistant, anonymous, unmoderated forum. Total freedom. And pure chaos. Some like it. I spent literally minutes on it, couldn’t stand all the threads about child abuses and all the spam.

    On the other side of the spectrum are Ubuntu Forums. I certainly doubt it’s the most heavily moderated site on the internet and if I wanted, I would find one. But still, its goal of being safe for children forces them to put a lot of limitations on how they let their users act.

    For me, both approaches are wrong. In fact, all approaches that I’ve seen are wrong.

    Because different people have different needs and everybody tries to shoehorn all their users into 1 category. I totally see uses for family friendly sites and I think that being one is good for Ubuntu. Nevertheless I would like the place better if they didn’t remove emotional but unkind actions. The place could be much more lively.

    Why no forums lets users choose their level of moderation?
    A couple of checkboxes, ‘spam’, ‘hate speech’, ‘curses’, etc.

    It doesn’t seem hard to carry out and would work much better.

    And I have 1 more thought.
    The service does not have to be provided by the same person / organisation that administers the medium. A mechanism that lets anybody become a moderator and provide their own view of the medium would be a great choice. Don’t like the default moderation? Provide your own. This is surely not an option for those who think they are entitled to control what people talk. But it’s something that would make your site more valuable to your users. There will always be ones who don’t like the way you moderate and alternatives could serve them better…and, oh, providing such feature is only some capital expense (though probably quite large), operational costs would be carried by users themselves.

    Blog at WordPress.com.