darkwater

To the OP: nice karma trick posting the URL with the anchor to bypass the HN duplicates detector. Dang & co, this is a bug, it should be fixed.

I know because I stumbled on the same page following the links from the blog of the author of another post that made the frontpage yesterday (https://news.ycombinator.com/item?id=45589156), liked the TernFS concept, submitted it and got redirected to https://news.ycombinator.com/item?id=45290245

show comments
president_zippy

Could anybody with applicable experience tell me how this filesystem compares in the real world to Lustre?

If it is decisively better than Lustre, I am happy to make the switch over at my sector in Argonne National Lab where we currently keep about 0.7 PB of image data and eventually intend to hold 3-5 PB once we switch over all 3 of our beamlines to using Dectris X-Ray detectors.

Contrary to what the non-computer scientists insist, we only need about 20Gb/s of throughput in either direction, so robustness and simplicity are the only concerns we have.

show comments
roadbuster

> all reads and writes go through the leader

One of the pain points of scaling Zookeeper is that all writes must go to the leader (reads can be fulfilled by followers). I understand this is "leader of a shard" and not a "global leader," but it still means a skewed write load on a shard has to run through a single leader instance

> given that horizontal scaling of metadata requires no rebalancing

This means a skewed load cannot be addressed via horizontal scaling (provisioning additional shards). To their credit, they acknowledge this later in the (very well-written) article:

> This design decision has downsides: TernFS assumes that the load will be

> spread across the 256 logical shards naturally.

poppafuze

Great default license.

show comments
anon-3988

Isn't this literally what ZFS is designed for? What is ZFS lacking that this is needed.

show comments
semessier

should post again when having 5% of the features of the other parallel file systems starting with RDMA, whereby it's not clear if this FS does even stripe that is if it is even a parallel file system