πŸ³οΈβ€πŸŒˆ Chloe is a user on octodon.social. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.
Holy shit am I tempted to buy this and use it to replace my current VM host.

Factory refurbished workstation with a Xeon E3-1240v2 and 16GB ram for like $350
@kurisu Still want it. My current VM host has 8GB ram and an i5-3550p
@zeta @quad you can always get some if required
@kurisu @zeta who would anyways. costs way too much just to protect against those magical solar flares crashing my seedbox
@kurisu @zeta Honestly the worst part about zfs is that fucking nobody seems to know how ram allocation works.

If you try to google how much ram you need for zfs feature x, you'll either find no answer, or 10 people who say 100 MB per 1 TB along with 10 others who say 2GB per 1 TB.

Nobody fucking knows. The solution to ZFS seems to always just be "Run it. If it wants RAM, you put more RAM in. If more RAM doesn't fit in your system, you move to a new system. If you can't afford these too things, too bad, sucks for you."
@quad @kurisu @zeta that's not true... and you don't need any special amount of ram for any feature except deduplication.

"In practice with FreeBSD, based on empirical testing and additional reading, it's closer to 5GB [of RAM] per TB [of storage]." -freebsd wiki

You could also just ask the ZFS inventors themselves. I could speak to one this week if you have a question you can't find an answer to. I believe Matthew Ahrens is here at BSDCan
@feld @kurisu @zeta Well yes, but from my experience ZFS performance starts to tank pretty fast once you don't have enough RAM.

You don't *need* RAM for it to have basic functionality. But it won't be pleasant.
@quad @kurisu @zeta you can always limit ARC with sysctl / loader.conf on FreeBSD to prevent bad things from happening.

The truth is that we are all playing catch up with Solaris because they did a lot of work in their memory memory management to ensure ZFS was flawless. Linux, FreeBSD, etc are still learning how to adapt our kernels to handle the edge cases and deal with extreme memory pressure.
@feld @quad @zeta linux isn't really learning how to adapt the kernel to zfs though - is it. Because the licence isn't GPL compatible.

Linux is going off and making its own GPL filesystems with blackjack and hookers.
@kurisu @feld @zeta Frankly all I want is a filesystem with block-level deduplication of some kind. With an official tool that I know will be maintained. And that doesn't require a fortune of RAM.

Comically enough, Windows+NTFS mostly works fine in this regard. We have like 10TB deduplicated on a server with 8GB RAM here (granted only like 3GB are free after the bloated applications/OS used to serve all those files) and performance is more than decent enough on 8x 10K SAS drives and about 30-50 users accessing it.

More than I'm pissed I can't dedup an fs on my potato Linux server, I'm just embarassed that Windows of all things has us beat on something filesystem-related.
@quad @feld @zeta windows does dedupe?

And technically it's extent-based deduplication, but same thing really. Modern filesystems don't think of blocks much.
@quad @feld @zeta

> When new files are added to the volume, they are not optimized right away. Only files that have not been changed for a minimum amount of time are optimized. (This minimum amount of time is set by user-configurable policy.)

so windows has offline dedupe. That's good. Looks like all you really want is a well-designed policy based deduplication daemon for linux. No new kernel work required.
@kurisu @feld @zeta Yeah, it's offline.

It's also Server-only so it's not too surprising you haven't heard of it. But it's also piss easy. Just install the deduplication feature from server manager, then you can do a "Enable-DedupVolume G:" in PowerShell.
πŸ³οΈβ€πŸŒˆ Chloe @chloe

@quad @kurisu @feld @zeta I wonder why it’s not enabled by default.

@chloe @zeta @feld @kurisu Probalbly because of I/O strain.

Windows 10 already strains I/O ridiculously much by itself. Also depending on how often the users edit files you could be looking at many hours worth of deduping, assuming it's on slower drives