[Ksummit-2008-discuss] Suggested topic: possible

Chris Mason chris.mason at oracle.com
Wed Aug 6 12:35:02 PDT 2008


On Wed, 2008-08-06 at 18:48 +0100, David Woodhouse wrote:
> On Wed, 2008-08-06 at 20:22 +0300, Eyal Shani wrote:
> > Sectors will end up stored according to their context, and expected
> > life cycle.
> 
> All of which information is plucked out of the ether, presumably, since
> the file system isn't allowed to be involved? And when we move stuff
> from one to the other, we don't consult that file system about that
> either, or let it do its own defragmentation or whatever other
> housekeeping it might want to do at the same time?
> 
> > The innovation curve in SSDs will soar, I hope, 
> 
> I believe that innovation will always be easier, cheaper and more
> reliable when the software can see what's going on and get involved. 
> 

To me, this is a pretty simple factor of complexity vs reward vs
throughput.

Even with spinning media, surely the filesystems could do better some of
the time if we controlled decisions all the way down to the disk head.

Some of the time, on some of the devices, with some of the filesystems.
With full specs about each device, updated every time they release, and
every time the firmware is updated.  And people think the XFS allocator
is complex today....

I believe our time is better spent working with the SSD engineers on
layering between the FS and the SSD.  The trim command is one example,
there may be others where the performance benefit is sufficient to push
it all the way up to the FS.

> Take this 'trim' thing, for example -- we've been saying that we need it
> for over a decade. The only reason we hadn't bothered to do it in Linux
> before is because we don't _care_ much about FTL; we don't believe that
> pretending to be a disk drive is the most effective way to use flash.
> 
> It took a day or two to do it in software, when we decided we could be
> bothered. And we _still_ don't know when we'll finally actually be able
> to lay our hands on hardware which supports it.
> 
> Or look at the result of our powerfail testing on SSD devices
> (admittedly a while ago now; I'm _hoping_ things have got better).
> Because it was all internal, we couldn't just fix the bugs and move on
> -- we couldn't even _diagnose_ the bugs. The only option was to throw
> the affected devices in the bin and declare them not suitable for use.
> 

True, but that's hardware.  We don't control spinning media and somehow
it has managed to safely store our bank accounts and payrolls for quite
some time.  The answer to bad storage hardware is to talk to the
hardware makers.

> And I don't even want to _think_ about the reports I've heard of devices
> which assume you'll be storing FAT16 on them and automatically do the
> equivalent of 'trim' when you write to the sectors which it thinks
> should be the FAT, and it thinks you're marking certain clusters as
> free...
> 
> Of _course_ software can keep pace with developments in the underlying
> hardware. Linux is _very_ good at that.
> 

The market reality here is the drives we see are going to be presented
as the FTL.  Just because we can do it doesn't mean we're going to have
the opportunity.

This doesn't mean the raw access doesn't matter, there are tons of uses
for it.  But each of those is a specialized application, not the mass
market.

-chris




More information about the Ksummit-2008-discuss mailing list