[Ksummit-2012-discuss] PCI breakout session

Benjamin Herrenschmidt benh at kernel.crashing.org
Mon Jun 18 10:38:35 UTC 2012


On Mon, 2012-06-18 at 13:28 +0300, Michael S. Tsirkin wrote:
> On Mon, Jun 18, 2012 at 08:19:45PM +1000, Benjamin Herrenschmidt wrote:
> > On Mon, 2012-06-18 at 11:08 +0300, Michael S. Tsirkin wrote:
> > 
> > > > > The "right" approach is to probably move the resource assignment to
> > > > > between the initial scan pass and the "adding" of devices (in our PCI
> > > > > stack terminology, adding -> register with the driver model), and
> > > > > naturally have the final fixups called right before the latter. But this
> > > > > will have some sort of impact on all archs so we probably want to do
> > > > > quite a bit of code auditing first.
> > > 
> > > An interesting related problem is hot-plugging complex bridged hierarchies - esoteric
> > > on real hardware but easy and sometimes useful in a VM.
> > > It would seem that at least for that space, adding support
> > > for moving resources around could be useful. For things like IO it
> > > only seems possible to do with cooperation from the drivers - and at least for
> > > VMs that's not such a big deal as we only emulate a handful of devices.
> > > Or do we need to try and solve it generally straight away?
> > 
> > I'd say don't do it :-) Or rather try to give each device in the VM its
> > own dedicated virtual host bridge & bus :-)
> 
> This only makes the problem of resource allocation worse.
> For example for IO this means you waste 4K of space per device. Since
> x86 only uses 16 bit IO addresses, this allows up to 16 devices.

On the other hand, do we really care about trying to hotplug hundreds of
devices that require IO space ? SR-IOV doesn't have IO, pretty much
every device produced after the last ice age can be operated entirely
without IO etc...

IE. I wouldn't waste time, energy and code maintainability for that :-)

Now if you're thinking of virtio, well, I argue that having used IO
space as the basic for virtio communication is a HUGE DESIGN BUG :-) I
know it's marginally faster to emulate that MMIO but that isn't an
excuse to perpetuate that historical abortion.

If not already (I haven't checked recently), virtio drivers should start
accepting an MMIO BAR (using iomap makes it trivial to not care about
the BAR type).
 
> > There are pros and cons to
> > the approach of course, if it's pass-through and the devices are in a
> > single isolation group they must be showed at once, but in that case,
> > I'd say show up the whole group as one host bridge with the stuff below
> > it. We've been spending quite a bit of time on that for KVM/ppc so let's
> > add that to the agenda for the discussion if you want.
> > 
> > (We can start on the linux-pci list if you prefer).
> 
> If you have any patches that are half way tolerable, pls post them.
> I don't :(

Not yet, we're focusing on VFIO initially and getting that to work the 
way Alex envisions it, and we'll resume work on some "alternate" way
later.

It's easier for us because we don't have things like IO space limits or
bus numbers limits, we can have an arbitrary number of domains and they
are all independant. 

Anyways, let's move that discussion elsewhere (or to KS corridors or
mini summit).

Cheers,
Ben.

> > Cheers,
> > Ben.
> > 
> > > > > 
> > > > > Of course that's scary since PCI is so prone to regressions, especially
> > > > > on x86 ...
> > > > > 
> > > > > I have some specific issues with resource allocation on bridges that
> > > > > segment the MMIO space in interesting ways (for error handling) that I
> > > > > want to discuss and get feedback on how to best deal with.
> > > > > 
> > > > > Do we start writing an agenda ?
> > > > 
> > > > Could we do it on the second mini-summit day?  I suspect most of the
> > > > arch maintainers will want to be there, since PCI is bound into all our
> > > > architectures in some quite unique ways.
> > > > 
> > > > James




More information about the Ksummit-2012-discuss mailing list