[Ksummit-2012-discuss] PCI breakout session

Michael S. Tsirkin mst at redhat.com
Mon Jun 18 11:11:56 UTC 2012


On Mon, Jun 18, 2012 at 08:38:35PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2012-06-18 at 13:28 +0300, Michael S. Tsirkin wrote:
> > On Mon, Jun 18, 2012 at 08:19:45PM +1000, Benjamin Herrenschmidt wrote:
> > > On Mon, 2012-06-18 at 11:08 +0300, Michael S. Tsirkin wrote:
> > > 
> > > > > > The "right" approach is to probably move the resource assignment to
> > > > > > between the initial scan pass and the "adding" of devices (in our PCI
> > > > > > stack terminology, adding -> register with the driver model), and
> > > > > > naturally have the final fixups called right before the latter. But this
> > > > > > will have some sort of impact on all archs so we probably want to do
> > > > > > quite a bit of code auditing first.
> > > > 
> > > > An interesting related problem is hot-plugging complex bridged hierarchies - esoteric
> > > > on real hardware but easy and sometimes useful in a VM.
> > > > It would seem that at least for that space, adding support
> > > > for moving resources around could be useful. For things like IO it
> > > > only seems possible to do with cooperation from the drivers - and at least for
> > > > VMs that's not such a big deal as we only emulate a handful of devices.
> > > > Or do we need to try and solve it generally straight away?
> > > 
> > > I'd say don't do it :-) Or rather try to give each device in the VM its
> > > own dedicated virtual host bridge & bus :-)
> > 
> > This only makes the problem of resource allocation worse.
> > For example for IO this means you waste 4K of space per device. Since
> > x86 only uses 16 bit IO addresses, this allows up to 16 devices.
> 
> On the other hand, do we really care about trying to hotplug hundreds of
> devices that require IO space ? SR-IOV doesn't have IO, pretty much
> every device produced after the last ice age can be operated entirely
> without IO etc...
> 
> IE. I wouldn't waste time, energy and code maintainability for that :-)
>
> Now if you're thinking of virtio, well, I argue that having used IO
> space as the basic for virtio communication is a HUGE DESIGN BUG :-) I
> know it's marginally faster to emulate that MMIO but that isn't an
> excuse to perpetuate that historical abortion.

Unfortunately it's significantly faster on x86 :(.

> If not already (I haven't checked recently), virtio drivers should start
> accepting an MMIO BAR (using iomap makes it trivial to not care about
> the BAR type).

Yes this works with a Linux guest but breaks non-Linux ones.

> > > There are pros and cons to
> > > the approach of course, if it's pass-through and the devices are in a
> > > single isolation group they must be showed at once, but in that case,
> > > I'd say show up the whole group as one host bridge with the stuff below
> > > it. We've been spending quite a bit of time on that for KVM/ppc so let's
> > > add that to the agenda for the discussion if you want.
> > > 
> > > (We can start on the linux-pci list if you prefer).
> > 
> > If you have any patches that are half way tolerable, pls post them.
> > I don't :(
> 
> Not yet, we're focusing on VFIO initially and getting that to work the 
> way Alex envisions it, and we'll resume work on some "alternate" way
> later.
> 
> It's easier for us because we don't have things like IO space limits or
> bus numbers limits, we can have an arbitrary number of domains and they
> are all independant. 
> 
> Anyways, let's move that discussion elsewhere (or to KS corridors or
> mini summit).
> 
> Cheers,
> Ben.
> 
> > > Cheers,
> > > Ben.
> > > 
> > > > > > 
> > > > > > Of course that's scary since PCI is so prone to regressions, especially
> > > > > > on x86 ...
> > > > > > 
> > > > > > I have some specific issues with resource allocation on bridges that
> > > > > > segment the MMIO space in interesting ways (for error handling) that I
> > > > > > want to discuss and get feedback on how to best deal with.
> > > > > > 
> > > > > > Do we start writing an agenda ?
> > > > > 
> > > > > Could we do it on the second mini-summit day?  I suspect most of the
> > > > > arch maintainers will want to be there, since PCI is bound into all our
> > > > > architectures in some quite unique ways.
> > > > > 
> > > > > James
> 


More information about the Ksummit-2012-discuss mailing list