[Desktop_architects] Making X more responsive Was: Linux desktop, the fun thread

Wed Dec 14 07:22:44 PST 2005

Am Mittwoch, 14. Dezember 2005 07:38 schrieb Linus Torvalds:

Thread Hijacking ahead....

Hi Linus,

thanks for starting a technical discussion to help to solve the issues with 
the responsiveness of the Linux Desktop. For the near and mid-term future 
this means an X11 based desktop.

Actually I remember that you once promised to make every reasonable change to 
the kernel which is going to help us desktop people and I therefore like the 
fact that you took the time to look into possible reasons for the problem.

> > And
> > Linus, once my mouse button stops jerking when I copy big files, I'll
> > add a GUI for you.  ;-)

> The solution to that was for the X server to update the cursor position in
> the signal handler (SIGIO on the mouse device), and it suddenly became a
> _lot_ smoother. Smooth enough that now you have to do something slightly
> special to get back the good (bad) old jerky mouse cursor.

Using SIGIO is often a suitable solution if you want to avoid the complexity 
of threads. I use it rather often in order to make legacy software more 
responsive and in order to add some asynchronous behaviour.

> returns from the signal handler, and will return to the main select loop.
>
> So far so good.
>
> Now it's not idle any more. The "select()" will have exited, but X now has
> an agenda: it needs to inform all the clients that the mouse has moved.
> That basically ends up looping over the internal X client list, and for
> each client that shows interest (which tends to be most of them ;), X
> does:

Keith: I can assume that in the case of congestion it is not really desireable 
to inform all clients about every small mouse movement. I can imagine that it 
is useful to introduce a more relaxed semantic when congestion(*) happens.

Which basically means that in case X does detect a congestion (I assume that 
this can even be done even in a signal safe way within the signal handler) it 
shall delay and aggregate the notification task.  

> The problem is
> the memory allocations inherent in both the mallocs and internally in the
> kernel in the writev() itself.

Can these problems with memory allocation be somewhat dampened by using some 
smarter memory pooling algorithm which can claim memory from preallocated 
space?

Linus: Can the kernel help so that small memory allocations which might be 
reclaimed soon can be made fast even when large file IO does happen. This 
basically also would boil down to do some pooling of memory per process.

> They will occasionally block, because the 
> big file copy has dirtied a lot of buffers that we need to write out to
> make room for more.

> They don't block for a long time, but when it happens, what do you think
> goes on? A mouse event comes in, and the kernel immediately sends a SIGIO.
> But the process it sent the SIGIO to (X) is _blocked_ on the memory
> allocations it does. So even though the kernel sent the signal as soon as
> the mouse moved, it won't be _acted_ upon, because X is busy doing
> something else.

Are the SIGIOs queued or aggregated by the kernel in case they can not be 
immediately delivered? 

Sofar in all my programming I always assumed that the Signals are aggregated 
and that I have to loop in userspace in order to get all data which might 
have accumulated in the meantime.

> See? It's _exactly_ the same situation as before, except using signals
> means that the "busy doing something else" is now no longer any of the
> normal user space loops, it's system calls that can't be interrupted (the
> writev() blocking on interruptible IO would be interrupted, but memory
> allocations aren't interruptible).

Keith: Please comment if some memory pooling within the X server can be a 
remedy. In general due to the fact that kernel roundtrips are expensive such 
pooling might speed things up significantly in the general case. Qt does such 
a pooling since many years very successfully (e.g. QString).

> at all. We were blocked on them for other reasons (sending the events to
> the clients i sobviously a _result_ of the mouse moving, but it's totally
> independent of actually updating the screen with a new mouse pointer).

So relaxing the notification of the clients in case of congestion should 
really help. 

Without proper testing I don't know if relaxing the notification will break 
existing semantics (e.g. Linus' focus-follows-mouse (**) usage). But I am 
rather confident that it could be made right.

> And SIGIO is very much a fake thread, and because it's fake, it ends up
> having these silly cases where the "signal thread" isn't executable
> because the "main thread" is doing something else.

> Keith may be able to tell us more.

> flaky. For example, if X uses siglongjmp() or something like that in a
> signal handler (which it may well do), taking a signal in the "idle
> thread" and then longjumping into smewhere else would do some seriously
> horribly bad things.

I assume that bad things happen anyway with using siglongjmp within a signal 
handler. E.g. when the signal interrupts a non async-save function.

(*) I think that congestion is detectable by the X server via some simple 
timestamp heuristic.

Yours,
-- martin

-- 
>From the 'Handbook of Corporate Slang':
- to protect prior investment (phrase):
     describes the inability to revert a wrong decision made
     in the past, expresses willingness of throwing 
     good money after bad. (q.v. Fiorina, C.)