I've started to read about Direct2D and I must say it's pretty exiting.
Moving the rendering to the graphic card will not only increase performance through hardware acceleration, it will also free the CPU for other computations.
Stuff like tracking multiple points might be able to run in real time, that would be awesome.
The drawback is portability to other OS (although I've seen something about X support…) and to older Windows versions. (Which might mean having to maintain/release two versions, but this will be a good excuse to finally try some of the new features of .NET 3.5, like the graphic tablet API).
Anyway, yes, it would be good to use this opportunity to refactor the player screen and decoding/rendering flow.
I have added a diagram of the current architecture here to help.
I like the idea of bufferring the incoming images. It won't be necessary when images are already extracted to memory though, but when decoding directly from file, it will help smoothing out the decoding time.
On the other side of the flow, when outputting to screen, the ideal way is like you say independant modules that take the image and data and draw them on their surface, be it GDI+, Direct2D or whatever.
(An "output modules" architecture would make more sense when we want to add back support for document rendering, we would just have to implement a PDF or ODF renderer.)
Passing the list of items to the renderer would completely change the approach for drawing though… As of now, each drawing has a Draw method, and is responsible for drawing itself on the GDI+ surface according to its own internal data. It's convenient and good encapsulation.
Abstracting the output would mean completely redesign this aspect. This needs to be thought carefully.
For instance, the exact same flow is used when saving snapshot or video. We just pass a new Bitmap to draw on instead of the control's Graphic.
One other way would be if renderers could be abstracted and the concrete selected renderer passed to the Draw method of each object.
The object would draw themselves using universal primitives, and these would be translated to actual methods in the concrete renderer.
What do you think ?