Hardening consulting

Bits from the cyberspace

Multi-monitor support in RDP

A post written after some recent inquiries with multi-monitor in firerds (so server side). It looked quite easy when I started working on this, but as usual with RDP I had lots of surprises (bad of course ;)

Testing multi-monitor

To begin, you need a test platform, the easiest way is to just plug 2 screens on your host and run xfreerdp:

# xfreerdp /v:myserver /multimon /f

But as we're aiming at doing server-side, it's interesting to test with the official client, our beloved mstsc and its numerous versions. You can proceed the same way and plug 2 screens on a host running windows. Anyway, here my window hosts generally run in virtual machines, so... Browsing the internet I found how to emulate multiple screens with QEMU/KVM: you must configure a SPICE display, then you will add as many video card in QXL mode as you want some extra monitors.

The hint is to use remote-viewer to connect on the VM and not the display of virt-manager:

# remote-viewer spice://

When launching remote-viewer, you will have a window per configured monitor. Really useful !

Multi-monitor at the protocol level

Now that we have a test platform, let's look at what changes at the protocol level when you come with multiple monitors configured.

First the monitor layout is sent by the client in the GCC packet (MCS connect request) at the very beginning of the negociation. This packet contains the coordinates of the 4 corners of the monitor, so that gives us their positions and sizes.

The server will answer later with an optional Server Monitor Layout PDU, it contains the monitor layout seen from the server-side. We'll see later that despice what is said in the specification that packet is mandatory if you want the client to do multi-monitor. Note that we're not supposed to send that PDU if the client has not announced that it was supporting it in the earlyCapabilityFlags.

Let's do it

The backends

At first look, the most complicated seemed to be the changes on the firerds backends to support multiple monitors. Because all our backend were supposing that there only one monitor, with all the bias that go with it. So a good amount of work have been done on these.

Hopefully X11 and Qt are already multi-monitor ready, it has been just a matter of correctly interacting with the API. Knowing the position of screens allows to do some smarter window placements (not overlapping on 2 monitors) and of course to correctly display in fullscreen.

RDP clients

I've used xfreerdp as client to modify the backends and make then multi-monitor ready. It was working great.

Of course the troubles came with mstsc. First it connects and sets a DesktopWidth / DesktopHeight that is the size of main monitor instead of the extents of all the monitors. And with mstsc the Server Monitor Layout PDU is mandatory, for it it's the multi-monitor signal otherwise you end up with a big window.

When you're in the configuration on the right, things become a little more complex. Some monitor coordinates will be negative. It's correct as according to the spec, the main monitor should be in (0,0) and others positionned relative to that one. mstsc takes some liberty with the spec and sometime the main monitor is not in (0,0) at all.

In the example if monitor 1 is the main one, the screen 2 and 3 should have negative coordinates. Of course we start with the complex case with monitors of different sizes and weird positions.

After some changes:

  • renegociating at the right size;
  • handling negative coordinates (because backends don't expect negative numbers;

it was working great.

egfx channel

Then comes the egfx channel... It was working when the negociated codec mode was planar or remoteFx, but as soon as the egfx channel was used there were some weird behaviours. Depending on the version of mstsc, windows or the monitor layout, we ended up with the mstsc window not going fullscreen on all monitors. Or when you had exited fullscreen you couldn't set it fullscreen again and so you had to use the sliders to navigate in the window.

I did many tests and finally I've decided to see how an official server was behaving. I took the settings that are sent by mstsc and I've modified sfreerdp-client, the sample FreeRDP client, to connect on a windows server with the same configuration as mstsc (at least what looked pertinent regarding multi-monitor). I've seen interesting things:

  • first, a windows server doesn't resize with a reactivation sequence when egfx is used, it uses the RDPGFX_RESET_GRAPHICS_PDU which allows to setup the size of the graphical buffer:

As you can see the message contains the size of the graphical buffer, but also the layout of the monitors.

  • I also figured that when in multi-monitor, the server creates a [RDPGFX_CREATE_SURFACE_PDU][surface] per monitor instead of a big surface covering all monitors. It seems that each surface has its own codec context (for H264 example).

So a large amount of work has been done to treat the repaint area by monitors instead of doing it globally:

  • split the damage area by monitor. Some troubles appeared with coordinates alignment: monitor aren't always on 64 boundary !
  • Of course when doing this you must rebase most of the coordinates with the origin of the surface;
  • this must work in parallel of the "standard mode" when egfx is not used;
  • when I've worked on limiting the number of monitors, it led me to rewrite almost completely the part handling resizes. When you come with 3 monitors and the limit is 2, the expected behaviour is that you should be resized to the size of the main monitor;
  • I'm not even talking of shadowing: when a multi-monitored spy do an assistance session, we have to disable the multi-monitor when it is shadowing and restore it when exiting. So lots of weird cases.

To conclude

The project took longer than initialy planed. Reminder for next projects: take a big safety delay when a proprietary software is involved, there's always bad surprises to expect.

In the current implementation we have decided to not do H264 in the multimonitor case, as it would require some significant changes to handle the context per surface. Anyway that would be interesting.