Pixel Streaming and WebXR Development in Unreal

Recently, Xbox made headlines when its game pass became available on Amazon Firestick.

Ouch! That hit me right in the nostalgia! You see, I grew up with an Xbox console, and those memories are still fresh in my mind. So the idea of Xbox leaving the console market genuinely shocked me.

As I was trying not to get my panty in a bunch, I was thinking about the broader implications of cloud gaming, some reason for its existence that weren’t deeply offensive. I thought cloud gaming for Augmented Reality (AR) games might be somewhat useful. Afterall I’m not too young to feel any nostalgia for these sort of games.

AR, for those who might not know, is a technology that superimposes 3D objects onto our real world, enhancing our experiences with multi-sensory stimuli. Cloud gaming could in theory allow much more powerful games to run on my phone which could be kinda cool.

While working with Unreal, I quickly became aware of Pixel Streaming because it is a topic that is constantly being discussed by engine representatives and influencers who are touting that this sort of thing will take over regular YouTube streaming anytime soon. Pixel Streaming takes advantage of streaming technologies such as WebRTC to stream content from a computer to a TV, mobile phone, or even other computers enabling high-quality content to be delivered from a powerful computer to less powerful devices.

So, I started experimenting with pixel streaming in Unreal. As a proof of concept, I wanted to see if I could create a simple AR app that takes advantage of rendering on my computer.

Augmented Reality Application Template

I have worked with Virtual Reality in the past using the Oculus Rift, and I knew from the outset that Android was the lingua franca in this realm. My initial thought was to allow Pixel Streaming on my mobile device. So after many painful steps setting up Android Studio, USB debugging and network connectivity, I launched the Augmented Reality Application Template on my mobile device.

Pixel Streaming is typically done from an Unreal app and client usually receive streamed images via their web browser. Now, I was wondering if there was any way to receive a pixel stream from another Unreal application? Doing this would allow me to take advantage of Unreal’s AR functionalities.

I came across an experimental feature of Pixel Streaming called the “Pixel Streaming Player”. Pixel Streaming Player adds the ability to received streamed pixel into a render texture. The idea was to overlay the streamed Pixel onto the camera feed shown in the mobile app. To quickly validate this, I placed a plane in front of the AR camera.

Following the Pixel Streaming Player, setting up the blueprint was fairly straight forward.

I should have started with a simpler scenario! I quickly realized that as it stood there were serious problems with Pixel Streaming connectivity on mobile device. Worst of all when testing on a standalone application on my PC, I realized that there were also problems with latency with Pixel Streaming Player even when the client application is ran on the same machine as the pixel streaming application. By second-guessing this approach, I realized that there was also no easy way to communicate back to the Unreal app from the Pixel Streaming Player. As shown in the Pixel Streaming “Get Started Demo” Virtual joystick input present in the Web Media Player rely on “emitUIInteraction” to send arbitrary string back to Unreal. There is no such available API in the Unreal Pixel Streaming Player as of yet.

The final straw that broke the camel’s back was realizing that having the player install an app defeated the point of cloud gaming in the first place. Come on, get with the times, man!

AR in the WebPlayer with WebXR

As expected PixelStreaming works flawlessly on mobile device when connected over the same network.

So, I started digging around and discovered that the Unreal developers were already thinking way ahead of me! In fact there were already facilities in the Web Player for XR. It leverages Web XR available on AR Core compatible device for sterescopic camera recording. It could potentially handle things such hit detection and light approximation as well which I was really excited about! However I didn’t have any fancy helmet or AR glasses equipment at home, and there were no functionalities for straight up AR compatible with my Samsung Galaxy Tablet as of yet.

I quickly realized that I would have to dust off my web developer hat to get this working. I decided to write a new CustomARController which would allow me to initiate the right type of WebXR session. the provided WebXRController starts a session with immersive-vr which is not suited for my application. While reading the documentation, I found everything I needed to iniate my session on this page and this page when it came down to reference space used to obtain tracking information.

// CustomARController.ts

import { Logger } from '@epicgames-ps/lib-pixelstreamingcommon-ue5.5';
import { WebRtcPlayerController } from '../WebRtcPlayer/WebRtcPlayerController';
import { XRGamepadController } from '../Inputs/XRGamepadController';
import { XrFrameEvent } from '../Util/EventEmitter'
import { Flags } from '../pixelstreamingfrontend';

export class CustomARController {
    public gl: WebGL2RenderingContext;
    public xrSession: XRSession;
    private xrRefSpace: XRReferenceSpace;
    private xrViewerPose : XRViewerPose = null;
    // Used for comparisons to ensure two numbers are close enough.
    private EPSILON = 0.0000001;

    private webRtcController: WebRtcPlayerController;
    private xrGamepadController: XRGamepadController;

    onSessionStarted: EventTarget;
    onSessionEnded: EventTarget;
    onFrame: EventTarget;    

    constructor(webRtcPlayerController: WebRtcPlayerController) {
        this.xrSession = null;
        this.webRtcController = webRtcPlayerController;
        this.xrGamepadController = new XRGamepadController(
            this.webRtcController.streamMessageController
        );
        this.onSessionEnded = new EventTarget();
        this.onSessionStarted = new EventTarget();
        this.onFrame = new EventTarget();
    }

    public startSession(gl: WebGL2RenderingContext) {
        if (!this.xrSession) 
        {
            this.gl = gl;
            navigator.xr
                /* Request immersive-ar session without any optional features. */
                .requestSession('immersive-ar', { 
                    optionalFeatures: ['hit-test'] ,
                    requiredFeatures: ['local']
                })
                .then((session: XRSession) => {
                    this.onXrSessionStarted(session);
                });
        } else 
        {
            this.xrSession.end();
        }
    }
    ...
}

WebRTC and WebGL2

There are 3 components to the WebPlayer, receive PixelStreaming frames, display the camera feed using, and transmit tracking information back to Unreal.

I needed to find a way to overlay PixelStreaming frames on top of the WebGL2RenderingContext used by the WebXR session. Thankfully this wasn’t too difficult. Because this is precisely shown in the WebXRController. The idea is to obtain the frameBuffer used by the xrSession and draw on top of it using gl.bindFramebuffer, gl.viewport and gl.drawArrays.

// PixelStreaming.ts 

_onFrame(time: DOMHighResTimeStamp, frame: XRFrame) {

        Logger.Log(Logger.GetStackTrace(), 'PixelStream: _onFrame');
        this._updateVideoTexture();

        const video = this._webRtcController.videoPlayer.getVideoElement();           
        

        // Bind the framebuffer to the base layer's framebuffer
        const glLayer = this.customArController.xrSession.renderState.baseLayer;
        this._gl.bindFramebuffer(this._gl.FRAMEBUFFER, glLayer.framebuffer);

        // Set the relevant portion of clip space
        this._gl.viewport(0, 0, glLayer.framebufferWidth, glLayer.framebufferHeight);

        // Draw the rectangle we will show the video stream texture on
        this._gl.drawArrays(this._gl.TRIANGLES /*primitiveType*/, 0 /*offset*/, 6 /*count*/);
    }

With that, I was able to draw the PixelStreaming frames on top of the camera feed. At first I thought the the xrSession camera feed disapeared. However I was able to validate this by drawing the frames intermitently.

Now, this raises the question: how would I isolate the subject of my AR application to show it in the context of the camera feed? PixelStream relies on WebRTC. WebRTC is a peer-to-peer technology, meaning that the communication between the computers is done directly from one to another, and it is used to transmit data such as images over said connection.

While reading the Unreal documentation and WebRTC forums, I quickly discovered that WebRTC doesn’t support alpha channel in video streaming with any of the supported video codecs (H264, VP8, VP9). So even if I managed to draw the scene with an alpha channel it will not be transmitted to the web player.

Create Masks With the Custom Depth Buffer

I was left thinking about Tutorial I followed a while back when I was trying to implement selection highlighting for item pickup. I thought that using custom depth and a post-process shader, I could simulate a green screen effect. The green could the be discarded

In my Postprocessing material Blueprint, PostProcessingInput0 refers to the color of the scene before the postprocessing full-screen effect is applied. Make sure to also add your postprocessing volume, set its extent to “Infinite” .

The next step is to enable Render CustomDepth pass on the static mesh. You will also need to go in the Editor Setting and enable the Custom Depth-Stencil Pass.

Custom WebGL2 Viewport

As mentioned WebRTC, so we will need a custom viewport which discards the green color sent over via Pixel Stream.

  // PixelStreaming.ts 

    _initShaders() {

        // shader source code
        const vertexShaderSource: string =
        `
        attribute vec2 a_position;
        attribute vec2 a_texCoord;

        // varyings
        varying vec2 v_texCoord;

        void main() {
           gl_Position = vec4(a_position.x, a_position.y, 0, 1);
           // pass the texCoord to the fragment shader
           // The GPU will interpolate this value between points.
           v_texCoord = a_texCoord;
        }
        `;

        const fragmentShaderSource: string =
        `
        precision mediump float;

        // our texture
        uniform sampler2D u_image;

        // the texCoords passed in from the vertex shader.
        varying vec2 v_texCoord;

        void main() {
            // gl_FragColor = texture2D(u_image, v_texCoord);
            vec4 color = texture2D(u_image, v_texCoord);
            // checking if the green component of the color is significantly higher than the red and blue components
            if (color.g > 0.6 && color.r < 0.4 && color.b < 0.4) {
                discard;
            } else {
                gl_FragColor = color;
            }

        }
        `;

        // setup vertex shader
        const vertexShader = this._gl.createShader(this._gl.VERTEX_SHADER);
        this._gl.shaderSource(vertexShader, vertexShaderSource);
        this._gl.compileShader(vertexShader);
        if (!this._gl.getShaderParameter(vertexShader, this._gl.COMPILE_STATUS)) {
            console.error('ERROR compiling vertex shader!', this._gl.getShaderInfoLog(vertexShader));
            return;
        }

        // setup fragment shader
        const fragmentShader = this._gl.createShader(this._gl.FRAGMENT_SHADER);
        this._gl.shaderSource(fragmentShader, fragmentShaderSource);
        this._gl.compileShader(fragmentShader);
        if (!this._gl.getShaderParameter(fragmentShader, this._gl.COMPILE_STATUS)) {
            console.error('ERROR compiling fragment shader!', this._gl.getShaderInfoLog(fragmentShader));
            return;
        }

        // setup GLSL program
        const shaderProgram = this._gl.createProgram();
        this._gl.attachShader(shaderProgram, vertexShader);
        this._gl.attachShader(shaderProgram, fragmentShader);
        this._gl.linkProgram(shaderProgram);
        if (!this._gl.getProgramParameter(shaderProgram, this._gl.LINK_STATUS)) {
            console.error('ERROR linking program!', this._gl.getProgramInfoLog(shaderProgram));
            return;
        }

        this._gl.useProgram(shaderProgram);

        // look up where vertex data needs to go
        this._positionLocation = this._gl.getAttribLocation(
            shaderProgram,
            'a_position'
        );
        this._texcoordLocation = this._gl.getAttribLocation(
            shaderProgram,
            'a_texCoord'
        );
    }

You will also need to make sure to enable gl.BLEND. WebGL’s default behavior is to clear the alpha channel to 1.0 (fully opaque). If you want to see through the parts of the canvas where you’ve discarded fragments,

// PixelStreaming.ts
    _initGL() {
        
        const video = this._webRtcController.videoPlayer.getVideoElement();

        this._canvas = document.createElement('canvas');
        this._canvas.id = 'customCanvas';
        this._canvas.style.width = '100%';
        this._canvas.style.height = '100%';
        this._canvas.style.position = 'absolute';
        this._canvas.style.pointerEvents = 'all';

        this._canvas.width = video.videoWidth;
        this._canvas.height = video.videoHeight;
        this._gl = this._canvas.getContext('webgl2', {
            xrCompatible: true
        });
        this._gl.clearColor(0.0, 0.0, 0.0, 1);
  
        video.parentElement.appendChild(this._canvas);

        // WebGL’s default behavior is to clear the alpha channel to 1.0 (fully opaque). 
        // If you want to see through the parts of the canvas where you’ve discarded fragments,
        this._gl.enable(this._gl.BLEND);
        // If depth testing is enabled, the pixel stream might be occluded by the XRWebGLLayer even if it’s rendered afterwards
        this._gl.disable(this._gl.DEPTH_TEST);
        this._gl.blendFunc(this._gl.SRC_ALPHA, this._gl.ONE_MINUS_SRC_ALPHA);
    }

WebXR position tracking

Now that I have the Pixel Stream properly shown on top of the camera feed, It is time to send the camera tracking back to Unreal so that the virtual camera in the engine can be moved in relation to how the tablet is moved.

Thankfully, a lot of relevant code can be reused from the WebXRController.ts sample. The only hurdle was determing the proper reference space to use. I found that initializing my Xr session with 'local' as a required option enabled and chosing the reference space named 'local' also was the right way to go.

// PixelStreaming.ts
export class CustomARController {
    ...

    sendXRDataToUE() {
        
        const trans = this.xrViewerPose.transform.matrix;

        // If we don't need to the entire eye views being sent just send the transform
        this.webRtcController.streamMessageController.toStreamerHandlers.get('CustomArTransform')([
            // 4x4 transform
            trans[0], trans[4], trans[8],  trans[12],
            trans[1], trans[5], trans[9],  trans[13],
            trans[2], trans[6], trans[10], trans[14],
            trans[3], trans[7], trans[11], trans[15],
        ]);
    }

    onXrFrame(time: DOMHighResTimeStamp, frame: XRFrame) {
        
        Logger.Log(Logger.GetStackTrace(), 'XR onXrFrame');

        this.xrViewerPose = frame.getViewerPose(this.xrRefSpace);
        
        if (this.xrViewerPose) {
            Logger.Log(Logger.GetStackTrace(), 'XR xrViewerPose');
            this.sendXRDataToUE();
        }

        ...

        this.xrSession.requestAnimationFrame(
            (time: DOMHighResTimeStamp, frame: XRFrame) =>
                this.onXrFrame(time, frame)
        );

        this.onFrame.dispatchEvent(new XrFrameEvent({ time, frame }));
    }
}

You will also need to register your custom WebRTC message as to not conflict with existing Unreal functionality.

// WebRtcPlayerController.ts
        
this.streamMessageController.registerMessageHandler(
            MessageDirection.ToStreamer,
            'CustomArTransform',
            (data: Array<number | string>) =>
                this.sendMessageController.sendMessageToStreamer(
                    'CustomArTransform',
                    data
                )
        );

Receiving Data Back in Unreal

I have experience writing my own camera rig. So modifying a camera transform via code was not too much of a hastle. I created a custom Pawn with a Camera stuck to its back. The solution simply consists of mapping the received transform to the Pawn’s transform and voila! Job is done.

// Fill out your copyright notice in the Description page of Project Settings.


#include "PixelStreamAr/PSARPawn.h"

#include "PixelStreamingInputComponent.h"
#include "IPixelStreamingInputHandler.h"
#include "IPixelStreamingModule.h"
#include "IPixelStreamingInputModule.h"
#include "IPixelStreamingStreamer.h"
#include "Camera/PlayerCameraManager.h"


// Called when the game starts or when spawned
void APSARPawn::BeginPlay()
{
	Super::BeginPlay();

    if (auto PlayerController = Cast<APlayerController>(GetController()))
    {
        CameraManager = PlayerController->PlayerCameraManager;

        typedef EPixelStreamingMessageTypes EType;
        FPixelStreamingInputProtocol::ToStreamerProtocol.Add("CustomArTransform", FPixelStreamingInputMessage(110, {	// 4x4 Transform
        EType::Float, EType::Float, EType::Float, EType::Float,
        EType::Float, EType::Float, EType::Float, EType::Float,
        EType::Float, EType::Float, EType::Float, EType::Float,
        EType::Float, EType::Float, EType::Float, EType::Float,
            }));


        auto& PixelStreamingModule = IPixelStreamingModule::Get();
        auto Streamers = PixelStreamingModule.GetStreamerIds();
        if (Streamers.Num() != 0)
        {
            if (auto Streamer = PixelStreamingModule.FindStreamer(Streamers[0]))
            {
                if (auto InputHandler = Streamer->GetInputHandler().Pin())
                {
                    InputHandler->RegisterMessageHandler("CustomArTransform", [this](FString SourceId, FMemoryReader Ar) { HandleOnARTransform(Ar); });
                }
            }
        }
    }	
}

FMatrix ExtractWebXRMatrix(FMemoryReader& Ar)
{
    FMatrix OutMat;
    for (int32 Row = 0; Row < 4; ++Row)

    {
        float Col0 = 0.0f, Col1 = 0.0f, Col2 = 0.0f, Col3 = 0.0f;
        Ar << Col0 << Col1 << Col2 << Col3;
        OutMat.M[Row][0] = Col0;
        OutMat.M[Row][1] = Col1;
        OutMat.M[Row][2] = Col2;
        OutMat.M[Row][3] = Col3;
    }
    OutMat.DiagnosticCheckNaN();
    return OutMat;
}

FTransform WebXRMatrixToUETransform(FMatrix Mat)
{
    // Rows and columns are swapped between raw mat and FMat
    FMatrix UEMatrix = FMatrix(
        FPlane(Mat.M[0][0], Mat.M[1][0], Mat.M[2][0], Mat.M[3][0]),
        FPlane(Mat.M[0][1], Mat.M[1][1], Mat.M[2][1], Mat.M[3][1]),
        FPlane(Mat.M[0][2], Mat.M[1][2], Mat.M[2][2], Mat.M[3][2]),
        FPlane(Mat.M[0][3], Mat.M[1][3], Mat.M[2][3], Mat.M[3][3]));
    // Extract & convert translation
    FVector Translation = FVector(-UEMatrix.M[3][2], UEMatrix.M[3][0], UEMatrix.M[3][1]) * 100.0f;
    // Extract & convert rotation
    FQuat RawRotation(UEMatrix);
    FQuat Rotation(-RawRotation.Z, RawRotation.X, RawRotation.Y, -RawRotation.W);
    return FTransform(Rotation, Translation, FVector(Mat.GetScaleVector(1.0f)));
}

void APSARPawn::HandleOnARTransform(FMemoryReader Ar)
{
    // The `Ar` buffer contains the transform matrix stored as 16 floats
    FTransform Transform = WebXRMatrixToUETransform(ExtractWebXRMatrix(Ar));       
    Transform.SetScale3D(FVector(1, 1, 1));
    SetActorTransform(Transform);
}

I forgot to mention, but as mentioned in the PixelStreaming guide, you will need to enable certain to enable streaming and receiving messages from another peer over the connection.

Pixel Streaming AR Demo

I have now a working AR app on my mobile device’s browser with visual drawn in Unreal. This could be taken a lot further using things like hit detection and light. There is obviously concerns with latency, and the green screen which may very well be a deal breaker for this.

Over the devleopment of this project, I found it very useful to use the Google Chrome devtool to allow a connection between my mobile device and my local machine. I found that feature such as Port Forwarding was the only way to temporarily bypass the https requirement mendated by WebXR. Details are available here.

The idea of a green screen to bypass the the limitation of no alpha channel support for WebRTC came from this article: Using streaming to render high-fidelity graphics in AR | by Jam3 | Medium

While a little cryptic, the article was a huge help over the development.

Anyway that’s it for me today!

Courage! Keep on building

Source Code Available

perrauo/PixelStreamingInfrastructure: The official Pixel Streaming servers and frontend. (github.com)

https://github.com/perrauo/pixelstream-ar-ue-demo.git

Leave a Reply

Your email address will not be published. Required fields are marked *