Marker Based AR Collaboration
Overview
The Marker Based AR Collaboration sample demonstrates how to build augmented reality use cases based on the detection of markers (QR Codes or ArUco).
In the first example scene, we show how a marker can be used in a colocation scenario where multiple users are physically in the same room.
When one user detects a marker, the application creates a network anchor linked to that marker. This network anchor is positioned in 3D space at the marker's real-world location.
When a second user detects the same marker, the application associates it with the existing network anchor. The second user's scene is then realigned so that the newly detected anchor matches the previously created network anchor.
Because each user is aligned to the same marker synchronized over the network, all users become colocated without needing to scan the room beforehand.
The second scenario illustrates how two users in different physical locations can collaborate effectively using a marker.
This is particularly relevant in the case of an on-site technician requesting technical support from a remote engineer.
The on-site technician is in a location with equipment that has a marker attached. First, the technician performs a calibration so the position of the marker on the real object can be determined.
After calibration is saved, each time the marker is detected, the system can automatically determine the technician's exact position relative to the equipment.
When the remote support engineer connects, they can precisely see the technician's location with respect to the equipment.
Additionally, a giant mode lets the user change scale to obtain an overall bird's-eye view of the scene.
Each user can also stream the video captured by their headset for richer remote assistance.

Technical Info
This sample uses the
Shared Authority
topology.The project has been developed with Unity 6 & Fusion 2 and tested with the following packages :
- Meta XR Core SDK 78.0.0 : com.meta.xr.sdk.core
- Meta MR Utility Kit 78.0.0 : com.meta.xr.mrutilitykit
- Unity OpenXR Meta 2.1.1 : com.unity.xr.meta-openxr
- OpenCV for Unity 2.6.6 (optionnal)
Headset firmware version: v79 & 81
The video broadcast is done using Photon Video SDK v2.58. Please note that a specific patch has been applied here because the official video SDK v2.58 preview resolution does not match the resolution of the video stream. This feature will be supported in a coming new version of the Video SDK.
Compilation : the
SubsampledLayoutDesactivation
editor script manages disabling the Meta XR Subsampled Layout option automatically : please remove it if it is not the desired behaviour.
Before you start
To run the sample :
Create a Fusion AppId in the PhotonEngine Dashboard and paste it into the
App Id Fusion
field in Real Time Settings (reachable from the Fusion menu).Create a Voice AppId in the PhotonEngine Dashboard and paste it into the
App Id Voice
field in Real Time Settings
Download
Version | Release Date | Download |
---|---|---|
2.0.7 | 10月 10, 2025 | Fusion Marker Based AR Collaboration 2.0.7 |
Version | Release Date | Download |
---|---|---|
2.0.7 | 10月 10, 2025 | Fusion Marker Based AR Collaboration Without Video SDK 2.0.7 |
Folder Structure
The main folder /Marker_Based_AR_Collaboration
contains all elements specific to this sample.
The /Photon
folder contains the Fusion and Photon Voice/Video SDK.
The /Photon/FusionAddons
folder contains the Industries Addons used in this sample.
The /Photon/FusionAddons/FusionXRShared
folder contains the rig and grabbing logic coming from the VR shared sample, creating a FusionXRShared light SDK that can be shared with other projects.
The /XR
folder contains configuration files for virtual reality.
Architecture overview
The Marker Based AR Collaboration sample is based on the same code base as that described in the VR Shared page, notably for the rig synchronization.
Aside from this base, the sample, like the other Industries samples, contains some extensions to the Industries Addons, to handle some reusable features like synchronized rays, locomotion validation, touching, teleportation smoothing or a gazing system.
For both scenarios in this sample, the architecture is very similar.
Marker tracking
Marker types
The project is compatible with two types of visual markers :
- QR Codes
- ArUco
QR code detection is based on the Meta MR Utility Kit (MRUK). Now, it supports Trackables of QR code type, as an experimental API. Please see the "Track QR Codes in MR Utility Kit" documentation page) on the Meta website, to see the current status, limitations, and how to enable it.
For ArUco markers detection and tracking, a modified version of the Quest ArUco Marker Tracking project has been used. It requires the OpenCV for Unity asset available in the Unity on the Asset Store.
The project can operate with a single marker type (QR code or ArUco) or both simultaneously.
Note :
- We do not distribute OpenCV for Unity with the project. You must purchase it yourself and add it to the project. Once installed, it is automatically detected, enabling ArUco marker support.
- Using ArUco markers requires compiling the application, whereas QR codes work directly in the Unity editor.
- If QR code detection doesn't work wiTh a compiled application, try to enable the experimental mode with adb command (
adb shell setprop debug.oculus.experimentalEnabled 1
) - Due to temporary technical constraints with the Video SDK, ArUco marker tracking is disabled when a video stream is started.
Marker management
The management of markers is divided into several layers :
The first layer,
IRLAnchorTracking
, is responsible for the visual detection of the markers located in the user's room in order to generate the associated anchors.
It will notify through events as soon as a change occurs on an anchor and will calculate a stabilized position for the anchors.A second layer will then handle the processing of this information.
For the first colocation scenario, this is done by theIRLRoomManager
class, whereas for the remote-assistance scenario it is done by theAnchorBasedObjectSynchronization
class.
These two classes implement theIRLAnchorTracking.IIRLAnchorTrackingListener
interface so that they are notified as soon as a change has been detected on an anchor.
For each detected marker, several prefabs are associated :
detectedIrlAnchorTagPrefab
: this is the current position of the marker reported by the detection mechanism (MRUK for QR codes or OpenCV for ArUco markers). This prefab is not intended to be displayed except for debugging purposes, in particular when using ArUco markers, because the current position may show large variations depending on the user's head movements.stabilizedIrlAnchorTagPrefab
: this is used to represent a stabilized version of the marker position. It is calculated based on an average of the previous positions (the length of the position history used in the calculation is configurable).representationIrlAnchorTagPrefab
: this is the visual element that is displayed when an anchor has remained stabilized at the same position for a sufficient amount of time.
The information related to an anchor is grouped in the IRLAnchorInfo
class, while the visual aspect of the anchors is handled by the AnchorTag
class that is representing a point detected in space: it can be used to track marker detection results, or to visualize stabilized version of those detected position.
Network Connection
The network connection is managed by the Meta building blocks [BuildingBlock] Network Manager
&& [BuildingBlock] Auto Matchmaking
.
[BuildingBlock] Auto Matchmaking
set the room name and Fusion topology (Shared mode).
[BuildingBlock] Network Manager
contains the Fusion's NetworkRunner
. The UserSpawner
component, placed on it, spawns the user prefab when the user joins the room and handles the Photon Voice connection.
Network parameters
The user can change the network settings using radial menu button under the left hand.

It can be usefull to select a specific room, region or to use a local server.
Please note that a watch is attached to the use's HardwareRig
, so the user can interact with the watch and open the network settings menu even when not connected to the network.
Watch Interaction
The watch menu opens when the user looks at the watch. Buttons appear, and the user can trigger actions by pressing them.


For both senarios, the buttons displayed above the watch trigger actions directly, while the settings buttons below the watch open configuration windows.
See Watch Menu Addon for more details.
Please note that the prefab spawned for each user contains 2 watches :
- one for the hand model driven by the controllers
- one for the hand model driven by the finger tracking
The RigPartVisualizer
enables/disables watches according to current hand tracking mode.

Colocation scenario
Regarding user colocation, the following are covered by the Anchor
addon used here:
- Multiple people located in a single room, with one or more markers in the room.
- Several groups of people located in different rooms. Each room has one or more markers. A person does not change rooms.
- Multiple people moving from room to room in the same building, each room equipped with a marker.
In this sample, we illustrate the first case : the goal in the demo scene is to correctly position people located in the same room.
For this, at least one marker must be visible in the room where the participants are located.
As soon as a person connects and detects a marker, a network anchor is created at the marker's position as detected by the headset.
When another person connects, thanks to real-time network synchronization, they detect that another user has already detected this marker.
They are then repositioned in the scene so that their position corresponds to their actual location in the room.
The Marker_Based_Colocation
scene is very simple, because the passthrough is enabled and there is no 3D environment.
Each user who connects is represented by an avatar.
Before colocation is performed, the position of the other users' avatars does not match their actual position in the room.
Once colocation is completed, each user's avatar should correspond to their real-world position in the room.
Note: the same scene can be in fact also used for the second scenario, several group of people in several rooms, with no change
Colocalization logic
The detailed mechanism of the colocalization is described in the colocalization chapter of the Ancors add-on documentation.
To summarize, several classes support this feature:
NetworkIRLRoomMember
:- This component located on the network rig manages a user's presence in a room.
- When a user connects, a random room identifier (
RoomId
) is assigned to them and registered in theIRLRoomManager
.
NetworkIRLRoomAnchor
:- The markers detected by a user in a room are represented by this class. The
AnchorId
parameter corresponds to the marker's payload. TheRoomId
is based on the room Id of the user who detects it (NetworkIRLRoomMember.RoomId
).
- The markers detected by a user in a room are represented by this class. The
IRLRoomManager
:- The scene has a room manager game object with this class, to manage all users and anchors detected in the room by those users.
- It implements the
IRLAnchorTracking.IIRLAnchorTrackingListener
interface in order to be notified whenever a marker is detected by the headset. - It tracks all real life rooms detected, and stores which anchors and members are present in it.
- It takes cares of triggering colocations when a member see an anchor with an anchor id previously detected in another room by another user: the 2 rooms are merged, and networked anchors and members in the merged room are moved to make the real life anchor match its preexisting virtual counterpart position.
Tracking settings
The user can modify the tracking settings using the radial-menu button located under the left hand.
The tracking settings UI is managed by the TrackingSettingsMenu
class.
In this menu, they can choose which type of marker will be used for tracking. Note that the button related to ArUco markers will be interactable only if OpenCV is installed in the project.
The Marker Stability
parameter sets the amount of time required for a marker to be considered stable at a fixed position (expectedDetectedAnchorsStabilityDuration
variable of IRLAnchorTracking
).
Finally, for ArUco markers, it is necessary to specify the size of the markers being used. If the selected size does not match the actual marker, the anchor will appear either in front of or behind the real marker.

Scene mapping
In addition to colocalization, another use case for visual markers is to map virtual elements onto the real-world scene observed in AR (for example, to change the room's décor).
If the position of a marker in the real environment is known, then once the headset detects that marker we can deduce the user's position in the room and precisely overlay graphical elements.
A simple way to test this feature is to place an IRLRoomAnchor
directly in the Unity scene and attach the associated visual as a child, so it depends on the position of that marker.
For more complex scenarios that use multiple markers, where it is difficult to measure their exact real-world positions in advance, you need to implement a calibration mechanism, which consists of :
- Placing the virtual elements in the real room at the desired positions.
- Detecting the positions of the markers with the headset.
- Saving the associated information (the positions of the virtual elements relative to the markers) so that later, when the associated marker is detected, the virtual element can be positioned correctly.
A similar calibration mechanism is used in the remote collaboration scenario.
Remote Collaboration scenario
The remote collaboration scenario is illustrated by the Marker_Based_AR_Collaboration
scene.
This scenario offers several features to enable effective collaboration when users are in different locations:
- Real-time 3D object repositioning using a marker, so that a remote participant can know the on-site person's position relative to the equipment with the marker.
- Giant mode, which allows changing scale to gain an overall bird's-eye view of the scene, helping users understand the environment.
- The 'World Move' feature, which lets the remote user correctly position themselves in the scene relative to the on-site user.
- Streaming of the video captured by the headset camera.
Tracking settings
Like for the colocation scenario, the user can modify the tracking settings using the radial-menu button located under the left hand.
The tracking settings UI is managed by the RepositioningTrackingSettingsMenu
class.
In this menu, they can choose which type of marker will be used for tracking. Note that the button related to ArUco markers will be interactable only if OpenCV is installed in the project.
For ArUco markers, it is necessary to specify the size of the markers being used. If the selected size does not match the actual marker, the anchor will appear either in front of or behind the real marker.
The Marker Stability
parameter sets the amount of time required for a marker to be considered stable at a fixed position (expectedDetectedAnchorsStabilityDuration
variable of IRLAnchorTracking
).
To perform the calibration, a 3D model representing the real equipment is spawned when the window opens (see model manager to change the 3D model).
It is possible to adjust its transparency or even completely disable the visual.
The size of the object is also configurable, mainly for debugging purposes when the object is very large.

Calibration

The calibration process simply consists of positioning the virtual object at the same location as the real object and pressing the Save Calibration button.
This button is enabled only when at least one marker is visible to the headset camera (checks that at least one tracking system, QR code or ArUco, is enabled if the button is never interactable).
Once the calibration is saved, it does not need to be repeated on the next launch of the application : as soon as the marker used during the calibration is detected by the headset, the 3D model will spawn and will be repositioned according to the detected marker's position and the calibration data.
ModelManager
The ModelManager
centralizes all information related to the 3D model that must be spawned during calibration or whenever the marker associated with a calibration is detected.
Any changes the user makes to the 3D model through the tracking settings interface are stored in this component (model scale, visibilty & transparency).

To change the 3D model, simply modify the AnchoredModelSettings
scriptable object reference.
The scriptable object allows you to specify:
- the prefab to spawn
- its scale
- the material's transparency
- the spawn position relative to the user's head position

The prefab of the 3D model object should contains the following components :
Transform
&NetworkTransform
to synchronize objet position,Grabbable
&NetworkGrabbable
to grab the objet (required during calibration process),NetworkVisibilty
to change & synchronize the model visibilty (enable or disable),ModelScaleChanger
to change the model scale (SyncScale
option in theNetworkTransform
component should be enabled),ModelPositionChanger
to control the object position & rotation.
In addition, it can be useful to add MagnetCoordinator
and AttractableMagnet
components so that the object can be easily positioned horizontally (on the floor or any other object with an AttractorMagnet
).
Object repositionning
If a calibration has been done, when a marker is detected, the application will spawn the 3D model and reposition it according to the detected marker's position and the calibration data.
This repositionning is managed by the AnchorBasedObjectSynchronization
class.

To prevent the model from being repositioned every time the anchor shifts slightly, a threshold is defined.
Depending on the size of the detected object, it may be necessary to adjust the minPositionChangeForUpdate
parameter.
For example, a precision of 1 cm might be appropriate for small objects, whereas a precision of 10 cm may be sufficient for larger ones.
Please note that the repositioning algorithm is currently optimized to detect a static object and to compensate for head movements using a stabilization algorithm.
This is especially important when tracking ArUco markers, where marker positions, reported at a high frequency, can vary significantly.
It is less necessary with QR code detection through the Meta API, because the reported position is already stabilized.
To track a moving object, the history used to calculate a stabilized anchor position would need to be reduced, or even removed, so the system can react more quickly.
Giant mode
This feature allows the user to scale up and view the entire scene from a towering perspective.
When the user press the 'Giant mode' button, it calls the Swap()
method of the ChangeScale
component (see ScaleChanger
game object in the scene).
The scale of the ScaleChanger
game object in the scene defined the target scale when the feature is enabled.
To synchronize the user's scale for remote participants, the SyncScale
option in the NetworkTransform
component of the network rig is enabled.
Locomotion
To effectively assist the on-site user, the remote user may need to move to a specific position, especially to perform hand-gesture interactions.
However, because users are in augmented reality, using traditional virtual reality style teleportation can feel strange, since their real environment does not visually move.
That's why we developed a different approach here : instead of the person moving themselves, the remote user moves the world.
But from the on-site user's perspective, the remote person appears to move normally.
When the 'World Move' feature is enabled, a grid is displayed (class DisplayWorldLocomotionAnchor
) and the user simply needs to grab in front of them and move their hand to initiate movement.
They will have the visual impression of moving the world (hence the name of the feature), but technically, this action moves the user within the scene.
The distance the user moves is proportional to the velocity of their hand movements (see class SelfLocomotionGrabbable
for more details).
Camera Streaming
Each user can decide to start/stop streaming their camera to the other users using the watch radial menu.
When streaming is active, in addition to the text on the watch, visual feedback informs the user that they are sharing their camera.
Other remote users can then view this video stream on the screen that appears when the stream is launched.
This screen is located in front of the user sharing their camera, giving the impression of having a kind of portal onto their real environment.
By pressing the watch streaming button again, you can place the screen at a fixed position in the scene, avoiding having a screen that is always moving.
The Meta camera and streaming resolution can be changed using the settings button under the left hand.
Note that several users can share their camera simultaneously.
To stream the camera, we use the Photon Video SDK. It is a special version of the Photon Voice SDK, including support for video streaming.
As the requirement is very similar to screen sharing, we are reusing the developments that were included in the ScreenSharing Addon.
Notably, being able to use Android video surfaces alongside XR single pass rendering requires a specific shader, that is included in the add-on, as well as some additional components handling those specific textures.
Camera Permission
In order to access to the Meta Quest Camera, it is required to request the permissions.
This is managed by the WebCamTextureManager
& PassthroughCameraPermissions
components located on the WebCamTextureManagerPrefab
game object.
Both scripts are provided by Meta in the Unity-PassthroughCameraAPISamples.
The WebCamTextureManagerPrefab
scene game object is disabled by default and it is automatically activated by the VoiceConnectionFadeManager
when the Photon Voice connection is established. It is required to prevent running several authorization requests at the same time.
Camera Video Emission
Camera streaming is managed by the WebCamEmitter
class of the ScreenSharing addon.
The user can start/stop the camera streaming using the watch button.
In addition, the streaming user can decide whether the screen can be anchored in the scene, which avoids having a screen that is always moving.
For this, the streaming watch button calls the WatchUIManager
ToggleEmitting()
methods when the user touches the watch.
This toggles between the following 3 states :
- no streaming,
- streaming screen following the user's head,
- streaming screen at a fixed position,
To start or stop the camera streaming, the NetworkWebcam
component located on the user prefab calls the WebCamEmitter
ToggleEmitting()
method.
Camera Video Reception
The ScreensharingReceiver
scene game object manages the reception of the camera stream.
It waits for new voice connections, on which the video stream will be transmitted. Upon such a connection, it creates a video user and a material containing the video user texture.
Then, with EnablePlayback()
, it pass it to the ScreenSharingScreen
which manages the screen renderer visibility : the screen will then change its renderer material to this new one.
The special case to be managed in this sample compared with the default screensharing addon is that there is a screen for receiving the video stream for each user.
So, ScreenSharingScreen
and NetworkWebcam
components are located on the networked user prefab, and the ScreenSharingEmitter
passes in the communication channel user data the network object Id of the object containing the target screen.
This way, the ScreensharingReceiver
looks for this object instead of using a default static screen.
The NetworkWebcam
component :
- references itself on the
WebCamEmitter
as thenetworkScreenContainer
, - configures the screen mode (display or not of the video stream for the local user),
- determines whether users stream the camera by default when they join the room,
Camera Resolution
The user can change the Meta camera resolution at runtime using the streaming settings menu (button under the left hand).

A UI is then displayed, showing the different resolutions supported by the Meta Camera API.
The streaming resolution is automatically adapted according to the Meta Camera resolution setting thanks to the WebcamEmitter
InitializeRecorder()
methods called when a new streaming is started.
If a stream is in progress when the user changes the resolution, the transmission is stopped and then automatically restarted.
Important notes about application configuration and deployment
The Video SDK is incompatible with some options that the Meta tooling might suggest you to activate (see the included PhotonVoice/readme-video.md
for more details on the supported configurations).
To configure properly the project, you need:
- Graphics jobs disabled
- Multithreading rendering disabled
Used XR Addons & Industries Addons
To make it easy for everyone to get started with their 3D/XR project prototyping, we provide a comprehensive list of reusable addons.
See Industries Addons for more details.
Here are the addons we've used in this sample.
XRShared
XRShared addon provides the base components to create a XR experience compatible with Fusion.
It is in charge of the users' rig parts synchronization, and provides simple features such as grabbing and teleport.
See XRShared for more details.
Anchors
We use the Anchors
addon to detected visual markers.
See Anchors Addon for more details.
Watch Menu
We use the Watch menu
addon to allow the user to access the features provided by the sample.
Note that we use subclasses of RadialMenuButtonAction
& RadialMenuButtonWindows
so that the buttons related to streaming are displayed only if the Photon Video SDK is installed.
See Watch Menu Addon for more details.
Voice Helpers
We use the VoiceHelpers
addon for the voice integration.
See VoiceHelpers Addon for more details.
Screen Sharing
We use the ScreenSharing
addon to stream the Meta Quest Camera.
See ScreenSharing Addon for more details.
Meta Core Integration
We use the MetaCoreIntegration
addon to synchronize users' hands.
See MetaCoreIntegration Addon for more details.
XRHands synchronization
The XR Hands Synchronization addon shows how to synchronize the hand state of XR Hands's hands (including finger tracking), with high data compression.
See XRHands synchronization Addon for more details.
Feedback
We use the Feedback
addon to centralize sounds used in the application and to manage haptic & audio feedbacks.
See Feedback Addon for more details.
3rd Party Assets and Attributions
The sample is built around several awesome third party assets:
- Oculus Integration
- Oculus Lipsync
- Oculus Sample Framework hands
- Meta Unity Passthrough Camera API Samples "Copyright Meta Platform Technologies, LLC and its affiliates. All rights reserved;"
- Meta Unity MRUK Sample "Copyright Meta Platform Technologies, LLC and its affiliates. All rights reserved;"
- QuestArUcoMarkerTracking MIT License Copyright (c) 2025 Takashi Yoshinaga;"
- OpenCV for Unity
- Sounds