When we set out to build Cosmos, we knew that creating truly immersive digital spaces for remote teams and educational institutions would require solving complex technical challenges. Traditional video conferencing platforms are designed for scheduled meetings with clear start and end times. Cosmos, however, needed to support an entirely different paradigm: always-on spatial environments where people can move around, form spontaneous conversations, and collaborate naturally.
In this technical deep-dive, we'll explain the architecture and innovations that make Cosmos uniquely suited for all-day use, even on standard hardware.
The Challenge: Instant Spatial Communication
The fundamental technical challenge we faced was speed. In physical spaces, you can walk up to a colleague and start talking instantly. Online, typical video calls require 30+ seconds to initiate - disrupting the natural flow of interaction.
For Cosmos to replicate the feeling of physical presence, we needed to reduce this to milliseconds. When you approach someone in a Cosmos space, the video connection must establish in under 50ms to create the sense of genuine spatial interaction.
This requirement shaped our entire technical approach.

Core Architecture
Cosmos operates on a dual-server architecture that separates environmental state from video handling:
Game Servers: The Digital Space
Our game servers maintain the complete state of each virtual environment - tracking user positions, interactions, and space configurations. These lightweight servers:
- Process user movements and proximity in real-time
- Determine when users are close enough to trigger voice/video connections
- Manage the visibility and interactive elements of the space
- Support customisable layouts for different use cases (offices, campuses, events)
Video Relay Servers: Dynamic Connections
Our globally distributed video servers handle the audio/video connections between users. Unlike traditional conferencing platforms that establish a single large call, Cosmos:
- Dynamically activates and deactivates video feeds based on proximity
- Prioritises connections for users currently in conversation
- Maintains low-latency audio even when video quality needs adjustment
- Scales connections fluidly as users move through the space
Both server types are deployed across multiple regions (North America, Latin America, Europe, Asia, Japan, and Australia) to ensure low-latency experiences for teams worldwide.
Video Quality Optimisation
Advanced Codec Selection
We utilise VP9 as our default video codec across all platforms, which provides:
- Approximately 50% higher compression efficiency than VP8
- Better quality at lower bitrates
- Reduced bandwidth requirements without sacrificing visual clarity
For browsers that don't support VP9 encoding (such as Firefox), we automatically fall back to VP8. This codec-switching happens seamlessly behind the scenes.
Intelligent Resource Management
Cosmos continuously monitors system performance and makes real-time adjustments:
- Frame Rate Monitoring: We target 30 frames per second for smooth movement in spatial environments
- Automatic Low Resource Mode: When rendering falls below 20 FPS, we trigger optimisations:
- Reduce the number of concurrent video streams
- Lower the resolution of peripheral videos
- Prioritise screen shares at higher quality
- Compress world textures and assets
- Adjust frame rates for optimal performance
This adaptive approach allows Cosmos to run smoothly even when:
- Running on older hardware
- Handling large numbers of participants
- Operating in low-bandwidth environments
- Being used for all-day sessions

Hardware Considerations
To help teams prepare for optimal experiences, we recommend these hardware configurations:
Minimum Requirements
- Dual Core processor
- 2GB memory
Suitable for:
- Up to 20 participants in meetings
- Presenting non-video content
Recommended Specifications
- Modern CPU:
- 10th Gen Intel i3, i5, or i7 (Ice Lake and up)
- AMD 3000 series Ryzen 5 or 7
- 4GB memory
Suitable for:
- High-quality video and audio
- Presenting content with videos and animations
- Using visual camera effects
- Multitasking during meetings
Optimal Configuration
- High-performance CPU:
- 11th Gen Intel i5 or i7
- AMD 5000 series Ryzen 5 or 7
- Apple Silicon M1
- 1080p camera
- Graphics card with WebGL 2.0 support
Suitable for:
- Full HD quality for camera video and presentations
- Access to all visual effects in the highest quality
- Heavy multitasking during use
Pop-out Mode: The Secret to All-Day Performance
One of our most innovative features is the pop-out mode in our desktop applications. When users minimise Cosmos to focus on other work:
- We clear most assets from memory
- Stop rendering the 3D space
- Maintain only essential connection information
- Keep a small, efficient interface for status changes and incoming conversations
- Automatically expand when interaction is needed
- Show screen shares without requiring the full application to open
This approach dramatically reduces CPU, GPU, and memory usage during periods of lower activity while maintaining presence in the space.

The Result: Natural Digital Presence
These technical optimisations work together to create an experience that feels remarkably different from traditional video conferencing:
- Instantaneous connections - Conversations start in under 50ms when approaching someone
- Flexible group formation - Teams can split and merge conversations naturally
- All-day sustainability - Low resource usage during inactive periods
- Consistent quality - Automatic adjustments based on network and hardware conditions
- Universal accessibility - Works across a wide range of devices and connection types

By solving these fundamental technical challenges, Cosmos creates digital spaces that support the natural ebb and flow of teamwork and learning - whether you're running a remote company, teaching a distributed class, or building an online community.