VisionOS: Making 3D objects tappable

Talks about how to add a .usdz 3D model to the user's room immersive space, and how to detect pinch to select tap gesture to the 3D model.

VisionOS: Making 3D objects tappable

To begin with this article, if you do not know how to display 3D assets in user's space, you should read my other article VisionOS app game development 101 (2D view, Immersive Space, tappable 3D objects, WindowGroup)

Now, let's assume we have a starting point, which is a spatial view for our VisionOS application that displays a 3D robot asset:

import SwiftUI
import RealityKit

struct ContentSpace: View {
    
    @State private var loaded3DAsset: Entity? = nil
    
    var body: some View {
        
        RealityView { content in
            
            loaded3DAsset = await loadFromRealityComposerProject(nodeName: "robot_walk_idle",
                                                                 sceneFileName: "robot_walk_idle.usdz")
            loaded3DAsset?.name = "robot"

            loaded3DAsset?.scale = .init(x: 0.1, y: 0.1, z: 0.1)
            loaded3DAsset?.position = .init(x: 0, y: 0, z: -3)
            
            // TODO: Collision
            // TODO: Allow user to tap on it
            // TODO: add lighting
            
            guard let loaded3DAsset else {
                return
            }
            
            content.add(loaded3DAsset)
            
        }
        
    }
    
}

The above view is put into a immersive space, which can be open or closed from our 2D SwiftUI view:

import SwiftUI

@main
struct VisionOSDemoApp: App {
    
    var body: some Scene {
        
        WindowGroup {
            ContentView()
        }
        .windowStyle(.plain)
        
        ImmersiveSpace(id: "robotSpace") {
            ContentSpace()
        }
        
    }
    
}

In our SwiftUI view, we can use the id of that immersive space robotSpace and the following environment variable to open or close it:

@Environment(\.openImmersiveSpace) private var openImmersiveSpace

@Environment(\.dismissImmersiveSpace) private var dismissImmersiveSpace

To allow the user to tap on the object, first, you allow input to the object before you add it to the scene:

arAsset.components[InputTargetComponent.self] = InputTargetComponent(allowedInputTypes: .all)
Tapping here means user looking at the object with their eyes and using the finger to pinch to select.

If you are looking for ways to allow the user to use hand to physically touch the object, you should use collider detection and also render user's hands as an Entity. This is not covered in this article!

Then, for the 3D RealityView, you should attach a tap gesture detector (yes, similar to how you detect tap gesture in 2D views, you can attach it to the RealityKit 3D view):

    .gesture(TapGesture()
        .targetedToAnyEntity()
        .onEnded({ tap in
            let tappedNode = tap.entity
            
        }))

Now, our 3D immersive space will contain the following code:

import SwiftUI
import RealityKit

struct ContentSpace: View {
    
    @State private var loaded3DAsset: Entity? = nil
    
    var body: some View {
        
        RealityView { content in
            
            loaded3DAsset = await loadFromRealityComposerProject(nodeName: "robot_walk_idle",
                                                                 sceneFileName: "robot_walk_idle.usdz")
            loaded3DAsset?.name = "robot"

            loaded3DAsset?.scale = .init(x: 0.1, y: 0.1, z: 0.1)
            loaded3DAsset?.position = .init(x: 0, y: 0, z: -3)
            loaded3DAsset?.components[InputTargetComponent.self] = InputTargetComponent(allowedInputTypes: .all)
            
            guard let loaded3DAsset else {
                return
            }
            
            content.add(loaded3DAsset)
            
        }
        .gesture(TapGesture()
        .targetedToAnyEntity()
        .onEnded({ tap in
            let tappedNode = tap.entity
            
        }))
        
    }
    
}

Now, since one node can contain many child nodes, and the user could tap on the child node instead of the robot node itself, we need to recursively check up to the node we are looking for.

For example, when the user taps on the robot we added to the 3D space, the user could be tapping on the head of the robot, which is its own child node. We can run the following helper function to check for the node based on the name.

/// Checks whether a collision event contains one of a list of named entities.
func eventHasTargets(event: CollisionEvents.Began, matching names: [String]) -> Entity? {
    for targetName in names {
        if let target = eventHasTarget(event: event, matching: targetName) {
            return target
        }
    }
    return nil
}

/// Checks whether a collision event contains an entity that matches the name you supply.
func eventHasTarget(event: CollisionEvents.Began, matching targetName: String) -> Entity? {
    let aParentBeam = event.entityA[parentMatching: targetName]
    let aChildBeam = event.entityA[descendentMatching: targetName]
    let bParentBeam = event.entityB[parentMatching: targetName]
    let bChildBeam = event.entityB[descendentMatching: targetName]
    
    if aParentBeam == nil && aChildBeam == nil && bParentBeam == nil && bChildBeam == nil {
        return nil
    }
    
    var beam: Entity?
    if aParentBeam != nil || aChildBeam != nil {
        beam = (aParentBeam == nil) ? aChildBeam : aParentBeam
    } else if bParentBeam != nil || bChildBeam != nil {
        beam = (bParentBeam == nil) ? bChildBeam : bParentBeam
    }
    
    return beam
}


func findEntity(within: Entity, matching targetName: String) -> Entity? {
    let aParentBeam = within[parentMatching: targetName]
    let aChildBeam = within[descendentMatching: targetName]
    
    if aParentBeam == nil && aChildBeam == nil {
        return nil
    }
    
    var beam: Entity?
    if aParentBeam != nil || aChildBeam != nil {
        beam = (aParentBeam == nil) ? aChildBeam : aParentBeam
    }
    
    return beam
}

Remember we have named the robot Entity as robot? We can check for that name:

    .gesture(TapGesture()
        .targetedToAnyEntity()
        .onEnded({ tap in
            let tappedNode = tap.entity
            if let tappedRobotEntity = findEntity(within: tappedNode, matching: "robot") {
                // tapped the robot
                // TODO
            }
            }))

And that's how you detect a tap on 3D objects