How to Build a Therapeutic Alliance Through a Screen

Social worker with laptop

This is an excerpt from SWTP CEU’s 3-CE Telehealth in Social Work Practice course.

The first time you meet a new client via video, you’ll notice something feels different. Not worse, not better—just different. You can’t read their full body language. You can’t gauge how they fill the physical space of your office. You don’t have those first few minutes of small talk while they hang up their coat or settle into the chair. Instead, you have a face in a box on your screen, a background that may or may not reveal something about their life, and the subtle awareness that you’re both performing your attention for a camera.

Therapeutic alliance—that sense of connection, trust, and collaborative partnership—absolutely forms through video. Research consistently shows that clients can and do develop strong therapeutic relationships in telehealth. But the process looks different, and understanding those differences helps you adapt effectively.

What changes on video is mostly what you can see and how you read it. In your office, you’d notice if someone sat on the edge of the chair, clutched their bag, or positioned themselves near the door. Those body cues communicate anxiety, defensiveness, or readiness to flee. On video, you see shoulders and head. Everything below that is invisible. When Danielle felt anxious in office sessions, she’d pull her knees to her chest and wrap her arms around them—a clear signal to slow down and ground. On video, her face would go blank and still. Her social worker had to learn new cues: rapid blinking, shoulders rising toward her ears, jaw tightening. Same anxiety, different visible manifestation.

This means you work harder at reading faces. Microexpressions matter more. You pay closer attention to tone of voice because you’re missing postural information. You name what you observe more explicitly: “I notice your shoulders just tensed up when we started talking about your mother. What’s happening for you right now?” In the office, you might let body language speak for itself. On video, you verbalize your observations to check accuracy.

Eye contact becomes a technical puzzle rather than a natural behavior. When you look at your client’s eyes on your screen, you’re looking away from your camera. When you look at your camera to simulate eye contact, you can’t see your client’s face. There’s no perfect solution. Most clinicians position their camera as close to eye-level as possible and as close to where the client’s face appears on screen, then glance at the camera periodically during meaningful moments—when expressing empathy, during important disclosures, when you want to emphasize connection. Otherwise, focus on the client’s face so you can read their responses.

Some clients find video eye contact easier than in-person. Autistic clients sometimes report that video feels less intense because they can look at the screen without feeling pressured to maintain constant eye contact. Socially anxious clients may relax slightly because video provides a bit of comfortable distance. Others find it alienating and impersonal. Ask clients about their experience. Don’t assume video is inferior; it’s different, and different works better for some people.

Your visible environment matters more than you might think. In your office, clients see your degree on the wall, your bookshelf, your professional space designed to communicate competence and warmth. At home, they might see your kitchen, your laundry, your family photos, or your blank white wall. All of these send messages.

Neutral and professional works best. That doesn’t mean sterile—a warm, slightly visible bookshelf or plant conveys professionalism while feeling human. But you probably want to avoid having your teenager’s soccer trophy visible, political posters behind you, or cluttered spaces that distract from clinical focus. Virtual backgrounds often look artificial and can be distracting; many clients report they’re weird and prevent them from focusing on you. A real, simple, uncluttered background usually works better.

Position your camera at eye level, about an arm’s length away. Laptops on desks often create an unflattering angle looking up your nose. Stack books under your laptop or use a separate webcam positioned correctly. Face a light source—a window or lamp in front of you—rather than backlighting yourself into a silhouette. These technical details affect whether clients can see your face clearly enough to read your expressions.

Silence feels longer on video. You know those therapeutic pauses where you sit with a client and let them process? On video, clients can’t tell if you’re thinking deeply or if the connection froze. A fifteen-second pause that feels natural in-person can feel awkward and disconnecting remotely.

You adapt by naming the silence: “I’m going to pause here and give you space to think about that.” Check in earlier than you might in-person: “Take your time—I’m right here with you.” Use slight nonverbal gestures during silence—nodding, leaning in slightly, maintaining soft eye contact with the camera—to convey presence. You’re not rushing to fill the silence, just making it clear the silence is intentional and you’re still engaged.

When Jennifer paused for fifteen seconds during an in-person session, you’d watch her shoulders drop and tears well in her eyes. You’d see the emotion building and moving through her body. On video, you see her face freeze into a neutral expression and you wonder if she’s dissociating, thinking, or losing connection. Learning to tolerate that ambiguity while staying present takes practice. Sometimes you check in: “You got quiet. What’s happening right now?” Sometimes you wait, trusting the process. Developing that clinical judgment through a screen requires intentionality.

Telepresence—the felt sense of being present with someone remotely—takes conscious cultivation. Your attention wanders more easily when you’re staring at a screen for hours. You might glance at another notification, check the time, or mentally drift in ways that wouldn’t happen in your office where another human is physically present. Clients sense this. They know when you’re fully engaged versus partially distracted.

Minimize your own distractions ruthlessly. Close email. Silence your phone. Put other browser tabs away. Hide your self-view (you don’t stare at yourself during in-person therapy, so why do it on video?). Use speaker view for individual sessions rather than gallery view. Create the conditions for your own presence, because presence is what enables connection.

On the client side, help them optimize presence too. When Keisha joined sessions from her kitchen with family members visible behind her, moving around, making noise, her social worker noticed her divided attention. “I wonder if we could find a more private, quiet space for our sessions? I want to make sure you can really focus without distractions.” Keisha moved to her bedroom with the door closed, and her engagement immediately deepened.

Research on telehealth alliance is reassuring. Multiple studies examining therapeutic outcomes for depression, anxiety, PTSD, and other common presenting issues show no significant difference between video therapy and in-person treatment. A 2022 meta-analysis of over 50 studies found that client satisfaction with telehealth is high once people actually try it—initial skepticism often resolves within a few sessions. Dropout rates are comparable to in-person therapy. Alliance ratings from clients and clinicians are equivalent.

What this means practically: when Amber expressed nervousness about starting therapy via video because “it won’t feel as real,” her social worker could respond with confidence. “I appreciate you sharing that concern. What research shows is that most people find video therapy just as helpful as in-person once they try it. The relationship we build is real regardless of whether we’re in the same room. Let’s try it for a few sessions and check in about how it’s feeling for you.”

The clinical skills that create alliance in-person translate to video—empathy, genuineness, unconditional positive regard, collaborative goal-setting, cultural responsiveness. You’re not learning a completely different way to practice. You’re adapting how you express those core relational capacities through a different medium.

Small adaptations enhance connection when technology creates distance. Slow your pace slightly—audio delays can make rapid exchanges feel disjointed. Over-express empathy nonverbally since body language doesn’t translate as fully. Check in more frequently: “I’m wondering what that brings up for you?” Use the chat feature strategically—drop a grounding exercise, coping skill, or resource link mid-session. Share your screen to collaboratively review documents, safety plans, or worksheets. These adaptations compensate for the reduced nonverbal information and create active, dynamic engagement.

Consider Anthony, who volunteers at a community center and talks about feeling isolated. In-person, you might lean forward, make warm eye contact, and let a moment of connection land. On video, you do that, but you also verbalize it more: “What you just described—showing up for others even when you’re struggling yourself—that takes real strength. I want you to take that in.” You’re not being patronizing or overly instructional. You’re making explicit what might be communicated entirely nonverbally in your office.

Case in point: A trauma client you’ve been seeing in-person for eight months requests switching to video permanently because commuting has become difficult. You’ve noticed during trauma processing that she sometimes dissociates—you can see it in how her body goes still, how her gaze becomes unfocused, how she seems to leave the room mentally even while sitting in front of you. You’re concerned about managing dissociation remotely where you can’t physically ground her or read her body as completely.

You address this directly: “I want to talk through what switching to video would mean for our trauma work. When you dissociate, I’ve been able to notice and help ground you by handing you something textured, having you press your feet into the floor, adjusting the lights. On video, I can’t do those things in the same way. I can coach you through grounding, but you’d need to do it yourself. How do you think about managing that?”

This is a real conversation about how alliance and clinical effectiveness translate—or don’t—to telehealth. Maybe she’s developed enough internal grounding capacity that remote coaching works fine. Maybe you pilot video sessions while keeping the option to return to in-person if needed. Maybe you determine trauma processing stays in-person while maintenance sessions can be remote. The key is transparent discussion about how the format affects the work, not assuming video is automatically inferior or pretending no adaptation is needed.

Therapeutic alliance through a screen is real. It requires intentionality, adaptation, and sometimes explicit conversation about what’s different. But when you establish presence, read clients carefully within the constraints of video, and maintain the core relational stance that defines good social work practice, connection happens.

Continue the course for credit at SWTP CEUs, an ASWB ACE-approved CE provider. To get this and every course on on the site with one purchase, use the Unlimited CE Pass.