Skip to content
CTI Integration with SAP Sales Cloud V2: A Technical Guide
Architecture · ·10 min read

CTI Integration with SAP Sales Cloud V2: A Technical Guide

Sofiene Karaja

Sofiene Karaja

SAP Integration Consultant, Spadoom AG

Share

Your sales team picks up the phone. The caller’s name, company, and last interaction should appear on screen before they say hello. That’s CTI: Computer Telephony Integration. Simple concept, surprisingly complex implementation.

SAP Sales Cloud V2 doesn’t ship with a built-in CTI adapter. It gives you the APIs and the UI shell. You build (or buy) the integration layer. We’ve done both. Our Engage CTI product handles this for multiple customers already. Here’s what the architecture looks like and where things tend to go sideways.

TL;DR: Sales reps spend only 28% of their time actually selling (Salesforce, 2024). CTI integration with SAP Sales Cloud V2 eliminates manual caller lookup — screen pop delivers caller context in under 2 seconds. The architecture: telephony provider → CTI middleware on BTP (Node.js + WebSocket) → Sales Cloud V2 APIs → client-side widget. Key decisions: WebSocket over polling, E.164 phone normalisation, Redis lookup cache. We’ve deployed this for Cisco, Genesys, RingCentral, and Teams.

Screen Pop Latency Budget — Inbound CallBreakdown of the 2-second screen pop target for CTI integration: telephony event delivery (under 100ms), phone number lookup (200-500ms), contact enrichment (300-800ms), WebSocket delivery (under 200ms). Source: Spadoom Engage CTI project data.Screen Pop Latency BudgetTarget: ring-to-screen-pop under 2 secondsTelephony event<100 msPhone lookup200–500 msContact enrichment300–800 msWebSocket push<200 msTotal target< 2 secondsSource: Spadoom Engage CTI project data (2025)

What Is CTI and Why Does It Matter for Sales Cloud V2?

Sales reps spend only 28% of their time actually selling. The rest goes to admin tasks, data entry, and searching for information (Salesforce, 2024). CTI kills one of the biggest time sinks: manually looking up who’s calling.

A CTI integration with Sales Cloud V2 has four components.

Telephony provider. Your PBX or cloud telephony system: Cisco, Genesys, RingCentral, Teams Phone, or any SIP-based provider. This is where calls actually happen.

CTI middleware. A server-side component that bridges the telephony provider and Sales Cloud V2. It translates telephony events (incoming call, call connected, call ended) into CRM actions (screen pop, create activity, log call). In our architecture, this runs on SAP BTP.

Sales Cloud V2 APIs. REST APIs for looking up contacts by phone number, creating phone call activities, and retrieving account context. V2’s API-first design makes this neat and clean. 82% of developers now prioritise API-first approaches (Postman, 2024), and V2 was built with exactly that thinking.

Client-side widget. A UI component embedded in the Sales Cloud V2 shell showing call controls (answer, hold, transfer, end) and caller information. This runs as a side-by-side extension using V2’s shell plug-in framework.

How Does an Inbound Call Flow Through the System?

The entire flow from ring to screen pop takes under 2 seconds. Anything slower and users lose trust in the system. Here’s exactly what happens:

  1. Call arrives at the telephony system. The PBX sends a call event to the CTI middleware via WebSocket or webhook.
  2. Middleware extracts the caller’s phone number and queries the Sales Cloud V2 API: GET /sap/c4c/api/v1/phone-call-collection?$filter=phone eq '{number}'. (In practice, we search across accounts, contacts, and individual customers.)
  3. If a match is found, middleware pushes caller context (name, account, open opportunities, recent interactions) to the client-side widget via WebSocket.
  4. The widget triggers a screen pop, navigating Sales Cloud V2 to the matched contact or account record.
  5. When the call ends, middleware creates a phone call activity in V2 with duration, direction, participants, and notes.

That 2-second target isn’t arbitrary. It’s the threshold where reps stop asking “who’s calling?” and start greeting callers by name. Miss it, and adoption drops fast. I’ve seen it happen: a 4-second screen pop is de facto the same as no screen pop at all. People just ignore it.

Sales professional using a headset in a modern office environment, representing CTI-enabled customer interactions

What Are the Key Technical Decisions?

The contact center software market reached $38.2 billion in 2024, growing at 21.4% CAGR (Grand View Research, 2024). With that growth comes more telephony options and more integration decisions. Three choices matter most.

Should You Use WebSocket or Polling?

The client-side widget needs real-time call events. Polling the middleware every second creates unnecessary load and adds latency. WebSocket connections deliver events in milliseconds.

We use a WebSocket server on BTP (Node.js) that maintains persistent connections with each active Sales Cloud V2 session. When a call event arrives from the telephony system, it’s pushed to the right user’s WebSocket connection instantly.

Why not Server-Sent Events (SSE)? They work for one-way streaming, but CTI needs bidirectional communication: the widget sends commands back (hold, transfer, end call). WebSocket handles both directions on a single connection. Cleaner.

How Do You Handle Phone Number Matching?

This sounds simple. It isn’t. Phone numbers come in many formats: +41 44 123 45 67, 044 123 45 67, 0041441234567. Your middleware needs to normalise numbers before searching.

We normalise to E.164 format (+41441234567) and search against a normalised field in V2. Sales Cloud V2 stores phone numbers as entered by users, which means wildly inconsistent formats. Our middleware handles normalisation on both sides: normalise the incoming caller number AND normalise the V2 stored numbers during comparison. Nota bene: if you skip normalisation on the stored side, you’ll get phantom misses that drive your support team mad.

Build a phone number index. Querying V2’s API with wildcard phone searches on every call is slow. We maintain a lightweight lookup cache (Redis on BTP) that maps normalised phone numbers to V2 entity IDs. The cache refreshes every 15 minutes and on entity update events. Without this cache, lookup latency jumps from 200ms to 1.5+ seconds. That blows your 2-second budget entirely.

How Does Authentication Work?

The middleware needs to call Sales Cloud V2 APIs on behalf of users. We use OAuth 2.0 with SAP IAS (Identity Authentication Service) as the identity provider. The widget handles the initial OAuth flow; the middleware uses refresh tokens for API calls.

For telephony-to-middleware authentication, it depends on the provider. Cisco and Genesys use API keys. Cloud providers like RingCentral and Teams use OAuth. The middleware abstracts this away, so adding a new telephony provider means implementing one adapter interface. The V2 integration stays the same.

What Gets Logged When a Call Ends?

CRM delivers $3.10 for every dollar spent, with time savings driving 51% of that return (Nucleus Research, 2024). Automatic call logging is where CTI delivers that return: no more manual activity creation after every call.

Every call creates a phone call activity in Sales Cloud V2. We log:

  • Direction: inbound, outbound, missed
  • Duration: start time, end time, talk time
  • Participants: caller, called party, any transferred parties
  • Account context: which account/contact was matched
  • Notes: reps can add notes during or after the call via the widget
  • Recording link: if the telephony system records calls, we store the recording URL (not the file)

The activity is created via POST /sap/c4c/api/v1/phone-call-collection. V2’s API accepts all these fields natively. No custom objects needed. Crisp.

Which Telephony Providers Are Supported?

Our Engage CTI product currently supports five provider categories:

ProviderConnection TypeNotes
Cisco CUCM/UCCXJTAPI / CTI ServerOn-premise; requires network connectivity to BTP
Genesys CloudWebSocket APICloud-native; fastest to integrate
RingCentralREST + WebSocketCloud-native; good API documentation
Microsoft TeamsGraph API + Bot FrameworkRequires Teams Phone licence; more complex setup
SIP-based PBXSIP events via SRTP/WebSocketGeneric adapter for smaller providers

Adding a new provider typically takes 2-4 weeks of development. The adapter pattern means most of the work is mapping the provider’s event model to our internal format. The V2 integration stays identical regardless of which telephony provider you run.

Modern data center server infrastructure representing cloud-based CTI middleware deployment on SAP BTP

What Are the Most Common CTI Pitfalls?

67% of sales reps don’t expect to meet their quota (Salesforce, 2024). A broken CTI system that adds friction instead of removing it makes that worse. Here are the six pitfalls we see most often.

Latency kills adoption. If the screen pop appears after the rep has already asked “who’s calling?”, nobody will use it. Target under 2 seconds. Test with real call volumes, not demos.

Phone number data quality. If your V2 data has phone numbers in 15 different formats, matching fails. Clean your data before going live. Run a phone number normalisation script across all accounts and contacts. Boring work. Non-negotiable.

WebSocket stability. WebSocket connections drop. Corporate proxies, VPNs, and network switches interrupt them. Implement automatic reconnection with exponential backoff. Show a clear “disconnected” indicator in the widget so reps know when CTI isn’t active.

Multi-tab handling. Sales reps open multiple browser tabs. The CTI widget should only be active in one. We use a leader election pattern (BroadcastChannel API) to make sure screen pop happens in exactly one tab. Without this, reps get duplicate popups or none at all. Both are a mess.

Call transfer context. When a call is transferred, the context should follow. The second agent should see the same screen pop. This requires tracking call sessions, not individual call legs.

Compliance. Call recording and logging have legal requirements that vary by jurisdiction. In Switzerland, both parties must consent to recording. Your CTI solution needs configurable recording controls. This isn’t optional, it’s the law.

How Do You Deploy CTI on SAP BTP?

With 55% of ASUG members now using SAP BTP (ASUG, 2025), BTP is the natural home for CTI middleware. Our architecture uses four BTP services:

  • Node.js application with Express for the REST API and WebSocket server
  • Redis for the phone number lookup cache and session management
  • SAP Integration Suite for reliable event delivery from on-premise telephony systems
  • XSUAA for authentication and tenant isolation

The widget is deployed as a Sales Cloud V2 shell plug-in: a small JavaScript application that loads inside the V2 shell frame.

For multi-tenant deployments (multiple customers on one middleware instance), we use XSUAA tenant isolation. Each customer’s telephony events are routed to their tenant only. Data isolation isn’t negotiable here. Get this wrong and you have a proper incident on your hands.


Want to connect your phone system to SAP Sales Cloud V2? Our Engage CTI product is production-ready. Supports Cisco, Genesys, RingCentral, and Teams. Get in touch.

Frequently Asked Questions

How long does a CTI integration take?

A standard integration with one telephony provider takes 6-8 weeks: 2 weeks for middleware setup and telephony adapter development, 2 weeks for V2 widget configuration and API integration, and 2-4 weeks for testing with real call flows. The timeline depends heavily on the telephony provider. Genesys Cloud is fastest (cloud-native WebSocket API). On-premise Cisco CUCM requires additional network setup for BTP connectivity and can push you closer to 8 weeks.

Does CTI work with Sales Cloud V2 on mobile?

The client-side CTI widget is designed for the desktop web client where reps handle calls at their workstations. Mobile V2 users can view call activities logged by CTI (those are standard V2 activities), but the real-time screen pop and call controls require the desktop shell plug-in framework. For field reps who primarily use mobile, CTI call logs give them solid context for follow-ups without needing the real-time widget.

What happens if the CTI middleware goes down?

Calls still work. The telephony system is independent of the middleware. Reps just don’t get screen pops. Our middleware runs with automatic restart policies on BTP Cloud Foundry, and we have health checks that alert ops teams within 60 seconds of downtime. The Redis cache persists independently, so a middleware restart doesn’t require rebuilding the phone number index.

Can CTI handle high call volumes?

Our Engage CTI middleware handles 500+ concurrent WebSocket connections and processes call events in under 50ms per event. For contact centres with higher volumes, we scale horizontally on BTP Cloud Foundry with sticky sessions to maintain WebSocket affinity. The Redis lookup cache handles thousands of queries per second, so phone number matching doesn’t become a bottleneck even at peak volumes.

Is call recording data stored in Sales Cloud V2?

No. And it shouldn’t be. Call recordings stay in the telephony system’s storage. We store only the recording URL as a link on the phone call activity in V2. This avoids storage costs, keeps sensitive audio data in the telephony system’s compliance-managed environment, and respects jurisdiction-specific data residency requirements. Reps click the link to access the recording through the telephony system’s own player.

SAPCTITelephonySales CloudIntegrationSAP Sales Cloud V2BTP
Next step

Solutions for Sales

See how SAP Sales Cloud V2 can work for your business.

Related Articles

Ask an Expert