BookmarkSubscribeRSS Feed

SAS for Multi-Session & Multi-Touchpoint Customer Journeys

Started ‎09-29-2022 by
Modified ‎10-25-2022 by
Views 1,057

As we enter Q4 of 2022, SAS continues to address and innovate around the business challenge of managing customer journeys. As many of us know, a typical use case will involve multiple customer interactions with a brand's digital presence spanning over numerous sessions and touchpoints. Behind customer journeys, opportunities exist to take advantage of different types of data that can be collected, contextualized & analyzed to derive actionable insight. These learnings frequently motivate brands to launch experimental testing, behavioral targeting, personalization or recommendation systems to optimize the experience. SAS recognizes that optimization within customer journeys does not end, and iterates in sequential continuance requiring brands to explore new analytical approaches to support CX that can self-learn, improve and deliver results.


Image 1: Managing Customer JourneysImage 1: Managing Customer Journeys


Although this article will be skewed to transparent technology demonstrations, I should disclose that SAS Customer Intelligence 360 & SAS Viya provide comprehensive DataOps, ModelOps & CX orchestration in an integrated platform. There are no constraints for data access (cloud/hybrid/on-premises), dashboarding, visualization, segmentation (rules-/algorithmic-based) & self-service usability w/ varying degrees of automation infused by design for do-it-for-me (DIFM) & do-it-yourself (DIY) software users.


Image 2: Technology User PersonasImage 2: Technology User Personas


Let's provide a demo preview:


  • Chapter 1 will showcase a customer journey that is triggered from real-time user behavior. SAS Customer Intelligence 360 will detect this user event by contextualizing streaming semi-structured clickstream and igniting a customer journey activity map.
  • Next in Chapter 2, as the customer browses a retail brand's website, various examples of behavioral targeting using a recently released OOTB system connector between SAS Customer Intelligence 360 and SAS Intelligent Decisioning on Viya simplifies the deployment of customizable business rules & DIFM/DIY machine learning that will ultimately result the user to convert onsite.
  • Finally, Chapter 3 will pivot from acquisition to upsell/cross-sell within the journey nurtured by additional touchpoint interactions across an instrumented mobile app and email with analytically-driven personalization offers. But what drove those offers? Propensity scores, clusters or CDP black box magic? Actually, we will use an application of reinforcement learning and exemplify how SAS optimizes offers orchestrated through SAS Customer Intelligence 360. Although experiment designs like A/B/n & MVT (Multivariate) can be used, we will demonstrate the use of RL Contextual Bandits natively available in SAS Visual Data Mining & Machine Learning on Viya to support aspects of self-learning that do not exist in the aforementioned testing methods. 


Image 3: Decisioning Within Customer Journey InteractionsImage 3: Decisioning Within Customer Journey Interactions


Chapter 1: Customer Journey Activation


In SAS Customer Intelligence 360, events enable users to enhance their ability to understand, target, and interact with customers in a meaningful way. Events are used to track user behavior and to provide input conditions for other items such as spots, segments, tasks, data views, and activities. For example, events are used in some of these ways:


  • Record when customers click a spot on a web page or in a mobile app
  • Collect data that is entered in a form
  • Trigger the start of an customer journey activity

Tracking user interactions is one of the core features that enables a brand to respond in real time to users based on their profile, origin, browsing behavior, and so on. When a visitor interacts with a website/app (for example, through a click event), that event is processed by the run-time environment and can be used to control many of branded site/app's features. The interactions that are monitored are the driving force behind many of the features that SAS Customer Intelligence 360 offers.


The Chapter 1 demo video below will be in the context of the retail industry. After a brief introduction, a customer journey will be exemplified beginning on the SAS Store website, and the behavioral trigger event will occur when a white wine product item is added to an ecommerce cart. We will pivot and show how the event was configured that will serve as the start to a series of targeted journey-based interactions.



Chapter 2: Behavioral Monitoring, Supervised Learning & Decisioning


Making the most of real-time customer interaction and streaming digital data requires real-time processing. Decisioning is best used to drive real-time actions in three contexts.


  • To drive the ideal next interaction that a customer or potential customer should have with your brand within their journey.
  • As part of a cross-channel marketing initiative that unifies experience across customer-facing channels.
  • To enable personalization that delivers customized messages based on visitor profiles or observed behaviors.

SAS Intelligent Decisioning provides user access to a comprehensive set of ModelOps capabilities. SAS recognizes the importance of removing friction to activating sophisticated decisioning in martech, and released an embedded decisioning connector within SAS Customer Intelligence 360 in the July 2022 software release. Our vision is to provide brands with open access to rich digital data and sophisticated decisions using customizable business rules and real time analytical model calculations for timely insights and advanced personalization to accelerate conversions. Previously, users had to configure at least one agent in order to integrate with an external system. Now, the SAS Intelligent Decisioning native connector provides the ability to map SAS Customer Intelligence 360 data as inputs to published decisions through a no-code GUI.


With the connector configured, users can auto-trigger a decision based on customer behavioral events. Decision outputs are automatically made available to any SAS Customer Intelligence 360 supported channel/task (web, mobile app, email, etc.) for targeting and personalization. For example, you can invoke a published decision based on a user action (pageview, click or micro-goal achievement). Then you can use the customer outcome of that targeted decision (such as a next-best-experience recommendation) in a task or activity map for downstream personalization as that user's journey matures.


The Chapter 2 demo video below will pick up where Chapter 1 left off to show how sophisticated decisioning can assist in converting customer purchases.



Chapter 3: Offer Optimization, Reinforcement Learning & Contextual Bandits


Along with supervised and unsupervised learning, reinforcement learning is one of the major branches of machine learning. Each branch of machine learning differs from the other in the manner of feedback that the model receives during training. Supervised learning methods require data to be labeled. Labeled data enables supervised learning methods to compare the model’s output to the target output in order to improve the model. Supervised learning models are often used for prediction or classification tasks. Unsupervised learning methods do not require data to be labeled. Instead of comparing a model's output to a target, unsupervised learning methods look within unlabeled data to find useful patterns or groupings.


Reinforcement learning methods rely on a reward signal to train a policy. The reward signal is often a scalar function that indicates the goodness or badness of the agent’s decisions. Reinforcement learning models attempt to learn a policy that maximizes a long-term reward that is accumulated over a sequence of time steps. Despite the differences between unsupervised, supervised, and reinforcement learning, each learning method is complementary to the other. Each discipline plays a vital role in solving complicated tasks.


The use of a reward signal as a feedback mechanism leads to several unique qualities of reinforcement learning.


  • The reward signal enables learning to occur without the need for a supervisor. Therefore, when a method for determining a reward exists, an agent can learn directly from interactions with an environment. Learning by exploration enables an agent to gather its own data rather than rely on a supervisor to provide labeled input. Furthermore, an agent can explore without any prior knowledge of how an environment works, which might lead to superior performance.
  • An agent’s observation of the environment at one instance of time is often highly correlated to its observation at the next instance. For example, consider a walking robot that receives as input a frame-by-frame video of its environment. Unless the robot is moving incredibly fast relative to the frame rate, each frame should be only a slight change from the previous frame. Because the RL agent’s input data is sequential, it is not independent and identically distributed. However, many traditional machine learning methods are suitable only for independent and identically distributed data.


Reinforcement learning methods can be grouped into two categories based on how the agent receives training data. In online reinforcement methods (online RL), an agent interacts with an environment, receives a reward, and updates its policy iteratively. Agent training occurs alongside data collection. In contrast, batch reinforcement learning methods (batch RL) decouples data collection from optimization. The separation of data collection from optimization enables batch RL methods to train an agent on a history of interactions with the environment.


SAS offers a variety of native reinforcement learning algorithms for users to leverage. The following table summarizes the main characteristics of each algorithm.


Algorithm Name



Policy gradient algorithm that learns the value function in order to compute the policy gradient by using the advantage function.

Deep Q-Network (DQN)

Q-learning algorithm that uses two neural networks to implement policy iteration. DQN uses experience replay to decouple correlated time-steps.

Fitted Q-Network (FQN)

Q-learning algorithm that uses two neural networks to implement policy iteration. FQN uses a fixed data set of experiences.


Policy gradient algorithm that computes policy gradients by using sub-trajectory sampling with a baseline.


Policy gradient algorithm that computes the policy gradient by using sampling with a baseline and accounting for causality.


If your brand develops personalization of customer experiences for your websites, mobile apps, email campaigns, etc., but reinforcement learning is a new/untested concept within your organization, contextual bandits can help. Using contextual bandits, brands can choose which content to display to the user, rank advertisements, select the best image to show on a screen, and much more. You can think about contextual bandits as an extension of multi-armed bandits, or as a simplified version of reinforcement learning.


In either scenario, the problem involves an exploration/exploitation tradeoff. Contextual bandits strike a balance between “exploration” (trying new tactics) and “exploiting” (presenting customers with the currently best-known tactic). This problem has many real-world applications, including website/app optimization, clinical trials, adaptive routing and financial portfolio design. In martech, you can think about it as smarter A/B testing.


Image 5: RL Contextual BanditsImage 5: RL Contextual Bandits


A traditional multi-armed bandit algorithm typically found in martech software offerings that support testing outputs an action but doesn’t use any information about the state of the environment (context). For example, if you use a multi-armed bandit to choose whether to display puppy images or kitten images to the user of your website, you’ll make the same random decision even if you know something about preferences of the user. The RL contextual bandit extends this model by making the decision conditional on the state of the environment.


With such an approach, you not only optimize decisions based on previous customer observations, but you also personalize decisions for every situation. The algorithm observes a context, makes a decision, choosing one action from a number of alternative actions, and observes an outcome of that decision. An outcome defines a reward. The goal is to maximize average reward (not simply likelihood of conversion).


The final Chapter 3 demo video below will explain how SAS Customer Intelligence 360 and SAS Viya work together to enable upsell/cross-sell offers across multiple outbound channels using RL contextual bandits.



We look forward to what the future brings in our development process – as we enable marketing technology users to access all of the most recent SAS analytical developments. Learn more about how SAS can be applied for customer analytics, journey personalization and integrated marketing here.

Version history
Last update:
‎10-25-2022 01:59 PM
Updated by:



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started