BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lydiawawa
Lapis Lazuli | Level 10

Hi All,

I'm hoping to determine user activity by how often they switch from one screen to another. I have the known columns Name, Activity, Screen, Value and is having trouble code for Switch:

 

Name

Activity

Screen

Value

Switch

Mary

Home_test_entry

Home

2

No

Mary

Home_test_exist

Home

2

No

Mary

Landing_test_entry

Landing

1

One_Two

Mary

Login_test_entry

Login

1

Switch_within

Mary

Login_test_entry

Login

2

Switch_within

Mary

Login_test_exit

Login

1

Switch_within

Mary

Login_test_exit

Login

2

Switch_within

Mary

Login_test_time

Login

08:09:12

Switch_time

Mary

Candy_test_entry

Candy

2

No

Mary

Candy_test_exit

Candy

2

No

Tom

Home_test_entry

Home

2

Switch_within

Tom

Home_test_exist

Home

1

Switch_within

Tom

Landing_test_entry

Landing

2

Two_One

Tom

Landing_test_exit

Landing

1

Two_One

 

Condition for Switch:

After sorting, Activity "Landing_test_entry" is not always followed by "Landing_test_exit"   on screen "Landing", thus switch is determined by any screen visited by the same user with activity followed by "text_exit". If "xxx_text_exit", xxx = screen, value = 2 and "Landing_test_entry" = 1, then switch = one_two, and vice versa, switch = two_one. For any other screen visited by that user, if value does not change within a screen, then Switch = No, else Switch = Switch_within. If value equal a time value, then Switch = switch_time. An user can visit many possible screens, the table is only a briefed version of the actual dataset.

 

Thank you!

 

Following is my attempted code:

 

proc sort data=have; by name activity; run;

data want;
set have;
length pre_name $62. pre_value $250. switch $10.; by name activity;
 retain pre_activity pre_value; 
     if first.name then do; pre_activity = ""; 
       pre_value = ""; pre_screen = ""; 
        end; 
  pre_activity= lag(activity); 
  pre_value = lag(value); 
  pre_screen = lag(screen); 
      if activity = "Landing_test_entry" and value = "1" then do; 
         /*do not know how to code "Landing_test_exit" as the following item */ 
        activity = "Landing_test_exit" and value = "2" then do; 
         Switch = "One_Two";
       retain pre_screen;
         if screen_name = pre_screen then switch = "One_Two"; ...... run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Next is my final tested code.

I have added a test line for landing entry without exit.

Results fits your posted output.

data have;
   length name $6 activity $20 screen $8 value $8;
   infile cards;
   input name $ activity $ screen $ valuex $ ;
   value_time = 0; /* flag */
   if strip(valuex) in ("1" "2") 
      then value = input(strip(valuex),1.);
      else do;
           value = input(valuex,hhmmss.); 
           value_time = 1;
      end;
   timestamp = _N_; /* dummy variable to save original order */
cards;
Mary Home_test_entry Home 2
Mary Home_test_exit  Home 2
Mary Landing_test_entry Landing 1
Mary Landing_test_exit  Landing 2
Mary Login_test_entry Login 1
Mary Login_test_entry Login 2
Mary Login_test_exit Login 1
Mary Login_test_exit Login 2
Mary Login_test_time Login 08:09:12
Mary Candy_test_entry Candy 2
Mary Candy_test_exit Candy 2
Tom Home_test_entry Home 2
Tom Home_test_exit  Home 1
Mary Landing_test_entry Landing 2
Mary Landing_test_exit  Landing 1
Xxxx Landing_test_entry Landing 1
;
run;

proc sort data=have; by name timestamp; run;  
data temp;
 set have;
  by name;  
     length switch $15;
     retain pre_activity pre_value;           
     pre_activity= lag(activity);
     pre_value = lag(value);
     pre_screen = lag(screen);
     drop pre_activity pre_value pre_screen ;
     
     if first.name then do;
        pre_activity = "";
        pre_value = "";
        pre_screen = "";
     end;
     
     if screen = "Landing" then do;
        if activity = "Landing_test_exit" then do;
           if value ne pre_value then do;
              if value=2 
                 then switch = "One-Two";
                 else switch = "Two-One";
           end;
        /* else treat while value = pre_value ??? */
       end;
     end;
	 
	 if screen ne "Landing" and 
        screen = pre_screen then do;
        if value_time = 1
           then switch = "Switch_time"; else
        if value = pre_value 
		   then switch = "No";
           else switch = "Switch_within";
end;
        
run;  

proc sort data=temp; 
  by name descending timestamp; 
run;  

data want;
 set temp;
  by name; 
     length pre_switch $15;
     retain pre_activity pre_value;           
     pre_activity= lag(activity);
     pre_value = lag(value);
     pre_screen = lag(screen);
	 pre_switch = lag(switch);
     drop pre_activity pre_value pre_screen pre_switch;
    
     if first.name then do;
        pre_activity = "";
        pre_value = "";
        pre_screen = "";
     end;
     
     if screen = "Landing" then do;
        if activity = "Landing_test_entry" then do;
           if value ne pre_value then do;
              if value=1 then switch = "One-Two";
              else switch = "Two-One";
           end;
        /* else treat while value = pre_value ??? */ 
       end;
     end;
    
	if screen ne "Landing" and
       /*screen ne pre_screen and*/
       switch = " " 
    then switch = pre_switch;    
run;  
proc sort data=want; by name timestamp; run;  

 

View solution in original post

16 REPLIES 16
andreas_lds
Jade | Level 19

Please post test-data in usable form.

 

Also note:

  • String-comparison is case-sensitive, so "Landing_test_entry" is not the same as "landing_test_entry".
  • Conditionally executing the retain-statement is not possible.
  • Use lag or retain
  • You reset the lagged variables on first name, but afterwards set the variables to the lagged values.
  • To test if a value is a valid time-value, you can use isTime = not missing(input(Value, ?? time.));

 

lydiawawa
Lapis Lazuli | Level 10

Hi Andreas,

 

That is a typo, it should be "Landing_test_entry", "L" is always capitalized. I will edit the original post.

Shmuel
Garnet | Level 18

Change your code to start as:

proc sort data=have; by name activity; run;  /* to be run if sort is needed */
data want;
 set have;
       by name;  
           retain pre_activity pre_value;
pre_activity= lag(activity); pre_value = lag(value); pre_screen = lag(screen); if first.name then do; pre_activity = ""; pre_value = ""; pre_screen = ""; end;
if activity = "landing_test_entry" and value = "1" then do; /* what about screen change ??? */ if pre_activity = "Landing_test_exit" and value = "2" then switch = "One-Two"; else switch = ... end; ..... run;

Now you need to examine all possible permutations of Activity*Screen*Value for a Name and their previous values,

 by filling next table:

         current           activity          screen       value

     vs previous

         activity

         screen             NOTE: each row contains multiple lines !

          value

 

In case some combination is not valid assign the cell by N/A.

When you finish to fulfill that table convert it into code.

Run the code and come back if any issue. 

 

      

lydiawawa
Lapis Lazuli | Level 10

Hi Shumel,

 

since "Landing_test_entry" is always followed by "Landing_test_exit" after sorting ( "Landing_test_exit"  always after "Landing_test_entry" ) :

 

 if activity = "landing_test_entry" and value = "1" then do; /* what about screen change ??? */
if pre_activity = "Landing_test_exit" and value = "2"


The statement above will treat "Landing_test_exit" as the previous entry, which is not correct. We should treat it as the following entry?

 

Could you give some more hint on coding for the second part of your statement?

Shmuel
Garnet | Level 18

1.

Looking at both lines:

Mary

Landing_test_entry

Landing

1

One_Two

Mary

Landing_test_exit

Landing

2

One_Two

it seems that at first line you compare the value to the following line,

while at second line to compare the level to previous line.

 

It can't be done in one step. There is no function to get value from the next observation.

 

Is there any other variable, like timestamp, to use for sorting the data.

Otherwise I shall add a dummy variable to save original order.

 

2.

Your code contains by name activity;

You'll get error when reading Mary Candy after Mary Login - as data is not sorted properly.

Can Candy be treated before Login,  even if it happen later ?

 

3. Is screen always first word of activity ? (screen = scan(activity,1,'_'); 

 

lydiawawa
Lapis Lazuli | Level 10

Hi Shumel,

The following are answers to your questions:

 

1. There is no variable to sort the order, I think a dummy variable can be added within each name group to sort out the order.

2. I did not include it in my code when I posted this thread, the data was sorted prior and I have edited my post. Sort has been included.

3. No, screen name position should be determined by index() since it is not always the first word.

 

Thank you so much!

 

lydiawawa
Lapis Lazuli | Level 10

Hi Shumel,
I ran into couple cases just now, and I revised the original condition for Switch in my first post. It turns out Activity "Landing_test_entry" is not always followed by "Landing_test_exit" . Some users do not have "Landing_test_exit". Sorry about the confusion.

Shmuel
Garnet | Level 18

On my first post I noticed:

examine all possible permutations of Activity*Screen*Value for a Name and their previous values,

 by filling next table:

         current           activity          screen       value

     vs previous

         activity

         screen             NOTE: each row contains multiple lines !

          value

try to check your data and complete that table before coding.

It is called a "Decisions Table".

Shmuel
Garnet | Level 18

Next code is not complete, trying to deal with landing only, Check it and try to complete it.

 

step 1 reads row data and adds dummy varible named TIMESTAMP assigned withe the _N_

to save original order.

 

step 2 sort the data (ascending TIMESTAMP)

step 3 deals with second line in a pair.

 

step 4 sort the data (descending TIMESTAMP)

step 5 deals with second line, which is originally first line in a pair.

 

data have;
   length name $6 activity $20 screen $8 value $8;
   infile cards;
   input name $ activity $ screen $ valuex $ ;
   if strip(valuex) in ("1" "2") 
      then value = input(strip(valuex),1.);
      else value = input(valuex,hhmmss.); 
   timestamp = _N_; /* dummy variable to save original order */
cards;
Mary Home_test_entry Home 2
Mary Home_test_exit  Home 2
Mary Landing_test_entry Landing 1
Mary Landing_test_exit  Landing 2
Mary Login_test_entry Login 1
Mary Login_test_entry Login 2
Mary Login_test_exit Login 1
Mary Login_test_exit Login 2
Mary Login_test_time Login 08:09:12
Mary Candy_test_entry Candy 2
Mary Candy_test_exit Candy 2
Tom Home_test_entry Home 2
Tom Home_test_exit  Home 1
Mary Landing_test_entry Landing 2
Mary Landing_test_exit  Landing 1
;
run;

proc sort data=have; by name timestamp; run;  
data temp;
 set have;
  by name;  
     retain pre_activity pre_value;           
     pre_activity= lag(activity);
     pre_value = lag(value);
     pre_screen = lag(screen);
    
     if first.name then do;
        pre_activity = "";
        pre_value = "";
        pre_screen = "";
     end;
     
     if screen = "Landing" then do;
        if activity = "Landing_test_exit" then do;
           if value ne pre_value then do;
              if value=2 then switch = "One-Two";
              else switch = "Two-One";
           end;
        /* else treat while value = pre_value ??? */
       end;
     end;
        
run;  

proc sort data=have; by name descending timestamp; run;  

data want;
 set temp;
  by name;  
     retain pre_activity pre_value;           
     pre_activity= lag(activity);
     pre_value = lag(value);
     pre_screen = lag(screen);
     drop pre_activity pre_value pre_screen;
    
     if first.name then do;
        pre_activity = "";
        pre_value = "";
        pre_screen = "";
     end;
     
     if screen = "Landing" then do;
        if activity = "Landing_test_entry" then do;
           if value ne pre_value then do;
              if value=1 then switch = "One-Two";
              else switch = "Two-One";
           end;
        /* else treat while value = pre_value ??? */ 
       end;
     end;
        
run;  
Shmuel
Garnet | Level 18
my last posy was created and sent before reading your last remarks
lydiawawa
Lapis Lazuli | Level 10
Yes, I'm aware of that. Thank you for the reminder.
lydiawawa
Lapis Lazuli | Level 10
Hi Shmuel, could you give some suggestions on how to populate the "Switch_within" and "No" values for Switch. Your code only populate part of my question. And this is based on before my most recent revision.
Shmuel
Garnet | Level 18

Based on your requirement:

For any other screen visited by that user, if value does not change within a screen, then Switch = No, else Switch = Switch_within. If value equal a time value, then Switch = switch_time.

I would add to step-3 (ascending "timestamp" keeps sort by name activity😞

if screen ne "Landing" and 
   screen = pre_screen then do;
    if value > 2  then switch = "Switch_time"; else
    if value = pre_value then switch = "No";
                         else switch = "Switch_within";
end;

that deals within screen (exclude Landing) for second line on.

 

Next code add to step-4 (descending "timestamp"):

if screen ne "Landing" and
   screen ne pre_screen and
   switch = " " 
then switch = pre_switch;    /* add PRE_SWITCH to RETAIN and assign LAG_SWITCH on top of the step*/

 

I did not test that added code. Try it. I hope it will give you what you want.

Shmuel
Garnet | Level 18
If time is 00:00:01 or 00:00:02 above code may be erroneous.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 1804 views
  • 5 likes
  • 3 in conversation