Hi everyone,
**POST EDITED: AUG24**
1) I'm using EG at my work and I was wondering if there's a simple way for me to do path analysis?
I want to understand the path (channel) customers take to get to their first channel = 3 transaction. I only want to look at transactions with Rev>=0, and exclude customers whose first transaction is channel = 3 i.e. ID 2
ie. ID 1 => 1 -> 3, PATH = '1 only'; ID 6=> 2->1->2->1->3, PATH = '1 & 2'.
Data set: HAVE
ID | TRANS_DATE | TRANS_KEY | UNITS | REV | CHANNEL |
1 | 11JUL2015 | 1068 | 2 | 1939.98 | 1 |
1 | 14JUL2015 | 1256 | 1 | 31.99 | 1 |
1 | 17JUL2015 | 9305 | 2 | 63.98 | 1 |
1 | 20JUL2015 | 1207 | 2 | 2279.98 | 1 |
1 | 20JUL2015 | 1477 | -2 | -1939.98 | 1 |
1 | 08SEP2015 | 5113 | 2 | 1539.99 | 1 |
1 | 24FEB2016 | 8835 | 1 | 299.99 | 3 |
2 | 01SEP2015 | 5178 | 1 | 139.99 | 3 |
2 | 28SEP2015 | 2368 | 1 | 214.99 | 1 |
3 | 30JUL2015 | 1504 | 1 | 189.95 | 3 |
4 | 22DEC2015 | 627 | 2 | 819.98 | 3 |
4 | 02JAN2016 | 1271 | -2 | -819.98 | 1 |
5 | 12AUG2015 | 9140 | 2 | 649.98 | 3 |
5 | 12AUG2015 | 3423 | -1 | -199.99 | 1 |
5 | 12AUG2015 | 3424 | 2 | 545.93 | 1 |
6 | 27DEC2015 | 0947 | 2 | 39.98 | 2 |
6 | 25JAN2016 | 2110 | 1 | 119.99 | 1 |
6 | 30JAN2016 | 3212 | 1 | 149.99 | 2 |
6 | 30JAN2016 | 3320 | 1 | 149.99 | 2 |
6 | 02FEB2016 | 3156 | 0 | -11 | 2 |
6 | 04FEB2016 | 2395 | -1 | -149.99 | 1 |
6 | 04FEB2016 | 2399 | -1 | -137.56 | 1 |
6 | 04FEB2016 | 431 | 1 | 99.99 | 3 |
7 | 05MAR2016 | 8426 | 3 | 1502.97 | 3 |
8 | 11AUG2015 | 4689 | 1 | 469.99 | 3 |
8 | 11AUG2015 | 8172 | 1 | 14.95 | 1 |
9 | 18SEP2015 | 1243 | 2 | 477.58 | 1 |
9 | 26SEP2015 | 5399 | -2 | -477.58 | 1 |
9 | 26SEP2015 | 5491 | 2 | 359.99 | 1 |
9 | 26DEC2015 | 552 | 1 | 699.99 | 1 |
9 | 27DEC2015 | 5969 | -1 | -699.99 | 1 |
9 | 27DEC2015 | 1018 | 1 | 999.99 | 1 |
9 | 03JAN2016 | 6506 | -1 | -24.99 | 1 |
9 | 04JAN2016 | 8623 | -1 | -999.99 | 1 |
9 | 10JAN2016 | 6097 | 1 | 610 | 1 |
9 | 16JAN2016 | 212 | 1 | 829.99 | 3 |
9 | 27JAN2016 | 6767 | 1 | 39.99 | 2 |
10 | 13MAR2016 | 6892 | 1 | 149.99 | 2 |
10 | 08MAY2016 | 5901 | 1 | 109.99 | 3 |
10 | 08MAY2016 | 124 | 3 | 7.97 | 1 |
10 | 08MAY2016 | 3683 | 4 | 1849.99 | 1 |
10 | 09MAY2016 | 5963 | 1 | 160 | 1 |
11 | 30AUG2015 | 7927 | 1 | 369.99 | 3 |
11 | 30AUG2015 | 4370 | 0 | 80 | 1 |
Data set: WANT
i want a variable PATH with 3 categories: 1 only, 2 only, 1 & 2
ID | PATH |
1 | 1 only |
6 | 1 & 2 |
9 | 1 only |
10 | 2 only |
2) i also have a side question....Is there a simple way to delete all ID rows where first transaction is channel =3?
Thanks for your help, much appreciated.
Hi,
The article shows steps for doing path analysis in SAS EM, not EG. Unfortunately my EM isn't working at the moment...
Here is for your second question. data have; infile cards expandtabs truncover; input ID TRANS_DATE : $20. TRANS_KEY UNITS REV CHANNEL; cards; 1 11JUL2015 1068 2 1939.98 1 1 14JUL2015 1256 1 31.99 1 1 17JUL2015 9305 2 63.98 1 1 20JUL2015 1207 2 2279.98 1 1 20JUL2015 1477 -2 -1939.98 1 1 08SEP2015 5113 2 1539.99 1 1 24FEB2016 8835 1 299.99 3 2 01SEP2015 5178 1 139.99 3 2 28SEP2015 2368 1 214.99 1 3 30JUL2015 1504 1 189.95 3 4 22DEC2015 627 2 819.98 3 4 02JAN2016 1271 -2 -819.98 1 5 12AUG2015 9140 2 649.98 3 5 12AUG2015 3423 -1 -199.99 1 5 12AUG2015 3424 2 545.93 1 6 27DEC2015 0947 2 39.98 2 6 25JAN2016 2110 1 119.99 1 6 30JAN2016 3212 1 149.99 2 6 30JAN2016 3320 1 149.99 2 6 02FEB2016 3156 0 -11 2 6 04FEB2016 2395 -1 -149.99 1 6 04FEB2016 2399 -1 -137.56 1 6 04FEB2016 431 1 99.99 3 7 05MAR2016 8426 3 1502.97 3 8 11AUG2015 4689 1 469.99 3 8 11AUG2015 8172 1 14.95 1 9 18SEP2015 1243 2 477.58 1 9 26SEP2015 5399 -2 -477.58 1 9 26SEP2015 5491 2 359.99 1 9 26DEC2015 552 1 699.99 1 9 27DEC2015 5969 -1 -699.99 1 9 27DEC2015 1018 1 999.99 1 9 03JAN2016 6506 -1 -24.99 1 9 04JAN2016 8623 -1 -999.99 1 9 10JAN2016 6097 1 610 1 9 16JAN2016 212 1 829.99 3 9 27JAN2016 6767 1 39.99 2 10 13MAR2016 6892 1 149.99 1 10 08MAY2016 5901 1 109.99 3 10 08MAY2016 124 3 7.97 1 10 08MAY2016 3683 4 1849.99 1 10 09MAY2016 5963 1 160 1 11 30AUG2015 7927 1 369.99 3 11 30AUG2015 4370 0 80 1 ; run; data want; do until(last.id); set have; by id; if first.id and CHANNEL=3 then flag=1; end; do until(last.id); set have; by id; if not flag then output; end; drop flag; run;
Thanks Xia, the program worked for me.
I have one follow-up question...do you know how i can delete rows after the first time channel=3 comes up?
or is there a way for me to flag all the rows up to and including the first channel=3 line? not sure which one is easier...
thanks for your help.
i.e. HAVE
ID | TRANS_DATE | TRANS_KEY | CHANNEL |
12 | 29FEB2016 | 9165 | 2 |
12 | 07MAR2016 | 5414 | 2 |
12 | 29APR2016 | 9977 | 2 |
12 | 30APR2016 | 4642 | 3 |
12 | 30APR2016 | 7072 | 1 |
12 | 30APR2016 | 4711 | 1 |
12 | 30APR2016 | 5578 | 3 |
12 | 30APR2016 | 5579 | 1 |
data WANT:
ID | TRANS_DATE | TRANS_KEY | CHANNEL |
12 | 29FEB2016 | 9165 | 2 |
12 | 07MAR2016 | 5414 | 2 |
12 | 29APR2016 | 9977 | 2 |
12 | 30APR2016 | 4642 | 3 |
data have; infile cards expandtabs truncover; input ID TRANS_DATE : $20. TRANS_KEY UNITS REV CHANNEL; cards; 1 11JUL2015 1068 2 1939.98 1 1 14JUL2015 1256 1 31.99 1 1 17JUL2015 9305 2 63.98 1 1 20JUL2015 1207 2 2279.98 1 1 20JUL2015 1477 -2 -1939.98 1 1 08SEP2015 5113 2 1539.99 1 1 24FEB2016 8835 1 299.99 3 2 01SEP2015 5178 1 139.99 3 2 28SEP2015 2368 1 214.99 1 3 30JUL2015 1504 1 189.95 3 4 22DEC2015 627 2 819.98 3 4 02JAN2016 1271 -2 -819.98 1 5 12AUG2015 9140 2 649.98 3 5 12AUG2015 3423 -1 -199.99 1 5 12AUG2015 3424 2 545.93 1 6 27DEC2015 0947 2 39.98 2 6 25JAN2016 2110 1 119.99 1 6 30JAN2016 3212 1 149.99 2 6 30JAN2016 3320 1 149.99 2 6 02FEB2016 3156 0 -11 2 6 04FEB2016 2395 -1 -149.99 1 6 04FEB2016 2399 -1 -137.56 1 6 04FEB2016 431 1 99.99 3 7 05MAR2016 8426 3 1502.97 3 8 11AUG2015 4689 1 469.99 3 8 11AUG2015 8172 1 14.95 1 9 18SEP2015 1243 2 477.58 1 9 26SEP2015 5399 -2 -477.58 1 9 26SEP2015 5491 2 359.99 1 9 26DEC2015 552 1 699.99 1 9 27DEC2015 5969 -1 -699.99 1 9 27DEC2015 1018 1 999.99 1 9 03JAN2016 6506 -1 -24.99 1 9 04JAN2016 8623 -1 -999.99 1 9 10JAN2016 6097 1 610 1 9 16JAN2016 212 1 829.99 3 9 27JAN2016 6767 1 39.99 2 10 13MAR2016 6892 1 149.99 1 10 08MAY2016 5901 1 109.99 3 10 08MAY2016 124 3 7.97 1 10 08MAY2016 3683 4 1849.99 1 10 09MAY2016 5963 1 160 1 11 30AUG2015 7927 1 369.99 3 11 30AUG2015 4370 0 80 1 ; run; data want; do i=1 by 1 until(last.id); set have; by id; if not flag and CHANNEL=3 then do;flag=1;k=i;end; end; do j=1 by 1 until(last.id); set have; by id; if j le k then output; end; drop flag i j k; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.