SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
ken2
Obsidian | Level 7

I have some problem regarding pattern detecting and matching in SAS.

Imagine I have some clients, at the point of beginning, 2014Q4, all the clients belong to Segment ST0.

Then one year later at 2015Q4, some belong to ST0, some to ST, some empty, some belong to Segment A.

Then in the following four quarters in 2016, the segment further changed.

 

The starting table looks like:

data TEST1;
input ClientID SEG_2014Q4 $ SEG_2015Q4 $ SEG_2016Q1 $ SEG_2016Q2 $ SEG_2016Q3 $ SEG_2016Q4 $;
cards;
1001 ST0 ST0 ST0 ST0 ST0 ST0
1002 ST0 ST0 ST , , ,
1003 ST0 ST0 , , , ,
1004 ST0 ST0 A A , ,
1005 ST0 A A A A A
1006 ST0 , BX BX , ,
1007 ST0 A A A A BD
1008 ST0 , , , , ,
1009 ST0 C C ST0 ST0 ,
;
run;

 

What I want to do is to find out the End Segment at 2015Q4 (Seg_2015q4). But I have reason to believe that the values under SEG_2015Q4 are not completely updated. Therefore I will also look into the following 4 quarters in 2016 to see if the Segnent has changed in 2016, and use that information to update Seg_2015Q4 retroactively.

 

A client always start from Seg_2014Q4 with a initial Segment=ST0. At 2015Q4 and afterwards, that client can either retain ST0, or acquired by mistake something like ST (where 0 was dropped in the delivery, but they are the same). A client can also temporarily or permanently disappeared from the list, so the Segment can be empty, missing.

 

If a client has acquired a new segment at 2015Q4, such as A, B or whatever, this segment information will stick, no matter what it may change into in the following quarters.

 

The problem lies at, for those clients having same ST0, or ST, or missing at 2015Q4, this information can not be trusted. One must look into the future to see if this client has acquired any segment other than ST0, ST or missing. If it does, the first segment information will be used to replace the old ones for 2015Q4. 

 

All the newly mapped information will be stored in SEG_2015Q4_NEW.

 

 

 

The result should look like this:

 

data TEST2;
input ClientID SEG_2014Q4 $ SEG_2015Q4_NEW $;
cards;
1001 ST0 ST0
1002 ST0 ST
1003 ST0 ST0
1004 ST0 A
1005 ST0 A
1006 ST0 BX
1007 ST0 A
1008 ST0 ,
1009 ST0 C
;
run;

 

Results:

Client SEG_2014Q4  SEG_2015Q4_NEW

1001 ST0 ST0
1002 ST0 ST
1003 ST0 ST0
1004 ST0 A
1005 ST0 A
1006 ST0 BX
1007 ST0 A
1008 ST0 ,
1009 ST0 C

 

I can use a lot of if-else logic to program this. But I am more willing to use pattern match functions in SAS, to detect those patterns not in (ST0, ST, missing) and take their value instead.

Thanks for any help.

 

Ken

2 REPLIES 2
PaigeMiller
Diamond | Level 26

While I understand why you have letter codes for each quarter ... would it be easier if you kept the entire sequence of letter codes as a single string for the purposes of this pattern matching? Then, the FIND or FINDW command can determine if there is the letter/word A somewhere in the string; etc.

--
Paige Miller
Astounding
PROC Star

This ought to work:

 

data want;

set have;

length seg_2015Q4_New $ 8;

do seg_2015Q4_new = SEG_2015Q4, SEG_2016Q1, SEG_2016Q2, SEG_2016Q3, SEG_2016Q4 

until (seg_2015Q4_new in : ('A', 'B', 'C'));

end;

run;

 

If SAS complains about the syntax, we might have to put the variables into an array instead of naming them individually here.  But I'm expecting the combination of a list with DO UNTIL will work.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 998 views
  • 0 likes
  • 3 in conversation