I have the following dataset:
Match ID Comments Date Survivor
1 Market Only
1 Different
1 Different
2 Market Only
2 Different
3 Same 10-07-2015
3 Same 11-08-2015
3 Same 12-09-2015
What I need to do is compare the observations within a given Match ID like follows:
- If there are two 'different' observations within a given match id, then Survivor=' ' (eg. where match id is 1)
-if there is one 'different' observation and others are 'market only' , then Survior='Y' for the case where it is 'different'
-if all are 'same' within a given match id, then survivor='Y' for the case with latest created date.
Essentially, the output should be as follows:
Match ID Comments Date Survivor
1 Market Only
1 Different
1 Different
2 Market Only
2 Different 'Y'
3 Same 10-07-2015
3 Same 11-08-2015
3 Same 12-09-2015 'Y'
Any help is greatly appreciated. Thank you
data HAVE;
infile cards missover;
input MATCH_ID 1. COMMENTS $12. DATE mmddyy10. ;
format DATE date9.;
cards;
1 Market Only
1 Different
1 Different
2 Market Only
2 Different
3 Same 10-07-2015
3 Same 11-08-2015
3 Same 12-09-2015
run;
proc sort ;
by MATCH_ID descending COMMENTS DATE;
data WANT;
set HAVE;
by MATCH_ID;
if first.MATCH_ID then call missing(NB_DIFFERENT, NB_SAME, NB_ID);
if COMMENTS='Different' then NB_DIFFERENT+1;
if COMMENTS='Same' then NB_SAME +1;
NB_ID+1;
if last.MATCH_ID then do;
if NB_DIFFERENT=1 then SURVIVOR='Y';
if NB_SAME=NB_ID then SURVIVOR='Y';
end;
run;
Below code works with your sample data.
data have;
infile datalines dsd dlm=',' truncover;
input Match_ID $ Comments :$15. Date :ddmmyy10.;
format date date9.;
datalines;
1,Market Only
1,Different
1,Different
2,Market Only
2,Different
3,Same,10-07-2015
3,Same,11-08-2015
3,Same,12-09-2015
;
run;
proc format;
value $comments (default=1)
'Market Only' = 'A'
'Different' = 'B'
'Same' = 'C'
;
run;
data inter;
set have;
_com=put(Comments,$comments.);
run;
proc transpose data=inter out=inter(drop=_name_) prefix=_comment;
by Match_ID;
var _com;
run;
data want(drop=_:);
merge have inter;
by Match_id;
array com {*} _comment:;
if last.match_id then
do;
call sortc(of com[*]);
_string=cats(of com[*]);
if prxmatch('/^A+B$/',strip(_string))>0 then Survivor='Y';
if prxmatch('/^C+$/',strip(_string))>0 then Survivor='Y';
end;
run;
True if the string starts with one or several 'A' and ends with exactly one 'B'
if prxmatch('/^A+B$/',strip(_string))>0 then Survivor='Y';
True if the string only contains one or several 'C'
if prxmatch('/^C+$/',strip(_string))>0 then Survivor='Y';
data HAVE;
infile cards missover;
input MATCH_ID 1. COMMENTS $12. DATE mmddyy10. ;
format DATE date9.;
cards;
1 Market Only
1 Different
1 Different
2 Market Only
2 Different
3 Same 10-07-2015
3 Same 11-08-2015
3 Same 12-09-2015
run;
proc sort ;
by MATCH_ID descending COMMENTS DATE;
data WANT;
set HAVE;
by MATCH_ID;
if first.MATCH_ID then call missing(NB_DIFFERENT, NB_SAME, NB_ID);
if COMMENTS='Different' then NB_DIFFERENT+1;
if COMMENTS='Same' then NB_SAME +1;
NB_ID+1;
if last.MATCH_ID then do;
if NB_DIFFERENT=1 then SURVIVOR='Y';
if NB_SAME=NB_ID then SURVIVOR='Y';
end;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.