Hi every one I wanted to know how do i find the new data entry of the week ?
For Example My data looks like this
Input :
Week Patient
W1 P1
W1 P2
W1 P3
W2 P1
W2 P2
W2 P4
W3 P3
W3 P4
W3 P5
Desired output
Week Patient Flag
W1 P1
W1 P2
W1 P3
W2 P1
W2 P2
W2 P4 New
W3 P3
W3 P4
W3 P5 New
Basically if i am finding the new patient in w6 then I have to look back all possible week and compare how many of these patients have visited me before W6 and if there a new which was not there till w6 then I have to mark that one as a new patient
The weekly data set I have is huge I there is no way i can write individual subset code for everyone
any help will be much appreciated
Thanks in advance
It looks like you have a second, unwritten rule here: Don't mark anybody in W1 as "New".
To get the result you are asking for:
proc sort data=have;
by patient week;
run;
data want;
set have;
by patient week;
if first.patient and week ne 'W1' then flag='New';
run;
There is a hidden pitfall here. This program assumes that the weeks go as high as 9, but no higher. Otherwise the sorted order would be wrong:
W1
W10
W11
W2
If you have more than 9 weeks, you would need to extract a numeric variable from WEEK:
weeknum = input( substr(week,2), 2.);
Then use WEEKNUM instead of WEEK for sorting.
A quick follow-up here ... SAS provides a way that you can sort character values but make "W9" sort before "W10". Here's the revised solution:
proc sort data=have sortseq=linguistic (numeric_collation=on);
by patient week;
run;
data want;
set have;
by patient;
if first.patient and week ne 'W1' then flag='New';
run;
Note that the second BY statement has changed as well.
data have;
input (Week Patient) ($) ;
datalines;
W1 P1
W1 P2
W1 P3
W2 P1
W2 P2
W2 P4
W3 P3
W3 P4
W3 P5
;
data want;
if _n_=1 then do;
if 0 then set have;
dcl hash h();
h.definekey('patient');
h.definedone();
end;
set have end=last;
by Week Patient;
length Flag $10;
call missing(Flag);
if first.week then _n+1;
if _n>1 then do;
if h.check() ne 0 then Flag='NEW';
end;
h.replace();
drop _:;
run;
Check this out.
Data have;
infile datalines dlm=" ";
input Week $ Patient $;
datalines;
W1 P1
W1 P2
W1 P3
W2 P1
W2 P2
W2 P4
W3 P3
W3 P4
W3 P5
;
run;
PROC SORT DATA=HAVE;
BY WEEK;
RUN;
%LET Patient_List=PATIENTS;
DATA WANT;
SET HAVE ;
IF FIND(SYMGET("Patient_List"),STRIP(PATIENT))=0
THEN DO;
CALL SYMPUT("Patient_List",CATX(',',SYMGET("Patient_List"),PATIENT));
FLAG="NEW";
END;
IF WEEK="W1" THEN FLAG="";
RUN;
arrays:
data have;
input (Week Patient) ($) ;
datalines;
W1 P1
W1 P2
W1 P3
W2 P1
W2 P2
W2 P4
W3 P3
W3 P4
W3 P5
;
data want;
do _n_=1 by 1 until(eof);
set have end=eof;
by Week Patient;
array temp(1000)$ _temporary_ ;/*arbitrary subscript value*/
length Flag $10;
call missing(Flag);
if first.week then _n+1;
if _n=1 then temp(_n_)=Patient;
else if _n>1 and patient not in temp then do; Flag="NEW";temp(_n_)=Patient;end;
output;
end;
drop _:;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.