data temp; length name $20.; length present $15.; Present="Y"; do visit=1,2,3,4,8,10,15; name='Namea'; output; end; do visit=2,15; name='Nameb'; output; end; do visit=1,2,5,7,10,11,12,13,14,15; name='Namec'; output; end; do visit=1,2,10,11,14,15; name='Named'; output; end; do visit=1 to 15; name='Namee'; output; end; run; proc sort data=temp out=temps nodupkey; by name; run; data dummy; set temps(keep=name); do lesson=1 to 18; output; end; run; proc sort data=dummy out=dummys nodup; by name; run; data merged; merge temps(in=a) dummys(in=b); by name; if a=b then present="Y"; else present="N"; run;
Hello. My task is this: "The internship consists of 18 lessons. Create new variable named "present" and set to “Y” if student was present in that lesson or set to “N” if student was absent. "
I tried this code but it didn't work,can you help to correct it?
data temp;
length name $20.;
length present $15.;
Present="Y";
do visit=1,2,3,4,8,10,15;
name='Namea';
output;
end;
do visit=2,15;
name='Nameb';
output;
end;
do visit=1,2,5,7,10,11,12,13,14,15;
name='Namec';
output;
end;
do visit=1,2,10,11,14,15;
name='Named';
output;
end;
do visit=1 to 15;
name='Namee';
output;
end;
run;
proc sort data=temp out=temps nodupkey;
by name visit;
run;
data dummy;
set temps(keep=name);
do lesson=1 to 18;
output;
end;
run;
proc sort data=dummy out=dummys nodup;
by name lesson;
run;
data merged;
merge temps(in=a) dummys(in=b rename=(lesson=visit));
by name visit;
if a=b then presentCheck="Y";
else presentCheck="N";
run;
Note that this A=B trick:
merge A(in=a) B(in=b) ;
by ...;
if a=b then ...
Will only work when you are merging exactly two datasets.
It works when there are two datasets because only 3 of the possible combinations of the two binary variables can exist since if A and B cannot both be zero (false) since then there would not be any observation.
But once you introduce another source dataset then both A and B could be false, in which case their values will be equal.
If is much clearer to just use
if a and b then ...
If you want to save typing two letters you could use & to represent AND.
if a & b then ...
but that is also going to be cause the programmer reviewing the code to have to slow down and remember that & means AND.
I would write the condition like this:
present = ifc(a and b,"Y","N");
So if you have this list of NAME/VISIT combinations.
data temp;
infile cards truncover ;
input name $ visit @ ;
do while(visit ne .);
output;
input visit @;
end;
cards;
A 1 2 3 4 8 10 15
B 2 15
C 1 2 5 7 10 11 12 13 14 15
D 1 2 10 11 14 15
E 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
;
And you want to generate data that has 18 visits for every NAME you can just use a step like this:
data dummy;
set temp;
by name;
if first.name ;
do visit=1 to 18;
output;
end;
run;
Now you can merge then and use the IN= flag from the original HAVE datasets to create your PRESENT varaible.
Personally I find 0/1 variables easier to work with then N/Y variables.
data want ;
merge dummy temp(in=actual);
by name visit;
present=actual;
run;
Results
Data TEMP, as you constructed it, is already sorted by name. So:
data want;
array classes {18} _temporary_ ;
set temp;
by name;
if first.name then call missing(of classes{*});
classes{visit}=1;
if last.name;
do visit=lbound(classes) to hbound(classes);
present=ifc(classes{visit}=1,'Y','N');
output;
end;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.