data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;
Dear,
I need help to derive the logic.
if a subject has a record in any 3 interval (3,6,9,12,15,18 ) then i need to create variable want='Yes' for the id and par and ady . If no record then want='No'. for the all the records until last record.
For example, id=1, has max ady is 12 and has all interval up to 12
id=2 has max ady 12 and has missed 9 so i create record with N
id=3 has max ady 9 and has missed 6 so i create record with N
output needed
id par ady want
1 a 3 Y
1 a 6 Y
1 a 9 Y
1 a 12 Y
2 a 3 Y
2 a 6 Y
2 a 9 N
2 a 12 Y
3 a 3 Y
3 a 6 N
3 a 9 Y
;Thank you
data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;
proc sql;
select b.id,b.par,b.ady,coalescec(b.want,'N') as want
from (select * from (select distinct id from one),(select distinct ady from one)) as a
right join
(select *,'Y' as want from one) as b on a.id=b.id and a.ady=b.ady;
quit;
HI
I really like the SQL code. But it is not giving me the output i need. Please suggest. I got help from other people in group. But i interested to know SQL code.
Thank you
Every time you need to build (or work with) sequences, the data step is the tool of choice. SQL is not really good at this, unless you work with a flavor that has non-ANSI extensions.
using a SAS data step:
data have;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
2 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;
data want(drop=_:);
set have;
by id ady;
_start=sum(lag(ady),3);
if first.id then _start=ady;
_stop=ady;
do ady=_start to _stop by 3;
if ady = _stop then want='Y';
else want='N';
output;
end;
run;
proc print;
run;
See this:
data have;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;
proc sort data=have nodupkey;
by id par ady;
run;
data want;
retain id par ady want; /* only for settting variable order */
want = 'Y';
do until (last.par);
set have;
by id par;
if ady - prevady > 3 and not first.par then want = 'N';
prevady = ady;
end;
do until (last.par);
set have (rename=(ady=_ady));
by id par;
if not first.par then do ady = prevady + 3 to _ady by 3;
output;
end;
else do ady = 3 to _ady by 3;
output;
end;
prevady = _ady;
end;
drop prevady _ady;
run;
proc print data=want noobs;
run;
Result:
id par ady want 1 a 3 Y 1 a 6 Y 1 a 9 Y 1 a 12 Y 2 a 3 N 2 a 6 N 2 a 9 N 2 a 12 N 3 a 3 N 3 a 6 N 3 a 9 N
HI Thank you very much for the help. I am adding my input dataset.
I think some modification is needed in your suggestion. It is not giving me correct output. I need to populate want=N only if there is no record. (missing ). F
data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
2 a 6
2 a 12
3 a 3
3 a 9
;
proc sort data=one;
by id ady;
run;
or subjects 2 and 3 i am having all N. Please help. Thank you
output needed
id par ady want
1 a 3 Y
1 a 6 Y
1 a 9 Y
1 a 12 Y
2 a 3 N
2 a 6 Y
2 a 9 N
2 a 12 Y
3 a 3 Y
3 a 6 N
3 a 9 Y
output getting
id par ady want
1 a 3 Y
1 a 6 Y
1 a 9 Y
1 a 12 Y
2 a 3 N
2 a 6 N
2 a 9 N
2 a 12 N
3 a 3 N
3 a 6 N
3 a 9 N
To me this looks like you want fill out a planned set of records.
It is not clear where you are getting the set of possible ADY values. Is it just from the values that already appear in ONE?
If so then this type of query pattern will let you get all possible combinations of ID+PAR and ADY.
select *,'N' as EXTRA
from (select distinct id,par from one)
, (select distinct ady from one)
;
If not then you need to first make a little dataset that have one observation per value of ADY.
data ady_list;
do ady=3 to 12 by 3;
output;
end;
run;
proc sql
select *,'N' as EXTRA
from (select distinct id,par from one)
, ady_list
;
quit;
Either way now that you have the full skeleton merge it with the actual data.
proc sql ;
create table want(drop=actual extra) as
select *
, coalesce(ACTUAL,EXTRA) as WANT
from (select *,'Y' as ACTUAL from one) a
natural full join
(
select *,'N' as EXTRA
from (select distinct id,par from one)
, (select distinct ady from one)
) b
;
quit;
proc print;
run;
Obs id par ady WANT 1 1 a 3 Y 2 1 a 6 Y 3 1 a 9 Y 4 1 a 12 Y 5 2 a 3 N 6 2 a 6 Y 7 2 a 9 N 8 2 a 12 Y 9 3 a 3 Y 10 3 a 6 N 11 3 a 9 Y 12 3 a 12 N
That makes the logic simpler, as we don't need the double do until loop, instead doing everything with counting iterations:
data have;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;
proc sort data=have nodupkey;
by id par ady;
run;
data want;
set have (rename=(ady=_ady));
by id par;
_prev = lag(_ady);
if first.par
then do ady = 3 to _ady - 3 by 3; /* fill from the start */
want = 'N';
output;
end;
else do ady = _prev + 3 to _ady - 3 by 3; /* fill a hole */
want = 'N';
output;
end;
/* write the current obs */
want = 'Y';
ady = _ady;
output;
drop _:;
run;
proc print data=want noobs;
run;
Result:
id par ady want 1 a 3 Y 1 a 6 Y 1 a 9 Y 1 a 12 Y 2 a 3 N 2 a 6 Y 2 a 9 N 2 a 12 Y 3 a 3 Y 3 a 6 N 3 a 9 Y
This is a case where you compare the record-in-hand to the prior record.
If there are any gaps (comparing _PRIOR_ADY to current ADY-3), loop through the gap (incrementing ADY by 3's from the prior_ady), outputting "extra" records with WANT='N'. The loop will end up with ADY equal the current actual value. So set want='Y' and output the actual record:
data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
2 a 6
2 a 12
3 a 3
3 a 9
;
data want;
set one ;
by id;
_prior_ady=ifn(first.id,0,lag(ady));
if _prior_ady^=ady-3 then do ady=_prior_ady +3 to ady-3 by 3;
want='N';
output;
end;
want='Y';
output;
run;
But my comments above didn't say what to do about starting and ID with, say ADY=6. This is solved by setting the _PRIOR_ADY value to 0 when at the beginning of an ID. Only if the ID starts with ADY=3 there is no gap.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.