Re: Help in my pgm logic

knveraraju91 · Posted 02-18-2020 11:10 PM

data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;

Dear,

I need help to derive the logic.

if a subject has a record in any 3 interval (3,6,9,12,15,18 ) then i need to create variable want='Yes' for the id and par and ady . If no record then want='No'. for the all the records until last record.

For example, id=1, has max ady is 12 and has all interval up to 12

id=2 has max ady 12 and has missed 9 so i create record with N

id=3 has max ady 9 and has missed 6 so i create record with N

output needed

id par ady want

1 a 3 Y
1 a 6 Y
1 a 9 Y
1 a 12 Y
2 a 3 Y
2 a 6 Y

2 a 9 N
2 a 12 Y
3 a 3 Y

3 a 6 N
3 a 9 Y

;Thank you

Ksharp · Posted 02-19-2020 01:15 AM

data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;

proc sql;
select b.id,b.par,b.ady,coalescec(b.want,'N') as want
from (select * from (select distinct id from one),(select distinct ady from one)) as a
right join
(select *,'Y' as want from one) as b on a.id=b.id and a.ady=b.ady;
quit;

knveraraju91 · Posted 02-19-2020 07:56 AM

HI

I really like the SQL code. But it is not giving me the output i need. Please suggest. I got help from other people in group. But i interested to know SQL code.

Thank you

Kurt_Bremser · Posted 02-19-2020 09:39 AM

Every time you need to build (or work with) sequences, the data step is the tool of choice. SQL is not really good at this, unless you work with a flavor that has non-ANSI extensions.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Patrick · Posted 02-19-2020 02:08 AM

using a SAS data step:

data have;
  input id par $ ady;
  datalines;
1 a 3
1 a 6
1 a 9
1 a 12
2 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;

data want(drop=_:);
  set have;
  by id ady;
  _start=sum(lag(ady),3);
  if first.id then _start=ady;
  _stop=ady;
  do ady=_start to _stop by 3;
    if ady = _stop then want='Y';
    else want='N';
    output;
  end;
run;

proc print;
run;

Kurt_Bremser · Posted 02-19-2020 03:05 AM

See this:

data have;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;

proc sort data=have nodupkey;
by id par ady;
run;

data want;
retain id par ady want; /* only for settting variable order */
want = 'Y';
do until (last.par);
  set have;
  by id par;
  if ady - prevady > 3 and not first.par then want = 'N';
  prevady = ady;
end;
do until (last.par);
  set have (rename=(ady=_ady));
  by id par;
  if not first.par then do ady = prevady + 3 to _ady by 3;
    output;
  end;
  else do ady = 3 to _ady by 3;
    output;
  end;
  prevady = _ady;
end;
drop prevady _ady;
run;

proc print data=want noobs;
run;

Result:

id	par	ady	want
1	a	3	Y
1	a	6	Y
1	a	9	Y
1	a	12	Y
2	a	3	N
2	a	6	N
2	a	9	N
2	a	12	N
3	a	3	N
3	a	6	N
3	a	9	N

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

knveraraju91 · Posted 02-19-2020 10:34 AM

HI Thank you very much for the help. I am adding my input dataset.

I think some modification is needed in your suggestion. It is not giving me correct output. I need to populate want=N only if there is no record. (missing ). F

data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
2 a 6
2 a 12
3 a 3
3 a 9
;
proc sort data=one;
by id ady;
run;

or subjects 2 and 3 i am having all N. Please help. Thank you

output needed

id par ady want

1 a 3 Y
1 a 6 Y
1 a 9 Y
1 a 12 Y
2 a 3 N
2 a 6 Y

2 a 9 N
2 a 12 Y
3 a 3 Y

3 a 6 N
3 a 9 Y

output getting

id par ady want

1 a 3 Y
1 a 6 Y
1 a 9 Y
1 a 12 Y
2 a 3 N
2 a 6 N

2 a 9 N
2 a 12 N
3 a 3 N

3 a 6 N
3 a 9 N

Tom · Posted 02-19-2020 12:03 PM

To me this looks like you want fill out a planned set of records.

It is not clear where you are getting the set of possible ADY values. Is it just from the values that already appear in ONE?

If so then this type of query pattern will let you get all possible combinations of ID+PAR and ADY.

select *,'N' as EXTRA 
from (select distinct id,par from one)
   , (select distinct ady from one)
;

If not then you need to first make a little dataset that have one observation per value of ADY.

data ady_list;
 do ady=3 to 12 by 3;
    output;
  end;
run;
proc sql
select *,'N' as EXTRA 
from (select distinct id,par from one)
   , ady_list
;
quit;

Either way now that you have the full skeleton merge it with the actual data.

proc sql ;
create table want(drop=actual extra) as
select *
     , coalesce(ACTUAL,EXTRA) as WANT
from (select *,'Y' as ACTUAL from one) a
natural full join 
(
select *,'N' as EXTRA 
from (select distinct id,par from one)
   , (select distinct ady from one)
) b
;
quit;
proc print;
run;

Obs    id    par    ady    WANT

  1     1     a       3     Y
  2     1     a       6     Y
  3     1     a       9     Y
  4     1     a      12     Y
  5     2     a       3     N
  6     2     a       6     Y
  7     2     a       9     N
  8     2     a      12     Y
  9     3     a       3     Y
 10     3     a       6     N
 11     3     a       9     Y
 12     3     a      12     N

Kurt_Bremser · Posted 02-20-2020 01:31 AM

That makes the logic simpler, as we don't need the double do until loop, instead doing everything with counting iterations:

data have;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
1 a 3
2 a 6
2 a 12
3 a 3
3 a 9
;

proc sort data=have nodupkey;
by id par ady;
run;

data want;
set have (rename=(ady=_ady));
by id par;
_prev = lag(_ady);
if first.par
then do ady = 3 to _ady - 3 by 3; /* fill from the start */
  want = 'N';
  output;
end;
else do ady = _prev + 3 to _ady - 3 by 3; /* fill a hole */
  want = 'N';
  output;
end;
/* write the current obs */
want = 'Y';
ady = _ady;
output;
drop _:;
run;

proc print data=want noobs;
run;

Result:

id    par    ady    want

 1     a       3     Y  
 1     a       6     Y  
 1     a       9     Y  
 1     a      12     Y  
 2     a       3     N  
 2     a       6     Y  
 2     a       9     N  
 2     a      12     Y  
 3     a       3     Y  
 3     a       6     N  
 3     a       9     Y

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

mkeintz · Posted 02-19-2020 02:21 PM

This is a case where you compare the record-in-hand to the prior record.

If there are any gaps (comparing _PRIOR_ADY to current ADY-3), loop through the gap (incrementing ADY by 3's from the prior_ady), outputting "extra" records with WANT='N'. The loop will end up with ADY equal the current actual value. So set want='Y' and output the actual record:

data one;
input id par $ ady;
datalines;
1 a 3
1 a 6
1 a 9
1 a 12
2 a 6
2 a 12
3 a 3
3 a 9
;
data want;
  set one ;
  by id;
  _prior_ady=ifn(first.id,0,lag(ady));
  if _prior_ady^=ady-3 then do ady=_prior_ady +3 to ady-3 by 3;
    want='N';
    output;
  end;
  want='Y';
  output;
run;

But my comments above didn't say what to do about starting and ID with, say ADY=6. This is solved by setting the _PRIOR_ADY value to 0 when at the beginning of an ID. Only if the ID starts with ADY=3 there is no gap.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Registration is open

SAS Training: Just a Click Away