I have a file with a case number that is sporadically filled in. The file is 500k plus rows, If the next time a case # is filled in happens to be the same case # as before I want to fill in the blanks. I thought of using lag() but some of the missing rows are pretty large. If the next case # populated is different the rows between should be left blank.
data have;
infile cards dsd;
input case$ ;
cards;
500
,
,
,
500
,
600
,
,
700
700
,
700
;
run;
data want;
infile cards dsd;
input case$ ;
cards;
500
500
500
500
500
,
600
,
,
700
700
700
700
;
run;
Try this
data have;
infile cards dsd;
input case ;
cards;
500
,
,
,
500
,
600
,
,
700
700
,
700
;
run;
data want;
do _N_ = 1 by 1 until (case);
if _N_ = 1 then c = case;
set have;
end;
flag = (c = case);
do _N_ = 1 to _N_;
set have;
if flag then case = c;
output;
end;
run;
There is also an ID field,
I can use:
data want;
set have;
by id;
retain _case;
if not missing(case_id) then _case = case_id;
run;
that gets me part of the way, but it fills in the last rows when the next value will not be the same. I suppose I could sort the other way and delete some.
Try this
data have;
infile cards dsd;
input case ;
cards;
500
,
,
,
500
,
600
,
,
700
700
,
700
;
run;
data want;
do _N_ = 1 by 1 until (case);
if _N_ = 1 then c = case;
set have;
end;
flag = (c = case);
do _N_ = 1 to _N_;
set have;
if flag then case = c;
output;
end;
run;
this looks great, can up modify this for case as a character?
I see the issue, case is before the set statement so it's a number. I added format case $500.; after the data step and this works. thanks for the help.
Next code can create a dataset where cases are in next format:
case1 case2 case3
case2 case3 case4
case3 case4 case5
...
case(n-2) case(n-1) case (n)
case (n-1) case(n)
case(n) /* <<< the last one */
The code is not tested:
data temp;
merge have (rename=(case=case_prev)) /* firstobs=1 */
have (firstobs=2 rename=(case=case_current))
have (firstobs=3 reanme=(case=case_next);
by;
run;
data want (keep=case);
set temp end=eof;
retain previous_case;
if _N_=1 then do;
case = case_prev; output;
if case_next = case_curr then case=.; /* missing value as blank */
else case = case_curr;
output;
previous_case = case;
end; else do;
if case_curr = previous_case then case=.;
else case = case_curr;
output;
previous_case = case;
end;
if eof then do; case = case_prev; output; end;
run;
Just for fun .
data have;
infile cards dsd;
n+1;
input case ;
cards;
500
,
,
,
500
,
600
,
,
700
700
,
700
;
run;
data have;
set have;
retain x1;
if not missing(case) then x1=case;
run;
proc sort data=have;by descending n;run;
data have;
set have;
retain x2;
if not missing(case) then x2=case;
run;
proc sort data=have;by n;run;
data want;
set have;
if x1=x2 then case=x1;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.