Hi Community,
Few values are missing in my dataset. I need to put them in sequence as given below.
Source Dataset:
| Subject | Sequence |
| 101 | 1 |
| 101 | |
| 101 | 2 |
| 101 | 2 |
| 101 | |
| 101 | |
| 101 | |
| 101 | 3 |
| 101 | 3 |
Output Dataset:
| Subject | Sequence |
| 101 | 1 |
| 101 | 1.1 |
| 101 | 2 |
| 101 | 2 |
| 101 | 2.1 |
| 101 | 2.2 |
| 101 | 2.3 |
| 101 | 3 |
| 101 | 3 |
Thank you in advance..
This is a good case for using the RETAIN statement, applied to a temporarily needed" variable:
data have;
infile datalines missover;
input subject sequence;
datalines;
101 1
101
101 2
101 2
101
101
101
101 3
101 3
run;
data want (drop=_:);
set have;
retain _sequence;
/* Replace by next line: _sequence=coalesce(sequence,sum(_sequence,0.1)); */
_sequence=round(coalesce(sequence,sum(_sequence,0.1)),0.1);
sequence=_sequence;
run;
The coalesce function returns the first non-missing argument.
Editted note: I incorporated a change to accommodate the precision issue noted by @FreelanceReinh.
If there are 10 or more blank values consecutively for variable SEQUENCE, then what? Can this actually happen?
/*You should know for numeric variable,
10.1 and 10.10 is the same thing*/
data have;
infile datalines missover;
input subject sequence;
datalines;
101 1
101
101 2
101 2
101
101
101
101 3
101 3
;
data want(drop=temp n);
set have;
by subject ;
retain temp n;
if first.subject then call missing(temp,n);
if not missing(sequence) then do;temp=sequence;want=sequence;n=0;end;
else do;n+1;want=sum(temp,n*0.1);end;
run;
Hi @gahlot1999,
Just a minor addition to the solutions you've already received:
If you want to ensure that the computed sequence numbers are equal to what they look like, i.e., if you want to avoid later "surprises" as in
168 data _null_; 169 if 1.1+0.1 ne 1.2 then put 'Surprised?'; 170 if 1+7*0.1 ne 1.7 then put 'Surprised?'; 171 run; Surprised? Surprised? NOTE: DATA statement used (Total process time):
you should either round the final result, e.g.,
sequence=round(_sequence,0.1);
want=round(sum(temp,n*0.1),0.1);
or work with integers as long as possible:
want=sum(temp*10,n)/10;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.