- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
say I have something like:
data have;
infile datalines delimiter=' ';
input group count n;
datalines;
9 1 400
9 2 297
9 3 344
9 4 32
9 5 45
9 6 6
9 7 54
9 8 34
9 9 21
10 1 32
10 2 3
10 3 57
10 4 43
10 5 45
10 6 45
10 7 34
10 8 32
10 11 5
;
but for group 10 I need to fill in missing count values per group, so I want count=9 and 10, while for n just fill in 0. Like below:
data have;
infile datalines delimiter=' ';
input group count n;
datalines;
9 1 400
9 2 297
9 3 344
9 4 32
9 5 45
9 6 6
9 7 54
9 8 34
9 9 21
10 1 32
10 2 3
10 3 57
10 4 43
10 5 45
10 6 45
10 7 34
10 8 32
10 9 0
10 10 0
10 11 5
;
I'm thinking there has to be a complete function or similar. Help is appreciated!!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @richart I believe there might be some inbuilt procedures within the SAS ETS module should you have time series dates in your real data to fill. Of course, otherwise you can always rely on a datastep-
data have;
infile datalines delimiter=' ';
input group count n;
datalines;
9 1 400
9 2 297
9 3 344
9 4 32
9 5 45
9 6 6
9 7 54
9 8 34
9 9 21
10 1 32
10 2 3
10 3 57
10 4 43
10 5 45
10 6 45
10 7 34
10 8 32
10 11 5
;
data want ;
do until (last.group) ;
set have curobs = _n ;
by group ;
if _c and dif(count) > 1 then do ;
do count = _c + 1 to count - 1 ;
n = 0 ;
output ;
end ;
set have point = _n ;
output ;
end ;
else output ;
_c =count ;
end ;
drop _c ;
run ;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @richart I believe there might be some inbuilt procedures within the SAS ETS module should you have time series dates in your real data to fill. Of course, otherwise you can always rely on a datastep-
data have;
infile datalines delimiter=' ';
input group count n;
datalines;
9 1 400
9 2 297
9 3 344
9 4 32
9 5 45
9 6 6
9 7 54
9 8 34
9 9 21
10 1 32
10 2 3
10 3 57
10 4 43
10 5 45
10 6 45
10 7 34
10 8 32
10 11 5
;
data want ;
do until (last.group) ;
set have curobs = _n ;
by group ;
if _c and dif(count) > 1 then do ;
do count = _c + 1 to count - 1 ;
n = 0 ;
output ;
end ;
set have point = _n ;
output ;
end ;
else output ;
_c =count ;
end ;
drop _c ;
run ;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @richart ,
The SAS/ETS way as @novinosrin is suggesting :
PROC TIMESERIES data=have out=want;
by group;
id count interval=day
accumulate=median
setmiss=0 ;
var n;
format count 5.;
run;
/* end of program */
Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming that max possible value of count is known, you can do something like:
data want;
array temp[100] _temporary_ ; /* set it big enough */
set have;
by group;
temp[count] = n;
_max_ = max(count, _max_);
if last.group then
do;
do count = 1 to _max_;
n = coalesce(temp[count], 0);
output;
end;
call missing(of temp[*], _max_);
end;
drop _max_;
run;
proc print;
run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here another coding variant that should work.
data want;
set have;
by group count;
output;
_lag_count=coalesce(lag(count),1);
do while(coalesce(count,1) - _lag_count > 1);
count=count-1;
n=0;
output;
end;
drop _lag_count;
run;
proc sort data=want;
by group count;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
infile datalines delimiter=' ';
input group count n;
datalines;
9 1 400
9 2 297
9 3 344
9 4 32
9 5 45
9 6 6
9 7 54
9 8 34
9 9 21
10 1 32
10 2 3
10 3 57
10 4 43
10 5 45
10 6 45
10 7 34
10 8 32
10 11 5
;
data want;
merge have have(firstobs=2 keep=group count rename=(group=_group count=_count));
output;
if group=_group then do;
do count=count+1 to _count-1;
n=0;output;
end;
end;
drop _group _count;
run;