BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Adubhai
Obsidian | Level 7

I have a time series dataset for 418 variables. I want for all variables to fill up the missing values with the previous available value. If previous value is not available (such that the first value of the time series is missing) then the missing value will be filled up by the next available value. Because I have such a large number of variables I won't be able to fill them up one by one. I want a single set of code that can process for all variables together. 

1 ACCEPTED SOLUTION

Accepted Solutions
japelin
Rhodochrosite | Level 12

OK. this is for only numeric variables.

 

data want(drop=i);
  set work.sample;/* modify your dataset name */
  
  array _n{&nvars.} _numeric_;
  array _rn{&nvars.} 8 _temporary_;/* variables for keeping data of previous observation */
  retain _rn:;
  do i=1 to dim(_n);
    if missing(_n{i}) then _n{i}=_rn{i};
    _rn{i}=_n{i};
  end;
run;

 

View solution in original post

10 REPLIES 10
japelin
Rhodochrosite | Level 12

How is this code.

 

data sample;
  length a b c $10 d e f 8;
  infile datalines dsd missover;
  input a b c d e f;
datalines;
aa,bb,  ,1, ,3
aa,  ,cc,1,2,3
dd,x ,  , ,6,  
  ,  ,bc,9, ,7
  ,  ,cc, , ,
;
run;

/* remember number of variables */
data _null_;
  set sashelp.vtable;
  where memname='SAMPLE';
  call symputx('cvars',num_character);
  call symputx('nvars',num_numeric);
run;

data want(drop=i);
  set sample;
  array _n{&nvars.} _numeric_;
  array _c{&cvars.} _character_;
  array _rn{&nvars.} 8 _temporary_;/* variables for keeping data of previous observation */
  array _rc{&cvars.} $ _temporary_;/* variables for keeping data of previous observation */
  retain _rc: _rn:;
  do i=1 to dim(_n);
    if missing(_n{i}) then _n{i}=_rn{i};
    _rn{i}=_n{i};
  end;
  do i=1 to dim(_c);
    if missing(_c{i}) then _c{i}=_rc{i};
    _rc{i}=_c{i};
  end;
run;
Adubhai
Obsidian | Level 7
Sorry but I am quite new to sas and so I was unable to incorporate this code into my datasets. If my understanding is correct, I don't need to run the first few lines as I already have a dataset. So I copied from "/* remember number of variables */" till the end. But I am not sure where to write the name of the dataset, that I have, within the code.
japelin
Rhodochrosite | Level 12

OK.

you need modify 2 statements below.

if your dataset is temp.original then 

where libname=upcase('temp') and memname=upcase('original ');

and

set temp.original;

That's all.

 

 

 

/* remember number of variables */
data _null_;
  set sashelp.vtable;
  where libname=upcase('WORK') and  memname=upcase('SAMPLE');/* modify this line to your library and dataset name */
  call symputx('cvars',num_character);
  call symputx('nvars',num_numeric);
run;

data want(drop=i);
  set work.sample;/* modify your dataset name */
  array _n{&nvars.} _numeric_;
  array _c{&cvars.} _character_;
  array _rn{&nvars.} 8 _temporary_;/* variables for keeping data of previous observation */
  array _rc{&cvars.} $ _temporary_;/* variables for keeping data of previous observation */
  retain _rc: _rn:;
  do i=1 to dim(_n);
    if missing(_n{i}) then _n{i}=_rn{i};
    _rn{i}=_n{i};
  end;
  do i=1 to dim(_c);
    if missing(_c{i}) then _c{i}=_rc{i};
    _rc{i}=_c{i};
  end;
run;

 

 

Adubhai
Obsidian | Level 7
Thanks for clearing it out. The following error statements came in the log after I ran the code:

ERROR: Invalid dimension specification for array _c. The upper bound of an array dimension is smaller than its corresponding lower bound.
ERROR: Too few variables defined for the dimension(s) specified for the array _c.

and

ERROR: Invalid dimension specification for array _rc. The upper bound of an array dimension is smaller than its corresponding lower bound.
japelin
Rhodochrosite | Level 12

Because these is no character variables in your dataset, I think.

 

run below.

it checks character and numeric variables are exist or not.

 

data want(drop=i);
  set work.sample;/* modify your dataset name */
  
  %if &nvars>0 %then %do;
    array _n{&nvars.} _numeric_;
    array _rn{&nvars.} 8 _temporary_;/* variables for keeping data of previous observation */
    retain _rn:;
    do i=1 to dim(_n);
      if missing(_n{i}) then _n{i}=_rn{i};
      _rn{i}=_n{i};
    end;
  %end;
  %if &cvars>0 %then %do;
    array _c{&cvars.} _character_;
    array _rc{&cvars.} $ _temporary_;/* variables for keeping data of previous observation */
    retain _rc:;
    do i=1 to dim(_c);
      if missing(_c{i}) then _c{i}=_rc{i};
      _rc{i}=_c{i};
    end;
  %end;
run;
Adubhai
Obsidian | Level 7

Yes you are right, I don't have any character variables. Sorry, should've mentioned that before. The new code has got the same erorrs for the character variable section. Also it shows some new errors: 

ERROR: The %IF statement is not valid in open code

and

ERROR: The %END statement is not valid in open code.

 

Thanks for patiently replying to all my queries. 

japelin
Rhodochrosite | Level 12

OK. this is for only numeric variables.

 

data want(drop=i);
  set work.sample;/* modify your dataset name */
  
  array _n{&nvars.} _numeric_;
  array _rn{&nvars.} 8 _temporary_;/* variables for keeping data of previous observation */
  retain _rn:;
  do i=1 to dim(_n);
    if missing(_n{i}) then _n{i}=_rn{i};
    _rn{i}=_n{i};
  end;
run;

 

Adubhai
Obsidian | Level 7
Thanks a lot. This works partially. It replaced the missing values with the previous values.
But when previous values are not available, then it did not take the next available value to replace. Is that possible?
japelin
Rhodochrosite | Level 12

Basically, you can do this by sorting in descending order, since SAS can only retain the previous value using the retain statement.
The procedure is to sort in descending order once, fill in the missing values, and then re-sort in ascending order.

In this case, it is better to keep the observation number as the key, using _n_ in the previous data step.

 


/* remember number of variables */
data _null_;
  set sashelp.vtable;
  where libname=upcase('WORK') and  memname=upcase('SAMPLE');/* modify this line to your library and dataset name */
  call symputx('nvars',num_numeric+1);/* +1:for key variable */
run;

data want(drop=i);
  set work.sample;/* modify your dataset name */
  key=_n_;
  array _n{&nvars.} _numeric_;
  array _rn{&nvars.} 8 _temporary_;/* variables for keeping data of previous observation */
  retain _rn:;
  do i=1 to dim(_n);
    if missing(_n{i}) then _n{i}=_rn{i};
    _rn{i}=_n{i};
  end;
run;

data sort; 
  set want;
run;
proc sort data=sort;
  by descending key;
run;

data want(drop=i);
  set work.sort;
  array _n{&nvars.} _numeric_;
  array _rn{&nvars.} 8 _temporary_;/* variables for keeping data of previous observation */
  retain _rn:;
  do i=1 to dim(_n);
    if missing(_n{i}) then _n{i}=_rn{i};
    _rn{i}=_n{i};
  end;
run;

proc sort data=want out=want(drop=key);
  by key;
run;

 

 

Adubhai
Obsidian | Level 7

Thank you so much. This should be the accepted solution. I chose the previous one. This worked perfectly

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3113 views
  • 2 likes
  • 2 in conversation