Hi
I have variables ending either with _LP or _HP. There are some observations need to be replaced their values in _LP (were missing) to the value in _HP. For example, conditons_LP was missing and needs to be replaced with the value shown in conditions_HP. Any suggestions? Thank you so much.
Tell us more about your data. Are the LP all adjacent in the data set? Meaning if you run proc contents the column numbers are in sequence? or if you look in a table are the column headers together? Is the order of the HP variables in the data set the same?
If the variables of a common suffix are together and the order of the base variable names is the same such as:
var1_lp var2_lp var3_lp … var25_lp <other variables perhaps> var1_hp var2_hp var3_hp … var25_hp
Then something like this will work:
data want;
   set have;
   array l   var1_lp -- var25_lp;
   array h   var1_hp -- var25_hp;
   do i=1 to dim(l);
      if missing ( l[i]) then l[i] = h[i];
   end;
run;
The key bit is to get the names of the matching variables in the array l and array h definitions to align exactly.
If scattered or intermingled how much is needed to get a properly ordered list of the two sets of variables is where the work actually goes.
Note that the variable list using 2 dashes is for variables with adjoining or sequential columns and says to use all of the variables in the interval. The Array definition will fail if all of the variables are not of the same type. In which case we need a lot more information about the actual contents of the data set.
Hi @CHL0320 Can you post a sample of the structure i.e some 5 vars
is it _LP...... series first followed by _HP or the order is random?
Tell us more about your data. Are the LP all adjacent in the data set? Meaning if you run proc contents the column numbers are in sequence? or if you look in a table are the column headers together? Is the order of the HP variables in the data set the same?
If the variables of a common suffix are together and the order of the base variable names is the same such as:
var1_lp var2_lp var3_lp … var25_lp <other variables perhaps> var1_hp var2_hp var3_hp … var25_hp
Then something like this will work:
data want;
   set have;
   array l   var1_lp -- var25_lp;
   array h   var1_hp -- var25_hp;
   do i=1 to dim(l);
      if missing ( l[i]) then l[i] = h[i];
   end;
run;
The key bit is to get the names of the matching variables in the array l and array h definitions to align exactly.
If scattered or intermingled how much is needed to get a properly ordered list of the two sets of variables is where the work actually goes.
Note that the variable list using 2 dashes is for variables with adjoining or sequential columns and says to use all of the variables in the interval. The Array definition will fail if all of the variables are not of the same type. In which case we need a lot more information about the actual contents of the data set.
The order is all the variables ending with _LP first and then the ones with_HP later.
@CHL0320 wrote:
The order is all the variables ending with _LP first and then the ones with_HP later.
If the ORDER between the _LP and _HP match then the code example I provided above should work. The first variable in each list would be the first (left most or lowest column number from proc contents) and the last would the last (right most / largest column number) for each set of variables.
There is absolutely no way we can show code with your variables if you don't share the names with us if that is what you are waiting for.
You should probably use metadata to generate the code.
proc contents data=have noprint out=contents; run;
proc sql noprint;
create table pairs as 
select a.name as LP
     , b.name as HP
from contents a 
inner join contents b
on upcase(substr(a.name,length(a.name)-2))
 = upcase(substr(b.name,length(b.name)-2))
and upcase(a.name) like '%^_LP' escape '^'
and upcase(b.name) like '%^_HP'  escape '^'
order by a.varnum
;
select lp,hp into :lp separated by ' ', :hp separated by ' '
from pairs
;
quit;
data want ;
  set have;
  array lp &lp;
  array hp &hp;
  do i=1 to dim(lp);
    lp(i)=coalesce(lp(i),hp(i));
  end;
run;
This might do, if your variables are numeric:
data have;
input a_lp a_hp b_lp b_hp c_lp;
datalines;
. 1 2 3 4
5 . . . .
. 6 . 7 8
;
proc sql;
select cats(name, "=coalesce(", name, ", ", substr(name,1,length(name)-3), "_hp)")
into :assign separated by "; "
from dictionary.columns 
where libname="WORK" and memname="HAVE" and trim(upcase(name)) like "%_LP";
quit;
data want;
set have;
&assign.;
run;
proc print data=want noobs; run;It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
