BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CHL0320
Obsidian | Level 7

Hi 

 

I have variables ending either with _LP or _HP.  There are some observations need to be replaced their values in _LP (were missing) to the value in _HP.  For example, conditons_LP was missing and needs to be replaced with the value shown in conditions_HP. Any suggestions? Thank you so much. 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Tell us more about your data. Are the LP all adjacent in the data set? Meaning if you run proc contents the column numbers are in sequence? or if you look in a table are the column headers together? Is the order of the HP variables in the data set the same?

 

If the variables of a common suffix are together and the order of the base variable names is the same such as:

 

var1_lp var2_lp var3_lp … var25_lp     <other variables perhaps> var1_hp var2_hp var3_hp … var25_hp

 

Then something like this will work:

data want;
   set have;
   array l   var1_lp -- var25_lp;
   array h   var1_hp -- var25_hp;
   do i=1 to dim(l);
      if missing ( l[i]) then l[i] = h[i];
   end;
run;

The key bit is to get the names of the matching variables in the array l and array h definitions to align exactly.

If scattered or intermingled how much is needed to get a properly ordered list of the two sets of variables is where the work actually goes.

 

Note that the variable list using 2 dashes is for variables with adjoining or sequential columns and says to use all of the variables in the interval. The Array definition will fail if all of the variables are not of the same type. In which case we need a lot more information about the actual contents of the data set.

 

View solution in original post

8 REPLIES 8
CHL0320
Obsidian | Level 7
276 variables with the suffix ending with _LP and the other 276 variables with _HP. Plus one ID variable. There is a total of 553 variables so far. 
 
Thank you.
novinosrin
Tourmaline | Level 20

Hi @CHL0320   Can you post a sample of the structure i.e some 5 vars

 

is it _LP...... series first followed by _HP or the order is random?

ballardw
Super User

Tell us more about your data. Are the LP all adjacent in the data set? Meaning if you run proc contents the column numbers are in sequence? or if you look in a table are the column headers together? Is the order of the HP variables in the data set the same?

 

If the variables of a common suffix are together and the order of the base variable names is the same such as:

 

var1_lp var2_lp var3_lp … var25_lp     <other variables perhaps> var1_hp var2_hp var3_hp … var25_hp

 

Then something like this will work:

data want;
   set have;
   array l   var1_lp -- var25_lp;
   array h   var1_hp -- var25_hp;
   do i=1 to dim(l);
      if missing ( l[i]) then l[i] = h[i];
   end;
run;

The key bit is to get the names of the matching variables in the array l and array h definitions to align exactly.

If scattered or intermingled how much is needed to get a properly ordered list of the two sets of variables is where the work actually goes.

 

Note that the variable list using 2 dashes is for variables with adjoining or sequential columns and says to use all of the variables in the interval. The Array definition will fail if all of the variables are not of the same type. In which case we need a lot more information about the actual contents of the data set.

 

CHL0320
Obsidian | Level 7

The order is all the variables ending with _LP first and then the ones with_HP later.

ballardw
Super User

@CHL0320 wrote:

The order is all the variables ending with _LP first and then the ones with_HP later.


If the ORDER between the _LP and _HP match then the code example I provided above should work. The first variable in each list would be the first (left most or lowest column number from proc contents) and the last would the last (right most / largest column number) for each set of variables.

 

There is absolutely no way we can show code with your variables if you don't share the names with us if that is what you are waiting for.

Tom
Super User Tom
Super User

You should probably use metadata to generate the code.

proc contents data=have noprint out=contents; run;
proc sql noprint;
create table pairs as 
select a.name as LP
     , b.name as HP
from contents a 
inner join contents b
on upcase(substr(a.name,length(a.name)-2))
 = upcase(substr(b.name,length(b.name)-2))
and upcase(a.name) like '%^_LP' escape '^'
and upcase(b.name) like '%^_HP'  escape '^'
order by a.varnum
;
select lp,hp into :lp separated by ' ', :hp separated by ' '
from pairs
;
quit;
data want ;
  set have;
  array lp &lp;
  array hp &hp;
  do i=1 to dim(lp);
    lp(i)=coalesce(lp(i),hp(i));
  end;
run;
PGStats
Opal | Level 21

This might do, if your variables are numeric:

 

data have;
input a_lp a_hp b_lp b_hp c_lp;
datalines;
. 1 2 3 4
5 . . . .
. 6 . 7 8
;

proc sql;
select cats(name, "=coalesce(", name, ", ", substr(name,1,length(name)-3), "_hp)")
into :assign separated by "; "
from dictionary.columns 
where libname="WORK" and memname="HAVE" and trim(upcase(name)) like "%_LP";
quit;

data want;
set have;
&assign.;
run;

proc print data=want noobs; run;
PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2326 views
  • 0 likes
  • 6 in conversation