BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yoyong
Obsidian | Level 7

Hi.

 

How do I merge two columns like these two:

 

         Obs        Height1        Height2

            1             150

            2                                 160

            3             170              170 

            4             180               190

            5              NA                NA

 

The result should look like 

      

         Obs        Height1        Height2               HeightA         HeightB

            1             150                                         150                 150 

            2                                 160                     160                 160

            3             170              170                     170                 170

            4             180              190                     180                 190 

            5              NA                NA                     NA                  NA  

 

Thanks.

        

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Perhaps a simple approach is best:

 

data want;

set have;

if Height1 > ' ' then HeightA = Height1;

else HeightA = Height2;

If Height2 > ' ' then HeightB = Height2;

else HeightB = Height1;

run;

 

At least consider where the proposed solutions are the same and where they are different, especially observation 4.

 

Actually, this should be equivalent to a slight variation on a previous proposal:

 

data want;

set have;

HeightA = coalesce(Height1, Height2);

HeightB = coalesce(Height2, Height1);

run;

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

heighta=coalesce(height1,height2);

heightb=coalesce(height1,height2);

 

yoyong
Obsidian | Level 7

Thank you @novinosrin @hashman @Astounding

hashman
Ammonite | Level 13

@novinosrin has the right idea, however it has to be wrapped into something executable and furnished with some details to guard against truncation in case the lengths of the two H-variables are different. Note that I made the lengths of H1 and H2 different on purpose.

 

data have  ;                
  input H1:$3. H2:$5. ;     
  cards ;                   
130  .                      
.    140.5                  
150  150                    
.    160.5                  
170  .                      
180  180                    
NA   NA                     
;                           
run ;                       
                            
data want ; 
set have ;
HA = H1 || H2 ;
HB = HA ;
HA = coalesceC (H1, H2) ;
HB = HA ;
run ;

The first two statements after SET set the lengths of HA and HB to the sum of the system length of H1 and H2. Otherwise, they would be assigned length $200, which could be way too long or two short (resulting in truncation) depending on the actual lengths of H1 and H2; and it's better to be safe than sorry. Then the COALESCE* function (coalesceC here since it is the DATA step, not SQL, and the variables are character) does what @novinosrin has suggested.

 

It would be still better to assign HA and HB the length equal to the maximum of the lengths of H1 and H2. Though this requires an extra step, it's a very quick one, run time wise:

data _null_ ;                                       
  set have (obs=0) ;                                
  call symputx ("L", vlength (H1) <> vlength (H2)) ;
  stop ;                                            
run ;                                               
                                                    
data want ;                                         
  set have ;                                        
  HA = put (coalesceC (H1, H2), $&L..) ;            
  HB = HA ;                                         
run ;                                               

HTH

Paul D.

 

 

Astounding
PROC Star

Perhaps a simple approach is best:

 

data want;

set have;

if Height1 > ' ' then HeightA = Height1;

else HeightA = Height2;

If Height2 > ' ' then HeightB = Height2;

else HeightB = Height1;

run;

 

At least consider where the proposed solutions are the same and where they are different, especially observation 4.

 

Actually, this should be equivalent to a slight variation on a previous proposal:

 

data want;

set have;

HeightA = coalesce(Height1, Height2);

HeightB = coalesce(Height2, Height1);

run;

hashman
Ammonite | Level 13

Your approach will work if the original variables are of the same lengths.

However, it could result in truncation if one variable is shorter/longer than the other. 

If both are shorter than $200, it could be alleviated by:

data want ;                                       
  set have ;                                      
  HeightA = ifC (Height1 > '', Height1, Height2) ;
  HeightB = ifC (Height2 > '', Height2, Height1) ;
run ;                                             

But that would set HeightA and HeightB to the default $200, which is not good for space management. If the original variables should happen to be longer than $200, then there again could be problems with truncation. The same is true for the COALESCE approach, not to mention that in the DATA step, this function is numeric and won't work in this case - COALESCEC is needed instead.

 

Failing to size up receiving character variables properly is a common coding problem in SAS with its fixed character lengths. I don't see how including a simple safeguard (e.g., using the concatenation - or, better, a short preliminary extra DATA step to get the max length of the two columns) is more complex (or even much lengthier code-wise) than the approach you consider "simple". Oftentimes, apparent simplicity incurs a considerable cost as far as the validity of the result is concerned.

 

Paul D.       

Astounding
PROC Star

Paul D.,

 

All wise things to consider.  Now that you mention COALESCEC, I may need to sit down for a week and read about all the functions that SAS has added over the years.  Who knows, I may even learn parsing functions.

 

Here's the "it depends" part.  In practice, I try to gear my answers toward the person who asks.  Depending on (my perception of) their level of SAS knowledge, I will either add to or remove from my replies.  So rightly or wrongly, I am prejudging how much a poster can absorb.  I have seen far too much evidence that convinces me this is the right approach ... posters accepting wrong answers, posters leaving out key information that a problem requires, posters requiring that a solution use a certain tool even if they don't know how to solve the problem, posters accepting solutions that they clearly don't understand.  For my benefit, I'm happy to see your concatenation approach.

hashman
Ammonite | Level 13

@Astounding,

 

Your approach is sensible and economical. Adding "btw"s, more context, and pointing at underwater reefs is an old habit from SAS-L of yore inherited from Ian Whitlock when he reigned there. In fact, I've learned more from those extras than from direct pointed answers to questions of the "how to do this" type. An extra consideration behind a more, uh, developed reply is that it is read not only by that who posed the question but by others, and the reply can be augmented or contradicted. More than once, straightened up by folks who knew far more than I did, I realized that I'd formed a wrong idea, and those lessons have seared in my memory. In the short time I've been around in this space, it's happened, too, and I'm looking forward to such friendly contradictions.

 

Best

Paul D.    

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2350 views
  • 2 likes
  • 4 in conversation