BookmarkSubscribeRSS Feed
Solvej
Obsidian | Level 7

Dear SAS experts

 

I hope somebody can help solve a data management problem. I have to datasets (have1 and have2) that I want to merge based on two variables (record_id and number), however I dont want the first dataset have1 to be overwritten but instead data to be added where missing. I hope this makes sense. Where data have been added in the want dataset it has been marked with bold.

 

Kind regards Solvej

 

 

Have1                
record_idnumberA1B1C1A2B2C2         
1171cc8tt         
22               
318ff            
                 
Have2                
record_idnumberA1B1C1A2B2C2A3B3C3      
1172bw4sh6kkgg      
223sx5sg         
314xx            
                 
Want                
record_idnumberA1B1C1A2B2C2A3B3C3A4B4C4A3B3C3
1171cc8tt2bw4sh6kkgg
223sx5sg         
318ff4xx         
                 
3 REPLIES 3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Post test data in the form of a datastep:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

As such, I will just give examples.

You can use update:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm

need to expand the first dataset.

 

Or you can normalise the data - something you should really do anyway - set the two together, renumber, then transpose again.

 

Or you can use arrays and renames.

 

Solvej
Obsidian | Level 7

Hi again

 

Sorrry about the missing data. Here it is:

 


data WORK.HAVE1;
infile datalines dsd truncover;
input record_id:BEST. number:BEST. A1:BEST. B1:$1. C1:$1. A2:BEST. B2:$1. C2:$1.;
format record_id BEST. number BEST. A1 BEST. A2 BEST.;
label record_id="record_id" number="number" A1="A1" B1="B1" C1="C1" A2="A2" B2="B2" C2="C2";
datalines;
1 17 1 c c 8 t t
2 2 . .
3 1 8 f f .

 


data WORK.HAVE2;
infile datalines dsd truncover;
input record_id:BEST. number:BEST. A1:BEST. B1:$1. C1:$1. A2:BEST. B2:$1. C2:$1. A3:BEST. B3:$2. C3:$2.;
format record_id BEST. number BEST. A1 BEST. A2 BEST. A3 BEST.;
label record_id="record_id" number="number" A1="A1" B1="B1" C1="C1" A2="A2" B2="B2" C2="C2" A3="A3" B3="B3" C3="C3";
datalines;
1 17 2 b w 4 s h 6 kk gg
2 2 3 s x 5 s g .
3 1 4 x x . .

Astounding
PROC Star

For data management purposes, I don't think you found the proper structure to the data.  Here is an alternative that keeps all the original information but is much more flexible for programming purposes.  (I'm illustrating just for one of the RECORD_IDs, and you may need to learn some programming techniques to deal with this form of the data):

 

Record_ID  Number  Source  Record  A   B   C

 

3             1    have1      1    8   f   f

3             1    have2      2    4   x   x

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 755 views
  • 0 likes
  • 3 in conversation