BookmarkSubscribeRSS Feed
Solvej
Obsidian | Level 7

Dear SAS experts

 

I hope somebody can help solve a data management problem. I have to datasets (have1 and have2) that I want to merge based on two variables (record_id and number), however I dont want the first dataset have1 to be overwritten but instead data to be added where missing. I hope this makes sense. Where data have been added in the want dataset it has been marked with bold.

 

Kind regards Solvej

 

 

Have1                
record_idnumberA1B1C1A2B2C2         
1171cc8tt         
22               
318ff            
                 
Have2                
record_idnumberA1B1C1A2B2C2A3B3C3      
1172bw4sh6kkgg      
223sx5sg         
314xx            
                 
Want                
record_idnumberA1B1C1A2B2C2A3B3C3A4B4C4A3B3C3
1171cc8tt2bw4sh6kkgg
223sx5sg         
318ff4xx         
                 
3 REPLIES 3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Post test data in the form of a datastep:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

As such, I will just give examples.

You can use update:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm

need to expand the first dataset.

 

Or you can normalise the data - something you should really do anyway - set the two together, renumber, then transpose again.

 

Or you can use arrays and renames.

 

Solvej
Obsidian | Level 7

Hi again

 

Sorrry about the missing data. Here it is:

 


data WORK.HAVE1;
infile datalines dsd truncover;
input record_id:BEST. number:BEST. A1:BEST. B1:$1. C1:$1. A2:BEST. B2:$1. C2:$1.;
format record_id BEST. number BEST. A1 BEST. A2 BEST.;
label record_id="record_id" number="number" A1="A1" B1="B1" C1="C1" A2="A2" B2="B2" C2="C2";
datalines;
1 17 1 c c 8 t t
2 2 . .
3 1 8 f f .

 


data WORK.HAVE2;
infile datalines dsd truncover;
input record_id:BEST. number:BEST. A1:BEST. B1:$1. C1:$1. A2:BEST. B2:$1. C2:$1. A3:BEST. B3:$2. C3:$2.;
format record_id BEST. number BEST. A1 BEST. A2 BEST. A3 BEST.;
label record_id="record_id" number="number" A1="A1" B1="B1" C1="C1" A2="A2" B2="B2" C2="C2" A3="A3" B3="B3" C3="C3";
datalines;
1 17 2 b w 4 s h 6 kk gg
2 2 3 s x 5 s g .
3 1 4 x x . .

Astounding
PROC Star

For data management purposes, I don't think you found the proper structure to the data.  Here is an alternative that keeps all the original information but is much more flexible for programming purposes.  (I'm illustrating just for one of the RECORD_IDs, and you may need to learn some programming techniques to deal with this form of the data):

 

Record_ID  Number  Source  Record  A   B   C

 

3             1    have1      1    8   f   f

3             1    have2      2    4   x   x

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 694 views
  • 0 likes
  • 3 in conversation