Dear SAS experts
I hope somebody can help solve a data management problem. I have to datasets (have1 and have2) that I want to merge based on two variables (record_id and number), however I dont want the first dataset have1 to be overwritten but instead data to be added where missing. I hope this makes sense. Where data have been added in the want dataset it has been marked with bold.
Kind regards Solvej
Have1 | ||||||||||||||||
record_id | number | A1 | B1 | C1 | A2 | B2 | C2 | |||||||||
1 | 17 | 1 | c | c | 8 | t | t | |||||||||
2 | 2 | |||||||||||||||
3 | 1 | 8 | f | f | ||||||||||||
Have2 | ||||||||||||||||
record_id | number | A1 | B1 | C1 | A2 | B2 | C2 | A3 | B3 | C3 | ||||||
1 | 17 | 2 | b | w | 4 | s | h | 6 | kk | gg | ||||||
2 | 2 | 3 | s | x | 5 | s | g | |||||||||
3 | 1 | 4 | x | x | ||||||||||||
Want | ||||||||||||||||
record_id | number | A1 | B1 | C1 | A2 | B2 | C2 | A3 | B3 | C3 | A4 | B4 | C4 | A3 | B3 | C3 |
1 | 17 | 1 | c | c | 8 | t | t | 2 | b | w | 4 | s | h | 6 | kk | gg |
2 | 2 | 3 | s | x | 5 | s | g | |||||||||
3 | 1 | 8 | f | f | 4 | x | x | |||||||||
Post test data in the form of a datastep:
As such, I will just give examples.
You can use update:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm
need to expand the first dataset.
Or you can normalise the data - something you should really do anyway - set the two together, renumber, then transpose again.
Or you can use arrays and renames.
Hi again
Sorrry about the missing data. Here it is:
data WORK.HAVE1;
infile datalines dsd truncover;
input record_id:BEST. number:BEST. A1:BEST. B1:$1. C1:$1. A2:BEST. B2:$1. C2:$1.;
format record_id BEST. number BEST. A1 BEST. A2 BEST.;
label record_id="record_id" number="number" A1="A1" B1="B1" C1="C1" A2="A2" B2="B2" C2="C2";
datalines;
1 17 1 c c 8 t t
2 2 . .
3 1 8 f f .
data WORK.HAVE2;
infile datalines dsd truncover;
input record_id:BEST. number:BEST. A1:BEST. B1:$1. C1:$1. A2:BEST. B2:$1. C2:$1. A3:BEST. B3:$2. C3:$2.;
format record_id BEST. number BEST. A1 BEST. A2 BEST. A3 BEST.;
label record_id="record_id" number="number" A1="A1" B1="B1" C1="C1" A2="A2" B2="B2" C2="C2" A3="A3" B3="B3" C3="C3";
datalines;
1 17 2 b w 4 s h 6 kk gg
2 2 3 s x 5 s g .
3 1 4 x x . .
For data management purposes, I don't think you found the proper structure to the data. Here is an alternative that keeps all the original information but is much more flexible for programming purposes. (I'm illustrating just for one of the RECORD_IDs, and you may need to learn some programming techniques to deal with this form of the data):
Record_ID Number Source Record A B C
3 1 have1 1 8 f f
3 1 have2 2 4 x x
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.