Hello:
I tried to combine multiple tables using loop, say i have 20 tables: t20, t19, t18, ......for each year. So far, I have tried join or union from proc sql, but don't know how to do it. Maybe data step, but don't know how. Here are the details about the tables:
Each table has a common "id" field and two other variables. e.g. tables look like
T20: T19: T18: T17 ...................................
id a20 b20 id a19 b19 id a18 b18 ........................................
id1 0 0 id1 1 0 id4 1 1 .......................................
id2 0 1 id4 0 0 id5 0 0
id3 1 1
I would like to combine them together to have first column as "id", it will contain all the ids appearing in all the tables. Then, all the "a" and "b" variables will be after the "id". If certain "id" were not in certain year, the result table will be a missing value. The result table look like:
id a20 b20 a19 b19 a18 b18 .......
id1 0 0 1 0 . .
id2 0 1 . . . .
id3 1 1 . . . .
id4 . . 0 0 1 1
id5 . . . . 0 0
I have put some code to create 3 tables as examples to start.
Thanks for any help!
data t20;
input id $ a20 b20;
cards;
id1 0 0
id2 0 1
id3 1 1
;
run;
data t19;
input id $ a19 b19;
cards;
id1 1 0
id4 0 0
;
run;
data t18;
input id $ a18 b18;
cards;
id4 1 1
id5 0 0
;
run;
Remember that to Merge with BY statement all of the sets need to be sorted, or be in the sort order, of the by variable.
With a lot of sequentially numbered data sets you can use syntax like:
data want;
merge t20-t01;
by ID;
run;
or Merge t20-t1 . You didn't mention how the numbers less than 10 might be. Use of this syntax does require all of the set numbers to be present. If you were to be missing t11 for example that would throw an error.
data want;
merge t18 t19 t20;
by ID;
run;
That's all you should need here, given what you've posted.
Output:
Obs id a18 b18 a19 b19 a20 b20
1 id1 . . 1 0 0 0
2 id2 . . . . 0 1
3 id3 . . . . 1 1
4 id4 1 1 0 0 . .
5 id5 0 0 . . . .
@sasecn wrote:
Hello:
I tried to combine multiple tables using loop, say i have 20 tables: t20, t19, t18, ......for each year. So far, I have tried join or union from proc sql, but don't know how to do it. Maybe data step, but don't know how. Here are the details about the tables:
Each table has a common "id" field and two other variables. e.g. tables look like
T20: T19: T18: T17 ...................................
id a20 b20 id a19 b19 id a18 b18 ........................................
id1 0 0 id1 1 0 id4 1 1 .......................................
id2 0 1 id4 0 0 id5 0 0
id3 1 1
I would like to combine them together to have first column as "id", it will contain all the ids appearing in all the tables. Then, all the "a" and "b" variables will be after the "id". If certain "id" were not in certain year, the result table will be a missing value. The result table look like:
id a20 b20 a19 b19 a18 b18 .......
id1 0 0 1 0 . .
id2 0 1 . . . .
id3 1 1 . . . .
id4 . . 0 0 1 1
id5 . . . . 0 0
I have put some code to create 3 tables as examples to start.
Thanks for any help!
data t20; input id $ a20 b20; cards; id1 0 0 id2 0 1 id3 1 1 ; run; data t19; input id $ a19 b19; cards; id1 1 0 id4 0 0 ; run; data t18; input id $ a18 b18; cards; id4 1 1 id5 0 0 ; run;
Hello Reeza:
Thanks for your reply. I guess that the part i didn't figure out was that how to use loop to merge because I have many tables. Or, any shortcut to list all the table names?
Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html
In your example these would all work
Merge t18-t20; *expects a continuous range;
merge t18--t20; *does not expect a continuous range;
merge t: ; *does anything starting with T;
Thanks. My real tables don't have the same naming pattern. But i guess i can rename them, then use your method to merge.
Remember that to Merge with BY statement all of the sets need to be sorted, or be in the sort order, of the by variable.
With a lot of sequentially numbered data sets you can use syntax like:
data want;
merge t20-t01;
by ID;
run;
or Merge t20-t1 . You didn't mention how the numbers less than 10 might be. Use of this syntax does require all of the set numbers to be present. If you were to be missing t11 for example that would throw an error.
Thanks for the comments. I do need sort in my real tables.
If they’re all in a library or some other pattern you can usually generate the list into a macro variable fairly easily.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.