Hello,
I'm having a problem with joining multiple tables to 'fill-in' missing cells. I will provide a rough idea of my procedure below:
Table A has 3 variables:
unique_no date region
A 20100126 North
B 20110718 South
C 20120819 East
D 20140211 West
E 20130919 North
F 20121109 East
....and so on.
I have tables for each region --
Table 'West' has 5 variables:
date region walk_speed run_speed sleep
20140211 West 12 11 6
20131211 West 15 19 7
20100917 West 9 26 8
20110325 West 10 14 6
Tables 'East', 'North', 'South' look exactly the same (in terms of variable names) as Table 'West'.
My final table should look like below, after joining all the tables (Table A + addition of the three run/walk/sleep variables from Table 'West', 'North', 'East', 'South') :
unique_no date region walk_speed run_speed sleep
A 20100126 North 12 20 6
B 20110718 South 13 15 6
C 20120819 East 8 16 8
D 20140211 West 9 17 10
E 20130919 North 7 26 9
F 20121109 East 10 35 5
....and so on
Here's my attempt at creating this final table.
Step 1) proc sql;
create table t1 as
select a.*, b.*
from Table_A as a
Table_North as b
on (a.date=b.date) and (a.region=b.region);
quit;
This seemed to have correctly merged Table A and Table North (cells that are regions other than North have "." in them).
Step 2) proc sql;
create table t2 as
select a.*, b.*
from t1 as a
Table_West as b
on (a.date=b.date) and (a.region=b.region);
quit;
This second step had no effect; t2 looks exactly the same as t1.... so this is where I stopped.
**Please ignore my numerical data values as they are just random.
Thank you very much in advance for any help!
The "in=" dataset option creates a temporary boolean variable (will not be included in the output) that is set to 1 when a record from this dataset is present.
Since we merge by region and date, duplicate dates are no problem, as long as no duplicate combinations region-date are present.
Your only join condition is region and date? Are those unique in your dataset, I wouldn't assume they are by default and that's going to cause issues...
I'm basically trying to add the three variables (walk_speed, run_speed, sleep) onto Table A.
Table A has multiple combinations of different dates and regions (N=10,000 rows).
I would appreciate any alternative procedure that you have in mind.
Combine the 4 regional tables into one, and keep the table name:
data regions;
set
east
west
north
south
;
run;
Then sort and merge:
proc sort data=table_a;
by date region;
run;
proc sort data=regions;
by date region;
run;
data want;
merge
table_a (in=a)
regions
;
by date region;
if a;
run;
Hi Kurt,
Thanks so much for your help! However, I want to make sure 2 things...
Table A has random dates ranging from 1/1/2009 to 12/31/2015 (not every date is represented, i.e. July 22, 2013 might not be in there). Thus, for my region tables, I set them up so that each region table has data for EACH date ranging from 1/1/2009 to 12/31/2015 (only the dates that "match" with Table A will be filled in the columns). Will your method still work (merging the 4 region tables into 1, we would have duplicate dates (i.e. 1/1/2009 for west, north, south, east)) ?
Also, what is meant by "in=a" and "if a" ? I haven't used the "in=" function before so please clarify for me.
Thanks again!
The "in=" dataset option creates a temporary boolean variable (will not be included in the output) that is set to 1 when a record from this dataset is present.
Since we merge by region and date, duplicate dates are no problem, as long as no duplicate combinations region-date are present.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.