Hello,
I'm having a problem with joining multiple tables to 'fill-in' missing cells. I will provide a rough idea of my procedure below:
Table A has 3 variables:
unique_no date region
A 20100126 North
B 20110718 South
C 20120819 East
D 20140211 West
E 20130919 North
F 20121109 East
....and so on.
I have tables for each region --
Table 'West' has 5 variables:
date region walk_speed run_speed sleep
20140211 West 12 11 6
20131211 West 15 19 7
20100917 West 9 26 8
20110325 West 10 14 6
Tables 'East', 'North', 'South' look exactly the same (in terms of variable names) as Table 'West'.
My final table should look like below, after joining all the tables (Table A + addition of the three run/walk/sleep variables from Table 'West', 'North', 'East', 'South') :
unique_no date region walk_speed run_speed sleep
A 20100126 North 12 20 6
B 20110718 South 13 15 6
C 20120819 East 8 16 8
D 20140211 West 9 17 10
E 20130919 North 7 26 9
F 20121109 East 10 35 5
....and so on
Here's my attempt at creating this final table.
Step 1) proc sql;
create table t1 as
select a.*, b.*
from Table_A as a
Table_North as b
on (a.date=b.date) and (a.region=b.region);
quit;
This seemed to have correctly merged Table A and Table North (cells that are regions other than North have "." in them).
Step 2) proc sql;
create table t2 as
select a.*, b.*
from t1 as a
Table_West as b
on (a.date=b.date) and (a.region=b.region);
quit;
This second step had no effect; t2 looks exactly the same as t1.... so this is where I stopped.
**Please ignore my numerical data values as they are just random.
Thank you very much in advance for any help!
The "in=" dataset option creates a temporary boolean variable (will not be included in the output) that is set to 1 when a record from this dataset is present.
Since we merge by region and date, duplicate dates are no problem, as long as no duplicate combinations region-date are present.
Your only join condition is region and date? Are those unique in your dataset, I wouldn't assume they are by default and that's going to cause issues...
I'm basically trying to add the three variables (walk_speed, run_speed, sleep) onto Table A.
Table A has multiple combinations of different dates and regions (N=10,000 rows).
I would appreciate any alternative procedure that you have in mind.
Combine the 4 regional tables into one, and keep the table name:
data regions;
set
east
west
north
south
;
run;
Then sort and merge:
proc sort data=table_a;
by date region;
run;
proc sort data=regions;
by date region;
run;
data want;
merge
table_a (in=a)
regions
;
by date region;
if a;
run;
Hi Kurt,
Thanks so much for your help! However, I want to make sure 2 things...
Table A has random dates ranging from 1/1/2009 to 12/31/2015 (not every date is represented, i.e. July 22, 2013 might not be in there). Thus, for my region tables, I set them up so that each region table has data for EACH date ranging from 1/1/2009 to 12/31/2015 (only the dates that "match" with Table A will be filled in the columns). Will your method still work (merging the 4 region tables into 1, we would have duplicate dates (i.e. 1/1/2009 for west, north, south, east)) ?
Also, what is meant by "in=a" and "if a" ? I haven't used the "in=" function before so please clarify for me.
Thanks again!
The "in=" dataset option creates a temporary boolean variable (will not be included in the output) that is set to 1 when a record from this dataset is present.
Since we merge by region and date, duplicate dates are no problem, as long as no duplicate combinations region-date are present.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.