Hi,
Please help me in solving this....
HAVE table is what we have and we need to get the corresponding departments for each IN-OUT pair from the LOOKUP table.
HAVE table can have missing date-times sometimes....for either the IN or OUT in a pair!!!!
basically we need to check to see where out IN-OUT pair falls in the LOOKUP table and pick the corresponding department
sometimes a person can be having IN from one department and OUT from other department ........
Thanks
HAVE
ID IN1 OUT1 IN2 OUT2 IN3 OUT3
101 06AUG2013:13:55 06AUG2013:17:14 16AUG2013:17:02 17AUG2013:07:00 29AUG2013:12:28 .
102 14AUG2013:11:40 27AUG2013:12:03
LOOKUP TABLE
ID DEPT IN OUT
101 SO 06AUG13:06:51 06AUG13:14:02
101 SC 06AUG13:14:02 09AUG13:19:18
101 ST 09AUG13:19:18 12AUG13:03:14
101 SM 12AUG13:03:14 13AUG13:12:17
101 ST 13AUG13:12:17 16AUG13:16:54
101 SC 16AUG13:16:54 19AUG13:11:04
101 PN 19AUG13:11:04 22AUG13:07:37
101 PE 22AUG13:07:37 09SEP13:15:20
102 SM 14AUG13:12:08 28AUG13:13:48
WANT
ID IN1 OUT1 DEPT_IN DEPT_OUT
101 06AUG2013:13:55 06AUG2013:17:14 SO SC
101 16AUG2013:17:02 17AUG2013:07:00 SC SC
101 29AUG2013:12:28 . SC .
102 14AUG2013:11:40 27AUG2013:12:03 SM SM
Hi Robert,
I don't have enough time to write the full code but here's a data step approach to do so. However, I still stand by the fact that you should take the time to learn proc sql; It is highly transferable and fairly simple. I managed to self-learn most of all I know reading SAS documentation in a morning at work.
If I recall Reeza had managed to get you to look into hash tables in another thread so I'll assume you at least know what they're about.
Your desired output can be done within a single data step using 2 arrays with variable ranges, in: and out: (to allow you to loop and output one record for each horizontal entry). The lookup can be achieved with a hash table, where I would rename IN and OUT from your lookup table so as to free the 2 variable names for desired output. You can then either use data step output or a second hash table to store your desired result and use the output method of hash to convert it to a data set.
Example of logic construction not tested:
data want(rename=(wantin=in wantout=out));
length ID $3. LUPIN 8. LUPOUT 8. DEPT $2.;
if _N_=1 then do;
declare hash lookup(dataset: "work.lookuptable(rename=(IN=LUPIN OUT=LUPOUT))", multidata: "YES");
lookup.defineKey("ID");
lookup.defineData("DEPT", "LUPIN", "LUPOUT");
lookup.defineDone();
end;
set have;
array in {*} in: ;
array out {*} out: ;
do i=1 to dim(in);
if lookup.find()=0 then do;
if lupin <= in(i) < lupout then do;
wantin = in(i);
dept_in = dept;
end;
if lupin <= out(i) < lupout then do;
wantout = out(i);
dept_out = dept;
end;
do until lookup.findnext() NE 0;
if lupin <= in(i) < lupout then do;
wantin = in(i);
dept_in = dept;
end;
if lupin <= out(i) < lupout then do;
wantout = out(i);
dept_out = dept;
end;
end;
end;
output;
end;
keep id wantin wantout dept_in dept_out;
run;
Since your lookup table has all fixed intervals with no overlap, this should do the work. All missing in: and out: will naturally not find an appropriate interval through the find_next() loop so they will be left as missing for output. It is far from optimized though since it is looping on all multidata element of a given hash key rather than break the loop once both in and out are found.
Vince
Hi,
I would suggest transposing HAVE into a more normalized format like:
ID Seq Type DateTime
101 1 In 06Aug2013:13:55
101 1 Out 06Aug2013:17:14
101 2 In 16Aug2013:17:02
101 2 Out 17Aug2013:07:00
101 3 In 29Aug2013:12:28
102 1 In 14Aug2013:11:40
102 1 Out 27Aug2013:12:03
This should make the lookup easier, since your date-times are in one column. You could probably do it with a SQL join WHERE Lookup.IN<=HAVE.datetime<Lookup.OUT
Not sure I've quite got the logic you want. But normalizing the data should improve the situation.
HTH,
--Q.
Hi Quentin,
Thanks for the reply.
I want it to be done in simple SAS steps. I have no idea on SQL !!!
could you help me frame the code in basic steps??
Thanks
Hi Robertrao,
In which format have data is present ?
If it is excel use data step to import it and double transpose it..
If it is in notepad you can use double trailing method of input to get the data into a single column or variable ..
Please give the format of Have data.. based on i can help to write the code for it..
--
Durga
Hi Dr. Majeti,
Thanks for the response.
LOOK UP Table and HAVE table are present as SAs datasets.
I appreciate your help in this regards
Thanks
Hi Robertrao,
data have_trans(drop=i IN1 OUT1 IN2 OUT2 IN3 OUT3); | |
array in {3} in1 in2 in3; | |
array out {3} out1 out2 out3; | |
do until (last.id); | |
set have; | |
by id notsorted; | |
format in_ out_ datetime16.; | |
do i = 1 to 3; | |
In_=in(i); | |
out_=out(i); | |
output; | |
end; | |
end; |
run;
this step might help you to get have dataset as lookup dataset.
HI,
BUt the main question is that how we are going to match these tables using basic SAS (since there are many to many????
i Know SQL can do it but I would need simple steps
thanks
when we transpose have data set and we do combine lookup and have DS but it is not possible based on only id variable as you said many to many ..
are there any other variables possible..
i merged it using merge statement.. check the whole code i written till now..
data have; | ||
input ID (IN1 OUT1 IN2 OUT2 IN3 OUT3) (datetime16.); | ||
Format IN1 OUT1 IN2 OUT2 IN3 OUT3 datetime16.; | ||
datalines; | ||
101 06AUG2013:13:55 06AUG2013:17:14 16AUG2013:17:02 17AUG2013:07:00 29AUG2013:12:28 | . | |
102 14AUG2013:11:40 27AUG2013:12:03 | ||
; | ||
data lookup; | ||
input ID DEPT$ (IN OUT) (datetime16.); | ||
format in out datetime16.; | ||
datalines; | ||
101 SO 06AUG13:06:51 06AUG13:14:02 | ||
101 SC 06AUG13:14:02 09AUG13:19:18 | ||
101 ST 09AUG13:19:18 12AUG13:03:14 | ||
101 SM 12AUG13:03:14 13AUG13:12:17 | ||
101 ST 13AUG13:12:17 16AUG13:16:54 | ||
101 SC 16AUG13:16:54 19AUG13:11:04 | ||
101 PN 19AUG13:11:04 22AUG13:07:37 | ||
101 PE 22AUG13:07:37 09SEP13:15:20 | ||
102 SM 14AUG13:12:08 28AUG13:13:48 | ||
; | ||
data have_trans(drop=i IN1 OUT1 IN2 OUT2 IN3 OUT3); | ||
array in {3} in1 in2 in3; | ||
array out {3} out1 out2 out3; | ||
do until (last.id); | ||
set have; | ||
by id notsorted; | ||
format in_ out_ datetime16.; | ||
do i = 1 to 3; | ||
In_=in(i); | ||
out_=out(i); | ||
output; | ||
end; | ||
end; | ||
run; | ||
data both; | ||
merge have_trans lookup; | ||
by id ; |
run;
i think merge like this is wrong but i done for checking.
Hi Robert,
I don't have enough time to write the full code but here's a data step approach to do so. However, I still stand by the fact that you should take the time to learn proc sql; It is highly transferable and fairly simple. I managed to self-learn most of all I know reading SAS documentation in a morning at work.
If I recall Reeza had managed to get you to look into hash tables in another thread so I'll assume you at least know what they're about.
Your desired output can be done within a single data step using 2 arrays with variable ranges, in: and out: (to allow you to loop and output one record for each horizontal entry). The lookup can be achieved with a hash table, where I would rename IN and OUT from your lookup table so as to free the 2 variable names for desired output. You can then either use data step output or a second hash table to store your desired result and use the output method of hash to convert it to a data set.
Example of logic construction not tested:
data want(rename=(wantin=in wantout=out));
length ID $3. LUPIN 8. LUPOUT 8. DEPT $2.;
if _N_=1 then do;
declare hash lookup(dataset: "work.lookuptable(rename=(IN=LUPIN OUT=LUPOUT))", multidata: "YES");
lookup.defineKey("ID");
lookup.defineData("DEPT", "LUPIN", "LUPOUT");
lookup.defineDone();
end;
set have;
array in {*} in: ;
array out {*} out: ;
do i=1 to dim(in);
if lookup.find()=0 then do;
if lupin <= in(i) < lupout then do;
wantin = in(i);
dept_in = dept;
end;
if lupin <= out(i) < lupout then do;
wantout = out(i);
dept_out = dept;
end;
do until lookup.findnext() NE 0;
if lupin <= in(i) < lupout then do;
wantin = in(i);
dept_in = dept;
end;
if lupin <= out(i) < lupout then do;
wantout = out(i);
dept_out = dept;
end;
end;
end;
output;
end;
keep id wantin wantout dept_in dept_out;
run;
Since your lookup table has all fixed intervals with no overlap, this should do the work. All missing in: and out: will naturally not find an appropriate interval through the find_next() loop so they will be left as missing for output. It is far from optimized though since it is looping on all multidata element of a given hash key rather than break the loop once both in and out are found.
Vince
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.