BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8

Hi,

Please help me in solving this....

HAVE table is what we have and we need to get the corresponding departments for each IN-OUT  pair from the LOOKUP table.

HAVE table can have missing date-times sometimes....for either the IN or OUT in a pair!!!!

basically we need to check to see where out IN-OUT pair falls in  the LOOKUP table and pick the corresponding department

sometimes a person can be having IN from one department and OUT from other department ........

Thanks

HAVE

       ID             IN1                       OUT1                IN2                               OUT2                       IN3                   OUT3 

101  06AUG2013:13:55  06AUG2013:17:14  16AUG2013:17:02  17AUG2013:07:00   29AUG2013:12:28        .

102   14AUG2013:11:40  27AUG2013:12:03

LOOKUP TABLE
ID    DEPT        IN                      OUT
101   SO      06AUG13:06:51      06AUG13:14:02
101   SC      06AUG13:14:02      09AUG13:19:18
101  ST       09AUG13:19:18      12AUG13:03:14
101   SM      12AUG13:03:14      13AUG13:12:17
101  ST       13AUG13:12:17      16AUG13:16:54
101   SC      16AUG13:16:54      19AUG13:11:04
101   PN      19AUG13:11:04      22AUG13:07:37
101   PE      22AUG13:07:37      09SEP13:15:20
102   SM      14AUG13:12:08      28AUG13:13:48

    WANT
   ID                   IN1                   OUT1              DEPT_IN   DEPT_OUT
101    06AUG2013:13:55  06AUG2013:17:14   SO               SC
101    16AUG2013:17:02  17AUG2013:07:00   SC              SC
101    29AUG2013:12:28            .                      SC               .
102    14AUG2013:11:40  27AUG2013:12:03   SM            SM

1 ACCEPTED SOLUTION

Accepted Solutions
Vince28_Statcan
Quartz | Level 8

Hi Robert,

I don't have enough time to write the full code but here's a data step approach to do so. However, I still stand by the fact that you should take the time to learn proc sql; It is highly transferable and fairly simple. I managed to self-learn most of all I know reading SAS documentation in a morning at work.

If I recall Reeza had managed to get you to look into hash tables in another thread so I'll assume you at least know what they're about.

Your desired output can be done within a single data step using 2 arrays with variable ranges, in: and out: (to allow you to loop and output one record for each horizontal entry). The lookup can be achieved with a hash table, where I would rename IN and OUT from your lookup table so as to free the 2 variable names for desired output. You can then either use data step output or a second hash table to store your desired result and use the output method of hash to convert it to a data set.

Example of logic construction not tested:

data want(rename=(wantin=in wantout=out));

     length  ID $3. LUPIN 8. LUPOUT 8.    DEPT $2.;

     if _N_=1 then do;

          declare hash lookup(dataset: "work.lookuptable(rename=(IN=LUPIN OUT=LUPOUT))", multidata: "YES");

          lookup.defineKey("ID");

          lookup.defineData("DEPT", "LUPIN", "LUPOUT");

          lookup.defineDone();

     end;

     set have;

     array in {*} in: ;

     array out {*} out: ;

     do i=1 to dim(in);

          if lookup.find()=0 then do;

               if lupin <= in(i) < lupout then do;

                    wantin = in(i);

                    dept_in = dept;

               end;

               if lupin <= out(i) < lupout then do;

                    wantout = out(i);

                    dept_out = dept;

               end;

               do until lookup.findnext() NE 0;

                    if lupin <= in(i) < lupout then do;

                         wantin = in(i);

                         dept_in = dept;

                    end;

                    if lupin <= out(i) < lupout then do;

                         wantout = out(i);

                         dept_out = dept;

                    end;

               end;

           end;

          output;

     end;

     keep id wantin wantout dept_in dept_out;

run;

Since your lookup table has all fixed intervals with no overlap, this should do the work. All missing in: and out: will naturally not find an appropriate interval through the find_next() loop so they will be left as missing for output. It is far from optimized though since it is looping on all multidata element of a given hash key rather than break the loop once both in and out are found.

Vince

View solution in original post

8 REPLIES 8
Quentin
Super User

Hi,

I would suggest transposing HAVE into a more normalized format like:

ID     Seq    Type     DateTime

101     1      In     06Aug2013:13:55

101     1      Out     06Aug2013:17:14

101     2      In     16Aug2013:17:02

101     2      Out     17Aug2013:07:00

101     3      In     29Aug2013:12:28

102      1     In     14Aug2013:11:40

102     1     Out     27Aug2013:12:03

This should make the lookup easier, since your date-times are in one column.  You could probably do it with a SQL join WHERE  Lookup.IN<=HAVE.datetime<Lookup.OUT

Not sure I've quite got the logic you want.  But normalizing the data should improve the situation.

HTH,

--Q.

BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
robertrao
Quartz | Level 8

Hi Quentin,

Thanks for the reply.

I want it to be done in simple SAS steps. I have no idea on SQL !!!

could you help me frame the code in basic steps??

Thanks

DR_Majeti
Quartz | Level 8

Hi Robertrao,

In which format have data is present ?

If it is excel use data step to import it and double transpose it..

If it is in notepad you can use double trailing method of input to get the data into a single column or variable ..

Please give the format of Have data.. based on i can help to write the code for it..

--

Durga

robertrao
Quartz | Level 8

Hi Dr. Majeti,

Thanks for the response.

LOOK UP Table and HAVE table are present as SAs datasets.

I appreciate your help in this regards

Thanks

DR_Majeti
Quartz | Level 8

Hi Robertrao,

data have_trans(drop=i IN1 OUT1 IN2 OUT2 IN3 OUT3);                                                                                                
array in {3} in1 in2 in3;                                                                                                                          
array out {3} out1 out2 out3;                                                                                                                      
do until (last.id);                                                                                                                                
set have;                                                                                                                                          
by id notsorted;                                                                                                                                   
format in_ out_ datetime16.;                                                                                                                       
do i = 1 to 3;                                                                                                                                     
   In_=in(i);                                                                                                                                      
   out_=out(i);                                                                                                                                    
output;                                                                                                                                            
end;                                                                                                                                               
end;                                                                                                                                               

run;

this step might help you to get have dataset as lookup dataset.

robertrao
Quartz | Level 8

HI,

BUt the main question is that how we are going to match these tables using basic SAS (since there are many to many????

i Know SQL can do it but I would need simple steps

thanks

DR_Majeti
Quartz | Level 8

when we transpose have data set  and we do combine lookup and have DS but it is not possible based on only id variable as you said many to many ..

are there any other variables possible..

i merged it using merge statement.. check the whole code i written till now..

data have;                                                                                                                                         
input ID (IN1 OUT1 IN2 OUT2 IN3 OUT3) (datetime16.);                                                                                               
Format IN1 OUT1 IN2 OUT2 IN3 OUT3 datetime16.;                                                                                                     
datalines;                                                                                                                                         
101 06AUG2013:13:55 06AUG2013:17:14 16AUG2013:17:02 17AUG2013:07:00 29AUG2013:12:28    .                                                       
102 14AUG2013:11:40 27AUG2013:12:03                                                                                                                
;                                                                                                                                                  
                                                                                                                                                   
data lookup;                                                                                                                                       
input ID DEPT$ (IN OUT) (datetime16.);                                                                                                             
format in out datetime16.;                                                                                                                         
datalines;                                                                                                                                         
101 SO  06AUG13:06:51  06AUG13:14:02                                                                                                               
101 SC  06AUG13:14:02  09AUG13:19:18                                                                                                               
101 ST  09AUG13:19:18  12AUG13:03:14                                                                                                               
101 SM  12AUG13:03:14  13AUG13:12:17                                                                                                               
101 ST  13AUG13:12:17  16AUG13:16:54                                                                                                               
101 SC  16AUG13:16:54  19AUG13:11:04                                                                                                               
101 PN  19AUG13:11:04  22AUG13:07:37                                                                                                               
101 PE  22AUG13:07:37  09SEP13:15:20                                                                                                               
102 SM  14AUG13:12:08  28AUG13:13:48                                                                                                               
;                                                                                                                                                  
                                                                                                                                                   
data have_trans(drop=i IN1 OUT1 IN2 OUT2 IN3 OUT3);                                                                                                
array in {3} in1 in2 in3;                                                                                                                          
array out {3} out1 out2 out3;                                                                                                                      
do until (last.id);                                                                                                                                
set have;                                                                                                                                          
by id notsorted;                                                                                                                                   
format in_ out_ datetime16.;                                                                                                                       
do i = 1 to 3;                                                                                                                                     
   In_=in(i);                                                                                                                                      
   out_=out(i);                                                                                                                                    
output;                                                                                                                                            
end;                                                                                                                                               
end;                                                                                                                                               
run;                                                                                                                                               
                                                                                                                                                   
data both;                                                                                                                                         
merge have_trans lookup;                                                                                                                           
by id ;                                                                                                                                            

run;

i think merge like this is wrong but i done for checking.

Vince28_Statcan
Quartz | Level 8

Hi Robert,

I don't have enough time to write the full code but here's a data step approach to do so. However, I still stand by the fact that you should take the time to learn proc sql; It is highly transferable and fairly simple. I managed to self-learn most of all I know reading SAS documentation in a morning at work.

If I recall Reeza had managed to get you to look into hash tables in another thread so I'll assume you at least know what they're about.

Your desired output can be done within a single data step using 2 arrays with variable ranges, in: and out: (to allow you to loop and output one record for each horizontal entry). The lookup can be achieved with a hash table, where I would rename IN and OUT from your lookup table so as to free the 2 variable names for desired output. You can then either use data step output or a second hash table to store your desired result and use the output method of hash to convert it to a data set.

Example of logic construction not tested:

data want(rename=(wantin=in wantout=out));

     length  ID $3. LUPIN 8. LUPOUT 8.    DEPT $2.;

     if _N_=1 then do;

          declare hash lookup(dataset: "work.lookuptable(rename=(IN=LUPIN OUT=LUPOUT))", multidata: "YES");

          lookup.defineKey("ID");

          lookup.defineData("DEPT", "LUPIN", "LUPOUT");

          lookup.defineDone();

     end;

     set have;

     array in {*} in: ;

     array out {*} out: ;

     do i=1 to dim(in);

          if lookup.find()=0 then do;

               if lupin <= in(i) < lupout then do;

                    wantin = in(i);

                    dept_in = dept;

               end;

               if lupin <= out(i) < lupout then do;

                    wantout = out(i);

                    dept_out = dept;

               end;

               do until lookup.findnext() NE 0;

                    if lupin <= in(i) < lupout then do;

                         wantin = in(i);

                         dept_in = dept;

                    end;

                    if lupin <= out(i) < lupout then do;

                         wantout = out(i);

                         dept_out = dept;

                    end;

               end;

           end;

          output;

     end;

     keep id wantin wantout dept_in dept_out;

run;

Since your lookup table has all fixed intervals with no overlap, this should do the work. All missing in: and out: will naturally not find an appropriate interval through the find_next() loop so they will be left as missing for output. It is far from optimized though since it is looping on all multidata element of a given hash key rather than break the loop once both in and out are found.

Vince

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1297 views
  • 3 likes
  • 4 in conversation