BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sagulolo
Quartz | Level 8

Hi all,

 

i have 2 table which is Table_A and Table_B, Table_A  extra number will always sufficient to cover the short number at Table_B. The logic will be use the extra number from Table A to knock off the first short number from Table B with to the same group(Sg_a = Sg_b) until zero then follow by second short number from the same group. Please refer below:-

 

data Table_A;
  infile datalines dlm='|';
  input  Sg_a $ Key_a $ extra;
;
datalines;
A |KEY1  |50
A |KEY2  |80
B |KEY6  |1000
C |KEY9  |10
C |KEY10 |5
C |KEY11 |500
;

data Table_B;
  infile datalines dlm='|';
  input  Sg_b $ Key_b $ short;
;
datalines;
A |KEY3  |10
A |KEY4  |25
A |KEY5  |35
B |KEY7  |200
B |KEY8  |700
C |KEY12 |20
C |KEY13 |400
C |KEY14 |10
C |KEY15 |50
;

proc sql;
	create table rightjoin as
	select a.*, b.*
	from Table_A a right join
	Table_B b
	on a.Sg_a = b.Sg_b;
quit;

So far, I have tried to right join table A to table B, but have no idea how to evaluate and keep track the actual_transfer, below is my desire output.

 

Sg_aKey_aKey_bactual_transfer
AKEY1KEY310
AKEY1KEY425
AKEY1KEY515
AKEY2KEY520
BKEY6KEY7200
BKEY6KEY8700
CKEY9KEY1210
CKEY10KEY125
CKEY11KEY125
CKEY11KEY13400
CKEY11KEY1410
CKEY11KEY1550

 

 

Appreciate your time and help on this matter.

1 ACCEPTED SOLUTION

Accepted Solutions
hashman
Ammonite | Level 13

@sagulolo:

A curious puzzle. Frankly, I've spent about 90% of my time having a crack at it by trying to understand your spec. However, it's commendable that you've tried harder every next iteration, and finally it's helped. Once the pattern was clear, writing the code was fairly easy. One selection that had to be made was to decide on the type of a lookup table for the info from Table_A, and I have finally opted, somewhat contrary to my moniker, for using arrays - primarily since they are simple to update in place. 

 

Here's the code (the input tables are assumed sorted by SG_A and SG_B, respectively):

data Table_A ;                                                                 
  input  Sg_a $ Key_a $ extra ;                                                
  cards ;                                                                      
A KEY1    50                                                                   
A KEY2    80                                                                   
B KEY6  1000                                                                   
C KEY9    10                                                                   
C KEY10   05                                                                   
C KEY11  500                                                                   
run ;                                                                          
                                                                               
data Table_B;                                                                  
  input  Sg_b $ Key_b $ short ;                                                
  cards ;                                                                      
A KEY3   10                                                                    
A KEY4   25                                                                    
A KEY5   35                                                                    
B KEY7  200                                                                    
B KEY8  700                                                                    
C KEY12  20                                                                    
C KEY13 400                                                                    
C KEY14  10                                                                    
C KEY15  50                                                                    
run ;                                                                          
                                                                               
proc sql noprint ;                                                             
  select max (d) into :d from (select count (*) as d from table_a group sg_a) ;
quit ;                                                                         
                                                                               
data want (keep = sg_a key_a key_b balance_extra actual_transfer) ;            
  array x [&d]   8 _temporary_ ;                                               
  array k [&d] $ 8 _temporary_ ;                                               
  do _i = 1 by 1 until (last.sg_a) ;                                           
    set table_a (in = _a) table_b (rename = sg_b = sg_a);                      
    by sg_a ;                                                                  
    if _a then do ;                                                            
      x [_i] = extra ;                                                         
      k [_i] = key_a ;                                                         
      _im = _i ;                                                               
    end ;                                                                      
    else do _i = 1 by 0 until (short = 0 or _i > _im) ;                        
      actual_transfer = x [_i] min short ;                                     
      if actual_transfer > 0 then do ;                                         
        key_a = k [_i] ;                                                       
        balance_extra = x [_i] - actual_transfer ;                             
        output ;                                                               
      end ;                                                                    
      x [_i] +- actual_transfer ;                                              
      short  +- actual_transfer ;                                              
      if x [_i] = 0 then _i + 1 ;                                              
    end ;                                                                      
  end ;                                                                        
run ;                                                                          

The preliminary SQL step is used to size their upper bound at the expense of an extra pass through Table_A. Alternatively, you can omit the step and instead of setting the upper bounds of the arrays X and K to &D, set them to something "big enough" like 99999. The only requirement to its magnitude is that it must be equal to or exceed the size of the largest BY group by SG_A in Table_A. Yet another alternative is to use a hash table instead of the arrays; but in this case methinks that compared to the arrays, it would introduce an extra level of complexity with regard to accessing the lookup items and updating them in place.

 

On the different note, I don't see how this nut can be cracked without procedural programming and by using SQL alone.

 

Kind regards

Paul D.      

View solution in original post

18 REPLIES 18
KachiM
Rhodochrosite | Level 12

The logic will be use the extra number from Table A to knock off the first short number from Table B with to the same group(Sg_a = Sg_b) until zero then follow by second short number from the same group. Please refer below:-

 

Can you explain how you got "actual_transfer"  using sg_a and sg_b? I am bad to understand your term "to knock off".

sagulolo
Quartz | Level 8

Hi datasp,

 

Thank for your time and sorry for my bad english,  please refer below:-

 

A KEY1(table A) --> extra 50  - A KEY3(table B) short 10  (50 - 10, can offset 10), so i get below:-

 

so i output A KEY1(table A)  A KEY3(table B) 10, then A KEY1(table A) -> still have extra $40

 

A KEY1(table A) --> extra 40 - A KEY4(table B) short 25 (40 - 25, can offset 25)

 

so i output A KEY1(table A)  A KEY4(table B) 25, then A KEY1(table A) -> still have extra 15

 

A KEY1(table A) --> extra 15 - A KEY5(table B) short 35 (15 - 35, only can offset 15)

 

so i output A KEY1(table A)  A KEY5(table B) 15, then next i refer to A KEY2 80 for the balance of  A KEY5(table B) which is 20

 

A KEY2 (table A) -->extra  80 - A KEY5(table B) short 20 ( 80 - 20, can offset 20)

 

so i output A KEY2 (table A)  A KEY5(table B) short 20

 

hope the above clarifies.

 

 

sagulolo
Quartz | Level 8

Hi datasp,

Apologize as didn't really answer your question.
"actual_transfer" is refer to sg_b if table A extra more than sg_b short. Else will be sg_a balance left.

Once the table B short all deducted then move to next B then C.

KachiM
Rhodochrosite | Level 12

I understand for the first A of Table 1 and you cross compare  EXTRA with SHORT. This goes like 10, 25, and final residual (15) replaces 35 of SHORT. 

Once the table B short all deducted then move to next B then C.

What is happening the next A of Table1? Is it to be ignored and go to B of Table 1?

 

You show the output on your revised specification. In your output append comment in the form of Balance_Extra = 50 - 10, 40 - 25 , 15 - 15 for each row. It will be easy to know what you want. 

sagulolo
Quartz | Level 8

Hi datasp,

 

Apologize for late reply.

 

What is happening the next A of Table1?

no action require as Table_B no more A group

 

Is it to be ignored and go to B of Table 1?

Yes, next will be cross match between Table_A and Table_B for B group.

 

Added balance_extra into table for more understanding, thank you

 

Sg_aKey_aKey_bBalance_Extraactual_transfer
AKEY1KEY350 -10 = 4010
AKEY1KEY440 - 25 = 1525
AKEY1KEY515 - 15 = 0(Key 1 no more number, next use Key 2)15
AKEY2KEY580 - 20,  stop as table_B no more group for A20
BKEY6KEY71000 - 200 = 800200
BKEY6KEY8800 - 700 = 100, stop as table_B no more group for B700
CKEY9KEY1210 - 10 = 0, (Key 9 no more number, next use Key 10)10
CKEY10KEY125 - 5 = 0, (Key 10 no more number, next use Key 11)5
CKEY11KEY12500 - 5 = 4955
CKEY11KEY13495 - 400 = 95400
CKEY11KEY1495 - 10 = 8510
CKEY11KEY1585 - 50 = 35,  stop for as table_B no more group for C50
KachiM
Rhodochrosite | Level 12

It is good now. I want the 'desired output'. Place the revised output. 

sagulolo
Quartz | Level 8

Hi datasp,

 

The desire output will be as per below:-

 

Sg_aKey_aKey_bactual_transfer
AKEY1KEY310
AKEY1KEY425
AKEY1KEY515
AKEY2KEY520
BKEY6KEY7200
BKEY6KEY8700
CKEY9KEY1210
CKEY10KEY125
CKEY11KEY125
CKEY11KEY13400
CKEY11KEY1410
CKEY11KEY1550

 

Thank you

KachiM
Rhodochrosite | Level 12
Sg_a Key_a Key_b Balance_Extra actual_transfer
A KEY1 KEY3 50 -10 = 40 10
A KEY1 KEY4 40 - 25 = 15 25
A KEY1 KEY5 15 - 15 = 0(Key 1 no more number, next use Key 2) 15
A KEY2 KEY5 80 - 20,  stop as table_B no more group for A 20
B KEY6 KEY7 1000 - 200 = 800 200
B KEY6 KEY8 800 - 700 = 100, stop as table_B no more group for B 700
C KEY9 KEY12 10 - 10 = 0, (Key 9 no more number, next use Key 10) 10
C KEY10 KEY12 5 - 5 = 0, (Key 10 no more number, next use Key 11) 5
C KEY11 KEY12 500 - 5 = 495 5
C KEY11 KEY13 495 - 400 = 95 400
C KEY11 KEY14 95 - 10 = 85 10
C KEY11 KEY15 85 - 50 = 35,  stop for as table_B no more group for C 50

 

5th row in 500 - 5, how do you get this 5? See In Table_B, k12 is 20. 

 

sagulolo
Quartz | Level 8

Hi datasp,

 

5th row in 500 - 5, how do you get this 5? See In Table_B, k12 is 20.

I get the output 5 is because

20 - 10 - 5 (KEY12 - KEY9 - KEY10) = 5, so need another 5 from KEY 11 to become zero.

 

Thank you

KachiM
Rhodochrosite | Level 12

@sagulolo 

 

Do you insist on SQL solution? I have no sql-head. I am trying Data Step approach. 

 

There are problems in getting :

A Key2 Key5 20 as Key5 has been used in the previous row of output.

Similarly, with:

C Key9 Key12 10

C Key10 Key12 5

C Key11 Key12 5

 

I am looking into this. Be patient.

sagulolo
Quartz | Level 8

Hi datasp,

 

I'm okay with the data step approach and really thank for looking into this matter.

 

Thank you so much

hashman
Ammonite | Level 13

@sagulolo:

A curious puzzle. Frankly, I've spent about 90% of my time having a crack at it by trying to understand your spec. However, it's commendable that you've tried harder every next iteration, and finally it's helped. Once the pattern was clear, writing the code was fairly easy. One selection that had to be made was to decide on the type of a lookup table for the info from Table_A, and I have finally opted, somewhat contrary to my moniker, for using arrays - primarily since they are simple to update in place. 

 

Here's the code (the input tables are assumed sorted by SG_A and SG_B, respectively):

data Table_A ;                                                                 
  input  Sg_a $ Key_a $ extra ;                                                
  cards ;                                                                      
A KEY1    50                                                                   
A KEY2    80                                                                   
B KEY6  1000                                                                   
C KEY9    10                                                                   
C KEY10   05                                                                   
C KEY11  500                                                                   
run ;                                                                          
                                                                               
data Table_B;                                                                  
  input  Sg_b $ Key_b $ short ;                                                
  cards ;                                                                      
A KEY3   10                                                                    
A KEY4   25                                                                    
A KEY5   35                                                                    
B KEY7  200                                                                    
B KEY8  700                                                                    
C KEY12  20                                                                    
C KEY13 400                                                                    
C KEY14  10                                                                    
C KEY15  50                                                                    
run ;                                                                          
                                                                               
proc sql noprint ;                                                             
  select max (d) into :d from (select count (*) as d from table_a group sg_a) ;
quit ;                                                                         
                                                                               
data want (keep = sg_a key_a key_b balance_extra actual_transfer) ;            
  array x [&d]   8 _temporary_ ;                                               
  array k [&d] $ 8 _temporary_ ;                                               
  do _i = 1 by 1 until (last.sg_a) ;                                           
    set table_a (in = _a) table_b (rename = sg_b = sg_a);                      
    by sg_a ;                                                                  
    if _a then do ;                                                            
      x [_i] = extra ;                                                         
      k [_i] = key_a ;                                                         
      _im = _i ;                                                               
    end ;                                                                      
    else do _i = 1 by 0 until (short = 0 or _i > _im) ;                        
      actual_transfer = x [_i] min short ;                                     
      if actual_transfer > 0 then do ;                                         
        key_a = k [_i] ;                                                       
        balance_extra = x [_i] - actual_transfer ;                             
        output ;                                                               
      end ;                                                                    
      x [_i] +- actual_transfer ;                                              
      short  +- actual_transfer ;                                              
      if x [_i] = 0 then _i + 1 ;                                              
    end ;                                                                      
  end ;                                                                        
run ;                                                                          

The preliminary SQL step is used to size their upper bound at the expense of an extra pass through Table_A. Alternatively, you can omit the step and instead of setting the upper bounds of the arrays X and K to &D, set them to something "big enough" like 99999. The only requirement to its magnitude is that it must be equal to or exceed the size of the largest BY group by SG_A in Table_A. Yet another alternative is to use a hash table instead of the arrays; but in this case methinks that compared to the arrays, it would introduce an extra level of complexity with regard to accessing the lookup items and updating them in place.

 

On the different note, I don't see how this nut can be cracked without procedural programming and by using SQL alone.

 

Kind regards

Paul D.      

sagulolo
Quartz | Level 8
@hashman

Thank you so much for the coding, this is what I need and I never though of array can solve this puzzle. Now I have to test it at my real data, nevertheless apologize for my bad english. I will update here again after I test it with the real data.

@KachiM

Really appreciate your giving comments on the lack of logic explanations and guiding me to write a explanation to get the desire result. I also don't mind if you share others solution as I still new and lack of logical mind to resolve issue/puzzle like this.

Best regards
KachiM
Rhodochrosite | Level 12

@hashman 

 

I have spent several hours on solving this problem. I am still testing another data step approach. With  struggles I had, I will say yours is a program of excellence. It must be a case study to every aspiring Data Step Programmers.

 

@sagulolo 

 

Initially I thought it is simply a many to many and one to many merging problem. There is a twist. In the middle of Table_B, we need to go to Table_A to get some info and return to the place in Table_B to continue. This jump_return-continue  is the crucial part of programming. This is first time for me.

 

Enjoy the array solution given by Paul D. If you have Data step debugger and if you like, understand the flow of the program and visualize  the  values of variables. 

 

best wishes

DataSP

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1307 views
  • 6 likes
  • 3 in conversation