DATA Step, Macro, Functions and more

"Classes Join"

Reply
Occasional Contributor
Posts: 9

"Classes Join"

[ Edited ]

Hi,

I have a problem with the assigment to a value.

Thank you so much!

 

 

TABLE A:  Categorical Variable + Cumulative (from 0 to 1)

 

VAR_A_1     VAR_A_2

A                    0,5

B                    0,7

C                    0,9

D                    1

 

TABLE B: Categorical variable + Random Variable (from 0 to 1)

 

VAR_B_1     VAR_B_2

X1                 0,43

X2                 0,76

X1                 0,98

X1                 0,23

 

 

 

TABLE C: Result

from table B : Categorial Variable, Random Variable,

from Table A: Categorical variable that corresponds to the upper of the class within which is the value of VAR_B_2

VAR_B_1     VAR_B_2           VAR_A_1     VAR_A_2      

X1                 0,43                    A                        0,5 

X2                 0,76                    C                        0,9

X1                 0,98                    D                          1

X1                 0,23                    A                        0,5   

 

I tried with 2 Macros but is very slow because Table A and Table B are very long.

I thought to use formats but do not know how.

 

Thank you

 

Super User
Posts: 7,758

Re: "Classes Join"

So, table A gives you ranges. How many observations (records) does A have?

The most elegant solution would be to dynamically create a value format with ranges from A and apply that to B.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 9

Re: "Classes Join"

Posted in reply to KurtBremser

Thank you so much

Table A 10.000 rows

Table B 13.000 rows

Super User
Super User
Posts: 7,942

Re: "Classes Join"

Post your test data as a datastep so we don't have to type it in and guess what your data looks like.  Follow this topic if you need help:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

It looks like you are merging table A onto table B, based on smallest VAR_A_2 which is greater than VAR_B_2.  The question is, what if you have multiple rows which have the same value?  So for row one you take A as 0.43 < 0.5 but what if B was 0.5.

 

Something like this will work, but not if you have multiples:

proc sql;
  select B.*,
         A.*
  from   TABLE_B B
  left join (select * 
               from (select * 
                       from TABLE_A 
                       where VAR_A_2 > B.VAR_VAR_B_2)
               group by VAR_A_2
               having VAR_A_2=min(VAR_A_2)) A
  on 1=1;
quit; 

Untested!!

Trusted Advisor
Posts: 1,553

Re: "Classes Join"

You can use formats as:

 

proc format lib=work;

     value vara1f

    0 - 0.5    = "A"

    0.5 - 0.7 = "B"

    0.7 - 0.9 = "C"

    0.9 - 1.0 = "D"

    ;

     value vara2f

    0 - 0.5    = 0.5

    0.5 - 0.7 = 0.7

    0.7 - 0.9 = 0.9

    0.9 - 1.0 = 1

    ;

; run;

 

data C;

  set B;

        var_A_1 = put(var_B_2, vara1f.);

         var_A_2 = put(var_B_2,vara2f.);

run;

Occasional Contributor
Posts: 9

Re: "Classes Join"

Thank you so much. Is not possible the same value for different row in Table A

 

Super Contributor
Posts: 340

Re: "Classes Join"

[ Edited ]

Data A;

Length Var_A_1 $2.;

Input VAR_A_1 $ VAR_A_2;

Datalines;

A 0.5

B 0.7

C 0.9

D 1

;

Data B;

Length Var_B_1 $2.;

Input VAR_B_1 $ VAR_B_2;

Datalines;

X1 0.43

X2 0.76

X1 0.98

X1 0.23

;

Data D (Drop=rc);

Length Var_A_1 $2. Var_A_2 8.;

If _N_ eq 1 Then Do;

Declare Hash H (Dataset:'A',Ordered:'y');

Declare HIter HI ('H');

H.Definekey('Var_A_2');

H.Definedata(All:'y');

H.Definedone();

Call Missing (Var_A_1,Var_A_2); *!;

End;

Set B;

rc=HI.First();

Do While (not rc);

If Var_A_2 gt Var_B_2 Then Leave;

rc=HI.Next();

End;

Run;

Occasional Contributor
Posts: 9

Re: "Classes Join"

Posted in reply to user24feb

I try the solutions!

 

thank you all

Super User
Posts: 10,018

Re: "Classes Join"

Data A;
  Length Var_A_1 $2.;
  Input VAR_A_1 $ VAR_A_2;
  Datalines;
A 0.5
B 0.7
C 0.9
D 1
;

Data B;
  Length Var_B_1 $2.;
  Input VAR_B_1 $ VAR_B_2;
  Datalines;
X1 0.43
X2 0.76
X1 0.98
X1 0.23
;
proc sql;
select *
 from a,b
  where  VAR_A_2 gt  VAR_B_2
   group by b. VAR_B_1,b.VAR_B_2
    having  VAR_A_2- VAR_B_2=min( VAR_A_2- VAR_B_2);
quit;

Ask a Question
Discussion stats
  • 8 replies
  • 419 views
  • 2 likes
  • 6 in conversation