Re: "Classes Join"

lo88lo · Posted 09-27-2016 03:35 AM

Hi,

I have a problem with the assigment to a value.

Thank you so much!

TABLE A: Categorical Variable + Cumulative (from 0 to 1)

VAR_A_1 VAR_A_2

A 0,5

B 0,7

C 0,9

D 1

TABLE B: Categorical variable + Random Variable (from 0 to 1)

VAR_B_1 VAR_B_2

X1 0,43

X2 0,76

X1 0,98

X1 0,23

TABLE C: Result

from table B : Categorial Variable, Random Variable,

from Table A: Categorical variable that corresponds to the upper of the class within which is the value of VAR_B_2

VAR_B_1 VAR_B_2 VAR_A_1 VAR_A_2

X1 0,43 A 0,5

X2 0,76 C 0,9

X1 0,98 D 1

X1 0,23 A 0,5

I tried with 2 Macros but is very slow because Table A and Table B are very long.

I thought to use formats but do not know how.

Thank you

Kurt_Bremser · Posted 09-27-2016 04:34 AM

So, table A gives you ranges. How many observations (records) does A have?

The most elegant solution would be to dynamically create a value format with ranges from A and apply that to B.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

lo88lo · Posted 09-27-2016 05:15 AM

Thank you so much

Table A 10.000 rows

Table B 13.000 rows

RW9 · Posted 09-27-2016 04:39 AM

Post your test data as a datastep so we don't have to type it in and guess what your data looks like. Follow this topic if you need help:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

It looks like you are merging table A onto table B, based on smallest VAR_A_2 which is greater than VAR_B_2. The question is, what if you have multiple rows which have the same value? So for row one you take A as 0.43 < 0.5 but what if B was 0.5.

Something like this will work, but not if you have multiples:

proc sql;
  select B.*,
         A.*
  from   TABLE_B B
  left join (select * 
               from (select * 
                       from TABLE_A 
                       where VAR_A_2 > B.VAR_VAR_B_2)
               group by VAR_A_2
               having VAR_A_2=min(VAR_A_2)) A
  on 1=1;
quit;

Untested!!

Shmuel · Posted 09-27-2016 05:10 AM

You can use formats as:

proc format lib=work;

value vara1f

0 - 0.5 = "A"

0.5 - 0.7 = "B"

0.7 - 0.9 = "C"

0.9 - 1.0 = "D"

;

value vara2f

0 - 0.5 = 0.5

0.5 - 0.7 = 0.7

0.7 - 0.9 = 0.9

0.9 - 1.0 = 1

;

; run;

data C;

set B;

var_A_1 = put(var_B_2, vara1f.);

var_A_2 = put(var_B_2,vara2f.);

run;

lo88lo · Posted 09-27-2016 07:18 AM

Thank you so much. Is not possible the same value for different row in Table A

user24feb · Posted 09-27-2016 05:09 AM

Data A;

Length Var_A_1 $2.;

Input VAR_A_1 $ VAR_A_2;

Datalines;

A 0.5

B 0.7

C 0.9

D 1

;

Data B;

Length Var_B_1 $2.;

Input VAR_B_1 $ VAR_B_2;

Datalines;

X1 0.43

X2 0.76

X1 0.98

X1 0.23

;

Data D (Drop=rc);

Length Var_A_1 $2. Var_A_2 8.;

If _N_ eq 1 Then Do;

Declare Hash H (Dataset:'A',Ordered:'y');

Declare HIter HI ('H');

H.Definekey('Var_A_2');

H.Definedata(All:'y');

H.Definedone();

Call Missing (Var_A_1,Var_A_2); *!;

End;

Set B;

rc=HI.First();

Do While (not rc);

If Var_A_2 gt Var_B_2 Then Leave;

rc=HI.Next();

End;

Run;

lo88lo · Posted 09-27-2016 05:19 AM

I try the solutions!

thank you all

Ksharp · Posted 09-27-2016 05:32 AM

Data A;
  Length Var_A_1 $2.;
  Input VAR_A_1 $ VAR_A_2;
  Datalines;
A 0.5
B 0.7
C 0.9
D 1
;

Data B;
  Length Var_B_1 $2.;
  Input VAR_B_1 $ VAR_B_2;
  Datalines;
X1 0.43
X2 0.76
X1 0.98
X1 0.23
;
proc sql;
select *
 from a,b
  where  VAR_A_2 gt  VAR_B_2
   group by b. VAR_B_1,b.VAR_B_2
    having  VAR_A_2- VAR_B_2=min( VAR_A_2- VAR_B_2);
quit;

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away