I want to merge eq2 and eq3 with final look like eq.
my code works. but I want to know a more efficient way by sql or data step to do it.
Thanks
data eq;
input ID y v x z w;
cards;
1 1 1 27 40 8
1 2 3 . 29 37
1 3 5 30 . 25
1 4 7 38 38 23
2 1 1 23 45 19
2 2 3 32 20 .
2 3 5 67 . .
2 4 7 . 27 .
3 1 1 33 23 46
3 2 3 21 12 56
3 3 5 78 . 34
3 4 7 13 45 .
4 1 1 56 45 23
4 2 3 67 13 67
4 3 5 . 35 13
4 4 7 48 35 56
;
run;
data eq2;
input ID y x z w;
cards;
1 1 27 40 8
1 2 . 29 37
1 3 30 . 25
1 4 38 38 23
2 1 23 45 19
2 2 32 20 .
2 3 67 . .
2 4 . 27 .
3 1 33 23 46
3 2 21 12 56
3 3 78 . 34
3 4 13 45 .
4 1 56 45 23
4 2 67 13 67
4 3 . 35 13
4 4 48 35 56
;
run;
data eq3;
input y v ;
cards;
1 1
2 3
3 5
4 7
;
run;
proc sql;
create table m as
select * from eq2 as a
full join
eq3 as b
on a.y=b.y;
quit;
proc sort data=m ;
by id;
run;
data n;
format id y v x z w;
set m;
run;
Perfect. Here's an easy way:
data eq;
array v_values {600} _temporary_;
if _n_=1 then do until (done);
set eq3 end=done;
v_values{y} = v;
end;
set eq2;
v=.;
if (1 <= y <= 600) then v = v_values{y};
format id y v x z w;
run;
proc sort data=eq;
by id;
run;
***** EDITED to add a small correction for safety's sake.
What's the first table for, EQ, since your query only uses the last 2 tables, eq2 and eq3?
Since it's a many to one I would recommend a format as the fastest approach.
@xiangpang wrote:
I want to merge eq2 and eq3 with final look like eq.
my code works. but I want to know a more efficient way by sql or data step to do it.
Thanks
data eq; input ID y v x z w; cards; 1 1 1 27 40 8 1 2 3 . 29 37 1 3 5 30 . 25 1 4 7 38 38 23 2 1 1 23 45 19 2 2 3 32 20 . 2 3 5 67 . . 2 4 7 . 27 . 3 1 1 33 23 46 3 2 3 21 12 56 3 3 5 78 . 34 3 4 7 13 45 . 4 1 1 56 45 23 4 2 3 67 13 67 4 3 5 . 35 13 4 4 7 48 35 56 ; run; data eq2; input ID y x z w; cards; 1 1 27 40 8 1 2 . 29 37 1 3 30 . 25 1 4 38 38 23 2 1 23 45 19 2 2 32 20 . 2 3 67 . . 2 4 . 27 . 3 1 33 23 46 3 2 21 12 56 3 3 78 . 34 3 4 13 45 . 4 1 56 45 23 4 2 67 13 67 4 3 . 35 13 4 4 48 35 56 ; run; data eq3; input y v ; cards; 1 1 2 3 3 5 4 7 ; run; proc sql; create table m as select * from eq2 as a full join eq3 as b on a.y=b.y; quit; proc sort data=m ; by id; run; data n; format id y v x z w; set m; run;
I want to create a table same as EQ
Ok, well, PROC FORMAT is likely the fastest method. You can find a comparison of the different approaches, proc format, hash, sql, data step or DS2 on lexjansen.com.
how to do merge in proc format? could you provide the code?
Thanks
@xiangpang wrote:
how to do merge in proc format? could you provide the code?
Thanks
> I want to create a table same as EQ
That's not at all obvious when looking at your post.
Good questions bring good answers.
Using formats will be faster if table EQ3 is not too large, as this conserves the table order and avoids sorting the table again. Thousands of values is fine. If you have millions of values don't bother.
To avoid re-sorting, a hash table is the next choice if you have many values.
It is a matter of volume.
Otherwise, your existing code should be faster if you remove the proc sort and add order by instead.
thanks for your reply
Here is an example of code and benchmark result.
The results will vary depending on your data and your hardware.
data F;
retain FMTNAME 'QE';
do START=1 to 1e7;
LABEL=put(START,z10.);
output;
end;
run;
proc format cntlin=F;
run;
%* real time 26.92 seconds;
data W_FMT;
do START=1 to 1e7;
LABEL=put(START,qe.);
output;
end;
run;
%* real time 10.07 seconds ;
%* total 37 seconds ;
data W_HASH;
dcl hash H(dataset:'F');
H.definekey('START');
H.definedata('LABEL');
H.definedone();
LABEL=' ';
drop RC;
do START=1 to 1e7;
RC=H.find();
output;
end;
run;
%* real time 18.39 seconds ;
proc sql;
create table W_SQL as
select a.START, b.LABEL from F a, F b where a.START=b.START order by 1;
quit;
%* real time 22.96 seconds ;
Thanks for the reply. I am happy to learn from you. Thanks again
Relevant questions for some methods (some of which are much simpler):
Is Y always an integer?
How many different values do you have for Y?
Thanks for your reply.
ID Y V is always an integer.
For Y, it has about 500 different value.
Perfect. Here's an easy way:
data eq;
array v_values {600} _temporary_;
if _n_=1 then do until (done);
set eq3 end=done;
v_values{y} = v;
end;
set eq2;
v=.;
if (1 <= y <= 600) then v = v_values{y};
format id y v x z w;
run;
proc sort data=eq;
by id;
run;
***** EDITED to add a small correction for safety's sake.
Thanks. But I have two more questions.
Although "format id y v x z w;" is used, why the column sequence is "y v id x z w;"?
when y is character (letter+number), how will you change the code?
It's possible that the FORMAT statement is intended to reorder the columns. If so, getting the right order would require a small change to the program. Also notice the small correction I added.
If Y is character, a lot depends on the range of values. In some cases, a wider array could be used, with a formula to translate from letter+number to a unique integer value. But that's really what a hash table is for. An informat based on eq3 would also be possible, to be created using a CNTLIN= data set to PROC FORMAT. Sorry to say, you would need to look up some of the details there. But the DATA step would be easy:
data eq;
set eq2;
v = input(y, myform.);
run;
In this case, MYFORM. is the name of the informat that your program creates using a CNTLIN= data set to PROC FORMAT.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.