Hi Everyone,
I want to run the below macro to create individual file for each record.
However, I dont know how to tell SAS to run "to the end of file".
Could anyone help me out?
So many thanks,
HHC
data have;
input time start;
datalines;
1 1
2 1
3 .
4 -1
5 .
6 1
7 -1
8 .
9 .
10 9
11 2
12 6
1 1
2 1
3 .
4 -1
5 .
6 1
7 -133
8 .
9 .
10 933
11 233
12 633
1 1
2 1
3 .
4 -1
5 .
6 1
7 -1
8 .
9 .
10 9
11 2
12 6
1 1
2 1
3 .
4 -1
5 .
6 1
7 -133
8 .
9 .
10 933
11 233
12 633
;
run;
%macro split;
/*add this 4 line of code and it will create the &nobs;
dont need to make any adjustment on nobs
this below run exactly 4 rounds*/
%local dsid nobs rc;
%let dsid = %sysfunc(open(condition));
%let nobs = %sysfunc(attrn(&dsid, nlobs));
%let rc = %sysfunc(close(&dsid));
%do i=1 %to &nobs;
data temp; set have;
if (&i-1)*10<_N_<&i*10;
start2=start*2;
run;
%if &i=1 %then %do;
data final; set temp;run;
%end;
%else %do;
data final; set final temp; run;
%end;
%end;
%mend;
%split;
%macro cond;
%do i=1 %to 4; *HOW TO MAKE IT: i=1 to END OF FILE ;
data single_condition_&i; set condition;
if _N_=&i;
run;
MACRO CODE BODY
MACRO CODE BODY
%end;
%mend;
%cond;run;
You could of course save yourself the i/o and temp file management by using call execute:
data condition;
input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data fulldata;
input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;
data _null_;
set condition end=last;
if _n_=1 then call execute('data want; set fulldata;');
call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');
if last then call execute(' if condition_id >= 1 then output; run;');
run;
However the above is a total faff caused by bad structure of the data - i.e. using transposed data. If you normalise the data then you don't need code generation at all, its is just a merge:
/* This is the problem dataset, modify this */
data condition;
input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data condition (keep=chk:);
set condition end=last;
retain chk_a chk_b chk_c chk_d;
if cond1="a" then chk_a=value1;
if cond1="c" then chk_b=value1;
if cond2="b" then chk_c=value2;
if cond2="d" then chk_d=value2;
if last then output;
run;
data fulldata;
input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;
proc sql;
create table WANT as
select A.*
from FULLDATA A
left join CONDITION B
on 1=1
where (A=CHK_A and B=CHK_B)
or (C=CHK_C and D=CHK_D);
quit;
Structure of data is pivotal in how complex/messy code is.
Here's a simple way of making it work, using the numeric attribute nlobs (number of logical observations (to ignore deleted observations - usually a good idea)) from the dataset:
option mprint notes;
data condition;
input condition_ID cond1 $ value1 ;
datalines;
1 a 4
2 c 1
3 a 1
4 f 40
;
run;
%macro cond;
%local dsid nobs rc;
%let dsid = %sysfunc(open(condition));
%let nobs = %sysfunc(attrn(&dsid, nlobs));
%let rc = %sysfunc(close(&dsid));
%do i=1 %to &nobs;
data single_condition_&i;
set condition(firstobs=&i obs=&i);
run;
%end;
%mend cond;
%cond;
Note that I've used firstobs and obs instead of _n_ - if condition is large, this will be more efficient.
I'm interested - what's the problem you're trying to solve?
If you're asking this question to resolve the revision to your request Merging data based on (variable names from 1 data file) and (variables value from another data file, then making the macro loop you contemplate is not an efficient solution. Instead use a PUT statement in the first DATA step to write out neccessary code, and then %include that code in a second data step:
data condition;
input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;run;
data FULLDATA;
input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;run;
filename tmp temp;
data _null_;
set condition end=last_cond;
file tmp;
put 'if ' cond1 '=' value1 'and ' cond2 '=' value2
' then condition_id=' condition_id ';' @;
if last_cond=0 then put 'else';
run;
options source2;
data want;
set fulldata;
%include tmp;
if condition_id >=1;
run;
Both of your codes work fine.
However, for the sake of knowledge, could you tell me how to put this End_of_File number into my Macro.
Thank you.
HHC
Well, you don't need to - my code picks the number of observations up from the metadata for the dataset. It interrogates the number of observations and always gets it right (as long as the dataset is a SAS one), saving you the bother of putting it in as a parameter.
You may use the "nobs=" parameter of the set statement, as in:
filename tmp temp;
data _null_;
set condition end=last_cond nobs=nconds;
file tmp;
put 'if ' cond1 '=' value1 'and ' cond2 '=' value2
' then condition_id=' condition_id ' out of ' nconds ';' @;
if last_cond=0 then put 'else';
else putlog "there are a total of " nconds " conditions." ;
run;
%put &=cond;
You could of course save yourself the i/o and temp file management by using call execute:
data condition;
input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data fulldata;
input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;
data _null_;
set condition end=last;
if _n_=1 then call execute('data want; set fulldata;');
call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');
if last then call execute(' if condition_id >= 1 then output; run;');
run;
However the above is a total faff caused by bad structure of the data - i.e. using transposed data. If you normalise the data then you don't need code generation at all, its is just a merge:
/* This is the problem dataset, modify this */
data condition;
input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data condition (keep=chk:);
set condition end=last;
retain chk_a chk_b chk_c chk_d;
if cond1="a" then chk_a=value1;
if cond1="c" then chk_b=value1;
if cond2="b" then chk_c=value2;
if cond2="d" then chk_d=value2;
if last then output;
run;
data fulldata;
input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;
proc sql;
create table WANT as
select A.*
from FULLDATA A
left join CONDITION B
on 1=1
where (A=CHK_A and B=CHK_B)
or (C=CHK_C and D=CHK_D);
quit;
Structure of data is pivotal in how complex/messy code is.
Hi RW9,
It seems fine but I dont know why SAS gives error notice "The EXECUTE subroutine call has too many arguments."
HHC
Yes, sorry, too used to using cat() functions. This:
call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');
Should read:
call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then condition_id='||strip(condition_id)||';');
The code works now.
Since for each condition, I want to get all records meet it, the final file will have some duplicate due to the fact that 1 record satistfy multiple conditions.
I modify your code a bit and get it done.
What I wonder is the role of "if condition_id>=1".
In the log file, the SAS understand the code as:
data want2; set fulldata;
if a=4 and b=5 then do; condition_id=1;output;end;
if c=1 and d=2 then do; condition_id=2;output;end;
if c=9 and d=9 then do; condition_id=3;output;end;
if condition_id >= 1;
run;
In this case having or not having the "if condition_id>=1" doesn't make any difference.
So what is the role of that "if condition_id>=1" in your code and mkeintz code. Clearly, I miss an important point here.
if last then call execute(' if condition_id >= 1 ; run;');
By the way, the code I finally got is below.
Thanks alot,
HHC
data _null_;
set condition end=last;
if _n_=1 then call execute('data want2; set fulldata;');
call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then do; condition_id='||strip(condition_id)|| ';output;end;');
if last then call execute(' if condition_id >= 1 ; run;');
run;
data _null_;
set condition end=last;
if _n_=1 then call execute('data want2; set fulldata;');
call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then do; condition_id='||strip(condition_id)|| ';output;end;');
if last then call execute(' ; run;');
run;
@hhchenfx wrote:
The code works now.
Since for each condition, I want to get all records meet it, the final file will have some duplicate due to the fact that 1 record satistfy multiple conditions.
I modify your code a bit and get it done.
What I wonder is the role of "if condition_id>=1".
In the log file, the SAS understand the code as:
I believe I told you to remove the "if condition_id>1" statement once the explicit OUTPUT statements were introduced in the other topic you started that is confounded with this one.
Yes you did.
Thanks,
HHC
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.