BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Rhodochrosite | Level 12

Hi Everyone,

 

I want to run the below macro to create individual file for each record.

However, I dont know how to tell SAS to run "to the end of file".

Could anyone help me out?

So many thanks,

HHC


data have;
input time start;
datalines;
1 1
2 1
3 .
4 -1
5 .
6 1
7 -1
8 .
9 .
10 9
11 2
12 6
1 1
2 1
3 .
4 -1
5 .
6 1
7 -133
8 .
9 .
10 933
11 233
12 633
1 1
2 1
3 .
4 -1
5 .
6 1
7 -1
8 .
9 .
10 9
11 2
12 6
1 1
2 1
3 .
4 -1
5 .
6 1
7 -133
8 .
9 .
10 933
11 233
12 633
;
run;

%macro split;

	/*add this 4 line of code and it will create the &nobs; 
		dont need to make any adjustment on nobs 
		this below run exactly 4 rounds*/

%local dsid nobs rc;
%let dsid = %sysfunc(open(condition));
%let nobs = %sysfunc(attrn(&dsid, nlobs));
%let rc = %sysfunc(close(&dsid));


	%do i=1 %to &nobs;
		data temp; set have;
		if (&i-1)*10<_N_<&i*10;
		
		start2=start*2;
		run;

		%if &i=1 %then %do;
			data final; set temp;run;
			%end;
		%else %do;
			data final; set final temp; run;
			%end;
	%end;
%mend;

%split;

 

 

 

%macro cond;
%do i=1 %to 4; *HOW TO MAKE IT: i=1 to END OF FILE ;
data single_condition_&i; set condition;
if _N_=&i;
run;

MACRO CODE BODY
MACRO CODE BODY

%end;
%mend;
%cond;run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You could of course save yourself the i/o and temp file management by using call execute:

data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want; set fulldata;');
  call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');
  if last then call execute(' if condition_id >= 1 then output; run;');
run;

However the above is a total faff caused by bad structure of the data - i.e. using transposed data.  If you normalise the data then you don't need code generation at all, its is just a merge:

/* This is the problem dataset, modify this */
data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;

data condition (keep=chk:);
  set condition end=last;
  retain chk_a chk_b chk_c chk_d;
  if cond1="a" then chk_a=value1;
  if cond1="c" then chk_b=value1;
  if cond2="b" then chk_c=value2;
  if cond2="d" then chk_d=value2;
  if last then output;
run;

data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

proc sql;
  create table WANT as
  select  A.*
  from    FULLDATA A
  left join CONDITION B
  on      1=1
  where   (A=CHK_A and B=CHK_B) 
    or    (C=CHK_C and D=CHK_D);
quit; 

Structure of data is pivotal in how complex/messy code is.

View solution in original post

11 REPLIES 11
LaurieF
Barite | Level 11

Here's a simple way of making it work, using the numeric attribute nlobs (number of logical observations (to ignore deleted observations - usually a good idea)) from the dataset:

 

option mprint notes;

data condition; 
input condition_ID cond1 $ value1 ;
datalines;
1 a 4 
2 c 1
3 a 1
4 f 40
;
run;

%macro cond;
%local dsid nobs rc;
%let dsid = %sysfunc(open(condition));
%let nobs = %sysfunc(attrn(&dsid, nlobs));
%let rc = %sysfunc(close(&dsid));
	%do i=1 %to &nobs;	
		data single_condition_&i; 
		set condition(firstobs=&i obs=&i);
		run;
	%end;
%mend cond;
%cond;

Note that I've used firstobs and obs instead of _n_ - if condition is large, this will be more efficient.

 

I'm interested - what's the problem you're trying to solve?

mkeintz
PROC Star

If you're asking this question to resolve the revision to your request Merging data based on (variable names from 1 data file) and (variables value from another data file, then making the macro loop you contemplate is not an efficient solution.  Instead use a PUT statement in the first DATA step to write out neccessary code, and then %include that code in a second data step:

 

data condition; 
input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;run;
data FULLDATA; 
input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;run;

filename tmp temp;
data _null_;
  set condition end=last_cond;
  file tmp;
  put 'if ' cond1 '=' value1 'and ' cond2 '=' value2  
      ' then condition_id=' condition_id ';' @;
  if last_cond=0  then put 'else';
run;

options source2;
data want;
  set fulldata;
  %include tmp;
  if condition_id >=1;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
hhchenfx
Rhodochrosite | Level 12

Both of your codes work fine.

However, for the sake of knowledge, could you tell me how to put this End_of_File number into my Macro.

Thank you.

HHC

 

LaurieF
Barite | Level 11

Well, you don't need to - my code picks the number of observations up from the metadata for the dataset. It interrogates the number of observations and always gets it right (as long as the dataset is a SAS one), saving you the bother of putting it in as a parameter.

mkeintz
PROC Star

You may use the "nobs=" parameter of the set statement, as in:

 

filename tmp temp;
data _null_;
  set condition end=last_cond  nobs=nconds;
  file tmp;
  put 'if ' cond1 '=' value1 'and ' cond2 '=' value2 
      ' then condition_id=' condition_id ' out of ' nconds ';' @;
  if last_cond=0  then put 'else';

  else putlog "there are a total of " nconds " conditions." ;
run;

 

%put &=cond;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You could of course save yourself the i/o and temp file management by using call execute:

data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want; set fulldata;');
  call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');
  if last then call execute(' if condition_id >= 1 then output; run;');
run;

However the above is a total faff caused by bad structure of the data - i.e. using transposed data.  If you normalise the data then you don't need code generation at all, its is just a merge:

/* This is the problem dataset, modify this */
data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;

data condition (keep=chk:);
  set condition end=last;
  retain chk_a chk_b chk_c chk_d;
  if cond1="a" then chk_a=value1;
  if cond1="c" then chk_b=value1;
  if cond2="b" then chk_c=value2;
  if cond2="d" then chk_d=value2;
  if last then output;
run;

data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

proc sql;
  create table WANT as
  select  A.*
  from    FULLDATA A
  left join CONDITION B
  on      1=1
  where   (A=CHK_A and B=CHK_B) 
    or    (C=CHK_C and D=CHK_D);
quit; 

Structure of data is pivotal in how complex/messy code is.

hhchenfx
Rhodochrosite | Level 12

Hi RW9,

 

It seems fine but I dont know why SAS gives error notice  "The EXECUTE subroutine call has too many arguments."

 

HHC

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, sorry, too used to using cat() functions.  This:

  call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');

Should read:

  call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then condition_id='||strip(condition_id)||';');

 

hhchenfx
Rhodochrosite | Level 12

The code works now. 

Since for each condition, I want to get all records meet it, the final file will have some duplicate due to the fact that 1 record satistfy multiple conditions. 

I modify your code a bit and get it done.

 

What I wonder is the role of "if condition_id>=1".

In the log file, the SAS understand the code as:

 

 

data want2; set fulldata;
if a=4 and b=5 then do; condition_id=1;output;end;
if c=1 and d=2 then do; condition_id=2;output;end;
if c=9 and d=9 then do; condition_id=3;output;end;
if condition_id >= 1;
run;

In this case having or not having the "if condition_id>=1" doesn't make any difference.

 

So what is the role of that "if condition_id>=1" in your code and mkeintz code. Clearly, I miss an important point here.

 

 

if last then call execute(' if condition_id >= 1 ; run;');

 

 

 

By the way, the code I finally got is below.

 

Thanks alot,

HHC

 

data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want2; set fulldata;');
  call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then do; condition_id='||strip(condition_id)|| ';output;end;');

if last then call execute(' if condition_id >= 1 ; run;');
run;



data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want2; set fulldata;');
  call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then do; condition_id='||strip(condition_id)|| ';output;end;');

if last then call execute(' ; run;');
run;

 

 

mkeintz
PROC Star

@hhchenfx wrote:

The code works now. 

Since for each condition, I want to get all records meet it, the final file will have some duplicate due to the fact that 1 record satistfy multiple conditions. 

I modify your code a bit and get it done.

 

What I wonder is the role of "if condition_id>=1".

In the log file, the SAS understand the code as:

 


I believe I told you to remove the "if condition_id>1" statement once the explicit OUTPUT statements were introduced in the other topic you started that is confounded with this one.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
hhchenfx
Rhodochrosite | Level 12

Yes you did.

Thanks,

HHC

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2771 views
  • 0 likes
  • 4 in conversation