BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Barite | Level 11

Hi Everyone,

 

I want to run the below macro to create individual file for each record.

However, I dont know how to tell SAS to run "to the end of file".

Could anyone help me out?

So many thanks,

HHC


data have;
input time start;
datalines;
1 1
2 1
3 .
4 -1
5 .
6 1
7 -1
8 .
9 .
10 9
11 2
12 6
1 1
2 1
3 .
4 -1
5 .
6 1
7 -133
8 .
9 .
10 933
11 233
12 633
1 1
2 1
3 .
4 -1
5 .
6 1
7 -1
8 .
9 .
10 9
11 2
12 6
1 1
2 1
3 .
4 -1
5 .
6 1
7 -133
8 .
9 .
10 933
11 233
12 633
;
run;

%macro split;

	/*add this 4 line of code and it will create the &nobs; 
		dont need to make any adjustment on nobs 
		this below run exactly 4 rounds*/

%local dsid nobs rc;
%let dsid = %sysfunc(open(condition));
%let nobs = %sysfunc(attrn(&dsid, nlobs));
%let rc = %sysfunc(close(&dsid));


	%do i=1 %to &nobs;
		data temp; set have;
		if (&i-1)*10<_N_<&i*10;
		
		start2=start*2;
		run;

		%if &i=1 %then %do;
			data final; set temp;run;
			%end;
		%else %do;
			data final; set final temp; run;
			%end;
	%end;
%mend;

%split;

 

 

 

%macro cond;
%do i=1 %to 4; *HOW TO MAKE IT: i=1 to END OF FILE ;
data single_condition_&i; set condition;
if _N_=&i;
run;

MACRO CODE BODY
MACRO CODE BODY

%end;
%mend;
%cond;run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You could of course save yourself the i/o and temp file management by using call execute:

data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want; set fulldata;');
  call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');
  if last then call execute(' if condition_id >= 1 then output; run;');
run;

However the above is a total faff caused by bad structure of the data - i.e. using transposed data.  If you normalise the data then you don't need code generation at all, its is just a merge:

/* This is the problem dataset, modify this */
data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;

data condition (keep=chk:);
  set condition end=last;
  retain chk_a chk_b chk_c chk_d;
  if cond1="a" then chk_a=value1;
  if cond1="c" then chk_b=value1;
  if cond2="b" then chk_c=value2;
  if cond2="d" then chk_d=value2;
  if last then output;
run;

data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

proc sql;
  create table WANT as
  select  A.*
  from    FULLDATA A
  left join CONDITION B
  on      1=1
  where   (A=CHK_A and B=CHK_B) 
    or    (C=CHK_C and D=CHK_D);
quit; 

Structure of data is pivotal in how complex/messy code is.

View solution in original post

11 REPLIES 11
LaurieF
Barite | Level 11

Here's a simple way of making it work, using the numeric attribute nlobs (number of logical observations (to ignore deleted observations - usually a good idea)) from the dataset:

 

option mprint notes;

data condition; 
input condition_ID cond1 $ value1 ;
datalines;
1 a 4 
2 c 1
3 a 1
4 f 40
;
run;

%macro cond;
%local dsid nobs rc;
%let dsid = %sysfunc(open(condition));
%let nobs = %sysfunc(attrn(&dsid, nlobs));
%let rc = %sysfunc(close(&dsid));
	%do i=1 %to &nobs;	
		data single_condition_&i; 
		set condition(firstobs=&i obs=&i);
		run;
	%end;
%mend cond;
%cond;

Note that I've used firstobs and obs instead of _n_ - if condition is large, this will be more efficient.

 

I'm interested - what's the problem you're trying to solve?

mkeintz
PROC Star

If you're asking this question to resolve the revision to your request Merging data based on (variable names from 1 data file) and (variables value from another data file, then making the macro loop you contemplate is not an efficient solution.  Instead use a PUT statement in the first DATA step to write out neccessary code, and then %include that code in a second data step:

 

data condition; 
input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;run;
data FULLDATA; 
input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;run;

filename tmp temp;
data _null_;
  set condition end=last_cond;
  file tmp;
  put 'if ' cond1 '=' value1 'and ' cond2 '=' value2  
      ' then condition_id=' condition_id ';' @;
  if last_cond=0  then put 'else';
run;

options source2;
data want;
  set fulldata;
  %include tmp;
  if condition_id >=1;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
hhchenfx
Barite | Level 11

Both of your codes work fine.

However, for the sake of knowledge, could you tell me how to put this End_of_File number into my Macro.

Thank you.

HHC

 

LaurieF
Barite | Level 11

Well, you don't need to - my code picks the number of observations up from the metadata for the dataset. It interrogates the number of observations and always gets it right (as long as the dataset is a SAS one), saving you the bother of putting it in as a parameter.

mkeintz
PROC Star

You may use the "nobs=" parameter of the set statement, as in:

 

filename tmp temp;
data _null_;
  set condition end=last_cond  nobs=nconds;
  file tmp;
  put 'if ' cond1 '=' value1 'and ' cond2 '=' value2 
      ' then condition_id=' condition_id ' out of ' nconds ';' @;
  if last_cond=0  then put 'else';

  else putlog "there are a total of " nconds " conditions." ;
run;

 

%put &=cond;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You could of course save yourself the i/o and temp file management by using call execute:

data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;
data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want; set fulldata;');
  call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');
  if last then call execute(' if condition_id >= 1 then output; run;');
run;

However the above is a total faff caused by bad structure of the data - i.e. using transposed data.  If you normalise the data then you don't need code generation at all, its is just a merge:

/* This is the problem dataset, modify this */
data condition; 
  input condition_ID cond1 :$1. value1 cond2 :$1. value2;
datalines;
1 a 4 b 5
2 c 1 d 2
;
run;

data condition (keep=chk:);
  set condition end=last;
  retain chk_a chk_b chk_c chk_d;
  if cond1="a" then chk_a=value1;
  if cond1="c" then chk_b=value1;
  if cond2="b" then chk_c=value2;
  if cond2="d" then chk_d=value2;
  if last then output;
run;

data fulldata; 
  input date a b c d ;
datalines;
11 4 5 1 1
12 4 5 2 5
13 4 1 2 6
14 8 3 1 2
;
run;

proc sql;
  create table WANT as
  select  A.*
  from    FULLDATA A
  left join CONDITION B
  on      1=1
  where   (A=CHK_A and B=CHK_B) 
    or    (C=CHK_C and D=CHK_D);
quit; 

Structure of data is pivotal in how complex/messy code is.

hhchenfx
Barite | Level 11

Hi RW9,

 

It seems fine but I dont know why SAS gives error notice  "The EXECUTE subroutine call has too many arguments."

 

HHC

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, sorry, too used to using cat() functions.  This:

  call execute('if ',strip(cond1),'=',strip(value1),' and ',strip(cond2),'=',strip(value2),' then condition_id=',strip(condition_id),';');

Should read:

  call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then condition_id='||strip(condition_id)||';');

 

hhchenfx
Barite | Level 11

The code works now. 

Since for each condition, I want to get all records meet it, the final file will have some duplicate due to the fact that 1 record satistfy multiple conditions. 

I modify your code a bit and get it done.

 

What I wonder is the role of "if condition_id>=1".

In the log file, the SAS understand the code as:

 

 

data want2; set fulldata;
if a=4 and b=5 then do; condition_id=1;output;end;
if c=1 and d=2 then do; condition_id=2;output;end;
if c=9 and d=9 then do; condition_id=3;output;end;
if condition_id >= 1;
run;

In this case having or not having the "if condition_id>=1" doesn't make any difference.

 

So what is the role of that "if condition_id>=1" in your code and mkeintz code. Clearly, I miss an important point here.

 

 

if last then call execute(' if condition_id >= 1 ; run;');

 

 

 

By the way, the code I finally got is below.

 

Thanks alot,

HHC

 

data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want2; set fulldata;');
  call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then do; condition_id='||strip(condition_id)|| ';output;end;');

if last then call execute(' if condition_id >= 1 ; run;');
run;



data _null_;
  set condition end=last;
  if _n_=1 then call execute('data want2; set fulldata;');
  call execute('if '||strip(cond1)||'='||strip(value1)||' and '||strip(cond2)||'='||strip(value2)||' then do; condition_id='||strip(condition_id)|| ';output;end;');

if last then call execute(' ; run;');
run;

 

 

mkeintz
PROC Star

@hhchenfx wrote:

The code works now. 

Since for each condition, I want to get all records meet it, the final file will have some duplicate due to the fact that 1 record satistfy multiple conditions. 

I modify your code a bit and get it done.

 

What I wonder is the role of "if condition_id>=1".

In the log file, the SAS understand the code as:

 


I believe I told you to remove the "if condition_id>1" statement once the explicit OUTPUT statements were introduced in the other topic you started that is confounded with this one.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
hhchenfx
Barite | Level 11

Yes you did.

Thanks,

HHC

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1558 views
  • 0 likes
  • 4 in conversation