Hi experts,
I need help to understand a step.It would be great help if some one help me understand a line.
The query is: produce 2 datasets from an existing dataset.
infile datalines;
input name $ content $;
datalines;
RE_OP Sydney
RE_OP Brussels
RE_OP Paris
RE_TRF Newyork
RE_TRF Washington
RE_TRF Houston
RE_TRF Dayton
;
run;
proc sql noprint;
select distinct name into : x separated by ' ' from try;
quit;
%macro create;
%do i = 1 %to %eval(%sysfunc(count(&x, %str( )))+1);
data %scan(&x,&i.);
set try;
if name = "%scan(&x,&i.)";
run;
%end;
%mend;
%create
I would like to understand how the statement %eval(%sysfunc(count(&x, %str( )))+1); works?
I have tried to break it down to understand it in bits but struck at a point. %str()+1 how does this help in building 2 new datasets here?My understanding so far is ---> in the ist step of loop &x resolves to RE_OP, then %str() probably gives 1 and later +1 makes it 2 . Don't know if I'm right so far.
Would be of a great help, if someone understands it please explain.
%do i = 1 %to %eval(
%sysfunc(
count(&x, %str( ))
)
+1);
You're reading it wrong, it's not %str() + 1, reevaluate where the parenthesis are in the code.
%SYSFUNC() applies to COUNT() function.
Count is counting the number of spaces which is the delimiter in the macro string X.
You need to add 1 because if you have three items you only have two spaces so for the correct number of items in the list you need to add 1.
Replacing spaces with underscores the example below illustrates the issue.
Item1_Item2_Item3
This is an older method of doing this, using the automatic macro variable &SQLOBS is a better method today.
proc sql noprint;
select distinct name into : x separated by ' ' from try;
quit;
%let nCount = &sqlObs.;
%macro create;
%do i = 1 %to &nCount;
data %scan(&x,&i.);
set try;
if name = "%scan(&x,&i.)";
run;
%end;
%mend;
%create;
Ultimately, in SAS, splitting your data sets is never a recommended method. It leads you down a road of macro loops and is inefficient and cumbersome. SAS has BY group processing, CALL EXECUTE and several other ways of splitting analysis that make this unnecessary 99% of the time.
This blog post covers several other methods of doing this and echos the general sentiment of "don't do this".
https://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/
@sahoositaram555 wrote:
Hi experts,
I need help to understand a step.It would be great help if some one help me understand a line.
The query is: produce 2 datasets from an existing dataset.
I have referred to a solution by @AmitRathoredata try;infile datalines;
input name $ content $;
datalines;
RE_OP Sydney
RE_OP Brussels
RE_OP Paris
RE_TRF Newyork
RE_TRF Washington
RE_TRF Houston
RE_TRF Dayton
;
run;
proc sql noprint;
select distinct name into : x separated by ' ' from try;
quit;
%macro create;
%do i = 1 %to %eval(%sysfunc(count(&x, %str( )))+1);
data %scan(&x,&i.);
set try;
if name = "%scan(&x,&i.)";
run;
%end;
%mend;
%create
I would like to understand how the statement %eval(%sysfunc(count(&x, %str( )))+1); works?
I have tried to break it down to understand it in bits but struck at a point. %str()+1 how does this help in building 2 new datasets here?My understanding so far is ---> in the ist step of loop &x resolves to RE_OP, then %str() probably gives 1 and later +1 makes it 2 . Don't know if I'm right so far.
Would be of a great help, if someone understands it please explain.
One, it is poor style to let macro variables just fall into the middle of macro code. This is done with the &X variable which is a space separate list of the city names (and flawed as this will fail for towns that have a name with spaces like "New York".)
Add a %put Macro variable X is: &x. ; after that proc SQL step to see what is in the variable. Hint: none of RE_OP or similar appear.
Instead of %eval(%sysfunc(count(&x, %str( )))+1); which basically counts spaces in the list,
would be to use
%sysfunc(countw(&x)) which counts the words.
The %scan(&x,&i.); pulls out one of those city names for each iteration of the loop. So that is used in two places, creating a data set name and the condition to find matching specific values.
Note that the cause of City names with spaces will yield unexpected results as Data New York; will create two data sets, one name New and the other named York in the work library.
A great tool for seeing what is happening with macro coding is to add:
OPTIONS MPRINT;
before the macro function call. Then the code generated by the macro will be shown in the log. If you include the options MLOGIC and SYMBOLGEN you will details about how logic constructs are used and the way macro variables values are created.
%do i = 1 %to %eval(
%sysfunc(
count(&x, %str( ))
)
+1);
You're reading it wrong, it's not %str() + 1, reevaluate where the parenthesis are in the code.
%SYSFUNC() applies to COUNT() function.
Count is counting the number of spaces which is the delimiter in the macro string X.
You need to add 1 because if you have three items you only have two spaces so for the correct number of items in the list you need to add 1.
Replacing spaces with underscores the example below illustrates the issue.
Item1_Item2_Item3
This is an older method of doing this, using the automatic macro variable &SQLOBS is a better method today.
proc sql noprint;
select distinct name into : x separated by ' ' from try;
quit;
%let nCount = &sqlObs.;
%macro create;
%do i = 1 %to &nCount;
data %scan(&x,&i.);
set try;
if name = "%scan(&x,&i.)";
run;
%end;
%mend;
%create;
Ultimately, in SAS, splitting your data sets is never a recommended method. It leads you down a road of macro loops and is inefficient and cumbersome. SAS has BY group processing, CALL EXECUTE and several other ways of splitting analysis that make this unnecessary 99% of the time.
This blog post covers several other methods of doing this and echos the general sentiment of "don't do this".
https://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/
@sahoositaram555 wrote:
Hi experts,
I need help to understand a step.It would be great help if some one help me understand a line.
The query is: produce 2 datasets from an existing dataset.
I have referred to a solution by @AmitRathoredata try;infile datalines;
input name $ content $;
datalines;
RE_OP Sydney
RE_OP Brussels
RE_OP Paris
RE_TRF Newyork
RE_TRF Washington
RE_TRF Houston
RE_TRF Dayton
;
run;
proc sql noprint;
select distinct name into : x separated by ' ' from try;
quit;
%macro create;
%do i = 1 %to %eval(%sysfunc(count(&x, %str( )))+1);
data %scan(&x,&i.);
set try;
if name = "%scan(&x,&i.)";
run;
%end;
%mend;
%create
I would like to understand how the statement %eval(%sysfunc(count(&x, %str( )))+1); works?
I have tried to break it down to understand it in bits but struck at a point. %str()+1 how does this help in building 2 new datasets here?My understanding so far is ---> in the ist step of loop &x resolves to RE_OP, then %str() probably gives 1 and later +1 makes it 2 . Don't know if I'm right so far.
Would be of a great help, if someone understands it please explain.
@Reeza says:
Ultimately, in SAS, splitting your data sets is never a recommended method. It leads you down a road of macro loops and is inefficient and cumbersome. SAS has BY group processing, CALL EXECUTE and several other ways of splitting analysis that make this unnecessary 99% of the time.
Bravo! Excellent! Great advice!
@sahoositaram555 , please listen to Reeza
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.