DATA Step, Macro, Functions and more

SAS query: How to split one dataset to many

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

SAS query: How to split one dataset to many

I got the below interesting codes (How to split one dataset to many) in the web and have added some codes on it (with ADDED). I have got the following message when i run the below program.

 

Invalid argument 5 to function CAT. Missing values may be generated.

Further warning from this call to CAT will be suppressed.

 

Anyone can help? or have a better way?

 

 


%let TABLE=sashelp.cars;
%let COLUMN=origin;

proc sql;
select distinct ORIGIN into :valList separated by ',' from SASHELP.CARS;
quit;

proc sql;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct
cat("DATA out_",compress(&COLUMN.,,'kad'),
"; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
"'));
IF TYPE EQ 'SUV' THEN CHECK = INVOICE*1.06;                      ****ADDED***
IF TYPE EQ 'Sports' THEN CHECK = INVOICE*1.1;                    ***ADDED***
IF TYPE EQ 'Truck' THEN CHECK = INVOICE*1.12;                    ****ADDED***
IF TYPE EQ 'Sedan' THEN CHECK = INVOICE*1.14;                 ****ADDED***
run;") into :allsteps separated by ';'
from &TABLE.;
quit;

/* macro that includes the program we just generated */
%macro runSteps;
&allsteps.;
%mend;

/* and...run the macro when ready */
%runSteps;


Accepted Solutions
Solution
‎10-25-2015 01:02 AM
Super User
Super User
Posts: 6,497

Re: SAS query: How to split one dataset to many

[ Edited ]

The default length for a character variable is 200.  Add LENGTH=32767 just after the CATS() function and before the the INTO clause.

Note: Instead of posting a picture of the error message it is usually easier to others if you copy and paste the actual text of the message.

 

594 %let table=sashelp.cars;
595 %let column=origin;
596 proc sql noprint;
597 select distinct
598 cats( "DATA out_" , compress(&COLUMN,,'kad'), ";"
599 , "SET &TABLE;"
600 , "WHERE &COLUMN='", &COLUMN , "';"
601 , "IF TYPE EQ 'SUV' THEN CHECK = INVOICE*1.06;"
602 , "ELSE IF TYPE EQ 'Sports' THEN CHECK = INVOICE*1.1;"
603 , "ELSE IF TYPE EQ 'Truck' THEN CHECK = INVOICE*1.12;"
604 , "ELSE IF TYPE EQ 'Sedan' THEN CHECK = INVOICE*1.14;"
605 , "RUN;"
606 ) length=32767
607 into :allsteps separated by ' '
608 from &TABLE
609 ;
610 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.08 seconds
cpu time 0.09 seconds


611 &allsteps;

NOTE: There were 158 observations read from the data set SASHELP.CARS.
WHERE origin='Asia';
NOTE: The data set WORK.OUT_ASIA has 158 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds

 

NOTE: There were 123 observations read from the data set SASHELP.CARS.
WHERE origin='Europe';
NOTE: The data set WORK.OUT_EUROPE has 123 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

 

NOTE: There were 147 observations read from the data set SASHELP.CARS.
WHERE origin='USA';
NOTE: The data set WORK.OUT_USA has 147 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

View solution in original post


All Replies
Super User
Super User
Posts: 6,497

Re: SAS query: How to split one dataset to many

Why did you insert a series of data step IF statements into the middle of the CAT() function call in the PROC SQL step? If you want them to be included in the code that is being generated into the macro variable ALLSTEPS then they need be quoted so that they become text strings instead of statements. The same way that the strings that are used to generate the DATA and SET statements are quoted.

Occasional Contributor
Posts: 14

Re: SAS query: How to split one dataset to many

I have tried to get it quoted but it can't work for me...maybe i don't understand it as i am new in SAS.

 

Would you mind to show me?

Valued Guide
Posts: 3,208

Re: SAS query: How to split one dataset to many

http://libguides.library.kent.edu/SAS/data/subset

There are a lot of samples on the web. Finding them is mosty using correct words.   

---->-- ja karman --<-----
Occasional Contributor
Posts: 14

Re: SAS query: How to split one dataset to many

Thanks for the comment.  The reason why I used these codes is when the ORIGIN (in this case is 3) could be 5 today and 10 dataset tomorrow; or in other situations that I might need to split to 25 datasets.

 

 

Valued Guide
Posts: 3,208

Re: SAS query: How to split one dataset to many

Do not use proc sql as you are accessing all data the data-step is more eays and far faster. Only accessing ca 10% -20% of the original data you are winning with SQL.

---->-- ja karman --<-----
Super User
Super User
Posts: 6,497

Re: SAS query: How to split one dataset to many

[ Edited ]

If you are new to SAS then just concentrate on normal SAS statements. Wait to begin trying to generate SAS code with macro language or CAT() function calls until you know what statements you want to generate. For your given problem there are only three distinct values of ORIGIN in the table SASHELP.CARS.  So just write the program once for ORIGIN='Asia' and copy the code twice changing the parts that need to change.  The total lines of code in your program will be less than what you posted in the question.

DATA out_ASIA ;
  SET SASHELP.CARS (where=(ORIGIN='Asia'));
  IF TYPE EQ 'SUV' THEN CHECK = INVOICE*1.06;                      
  ELSE IF TYPE EQ 'Sports' THEN CHECK = INVOICE*1.1;
  ELSE IF TYPE EQ 'Truck' THEN CHECK = INVOICE*1.12;
  ELSE IF TYPE EQ 'Sedan' THEN CHECK = INVOICE*1.14;
RUN; 

 

Occasional Contributor
Posts: 14

Re: SAS query: How to split one dataset to many

The reason why I used these codes is when the ORIGIN (in this case is 3) could more than 5 or in other situations that I might need to split to 25 datasets

Super User
Super User
Posts: 6,497

Re: SAS query: How to split one dataset to many

Using this method to generate code into a character variable and then into a macro variable will place limits on the amount of code you can generate. The character variable is limited to 32,767 characters and a macro variable is limited to 65,534 characters.  You might want to use a method that eliminates those limits. You could create a macro that processes one subset and then call it once per distinct value.  Then you only need to generate the macro call and not the full set of SAS code that the macro could generate.  Or you could use a DATA step to write the code to a file and then %INCLUDE the file.  This method also has the advantage of being much easier to debug.

Super User
Super User
Posts: 6,497

Re: SAS query: How to split one dataset to many

[ Edited ]

Let's look at the SELECT statement that is trying to generate code into a macro variable.  (Note that there is a limit to the length of a macro variable so this technique will not work when there are a large number of distinct values of your 'split' variable).  The basic idea is that the CATS() function call is combining all of its arguments into a single string. The DISTINCT keyword makes it take just one copy of each generated string, which really means just one copy for each distinct value of the variable &COLUMN.

 

Formatting the code will make it much easier to see what the parts are doing. If you write each of the generated statements on a different line of code then it is easier for the programmer to verify that they have proper SAS statements.  Also it is easier to verify that the quotes and parentheses are balanced.  I have simplified the generated code a little. Switched from the WHERE= dataset option to a separate WHERE statement. Added in the ELSE statements that your added series of IF statements should have.  Also note that I have left the three arguments used for the DATA and WHERE statements on one line so that the source code looks a little more like the generated program would.  Note: Editted to add the LENGTH= option to allow the CATS() function to generate more than 200 characters.

 

proc sql noprint;
select distinct
  cats( "DATA out_" , compress(&COLUMN,,'kad'), ";"
      , "SET &TABLE;"
      , "WHERE &COLUMN='", &COLUMN , "';"
      , "IF TYPE EQ 'SUV' THEN CHECK = INVOICE*1.06;"
      , "ELSE IF TYPE EQ 'Sports' THEN CHECK = INVOICE*1.1;"
      , "ELSE IF TYPE EQ 'Truck' THEN CHECK = INVOICE*1.12;"
      , "ELSE IF TYPE EQ 'Sedan' THEN CHECK = INVOICE*1.14;"
      , "RUN;"
      ) length=32767
into :allsteps separated by ' '
from &TABLE
;
quit;

The extra macro that the code defines is not needed.  You can execute the statements that have been stored in the macro variable ALLSTEPS by just expanding it.

&allsteps;

 

Occasional Contributor
Posts: 14

Re: SAS query: How to split one dataset to many

When I run the code, the same error message still appear.  Attached is the file.


Cat Error Msg.JPG
Solution
‎10-25-2015 01:02 AM
Super User
Super User
Posts: 6,497

Re: SAS query: How to split one dataset to many

[ Edited ]

The default length for a character variable is 200.  Add LENGTH=32767 just after the CATS() function and before the the INTO clause.

Note: Instead of posting a picture of the error message it is usually easier to others if you copy and paste the actual text of the message.

 

594 %let table=sashelp.cars;
595 %let column=origin;
596 proc sql noprint;
597 select distinct
598 cats( "DATA out_" , compress(&COLUMN,,'kad'), ";"
599 , "SET &TABLE;"
600 , "WHERE &COLUMN='", &COLUMN , "';"
601 , "IF TYPE EQ 'SUV' THEN CHECK = INVOICE*1.06;"
602 , "ELSE IF TYPE EQ 'Sports' THEN CHECK = INVOICE*1.1;"
603 , "ELSE IF TYPE EQ 'Truck' THEN CHECK = INVOICE*1.12;"
604 , "ELSE IF TYPE EQ 'Sedan' THEN CHECK = INVOICE*1.14;"
605 , "RUN;"
606 ) length=32767
607 into :allsteps separated by ' '
608 from &TABLE
609 ;
610 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.08 seconds
cpu time 0.09 seconds


611 &allsteps;

NOTE: There were 158 observations read from the data set SASHELP.CARS.
WHERE origin='Asia';
NOTE: The data set WORK.OUT_ASIA has 158 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds

 

NOTE: There were 123 observations read from the data set SASHELP.CARS.
WHERE origin='Europe';
NOTE: The data set WORK.OUT_EUROPE has 123 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

 

NOTE: There were 147 observations read from the data set SASHELP.CARS.
WHERE origin='USA';
NOTE: The data set WORK.OUT_USA has 147 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

Occasional Contributor
Posts: 14

Re: SAS query: How to split one dataset to many

Thanks Tom.

Respected Advisor
Posts: 4,641

Re: SAS query: How to split one dataset to many

Splitting a SAS dataset is rarely a good idea, unless you are dealing with inhomogeneous data.

That said, here is a simple way to do it with call execute.

 

proc sql;
create table origins as
select distinct origin from SASHELP.CARS;
quit;

data invoiceFactors;
input type $ factor;
DATALINES;
SUV    1.06
Sports 1.1
Truck  1.12
Sedan  1.14
;

data _null_;
set origins end=done;
if _n_ = 1 then
    call execute("proc sql;");
line = catt(
    "create table Out_", origin,
    " as select a.*, invoice*factor", 
    " as check from sashelp.cars as a inner join invoiceFactors as b",
    " on a.type=b.type where origin=""", origin, """;");
call execute(line);
if done then call execute("quit;");
run;
PG
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 513 views
  • 1 like
  • 4 in conversation