DATA Step, Macro, Functions and more

Splitting a dataset macro by unique values of a variable

Reply
Super Contributor
Posts: 275

Splitting a dataset macro by unique values of a variable

Hi,

I am using a macro program originally written in SUGI 069-2012. I am trying to implement this for my scenario but running into issues -

%macro split (dataset=lkbkreq.ftmonlkbk_trn_match, varname= alert_id, outlib=WORK );

%local sel sets;

%if %INDEX(&dataset,.) eq 0 %then %do;

%let libname = WORK ;

%let setname = &dataset ;

%end;

%else %do;

%let libname = %SUBSTR(&dataset,1,%INDEX(&dataset,.)-1);

%let setname = %SUBSTR(&dataset,%INDEX(&dataset,.)+1) ;

%end;

proc sql noprint;

select type into :vartype

from sashelp.vcolumn

where libname = upcase("&libname")

and memname = upcase("&setname")

and memtype = "DATA"

and upcase(name) = upcase("&varname")

;

quit;

data _null_;

retain sel sets;

format sel sets $20000.;

if _n_=1 then do;

dcl hash members (ordered: 'a');

rc = members.definekey("&varname");

rc = members.definedone();

sel="select (&varname);";

sets='';

end;

set &dataset (keep=&varname) end = eof;

rc=members.add();

if rc eq 0 then do;

flag+1;

sets=trim(sets)||"&outlib..set"||trim(left(&varname.));

if "&vartype" eq "num" then

sel=trim(sel)||" when ("||trim(left(&varname.))||

") output &outlib..set"||trim(left(&varname.))||";";

else

sel=trim(sel)||" when ('"||trim(left(&varname.))||

"') output &outlib..set"||trim(left(&varname.))||";";

end;

if eof then call symputx('sel',trim(sel)||' otherwise; end;','L');

if eof then call symputx('sets',trim(sets),'L');

run;

data &sets;

set &dataset;

&sel.

run;

proc export data=&sets

   outfile='/.../&sets..txt'

   dbms=dlm;

   delimiter='|';

run;

proc export data=&sets

   outfile='/..../&sets..csv'

   dbms=csv;

run;

%mend split; 

%split

I have 2 issues. One this macro needs to be modified to fit this new requirement -

1. if an alert_id has >50000 txn, the output csv or text files need to be multiple in order to fit only 50000 txn

2. the variable sets is giving an error

WORK.set0000057174FIBI120101 WORK.set0000090969FIBI120101 WORK.set0000164483FIBI120101 WORK.set0000522144EIBI120101

                                        ____________________________

                                        22

                                        76

NOTE: The SAS System stopped processing this step because of errors.

NOTE: Line generated by the macro variable "SETS".

Thanks,

thecoolking

Super User
Posts: 11,343

Re: Splitting a dataset macro by unique values of a variable

Run your code with options mprint symbolgen;

to see where in the generated code the error is occurring.

I think one issue is you have the macro variable &sets containing multiple dataset names but are using in places that only allow a single dataset such as PROC EXPORT.

You may want to consider CALL EXECUTE code to perform the same steps for each data set name.

You should expand on this:

1. if an alert_id has >50000 txn, the output csv or text files need to be multiple in order to fit only 50000 txn

As I don't see alert_id anywhere, so have no idea when to address it. Also I do not understand what 50000 txn could possibly refer to.

Super User
Super User
Posts: 7,997

Re: Splitting a dataset macro by unique values of a variable

Try something along the lines of (note, not tested so may need some tweaking):

data _null_;

  set sashelp.vtables (where=(libname="SASHELP" and memname="CLASS"));

  do i=1 to floor(nobs/5000);

    call execute('data tmp; set sashelp.class; if '||put(i,1.)||'-1 * 5000 < _n_ < '||put(i,1)||' * 5000 then output; run;

                  proc export data=tmp outfile="c:\xyz_'||put(i,1)||'.csv" dbms="csv"; run;');

  end;

run;

Super Contributor
Posts: 275

Re: Splitting a dataset macro by unique values of a variable

Thanks RW9 for the response.

Super User
Posts: 7,863

Re: Splitting a dataset macro by unique values of a variable

You can't use &sets in the proc export, as sets contains multiple dataset names and proc export only processes one dataset at a time. Also the libname in the dataset names may/will cause issues when used in a filename for a flat file.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 4 replies
  • 454 views
  • 0 likes
  • 4 in conversation