DATA Step, Macro, Functions and more

issue in subset dataset with macro

Reply
Contributor
Posts: 24

issue in subset dataset with macro

Hi, Can anyone please help me on below scenario. battrid test HEMA HB HEMA WBC CHEM SGPT CHEM SGOT I have this dataset where i have distinct groups. I want to create new datasets according to distinct groups. For Example, if group is HEMA then new dataset should be HEMA having correspoding records only and if group name is CHEM then new datset name should be CHEM with corresponding records. I have programmed a code , but that is not working as expected. Rajesh.
Attachment
Super User
Posts: 6,945

Re: issue in subset dataset with macro

[ Edited ]

Try this:

proc sql noprint;
select distinct(battrid) into :groups separated by " " from lbload;
quit;

data _null_;
groups = "&groups";
call execute("data ");
do i = 1 to countw(groups);
  call execute(scan(groups,i)!!" ");
end;
call execute("; set lbload; select(battrid);");
do i = 1 to countw(groups);
  call execute('when("'!!scan(groups,i)!!'") output '!!scan(groups,i)!!';');
end;
call execute('end; run;');
run;

This is of course completely untested and depends on the number of distinct battrid's to work; with high cardinality, you might run into memory problems. If you only have 8, that should work like a charm.

Another method would be to use call execute to create a single data step for each group, but that would multiply the scans through lbload, therefore be less performant.

 

Edit: added "end" in final call execute

Worked fine with test data from

data lbload;
input battrid $ value;
cards;
HEMA 1
HB 1
HEMA 2
WBC 1
CHEM 1
SGPT 1
CHEM 2
SGOT 1
;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,405

Re: issue in subset dataset with macro

Why do you not use an industry standard data structure such as CDISC LAB, LB, ADLB models?  You will find your programming much easier to use normalised data (i.e. test - result per row, rather than in columns) as you can then use by group processing - i.e. no need to create separate files, know what columns are called etc.

https://www.cdisc.org/standards/foundational/lab

For example, to print your multiple datasets you would need a step for each datastep, for me using normalised data and by groups:

proc print data=have; 
  by parcat;
  title "Category=#byval1";
run;

Will produce print output of the data separated by the parameter categories, far easier to maintain.

Ask a Question
Discussion stats
  • 2 replies
  • 168 views
  • 0 likes
  • 3 in conversation