BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PeterClemmensen
Tourmaline | Level 20

Been searching around the internet a while for a solution to this one.. Lets say I have a dataset with the variable ID as

 

data have;
	input ID $ var1 var2;
datalines;
one 100 200
three 300 400
three 500 200
three 200 100
two 400 200
two 300 166
one 300 100
three 400 400
;

Now, what if I want to dynamically split up this dataset into three datasets called one two and three? And I want to input the unique variable names of ID into a macro variable such as macrovar = one two three 🙂

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
	input ID $ var1 var2;
datalines;
one 100 200
one 100 200
three 300 400
three 500 200
three 200 100
two 400 200
two 300 166
one 300 100
three 400 400
;

proc sort data=have;by id;run;
data _null_;
 if _n_=1 then do;
   if 0 then set have;
   declare hash h(dataset:'have(obs=0)',multidata:'y');
   h.definekey(all:'y');
   h.definedata(all:'y');
   h.definedone();
 end;
do until(last.id);
 set have;
 by id;
 h.add();
end;
h.output(dataset:id);
h.clear();
run;

View solution in original post

13 REPLIES 13
lakshmi_74
Quartz | Level 8
data one two three;
set have;
select(ID);
when('one') output one;
when('two') output two;
when('three') output three;
end;
run;
PeterClemmensen
Tourmaline | Level 20

@lakshmi_74 thank you but I am looking for a dynamic solution, which would also be applicable if I had a fourth variable 🙂

Kurt_Bremser
Super User

@PeterClemmensen wrote:

Been searching around the internet a while for a solution to this one.. Lets say I have a dataset with the variable ID as

 

data have;
	input ID $ var1 var2;
datalines;
one 100 200
three 300 400
three 500 200
three 200 100
two 400 200
two 300 166
one 300 100
three 400 400
;

Now, what if I want to dynamically split up this dataset into three datasets called one two and three? And I want to input the unique variable names of ID into a macro variable such as macrovar = one two three 🙂

 


Don't store in a macro variable, store in a dataset:

proc sort
  data=have (keep=id)
  out=lookup
  nodupkey
;
by id;
run;

Now you can split in a dynamically created data step:

data _null_;
call execute("data ");
do until (eof1);
  set lookup end=eof1;
  call execute(trim(id)!!" ");
end;
call execute("; set have;");
do until (eof2);
  set lookup end=eof2;
  call execute('if id = "'!!trim(id)!!'" then output '!!trim(id)!!';');
end;
call execute("run;")
stop;
run;

But you only should do this if by-group processing is not a viable option.

Also keep in mind that the values in id need to be valid SAS names.

 

Edit: added "set have;" in the call execute.

Astounding
PROC Star

I've seen this done with hashing, but I can't point you toward a sample solution.

 

Of course your ID values all have to be legitimate variable names.

 

One approach:

 

proc sql noprint;

select distinct id into : id_list separated by ' ' from have;

quit;

 

Then inside a macro you can process the list.  For example:

 

data &id_list;

set have;

select (id);

   %do i=1 %to %sysfunc(countw(&id_list));

        %let next_dataset = %scan(&id_list, &i, %str( ));

         when ("&next_dataset") output &next_dataset;

   %end;

end;

run;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Exactly the same way as the output to Excel we were talking about in the thread earlier today, just change the proc export part to a datastep.  Also, its really not recommend to put "data" in macro variables, especially as lists.  It will add layers of complexity to your code.

Ksharp
Super User
data have;
	input ID $ var1 var2;
datalines;
one 100 200
one 100 200
three 300 400
three 500 200
three 200 100
two 400 200
two 300 166
one 300 100
three 400 400
;

proc sort data=have;by id;run;
data _null_;
 if _n_=1 then do;
   if 0 then set have;
   declare hash h(dataset:'have(obs=0)',multidata:'y');
   h.definekey(all:'y');
   h.definedata(all:'y');
   h.definedone();
 end;
do until(last.id);
 set have;
 by id;
 h.add();
end;
h.output(dataset:id);
h.clear();
run;
rogerjdeangelis
Barite | Level 11
WPS/SAS: DOSUBL solution for splitting a SAS dataset

WPS does not support DOSUBL

SOAPBOX ON

You may want to use this in non-production jobs.
If many of us use it, there is less of a chance
SAS and/or Corporate IT will  discourage, weaken or remove
this powerful  'subroutine' function  in 9.4M4.
We do not want this to go the route of
the store command, window, %window, pmenu, command macros,  libanme, 
passthru to excel/access , command line, function keys... Note you almost never see SAS supporting the lost functionality
above but you see SAS often supporting SAS studio and EG. It would help if SAS will unequivocally state that it will support and enhance all current DOSUBL functionality on all
interfaces (editors not just EG and SAS Studio) and all platforms. I don't understand why SAS employees are not promoting DOSUBL by using it to respond to posts like this one. I have experienced push back. SOAPBOX OFF Unlike the HASH this does not require a sort. Unlike 'call execute' and macro solutions we can stop the generation of code as soon as a 'user defined' or system warning/error occurs. This post https://goo.gl/aHXBoZ https://communities.sas.com/t5/Base-SAS-Programming/Splitting-up-dataset-based-on-unique-variable/m-p/347689 HAVE ==== Up to 40 obs WORK.HAVE total obs=9 Obs ID VAR1 VAR2 1 one 100 200 2 one 100 200 3 three 300 400 4 three 500 200 5 three 200 100 6 two 400 200 7 two 300 166 8 one 300 100 9 three 400 400 WANT ==== WORK.ONE has 3 obs WORK.TWO has 2 obs WORK.THREE has 4 obs Up to 40 obs from WORK.ONE total obs=3 Obs ID VAR1 VAR2 1 one 300 100 2 one 100 200 3 one 100 200 Up to 40 obs from WORK.TWO total obs=2 Obs ID VAR1 VAR2 1 two 400 200 2 two 300 166 Up to 40 obs from WORK.THREE total obs=4 Obs ID VAR1 VAR2 1 three 500 200 2 three 200 100 3 three 300 400 4 three 400 400 WORKING CODE ============ WPS/SAS declare hash h(dataset:'have(obs=0)',multidata:'y', ordered: "A"); h.output(dataset:id); * _ _ _ _ __ ___ __ _| | _____ __| | __ _| |_ __ _ | '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` | | | | | | | (_| | < __/_____| (_| | (_| | || (_| | |_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_| ; data have; input ID $ var1 var2; datalines; one 100 200 three 300 400 three 500 200 three 200 100 two 400 200 two 300 166 one 300 100 three 400 400 ;;;; run;quit; %symdel dsn sqlobs id / nowarn; data _null_; If _n_=0 then do; rc=%sysfunc(dosubl(' proc sql; select distinct quote(trim(id)) into :id separated by "," from have;quit; ')); end; length id $5; do id=&id; call symputx('dsn',id); rc=dosubl(' proc sql; create table &dsn as select * from have where id="&dsn" ;quit; '); if &sqlobs=0 then put "dataset " id " failed with 0 observations stoppin code generation"; end; run;quit; NOTE: Table WORK.ONE created, with 2 rows and 3 columns. NOTE: PROCEDURE SQL used (Total process time): real time 0.01 seconds cpu time 0.00 seconds NOTE: Table WORK.THREE created, with 4 rows and 3 columns. NOTE: PROCEDURE SQL used (Total process time): real time 0.01 seconds cpu time 0.01 seconds NOTE: Table WORK.TWO created, with 2 rows and 3 columns. NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: DATA statement used (Total process time): real time 2.21 seconds cpu time 0.28 seconds
Astounding
PROC Star

To be totally fair when noting pros and cons, the DOSUBL solution makes a separate pass through the entire data set for each subset being created.

rogerjdeangelis
Barite | Level 11
Thanks for the review and insight.

You cannot check completion codes for each interaction with one passthru.
If you want one passthru then use this code.

%symdel dsns ifout id / nowarn;
data _null_;

  If _n_=0 then do;
      rc=%sysfunc(dosubl('
      proc sql; select distinct quote(trim(id)) into :id separated by ","  from have;quit;
    '));
  end;

  length id $5;
  length dsns $2000;
  length ifout $2000;

  do id=&id;

    dsns  =  catx(' ',dsns,id);
    ifout =  catx(' ',ifout,cats('when ("',id,'") output'),id,';');

  end;

  call symputx('dsns',dsns);
  call symputx('ifout',ifout);

  rc=dosubl('
    data &dsns;
      set have;
      select (id);
        &ifout
      end;
    run;quit;

  ');
   stop;

run;quit;

NOTE: There were 8 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.ONE has 2 observations and 3 variables.
NOTE: The data set WORK.THREE has 4 observations and 3 variables.
NOTE: The data set WORK.TWO has 2 observations and 3 variables.



rogerjdeangelis
Barite | Level 11
It amazes me how flexible DOSUBL is

10 lines of code

data have;
      input ID $ var1 var2;
datalines;
one 100 200
one 300 100
three 300 400
three 500 200
three 200 100
three 400 400
two 400 200
two 300 166
;;;;
run;quit;


data _null_;
  set have(keep=id);
  by id;
  if last.id then do;
     call symputx('id',id);
     rc=dosubl('
      data &id;
         set have(where=(id="&id"));
      run;quit;
     ');
  end;
run;quit;

rogerjdeangelis
Barite | Level 11
Even more flexibility, unsorted have

data have;
  input ID $ var1 var2;
cards4;
one 100 200
three 300 400
three 500 200
three 200 100
two 400 200
two 300 166
one 300 100
three 400 400
;;;;
run;quit;

proc datasets lib=work;
delete one two three;
run;quit;

data _null_;
  set have;
  by id notsorted;
  call symputx('id',id);
  if first.id then call symputx('obsbeg',put(_n_,3.));
  if last.id then  call symputx('obsend',put(_n_,3.));
  if last.id then
  rc=dosubl('
     proc append base=&id data=have(firstobs=&obsbeg obs=&obsend);
     run;quit;
  ');
run;quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 20315 views
  • 9 likes
  • 8 in conversation