DATA Step, Macro, Functions and more

split a sinlgle data set into 3 dataset at a time

Reply
Frequent Contributor
Posts: 76

split a sinlgle data set into 3 dataset at a time

name   math   english  science  divison
ram     40       70          30          2nd
hari     20       20          49          3rd
gopal   90       80          80          1st


I have to spit studentdetail dataset in 3 data set like Ram ,Hari , Gopal

Ram
subject  mark    divison
math      40        2nd
english  70         2nd
science  90         2nd

Hari
subject  mark    divison
math      20        3rd
english  20         3rd
science  49         3rd

Gopal
subject  mark    divison
math      90        1st
english  80         1st
science  80        1st


Kindly help me

Thanks ,
Regards,
Ashwini

Super Contributor
Posts: 1,636

Re: split a sinlgle data set into 3 dataset at a time

Hi Ashwini,

If you don't know Macro, you can use the code below to get what you want. but macro is much earier.

data have;

input name $  math   english  science  divison $;

name=trim(left(lowcase(name)));

cards;

ram     40       70          30          2nd

hari     20       20          49          3rd

gopal   90       80          80          1st

;

data have (keep=name subject mark divison);

    set have;

      array s(*) math--science;

      do i=1 to dim(s);

    subject=vname(s(i));

      mark=s(i);

      output;

      end;

  run;

data ram hari  gopal other ;

set have;

select (name);

      when ('ram')  output ram ;

      when ('hari')  output hari;

      when ('gopal')  output gopal;

      otherwise output other;

      end;

    drop name;

run;


PROC Star
Posts: 7,364

split a sinlgle data set into 3 dataset at a time

I would take a different approach namely:

proc transpose data=have name=subject

  out=need (rename=(col1=score));

  by name notsorted divison;

run;

data need (index = (name));

  set need;

  label subject='subject';

run;

data _null_ ;

  dcl hash hh   (             ) ;

  hh.definekey  ('k'          ) ;

  hh.definedata ('name', 'divison', 'subject', 'score') ;

  hh.definedone () ;

  do k = 1 by 1 until ( last.name ) ;

    set need;

    by name;

    hh.add () ;

  end ;

  hh.output (dataset: name) ;

run ;

Super Contributor
Posts: 1,636

split a sinlgle data set into 3 dataset at a time

Thank you Art! I need to improve my HASH skill.

Super User
Posts: 9,687

Re: split a sinlgle data set into 3 dataset at a time

Art. But The hash code is I wrote before.  Smiley Happy

But  I would code like this:

data have;
input name $  math   english  science  divison $;
cards;
ram     40       70          30          2nd
hari     20       20          49          3rd
gopal   90       80          80          1st
;
run;
data _null_;
 if _n_ eq 1 then do;
   if 0 then set have;
   length mark 8 subject $ 20;
   declare hash ha(ordered:'Y');
    ha.definekey('obs');
    ha.definedata('subject','mark','divison');
    ha.definedone();
  end;
set have; obs=0;
array _a{*} _numeric_;
do i=1 to dim(_a);
 obs+1;subject=vname(_a{i});mark=_a{i};
 if subject not in ('mark' 'obs') then ha.add();
end;
ha.output(dataset: name);
ha.clear();
run;

 


Ksharp

Regular Contributor
Posts: 171

Re: split a sinlgle data set into 3 dataset at a time

The solution I'm about to propose is not as efficient as the hash solution by Ksharp, I just think that it's a fun way to approach the problem.  If the data set is large and efficiency is a concern, then I would have also used a hash solution.  Before executing my solution, you have to first create a user-defined function with PROC FCMP, which is a one time endeavor.  I call this the SQL function.  After it is created, you could use this function in any program.

To create the function, you need to first save the following macro in your autocall library:

%macro sqlfunc;

     %let sqlselect = %sysfunc(dequote(&sqlselect));

     %let sqlresults = ;

     proc sql noprint;

          create view _TempView_ as &sqlselect;

          select * into: sqlresults separated by " " from _TempView_;

          drop view _TempView_;

     quit;

     %let sqlresults = &sqlresults;

%mend;

Next run the following FCPM procedure:

proc fcmp outlib=sasuser.funcs.general;

     function sql(sqlselect $) $;

     length sqlresults $ 32767; /* NOTE THAT IF YOU ARE USING SAS 9.2, THE LARGEST LENGTH YOU CAN USE HERE IS 260 */

     rc=run_macro('sqlfunc', sqlselect, sqlresults);

     return(sqlresults);

     endsub;

quit;

Now you need to set the CMPLIB= option to tell sas where to look for previously compiled functions.  I recommend doing this in your config file.

options cmplib=sasuser.funcs;

Now that the function has been defined, here is my solution to the problem:

data %sysfunc(sql(select distinct name from have));

     set have;

     array subjects{*} _numeric_;

     do i=1 to dim(subjects);

          subject=vname(subjects{i});

          mark=subjects{i};

          %sysfunc(sql(select distinct catx(' ', 'if name =', quote(strip(name)), 'then output', name, ';') from have));

     end;

     keep subject mark divison;

run;

Note that you could also use the SQL function in an expression or an assignment statement within the DATA step, without using %SYSFUNC.  

Ask a Question
Discussion stats
  • 5 replies
  • 173 views
  • 1 like
  • 5 in conversation