BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NKormanik
Barite | Level 11

Oligolas was kind to provide the following code, handling multiple datasets in a library, and doing the same procedure(s) on each dataset:

 

data _NULL_;
   set sashelp.vtable;
   where libname eq upcase('YourLib');
   call execute('
      PROC SORT data='||strip(libname)||'.'||strip(memname)||' out=tmp;
         BY variable_rank variable_continuous;
      RUN;

      DATA yourPermLib.'||strip(memname)||'_divisions;
         SET tmp;
         BY variable_rank;
         if last.variable_rank;
      RUN;

      PROC DATASETS lib=work noprint;
         delete tmp;
      RUN;QUIT;');
run;

Works wonderfully.

 

Here's now a follow-up question, if I may:

 

The above code deals with one variable pair -- variable_continuous, and it's rank counterpart, variable_rank.

 

The follow-up issue is that there are 200 such variable pairs in my dataset.  EACH of these continuous, with rank counterpart.

 

The above code results in the following partial output dataset:

 

  •   i_20901 i_20902 i_20903 i_20904 i_20905 i_20906 rg7_20901 rg7_20902 rg7_20903 rg7_20904 rg7_20905 rg7_20906
      5.2998 0.8653 9.7555 -0.4079     4 6 4 1    
      15.686 0.4401 6.3543 0.0869 4.3429 -0.1955 6 4 3 3 0 0
      1.7026 -0.169 11.5141 -0.7457 6.1116 0.0722 1 2 5 0 1 4
      0.6265 0.0777 2.2246 -0.1288 7.4253 -0.2075 0 3 0 2 2 0
      5.6289 0.9559 2.0766 -0.3934 9.0273 -0.3257 4 6 0 1 3 0
      3.1862 -0.2157 4.8757 -0.0159 11.6997 -0.0549 2 2 2 3 4 2
      7.7334 -0.1715 9.3654 -0.1501 14.9105 -0.0166 5 2 4 2 5 2
      2.3106 0.3048 6.3218 -0.6706 25.3866 0.011 2 4 3 0 6 3

 

The above dataset creation run focused on continuous variable i_20905, and the corresponding rank variable rg7_20905.

 

I'll need to delete all the other variables, using SAS 'keep'.  The result would be as follows:

 

  •   i_20905 rg7_20905
         
      4.3429 0
      6.1116 1
      7.4253 2
      9.0273 3
      11.6997 4
      14.9105 5
      25.3866 6

 

Each and every variable pair would require it's own unique output file such as that above.

 

All the files would then be combined using SAS Match-Merging, using rank as the common column.

 

That would be the final result desired.  At least for now.

 

So, what question am I presently asking?

 

Simply, Using Oligolas' code above, is there a way to handle all the variables in one go?

 

If not, I will simply copy-paste x200, changing the specifics on variable for each of the 200 runs.  Perhaps that would be the best way anyway.  As at least I will be able to understand what's happening.

 

Thanks!

 

Nicholas Kormanik

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Oligolas
Barite | Level 11

Hi,

 

you're the architect and if I understood your requirements right you could do this:

data test;
input i_20901
i_20902
i_20903
i_20904
i_20905
i_20906
rg7_20901
rg7_20902
rg7_20903
rg7_20904
rg7_20905   
rg7_20906;
cards;
5.2998 0.8653 9.7555 -0.4079 0 0 4 6 4 1 0 0    
15.686 0.4401 6.3543 0.0869 4.3429 -0.1955 6 4 3 3 0 0 
1.7026 -0.169 11.5141 -0.7457 6.1116 0.0722 1 2 5 0 1 4 
0.6265 0.0777 2.2246 -0.1288 7.4253 -0.2075 0 3 0 2 2 0 
5.6289 0.9559 2.0766 -0.3934 9.0273 -0.3257 4 6 0 1 3 0 
3.1862 -0.2157 4.8757 -0.0159 11.6997 -0.0549 2 2 2 3 4 2 
7.7334 -0.1715 9.3654 -0.1501 14.9105 -0.0166 5 2 4 2 5 2 
2.3106 0.3048 6.3218 -0.6706 25.3866 0.011 2 4 3 0 6 3
;
run;

DATA vcolumn;
   length pairID $20;
   set sashelp.vcolumn(keep=libname memname name); 
   where libname eq 'WORK' and memname eq 'TEST';
   pairID=scan(name,2,'_');
RUN;

PROC SORT data=vcolumn; 
   by libname memname pairID descending name; 
RUN;

DATA vcolumn;
   length byVar $40;
   set vcolumn;
   retain byVar;
   by libname memname pairID descending name; 
   if first.pairID then byVar=strip(name);
   else do;
      byVar=strip(byVar)||' '||strip(name);
      output;
   end;
RUN;

DATA _NULL_;
   set vcolumn;
   call execute('
      PROC SORT data='||strip(libname)||'.'||strip(memname)||' out=tmp;
         BY '||strip(byVar)||';
      RUN;

      DATA WORK.'||strip(memname)||'_'||strip(pairID)||'_divisions;
         SET tmp;
         BY '||strip(byVar)||';
         if last.'||strip(scan(byVar,1))||';
      RUN;

      PROC DATASETS lib=work noprint;
         delete tmp;
      RUN;QUIT;');
RUN;

But I'm not sure if creating 100 files x 6 pair variables = 600 datasets is the thing you really want.

________________________

- Cheers -

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, there is a way to do it, and there are many topics on here which you can find using the search, in fact I have probably posted such many times.  Call execute accepts a string - any string - and inserts that into the compiler after the datastep has finished running, therefore any Base SAS code you can write can be put into a call execute, much like you could put it into a macro, the benefit being that you use the datastep as a loop rather than handling that with macro code.  So write your code once so that it works, then identify the pieces which will change, then put that in a call execute.

However, that being said, as in my other post you are really fighting the SAS system.  It is really not a good idea to split data up into little bits.  What will you do after you have copied the code 200 times or put it all in a macro, everything from that point on then requires the same process, not to mention that running each bit will take longer and use more resources.  Three examples for you of how to work with the data as is:

arrays: you can use arrays to refer to groups of variables

shorthand: the other method is shorthand, so to sum up all i_ variables sum(of i_:) - how complicated would that be if it was all in different datasets?

Normalisiation: having the data go down the page rather than across e.g:

PARAM  VALUE

i_20901  xyz

...

You can then refer to observations rather than splitting that out into a new dataset - i.e. where clausing.

ChrisBrooks
Ammonite | Level 13

I'd like to echo the comments of @RW9 - I've seen this many, many times and I'd bet a pound to a penny that this data originally came from spreadsheets. This "short and wide" layout works really well for copying formulae in, say, Excel but doesn't work so well in other environments for the reasons given. If you have the data "tall and thin" as suggested you'll find your programs are more robust (less changing of code), easier to work with (much less typing and reading lines of lines of repeated variable names and simpler coding using By Groups) and they'll actually run faster as well. You should definitley consider converting this data to "long and thin" with Proc Transpose.

Oligolas
Barite | Level 11

Hi,

 

you're the architect and if I understood your requirements right you could do this:

data test;
input i_20901
i_20902
i_20903
i_20904
i_20905
i_20906
rg7_20901
rg7_20902
rg7_20903
rg7_20904
rg7_20905   
rg7_20906;
cards;
5.2998 0.8653 9.7555 -0.4079 0 0 4 6 4 1 0 0    
15.686 0.4401 6.3543 0.0869 4.3429 -0.1955 6 4 3 3 0 0 
1.7026 -0.169 11.5141 -0.7457 6.1116 0.0722 1 2 5 0 1 4 
0.6265 0.0777 2.2246 -0.1288 7.4253 -0.2075 0 3 0 2 2 0 
5.6289 0.9559 2.0766 -0.3934 9.0273 -0.3257 4 6 0 1 3 0 
3.1862 -0.2157 4.8757 -0.0159 11.6997 -0.0549 2 2 2 3 4 2 
7.7334 -0.1715 9.3654 -0.1501 14.9105 -0.0166 5 2 4 2 5 2 
2.3106 0.3048 6.3218 -0.6706 25.3866 0.011 2 4 3 0 6 3
;
run;

DATA vcolumn;
   length pairID $20;
   set sashelp.vcolumn(keep=libname memname name); 
   where libname eq 'WORK' and memname eq 'TEST';
   pairID=scan(name,2,'_');
RUN;

PROC SORT data=vcolumn; 
   by libname memname pairID descending name; 
RUN;

DATA vcolumn;
   length byVar $40;
   set vcolumn;
   retain byVar;
   by libname memname pairID descending name; 
   if first.pairID then byVar=strip(name);
   else do;
      byVar=strip(byVar)||' '||strip(name);
      output;
   end;
RUN;

DATA _NULL_;
   set vcolumn;
   call execute('
      PROC SORT data='||strip(libname)||'.'||strip(memname)||' out=tmp;
         BY '||strip(byVar)||';
      RUN;

      DATA WORK.'||strip(memname)||'_'||strip(pairID)||'_divisions;
         SET tmp;
         BY '||strip(byVar)||';
         if last.'||strip(scan(byVar,1))||';
      RUN;

      PROC DATASETS lib=work noprint;
         delete tmp;
      RUN;QUIT;');
RUN;

But I'm not sure if creating 100 files x 6 pair variables = 600 datasets is the thing you really want.

________________________

- Cheers -

ChrisBrooks
Ammonite | Level 13

I've seen systems like that - they're a support nightmare.....

NKormanik
Barite | Level 11

Not really sure what you did here, Oligolas, but you're still the winner.

 

Thanks very much!

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 882 views
  • 5 likes
  • 4 in conversation