BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
caveman529
Calcite | Level 5

Dear All:

I have a library with hundreds of datasets like:

cq_19970108.sas7bdat

cq_19970109.sas7bdat

cq_19970110.sas7bdat

cq_19970115.sas7bdat

How to use macro to extract 1% of the sample stratified by, say, variable A, B, and C and then put them in separate files (keep the original naming convention but in a different library)?  I have the general idea that I need to use the proc contents procedure and then somehow plug the result of the proc contents into a proc surveyselect macro and repeat the process.  Not quite sure how to google this around to know to do this in the right way.  I think this got to be common practice.

Thank you!

  • HAVE:

library name: HAVE

datasets:

cq_19970108.sas7bdat

cq_19970109.sas7bdat

cq_19970110.sas7bdat

cq_19970115.sas7bdat

  • WANT:

library name: WANT

datasets:

cq_19970108.sas7bdat

cq_19970109.sas7bdat

cq_19970110.sas7bdat

cq_19970115.sas7bdat

Procedure to be repeated proc surveyslect at 1% of the original data observations.

/* 1. Get a list of datsets */

libname have  'E:\wrds\taq\taq.1996\taq96d\sasdata';

ods output "Library Members" = have_table;

proc datasets library=have memtype=data ;

run;

ods output close;



proc sql;

    create table names as

          select distinct Name

              from have_table

                  where MemType = 'DATA';

quit;

/* 2. Just one datset (Not how to plug the name dataset into the Step 3 do loop) */

%let dn = cq_19960105;

proc surveyselect data=have.&dn method = urs sampsize = 1000

     rep=1 seed=12345 out=want.&dn out=outhits;

     id symbol date time;

run;

/* 3. Looping over the entire list of datasets */

%macro sampleselect();

    %do i = 1 %to &N;

        %put &&deal_no_&i;

    %end;

%mend sampleselect();

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Generating many macro variables is probably much more work and potentially too confusing for what you want.

Creating a macro and using a tool like CALL EXECUTE to generate multiple calls to it is probably easier to code and understand.

%macro sample1(dn);

proc surveyselect data=have.&dn method = urs sampsize = 1000

     rep=1 seed=12345 out=want.&dn out=outhits;

     id symbol date time;

run;

%mend sample1 ;

libname have ... ;

libname want ... ;

proc contents data=have._all_ noprint out=contents;

run;

data _null_;

  set contents;

  by memname ;

  if first.memname;

  call execute(cats('%sample1(',memname,')'));

run;

View solution in original post

9 REPLIES 9
Tom
Super User Tom
Super User

Here is one approach.

Step 1 is write the code to do one file.

Step 2 is probably to convert that into a macro with a single parameter, the name of the dataset (the member name).

Step 3 is generate calls to the macro for every member in the input library.

Start by posting your solution to Step 1.

caveman529
Calcite | Level 5

Hi, Tom:

I specified the main steps I hope to go through and uploaded a sample data for just one dataset in the library.  But I haven't find a nice guide how to plug a list (in the form of a sas data) to a do until macro loop to get the job done.  Could you offer me some guidance on this?  Some of the examples I searched online is difficult to understand since I'm not sure what they want to achieve in the first place and not sure the data structure they are referencing to.  Thanks -

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

There are many posts I have put out for this subject:

libname temp "path to where you have your datasets"; /* Note you can exclude this if you already have your libname */

data _null_;

     set sashelp.vtable (where=(libname="TEMP"));  /* Gets all datasets in that libname */

     call execute('data my_new_l.'||strip(memname)||';

                              set temp.'||strip(memename)||';

                          run;');

run;

The above will run the datastep in the execute block for every dataset in the given libname.

jakarman
Barite | Level 11

I do not really understand what you are wanting.
It could be make selection in a collection in some datasets .... analyze those grouped.


A simple  approach with two datasets. First I just creating them. Note: only 1 datastep is needed.

data dataset1 dataset2 ;
  Do i= 1 to 20 ;         date=i ; output dataset1;   end;
  Do i= 100 to 150 ;    date=i ; output dataset2;   end;
run;

Combine-union off all datasets and select (stratified as you like) record can be done as:

data select ;                      /* a single dataset as result */

  length datain $41 ;         /* names in datain can be "libary.member" total 41 long */

  set dataset: indsname=_datain ;    /* select all datasets in the list abbreviated by the : symbol

                                                             - here just 2 but it is variable SAS(R) 9.4 Language Reference: Concepts, Third Edition */ 

    if ( ranuni(-1) < 0.5 ) ;                   /* the selection replace that with yours.  This is simple 50% random */

  datain=_datain;                              /* keeping the name of the original dataset as known from before the combine-union */

  put datain date _n_ ;     /* just showing what is done, remove with real processing */

run;

The only pre-req is that all members are having similar structure (variables).

Needing to split up can be done with a datastep. At that moment all name must be prepared  (data namelist  ) followed  by a select/output on that datain.

More often the combined dataset is the one needed for analyzing.

---->-- ja karman --<-----
caveman529
Calcite | Level 5

Hi, Jaap:

Sorry that I confused you and thank you for the example you just presented!

All I want to do is to first get a list of the datasets within the HAVE library.  And then plug this list into some kind of do loop.

The loop is to select 1% of the sample out of the datasets in the HAVE library (or even proc sort to make things simple).  The loop work on the procedures repeatedly until it reaches the end of the list.  I now know how to use proc datasets to get the list of datasets in a library, but I'm not sure how to plug that into a sas do until loop/macro.

Could you point me in the right direction?  Thank you -

Reeza
Super User

Write a macro and then use call execute to invoke it once for each data set.

jakarman
Barite | Level 11

The sample RW9 has given is using the list of all datasets using sashelp.vtable I think in this case that is better than using proc datasets.
Having that list is generating new commands to be executed with that name.  Probably the most easy quick and clean approach.

I think you should try the examples. No the old same stoneage anymore. We are adding some bronze to it. 

The IT world is changing, some new pony tricks to add. 
  

---->-- ja karman --<-----
caveman529
Calcite | Level 5

Thank you, Jaap and Tom.  I'll implement it right now.  Get tied up in recent days due to deadlines.  Yes, I need to improve my toolkits Smiley Happy 

Enjoy a great weekend ~

Tom
Super User Tom
Super User

Generating many macro variables is probably much more work and potentially too confusing for what you want.

Creating a macro and using a tool like CALL EXECUTE to generate multiple calls to it is probably easier to code and understand.

%macro sample1(dn);

proc surveyselect data=have.&dn method = urs sampsize = 1000

     rep=1 seed=12345 out=want.&dn out=outhits;

     id symbol date time;

run;

%mend sample1 ;

libname have ... ;

libname want ... ;

proc contents data=have._all_ noprint out=contents;

run;

data _null_;

  set contents;

  by memname ;

  if first.memname;

  call execute(cats('%sample1(',memname,')'));

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 15740 views
  • 1 like
  • 5 in conversation