BookmarkSubscribeRSS Feed
shrikrishna
Calcite | Level 5

I need a program to loop through dataset1, which will select 10 usernames from dataset1 filter on those usernames in dataset2 and export that data to the .xlsx file, then select next set of 10 usernames.

I have set of 10000 usernames.

I am new to SAS, will really appreciate all the help.

11 REPLIES 11
Reeza
Super User
Why? Why not just sort the dataset2 filtered by data set1 and then export all records with 10 by groups at a time?
shrikrishna
Calcite | Level 5

One file should contain all data for particular username. limiting number of usernames is just to maintain manageable excel file size.

Kurt_Bremser
Super User

@shrikrishna wrote:

One file should contain all data for particular username. limiting number of usernames is just to maintain manageable excel file size.


Which means you have to create 10000 files? REALLY??

PaigeMiller
Diamond | Level 26

@shrikrishna wrote:

I need a program to loop through dataset1, which will select 10 usernames from dataset1 filter on those usernames in dataset2 and export that data to the .xlsx file, then select next set of 10 usernames.

I have set of 10000 usernames.


Splitting up data sets like this is rarely a good thing to do. Unless it is an absolute unchangeable requirement for your project, don't do it. Better is to explain what you are doing and why you need this — we need you to explain the big picture, not the programming that you are trying to do. Very likely there are much better ways to handle this data.

--
Paige Miller
shrikrishna
Calcite | Level 5

The requirement is that I need all data sorted in specified way (taken care of) in excel file. There is no way around it. people I am sending requirement to like excel format way too much.

10000 usernames are unique, and data cannot be grouped as they need day by day each instance in the file. (Legal requirement)

PaigeMiller
Diamond | Level 26

So you want to create and then send 1,000 Excel files, no way around it. Sorry to hear that.

 

The code from @ballardw will create one Excel file with 1,000 tabs. Is that not good enough?

--
Paige Miller
Patrick
Opal | Level 21

@shrikrishna wrote:

The requirement is that I need all data sorted in specified way (taken care of) in excel file. There is no way around it. people I am sending requirement to like excel format way too much.

10000 usernames are unique, and data cannot be grouped as they need day by day each instance in the file. (Legal requirement)


I dare to doubt that you've got a requirement that leads to an Excel file with 1000 tabs or even to 1000 Excel files. That's just not manageable. 

People normally like all the data within a single tab with columns that allow to apply filters. Or even data surfaced as pivot table in a single tab. 

 

SAS can load data in any sort order into an Excel tab as long as there is some logic to it.

 

If you really believe you're asked to create 1000 tabs or 1000 separate Excel files then I'd strongly recommend you confirm this with your requestor once more before you proceed spending time on such code.

PaigeMiller
Diamond | Level 26

@Patrick wrote:

If you really believe you're asked to create 1000 tabs or 1000 separate Excel files then I'd strongly recommend you confirm this with your requestor once more before you proceed spending time on such code.


I wish I had said that!

--
Paige Miller
ballardw
Super User

@shrikrishna wrote:

The requirement is that I need all data sorted in specified way (taken care of) in excel file. There is no way around it. people I am sending requirement to like excel format way too much.

10000 usernames are unique, and data cannot be grouped as they need day by day each instance in the file. (Legal requirement)


I am afraid that last sentence sort of contradicts your requirement.

At this point you need to provide ALL the rules involved. This sounds like the same names need to be in a similar file every day. If that is the case then the rules for how those are done need some explanation as your "sorted in a specified way" may break that requirement or that ordering needs to be explained because the way that you order is likely to impact the next steps.

 

I have had a hard enough time getting some data users to read 100 lines in a single file that I have a hard time envisioning 1000 files produced every day ever being read. Is there supposed to be some naming convention for these 1000's of files so you can tell which names are in which file? If so, early in the process is when to discuss this.

ballardw
Super User

How are the names to be "selected"?

How many unique names are there in the data?

What if there are not 10 names available after processing the first XXX number of groups?

 

If you sort your dataset2 by name you can group them by any number of names desired. Here is a short example with a data set you should have available, SASHELP.CLASS provided by SAS.

 

proc sort data=sashelp.class out=work.class;
   by name;
run;

data work.nameseq;
   set work.class;
   by name;
   if first.name then seq+1;
   group = int((seq-1)/10); /* the 10 groups records in to values of 10 based on sequence in the data set)
run;

The sort will get the names together if there are multiple instances.

The BY statement in the second data step adds automatic variables to indicate the first or last of a group of same names. These are accessed using First.variable or Last.variable and have values of 1 when true and 0 when false.

The Seq+1 implies retaining the value across data steps and means that each name gets a unique sequence number. This set does not have repeats but you can test with your data.

The Group calculation will group the names into groups of 10 (except the last group may be smaller if there aren't enough names).

 

You would use the Group variable with a BY statement for most things after that.

An example of writing each group to a different sheet in excel: (change the drive/path to something on your system)

ods excel file="x:\text.xlsx" options(sheet_interval='BYGROUP');

proc print data=work.nameseq noobs;
   by group;
   var name sex age height weight;
run;

ods excel close;

You don't say what the XLSX is to be used for. It may be easier to do it ONCE in SAS with a grouped data set before any export.

yabwon
Onyx | Level 15

It's "inconvenient" business setup you have, but hopefully this example will help you:

libname mylib (work); /* <- set your data directory */

/* example data */
data mylib.dataset1(keep=username) mylib.dataset2;
  call streaminit(42);
  do _N_ = 1 to 100;
    username = "NAME " !! put(_N_, z6.);
    output mylib.dataset1;
    do someValue1=1 to rand("integer",10);
      someValue2 = rand("integer", 100,200);
      output mylib.dataset2;
    end;
  end;
run;

/* add index for sql join */
proc datasets lib=mylib nolist;
  modify dataset2;
    index create username;
  run;
quit;

/* macro for export */
%macro selectAndExport(start, end, path4excel);
  proc sql;
  create table temp as
    select a.*
    from
      mylib.dataset2 as a
      inner join
      mylib.dataset1(firstobs=&start. obs=&end.) as b
      on 
        a.username = b.username
    ;
  quit;

  proc export data=temp
    file="&path4excel./file_for_obs_from_&start._to_&end..xlsx"
    dbms=xlsx /* "xlsx" or "excel" depending on your setup */
    replace
    ;
  run;
%mend selectAndExport;


/* code execution */
data _null_;
 set mylib.dataset1 curobs=c;
 i+1;
 retain start 0;
 if 1 = i then start=c;
 if 10 = i then
  do;
    call execute('%nrstr(%selectAndExport('!! start !!','!! c !!', C:/path/for/exel/files/)'); /*<- change path here! */
    i=0;
  end;
run;

Remember to adjust data sets names ("dataset1" and "dataset2") and libraries ("lib").

 

It's not the "most optimal" in terms of I/O but since you don't have to much SAS experience I figure it will be easier to maintain/modify.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 747 views
  • 6 likes
  • 7 in conversation