BookmarkSubscribeRSS Feed
GregG
Quartz | Level 8

Our Oracle system doesn't allow lists of greater than 1,000 items. When using a cohort of students, and wanting to retrieve more about them from Oracle, we use Proc SQL to select into list groups of 1,000 items each as such:

 

proc sql noprint;
	select VariableToGroup into :group_1 separated by ','
    from SASdataset (where=(count > 0 and COUNT <= 1000)) ;

	select VariableToGroup into :group_2 separated by ','
    from SASdataset (where=(count > 1000 and COUNT <= 2000)) ;

    select VariableToGroup into :group_3 separated by ','
    from SASdataset (where=(count > 2000 and COUNT <= 3000)) ;
			
    select VariableToGroup into :group_4 separated by ','
    from SASdataset (where=(count > 3000 and COUNT <= 4000)) ;

	select VariableToGroup into :group_5 separated by ','
    from SASdataset (where=(count > 4000 and COUNT <= 5000)) ;

	select VariableToGroup into :group_6 separated by ','
    from SASdataset (where=(count > 5000 and COUNT <= 6000)) ;              
quit;

We then create a corresponding where statement:

			WHERE	(
					ID in (&GROUP_1)
					OR ID in (&GROUP_2)
					OR ID in (&GROUP_3)
					OR ID in (&GROUP_4)
					OR ID in (&GROUP_5)
					OR ID in (&GROUP_6)
					OR ID in (&GROUP_7)
					OR ID in (&GROUP_8)
					OR ID in (&GROUP_9)
					OR ID in (&GROUP_10)
					)


How do I do this dynamically?

 

I may not have the best way, but I can get the number of observations in the VariableToGroup, divide that by 1,000 and then use a ceiling to get the number of groups to create. But I can't figure out how to use this to my advantage to dynamically create groups.

 

This is my first failed attempt:

%macro sqlloop(last=);
proc sql noprint;
	%do i=1 %to &last;
   	select variableToGroup into :group_&i separated by ','
        from SASdataset (where=(count > ((&i - 1)*1000) and COUNT <= (&i*1000))) ;
		    /*first time through, ^this^ should be 0*/
	%end;
quit;
%mend sqlloop;
%sqlloop(last=1012);

There is probably quite a bit wrong with it, but it isn't working. It gives no error when calling it, but it doesn't create group_1, group_2, etc.

 

Thank you.

 

8 REPLIES 8
ballardw
Super User

One would ask about why using lists at all.

 

Something like:

/* after making connections to oracle*/

proc sql;
   create table want as
   select OracleStudentData.*  /*or appropriate variables*/ f
   from YourDataSetOfStudentIds left join OracleStudentData
      on   YourDataSetOfStudentIds.IdVariable =  OracleStudentData.IdVariable;
quit;

Hopefully your variable VariableToGroup is an Id variable similar to what I called IdVariable that you can match to the Oracle variable containing id values. No reason to use Macro variable and long where statements.

 

GregG
Quartz | Level 8

As of right now, we have to do quite a bit with Oracle to bring data to us. Because of this, we have switched to using passthrough (instead of libname), which has increased our query runtimes significantly.

 

The studentIds often originate from previous queries involving the Oracle database, then using SAS to indentify cohorts (rather than some rather involved inline views or nested subqueries within SQL) - but the studentIds also come from other sources, most notably imported Excel sheets.

 

I have tried but was unable to join a sasdataset with an Oracle table within Proc SQL passthrough - is this possible?

Reeza
Super User

No, you can't use a SAS dataset with a pass through query.

 

Can you create temp tables on your Oracle DB? If so, use a libname method to create the table on your DB and then join on it there. 

 

 

Tom
Super User Tom
Super User

I would use the dataset options FIRSTOBS and OBS to control gettng the groups. I would use the automatic macro variables SQLOBS to control stopping the loop.

First lets make a sample dataset.

data ids;
  do id=1 to 25; output; end;
run;

Now let's make a simple macro to build a series of macro variables.

%macro loop(size) ;
%global i ;
%let i=1;
%let sqlobs=0;
proc sql noprint;
%do %until(&sqlobs=0);
  %global group&i ;
  %let group&i=;
  select id into :group&i separated by ','
  from ids (firstobs=%eval(&size*(&i-1)+1) obs=%eval(&size*&i) )
  ;
  %if (&sqlobs) %then %do; 
    %put &=i (&&group&i) ;
    %let i=%eval(&i + 1);
  %end;
  %else %let i=%eval(&i - 1);
%end;
%mend loop;

Now try calling it with different values for SIZE.

%loop(10);
%put &=i ;

Then when you want to generate your query you can use another %DO loop to reference your macro variables.

%do j=1 to &i ;
 or ID in (&&group&j)
%end;
GregG
Quartz | Level 8

Tom, that works great with the sample dataset you provided (even increasing the id count from 25 to 3105).

 

But it causes an issue when I try it on my actual dataset (with 3105 actual ids, which are 18 characters long). It continually gives me the "must clear log" pop-up and will not continue until I choose an option.

 

I should mention that there is a separate variable for the count, but you are using sqlobs so I don't know how that plays with this.

 

The ids are unique, but 18 characters long. They already have the requisite single quotes ( '00000000ID00000000' ) around them, since they are character.

 

Here is the full popup I am receiving:

 

Log WINDOW FULL

Window is full and must be cleared. Select

F to File,

P to Print,

S to Save or 

C to Clear window without saving

 

I must hand select one of the above before it continues for a few seconds, and then I have to select it again. I did this 10 times before I canceled the submitted statements. 

 

 Editted to add: I am processing the same exact list of ids in the first set of code without these popups.

 

Tom
Super User Tom
Super User

You probably do NOT want the %PUT statement. 🙂 

Reeza
Super User

@GregG wrote:

Tom, that works great with the sample dataset you provided (even increasing the id count from 25 to 3105).

 

But it causes an issue when I try it on my actual dataset (with 3105 actual ids, which are 18 characters long). It continually gives me the "must clear log" pop-up and will not continue until I choose an option.

 

I should mention that there is a separate variable for the count, but you are using sqlobs so I don't know how that plays with this.

 

The ids are unique, but 18 characters long. They already have the requisite single quotes ( '00000000ID00000000' ) around them, since they are character.

 

Here is the full popup I am receiving:

 

Log WINDOW FULL

Window is full and must be cleared. Select

F to File,

P to Print,

S to Save or 

C to Clear window without saving

 

I must hand select one of the above before it continues for a few seconds, and then I have to select it again. I did this 10 times before I canceled the submitted statements. 

 

 Editted to add: I am processing the same exact list of ids in the first set of code without these popups.

 


When you run into this issue check what's printing to your log. If it's unavoidable, use PROC PRINTTO to redirect your log. You can also consider option nonotes though I rarely find that to be a good idea. 

Tom
Super User Tom
Super User

It is even easier to do with a DATA step, mainly because the DO statement is much more flexible than the %DO statement.

data _null_;
  if eof then call symputx('ngroups',i);
  i+1;
  length group $32767 ;
  do j=1 to 10 while (not eof) ;
    set ids end=eof;
    group = catx(',',group,id);
  end;
  call symputx(cats('group',i),group);
run;
%put &=ngroups (&group2);

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 5445 views
  • 1 like
  • 4 in conversation