DATA Step, Macro, Functions and more

Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Reply
Contributor
Posts: 50

Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

I've got an interesting situation due to an interesting request. I've googled for a while but can't find the simple solution I think I'm looking for. I want to use nested "do over" loops to go through all possibilities of 2 arrays inside a proc sql statement.

I've got a query that returns results exactly how I need them I just need to iterate through every possible value of two variables underlined and highlighted bold below. I've tried using entirely SQL by using sub-selects but performance is horrible due to table size. The query cannot change much as I need it to sort and limit by 1 to find the first occurance of an so_pk for each unit_pk and service_cat_desc. I've highlighted in bold what I need to iterate through.

I also need to understand how to load an array using proc sql (so I can actually use the two arrays needed). I somewhat understand how to load using a data step but I will only have 1 column of observations that needs to be loaded into the array. I feel like I need to transpose the results of SQL so they can be loaded into an array.

Here is the query that needs to be parameterized and looped through using nested array loops:


select
orig.unit_pk,
cb_cntr_service_category.service_cat_desc,
orig.so_pk,
min(sub_prod_pckg_pk) as pk

from
cb_subs_prod_pckg as orig
left outer join cb_cntr_service_type on
orig.service_type_code = cb_cntr_service_type.service_type_code
left outer join cb_cntr_service_category on
cb_cntr_service_category.service_cat_code = cb_cntr_service_type.service_cat_code

where
orig.so_pk is not null and
orig.unit_pk = 54 and
cb_cntr_service_category.service_cat_desc = 'Video'

group by
orig.unit_pk,
cb_cntr_service_category.service_cat_desc,
orig.so_pk

order by
orig.unit_pk,
cb_cntr_service_category.service_cat_desc,
pk

limit 1


To summarize I'm looking for the following:
1) How to load an array using a single column of data returned by proc sql.
2) How to loop through a proc sql statement with arrays and store the results in a table.

Thanks!
Respected Advisor
Posts: 4,173

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to StephenOverton
Not sure if I fully understand your problem. Hopefully the following helps anyway:

As you use "group by" you will get one observation for each distinct combination of orig.unit_pk,cb_cntr_service_category.service_cat_desc,orig.so_pk.
The value for pk will be the minimum value per distinct combination of these 3 columns.

As you select in your where close orig.unit_pk = 54 and cb_cntr_service_category.service_cat_desc = 'Video' only rows with these two values are selected and joined.
(actually: I'm not sure whether the joins happen first and then the where clause subsets the result or the other way round. You might want to try and have the selections already applied during the joins in the "on.." part to improve performance).

The "order by" is not necessary, "group by" sorts already.

select
orig.unit_pk,
cb_cntr_service_category.service_cat_desc,
orig.so_pk,
min(sub_prod_pckg_pk) as pk

from
cb_subs_prod_pckg as orig
left outer join cb_cntr_service_type on
orig.service_type_code = cb_cntr_service_type.service_type_code
left outer join cb_cntr_service_category on
cb_cntr_service_category.service_cat_code = cb_cntr_service_type.service_cat_code

where
orig.so_pk is not null and
orig.unit_pk = 54 and
cb_cntr_service_category.service_cat_desc = 'Video'

group by
orig.unit_pk,
cb_cntr_service_category.service_cat_desc,
orig.so_pk



"....find the first occurance of an so_pk ..."
As you use "group by" you will only get one row per combination. As the only other variable is a minimum it doesn't matter whether this is "the first one" or not (the "first one" wouldn't be possible).
HTH
Patrick
Contributor
Posts: 50

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

orig.unit_pk = 54 and cb_cntr_service_category.service_cat_desc = 'Video'

These are the two values I need to iterate through. There are unit_pk's that range from 1 to a big number and service_cat_desc that are about 10 different text values.

As you suggest, I could move these parameters to the ON clause to improve performance but this doesn't solve my problem of needing to cycle through different values. I'm thinking use nested loops to iterate through every combination of unit_pk and service_cat_desc.
SAS Super FREQ
Posts: 8,861

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to StephenOverton
Hi:
You requirement to cycle through a process multiple times and only vary a few pieces of the process points to using SAS macro coding techniques. To me. As far as I know, you can't use arrays, per se, with SQL, but you can create numbered macro variables, and within a macro program, you could iteratively cycle through your SQL code and only change your unit_pk value and your service_cat_desc value.

I would start where you are at: you already have a working SAS program. Now you need to learn more about SAS macro processing. The SAS Macro facility is a method that you can use to actually -generate- code -- think of the SAS Macro facility as a big, behind-the-scenes, typewriter, which can type pieces of code or whole programs for you and then send those programs to the regular SAS compile/execute phases. So the Macro facility isn't doing any execution of code -- just resolving macro program invocations and macro variable references in order to generate code.

So, briefly, if you had:
[pre]
** above SQL step;
%let wantpk = 54;
%let wantsvc=Video;

then in the SQL step had:
orig.unit_pk = &wantpk and cb_cntr_service_category.service_cat_desc = "&wantsvc"
[/pre]

what would happen is that the macro facility word scanner would detect the &wantpk reference and would substitute or type the number 54 into the resolved code. Next, the macro word scanner would encounter the macro reference &wantsvc and would type or substitute the value Video into the quoted string (note the need for double quotes in the SQL statement in order for the macro variable reference to resolve).

Once you get a macro variable reference like this working for your program, you are now ready to look into the use of a macro program and a macro %DO loop to iterate through a numbered list of macro variables.

If you have never used the SAS macro facility, then, buckle your seat belt because this is a new SAS ride for you in the SAS park. A good place to start is this paper -- that covers the basics of the Macro facility in 9 steps:
http://www2.sas.com/proceedings/sugi28/056-28.pdf
and this one: http://www2.sas.com/proceedings/sugi29/243-29.pdf
and this is a nice blog about using %DO loops
http://scott.sherrillmix.com/blog/programmer/sas-macros-letting-sas-do-the-typing/
you'll have to search for more papers by SAS users on macro processing. Here's one that explicitly talks about using PROC SQL to create your macro variables (such as the numbered list you'll need for your task)
http://support.sas.com/resources/papers/proceedings09/200-2009.pdf

Hopefully, this will get you started.

cynthia
Contributor
Posts: 50

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to Cynthia_sas
Thanks Cynthia, you've been very helpful as always!

Using the select into clause is exactly what I was looking for. I was not aware of macro lists either.

I think I've got two lists loaded for my two parameters. Now I've just got to figure out how to iterate through them.

One question in the meantime: You mentioned a "numbered list of macro variables". Should I be loading the two parameters in any special way??

Here is what I've done to load the lists so far:

/**Load service categories into macro variable*/
proc sql noprint;
select servicecat into Smiley Frustratedervices separated by ' '
from ServiceCategories;
quit;
/**Load units into macro variable*/
proc sql noprint;
select unit_pk into :Units separated by ' '
from PossibleUnits;
quit;


ServiceCategories and PossibleUnits are tables with distinct values of what I want to iterate through.
Super Contributor
Super Contributor
Posts: 3,174

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to StephenOverton
Also, in your SELECT, you may want to use DISTINCT() so that there is only one macro variable for each unique value. And, depending on the number of "separated" values generated, you may exceed the size limit of a SAS macro variable -- not sure about your unique value count though.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 50

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Thanks, I've got the distinct covered in the SQL that loads the temp tables.

I think I am exceeding the size of macro variables. My list of unit_pks can have about 70000 integers starting from 1 to 70000.
Super Contributor
Super Contributor
Posts: 3,174

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to StephenOverton
As a general-rule, consider always working with a SAS-generated macro variable list, if you must use PROC SQL, rather than using a DATA step approach. Here's a Google advanced search argument which will reveal some useful SAS-hosted DOC and technical/conference reference material on the topic:

proc sql select into macro variable iterate site:sas.com

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 50

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

So instead of a single list, I could use something like this, right?

SELECT
INTO : -
:
FROM < table>

I think this is what Cynthia was talking about when she mentions "numbered list".
Super Contributor
Super Contributor
Posts: 3,174

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to StephenOverton
Yes, but the high-value (macro variable with count of unique values) would be derived from an initial SELECT, followed by the SELECT that populates your declared range of values. Then, you would iterate through the list of macro variables in a %DO / %END code paragraph, generating your SAS code as needed. You will find several examples on the SAS.COM support site with example coding techniques.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 50

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Sounds good, I found a good example with your search suggestion. I'll post the code when I'm done.

In the meantime, this PDF covers a lot of what I'm doing: http://support.sas.com/resources/papers/proceedings09/061-2009.pdf
Contributor
Posts: 50

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Now I'm stuck with this error:

ERROR: The %DO statement is not valid in open code.

This is the full code of what I have right now:

LIBNAME CONX BASE "/sas_perm_data/Enlighten_Landing" ;
%global s;

/** Passthrough SQL to load distinct service category types ***/
proc sql;
connect to odbc as myconn (dsn=Enlighten);
create table ServiceCategories as
select *
from connection to myconn (
select distinct
cb_cntr_service_category.service_cat_desc as ServiceCat
from
cb_subs_prod_pckg as orig
left outer join cb_cntr_service_type on
orig.service_type_code = cb_cntr_service_type.service_type_code
left outer join cb_cntr_service_category on
cb_cntr_service_category.service_cat_code = cb_cntr_service_type.service_cat_code
);
quit;

/** Passthrough SQL to load possible units ***/
proc sql;
connect to odbc as myconn (dsn=Enlighten);
create table PossibleUnits as
select *
from connection to myconn (
select distinct unit_pk from cb_so_unit
);
quit;

/**Load service categories into numbered macro variables*/
proc sql noprint;
select servicecat into Smiley Frustratedervices1 - Smiley Frustratedervices20
from ServiceCategories;
quit;

%let cntservices=&sqlobs; /*count of services */

/**Load units into numbered macro variables*/
proc sql noprint;
select unit_pk into :Units1 - :Units70000
from PossibleUnits;
quit;

%let cntunits=&sqlobs; /* Count of units */

/*Create table to hold returned values*/
proc sql noprint;
create table test (unit_pk num, service_cat_desc char(30), so_pk num, pk num);
quit;

/**Loop through all possible units and service categories and insert into table*/
proc sql;
%do s=1 %to &cntservices; /* just through services for now for testing*/
insert into test
select
orig.unit_pk,
orig.service_cat_desc,
orig.so_pk,
min(orig.sub_prod_pckg_pk) as pk
from
conx.subscriber_services as orig
where
orig.so_pk is not null and
orig.unit_pk = 54 and
orig.service_cat_desc = &&Services&s
group by
orig.unit_pk,
orig.service_cat_desc,
orig.so_pk
order by
orig.unit_pk,
orig.service_cat_desc,
pk
;
%end;
quit;


I think I'm just a few lines away from getting this problem solved. more

Message was edited by: SteveONCSU
SAS Super FREQ
Posts: 8,861

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to StephenOverton
Hi:
Sounds like you're on the right track. However, I highly recommend that you read the "Nine Steps" paper -- it will explain why you cannot use %DO or %IF in open code and why you have to put create a macro program using %MACRO and %MEND.

Once you read the "Nine Steps" paper, I think you'll have a much better idea of what to do next.

cynthia
Contributor
Posts: 50

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to Cynthia_sas
So I've got the macro written now, thanks to a little old fashioned reading. But I keep getting a segmentation error when running it:

Segmentation Violation In Task [ SQL (2) ]
Fault and traceback information not available
Task Traceback

The macro runs and inserts values into the table correctly, it just doesn't finish which does not give me the entire dataset I need.

Here is the macro, any suggestions would be appreciated:

%macro iterate;
proc sql noprint;
%do u=1 %to &cntunits;
%do s=1 %to &cntservices;
insert into InitialTickets
select
orig.unit_pk,
orig.service_cat_desc,
orig.so_pk,
min(orig.sub_prod_pckg_pk) as pk
from
conx.subscriber_services as orig
where
orig.so_pk is not null and
orig.unit_pk = &&Units&u and
orig.service_cat_desc = &&Services&s
group by
orig.unit_pk,
orig.service_cat_desc,
orig.so_pk
;
%end;
%end;
quit;
%mend iterate;
SAS Super FREQ
Posts: 8,861

Re: Parameterizing Proc SQL with Arrays and Iterating Using Nested Loops

Posted in reply to StephenOverton
Hi:
I don't know....segmentation violation sounds pretty serious. I wonder if you are somehow reaching a liimit on the INSERTs that you can generate or have exceeded some other limit. How high can your %DO loops go???
You might want to work with Tech Support on this one.
cynthia
Ask a Question
Discussion stats
  • 21 replies
  • 317 views
  • 0 likes
  • 6 in conversation