Create a loop in a SAS datatable / dataset and execute a store procedu... - Page 2

Patrick · Posted 09-15-2022 12:59 AM

If you can't share any code for us to understand where you are at right now then it's really hard to provide concrete advice.

Having said that: It's going to be something pretty close to what @Tom just shared as code. ...but again: You need to talk "code" with us for us to be able to amend/extend your code to what you need.

tassnh · Posted 09-15-2022 01:30 AM

Dear Patrick, my answers in red:

If you can't share any code for us to understand where you are at right now then it's really hard to provide concrete advice.

Having said that: It's going to be something pretty close to what @Tom just shared as code. ...but again: You need to talk "code" with us for us to be able to amend/extend your code to what you need.

ok, could you indicate me what part of the coding you need me to post?

My SP? My select from my db/table? The SP call?

Please check my reply to Tom. Maybe that would sort things out a bit.

Kind regards,

tassnh

Tom · Posted 09-14-2022 09:11 AM

That system_id is kept in a different table which (after I run my SP) I will select it's value from that table (single table, single column, single row value). Let's call this system_id_table

So that might be enough to make some pseudo code to show to process.

Sounds like you are doing something like this:

1) Connect to database.

2) run procedure in database.

3) query table in database.

4) upload data to database.

Let's assume you already have a connection to the database defined by the libref MYDB.

So your single instance program might look like:

proc sql;
connect using mydb ;
execute by mydb
  ( run_my_stored_procedure )
;
create table new_id as select * from connection to mydb
(select system_id from system_id_table)
;
quit;
data for_upload ;
  merge new_id new_data;
run;
proc append base=mydb.target_table data=for_upload;
run;

What we cannot tell from your description is exactly what it is you want to pull form SYSTEM_ID_TABLE, how to find the "new" id that the process generated, etc etc.

I also do not see how the source dataset you wanted to loop over applies to the process.

Is it just that you need to call the stored process once for every observation in the dataset?

Or are there values in the source dataset that need to serve as inputs to the stored process?

In general a simply way to repeat multiple steps for every observation in a dataset is to make a macro that does the steps that takes as input the information that is the dataset. Then just generate one call to the macro for every observation in the dataset.

So if the process needs to take the value of NAME from the input dataset, let's call the dataset HAVE, then macro a macro that has a NAME parameter and you can then call it once per observation.

data _null_;
  set have ;
  call execute(cats('%nrstr(%mymacro)(name=',quote(trim(name),"'"),')'));
run;

tassnh · Posted 09-15-2022 01:27 AM

Good morning Tom,

I will try to answer in detail and be as thorough as I can. Please find my replies in red:

Sounds like you are doing something like this:

1) Connect to database.

2) run procedure in database.

3) query table in database.

4) upload data to database.

You might want to revert 2 and 3 between one each other. First I select the value of the table, then run my SP (in which the value gets updated) and get the value of the table again. This has to loop for as many times as my rows. Each time I select the value I get a new system_id. At the end, I need to run it once more so that the app will use it for its own purposes.

Between 3 and 4 there is another (if not more) step. I need to update the value I got from my table in my dt (in DI), at the first field. As stated before, this has to loop for as many times as my line in my dt. Then comes step "4", in which I append my dt in my other table (Document Management Table).

Let's assume you already have a connection to the database defined by the libref MYDB. I do, with a specific username / password at a specific instance / db

Are you asking for this?:

Proc sql;

CONNECT TO ODBC(DATAsrc=H USER=sas_h PASSWORD="{SAS102}xxxxxxxxxxxxxxxxxxxxxx" );

EXECUTE ( EXEC [H].[dbo].[sp_next_system_id] ) by ODBC;

DISCONNECT FROM ODBC;

quit;

What we cannot tell from your description is exactly what it is you want to pull form SYSTEM_ID_TABLE, how to find the "new" id that the process generated, etc etc. As stated before, this is a single column, single row table, which holds the next system_id that will be given by the system (integer). So, in other words, I want to select the value (let's say that the value is 1000) and assign this value to my first field in my dt. I will then need to execute my SP, the SP will increment the value of the system_id table by 1 (so the new value in the table will be 1001) and select it again (for as many times as my rows in my dt) and reassign it to my 2nd line, 1st field of my dt.

For instance:

Proc sql;

CONNECT TO ODBC(DATAsrc=H USER=sas_h PASSWORD="{SAS102}xxxxxxxxxxxxxxxxxxxxxx" );

EXECUTE ( EXEC [H].[dbo].[sp_nkey] ) by ODBC;

DISCONNECT FROM ODBC;

quit;

I also do not see how the source dataset you wanted to loop over applies to the process.

This is not a source ds. This is a dt / ds created (generated) from various selections both from DWH and my DB in SQL. From all fields retrieved, the only thing left behind is the system_id which (since this dt will be appended to a production app) need to be given to the row at the very last second and is a new value (not an existing one).

To describe it in more detail;

I am constructing let's say a customer, based on his ID.

I need to get his name from the table that holds the names, based on his ID.
I need to get his date of birts from the table that holds the dates of birth, based on his ID.
I need to get his address from the table that holds the addresses, based on his ID.
I need to get his teleplhone from the table that holds the telephone numbers, based on his ID
I construct a table (and map it on another table which is a whole different story).
I also append an empty column on this table.
Now, for each line (which is for each client), I need to assign a system_id, based on my app (document management app). So, I need to loop through my rows, select the value of system_id from system_id table, execute the SP to increment the value of the system_id by 1 and go to the next line

The only thing missing is step "7".

Is it just that you need to call the stored process once for every observation in the dataset?

Yes and get the value from the system_id table (as explained in detail above)

Or are there values in the source dataset that need to serve as inputs to the stored process?

No. Just the 1st one

In general a simply way to repeat multiple steps for every observation in a dataset is to make a macro that does the steps that takes as input the information that is the dataset. Then just generate one call to the macro for every observation in the dataset.

Could you please elaborate a bit in more detail? Please bear in mind that I’m not that experienced in DI.

Thank you for all the effort you are putting into this.

Please let me know if there is something else you need me to clarify or coding you might think it will help more.

Kind regards,

tassnh

Tom · Posted 09-15-2022 10:36 AM

Trying to identify the forest from the bumps on the leaves of the trees your answer is showing:

It sounds like you have a dataset that is a list of customers. The dataset has variables like

ID, NAME, DOB, ADDRESS, PHONE

You now want to add a new variable, SYSTEM_ID, by using the stored process in your foreign database. Each procedure call only produces one new value. So if you have 100 customers in your dataset you will want to call the procedure 100 times.

One way to do that is to call the procedure 100 times and extract the new system_id values and save them into a SAS dataset. Then you just need to combine the dataset with the list of 100 customers with the dataset with 100 new system_id values.

Example (wallpaper code):

* Make an empty dataset for storing new system_id values ;
data new_system_ids;
  length system_id 8;
  stop;
run;
proc sql;
  CONNECT TO ODBC (...);
  EXECUTE BY ODBC ( EXEC [H].[dbo].[sp_nkey] ) ;
  insert into new_system_ids(system_id)
    select system_id from connection to odbc
     (select system_id from system_id)
  ;
  EXECUTE BY ODBC ( EXEC [H].[dbo].[sp_nkey] ) ;
  insert into new_system_ids(system_id)
    select system_id from connection to odbc
     (select system_id from system_id)
  ;
  EXECUTE BY ODBC ( EXEC [H].[dbo].[sp_nkey] ) ;
  insert into new_system_ids(system_id)
    select system_id from connection to odbc
     (select system_id from system_id)
  ;
...
disconnect from odbc;
quit;
data want ;
   set have;
   set new_system_ids;
run;

Now to generate that "wallpaper" code you could just count how many ids you need by counting how many customers are in your HAVE dataset. You could then use code generation to generate code. SAS macro language is one way to generate code.

%macro system_ids(n);
%local i;
%do i=1 to &n ;
  EXECUTE BY ODBC ( EXEC [H].[dbo].[sp_nkey] ) ;
  insert into new_system_ids(system_id)
    select system_id from connection to odbc
     (select system_id from system_id)
  ;
%end;
%mend;

data _null_;
  call symputx('nobs',nobs);
  stop;
  set have nobs=nobs;
run;

* Make an empty dataset for storing new system_id values ;
data new_system_ids;
  length system_id 8;
  stop;
run;
proc sql;
  CONNECT TO ODBC (...);
  %system_ids(&nobs);
  disconnect from odbc;
quit;
data want ;
   set have;
   set new_system_ids;
run;

NOTE: If the value is a BIGINT in your foreign database you might need to store it as a string in your SAS dataset since SAS can only represent integers with 15 decimal digits exactly. In that case in the passthru query to pull the new value of system_id cast it as a string first so that you can preserve the uniqueness of the value.

PS How can the system you described actually work? What happens if two users call the stored procedure at almost the exact same time so that it executes twice before the query to receive the number can run. So when they go to get the value both will take the same value. This is not how this type of autoincrement features work in other datasets I have seen. For example in Oracle it is a FUNCTION that you call to get the next value. So it can be returned as part of a query. Then there is no window of time between generating the value and retrieving it that could cause the race condition I just described.

tassnh · Posted 09-20-2022 12:44 AM

Dear Tom,

First of all I am sorry for the extremely long delay but I was sick at home and thank you very much for your reply.

Please find my answers in red:

Trying to identify the forest from the bumps on the leaves of the trees your answer is showing:

It sounds like you have a dataset that is a list of customers. The dataset has variables like

ID, NAME, DOB, ADDRESS, PHONE

You now want to add a new variable, SYSTEM_ID, by using the stored process in your foreign database. Each procedure call only produces one new value. So if you have 100 customers in your dataset you will want to call the procedure 100 times.

You nailed it!

One way to do that is to call the procedure 100 times and extract the new system_id values and save them into a SAS dataset. Then you just need to combine the dataset with the list of 100 customers with the dataset with 100 new system_id values.

The number of the customers varies each day. It could be 50, 55 or even a 100. This is the reason I need to have a loop in my ds and not just repeat a procedure for X times

Now to generate that "wallpaper" code you could just count how many ids you need by counting how many customers are in your HAVE dataset. You could then use code generation to generate code. SAS macro language is one way to generate code.

This code sounds about right. I will try this solution and get back to you asap.

Thank you for everything,

Regards,

tassnh

RacheLGomez123 · Posted 11-09-2022 04:32 AM

Loops are fundamental to programming because they enable you to repeat a computation for various values of parameters. Different languages use different keywords to define the iteration statement. The most well-known statement is the "for loop," which is used by C/C++, MATLAB, R, and other languages. Older languages, such as FORTRAN and SAS, call the iteration statement a "do loop," but it is exactly the same concept.
DO loops in the DATA step
The basic iterative DO statement in SAS has the syntax DO value = start TO stop. An END statement marks the end of the loop, as shown in the following example:
data A;
do i = 1 to 5;
y = i**2; /* values are 1, 4, 9, 16, 25 */
output;
end;
run;
By default, each iteration of a DO statement increments the value of the counter by 1, but you can use the BY option to increment the counter by other amounts, including non-integer amounts. For example, each iteration of the following DATA step increments the value i by 0.5:
data A;
do i = 1 to 5 by 0.5;
y = i**2; /* values are 1, 2.25, 4, ..., 16, 20.25, 25 */
output;
end;
run;
You can also iterate "backwards" by using a negative value for the BY option: do i=5 to 1 by -0.5.

Regards,

Rachel Gomez

Re: Create a loop in a SAS datatable / dataset and execute a store procedure each time

Re: Create a loop in a SAS datatable / dataset and execute a store procedure each time

Re: Create a loop in a SAS datatable / dataset and execute a store procedure each time

Re: Create a loop in a SAS datatable / dataset and execute a store procedure each time

Re: Create a loop in a SAS datatable / dataset and execute a store procedure each time

Re: Create a loop in a SAS datatable / dataset and execute a store procedure each time

Re: Create a loop in a SAS datatable / dataset and execute a store procedure each time

Catch up on SAS Innovate 2026