BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
persephone
Fluorite | Level 6

I use SAS Enterprise Guide 8.3, with Linux server and Windows SAS EG client.  We recently moved from Netezza to Snowflake/Azure.  This program expands the membership date spans into member months and has worked after we moved to the cloud however 2 months later (no changes to data) the program churns and I have to kill it.  Does Snowflake not like the syntax for the do while program? Any ideas why this did work after move to Snowflake and now 1 month later it churns?


SAMPLE DATA
data member_spans;
infile datalines delimiter= ',';
input patient_id $ startElig:date9. endElig:date9. transfer;
format startElig endElig date9. transfer dollar12.2;
datalines;
P12345678,01JAN2023,31OCT2023, 4000
Q23456789,01JAN2023,30JUN2023,461.43
R34567891,01JAN2023,31MAR2023,544.07
S45678912,01JAN2023,31AUG2023,3678.69
S45678912,01SEP2023,31OCT2023,923.46
T56789123,01OCT2023,31OCT2023,18930
U67891234,01JAN2023,31MAR2023,-1269.85
U67891234,01APR2023,31OCT2023,-3019.41
V78912345,01JAN2023,18JUL2023,-3196.49
V78912345,19JUL2023,14OCT2023,-1413.52
V78912345,15OCT2023,31OCT2023,-273.07
run;

/*Expand into mms*/
data expand_2_mms;
set Member_Spans ;
by patient_id startElig endElig ;
counter=1;
if transfer ne .;
do while(startElig <= endElig);
output;

startElig=intnx('month',startElig,1);
counter=counter+1;
if startElig >= endElig then leave;
end;

format startElig date9.;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
Ammonite | Level 13

DATA step code does not run in Snowflake, so the techniques you use for DATA step processing aren't an issue. Here are some considerations:

1. You really shouldn't use a subsetting IF statement when reading database data in SAS if you can get the same result using a WHERE statement. To improve the efficiency of your code, replace:

if transfer ne .;

 with:

where transfer is not null;

This will minimize the amount of data pulled into SAS from Snowflake for processing.

2. Is it possible you have some bad records in your data where the startElig value is missing? If so, your code will go into an infinite loop.
You can avoid this situation by adding code to avoid reading records with missing startElig values:

 

where transfer is not null and startElig is not null;

 

or add code to impute a value for a missing startElig before entering the loop:

 

startElig=colaesce(startElig, endElig-1);
do while(startElig <= endElig);
   *... more statement s ...;
end;

 

 

3. Is the dataset you are reading growing in size? If so, expect longer run times as the size of the data grows.

 

Check out my Jedi SAS Tricks for SAS Users

View solution in original post

5 REPLIES 5
ballardw
Super User

You don't show any connection to Snowflake so I am guessing that perhaps Snowflake stores dates differently than Netezza and your connection with Snowflake isn't doing the date conversion "correctly".

 

Perhaps you could share exactly how you connect to Snowflake and the code submitted with some LOG entry to provide more details that someone familiar with Snowflake can diagnose.

SASJedi
Ammonite | Level 13

DATA step code does not run in Snowflake, so the techniques you use for DATA step processing aren't an issue. Here are some considerations:

1. You really shouldn't use a subsetting IF statement when reading database data in SAS if you can get the same result using a WHERE statement. To improve the efficiency of your code, replace:

if transfer ne .;

 with:

where transfer is not null;

This will minimize the amount of data pulled into SAS from Snowflake for processing.

2. Is it possible you have some bad records in your data where the startElig value is missing? If so, your code will go into an infinite loop.
You can avoid this situation by adding code to avoid reading records with missing startElig values:

 

where transfer is not null and startElig is not null;

 

or add code to impute a value for a missing startElig before entering the loop:

 

startElig=colaesce(startElig, endElig-1);
do while(startElig <= endElig);
   *... more statement s ...;
end;

 

 

3. Is the dataset you are reading growing in size? If so, expect longer run times as the size of the data grows.

 

Check out my Jedi SAS Tricks for SAS Users
persephone
Fluorite | Level 6

Thanks Jedi!  You are spot on. 
I do a validation of the data prior to this data step so I knew that wasn't the issue.
I implemented the 'where variables are not nulll) for both transfer and startElig and it ran successfully in .26 seconds.

Very much appreciate your prompt reply and assistance!

Quentin
Super User

Can you share the log from running the step (and after waiting a while, killing it)?  There should be some helpful clues there.

 

To detect the potential infinite loop problem, you could add something like:

counter=1;
do while(startElig <= endElig);
  output;

  startElig=intnx('month',startElig,1);
  if startElig >= endElig then leave;
  counter=counter+1;
  if counter  > 1000 then do;
    put "ERROR: infinite loop!" patientID= startElig= endElig=;
    leave;
  end;
end;

I agree with SAS Jedi's suggestion that a missing value for StartElig would be one easy way for your current code to trigger an infinite loop.

persephone
Fluorite | Level 6

Thanks for your prompt response and suggestion to identify an infinite loop Quentin.
Will definitely use this in the future!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1899 views
  • 2 likes
  • 4 in conversation