SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
halkyos406
Fluorite | Level 6

I am in a SAS coding workgroup and I am trying to figure out some practical uses of DO-WHILE and DO-UNTIL for demonstrating while presenting on DO loops. These are two functions of the loop that I have never found a use for. We work in a health department, so I would especially be interested in an example applicable to health data if anyone can come up with something. Here is what I have so far: (Data note: the values are all randomly generated numbers I threw together in Excel, these are not reflective of a real program)

DATA sample;
	INPUT coach $ patid $ intake lbs_s1-lbs_s10 @@;
	DATALINES;
Alex	6914	265	265	266	265	266	264	261	262	264	263	265
Blake	7506	234	236	235	232	233	233	236	236	239	241	243
Cindy	2172	202	199	202	203	200	201	999	200	198	199	202
Alex	4100	191	193	192	191	194	195	193	191	193	195	196
Blake	9348	216	215	216	999	216	213	210	211	209	212	999
Cindy	8537	274	272	272	274	272	271	269	270	999	267	265
Alex	8659	248	245	999	248	247	250	251	254	253	252	999
Blake	1986	232	231	231	230	231	228	231	234	234	237	238
Cindy	6340	287	286	289	286	287	288	291	293	291	999	292
Alex	4590	285	288	291	293	294	295	294	291	289	289	286
Blake	7197	275	276	276	277	276	999	273	272	273	999	999
Cindy	1738	317	318	315	314	314	314	315	317	316	313	314
Alex	8250	280	277	276	276	275	999	274	277	276	274	273
Blake	4230	228	228	226	999	224	225	224	225	226	223	222
Cindy	7662	281	282	279	279	276	273	272	271	270	272	273
Alex	3675	297	999	999	293	293	290	288	289	290	293	296
Blake	3576	211	213	216	218	215	212	210	213	214	214	212
Cindy	7782	225	227	229	231	234	234	237	236	233	233	236
Alex	1941	226	225	226	228	225	223	226	223	225	227	225
Blake	1586	260	258	255	257	260	259	262	261	259	257	258
Cindy	4466	233	232	231	228	226	999	222	222	223	222	223
Alex	8180	211	211	213	216	213	211	212	209	207	205	203
Blake	6577	202	201	204	201	204	205	205	206	206	209	210
Cindy	3164	208	208	209	210	999	211	210	208	210	211	999
Alex	8028	345	345	346	343	999	342	999	339	341	344	346
Blake	9577	265	265	267	264	261	264	261	263	263	264	263
Cindy	1767	187	188	189	187	186	189	192	191	999	999	190
Alex	2709	246	999	243	245	242	239	237	239	240	240	242
Blake	5686	271	268	270	268	269	269	999	272	273	273	273
Cindy	1285	268	268	999	266	265	267	269	267	269	270	272
	;
RUN;
PROC PRINT DATA = sample NOOBS;
RUN;
PROC MEANS DATA = sample N MIN MAX MEAN MEDIAN;
	TITLE "Variation of weight for each session";
	VAR lbs_s1-lbs_s10;
RUN;

*PROBLEM: Missing weights were coded as 999, which is messing up our results.
SOLUTION: Recode all 999 values a missing.;

DATA sample;
	SET sample;
	ARRAY mothman (10) lbs_s1-lbs_s10;
		DO i = 1 to 10;
			IF mothman(i) = 999 then mothman(i) = .;
		END; *NOTE: All do-loops require and end.;
	DROP i;
RUN;
PROC PRINT DATA = sample NOOBS;
RUN;
PROC MEANS DATA = sample N MIN MAX MEAN MEDIAN;
	TITLE "Variation of weight for each session";
	VAR lbs_s1-lbs_s10;
RUN;

*PROBLEM: We want to include data on the location of each of these participants before sending a copy to the CDC.
SOLUTION: We know where each of these coaches works and so we can add columns for the hospital, city, and zipcode.;

*** This is what it would look like without a do-loop ***;
DATA sample_x;
	SET sample;
	LENGTH hospital $ 25 city $ 8 zipcode $ 5;
	IF coach = "Alex" THEN hospital = "St. Peter's Health";
		ELSE IF coach = "Blake" THEN hospital = "Intermountain St. Vincent";
		ELSE IF coach = "Cindy" THEN hospital = "Providence St. Patrick";
	IF coach = "Alex" THEN city = "Helena";
		ELSE IF coach = "Blake" THEN city = "Billings";
		ELSE IF coach = "Cindy" THEN city = "Missoula";
	IF coach = "Alex" THEN zipcode = "59601";
		ELSE IF coach = "Blake" THEN zipcode = "59101";
		ELSE IF coach = "Cindy" THEN zipcode = "59802";
RUN;
PROC SORT DATA = sample_x;
	BY coach;
RUN;
PROC PRINT DATA = sample_x;
	Title "Sample output 1";
RUN;
*This works, but can be cumbersome if there are a lot of actions to execute and a lot of sites.
Even moreso if a staffing change occurs and we need to update the coding for future datasets;

****This is what it looks like with a do-loop executing multiple actions***;
DATA sample;
	SET sample;	
	LENGTH hospital $ 25 city $ 8 zipcode $ 5;
	IF coach = "Alex" THEN DO;
		hospital = "St. Peter's Health";
		city = "Helena";
		zipcode = "59601";
	END;	*Notice that each loop has a paired end.;
	IF coach = "Blake" THEN DO;
		hospital = "Intermountain St. Vincent";
		city = "Billings";
		zipcode = "59101";
	END;
	IF coach = "Cindy" THEN DO;
		hospital = "Providence St. Patrick";
		city = "Missoula";
		zipcode = "59802";
	END;
RUN;
PROC SORT DATA = sample;
	BY coach;
RUN;
PROC PRINT DATA = sample;
	Title "Sample output 2";
RUN;

*DO-UNTIL and DO-WHILE: These are conditional loops that checks for a criteria
for when to stop advancing. DO-UNTIL checks for the condition to be met at the
bottom of the loop (after completing all steps), and DO-WHILE checks for the
condition to be met at the top of the loop.;
1 ACCEPTED SOLUTION

Accepted Solutions
quickbluefish
Barite | Level 11

WHILE and UNTIL are useful when the underlying data are going to be used to determine when to stop the iteration (looping) rather than having it continue until a fixed point (as you would generally do in a DO loop).  The following is a little bit manufactured because I don't really know anything about your data or objective:

data test;
set sample;
array lb {*} lbs_:;
* lets say we want to find the first point at which the person 
has had 3 consecutive seasons (?) with an average weight<250 ;
s=3;
under250_3seas=0; * initialize this binary variable to 0 for this person ;
do while (under250_3seas=0 and s<dim(lb));
    if mean(lb[s-2], lb[s-1], lb[s])<250 then under250_3seas=1;
    else s+1;
end;
if under250_3seas=0 then s=.;
run;

So the resulting dataset will have a 0/1 indicator of whether the person met the condition, and if they did meet the condition, then the value of S will be the first point at which the condition was reached.  

 

There are definitely more complicated things you can do with WHILE - that's just a simple, contrived example.  

 

I would say that in general, UNTIL is used less than WHILE, at least in my experience, but it's just in the case that you want to run through the steps once even if the conditions are not met.  

 

A lot of times (not always), you could do the same thing with a regular DO loop, but it might be slightly more awkward:

data test;
set sample;
array lb {*} lbs_:;
* lets say we want to find the first point at which the person 
has had 3 consecutive seasons (?) with an average weight<250 ;
under250_3seas=0; * initialize this binary variable to 0 for this person ;
do s=3 to dim(lb);
    if mean(lb[s-2], lb[s-1], lb[s])<250 then do;
        under250_3seas=1;
        LEAVE;
    end;
end;
if under250_3seas=0 then s=.;
run;

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Suppose you have an iterative procedure, it keeps going until some threshold is crossed.

 

do while (quantity >= threshold);

 

or

 

do until (quantity <= threshold);

 

--
Paige Miller
quickbluefish
Barite | Level 11

WHILE and UNTIL are useful when the underlying data are going to be used to determine when to stop the iteration (looping) rather than having it continue until a fixed point (as you would generally do in a DO loop).  The following is a little bit manufactured because I don't really know anything about your data or objective:

data test;
set sample;
array lb {*} lbs_:;
* lets say we want to find the first point at which the person 
has had 3 consecutive seasons (?) with an average weight<250 ;
s=3;
under250_3seas=0; * initialize this binary variable to 0 for this person ;
do while (under250_3seas=0 and s<dim(lb));
    if mean(lb[s-2], lb[s-1], lb[s])<250 then under250_3seas=1;
    else s+1;
end;
if under250_3seas=0 then s=.;
run;

So the resulting dataset will have a 0/1 indicator of whether the person met the condition, and if they did meet the condition, then the value of S will be the first point at which the condition was reached.  

 

There are definitely more complicated things you can do with WHILE - that's just a simple, contrived example.  

 

I would say that in general, UNTIL is used less than WHILE, at least in my experience, but it's just in the case that you want to run through the steps once even if the conditions are not met.  

 

A lot of times (not always), you could do the same thing with a regular DO loop, but it might be slightly more awkward:

data test;
set sample;
array lb {*} lbs_:;
* lets say we want to find the first point at which the person 
has had 3 consecutive seasons (?) with an average weight<250 ;
under250_3seas=0; * initialize this binary variable to 0 for this person ;
do s=3 to dim(lb);
    if mean(lb[s-2], lb[s-1], lb[s])<250 then do;
        under250_3seas=1;
        LEAVE;
    end;
end;
if under250_3seas=0 then s=.;
run;
halkyos406
Fluorite | Level 6
These are really good and help to demonstrate the features of the programming. Thanks for the ideas! In my mind, s was sessions, instead of seasons, but with this being entirely fictional data it can really be anything.
quickbluefish
Barite | Level 11
...occurs to me that in the DO WHILE statement above, that should be:
...s <= dim(lb)
...rather than...
...s < dim(lb)
Tom
Super User Tom
Super User

With that data you might want to loop across all of those replicated variables.  Using WHILE() or UNTIL() will let you stop once you have found the answer.

 

For example you might want to make a flag that detects if the subject ever exceeded 250 pounds.

data want;
  set sample;
  array lbs_s[10];
  do i=1 to dim(lbs_s) until(over250);
    if (999 ne lbs_s[i]) then over250 = lbs_s[i]>250;
  end;
  drop i;
run;
SASKiwi
PROC Star

DO WHILE  and DO UNTIL are often useful when there is also a counter involved like - DO I  = 1 TO 100 UNTIL  - where the counter maximum is set for safety's sake but you also want to stop when the UNTIL condition is met.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 878 views
  • 2 likes
  • 5 in conversation