turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Dow Loop-Basics

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-05-2012 01:37 PM

The example is from page 2 of this paper.

http://analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf

"If VAR is missing CONTINUE passes the control straight to the bottom of the loop".

What is the meaning of bottom of the loop? Is it not gona calculate Mcount Prod and SUm for that variable and goes to the next variable but still with in the loop?Is that the meaning ?

Secondly,

"PROD and COUNT are

set to 1, and the non-retained SUM, MEAN, and MCOUNT are set to missing by the default action of the

implied loop (program control at the top of the implied loop)."

This sentence says that SUM Mean and Count are set to missing . But what lines in the code sets these values to missing??

Any help is greatly appreciated possibly with an example would be great

Thanks

Data B ( Keep = Id Prod Sum Count Mean) ;

Prod = 1 ;

Do Count = 1 By 1 Until ( Last.Id ) ;

Set A ;

By Id ;

If Missing (Var) Then Continue ;

Mcount = Sum (Mcount, 1) ;

Prod = Prod * Var ;

Sum = Sum (Sum, Var) ;

End ;

Mean = Sum / Mcount ;

Run ;

Message was edited by: Karun Diri Also I feel the keep statement would have been good with the set statement rather than the Data B Any suggestions??

Accepted Solutions

Solution

10-05-2012
03:59 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

10-05-2012 03:59 PM

MyBad! I had two step number 4s. I meant for you to run the following in one run .. not try to run it in parts:

data a;

input id var;

cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

Data B ( Keep = Id Prod Sum Count Mean) ;

Prod = 1 ;

Do Count = 1 By 1 Until ( Last.Id ) ;

Set A ;

By Id ;

putlog "(1)" _all_;

If Missing (Var) Then Continue ;

putlog "(2)" _all_;

Mcount = Sum (Mcount, 1) ;

Prod = Prod * Var ;

putlog "(3)" _all_;

Sum = Sum (Sum, Var) ;

putlog "(4)" _all_;

End ;

Mean = Sum / Mcount ;

putlog "(5)" _all_;

Run ;

You should end up reading all six records, but only reaching step 5 twice and only outputting two records.

However, I agree with Rick, that this might be too complex for you at this point. If you can answer your own questions after running the above code (that is to say, as is, not in parts), and answer your own questions by looking at what the log shows, you will at least learned a valuable lesson.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

10-05-2012 02:47 PM

Karun,

I think the best way to learn what is going on is to attempt the solution with some data, add some putlog statements in the code so that you can see what is happening within the PDV, and then attempt to answer your own questions.

E.g., try the following and let us know if you can answer your own questions:

data a;

input id var;

cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

Data B ( Keep = Id Prod Sum Count Mean) ;

Prod = 1 ;

Do Count = 1 By 1 Until ( Last.Id ) ;

Set A ;

By Id ;

putlog "(1)" _all_;

If Missing (Var) Then Continue ;

putlog "(2)" _all_;

Mcount = Sum (Mcount, 1) ;

Prod = Prod * Var ;

putlog "(3)" _all_;

Sum = Sum (Sum, Var) ;

putlog "(4)" _all_;

End ;

Mean = Sum / Mcount ;

putlog "(5)" _all_;

Run ;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

10-05-2012 03:41 PM

Sorry about that.

I figured it out.

I first ran it until putlog "(1)" _all_;

WARNING: The variable Sum in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable Mean in the DROP, KEEP, or RENAME list has never been referenced.

(1)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1

(1)Prod=1 Count=2 Last.Id=0 id=1 var=. FIRST.id=0 _ERROR_=0 _N_=1

(1)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1

(1)Prod=1 Count=1 Last.Id=0 id=2 var=. FIRST.id=1 _ERROR_=0 _N_=2

(1)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2

(1)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

Compared that output with running until putlog "(2)" _all_;

WARNING: The variable Sum in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable Mean in the DROP, KEEP, or RENAME list has never been referenced.

(2)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1

(2)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1

(2)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2

(2)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

Putlog(2) output will not have var=. because continue passes to the bottom of the loop directly if the var=.

So **of the 6** obs we now have **only 4 observation**

**IS that right?**

**Thanks**

Solution

10-05-2012
03:59 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

10-05-2012 03:59 PM

MyBad! I had two step number 4s. I meant for you to run the following in one run .. not try to run it in parts:

data a;

input id var;

cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

Data B ( Keep = Id Prod Sum Count Mean) ;

Prod = 1 ;

Do Count = 1 By 1 Until ( Last.Id ) ;

Set A ;

By Id ;

putlog "(1)" _all_;

If Missing (Var) Then Continue ;

putlog "(2)" _all_;

Mcount = Sum (Mcount, 1) ;

Prod = Prod * Var ;

putlog "(3)" _all_;

Sum = Sum (Sum, Var) ;

putlog "(4)" _all_;

End ;

Mean = Sum / Mcount ;

putlog "(5)" _all_;

Run ;

You should end up reading all six records, but only reaching step 5 twice and only outputting two records.

However, I agree with Rick, that this might be too complex for you at this point. If you can answer your own questions after running the above code (that is to say, as is, not in parts), and answer your own questions by looking at what the log shows, you will at least learned a valuable lesson.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

10-05-2012 04:09 PM

Thats great . I was running in parts just to see what happens at each step.

I already noted the difference between step1 and step2.

I agree with you both on this stuff looking complex to me. But i dont know for good or bad I have the attitude of going into the very depths even though its not required for me right now.

I see that most of the people post Answers on Dow loop. So I wanted to learn this stuff.

**Thanks for making me walk through the basic concepts**.....

Cheers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

10-05-2012 05:27 PM

Good! When you are ready, post the answers to your original questions. BTW, I find that taking advantage of the putlog _all_ statement is an excellent way to gain an understanding of how the datastep functions .. especially when one adds the complexities of a dow-type loop to the process.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

10-05-2012 05:34 PM

Yes I will. Thanks once again

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

10-05-2012 05:47 PM

Thanks a lot . I will frame the answer but it will be based on your example........and i will explain the put log with what is happening in every step. Will post it later today.

Cheers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

10-08-2012 12:32 PM

*The following is the log for the dow loop by using putlog in the code and knowing what happens at each stage.

Arthurs example dataset above is used for analysis

Please let me know if I went wrong

I am worried abt the explanation I gave for Putlog(3).......PLease correct

data a;

input id var;

cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

**Putlog 1**

***For the datastep the PDV is set up at the time of compilation with all the information for the variables as well as the automatic variables(_n_ and _error_). Here missing variables are also included.**

20 Data B ( Keep = Id Prod Sum Count Mean) ;

21

22 Prod = 1 ;

23

24 Do Count = 1 By 1 Until ( Last.Id ) ;

25

26 Set A ;

27

28 By Id ;

29

30 putlog "(1)" _all_;

31 END;

32 RUN;

(1)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1

(1)Prod=1 Count=2 Last.Id=0 id=1 var=. FIRST.id=0 _ERROR_=0 _N_=1

(1)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1

(1)Prod=1 Count=1 Last.Id=0 id=2 var=. FIRST.id=1 _ERROR_=0 _N_=2

(1)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2

(1)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

**Putlog 2**

**In this part of the code below since the continue is used if var is missing the contol transfrs the loop to the end statement.**

**That is why in this stage we have all the information for only the variables with non missing var;**

Data B ( Keep = Id Prod Sum Count Mean) ;

34

35 Prod = 1 ;

36

37 Do Count = 1 By 1 Until ( Last.Id ) ;

38

39 Set A ;

40

41 By Id ;

42

43

44

45

If Missing (Var) Then Continue ;

47

48 putlog "(2)" _all_;

49 END;

50 RUN;

(2)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1

(2)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1

(2)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2

(2)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

**Putlog 3**

**When we use a by group with in the DOW loop SAS keeps track of the DATA within each BY group.**

**Meaning prod value in this step below is retained in the group now when it has to have a value of 1 (we assigned at the beginning and is supposed to retain usually since this variable is being created in this datastep and not coming from dataset A) each time...!!!!!!!!!**

51 Data B ( Keep = Id Prod Sum Count Mean) ;

52

53 Prod = 1 ;

54

55 Do Count = 1 By 1 Until ( Last.Id ) ;

56

57 Set A ;

58

59 By Id ;

60

61 If Missing (Var) Then Continue ;

62 Mcount = Sum (Mcount, 1) ;

63

64 Prod = Prod * Var ;

65

66 putlog "(3)" _all_;

67 END;

68 RUN;

(3)Prod=10 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 Mcount=1 _ERROR_=0 _N_=1

(3)Prod=200 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 Mcount=2 _ERROR_=0 _N_=1

(3)Prod=20 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 Mcount=1 _ERROR_=0 _N_=2

(3)Prod=600 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 Mcount=2 _ERROR_=0 _N_=2

**Putlog 4**

**Now in each group the sum value has the cumulative of var because of the stmnt**

**sum=sum(sum,var);**

69 Data B ( Keep = Id Prod Sum Count Mean) ;

70

71 Prod = 1 ;

72

73 Do Count = 1 By 1 Until ( Last.Id ) ;

74

75 Set A ;

76

77 By Id ;

78

79 If Missing (Var) Then Continue ;

80 Mcount = Sum (Mcount, 1) ;

81

82 Prod = Prod * Var ;

83 Sum = Sum (Sum, Var) ;

84

85 putlog "(4)" _all_;

86 END;

87 RUN;

(4)Prod=10 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 Mcount=1 Sum=10 _ERROR_=0 _N_=1

(4)Prod=200 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 Mcount=2 Sum=30 _ERROR_=0 _N_=1

(4)Prod=20 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 Mcount=1 Sum=20 _ERROR_=0 _N_=2

(4)Prod=600 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 Mcount=2 Sum=50 _ERROR_=0 _N_=2

**Putlog 5**

**In the step below 30 is the final sum for ID 1 group and 50 is trhe sum for ID 2 group;**

**since our data is divided into 2 groups by the BY ID statement**

**At this point we went Outside the loop and said Mean = Sum / Mcount ;**

**Final sum's for each BY variable is divided by the final Mcount and finallly end up with the 2 observations having value of **

**Mean=15 for group ID1**

**Mean=25 for group ID2**

88 Data B ( Keep = Id Prod Sum Count Mean) ;

89

90 Prod = 1 ;

91

92 Do Count = 1 By 1 Until ( Last.Id ) ;

93

94 Set A ;

95

96 By Id ;

97

98 If Missing (Var) Then Continue ;

99 Mcount = Sum (Mcount, 1) ;

100

101 Prod = Prod * Var ;

102 Sum = Sum (Sum, Var) ;

103 END;

104

105 /*notice that the mean stmnt is after the END stmnt*/

106

107 Mean = Sum / Mcount ;

108

109 putlog "(5)" _all_;

110

111 Run ;

(5)Prod=200 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 Mcount=2 Sum=30 Mean=15 _ERROR_=0 _N_=1

(5)Prod=600 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 Mcount=2 Sum=50 Mean=25 _ERROR_=0 _N_=2