BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8

The example is from page 2 of this paper.

http://analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf

"If VAR is missing CONTINUE passes the control straight to the bottom of the loop".

What is the meaning of bottom of the loop? Is it not gona calculate Mcount Prod and SUm for that variable and goes to the next variable but still with in the loop?Is that the meaning ?

Secondly,

"PROD and COUNT are

set to 1, and the non-retained SUM, MEAN, and MCOUNT are set to missing by the default action of the

implied loop (program control at the top of the implied loop)."

This sentence says that SUM Mean and Count are set to missing . But what lines in the code sets these values to missing??

Any help is greatly appreciated possibly with an example would be great

Thanks

Data B ( Keep = Id Prod Sum Count Mean) ;

  Prod = 1 ;

   Do Count = 1 By 1 Until ( Last.Id ) ;

   Set A ;

    By Id ;

       If Missing (Var) Then Continue ;

      Mcount = Sum (Mcount, 1) ;

       Prod = Prod * Var ;

       Sum = Sum (Sum, Var) ;

       End ;

       Mean = Sum / Mcount ;

Run ;

Message was edited by: Karun Diri Also I feel the keep statement would have been good with the set statement rather than the Data B Any suggestions??

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

MyBad!  I had two step number 4s.  I meant for you to run the following in one run .. not try to run it in parts:

data a;

  input id var;

  cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

Data B ( Keep = Id Prod Sum Count Mean) ;

  Prod = 1 ;

  Do Count = 1 By 1 Until ( Last.Id ) ;

    Set A ;

    By Id ;

putlog "(1)" _all_;

    If Missing (Var) Then Continue ;

putlog "(2)" _all_;

    Mcount = Sum (Mcount, 1) ;

    Prod = Prod * Var ;

putlog "(3)" _all_;

    Sum = Sum (Sum, Var) ;

putlog "(4)" _all_;

  End ;

  Mean = Sum / Mcount ;

putlog "(5)" _all_;

Run ;

You should end up reading all six records, but only reaching step 5 twice and only outputting two records.

However, I agree with Rick, that this might be too complex for you at this point.  If you can answer your own questions after running the above code (that is to say, as is, not in parts), and answer your own questions by looking at what the log shows, you will at least learned a valuable lesson.

View solution in original post

8 REPLIES 8
art297
Opal | Level 21

Karun,

I think the best way to learn what is going on is to attempt the solution with some data, add some putlog statements in the code so that you can see what is happening within the PDV, and then attempt to answer your own questions.

E.g., try the following and let us know if you can answer your own questions:

data a;

  input id var;

  cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

Data B ( Keep = Id Prod Sum Count Mean) ;

  Prod = 1 ;

  Do Count = 1 By 1 Until ( Last.Id ) ;

    Set A ;

    By Id ;

putlog "(1)" _all_;

    If Missing (Var) Then Continue ;

putlog "(2)" _all_;

    Mcount = Sum (Mcount, 1) ;

    Prod = Prod * Var ;

putlog "(3)" _all_;

    Sum = Sum (Sum, Var) ;

putlog "(4)" _all_;

  End ;

  Mean = Sum / Mcount ;

putlog "(5)" _all_;

Run ;

robertrao
Quartz | Level 8

Sorry about that.

I figured it out.

I first ran it until putlog "(1)" _all_;

WARNING: The variable Sum in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable Mean in the DROP, KEEP, or RENAME list has never been referenced.

(1)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1

(1)Prod=1 Count=2 Last.Id=0 id=1 var=. FIRST.id=0 _ERROR_=0 _N_=1

(1)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1

(1)Prod=1 Count=1 Last.Id=0 id=2 var=. FIRST.id=1 _ERROR_=0 _N_=2

(1)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2

(1)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

Compared that output with running until putlog "(2)" _all_;

WARNING: The variable Sum in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable Mean in the DROP, KEEP, or RENAME list has never been referenced.

(2)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1

(2)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1

(2)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2

(2)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

Putlog(2) output will not have var=. because continue passes to the bottom of the loop directly if the var=.

So of the 6 obs we now have only 4 observation

IS that right?

Thanks

art297
Opal | Level 21

MyBad!  I had two step number 4s.  I meant for you to run the following in one run .. not try to run it in parts:

data a;

  input id var;

  cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

Data B ( Keep = Id Prod Sum Count Mean) ;

  Prod = 1 ;

  Do Count = 1 By 1 Until ( Last.Id ) ;

    Set A ;

    By Id ;

putlog "(1)" _all_;

    If Missing (Var) Then Continue ;

putlog "(2)" _all_;

    Mcount = Sum (Mcount, 1) ;

    Prod = Prod * Var ;

putlog "(3)" _all_;

    Sum = Sum (Sum, Var) ;

putlog "(4)" _all_;

  End ;

  Mean = Sum / Mcount ;

putlog "(5)" _all_;

Run ;

You should end up reading all six records, but only reaching step 5 twice and only outputting two records.

However, I agree with Rick, that this might be too complex for you at this point.  If you can answer your own questions after running the above code (that is to say, as is, not in parts), and answer your own questions by looking at what the log shows, you will at least learned a valuable lesson.

robertrao
Quartz | Level 8

Thats great . I was running in parts just to see what happens at each step.

I already noted the difference between step1 and step2.

I agree with you both on this stuff looking complex to me. But i dont know for good or bad I have the attitude of going into the very depths even though its not required for me right now.

I see that most of the people post Answers on Dow loop. So I wanted to learn this stuff.

Thanks for making me walk through the basic concepts.....

Cheers

art297
Opal | Level 21

Good!  When you are ready, post the answers to your original questions.  BTW, I find that taking advantage of the putlog _all_ statement is an excellent way to gain an understanding of how the datastep functions .. especially when one adds the complexities of a dow-type loop to the process.

robertrao
Quartz | Level 8

Yes I will. Thanks once again

robertrao
Quartz | Level 8

Thanks a lot . I will frame the answer but it will be based on your example........and i will explain the put log  with what is happening in every step. Will post it later today.

Cheers

robertrao
Quartz | Level 8

*The following is the log for the dow loop by using putlog in the code and knowing what happens at each stage.

Arthurs example dataset above is used for analysis

Please let me know if I went wrong

I am worried abt the explanation I gave for Putlog(3).......PLease correct

data a;

  input id var;

  cards;

1 10

1 .

1 20

2 .

2 20

2 30

;

Putlog 1

*For the datastep the PDV is set up at the time of compilation with all the information for the variables as well as the automatic variables(_n_ and _error_). Here missing variables are also included.


20   Data B ( Keep = Id Prod Sum Count Mean) ;
21
22     Prod = 1 ;
23
24     Do Count = 1 By 1 Until ( Last.Id ) ;
25
26       Set A ;
27
28       By Id ;
29
30   putlog "(1)" _all_;
31   END;
32   RUN;

(1)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1
(1)Prod=1 Count=2 Last.Id=0 id=1 var=. FIRST.id=0 _ERROR_=0 _N_=1
(1)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1
(1)Prod=1 Count=1 Last.Id=0 id=2 var=. FIRST.id=1 _ERROR_=0 _N_=2
(1)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2
(1)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

Putlog 2

In this part of the code below since the continue is used if var is missing the contol transfrs the loop to the end statement.

That is why in this stage we have all the information for only the variables with non missing var;


Data B ( Keep = Id Prod Sum Count Mean) ;
34
35     Prod = 1 ;
36
37     Do Count = 1 By 1 Until ( Last.Id ) ;
38
39       Set A ;
40
41       By Id ;
42
43
44
45
     If Missing (Var) Then Continue ;
47
48   putlog "(2)" _all_;
49   END;
50   RUN;

(2)Prod=1 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 _ERROR_=0 _N_=1
(2)Prod=1 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 _ERROR_=0 _N_=1
(2)Prod=1 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 _ERROR_=0 _N_=2
(2)Prod=1 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 _ERROR_=0 _N_=2

Putlog 3

When we use a by group with in the DOW loop SAS keeps track of the DATA within each BY group.

Meaning prod value in this step below  is retained in the group now  when it has to have a value of 1 (we assigned at the beginning and is supposed to retain usually since this variable is being created in this datastep and not coming from dataset A) each time...!!!!!!!!!


51   Data B ( Keep = Id Prod Sum Count Mean) ;
52
53     Prod = 1 ;
54
55     Do Count = 1 By 1 Until ( Last.Id ) ;
56
57       Set A ;
58
59       By Id ;
60
61   If Missing (Var) Then Continue ;
62   Mcount = Sum (Mcount, 1) ;
63
64       Prod = Prod * Var ;
65
66   putlog "(3)" _all_;
67   END;
68   RUN;

(3)Prod=10 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 Mcount=1 _ERROR_=0 _N_=1
(3)Prod=200 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 Mcount=2 _ERROR_=0 _N_=1
(3)Prod=20 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 Mcount=1 _ERROR_=0 _N_=2
(3)Prod=600 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 Mcount=2 _ERROR_=0 _N_=2

Putlog 4

Now in each group the sum value has the cumulative of var because of  the stmnt

sum=sum(sum,var);


69   Data B ( Keep = Id Prod Sum Count Mean) ;
70
71     Prod = 1 ;
72
73     Do Count = 1 By 1 Until ( Last.Id ) ;
74
75       Set A ;
76
77       By Id ;
78
79   If Missing (Var) Then Continue ;
80   Mcount = Sum (Mcount, 1) ;
81
82       Prod = Prod * Var ;
83   Sum = Sum (Sum, Var) ;
84
85   putlog "(4)" _all_;
86   END;
87   RUN;

(4)Prod=10 Count=1 Last.Id=0 id=1 var=10 FIRST.id=1 Mcount=1 Sum=10 _ERROR_=0 _N_=1
(4)Prod=200 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 Mcount=2 Sum=30 _ERROR_=0 _N_=1
(4)Prod=20 Count=2 Last.Id=0 id=2 var=20 FIRST.id=0 Mcount=1 Sum=20 _ERROR_=0 _N_=2
(4)Prod=600 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 Mcount=2 Sum=50 _ERROR_=0 _N_=2

Putlog 5

In the step below 30 is the final sum for ID 1 group and 50 is trhe sum for ID 2 group;

since our data is divided into 2 groups by the BY ID statement

At this point we went Outside the loop and said  Mean = Sum / Mcount ;

Final sum's for each BY variable is divided by the final Mcount  and finallly end up with the 2 observations having value of

Mean=15 for group ID1

Mean=25  for group ID2

88   Data B ( Keep = Id Prod Sum Count Mean) ;
89
90     Prod = 1 ;
91
92     Do Count = 1 By 1 Until ( Last.Id ) ;
93
94       Set A ;
95
96       By Id ;
97
98   If Missing (Var) Then Continue ;
99   Mcount = Sum (Mcount, 1) ;
100
101      Prod = Prod * Var ;
102  Sum = Sum (Sum, Var) ;
103  END;
104
105                 /*notice that the mean stmnt is after the END stmnt*/
106
107    Mean = Sum / Mcount ;
108
109  putlog "(5)" _all_;
110
111  Run ;

(5)Prod=200 Count=3 Last.Id=1 id=1 var=20 FIRST.id=0 Mcount=2 Sum=30 Mean=15 _ERROR_=0 _N_=1
(5)Prod=600 Count=3 Last.Id=1 id=2 var=30 FIRST.id=0 Mcount=2 Sum=50 Mean=25 _ERROR_=0 _N_=2

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1179 views
  • 0 likes
  • 2 in conversation