BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jkl123
Obsidian | Level 7
Hello everyone,

I have some questions about the following code and its answer choice. This is a problem of sas practice for base exam.
The right answer is A.

(1) In the user-defined format, the right side of "=" is some date formats, which are put inside
of []. Is this true for all the cases of any format on the right?

(2) I thought this program subsets all the observations by year 1980 and later, not 1991.
The second where statement overwrites the first one. Any comments? Thanks.


title; proc format; value yrgroup low - '31dec1991'd = [year.] '01jan1992'd - high = [monyy.] ; run; proc print data=SASHelp.retail noobs; where date > '01Jan1991'd; var date sales; format date yrgroup.; where date > '01Jan1980'd; run;
Which statement describes the results?
 
Select one:
The report generated by the PRINT procedure will summarize sales by year through 1991 and after by month and year.
The FORMAT procedure will fail but the PRINT procedure will summarize by date disregarding the FORMAT statement.
The FORMAT procedure will fail to create a format and thus the PRINT procedure will fail as the format cannot be found.
The PRINT procedure will fail as there are conflicting WHERE statements.
1 ACCEPTED SOLUTION

Accepted Solutions
jimbarbour
Meteorite | Level 14

The answer key appears to be just plain wrong.

 

If two WHERE clauses are specified, SAS takes the most recent one and replaces any earlier WHERE clauses.

 

I just ran your code and got this message in my log:

36         proc print data=SASHelp.retail noobs;
37            where date > '01Jan1991'd;
38            var date sales;
39            format date yrgroup.;
40            where date > '01Jan1980'd;
NOTE: WHERE clause has been replaced.
41         run;

In other words, your data will be subsetted by 01JAN1980 not 01JAN1991.

 

Jim

 

Results are below.  Correct me if I am wrong, but I believe that I see data from January 1980 onward.

Practice_Exam_WHERE_Clause_Results_2020-09-15_17-46-26.jpg

View solution in original post

4 REPLIES 4
jimbarbour
Meteorite | Level 14

The answer key appears to be just plain wrong.

 

If two WHERE clauses are specified, SAS takes the most recent one and replaces any earlier WHERE clauses.

 

I just ran your code and got this message in my log:

36         proc print data=SASHelp.retail noobs;
37            where date > '01Jan1991'd;
38            var date sales;
39            format date yrgroup.;
40            where date > '01Jan1980'd;
NOTE: WHERE clause has been replaced.
41         run;

In other words, your data will be subsetted by 01JAN1980 not 01JAN1991.

 

Jim

 

Results are below.  Correct me if I am wrong, but I believe that I see data from January 1980 onward.

Practice_Exam_WHERE_Clause_Results_2020-09-15_17-46-26.jpg

jkl123
Obsidian | Level 7
Thank you so much, Jim. This is also what I got before. It's unusual to see
something wrong in the official practice. But anyway, it's a good idea to
run the program.

jimbarbour
Meteorite | Level 14

Yes, when in doubt, run the code.  Actual code trumps all answer keys.  🙂

 

I suppose this topic is now "solved," at least in the sense that we know what's going on.  Far better to know what's going on than to try to talk yourself into accepting the answer key's solution and thereby mis-learning SAS.

 

Jim

Tom
Super User Tom
Super User

The code as posted runs without errors. There are some NOTES produced by both steps.

188   proc format;
189      value yrgroup low - '31dec1991'd = [year.]
NOTE: The YEAR (in)format was specified on the right-hand side of an equal sign, but without a length
      specification. PROC FORMAT will assume a default length of at least 40 for the format being
      generated. If this is an insufficient width, you can rerun PROC FORMAT with an explicit width for
      the YEAR (in)format, or provide a sufficient DEFAULT= option.
190                    '01jan1992'd - high = [monyy.]
NOTE: The MONYY (in)format was specified on the right-hand side of an equal sign, but without a length
      specification. PROC FORMAT will assume a default length of at least 40 for the format being
      generated. If this is an insufficient width, you can rerun PROC FORMAT with an explicit width for
      the MONYY (in)format, or provide a sufficient DEFAULT= option.
191   ;
NOTE: Format YRGROUP has been output.
192   run;

NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.09 seconds
      cpu time            0.03 seconds


193
194   proc print data=SASHelp.retail noobs;
195      where date > '01Jan1991'd;
196      var date sales;
197      format date yrgroup.;
198      where date > '01Jan1980'd;
NOTE: WHERE clause has been replaced.
199   run;

NOTE: There were 57 observations read from the data set SASHELP.RETAIL.
      WHERE date>'01JAN1980'D;
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.21 seconds
      cpu time            0.06 seconds

Let's look at each answer and see how it stacks up to reality.  Then perhaps we can pick a least bad answer.

Let's do in reverse order since they claim you should pick A.

D The PRINT procedure will fail as there are conflicting WHERE statements.

There is no such things as conflicting WHERE statements. The last one cancels the earlier ones. You can see that in the notes in the SAS log.

C The FORMAT procedure will fail to create a format and thus the PRINT procedure will fail as the format cannot be found.

The is no obvious error with the format step.  The only issue is that the nested format specifications do not have widths specified. But as you can see from the notes in the log that does not stop the format from being created.  And a data step will only fail if the format is not found if you have set the FMTERR option. They do not mention anything about that being set.

B The FORMAT procedure will fail but the PRINT procedure will summarize by date disregarding the FORMAT statement.

So this is like C but they say that PRINT will ignore the missing format.  So that part is right.  But the first part is wrong.  But the PROC PRINT step does not "summarize" by anything.  Perhaps if you included a SUM statement for one or more of the variables in the VAR statement it might show totals for that variable.  Or perhaps if you add the N option to the PROC PRINT statement it would show a count of the number of observations.  

A The report generated by the PRINT procedure will summarize sales by year through 1991 and after by month and year.

This one is a little closer to the truth.  The PROC PRINT output will DISPLAY the date values as just the YEAR for dates before 1992 and as 3 letter month and year for the later dates.  But again the PROC PRINT is not "summarizing" anything.

 

Part of the output. You can see that there appears to be four observations for each year. Based on the later values apparently each on is for a date that is in a different month of that year.

 

image.png

 

So A is the best answer as it is the only one that describes what the format does (or even attempts to describe it).  So if I had to pick one I would choose A.  Is this strictly a multiple choice test? Or are you allowed to explain why you picked a given answer?

 

If you wanted to summarize by DATE then use another procedure.  Like PROC MEANS.

proc means data=sashelp.retail ;
   where date > '01Jan1990'd; 
   where also date < '01JAN1993'd;
   format date yrgroup.;
   class date;
   var sales;
run;
The MEANS Procedure

           Analysis Variable : SALES Retail sales in millions of $

            N
DATE      Obs    N           Mean        Std Dev        Minimum        Maximum
------------------------------------------------------------------------------
1990        3    3    709.3333333     37.6873100    674.0000000    749.0000000

1991        4    4    736.7500000     48.0303723    703.0000000    807.0000000

JAN1992     1    1    692.0000000              .    692.0000000    692.0000000

APR1992     1    1    797.0000000              .    797.0000000    797.0000000

JUL1992     1    1    826.0000000              .    826.0000000    826.0000000

OCT1992     1    1    889.0000000              .    889.0000000    889.0000000
------------------------------------------------------------------------------

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1433 views
  • 1 like
  • 3 in conversation