BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
miriam93
Fluorite | Level 6

Dear SAS community,

 

I installed the University edition of SAS last week and am now working through the SAS online course no. 2. 

 

The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).

 

However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.

 

 

***********************************************************;
*  Activity 2.04                                          *;
*  1) Change the WHERE statement to a subsetting IF       *;
*     statement and submit the program. How many rows are *;
*     included in the output table?                       *;
*  2) Move the subsetting IF statement just before the    *;
*     RUN statement and submit the program. How many rows *;
*     are included in the output table?                   *;
*  3) Consider the sequence of the statements in the      *;
*     execution phase. Where is the optimal placement of  *;
*     the subsetting IF statement?                        *;
***********************************************************;
proc sort data=pg2.storm_2017 out=storm2017_sort;
	by Basin MaxWindMPH;
run;

data storm2017_max;
    set storm2017_sort;
    by Basin;
    if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;

 

Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?

I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.

 

I am using SAS Studio for my studies.

 

Thanks for your help,

 

Miriam

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@miriam93 wrote:

Dear SAS community,

 

I installed the University edition of SAS last week and am now working through the SAS online course no. 2. 

 

The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).

 

 

 

However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.

 

 

***********************************************************;
*  Activity 2.04                                          *;
*  1) Change the WHERE statement to a subsetting IF       *;
*     statement and submit the program. How many rows are *;
*     included in the output table?                       *;
*  2) Move the subsetting IF statement just before the    *;
*     RUN statement and submit the program. How many rows *;
*     are included in the output table?                   *;
*  3) Consider the sequence of the statements in the      *;
*     execution phase. Where is the optimal placement of  *;
*     the subsetting IF statement?                        *;
***********************************************************;
proc sort data=pg2.storm_2017 out=storm2017_sort;
	by Basin MaxWindMPH;
run;

data storm2017_max;
    set storm2017_sort;
    by Basin;
    if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;

 

Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?

I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.

 

I am using SAS Studio for my studies.

 

Thanks for your help,

 

Miriam

 

 


If you want data sorted by descending values then you need to add the Descending in the BY statement of the sort. Otherwise your data sorts by default smallest to largest and the Last basin value has the greatest MaxWindMPH

 

proc sort data=pg2.storm_2017 out=storm2017_sort;
	by Basin descending MaxWindMPH;
run;

But that does not match the apparent requirement if you use LAST.basin. Descending MaxWindMPH would imply that you want FIRST.Basin for the greatest MaxWindMPH value.

Show the LOG of the code you actually ran. One would suspect that if the source set and your output both have 54 records that you did not use a subsetting if statement correctly.

 

 

View solution in original post

2 REPLIES 2
ballardw
Super User

@miriam93 wrote:

Dear SAS community,

 

I installed the University edition of SAS last week and am now working through the SAS online course no. 2. 

 

The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).

 

 

 

However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.

 

 

***********************************************************;
*  Activity 2.04                                          *;
*  1) Change the WHERE statement to a subsetting IF       *;
*     statement and submit the program. How many rows are *;
*     included in the output table?                       *;
*  2) Move the subsetting IF statement just before the    *;
*     RUN statement and submit the program. How many rows *;
*     are included in the output table?                   *;
*  3) Consider the sequence of the statements in the      *;
*     execution phase. Where is the optimal placement of  *;
*     the subsetting IF statement?                        *;
***********************************************************;
proc sort data=pg2.storm_2017 out=storm2017_sort;
	by Basin MaxWindMPH;
run;

data storm2017_max;
    set storm2017_sort;
    by Basin;
    if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;

 

Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?

I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.

 

I am using SAS Studio for my studies.

 

Thanks for your help,

 

Miriam

 

 


If you want data sorted by descending values then you need to add the Descending in the BY statement of the sort. Otherwise your data sorts by default smallest to largest and the Last basin value has the greatest MaxWindMPH

 

proc sort data=pg2.storm_2017 out=storm2017_sort;
	by Basin descending MaxWindMPH;
run;

But that does not match the apparent requirement if you use LAST.basin. Descending MaxWindMPH would imply that you want FIRST.Basin for the greatest MaxWindMPH value.

Show the LOG of the code you actually ran. One would suspect that if the source set and your output both have 54 records that you did not use a subsetting if statement correctly.

 

 

miriam93
Fluorite | Level 6

Hello 🙂

Oh yes yes of course, that makes sense!
It now worked.

 

Thanks again,

Miriam

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

LIBNAME 101

Follow along as SAS technical trainer Dominique Weatherspoon expertly answers all your questions about SAS Libraries.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1562 views
  • 2 likes
  • 2 in conversation