Dear SAS community,
I installed the University edition of SAS last week and am now working through the SAS online course no. 2.
The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).
However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.
***********************************************************; * Activity 2.04 *; * 1) Change the WHERE statement to a subsetting IF *; * statement and submit the program. How many rows are *; * included in the output table? *; * 2) Move the subsetting IF statement just before the *; * RUN statement and submit the program. How many rows *; * are included in the output table? *; * 3) Consider the sequence of the statements in the *; * execution phase. Where is the optimal placement of *; * the subsetting IF statement? *; ***********************************************************; proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin MaxWindMPH; run; data storm2017_max; set storm2017_sort; by Basin;
if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;
Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?
I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.
I am using SAS Studio for my studies.
Thanks for your help,
Miriam
@miriam93 wrote:
Dear SAS community,
I installed the University edition of SAS last week and am now working through the SAS online course no. 2.
The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).
However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.
***********************************************************; * Activity 2.04 *; * 1) Change the WHERE statement to a subsetting IF *; * statement and submit the program. How many rows are *; * included in the output table? *; * 2) Move the subsetting IF statement just before the *; * RUN statement and submit the program. How many rows *; * are included in the output table? *; * 3) Consider the sequence of the statements in the *; * execution phase. Where is the optimal placement of *; * the subsetting IF statement? *; ***********************************************************; proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin MaxWindMPH; run; data storm2017_max; set storm2017_sort; by Basin;
if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;
Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?
I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.
I am using SAS Studio for my studies.
Thanks for your help,
Miriam
If you want data sorted by descending values then you need to add the Descending in the BY statement of the sort. Otherwise your data sorts by default smallest to largest and the Last basin value has the greatest MaxWindMPH
proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin descending MaxWindMPH; run;
But that does not match the apparent requirement if you use LAST.basin. Descending MaxWindMPH would imply that you want FIRST.Basin for the greatest MaxWindMPH value.
Show the LOG of the code you actually ran. One would suspect that if the source set and your output both have 54 records that you did not use a subsetting if statement correctly.
@miriam93 wrote:
Dear SAS community,
I installed the University edition of SAS last week and am now working through the SAS online course no. 2.
The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).
However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.
***********************************************************; * Activity 2.04 *; * 1) Change the WHERE statement to a subsetting IF *; * statement and submit the program. How many rows are *; * included in the output table? *; * 2) Move the subsetting IF statement just before the *; * RUN statement and submit the program. How many rows *; * are included in the output table? *; * 3) Consider the sequence of the statements in the *; * execution phase. Where is the optimal placement of *; * the subsetting IF statement? *; ***********************************************************; proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin MaxWindMPH; run; data storm2017_max; set storm2017_sort; by Basin;
if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;
Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?
I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.
I am using SAS Studio for my studies.
Thanks for your help,
Miriam
If you want data sorted by descending values then you need to add the Descending in the BY statement of the sort. Otherwise your data sorts by default smallest to largest and the Last basin value has the greatest MaxWindMPH
proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin descending MaxWindMPH; run;
But that does not match the apparent requirement if you use LAST.basin. Descending MaxWindMPH would imply that you want FIRST.Basin for the greatest MaxWindMPH value.
Show the LOG of the code you actually ran. One would suspect that if the source set and your output both have 54 records that you did not use a subsetting if statement correctly.
Hello 🙂
Oh yes yes of course, that makes sense!
It now worked.
Thanks again,
Miriam
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Follow along as SAS technical trainer Dominique Weatherspoon expertly answers all your questions about SAS Libraries.
Find more tutorials on the SAS Users YouTube channel.