Dear SAS community,
I installed the University edition of SAS last week and am now working through the SAS online course no. 2.
The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).
However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.
***********************************************************; * Activity 2.04 *; * 1) Change the WHERE statement to a subsetting IF *; * statement and submit the program. How many rows are *; * included in the output table? *; * 2) Move the subsetting IF statement just before the *; * RUN statement and submit the program. How many rows *; * are included in the output table? *; * 3) Consider the sequence of the statements in the *; * execution phase. Where is the optimal placement of *; * the subsetting IF statement? *; ***********************************************************; proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin MaxWindMPH; run; data storm2017_max; set storm2017_sort; by Basin;
if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;
Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?
I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.
I am using SAS Studio for my studies.
Thanks for your help,
Miriam
@miriam93 wrote:
Dear SAS community,
I installed the University edition of SAS last week and am now working through the SAS online course no. 2.
The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).
However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.
***********************************************************; * Activity 2.04 *; * 1) Change the WHERE statement to a subsetting IF *; * statement and submit the program. How many rows are *; * included in the output table? *; * 2) Move the subsetting IF statement just before the *; * RUN statement and submit the program. How many rows *; * are included in the output table? *; * 3) Consider the sequence of the statements in the *; * execution phase. Where is the optimal placement of *; * the subsetting IF statement? *; ***********************************************************; proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin MaxWindMPH; run; data storm2017_max; set storm2017_sort; by Basin;
if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;
Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?
I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.
I am using SAS Studio for my studies.
Thanks for your help,
Miriam
If you want data sorted by descending values then you need to add the Descending in the BY statement of the sort. Otherwise your data sorts by default smallest to largest and the Last basin value has the greatest MaxWindMPH
proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin descending MaxWindMPH; run;
But that does not match the apparent requirement if you use LAST.basin. Descending MaxWindMPH would imply that you want FIRST.Basin for the greatest MaxWindMPH value.
Show the LOG of the code you actually ran. One would suspect that if the source set and your output both have 54 records that you did not use a subsetting if statement correctly.
@miriam93 wrote:
Dear SAS community,
I installed the University edition of SAS last week and am now working through the SAS online course no. 2.
The following code (activity 2.04), which is my solution and also the official solution, is supposed to produce a table with 5 rows, one for each value of basin (the one with the highest MaxWindMPH).
However, if I run this, I get a table with 54 rows and multiple entries for identical values of 'Basin'.
***********************************************************; * Activity 2.04 *; * 1) Change the WHERE statement to a subsetting IF *; * statement and submit the program. How many rows are *; * included in the output table? *; * 2) Move the subsetting IF statement just before the *; * RUN statement and submit the program. How many rows *; * are included in the output table? *; * 3) Consider the sequence of the statements in the *; * execution phase. Where is the optimal placement of *; * the subsetting IF statement? *; ***********************************************************; proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin MaxWindMPH; run; data storm2017_max; set storm2017_sort; by Basin;
if last.Basin=1; StormLength=EndDate-StartDate; MaxWindKM=MaxWindMPH*1.60934; run;
Now, my question: Is there simply a mistake in this solution (which, just by chance, I also made in my solution) or has there been an update in SAS which changed the behaviour/usage of last.myvariable/first.myvariable?
I naively expected that this code sorts the input data according to Basin and then decsending wind speed, and by last.Basin=1 only keeps the last entry for each basin, the one with the highest wind speed.
I am using SAS Studio for my studies.
Thanks for your help,
Miriam
If you want data sorted by descending values then you need to add the Descending in the BY statement of the sort. Otherwise your data sorts by default smallest to largest and the Last basin value has the greatest MaxWindMPH
proc sort data=pg2.storm_2017 out=storm2017_sort; by Basin descending MaxWindMPH; run;
But that does not match the apparent requirement if you use LAST.basin. Descending MaxWindMPH would imply that you want FIRST.Basin for the greatest MaxWindMPH value.
Show the LOG of the code you actually ran. One would suspect that if the source set and your output both have 54 records that you did not use a subsetting if statement correctly.
Hello 🙂
Oh yes yes of course, that makes sense!
It now worked.
Thanks again,
Miriam
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.
Find more tutorials on the SAS Users YouTube channel.