Hello,
I am getting the right answers to my program, but I am not able to understand what the last.code is actually doing.
In below pic if last.code; is included.
the output is:
In the second program, I omitted the last.code; statement. The output is as follows:
I am not able to understand why removing just the last.code; is not grouping by code. Would appreciate help in understanding how this last.var works. TIA!!
@SASsusrik wrote:
Also if I did not have a if last.code statement, I am presuming it automatically ouputs the last variable values from the first.code statement. I hope I am correct in assuming this. TIA!
The way you have written this, and if I am understanding you properly, I would say the answer is NO. You are not correct. The command
if last.date;
causes only the last record for each date to be output. If you leave this out, then every record for every date is output. It works independently of any first.date command.
And its not 'last.code', and it is not 'last.var' which won't work here, it is last.date.
if last.code;
This IF statement tells the data step to output a record to the data set. Other records are not output.
Which record(s) get output? The last record that has each value of combination of variables ID and CODE. In this case, the last record contains number of observations and total miles for each value of the combination of variables ID and CODE.
If you omit this IF statement, then all records are output.
Oh awesome, Thanks Paige,
I have another follow up question reg the initialization of variables in the data step:
this is the code :
and the partial output (which is what I am looking for):
However if I do this: explicitly state a var statement and use assignment then it prints booksales as empty values. The total sales is. correct however. I presume I have initialized the variables in the if first.date conditional. Can you please shed some light on this as to why the booksales is showing as empty. TIA!!
partial output
This construct creates cumulative sums, where on every record SALES is added to the value of BOOKSALES from the previous record.
booksales + sales;
It keeps adding sales into booksales on every record. But you don't want that, you want to set the booksales back to zero for each date.
But since you don't have a variable named SALES, but you do have a variable names BOOKSALES, I don't think the code as written would work. I think you want
sales+booksales;
Then last.date outputs the final cumulative sales to the data set.
Lastly, you are struggling with coding that is much simpler in PROC SUMMARY. Since SAS has done the hard work of figuring out how to compute sales across many variables, and then created PROC SUMMARY which lets you figure out the total sales in a DATE (or any other variable you want like COUNTRY or STATE or any other variable) with much less programming and much less logic to trip you up, I suggest you learn and use PROC SUMMARY for this task. Note that if you had 10 variables instead of these 4, you just add the variables into the VAR statement, you don't need to set each one to zero and then do the summation on each variable.
proc summary data=datetypesort nway;
class date;
var booksales cardsales persales totalsales;
output out=daysales sum=;
run;
Thanks Paige, appreciate your help in understanding the concepts and how they work.
Sure, I am on this chapter trying to learn these concepts and was working out some homework problems. I will look up proc.summary too. TY!
Also I am sorry the problem does ask for the booksales. I did not share the problem statement. But I do get the idea of what you are trying to convey. TY Paige!
booksales = booksales + sales;
is a simple assignment statement, with a calculation on the right side of the equal sign.
booksales + sales;
is a Sum statement.
A Sum statement implies an automatic RETAIN and will treat missing arguments as zero.
The normal assignment does neither, and so your booksales is set to missing at the second observation of a group.
Do this:
proc summary data=datetypesort nway;
class date merchtype;
var sales;
output out=daysales sum()=;
run;
If you want the wide structure, you can transpose this or use PROC REPORT with ACROSS.
Thanks Kurt, will read up on those links! Appreciate it!
Also if I did not have a if last.code statement, I am presuming it automatically ouputs the last variable values from the first.code statement. I hope I am correct in assuming this. TIA!
@SASsusrik wrote:
Also if I did not have a if last.code statement, I am presuming it automatically ouputs the last variable values from the first.code statement. I hope I am correct in assuming this. TIA!
The way you have written this, and if I am understanding you properly, I would say the answer is NO. You are not correct. The command
if last.date;
causes only the last record for each date to be output. If you leave this out, then every record for every date is output. It works independently of any first.date command.
And its not 'last.code', and it is not 'last.var' which won't work here, it is last.date.
I STRONGLY suggest that you study the documentation (Maxim 1!) of the BY Statement and BY-Group Processing in the DATA Step.
This is a Subsetting IF
If the condition does not evaluate to true, the rest of the code is skipped for the current iteration of the DATA step, and execution starts for the next iteration. Or, if you want to say so, the program pointer jumps to the "top" of the DATA step.
first. and last. variables are created for every variable included in the BY statement. The last. is set to true when the last observation for the current group of the particular BY variable is read.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.