I am trying to find the last row per year in a data set. Adding an extra column is not possible in the real case. I have created a small example, which obviously does not work. Is there any smooth way?
DATA test;
infile cards dlm=',' dsd;
INPUT DATE1 :DATE9. count: 1.;
FORMAT DATE1 DDMMYY10. count 1.;
CARDS;
11MAY2009, 1
12MAY2009, 1
13MAY2009, 1
14MAY2009, 5
11MAY2010, 1
16MAY2010, 1
17MAY2010, 1
19MAY2010, 7
;
RUN;
data test1 ;
set test ;
by year(date1); /* GROUP PER YEAR */
if last.year;
run;
Yes, use a format and the GROUPFORMAT option on the BY statement.
Please paste your code in a text box, and make sure it runs before posting. Your data step did not work correctly for me.
DATA test;
infile cards dlm=',' dsd;
informat date1 date9. count 8.;
INPUT DATE1 count ;
FORMAT DATE1 DDMMYY10. count 8.;
CARDS;
11MAY2009, 1
12MAY2009, 1
13MAY2009, 1
14MAY2009, 5
11MAY2010, 1
16MAY2010, 1
17MAY2010, 1
19MAY2010, 7
;
RUN;
data test1 ;
set test ;
by date1 groupformat; /* GROUP PER YEAR */
format date1 year4.;
if last.date1;
run;
@Jeg123 wrote:
I am trying to find the last row per year in a data set. Adding an extra column is not possible in the real case. I have created a small example, which obviously does not work. Is there any smooth way?
DATA test;
infile cards dlm=',' dsd;
INPUT DATE1 :DATE9. count: 1.;
FORMAT DATE1 DDMMYY10. count 1.;
CARDS;
11MAY2009, 1
12MAY2009, 1
13MAY2009, 1
14MAY2009, 5
11MAY2010, 1
16MAY2010, 1
17MAY2010, 1
19MAY2010, 7
;
RUN;data test1 ;
set test ;
by year(date1); /* GROUP PER YEAR */
if last.year;
run;
Adding an extra column is not possible in the real case.
Wow, that's quite a strong restriction. SAS makes no such restriction. I can't even imagine why you would not be able to add the extra column in a temporary data set and then the desired BY statement works properly.
@Jeg123 wrote:
Strange that SAS does not allow in, while it works in PROC SQL
Please show an example of BY processing in Proc SQL. "GROUP BY" is not the same as By processing. Group By collects records on output, BY indicates an order and grouping on input from data.
@Jeg123 wrote:
Strange that SAS does not allow in, while it works in PROC SQL
Most of SAS is not SQL, it doesn't have to allow the same things as SQL. If you want most of SAS to behave the same as SQL, you are going to be disappointed and struggle with SAS.
But you haven't addressed the key question, can you create a temporary data set and then add a column to the temporary data set? Of course you can do this, why would this be not allowed?
You can't use a function in a BY statement. But there are a couple of workarounds. If you don't consider this cheating, you could add YEAR to a view and process the view:
data test2 / view=test2;
set test;
year = year(date1);
run;
data want;
set test2;
by year;
if last.year;
drop year;
run;
Technically, you never actually save YEAR values in a data set.
Alternatively, you can use:
data want;
set test end=done;
prior_date = lag(date1);
drop prior_date;
if done then output;
if year(date1) ne year(prior_date) then do;
_n_ = _n_ - 1;
if _n_ > 0;
set test point=_n_;
output;
end;
if done then output;
run;
If you can't change the original table, then take a copy of it, and change the copy?
Yes, use a format and the GROUPFORMAT option on the BY statement.
Please paste your code in a text box, and make sure it runs before posting. Your data step did not work correctly for me.
DATA test;
infile cards dlm=',' dsd;
informat date1 date9. count 8.;
INPUT DATE1 count ;
FORMAT DATE1 DDMMYY10. count 8.;
CARDS;
11MAY2009, 1
12MAY2009, 1
13MAY2009, 1
14MAY2009, 5
11MAY2010, 1
16MAY2010, 1
17MAY2010, 1
19MAY2010, 7
;
RUN;
data test1 ;
set test ;
by date1 groupformat; /* GROUP PER YEAR */
format date1 year4.;
if last.date1;
run;
@Jeg123 wrote:
I am trying to find the last row per year in a data set. Adding an extra column is not possible in the real case. I have created a small example, which obviously does not work. Is there any smooth way?
DATA test;
infile cards dlm=',' dsd;
INPUT DATE1 :DATE9. count: 1.;
FORMAT DATE1 DDMMYY10. count 1.;
CARDS;
11MAY2009, 1
12MAY2009, 1
13MAY2009, 1
14MAY2009, 5
11MAY2010, 1
16MAY2010, 1
17MAY2010, 1
19MAY2010, 7
;
RUN;data test1 ;
set test ;
by year(date1); /* GROUP PER YEAR */
if last.year;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.