Hello
I have a dataset, the number of observations in it will vary frrom day to day. I would like to be able to create a dataset from it where only the last 5 observations are output to a new dataset. I will not have numerical column I could use. The below is representative of the data set I wish to subset. What would be the most efficient way to create a new data set with just the last 5 observations. Thank you.
DATA WORK.SET_1;
INPUT YYMM $8. VAR2 $3. VAR3 $3.;
DATALINES;
2016-01 A1 B1
2016-01 A2 B2
2016-01 A3 B3
2016-01 A4 B4
2016-02 A5 B5
2016-03 A6 B6
2016-04 A7 B7
2016-04 A8 B8
2016-04 A9 B9
run;
DATA WORK.SET_1;
INPUT YYMM $8. VAR2 $3. VAR3 $3.;
DATALINES;
2016-01 A1 B1
2016-01 A2 B2
2016-01 A3 B3
2016-01 A4 B4
2016-02 A5 B5
2016-03 A6 B6
2016-04 A7 B7
2016-04 A8 B8
2016-04 A9 B9
;;;;
run;
proc print;
run;
data last5;
do point=nobs-4 to nobs;
set set_1 point=point nobs=nobs;
output;
end;
stop;
run;
proc print;
run;
DATA WORK.SET_1;
INPUT YYMM $8. VAR2 $3. VAR3 $3.;
DATALINES;
2016-01 A1 B1
2016-01 A2 B2
2016-01 A3 B3
2016-01 A4 B4
2016-02 A5 B5
2016-03 A6 B6
2016-04 A7 B7
2016-04 A8 B8
2016-04 A9 B9
;;;;
run;
proc print;
run;
data last5;
do point=nobs-4 to nobs;
set set_1 point=point nobs=nobs;
output;
end;
stop;
run;
proc print;
run;
Thank you very much, quick and simple. I greatly appreciate it!!
I think my code is faster than John King . DATA WORK.SET_1; INPUT YYMM $8. VAR2 $3. VAR3 $3.; DATALINES; 2016-01 A1 B1 2016-01 A2 B2 2016-01 A3 B3 2016-01 A4 B4 2016-02 A5 B5 2016-03 A6 B6 2016-04 A7 B7 2016-04 A8 B8 2016-04 A9 B9 ;;;; run; data want; set set_1 nobs=nobs; if _n_=nobs-4 or _n_=nobs-3 or _n_=nobs-2 or _n_=nobs-1 or _n_=nobs then output; run;
Thank you. While you state his code is faster, you did show me a use of _n_ I was not aware of. This is very valuable!! I actually ended up going wiht a PROC SQL step to get the count, then used the count in a macro, then the macro to limit my outputs.
PROC SQL;
CREATE TABLE WORK.OBS_MACRO AS
SELECT count(Month_Day) AS MAX_CT
FROM WORK.IFF_IN_OUT_CHANGE4;
QUIT;
*make macro for last 5 obs;
DATA _NULL_;
SET WORK.OBS_MACRO;
CALL SYMPUTX ('MAXCT',MAX_CT,'g');
RUN;
%put &MAXCT;
DATA WORK.IN_OUT_LAST5;
SET WORK.IFF_IN_OUT_CHANGE4;
IF _N_ >= (&MAXCT-4) THEN OUTPUT;
RUN;
You can code as simple as : DATA WORK.IN_OUT_LAST5; SET WORK.IFF_IN_OUT_CHANGE4 NOBS=NOBS; IF _N_ >= (NOBS-4) THEN OUTPUT; RUN;
This has truely been a great leanring experience for me. I had not thought of the ability to use the NOBS in that manner. I greatly appreciate your expertice!!
Never mind. John's code is faster than me after testing .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.