Hello
I have a dataset, the number of observations in it will vary frrom day to day. I would like to be able to create a dataset from it where only the last 5 observations are output to a new dataset. I will not have numerical column I could use. The below is representative of the data set I wish to subset. What would be the most efficient way to create a new data set with just the last 5 observations. Thank you.
DATA WORK.SET_1;
INPUT YYMM $8. VAR2 $3. VAR3 $3.;
DATALINES;
2016-01 A1 B1
2016-01 A2 B2
2016-01 A3 B3
2016-01 A4 B4
2016-02 A5 B5
2016-03 A6 B6
2016-04 A7 B7
2016-04 A8 B8
2016-04 A9 B9
run;
DATA WORK.SET_1;
INPUT YYMM $8. VAR2 $3. VAR3 $3.;
DATALINES;
2016-01 A1 B1
2016-01 A2 B2
2016-01 A3 B3
2016-01 A4 B4
2016-02 A5 B5
2016-03 A6 B6
2016-04 A7 B7
2016-04 A8 B8
2016-04 A9 B9
;;;;
run;
proc print;
run;
data last5;
do point=nobs-4 to nobs;
set set_1 point=point nobs=nobs;
output;
end;
stop;
run;
proc print;
run;
DATA WORK.SET_1;
INPUT YYMM $8. VAR2 $3. VAR3 $3.;
DATALINES;
2016-01 A1 B1
2016-01 A2 B2
2016-01 A3 B3
2016-01 A4 B4
2016-02 A5 B5
2016-03 A6 B6
2016-04 A7 B7
2016-04 A8 B8
2016-04 A9 B9
;;;;
run;
proc print;
run;
data last5;
do point=nobs-4 to nobs;
set set_1 point=point nobs=nobs;
output;
end;
stop;
run;
proc print;
run;
Thank you very much, quick and simple. I greatly appreciate it!!
I think my code is faster than John King . DATA WORK.SET_1; INPUT YYMM $8. VAR2 $3. VAR3 $3.; DATALINES; 2016-01 A1 B1 2016-01 A2 B2 2016-01 A3 B3 2016-01 A4 B4 2016-02 A5 B5 2016-03 A6 B6 2016-04 A7 B7 2016-04 A8 B8 2016-04 A9 B9 ;;;; run; data want; set set_1 nobs=nobs; if _n_=nobs-4 or _n_=nobs-3 or _n_=nobs-2 or _n_=nobs-1 or _n_=nobs then output; run;
Thank you. While you state his code is faster, you did show me a use of _n_ I was not aware of. This is very valuable!! I actually ended up going wiht a PROC SQL step to get the count, then used the count in a macro, then the macro to limit my outputs.
PROC SQL;
CREATE TABLE WORK.OBS_MACRO AS
SELECT count(Month_Day) AS MAX_CT
FROM WORK.IFF_IN_OUT_CHANGE4;
QUIT;
*make macro for last 5 obs;
DATA _NULL_;
SET WORK.OBS_MACRO;
CALL SYMPUTX ('MAXCT',MAX_CT,'g');
RUN;
%put &MAXCT;
DATA WORK.IN_OUT_LAST5;
SET WORK.IFF_IN_OUT_CHANGE4;
IF _N_ >= (&MAXCT-4) THEN OUTPUT;
RUN;
You can code as simple as : DATA WORK.IN_OUT_LAST5; SET WORK.IFF_IN_OUT_CHANGE4 NOBS=NOBS; IF _N_ >= (NOBS-4) THEN OUTPUT; RUN;
This has truely been a great leanring experience for me. I had not thought of the ability to use the NOBS in that manner. I greatly appreciate your expertice!!
Never mind. John's code is faster than me after testing .
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.