Hello,
I have a problem to keep ALL data in the most recent quarter for each measure.
Characters in my data;
1. The most recent quarter depends on the measure you select
2. The most recent quarter changes every month for each measure
My data look like:
ID Measure Quarter
1 A Q1 2020
2 A Q2 2020
3 B Q2 20202
4 B Q3 2020
5 B Q3 2020
I like to keep ID 2, 4 and 5, since ID 2 is in the most recent quarter of measure A and ID 4&5 are in the most recent quarter of measure B.
Any advice will be appreciated.
Thank you!
data have;
input ID $ Measure $ Quarter :yyq6.;
format quarter yyq6.;
datalines;
1 A 2020Q1
2 A 2020Q2
3 B 2020Q2
4 B 2020Q3
5 B 2020Q3
;
proc sql;
create table want as
select id, measure, quarter
from have
group by measure
having quarter = max(quarter)
order by measure, id
;
quit;
data have;
input ID $ Measure $ Quarter :yyq6.;
format quarter yyq6.;
datalines;
1 A 2020Q1
2 A 2020Q2
3 B 2020Q2
4 B 2020Q3
5 B 2020Q3
;
proc sql;
create table want as
select id, measure, quarter
from have
group by measure
having quarter = max(quarter)
order by measure, id
;
quit;
data have;
input ID $ Measure $ Quarter :yyq6.;
format quarter yyq6.;
datalines;
1 A 2020Q1
2 A 2020Q2
3 B 2020Q2
4 B 2020Q3
5 B 2020Q3
;
data want;
do _n_=1 by 1 until(last.measure);
set have;
by measure;
_q=_q max quarter;
end;
do _n_=1 to _n_;
set have;
if _q=quarter then output;
end;
drop _:;
run;
proc print noobs;run;
The SAS System |
ID | Measure | Quarter |
---|---|---|
2 | A | 2020Q2 |
4 | B | 2020Q3 |
5 | B | 2020Q3 |
Since the sample is sweetly sorted by measure qtr, this is a nice case to experiment Duplicate:replace HASH option and make it seem a solution requiring one pass san loading the MAX qtr i.e the most recent qtr. The insertion order of the HASH mirrors a sequential read and duplicate key is replaced by the latest insert.
So for what it's worth:
data have;
input ID $ Measure $ Quarter :yyq6.;
format quarter yyq6.;
datalines;
1 A 2020Q1
2 A 2020Q2
3 B 2020Q2
4 B 2020Q3
5 B 2020Q3
;
data want ;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have',duplicate: "r") ;
h.definekey ("measure") ;
h.definedata ("measure","quarter") ;
h.definedone () ;
end;
do _n_=1 by 1 until(last.measure);
set have(rename=(quarter=_q));
by measure;
if _n_=1 then h.find();
if quarter=_q then output;
end;
drop _q;
format quarter yyq6.;
run;
proc print noobs;run;
ID | Measure | Quarter |
---|---|---|
2 | A | 2020Q2 |
4 | B | 2020Q3 |
5 | B | 2020Q3 |
@huhuhu wrote:
Hello,
I have a problem to keep ALL data in the most recent quarter for each measure.
Characters in my data;
1. The most recent quarter depends on the measure you select
2. The most recent quarter changes every month for each measure
My data look like:
ID Measure Quarter
1 A Q1 2020
2 A Q2 2020
3 B Q2 20202
4 B Q3 2020
5 B Q3 2020
I like to keep ID 2, 4 and 5, since ID 2 is in the most recent quarter of measure A and ID 4&5 are in the most recent quarter of measure B.
Any advice will be appreciated.
Thank you!
I would suggest that instead of the phrase "most recent" when you describe this data that you use "last quarter". Especially since many of your values as of March 2020, when this question is asked, are actually in the future and hence are not "recent" which typically refers to "up to a date", "of late" , "latest".
So "recent quarter" would more often be either 1) the current calendar quarter or optionally 2) the calendar quarter preceding the current quarter, and which explicitly stated.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.