BookmarkSubscribeRSS Feed
csetzkorn
Lapis Lazuli | Level 10

I have a dataset, which contains:

 

Date

,GadgetId

,SomeMeasurement

 

I would like to calculate the median of SomeMeasurement for every month whilst considering the retrospective/previous data. Example:

 

DateGadgetIdSomeMeasurement
31-Jan-15A15
26-Jan-15A13
26-Jan-15A13
26-Jan-15A13
03-Feb-15A15
07-Feb-15A15
07-Feb-15A15
07-Feb-15A14
02-Feb-15A15
02-Feb-15A15
03-Feb-15A15
02-Feb-15A15
07-Feb-15A14
03-Feb-15A15

 

In month Jan 2015 one would consider the values of this month only to calculate the median. In month Feb 2015 one would consider the values in Jan 2015 and Feb 2015, in Dec 2017 one would consider the data for Dec 2017 and all the previous months etc.

 

Please note that each dataset contains several GadgetIds so a BY GadgetId would be required I suppose. Also each GadgetId has different number samples/dates (some may only have 1 year's worth of data whereas others may have several year's worth of data). 

14 REPLIES 14
Reeza
Super User

PROC EXPAND. 

 

csetzkorn
Lapis Lazuli | Level 10

Thanks. I think this is sas/ets which we do not have )-:

RW9
Diamond | Level 26 RW9
Diamond | Level 26

How many "and so on"'s are we talking?  I mean you could keep all the values in an array for instance then median each row.  

data want;
  set have;
  array vals{100} 8;
  retain vals:;
  retain num;
  num=ifn(_n_=1,1,num+1);
  vals{num}=somemeasurement;
  result=median(of vals{*});
run;

That is given 100 observations. 

 

I am not sure I quote see the logic here though, why doing a rolling median?  Would not a monthly or yearly be appropriate?

csetzkorn
Lapis Lazuli | Level 10

Thanks. It could be 3-4 years worth of data. so in month 12 of year 4 I have to use data of all 4 years to get the median. Please also not that I have to use a BY for different gadgets. Each gadget can have 3-4 years but the amount of data is dynamic - i.e. depends on the gadget.

novinosrin
Tourmaline | Level 20

Can you please provide a more complete sample data with gadgets and the rest?

csetzkorn
Lapis Lazuli | Level 10
Done - sorry if it was not clear enough ...
Reeza
Super User

You only included one gadget, he asked for a few. 

The second solution I posted deals with BY groups - see the BY and IF FIRST statement that resets things.

 


@csetzkorn wrote:
Done - sorry if it was not clear enough ...

 

novinosrin
Tourmaline | Level 20

Helps when you provide complete and comprehensive samples and details 

 

data have;
  input Date : date9. gadgetid $ SomeMeasurement;
  format date date9.;
datalines;
31-Jan-15	A1	5
26-Jan-15	A1	3
26-Jan-15	A1	3
26-Jan-15	A1	3
03-Feb-15	A1	5
07-Feb-15	A1	5
07-Feb-15	A1	5
07-Feb-15	A1	4
02-Feb-15	A1	5
02-Feb-15	A1	5
03-Feb-15	A1	5
02-Feb-15	A1	5
07-Feb-15	A1	4
03-Feb-15	A1	5
31-Jan-15	B1	5
26-Jan-15	B1	3
26-Jan-15	B1	3
26-Jan-15	B1	3
03-Feb-15	B1	5
07-Feb-15	B1	5
07-Feb-15	B1	5
07-Feb-15	B1	4
02-Feb-15	B1	5
02-Feb-15	B1	5
03-Feb-15	B1	5
02-Feb-15	B1	5
07-Feb-15	B1	4
03-Feb-15	B1	5
;
run;
data temp;
set have;
by gadgetid;
if first.gadgetid then grp=0;
formatted_date=date;
if  month(date) ne lag(month(date)) then grp+1;
format formatted_date monyy7.;
run;
data want;
_k=_n_;
_c=0;
array t(20) _temporary_ ;/*array subscript arbitrary,should assign a big one to hold*/
call missing(median,of t(*));
do  until(last.gadgetid);
do  until(last.grp);
set temp;
by gadgetid grp;
_c+1;
t(_c)=SomeMeasurement;
if last.grp then do; median=median(median,of t(*));output;end;
end;
end;
drop _:;
run;
novinosrin
Tourmaline | Level 20

slight correction to the data want step:

 

data want;
_k=_n_;
_c=0;
array t(20) _temporary_ ;/*array subscript arbitrary,should assign a big one to hold*/
call missing(median,of t(*));
do  until(last.gadgetid);
do  until(last.grp);
set temp;
by gadgetid grp;
_c+1;
t(_c)=SomeMeasurement;
if last.grp then do; median=median(of t(*));output;end;
end;
end;
drop _: grp;
run;
csetzkorn
Lapis Lazuli | Level 10
Thanks. Does "should assign a big one" mean that i can assign one which is bigger then what is needed, just in case?
novinosrin
Tourmaline | Level 20

@csetzkorn Yes, the bigger subscript makes sure values(elements doesn't go out of range.

For example, if you believe there could be 10000 records per gadgetid

Reeza
Super User

And temporary arrays method, make your array 31 to have a full month of data. If you have repeated measurements for a month are they considered the same? I noticed you had two observations for month=1 and 1 for month =3. If you have a variable number per month you may want to standardize or aggregate that somehow first.

 

https://gist.github.com/statgeek/27e23c015eae7953eff2

 

data want;

set sashelp.stocks; 
by stock notsorted;

array p{0:30} _temporary_;


if first.stock then call missing(of p{*});
p{mod(_n_,31)} = open;
lowest = median(of p{*});
highest = max(of p{*});


run;
csetzkorn
Lapis Lazuli | Level 10

yes there could be several values per day as indicated in the example.

Ksharp
Super User

If you have SAS9.4

 

data have;
  input Date : date9. gadgetid $ SomeMeasurement;
  new_date=intnx('month',date,0);
  format date new_date date9.;
datalines;
31-Jan-15	A1	5
26-Jan-15	A1	3
26-Jan-15	A1	3
26-Jan-15	A1	3
03-Feb-15	A1	5
07-Feb-15	A1	5
07-Feb-15	A1	5
07-Feb-15	A1	4
02-Feb-15	A1	5
02-Feb-15	A1	5
03-Feb-15	A1	5
02-Feb-15	A1	5
07-Feb-15	A1	4
03-Feb-15	A1	5
31-Jan-15	B1	5
26-Jan-15	B1	3
26-Jan-15	B1	3
26-Jan-15	B1	3
03-Feb-15	B1	5
07-Feb-15	B1	5
07-Feb-15	B1	5
07-Feb-15	B1	4
02-Feb-15	B1	5
02-Feb-15	B1	5
03-Feb-15	B1	5
02-Feb-15	B1	5
07-Feb-15	B1	4
03-Feb-15	B1	5
;
run;
proc sql;
create table want as
 select *,(select median(SomeMeasurement) from have 
where gadgetid=a.gadgetid and new_date<=a.new_date) as median
  from have as a;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 2580 views
  • 1 like
  • 5 in conversation