BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lady
Calcite | Level 5

Hi, all!

 

I was hoping you could help me out on this query. I am able to do this in Excel using a combination of median and if function (somehow mimicking the averageif built-in function) but would like to have it translated in SAS due to the big amount of data I am working on.

 

For illustration purposes, I have this sample dataset. 

 

Month      Day      Volume

1               1           20

1               2           X

2               1           28

2               2           29

3               1           30

3               2           27

1               1           40

1               2           56

2               1           28

2               2           30

1               1           Y

1               2           30

 

In the dataset there are missing values represented by X and Y here. I need to fill out these missing values by getting the median of all the same month and same day as that of the missing field. For example, in order to get X, i get the median of all those with month = 1 and day = 2. Similarly for Y, i get the median of all those with month = 1 and day= 1.

 

Appreciate your help!  Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Here's an example via code:

 


data have;
infile datalines truncover;
input Month      Day      Volume;
datalines;
1               1           20
1               2           
2               1           28
2               2           29
3               1           30
3               2           27
1               1           40
1               2           56
2               1           28
2               2           30
1               1           
1               2           30
;
run;
proc sort data=have;
by month day;
run;

proc stdize data=have out=want missing=median reponly;
by month day;
var volume;
run;

proc print data=want;
run;

View solution in original post

8 REPLIES 8
Reeza
Super User

Look at proc STDIZE with the MISSING option to replace the missing values. 

 

The month and Hour are your BY Groups

Reeza
Super User

Here's an example via code:

 


data have;
infile datalines truncover;
input Month      Day      Volume;
datalines;
1               1           20
1               2           
2               1           28
2               2           29
3               1           30
3               2           27
1               1           40
1               2           56
2               1           28
2               2           30
1               1           
1               2           30
;
run;
proc sort data=have;
by month day;
run;

proc stdize data=have out=want missing=median reponly;
by month day;
var volume;
run;

proc print data=want;
run;
Lady
Calcite | Level 5

Thank you, Reeza! This was very helpful. Smiley Happy

Lady
Calcite | Level 5

by the way, what if instead of having missing values we have zero values instead? how do we get the median?

Reeza
Super User

Then the solutions above wouldn't work. I would replace the 0 with missing and use the solution above. 

Lady
Calcite | Level 5

What if we need to add some criteria, say we need to have this logic:

 

if Volume < 20 then Adjusted Volume = median of the same month and day

 

Month      Day      Volume   Adjusted Volume

1               1           20

1               2          10

2               1           28

2               2           29

3               1           30

3               2           27

1               1           40

1               2           56

2               1           28

2               2           30

1               1           10

1               2           30

Reeza
Super User

Then use the SQL code. You can add a case statement with conditional logic. 

 

If if you don't have SAS 9.4 then you'll have to either use proc means to calculate median, merge it in and execute conditional logic, or conditionally set values to missing ( volume<20) and use proc stdize. 

Reeza
Super User

And one more option, BUT this only works if you're on SAS 9.4+. In previous version, SAS SQL does not calculate the median correctly.

 

proc sql;
create table want2 as
select *, median(volume) as med_volume, coalesce(volume, calculated med_volume) as Volume_Replaced
from have
group by month, day
order by month, day;
quit;

proc print data=want2;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2618 views
  • 1 like
  • 2 in conversation