BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
deltaskipper
Fluorite | Level 6

Problem statement: I have a data set flying which consist of  variables:

[1]origin [2]destination [3]carrier [4]delay

Now I want to replace the missing value in variable delay with average delay on the specific route(source-destination) and carrier.

If i use proc stdize for imputation it replace  the missing value with the average of variable delay which I dont want to do.

Any suggestion/help will appreciate.

 

My code :

proc stdize data=flight missing=mean reponly  out=missing_flight;
var air_time;/* delay*/
by  origin dest carrier notsorted ;
run;

 

It shows some warning :

WARNING: At least one of the scale and location estimators of variable air_time can not be computed. Variable air_time will not be
standardized.
 

 After that I check my missing values in the output dataset  missing_flight . It is still missing and imputation is not done.

what am I doing wrong here?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Add a BY statement to PROC STDIZE to group by source/destination. 

 


@deltaskipper wrote:

Problem statement: I have a data set flying which consist of  variables:

[1]origin [2]destination [3]carrier [4]delay

Now I want to replace the missing value in variable delay with average delay on the specific route(source-destination) and carrier.

If i use proc stdize for imputation it replace  the missing value with the average of variable delay which I dont want to do.

Any suggestion/help will appreciate.


 

View solution in original post

5 REPLIES 5
Reeza
Super User

Add a BY statement to PROC STDIZE to group by source/destination. 

 


@deltaskipper wrote:

Problem statement: I have a data set flying which consist of  variables:

[1]origin [2]destination [3]carrier [4]delay

Now I want to replace the missing value in variable delay with average delay on the specific route(source-destination) and carrier.

If i use proc stdize for imputation it replace  the missing value with the average of variable delay which I dont want to do.

Any suggestion/help will appreciate.


 

Reeza
Super User
FYI if you respond by editing your original post no one knows what was changed, what is new and it won’t come across as a new post. You’re better off replying than editing your original question.
deltaskipper
Fluorite | Level 6
Yeah from next time. I keep this thing in mind.
And yes thanks for advice.
PGStats
Opal | Level 21

Maybe you misinterpreted what NOTSORTED means in the BY statement. NOTSORTED means that the data is grouped by the origin, dest, and carrier variables, but that they are not in ascending or descending order. Try sorting your data prior to calling STDIZE without the NOTSORTED keyword.

PG
deltaskipper
Fluorite | Level 6
Thats work.
Thank you for that

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1067 views
  • 2 likes
  • 3 in conversation