Hi,
I'm working with a dataset that looks similar to the following:
Construction_date Material
1991 Cement
1972 Wood
1991 Cement
1991
2002 Wood
1983 Wood
1991 Wood
I would like to impute missing material data through the following condition:
if there are more than one identical construction_date and the material from at least two of those dates match, then make it so that all other identical missing construction_date have the same material.
From the little I know, I would first have to sort my construction_date and material variables, and then I get lost from that point on.
proc sort data=data;
by construction_date material;
run;
Note that I am a very new to coding language, and this is a large dataset.
Would appreciate any and all help!
What's the imputed material here?
Construction_date Material
1991 Cement
1972 Wood
1991 Cement
1991 ?
2002 Wood
1983 Wood
1991 Wood
1991 Wood
your data contains 4 records with constructionb_date = 1991:
2 of then with material = cement, 1 with wood and one missing value.
what output and how many records do you expect for this date 1991 ?
My dataset contains over 10,000 dates, and so there will be a variety of date repititions. 1991 will be one of the many repeating dates, and I'm hoping to fill in the missing material variable with the most common material for that construction_date.
Does that answer your question?
what will be your preferences if you have N times material A and same number of times material B -
what would you put to replace missing values?
the situation needs analyze before defining the algoritm to bo done by SAS
Good question! In that case I would just leave the missing as missing.
in such case you need:
1) count frequency of each material per construction_date
2) select those dates with one most common material
3) update your data where material is missing using 2ns step output
to count frequencies you may use severeal methods:
either count by SQL, group by construction_date and material
or use proc FREQ with OUTPUT statement.
or use data step and retain counters
Would you like to struggle it by yourself?
I would love to have some immediate help!
Thank you!
I do have at least a thousand dates though, and was hoping for a more automated process if possible?
run the attached program and follow the steps, as I designed previously.
try to understand each step what it is doing, how and why.
I hope you will be able to addapt it to your needs.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.