BookmarkSubscribeRSS Feed
Estelle
Calcite | Level 5

Hi Sas Community,

 

I haven't come across this before, but I have multiple observations per person, and then for observations that are expected to be repeated (like sex) they are only listed once per person and then the rest are listed as missing. A way simplified version of the data is like this:

Person     Sex

1                2

1                .

1                . 

2                1

2                .

2                .

I need those variables to not be missing. I can be basically 100% certain that that is exactly what that person's sex will be at the next observation. Sooooo I've read about single or deterministic regression imputation, and it sounds like that'd be a good way to go. I haven't found any code anywhere for this though? Does anyone have any code to offer to help me go about this? 

 

Thanks,

 

Estelle

8 REPLIES 8
PGStats
Opal | Level 21

Imputation is about replacing unknown values with probable/representative values. All you need is to replace missing information in a deterministic way.

One way to do this is to invoque SQL automatic remerging and use the fact that SAS missing values are inferior to non-missing values:

 

data have;
input Person Sex :$1. Age x;
datalines;
1    2 30 0.1
1    . . 0.2
1    .  . 0.3
2    1 32 1.1
2    . . 1.2
2    . . 1.3
;

Proc sql;
create table want as
select 
    Person, 
    max(sex) as sex,
    max(age) as age,
    x
from have
group by person;
select * from want;
quit;
PG
Estelle
Calcite | Level 5

I'm a new to SQL so that's maybe a problem here. I'm submitting this:

 

Proc sql;

create table imputed as

select

nomen,

max (sexeps) as sexeps,

max (tage) as tage,

from mergeinca2

group by nomen;

quit;

And it tells me in the log "expecting a "FROM" (and underlines "from mergeinca2"). Mergeinca2 is the relevant dataset.

 

Estelle
Calcite | Level 5

This procedure actually omits all repeated observations (because the repeated observations are from other variables). I think I need to "keep" the other variables in my dataset while running proc sql perhaps? 

 

 

I am submitting this code and I think maybe my "group by" line is now in the wrong spot?

 

Proc sql;

create table imputed as

select nomen,

max (sexeps) as sexeps2,

max (tage) as tage2

from mergeinca2 (keep= sexeps codgr tage qte_nette nomen qte_brute nbjcc);

group by nomen;

quit;

 

I get the message "Statement is not valid or used out of proper order" with group by nomen underlined in the log. Also the table has all sex set to 2 and all age categories set to 8 for all 541,526. Do I have to put that group by variable somewhere else if I want to keep other variables from the mergeinca2 dataset?

 

Thanks!

 

Estelle

Reeza
Super User

Try this: 

1. Remove extra semicolon

2. Add * to beginning of query

 

Proc sql;
create table imputed as
select *, nomen,
max (sexeps) as sexeps2,
max (tage) as tage2
from mergeinca2 
group by nomen;
quit;

PGStats
Opal | Level 21

You are very close, 

 

Proc sql;
create table imputed as
select 
	nomen,
	codgr, qte_nette, qte_brute, nbjcc,
	max (sexeps) as sexeps,
	max (tage) as tage
from mergeinca2
group by nomen;
quit;
PG
Ksharp
Super User
data have;
input Person     Sex;
cards;
1                2
1                .
1                . 
2                1
2                .
2                .
;
run;
data want;
 merge have(drop=sex) have(keep=person sex where=(sex is not missing)) ;
 by person;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1106 views
  • 0 likes
  • 4 in conversation