Help using Base SAS procedures

Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique ID)

Reply
New Contributor
Posts: 3

Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique ID)

[ Edited ]

Hi Sas Community,

 

I haven't come across this before, but I have multiple observations per person, and then for observations that are expected to be repeated (like sex) they are only listed once per person and then the rest are listed as missing. A way simplified version of the data is like this:

Person     Sex

1                2

1                .

1                . 

2                1

2                .

2                .

I need those variables to not be missing. I can be basically 100% certain that that is exactly what that person's sex will be at the next observation. Sooooo I've read about single or deterministic regression imputation, and it sounds like that'd be a good way to go. I haven't found any code anywhere for this though? Does anyone have any code to offer to help me go about this? 

 

Thanks,

 

Estelle

Respected Advisor
Posts: 4,641

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

[ Edited ]

Imputation is about replacing unknown values with probable/representative values. All you need is to replace missing information in a deterministic way.

One way to do this is to invoque SQL automatic remerging and use the fact that SAS missing values are inferior to non-missing values:

 

data have;
input Person Sex :$1. Age x;
datalines;
1    2 30 0.1
1    . . 0.2
1    .  . 0.3
2    1 32 1.1
2    . . 1.2
2    . . 1.3
;

Proc sql;
create table want as
select 
    Person, 
    max(sex) as sex,
    max(age) as age,
    x
from have
group by person;
select * from want;
quit;
PG
New Contributor
Posts: 3

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

I'm a new to SQL so that's maybe a problem here. I'm submitting this:

 

Proc sql;

create table imputed as

select

nomen,

max (sexeps) as sexeps,

max (tage) as tage,

from mergeinca2

group by nomen;

quit;

And it tells me in the log "expecting a "FROM" (and underlines "from mergeinca2"). Mergeinca2 is the relevant dataset.

 

Respected Advisor
Posts: 4,641

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

Remove the comma after tage.

PG
Respected Advisor
Posts: 4,641

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

Remove the comma after tage.

PG
New Contributor
Posts: 3

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

This procedure actually omits all repeated observations (because the repeated observations are from other variables). I think I need to "keep" the other variables in my dataset while running proc sql perhaps? 

 

 

I am submitting this code and I think maybe my "group by" line is now in the wrong spot?

 

Proc sql;

create table imputed as

select nomen,

max (sexeps) as sexeps2,

max (tage) as tage2

from mergeinca2 (keep= sexeps codgr tage qte_nette nomen qte_brute nbjcc);

group by nomen;

quit;

 

I get the message "Statement is not valid or used out of proper order" with group by nomen underlined in the log. Also the table has all sex set to 2 and all age categories set to 8 for all 541,526. Do I have to put that group by variable somewhere else if I want to keep other variables from the mergeinca2 dataset?

 

Thanks!

 

Estelle

Super User
Posts: 17,750

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

Try this: 

1. Remove extra semicolon

2. Add * to beginning of query

 

Proc sql;
create table imputed as
select *, nomen,
max (sexeps) as sexeps2,
max (tage) as tage2
from mergeinca2 
group by nomen;
quit;

Respected Advisor
Posts: 4,641

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

You are very close, 

 

Proc sql;
create table imputed as
select 
	nomen,
	codgr, qte_nette, qte_brute, nbjcc,
	max (sexeps) as sexeps,
	max (tage) as tage
from mergeinca2
group by nomen;
quit;
PG
Super User
Posts: 9,662

Re: Need to Impute Demographic Data (present for one obs/person, expect it to correlated to unique I

data have;
input Person     Sex;
cards;
1                2
1                .
1                . 
2                1
2                .
2                .
;
run;
data want;
 merge have(drop=sex) have(keep=person sex where=(sex is not missing)) ;
 by person;
run;

Ask a Question
Discussion stats
  • 8 replies
  • 309 views
  • 0 likes
  • 4 in conversation