Help for SAS_Longitudinal data recoding

Reply
Occasional Contributor
Posts: 6

Help for SAS_Longitudinal data recoding

 

Hi,

 

I'd appreciate if someone can provide some help for my following question:

 

Data description: longitudinal data, edu is a dummy which 1 represents to have education in the corresponding tax year; 0 means no education.

 

I would like to get the duration of the waiting period before a person gets education, and how many years one participates at school.

 

For example, I would like to have a variable to show the first person waits for 1 year to start education, and stays in school for 2 years; the 2nd person waits for 0 year and stays in school for 3 years; the 4th person waits for 2 years and has been in school for 2 years. For the 3rd person, he has multiple entries for education but I would like to count only the first time (or 1 year of education).

 

In other words, I wonder if it is possible to create a variable for only those who have attended school consecutively. 

I would also like to see if the mean of waiting time and years of education can be calculated directly.

 

Many thanks!!!

 

 

Sample data:

 

id            tax_year              edu                       

1              1                              0                             

1              2                              1                             

1              3                              1                             

1              4                              0             

1              5                              0

 

2              1                              1             

2              2                              1             

2              3                              1             

2              4                              0             

2              5                              0

 

 

3              1                              0             

3              2                              1             

3              3                              0             

3              4                              1             

3              5                              0

 

4              1                              0             

4              2                              0             

4              3                              1             

4              4                              1

4              5                              0             

Super User
Posts: 10,497

Re: Help for SAS_Longitudinal data recoding

Can you show what you would like the output to look like for the example data?

Occasional Contributor
Posts: 6

Re: Help for SAS_Longitudinal data recoding

Hi,

 

If possible, the output should be:

1) SAS code for how to create a duration variable in such case and condition;

2) SAS code for how to create an education variable that only takes consecutive years of education once, and if there are random multiple entries for educational attainments at different years, we will just count the first entry once. 

3) SAS code for calcuating the mean of the newly created duration and education variables.

 

Thanks a lot!

Super User
Posts: 10,497

Re: Help for SAS_Longitudinal data recoding

Show the actual output similar to your input data layout. It helps to clarify what the rules actually mean.

And it should show a reasonable number of case/conditions.

Occasional Contributor
Posts: 6

Re: Help for SAS_Longitudinal data recoding

Hi,

 

Thanks for the prompt reply.

 

I am a new SAS user and I am still at the early stage of learning this program.

 

I am not sure how to create what I want using SAS that's why there's nothing to show.

 

Would "array" work? 

 

Thanks again.

Super User
Posts: 10,497

Re: Help for SAS_Longitudinal data recoding

Example output can be text just like your input but add the expected values for that set of data. maybe something like:

 

id            tax_year              edu                wait        participation           

1              1                              0                             

1              2                              1                 1            

1              3                              1                             2

1              4                              0             

1              5                              0

 

2              1                              1             0

2              2                              1             

2              3                              1                             3

2              4                              0             

2              5                              0

 

 

3              1                              0             

3              2                              1             1

3              3                              0                            1

3              4                              1             

3              5                              0

 

Or maybe      

id             wait        participation           

1              1             2

2               0            3

3               1            1

 

DO you have a SAS dataset with your starting data?

 

Since you are looking a values across records then RETAIN for the wait and participation variables is likely. You may want a flag to set after the first time you set participation duration to say that you are done.

Since you need to reset the process for each ID then First.ID type processing is a likely start.

Hint to get started: for the first record in each ID then you want to set an appropriate initial value for your wait and participation variables.

You may want to look at the LAG function to compare the current value of edu with the previous one within your ID to see if the current record should increase the count for wait or participation.

 

How you want the output to look like will change some other options. One, if you have gotten to a point where you have completed the initial participation cycle, you might want to explicitly output the data at that point only.

 

Occasional Contributor
Posts: 6

Re: Help for SAS_Longitudinal data recoding

Hi,

 

Thank you for the useful comments. 

 

Yes, based on my example data, my example output will look like

 

Id  Duration Edu Year
1   1             2
2   0             3
3   1             1

4   2             2

 

I don't have SAS at home but I can test it later at school.

 

I tried to create some codes and not sure if they are right. The part in red is the part I think will be the key (and they are probably not right at the moment):

 

Data;

set id;

by tax_year;

retain edu flag;

if first.tax_year then do;

edu=result;

flag=0;

end;

 

if 0 < result < edu then edu = result;

if (result+edu) >1 then flag =1;

if last.tax_year and flag = 1 then output;

 

run;

 

Thank you!

Super User
Posts: 9,676

Re: Help for SAS_Longitudinal data recoding

data have;
input id            tax_year              edu  ;
cards;                     
1              1                              0                             
1              2                              1                             
1              3                              1                             
1              4                              0             
1              5                              0
2              1                              1             
2              2                              1             
2              3                              1             
2              4                              0             
2              5                              0
3              1                              0             
3              2                              1             
3              3                              0             
3              4                              1             
3              5                              0 
4              1                              0             
4              2                              0             
4              3                              1             
4              4                              1
4              5                              0  
;
run;
proc summary data=have;
by id edu notsorted;
output out=temp;
run;
data temp1;
 set temp;
 by id;
 retain found;
 length name $ 10;
 name=ifc(edu=0,'Duration','EduYear');
 if first.id then call missing(found);
 if not found then output;
 if edu=1 then found=1;
run;
proc transpose data=temp1 out=temp2(drop=_name_);
 by id;
 id name;
 var _freq_;
run;
proc stdize data=temp2 out=want reponly missing=0;run;
Occasional Contributor
Posts: 6

Re: Help for SAS_Longitudinal data recoding

Thank you, Keshan.

 

I will give it a try and get back to you =)

Super User
Posts: 17,816

Re: Help for SAS_Longitudinal data recoding

It isn't an issue with the program, we don't know what your problem is. Please illustrate it with sample input data AND sample output data that matches your input data.
Occasional Contributor
Posts: 6

Re: Help for SAS_Longitudinal data recoding

 

Thank you. Please see my example output and code above.

 

 

Ask a Question
Discussion stats
  • 10 replies
  • 513 views
  • 0 likes
  • 4 in conversation