Do loop to assign across variables

Reply
Frequent Contributor
Posts: 101

Do loop to assign across variables

[ Edited ]

Hi,

 

I've a dataset with start year and end year variables (think of them like disease onset date and death date) and a set of variables (IN1995-IN2015) that define if they have actual residence between start and end years. 

 

I'd like to assign two sets of status variables, based on start year, end year and IN year status variables:

A. xIN1995-xIN2015 - Assign 1 for each year between start year and end year and if they have actual residence in that year. 

B. zIN1995-zIN2015 - Assign zIN variables a value of 1 between start year and end year and the preceding 2 years (w/n the range of start year and end year) a value of 1. That is, if a person is IN in 2015 (INyear2015=1) then I'd assume their status variables between 2013 and 2015 are all 1 (ID=11). If the person is IN in 2010 (INyear2010=1) then I'd assume their status variables between 2008 and 2010 are all 1 (provided 2008 is still within the start year and end year range)

 

The cod for set (A) is done. I'm looking for help with set (B).

 

Data for CSV file and an Excel file showing grpahically are attached. SAS code is as below. Have and want data are below.

 

proc import datafile="c:\ ... \1. Have data.txt" out=have dbms=dlm replace;
delimiter=',';  getnames=yes;
run;

*end year 9999 for those who have not ended;
*Code to assign xIN values between start year and end year and if their corresponding IN variables values are 1.
%let lastyear=2015;
data have2; set have;
**Assignment 1: Status variables = 1 btwn start year and ending year ***;
array in[1995:&lastyear] 3 IN1995-IN&lastyear;
array xIN[1995:&lastyear] 3 xIN1995-xIN&lastyear;
do yr = 1995 to &lastyear;
xIN[yr] = 0;
if StartYear <= yr <= EndYear then xIN[yr] = 1;
if in[yr]=. then xIN[yr]=0;

if xIN[yr]=0 then xIN[yr]=.; *assign all 0 as missing, for earier read;
end;
drop yr;
run;
proc print noobs; var id startyear endyear xIN:; run;

 

 

This is how the source data looks like

 

ID StartYear EndYear IN1995 IN1996 IN1997 IN1998 IN1999 IN2000 IN2001 IN2002 IN2003 IN2004 IN2005 IN2006 IN2007 IN2008 IN2009 IN2010 IN2011 IN2012 IN2013 IN2014 IN2015
1 2000 2003 1 1 1 1 1 1     1 . . . . . . . . . . . .
2 1995 1997 1   1 . . . . . . . . . . . . . . . . . .
3 2005 2010 1 1 1 1   1   1 1 1 1         1 . . . . .
4 2009 9999 1 1   1   1 1   1   1 1 1 1 1   1   1 1 1
5 1995 2001 1       1 1 1 . . . . . . . . . . . . . .
6 2000 9999 . . . . . 1 1 1                       1 1
7 1995 1997 1   1 . . . . . . . . . . . . . . . . . .
8 2014 9999 . . . . . . . . . . . 1 1 1 1 1 1 1 1 1 1
9 1998 2008 1 1 1 1   1     1 1     1 1 . . . . . . .
10 1999 9999 1   1 1 1 1                             1
11 1997 2001 1 1 1       1 . . . . . . . . . . . . . .

 

 

This one assigned 1 based on start year and end year  and their IN status

 

ID StartYear EndYear xIN1995 xIN1996 xIN1997 xIN1998 xIN1999 xIN2000 xIN2001 xIN2002 xIN2003 xIN2004 xIN2005 xIN2006 xIN2007 xIN2008 xIN2009 xIN2010 xIN2011 xIN2012 xIN2013 xIN2014 xIN2015  
1 2000 2003           1     1                          
2 1995 1997 1   1                                      
3 2005 2010                     1         1            
4 2009 9999                             1   1   1 1 1  
5 1995 2001 1       1 1 1                              
6 2000 9999           1 1 1                       1 1  
7 1995 1997 1   1                                      
8 2014 9999                                       1 1  
9 1998 2008       1   1     1 1     1 1                
10 1999 9999         1 1                             1  
11 1997 2001     1       1                              

 

This is the data I truly want: Assign zIN variables a value of 1 between start year and end year and the preceding 2 years (w/n the range of start year and end year) a value of 1 

ID StartYear EndYear zIN1995 zIN1996 zIN1997 zIN1998 zIN1999 zIN2000 zIN2001 zIN2002 zIN2003 zIN2004 zIN2005 zIN2006 zIN2007 zIN2008 zIN2009 zIN2010 zIN2011 zIN2012 zIN2013 zIN2014 zIN2015
1 2000 2003           1 1 1 1                        
2 1995 1997 1 1 1                                    
3 2005 2010                     1     1 1 1          
4 2009 9999                             1   1   1 1 1
5 1995 2001 1   1 1 1 1 1                            
6 2000 9999           1 1 1                 1 1 1 1 1
7 1995 1997 1 1 1                                    
8 2014 9999                                       1 1
9 1998 2008       1 1 1 1 1 1 1 1 1 1 1              
10 1999 9999         1 1                         1 1 1
11 1997 2001     1   1 1 1                            

 

Super User
Posts: 10,535

Re: Do loop to assign across variables

It would help if you could provide what you may want for the actual output for few lines of the data.

Since you mention Following 7 years specific records to provide output for would be ID = 6, 8 and 10.

Super User
Posts: 17,905

Re: Do loop to assign across variables

You declare an array but don't really use it. 

Is there a specific reason you choose to use macros here, it doesn't seem required. 

Frequent Contributor
Posts: 101

Re: Do loop to assign across variables

Almost certainly just bad programming on my part. And I'll post "want data" in a few minutes (maybe using a simpler example, with much fewer variables".

Super User
Posts: 10,535

Re: Do loop to assign across variables

Since your data seems to have a noticeable difference in the number of values to process it doesn't hurt to include the long version so that suggestions can incorporate the variability.

Frequent Contributor
Posts: 101

Re: Do loop to assign across variables

I edited my original post, with data attached (in txt). Hope it makes more sense. Many thanks for anyone who can help.

Super User
Posts: 17,905

Re: Do loop to assign across variables

Based only on your initial requirements here's how an array solution would work for A and B. If the logic isn't exact hopefully it's enough to get you started. 

 

PS. If in the future you need to provide data, please use the format of my HAVE data step code, including the data/input/cards statement. It makes the process faster. If you don't know how to generate such a file, see the link that was posted by a different user. 

 

data have;
	infile cards dsd dlm=',';
	input ID StartYear EndYear IN1995 IN1996 IN1997 IN1998 IN1999 IN2000 IN2001 
		IN2002 IN2003 IN2004 IN2005 IN2006 IN2007 IN2008 IN2009 IN2010 IN2011 IN2012 
		IN2013 IN2014 IN2015;
	cards;
1,2000,2003,1,1,1,1,1,1,,,1,.,.,.,.,.,.,.,.,.,.,.,.
2,1995,1997,1,,1,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.
3,2005,2010,1,1,1,1,,1,,1,1,1,1,,,,,1,.,.,.,.,.
4,2009,9999,1,1,,1,,1,1,,1,,1,1,1,1,1,,1,,1,1,1
5,1995,2001,1,,,,1,1,1,.,.,.,.,.,.,.,.,.,.,.,.,.,.
6,2000,9999,.,.,.,.,.,1,1,1,,,,,,,,,,,,1,1
7,1995,1997,1,,1,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.
8,2014,9999,.,.,.,.,.,.,.,.,.,.,.,1,1,1,1,1,1,1,1,1,1
9,1998,2008,1,1,1,1,,1,,,1,1,,,1,1,.,.,.,.,.,.,.
10,1999,9999,1,,1,1,1,1,,,,,,,,,,,,,,,1
11,1997,2001,1,1,1,,,,1,.,.,.,.,.,.,.,.,.,.,.,.,.,.
;
run;

data want;
	set have;
	array p_in(1995:2015) in1995-in2015;
	array xIN(1995:2015) xIN1995-xIN2015;
	array zIN(1995:2015) zIN1995-zIN2015;

	do year=startYear to min(2015, endYear);
		xIN(year)=1;
	end;

	do year=max(2005, year-2) to min(2015, endYear);
		zIN(year)=1;
	end;
run;

 

 

Super User
Posts: 10,535

Re: Do loop to assign across variables

[ Edited ]

Please explain the role of endyear = 9999. The last data example do not seem consistent for ID 6 8 and 10.

 

Also what ever source you pasted from has issues with spaces or something and is hard to write code to read it. If you have a SAS data set you can use instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create datastep code you can paste or attach so we have exactly the same data as you used.

Ask a Question
Discussion stats
  • 7 replies
  • 495 views
  • 1 like
  • 3 in conversation