DATA Step, Macro, Functions and more

Create one variable from 3 - based on unique numeric ordering

Reply
Contributor
Posts: 47

Create one variable from 3 - based on unique numeric ordering

Hi,

I'm struggling with what seemed like a simple task - based on the documentation:

"2001-Wife/"Wife"'s employment status ER17786, ER17787, ER17788

In order to create an equivalent employment status variable for
Heads/Wives/"Wives" using the above mentions, a user must consider the code
values of all three mentions and prioritize them in order as follows:
2,1,3,4,5,7,6,8, 9. For example, if Head had two mentions for employment
status, and the first mention was code 1 with the second mention being code 2,
then Head's overall employment status would become code 2. If Head had three
mentions for employment status with codes of 5, 6, 3, then Head's overall
employment status would become code 3."

I've tried 2 different arrays - but think I now need to order the 3 employment 'parts' by the odd numeric order before putting them through the array, maybe. I'll include my 2 attempts below. The problem is if the new variable is assigned a '2' it can later become a '7' in the code below - which I don't want.

Thank you.

Attempt 1 at creating a WIFE employment status variable:

data test2 (drop=i);
set trial1;

* rank employment vars;
head_status_01 = .;

array m{3} ER17216-ER17218;

do i=1 to 3 while (head_status_01 = .);

if m{i} = 2 then head_status_01 = 2;
else if m{i} = 1 then head_status_01=1;
else if m{i} = 3 then head_status_01=3;
else if m{i} = 4 then head_status_01=4;
else if m{i} = 5 then head_status_01=5;
else if m{i} = 7 then head_status_01=7;
else if m{i} = 6 then head_status_01=6;
else if m{i} = 8 then head_status_01=8;
else if m{i} = 9 then head_status_01=9;

end;
run;

*****************************

attempt 2

data test2 (drop=i);
set trial1 (obs=1000);

* rank employment vars;
head_status_01 = .;

array m{3} ER17216-ER17218;
array const{9} (2 1 3 4 5 7 6 8 9);
do j=1 to 9;

do i=1 to 3 while (head_status_01 ne .);
if m{i} = const{j} then head_status_01 = const{j};

end;
end;
run;
Occasional Contributor
Posts: 6

Re: Create one variable from 3 - based on unique numeric ordering

Hi,

Could you send the dataset trial1 or a link to it ?

I´m not familiar with these documentation, but maybe I could help you with these algorithm.
Are you using the default PSID ?

I have an ideia and I´d like to make some test´s first.

Kassim
Brazil
Contributor
Posts: 47

Re: Create one variable from 3 - based on unique numeric ordering

Hi,

Great - I can send the dataset if you provide an email.

Yes, this is from the PSID.
Super Contributor
Super Contributor
Posts: 3,174

Re: Create one variable from 3 - based on unique numeric ordering

Also, consider sending a post-reply with a DATA step that creates an example WORK.TRIAL1 SAS file, using instream DATALINES and an INPUT statement. That way you keep focus on the forum subscribers by submitting "sample data" for your original post.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 47

Re: Create one variable from 3 - based on unique numeric ordering

Thank you for the sugggestion. Hope this works.


data trial1;
input ER17216 ER17217 ER17218;
datalines;
. . .
. 2 .
1 0 0
1 3 0
1 4 0
1 5 6
1 6 0
1 7 0
1 7 6
1 8 0
2 0 0
2 6 0
3 0 0
3 6 0
3 7 0
4 0 0
4 1 0
4 3 0
4 5 0
4 6 0
5 0 0
5 1 0
5 2 0
5 3 0
5 4 0
5 4 6
5 6 0
5 7 0
6 0 0
6 3 0
6 4 0
6 5 0
6 7 0
6 7 3
6 8 0
7 0 0
7 1 0
7 6 0
8 0 0
8 1 0
8 3 0
8 5 0
8 6 0
;
run;
Super Contributor
Super Contributor
Posts: 3,174

Re: Create one variable from 3 - based on unique numeric ordering

Your interior DO / END loop is executing 9 times, regardless of the fact that you have assigned your new variable. I would suggest setting one of the following:

1) use the LEAVE statement to exit your DO loops, when appropriate.
2) assign both of your DO/END looping variables when you want to continue / iterate to the next observations, using the DIM(i) function in your assignment statement (better than assigning an explicit max value).

Also, adding some PUTLOG "" / _ALL_; statements will help you see the processing -- consider setting up multiple PUTLOGs and use a unique "nn" value for each distinct statement, for easier identification.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 47

Re: Create one variable from 3 - based on unique numeric ordering

Hello,

Thanks -

I considered the leave variable - but it seemed to require the huge if/then combo I was trying to avoid. That is, when do you leave? You don't know until the last read if you have a 'better' variable.

I do use put statements to help ... and don't quite understand your second suggestion. If you could expand ...
Super Contributor
Super Contributor
Posts: 3,174

Re: Create one variable from 3 - based on unique numeric ordering

With the 2nd code example, the LEAVE "statement" (not variable) will allow you to exit the two nested DO loops immediately, once you have an assigned value for your current observation. The implied understanding is that you are inputting and outputting once per observation. Using the PUTLOG with a suffix, allows you to code multiple PUTs (or PUTLOGs) and know which is being executed with you are displaying _ALL_ variables to the log. I would expect you need to analyze the processing both inside and outside the DO loops to totally understand what's going on in your DATA step.

Scott Barry
SBBWorks, Inc.

Sample "very simplified" SAS program to demonstrate using PUTLOG for diagnosis:

OPTIONS SOURCE SOURCE2 MACROGEN SYMBOLGEN MLOGIC;
DATA _NULL_;
DO I=1 TO 10;
PUTLOG '>DIAG-01' / _ALL_;
DO J=1 TO 3;
IF J = 2 THEN PUTLOG '>DIAG-02' / _ALL_;
END;
END;
PUTLOG '>DIAG-99' / _ALL_;
RUN;
Contributor
Posts: 47

Re: Create one variable from 3 - based on unique numeric ordering

Hi,

Thanks to my brother for this idea - works well. Basically recode the odd importance of variables to a consecutive order.

data test2 (drop=head_status_1 - head_status_3 head_status);
set trial1 ;

* rank employment vars;
head_status = .;
head_status_1 = .;
head_status_2 = .;
head_status_3 = .;

array weight{2,9} (2 1 3 4 5 7 6 8 9 9 8 7 6 5 4 3 2 1);

do j=1 to 9 until (head_status_1 - head_status_3 ne .) ;

if ER17216 = weight(1,j) then head_status_1 = weight(2,j);
if ER17217 = weight(1,j) then head_status_2 = weight(2,j);
if ER17218 = weight(1,j) then head_status_3 = weight(2,j);
if head_status_1 or head_status_2 or head_status_3 = 9 then leave;
end;

head_status = max(head_status_1, head_status_2, head_status_3);

head_status_01 = .;

do j=1 to 9 until (head_status_01 ne .);

if head_status = weight(2,j) then head_status_01 = weight(1,j);
end;
run;
Ask a Question
Discussion stats
  • 8 replies
  • 209 views
  • 0 likes
  • 3 in conversation