DATA Step, Macro, Functions and more

Find the first occurence of value inside group of variables

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Find the first occurence of value inside group of variables

Hi,

 

I have multiple indicator variables and I need to find the first occurence of specific value among them and replace the next values with another value. 

 

Sample of my data looks like 

id year sex a1 a2 a3 a4 a5 
1 1994  1   1  1  2  2  2 
2 1992  1   1  4  4  4  4
3 1992  2   1  2  3  3  4
4 1991  1   1  1  4  4  4

What I'm trying to is to detect if variable a1--a5 has a value 4 before value 2 or 3 have occurenced, and replace the first occurence of value for with 2 and the next values with 3. 

 

I have tried to do that with arrays, but I haven't find out how to detect the first occurence of value and replace it with another value.. 

 

So the data should look like 

 

id year sex a1 a2 a3 a4 a5 
1 1994  1   1  1  2  2  2 
2 1992  1   1  2  3  3  3
3 1992  2   1  2  3  3  4
4 1991  1   1  1  2  3  3

Any help is highly appreciated! 

 


Accepted Solutions
Solution
‎09-24-2017 10:45 AM
Super Contributor
Posts: 320

Re: Find the first occurence of value inside group of variables

Posted in reply to Laura_123

Better understood now

data have;
input id year sex a1 a2 a3 a4 a5;
datalines; 
1 1994  1   1  1  2  2  2 
2 1992  1   1  4  4  4  4
3 1992  2   1  2  3  3  3
4 1991  1   1  2  4  4  4
5 1992  2   1  2  2  4  4
6 1985  2   1  3  4  4  4 
7 1988  2   1  3  3  3  3 
;
run;

data need;
   array k[5] _temporary_;
   do until(eof);
      set have end = eof;
      array a a1--a5;
      do i = 1 to dim(k);
         k[i] = a[i];
      end;
      /** Process the conditions **/
     do i = 1 to dim(k);
      if k[i] = 4 then do;
      
         if k[i-1] = 2 then do;
            do j = i to dim(k);
               k[j] = 3;
            end;
         end;
      
         else if k[i-1] = 3 then do;
            do j = i to dim(k);
               k[j] = 2;
            end;
         end;
      
         else if k[i-1] = 1 then do;
            k[i] = 2;
            do j = i+1 to dim(k);
               k[j] = 3;
            end;
         end;
      end;
   end;
   do i = 1 to dim(k);
      a[i] = k[i];
   end;
   output;
end;
drop i j;
run;


Obs	id	year	sex	a1	a2	a3	a4	a5
1	1	1994	1	1	1	2	2	2
2	2	1992	1	1	2	3	3	3
3	3	1992	2	1	2	3	3	3
4	4	1991	1	1	2	3	3	3
5	5	1992	2	1	2	2	3	3
6	6	1985	2	1	3	2	2	2
7	7	1988	2	1	3	3	3	3

.

It is easier to copy a1 - a5 into a _temporary_ array first.

Then process the array for 4, and apply the conditions.

Then copy the elements of array to a1 - a5 and write the output.

 

View solution in original post


All Replies
Super Contributor
Posts: 320

Re: Find the first occurence of value inside group of variables

[ Edited ]
Posted in reply to Laura_123

Here is array way.

Hope this is acceptable.

As you are just new to the Community, please see a wheel on the top right. Click the arrow. Mark my answer as you please.

data have;
input id year sex a1 a2 a3 a4 a5;
datalines; 
1 1994  1   1  1  2  2  2 
2 1992  1   1  4  4  4  4
3 1992  2   1  2  3  3  4
4 1991  1   1  1  4  4  4
;
run;

data want;
   set have;
   array a[5] a1 a2 a3 a4 a5;

   retain flag 0;
   retain a4pos;
   do i = 1 to 5;
      if a[i] = 2 or a[i] = 3 then do;
         flag = 1;
         if i > 3 then leave;
      end;
      if flag = 1 and a[i] = 4 then do; a[i] = 2; a4pos = i + 1; leave; end;
   end;
   if flag = 1 and a4pos ^= . then do i = a4pos to 5;
      a[i] = 3;
   end;
   a4pos = .;
drop i flag a4pos;
run;

Obs	id	year	sex	a1	a2	a3	a4	a5
1	1	1994	1	1	1	2	2	2
2	2	1992	1	1	2	3	3	3
3	3	1992	2	1	2	3	3	4
4	4	1991	1	1	1	2	3	3

 

Occasional Contributor
Posts: 5

Re: Find the first occurence of value inside group of variables

[ Edited ]

Hi! 

 

Thanks for your reply. This works perfectly, but then I noticed that in my data exists couple of subjects (my data consist 20 000 rows) who has obs like this 

7 1994  1   1  3  4  4  4

 So now those rows become 

7 1994  1   1  3  2  3  3

 And what I want them to become is 

7 1994  1   1  3  2  2  2

 

Didin't realize when asking the question that there might be also obs like this.. 

 

(So that there would be some kind of logic why I want create rows like this, is that those numbers indicate subjects marital status in year(1 to 5). So that 1=single 2=married 3=child and this number 4 is married & child. And I want separate that value. If subject has value 4 after being single, then it should be first married and then child (that's how it done right now). But if subject has first child and then has a value number 4 then it should be coded that first child(=3) then married(=2).)

Super Contributor
Posts: 320

Re: Find the first occurence of value inside group of variables

Posted in reply to Laura_123

Better you place an input data set with the revised requirements and show how you want the output to looklike. This will worth 1000 words.

Occasional Contributor
Posts: 5

Re: Find the first occurence of value inside group of variables

[ Edited ]

Yeah, maybe you're right. 

Here's example rows of all kind of combinations that exists in my data 

 

id year sex a1 a2 a3 a4 a5 
1 1994  1   1  1  2  2  2 
2 1992  1   1  2  3  3  3
3 1992  2   1  2  3  3  3
4 1991  1   1  1  2  3  3
5 1990 2 1 2 2 4 4
6 1985 2 1 2 4 4 4
7 1992 2 1 3 4 4 4
8 1993 1 1 4 4 4 4

 So basically the idea is to replace value for depenging what value occurs before that. 

If value before 4 is 2 then number 4 should be replaced with 3. 

If value before 4 is 3 then number 4 should be replaced with 2 

if value before 4 is 1 then number 4 and the next one should be replaced with 2 & 3. 

 

And the desired output:

 

id year sex a1 a2 a3 a4 a5 
1 1994  1   1  1  2  2  2 
2 1992  1   1  2  3  3  3
3 1992  2   1  2  3  3  3
4 1991  1   1  1  2  3  3
5 1990 2 1 2 2 3 3
6 1985 2 1 2 3 3 3
7 1992 2 1 3 2 2 2
8 1993 1 1 2 3 3 3

 

Occasional Contributor
Posts: 5

Re: Find the first occurence of value inside group of variables

[ Edited ]

Yeah, you're right. 

 

So here's all kind of combinations that exists my data 

 

id year sex a1 a2 a3 a4 a5 
1 1994  1   1  1  2  2  2 
2 1992  1   1  4  4  4  4
3 1992  2   1  2  3  3  3
4 1991  1   1  2  4  4  4
5 1992 2 1 2 2 4 4
6 1985 2 1 3 4 4 4
7 1988 2 1 3 3 3 3

 And the data what I want:

id year sex a1 a2 a3 a4 a5 
1 1994  1   1  1  2  2  2 
2 1992  1   1  2  3  3  3
3 1992  2   1  2  3  3  3
4 1991  1   1  2  3  3  3
5 1992 2 1 2 2 3 3
6 1985 2 1 3 2 2 2
7 1988 2 1 3 3 3 3

 So if number 4 exists, then I want to detect which value occurs before that. 

 

If value before 4 is 2 then number 4 should be replaced with 3 

If value before 4 is 3 then number 4 should be replaced with 2

If value before 4 is 1 then number 4 and the next one should be replaced with 2 and 3 

Solution
‎09-24-2017 10:45 AM
Super Contributor
Posts: 320

Re: Find the first occurence of value inside group of variables

Posted in reply to Laura_123

Better understood now

data have;
input id year sex a1 a2 a3 a4 a5;
datalines; 
1 1994  1   1  1  2  2  2 
2 1992  1   1  4  4  4  4
3 1992  2   1  2  3  3  3
4 1991  1   1  2  4  4  4
5 1992  2   1  2  2  4  4
6 1985  2   1  3  4  4  4 
7 1988  2   1  3  3  3  3 
;
run;

data need;
   array k[5] _temporary_;
   do until(eof);
      set have end = eof;
      array a a1--a5;
      do i = 1 to dim(k);
         k[i] = a[i];
      end;
      /** Process the conditions **/
     do i = 1 to dim(k);
      if k[i] = 4 then do;
      
         if k[i-1] = 2 then do;
            do j = i to dim(k);
               k[j] = 3;
            end;
         end;
      
         else if k[i-1] = 3 then do;
            do j = i to dim(k);
               k[j] = 2;
            end;
         end;
      
         else if k[i-1] = 1 then do;
            k[i] = 2;
            do j = i+1 to dim(k);
               k[j] = 3;
            end;
         end;
      end;
   end;
   do i = 1 to dim(k);
      a[i] = k[i];
   end;
   output;
end;
drop i j;
run;


Obs	id	year	sex	a1	a2	a3	a4	a5
1	1	1994	1	1	1	2	2	2
2	2	1992	1	1	2	3	3	3
3	3	1992	2	1	2	3	3	3
4	4	1991	1	1	2	3	3	3
5	5	1992	2	1	2	2	3	3
6	6	1985	2	1	3	2	2	2
7	7	1988	2	1	3	3	3	3

.

It is easier to copy a1 - a5 into a _temporary_ array first.

Then process the array for 4, and apply the conditions.

Then copy the elements of array to a1 - a5 and write the output.

 

Occasional Contributor
Posts: 5

Re: Find the first occurence of value inside group of variables

Thank you! That solved my problem. Smiley Happy 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 299 views
  • 0 likes
  • 2 in conversation