Hi,
I'm trying to assign a marital status to different individuals looking at their history over the last few years (survey data). I'm only interested in those whose status have remained same all along. So the first individual, was married the first two years and divorced later...and gets dropped out.
Individual 3 gets a status because he entered the survey in 2012...etc.
Person | M_2009 | M_2010 | M_2011 | M_2012 | D_2009 | D_2010 | D_2011 | D_2012 | STATUS |
1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | NA |
2 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | Married |
3 | 1 | 1 | Married | ||||||
4 | 1 | 0 | 0 | 1 | NA | ||||
5 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | Divorced |
6 | 0 | 0 | 0 | 1 | 1 | 1 | Divorced | ||
7 | 1 | 1 | 0 | 0 | Married |
How do I get at this quickly? I tried some clauses in the data steps, but cannot seem to include every possible combination. I have several thousand records.
Thanks in advance!
set up two arrays, one for M and one for D. Then see if the max=min of the array is the same and then you know they haven't changed over the survey period.
array marriage(4) m_2009-m_2012;
array divorce(4) d_2009-d_2012;
if max(of marriage(*))=min(of marriage(*)) and max(of divorce(*))=min(of divorce(*)) and max(of marriage(*))=1 then do;
if max(of marriage(*))=1 then status='Married';
else status='Divorced';
end;
else status='NA';
Unless the example data is incomplete the D variables are completely superflous for this as they are the complement of the M variable.
If sum (of m_2:) = n (of m_2:) then Status='Married';
else if sum (of m_2:)=0 then Status = "Divorced;
else status='NA';
data have;
infile cards truncover expandtabs;
input Person M_2009 M_2010 M_2011 M_2012 D_2009 D_2010 D_2011 D_2012;
cards;
1 1 1 0 0 0 0 1 1
2 1 1 1 1 0 0 0 0
3 . . . 1 . . . 1
4 . . 1 0 . . 0 1
5 0 0 0 0 1 1 1 1
6 . 0 0 0 . 1 1 1
7 . 1 1 . . 0 0 .
;
run;
data want ;
set have;
array mm{*} m:;
array dd{*} d:;
if (whichn(0,of mm{*})=0 or whichn(1,of mm{*})=0 ) and
(whichn(0,of dd{*})=0 or whichn(1,of dd{*})=0 )
then do;
if whichn(1,of mm{*}) then STATUS='Married ';
else if whichn(1,of dd{*}) then STATUS='Divorced';
end;
else STATUS='NA';
run;
Thank you all. Reeza, I tried your code and tweaked it a bit to have correct entries for all three categories (it was giving me only NA and Divorced initially).
KSharp, will try your code today!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.