Hello everyone,
My dataset is the following:
Person | Person_num | Year | Firstyear | Company |
A | 1 | 2005 | 1 | aa |
A | 1 | 2006 | 0 | aa |
A | 1 | 2007 | 0 | aa |
A | 1 | 2008 | 0 | aa |
A | 1 | 2008 | 0 | bb |
A | 1 | 2009 | 0 | aa |
A | 1 | 2009 | 0 | bb |
B | 2 | 2008 | 1 | cc |
B | 2 | 2009 | 0 | dd |
B | 2 | 2009 | 0 | cc |
B | 2 | 2010 | 0 | dd |
B | 2 | 2011 | 0 | dd |
B | 2 | 2012 | 0 | dd |
C | 3 | 2002 | 1 | ee |
C | 3 | 2002 | 1 | ff |
C | 3 | 2003 | 0 | ee |
C | 3 | 2003 | 0 | ff |
C | 3 | 2004 | 0 | ff |
C | 3 | 2008 | 0 | ff |
C | 3 | 2009 | 0 | ff |
C | 3 | 2010 | 0 | ff |
C | 3 | 2011 | 0 | ff |
C | 3 | 2012 | 0 | ff |
C | 3 | 2012 | 0 | gg |
C | 3 | 2013 | 0 | ff |
C | 3 | 2014 | 0 | ff |
D | 4 | 2005 | 1 | hh |
D | 4 | 2006 | 0 | hh |
D | 4 | 2010 | 0 | ii |
D | 4 | 2011 | 0 | ii |
D | 4 | 2012 | 0 | ii |
D | 4 | 2014 | 0 | jj |
D | 4 | 2015 | 0 | jj |
The expected outcome is the following:
Person | Person_num | Year | Firstyear | Company | Save | Want |
A | 1 | 2005 | 1 | aa | aa | 1 |
A | 1 | 2006 | 0 | aa | aa | 0 |
A | 1 | 2007 | 0 | aa | aa | 0 |
A | 1 | 2008 | 0 | aa | aa | 0 |
A | 1 | 2008 | 0 | bb | aa | 0 |
A | 1 | 2009 | 0 | aa | aa | 0 |
A | 1 | 2009 | 0 | bb | aa | 0 |
B | 2 | 2008 | 1 | cc | cc | 1 |
B | 2 | 2009 | 0 | dd | cc | 0 |
B | 2 | 2009 | 0 | cc | cc | 0 |
B | 2 | 2010 | 0 | dd | dd | 1 |
B | 2 | 2011 | 0 | dd | dd | 0 |
B | 2 | 2012 | 0 | dd | dd | 0 |
C | 3 | 2002 | 1 | ee | {ee, ff} | 1 |
C | 3 | 2002 | 1 | ff | {ee, ff} | 1 |
C | 3 | 2003 | 0 | ee | {ee, ff} | 0 |
C | 3 | 2003 | 0 | ff | {ee, ff} | 0 |
C | 3 | 2004 | 0 | ff | ff | 0 |
C | 3 | 2008 | 0 | ff | ff | 0 |
C | 3 | 2009 | 0 | ff | ff | 0 |
C | 3 | 2010 | 0 | ff | ff | 0 |
C | 3 | 2011 | 0 | ff | ff | 0 |
C | 3 | 2012 | 0 | ff | ff | 0 |
C | 3 | 2012 | 0 | gg | ff | 0 |
C | 3 | 2013 | 0 | ff | ff | 0 |
C | 3 | 2014 | 0 | ff | ff | 0 |
D | 4 | 2005 | 1 | hh | hh | 1 |
D | 4 | 2006 | 0 | hh | hh | 0 |
D | 4 | 2010 | 0 | ii | ii | 1 |
D | 4 | 2011 | 0 | ii | ii | 0 |
D | 4 | 2012 | 0 | ii | ii | 0 |
D | 4 | 2014 | 0 | jj | jj | 1 |
D | 4 | 2015 | 0 | jj | jj | 0 |
The logic behind is the following:
1. If firstyear = 1 then want = 1;
2. If firstyear = 1, save = {companies}. That is, save include all companies when firstyear = 1. (Is it possible to save multiple variables into the save variable?)
3. If firstyear ne 0 and company = lag(save), then save = lag(save) and want = 0;
else if the company in save then save = lag(save) and want = 0;
else if, among the same year observations, there is an observation whose company is equal to the current observation's save variable, then save = lag(save) and want = 0;
else save = code and want = 1;
My attempt is the following:
data have;
input Person $ Person_num Year Firstyear Company $;
datalines;
A 1 2005 1 aa
A 1 2006 0 aa
A 1 2007 0 aa
A 1 2008 0 aa
A 1 2008 0 bb
A 1 2009 0 aa
A 1 2009 0 bb
B 2 2008 1 cc
B 2 2009 0 dd
B 2 2009 0 cc
B 2 2010 0 dd
B 2 2011 0 dd
B 2 2012 0 dd
C 3 2002 1 ee
C 3 2002 1 ff
C 3 2003 0 ee
C 3 2003 0 ff
C 3 2004 0 ff
C 3 2008 0 ff
C 3 2009 0 ff
C 3 2010 0 ff
C 3 2011 0 ff
C 3 2012 0 ff
C 3 2012 0 gg
C 3 2013 0 ff
C 3 2014 0 ff
D 4 2005 1 hh
D 4 2006 0 hh
D 4 2010 0 ii
D 4 2011 0 ii
D 4 2012 0 ii
D 4 2014 0 jj
D 4 2015 0 jj
;
data want; set have;
do i = 1 to 4;
if person_num = i then do;
if firstyear = 1 then want = 1 and save = code; *How can assign multiple variables into the save variable when Person = C;
if firstyear ne 0 and company = lag(save) then save = lag(save) and want = 0;
else if company in save then save = lag(save) and want = 0;
else if; *I have no idea in this part.;
else want = 1;
end;
end;
run;
Hello,
I am stuck to solve the difficulty.
I need your knowledge..
The dataset and the expected result("Want variable" is the following:
Person | Year | Company | Want |
A | 2005 | aa | 1 |
A | 2006 | aa | 0 |
A | 2007 | aa | 0 |
A | 2008 | aa | 0 |
A | 2008 | bb | 0 |
A | 2009 | aa | 0 |
A | 2009 | bb | 0 |
B | 2008 | cc | 1 |
B | 2009 | dd | 0 |
B | 2009 | cc | 0 |
B | 2010 | dd | 1 |
B | 2011 | dd | 0 |
B | 2012 | dd | 0 |
C | 2002 | ee | 1 |
C | 2002 | ff | 1 |
C | 2003 | ee | 0 |
C | 2003 | ff | 0 |
C | 2004 | ff | 0 |
C | 2008 | ff | 0 |
C | 2009 | ff | 0 |
C | 2010 | ff | 0 |
C | 2011 | ff | 0 |
C | 2012 | ff | 0 |
C | 2012 | gg | 0 |
C | 2013 | ff | 0 |
C | 2014 | ff | 0 |
D | 2005 | hh | 1 |
D | 2006 | hh | 0 |
D | 2010 | ii | 1 |
D | 2011 | ii | 0 |
D | 2012 | ii | 0 |
D | 2014 | jj | 1 |
D | 2015 | jj | 0 |
The logic to determine the "want variable" is like this:
1. if first.person then want = 1; if year = lag(year) and lag(want) = 1 then want = 1;
2. When the want variable =1, I would like to save the company as a new "save variable". (Is it possible to save multiple variables into the save variable?)
3. If the company of current observation equals to the save variable, then want = 0; else if there is an observation which has the same company with the save variable in the same year, then want = 0; else want = 1;
4. And everytime the want variable changes from 0 to 1, I would like to update the save variable to the company with the corresponding value of 1.
If you know an easy way to solve this problem, please let me know.
Thanks for your help.
I believe the bit you will be able to do for us:
Provide a fully working SAS data step posted via the running man icon above which creates the sample data and also includes at least your first condition: if first.person then want = 1;
Why is your Want=1 for below case. Which one of your rules applies here?
B | 2010 | dd | 1 |
I would like to save the company as a new "save variable".
Then please add also a new column save_want to your table with the desired result so we can see what you expect to get.
Here are instructions on how to provide sample data as a data step:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.