Hello All,
I have raw data as shown below. I do not know of how many times rescreening will be done. so, first set has 2 entries and second set has 1 entry and third set has 4 entries.
subj prevsubj
/*** First set - 2 times ****/
1001
1002 1001
1003 1002
/***Second set - 1 time ****/
1011
1012 1011
/***Third set - 4 times ****/
2010
2011 2010
2012 2011
2013 2012
2014 2013
The final dataset should look like:
subj prevsubj newsubj
1001 1003
1002 1001 1003
1003 1002 1003
1011 1012
1012 1011 1012
2010 2014
2011 2010 2014
2012 2011 2014
2013 2012 2014
2014 2013 2014
Can you please let me know of how can this be achieved?
Thank You.
The first part, combining the datasets, is easy using a SET statement
1. Logic is to get the newsubj (last screened subject).
In the example data given:
first set, 1001->1002->1003. And so newsubj should be 1003.
second set, 1011->1012. And so newsubj should be 1012.
third set, 2010->2011->2012->2013->2014. And so newsubj should be 2014.
2. I tried merging datasets by splitting subj and prevsubj but this only works for one level.
I am not trying to combine datasets, trying to see if there is a way to loop around each subject and see if we can get the expected solution.
Do you compute prevsubj yourself? And why does the first data set not have 3 obs (1001, 1002 and 1003) ?
No, prevsubj is entered in the way i have shown in the post. I created this test data for testing.
first set has 3 records....Not sure i understand your question. Can you please clarify?
Can you explain more what is the process that is generating this data?
Looks like the pattern is that when you have a missing value of PREVSUBJ you want to find the largest value of SUBJ in that block. Might be easiest to first generate the group numbers.
data have;
input subj prevsubj;
cards;
1001 .
1002 1001
1003 1002
1011 .
1012 1011
2010 .
2011 2010
2012 2011
2013 2012
2014 2013
;
data step1;
set have ;
group + missing(prevsubj);
run;
proc sql;
create table want as
select group, subj, prevsubj, max(subj) as newsubj
from step1
group by group
order by group, subj
;
quit;
Result
Obs group subj prevsubj newsubj 1 1 1001 . 1003 2 1 1002 1001 1003 3 1 1003 1002 1003 4 2 1011 . 1012 5 2 1012 1011 1012 6 3 2010 . 2014 7 3 2011 2010 2014 8 3 2012 2011 2014 9 3 2013 2012 2014 10 3 2014 2013 2014
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.