I have data on about 30000 records, each record can have up to 17 release dates (release1-release17) and up to 17 codes decribing the release type (street1-street17- for whether or not the immate was released to the streets).
So now I want to create 60 new variables: Month1-Month60 covering the 60 months of the data of Jan 2011-Dec 2015, where Month1 is January 2011 and Month60 is December 2015. If the variable street1-street17 is Y that means if was a "true" release and this new variable (MonthY) will be the difference in days for that month and a predefined date until the next release. If multiple release dates occur on the same month, then I need to concatenate these differences. Only 5 release dates per month will be concatenated. Dates are in this format mm/dd/yyy and the predefined date is 12/28/2035.
I have 2 problems with my code- the second release is being put on the wrong month (for example it should be month33 but it is showing on Month 34). And the other problem is that it is not concatenating multiple releases on the same month.
any help will be appreciated, thanks,
malena
here is my code:
*Create a record for each month based on release date;
data have_monthly;
length cat $24.;
length cat2 $24.;
set have_flipped;
N_interval = intck('month', date_type, date_release);
/*If NOT MISSING(date_admission) then do;*/
if N_interval <> . then do;
Month=intnx('month', date_release, i-1, 'b');
Status=1;
cat=catx(',',cat,date_type);
cat2=compress(cat, ',');
if i=(N_interval+1) and release_type='Y' then status=0;
output;
end;
format month yymon7.;
keep studyID month status cat2;
run;
/*This section of code creates the empty data set required for the final data structure
one record per month, per CHIA_random_ID
*/
*Create months;
data empty_months;
start = '01Dec2010'd;
do i=1 to 60;
Month=intnx('month', start, i, 'b');
Monthly_Number = catt("Month", put(i, z2.));
output;
end;
format month yymon7.;
keep month Monthly_Number;
run;
*Get list of IDs;
proc sql;
create table ids as
select distinct id from have;
quit;
*Create monthly by ID table with no status;
proc sql;
create table empty_status as
select a.id, b.*
from empty_months as b
cross join ids as a
order by 1, 2;
quit;
*Add in status from previous calculation;
proc sql;
create table status as
select a.*, b.cat2
from empty_status as a
left join have_monthly as b
on a.id=b.id
and a.month=b.month
order by 1,2;
quit;
*fill down values so values are retained across time;
data status_filled;
length status 8.;
set status;
by id;
retain final_status;
if first.id then final_status=status;
else if not missing(status) then final_status=status;
status=final_status;
if missing (status) then status=0;
drop final_status;
run;
proc sort data=status_filled out= status_filled2 nodupkey equals;
by id monthly_number;
run;
*Transpose back to final wide data structure;
proc transpose data=status_Filled2 out=want(drop= _name_);
by id;
var cat2;
id monthly_number;
idlabel month;
run;
proc sort data=incarcerated out= incarcerated2 nodupkey equals;
by id;
run;
data data_final;
set want;
DOC1=Month01;
DOC2=Month02;
DOC3=Month03;
DOC4=Month04;
DOC5=Month05;
DOC6=Month06;
DOC7=Month07;
DOC8=Month08;
DOC9=Month09;
DOC10=Month10;
DOC11=Month11;
DOC12=Month12;
DOC13=Month13;
DOC14=Month14;
DOC15=Month15;
DOC16=Month16;
DOC17=Month17;
DOC18=Month18;
DOC19=Month19;
DOC20=Month20;
DOC21=Month21;
DOC22=Month22;
DOC23=Month23;
DOC24=Month24;
DOC25=Month25;
DOC26=Month26;
DOC27=Month27;
DOC28=Month28;
DOC29=Month29;
DOC30=Month30;
DOC31=Month31;
DOC32=Month32;
DOC33=Month33;
DOC34=Month34;
DOC35=Month35;
DOC36=Month36;
DOC37=Month37;
DOC38=Month38;
DOC39=Month39;
DOC40=Month40;
DOC41=Month41;
DOC42=Month42;
DOC43=Month43;
DOC44=Month44;
DOC45=Month45;
DOC46=Month46;
DOC47=Month47;
DOC48=Month48;
DOC49=Month49;
DOC50=Month50;
DOC51=Month51;
DOC52=Month52;
DOC53=Month53;
DOC54=Month54;
DOC55=Month55;
DOC56=Month56;
DOC57=Month57;
DOC58=Month58;
DOC59=Month59;
DOC60=Month60;
run;
data Data_final;
merge incarcerated2 DOC_final;
by studyid;
keep id gender age race DOC1-DOC60 times;
run;
You didnt provide any data (or at least I didn't see it), but did see a problem with your code. You have a statement early on:
Month=intnx('month', date_release, i-1, 'b');
but haven't defined the variable i
Art, CEO, AnalystFinder.com
sorry, I defined that earlier... here is a file with a few observations
If you have data in a SAS data set it is much better to post as a DATASTEP. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
If we have to write a method to read your data we may end up with characteristics that do not actually match your data and provide a solution that will not quite work. Also since most of us are unpaid volunteers the amount of time involved with writing and debugging code to read you file may exceed the time some of us are willing to spend.
thanks- will try this to import myfile... here is a quick screenshot
id | Release1 | Release2 | Release3 | Release4 | Release5 | Street1 | Street2 | Street3 | Street4 | Street5 | date_proxy1 | date_proxy2 | date_proxy3 | date_proxy4 | date_proxy5 |
1205 | 1/6/2011 | 9/4/2013 | N | Y | 3461 | 2489 | |||||||||
1402 | 3/16/2011 | 11/18/2011 | 3/4/2013 | N | Y | N | 3392 | 3145 | 2673 | ||||||
2354 | 12/12/2011 | 11/17/2015 | Y | Y | 3121 | 1685 | |||||||||
5810 | 8/31/2011 | 6/7/2012 | 8/8/2012 | 3/29/2013 | 4/5/2013 | Y | Y | Y | Y | Y | 3224 | 2943 | 2881 | 2648 | 2641 |
1018 | 1/13/2012 | 6/14/2012 | 9/19/2012 | 12/3/2012 | 10/15/2013 | Y | Y | Y | Y | Y | 3089 | 2936 | 2839 | 2764 | 2448 |
5850 | 6/17/2013 | 10/21/2013 | 5/1/2014 | 12/5/2014 | 12/10/2014 | N | N | Y | Y | Y | 2568 | 2442 | 2250 | 2032 | 2027 |
Screen shots really a poor. We would have to 1) copy the data and paste into something to reformat it(sometimes problematical), 2) write a data step or other code to import it. At that point we have to make choices that may result in a data set with different characteristics for your variables and a solution that relies on those differences.
sorry, the dateproxy are the calculated values of the difference between release date and inputdate = '06/28/2020'. So for the 17 releases this has already been done.
Now I need to use these data to come up with a dataset in this way- only when Street1-Street17 is Y then put the date_proxy1 in the corresponding month (DOC1-DOC60). When 2 releases or more occu on the same month, concatenate the date proxy in the corresponding month
I don't have the software to upload the sas file here- but here is sample input file
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.