DATA Step, Macro, Functions and more

Concatenation difference in days

Reply
Contributor
Posts: 40

Concatenation difference in days

I have data on about 30000 records, each record can have up to 17 release dates (release1-release17) and up to 17 codes decribing the release type (street1-street17- for whether or not the immate was released to the streets).  

 

So now I want to create 60 new variables: Month1-Month60 covering the 60 months of the data of Jan 2011-Dec 2015, where Month1 is January 2011 and Month60 is December 2015. If the variable street1-street17 is Y that means if was a "true" release and this new variable (MonthY) will be the difference in days for that month and a predefined date until the next release. If multiple release dates occur on the same month, then I need to concatenate these differences. Only 5 release dates per month will be concatenated. Dates are in this format mm/dd/yyy and the predefined date is 12/28/2035.

 

I have 2 problems with my code- the second release is being put on the wrong month (for example it should be month33 but it is showing on Month 34). And the other problem is that it is not concatenating multiple releases on the same month.

 

any help will be appreciated, thanks,

malena

 

 

here is my code:

 

 


*Create a record for each month based on  release date;

data have_monthly;
length cat $24.;
length cat2 $24.;

set have_flipped;

N_interval = intck('month', date_type, date_release);

/*If NOT MISSING(date_admission) then do;*/

if N_interval <> . then do;

Month=intnx('month', date_release, i-1, 'b');
Status=1;
cat=catx(',',cat,date_type);
cat2=compress(cat, ',');

if i=(N_interval+1) and release_type='Y' then status=0;

output;
end;

format month yymon7.;

keep studyID month status cat2;
run;

/*This section of code creates the empty data set required for the final data structure
one record per month, per CHIA_random_ID
*/

*Create months;
data empty_months;
start = '01Dec2010'd;
do i=1 to 60;
Month=intnx('month', start, i, 'b');
Monthly_Number = catt("Month", put(i, z2.));
output;
end;
format month yymon7.;
keep month Monthly_Number;
run;

*Get list of IDs;
proc sql;
create table ids as
select distinct id from have;
quit;

*Create monthly by ID table with no status;
proc sql;
create table empty_status as
select a.id, b.*
from empty_months as b
cross join ids as a
order by 1, 2;
quit;

*Add in status from previous calculation;
proc sql;
create table status as
select a.*, b.cat2
from empty_status as a
left join have_monthly as b
on a.id=b.id
and a.month=b.month
order by 1,2;
quit;


*fill down values so values are retained across time;
data status_filled;
length status 8.;
set status;
by id;
retain final_status;
if first.id then final_status=status;
else if not missing(status) then final_status=status;
status=final_status;

if missing (status) then status=0;
drop final_status;
run;


proc sort data=status_filled out= status_filled2 nodupkey equals;
by id monthly_number;
run;

*Transpose back to final wide data structure;
proc transpose data=status_Filled2 out=want(drop= _name_);
by id;
var cat2;
id monthly_number;
idlabel month;
run;

proc sort data=incarcerated out= incarcerated2 nodupkey equals;
by id;
run;

 

 

data data_final;
set want;
DOC1=Month01;
DOC2=Month02;
DOC3=Month03;
DOC4=Month04;
DOC5=Month05;
DOC6=Month06;
DOC7=Month07;
DOC8=Month08;
DOC9=Month09;
DOC10=Month10;
DOC11=Month11;
DOC12=Month12;
DOC13=Month13;
DOC14=Month14;
DOC15=Month15;
DOC16=Month16;
DOC17=Month17;
DOC18=Month18;
DOC19=Month19;
DOC20=Month20;
DOC21=Month21;
DOC22=Month22;
DOC23=Month23;
DOC24=Month24;
DOC25=Month25;
DOC26=Month26;
DOC27=Month27;
DOC28=Month28;
DOC29=Month29;
DOC30=Month30;
DOC31=Month31;
DOC32=Month32;
DOC33=Month33;
DOC34=Month34;
DOC35=Month35;
DOC36=Month36;
DOC37=Month37;
DOC38=Month38;
DOC39=Month39;
DOC40=Month40;
DOC41=Month41;
DOC42=Month42;
DOC43=Month43;
DOC44=Month44;
DOC45=Month45;
DOC46=Month46;
DOC47=Month47;
DOC48=Month48;
DOC49=Month49;
DOC50=Month50;
DOC51=Month51;
DOC52=Month52;
DOC53=Month53;
DOC54=Month54;
DOC55=Month55;
DOC56=Month56;
DOC57=Month57;
DOC58=Month58;
DOC59=Month59;
DOC60=Month60;


run;

data Data_final;
merge incarcerated2  DOC_final;
by studyid;
keep id gender age race DOC1-DOC60 times;
run;

 

PROC Star
Posts: 7,468

Re: Concatenation difference in days

You didnt provide any data (or at least I didn't see it), but did see a problem with your code. You have a statement early on:

 

Month=intnx('month', date_release, i-1, 'b');

 

but haven't defined the variable    i

 

Art, CEO, AnalystFinder.com

 

Contributor
Posts: 40

Re: Concatenation difference in days

[ Edited ]

sorry, I defined that earlier... here is a file with a few observations

Super User
Posts: 11,343

Re: Concatenation difference in days

If you have data in a SAS data set it is much better to post as a DATASTEP. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

If we have to write a method to read your data we may end up with characteristics that do not actually match your data and provide a solution that will not quite work. Also since most of us are unpaid volunteers the amount of time involved with writing and debugging code to read you file may exceed the time some of us are willing to spend.

 

Contributor
Posts: 40

Re: Concatenation difference in days

thanks- will try this to import myfile... here is a quick screenshot

 

id Release1 Release2 Release3 Release4 Release5 Street1 Street2 Street3 Street4 Street5 date_proxy1 date_proxy2 date_proxy3 date_proxy4 date_proxy5
1205 1/6/2011 9/4/2013       N Y       3461 2489      
1402 3/16/2011 11/18/2011 3/4/2013     N Y N     3392 3145 2673    
2354 12/12/2011 11/17/2015       Y Y       3121 1685      
5810 8/31/2011 6/7/2012 8/8/2012 3/29/2013 4/5/2013 Y Y Y Y Y 3224 2943 2881 2648 2641
1018 1/13/2012 6/14/2012 9/19/2012 12/3/2012 10/15/2013 Y Y Y Y Y 3089 2936 2839 2764 2448
5850 6/17/2013 10/21/2013 5/1/2014 12/5/2014 12/10/2014 N N Y Y Y 2568 2442 2250 2032 2027
Super User
Posts: 11,343

Re: Concatenation difference in days

Screen shots really a poor. We would have to 1) copy the data and paste into something to reformat it(sometimes problematical), 2) write a data step or other code to import it. At that point we have to make choices that may result in a data set with different characteristics for your variables and a solution that relies on those differences.

Contributor
Posts: 57

Re: Concatenation difference in days

Can you share above .txt file into csv format.
Thanks
Contributor
Posts: 40

Re: Concatenation difference in days

Posted in reply to lakshmi_74
Contributor
Posts: 57

Re: Concatenation difference in days

I am bit confusion what you are looking for. as your data set is having date_proxy variables.Please can you explain what you want get the output from the test.csv as a result.can you give brief explain? Thanks
Contributor
Posts: 40

Re: Concatenation difference in days

[ Edited ]
Posted in reply to lakshmi_74

sorry, the dateproxy are the calculated values of the difference between release date and inputdate = '06/28/2020'. So for the 17 releases this has already been done.

 

Now I need to use these data to come up with a dataset in this way- only when Street1-Street17 is Y then put the date_proxy1 in the corresponding month (DOC1-DOC60). When 2 releases or more occu on the same month, concatenate the date proxy in the corresponding month

Contributor
Posts: 40

Re: Concatenation difference in days

I don't have the software to upload the sas file here- but here is sample input file

Ask a Question
Discussion stats
  • 10 replies
  • 180 views
  • 0 likes
  • 4 in conversation