BookmarkSubscribeRSS Feed
malena
Calcite | Level 5

I have data on about 30000 records, each record can have up to 17 release dates (release1-release17) and up to 17 codes decribing the release type (street1-street17- for whether or not the immate was released to the streets).  

 

So now I want to create 60 new variables: Month1-Month60 covering the 60 months of the data of Jan 2011-Dec 2015, where Month1 is January 2011 and Month60 is December 2015. If the variable street1-street17 is Y that means if was a "true" release and this new variable (MonthY) will be the difference in days for that month and a predefined date until the next release. If multiple release dates occur on the same month, then I need to concatenate these differences. Only 5 release dates per month will be concatenated. Dates are in this format mm/dd/yyy and the predefined date is 12/28/2035.

 

I have 2 problems with my code- the second release is being put on the wrong month (for example it should be month33 but it is showing on Month 34). And the other problem is that it is not concatenating multiple releases on the same month.

 

any help will be appreciated, thanks,

malena

 

 

here is my code:

 

 


*Create a record for each month based on  release date;

data have_monthly;
length cat $24.;
length cat2 $24.;

set have_flipped;

N_interval = intck('month', date_type, date_release);

/*If NOT MISSING(date_admission) then do;*/

if N_interval <> . then do;

Month=intnx('month', date_release, i-1, 'b');
Status=1;
cat=catx(',',cat,date_type);
cat2=compress(cat, ',');

if i=(N_interval+1) and release_type='Y' then status=0;

output;
end;

format month yymon7.;

keep studyID month status cat2;
run;

/*This section of code creates the empty data set required for the final data structure
one record per month, per CHIA_random_ID
*/

*Create months;
data empty_months;
start = '01Dec2010'd;
do i=1 to 60;
Month=intnx('month', start, i, 'b');
Monthly_Number = catt("Month", put(i, z2.));
output;
end;
format month yymon7.;
keep month Monthly_Number;
run;

*Get list of IDs;
proc sql;
create table ids as
select distinct id from have;
quit;

*Create monthly by ID table with no status;
proc sql;
create table empty_status as
select a.id, b.*
from empty_months as b
cross join ids as a
order by 1, 2;
quit;

*Add in status from previous calculation;
proc sql;
create table status as
select a.*, b.cat2
from empty_status as a
left join have_monthly as b
on a.id=b.id
and a.month=b.month
order by 1,2;
quit;


*fill down values so values are retained across time;
data status_filled;
length status 8.;
set status;
by id;
retain final_status;
if first.id then final_status=status;
else if not missing(status) then final_status=status;
status=final_status;

if missing (status) then status=0;
drop final_status;
run;


proc sort data=status_filled out= status_filled2 nodupkey equals;
by id monthly_number;
run;

*Transpose back to final wide data structure;
proc transpose data=status_Filled2 out=want(drop= _name_);
by id;
var cat2;
id monthly_number;
idlabel month;
run;

proc sort data=incarcerated out= incarcerated2 nodupkey equals;
by id;
run;

 

 

data data_final;
set want;
DOC1=Month01;
DOC2=Month02;
DOC3=Month03;
DOC4=Month04;
DOC5=Month05;
DOC6=Month06;
DOC7=Month07;
DOC8=Month08;
DOC9=Month09;
DOC10=Month10;
DOC11=Month11;
DOC12=Month12;
DOC13=Month13;
DOC14=Month14;
DOC15=Month15;
DOC16=Month16;
DOC17=Month17;
DOC18=Month18;
DOC19=Month19;
DOC20=Month20;
DOC21=Month21;
DOC22=Month22;
DOC23=Month23;
DOC24=Month24;
DOC25=Month25;
DOC26=Month26;
DOC27=Month27;
DOC28=Month28;
DOC29=Month29;
DOC30=Month30;
DOC31=Month31;
DOC32=Month32;
DOC33=Month33;
DOC34=Month34;
DOC35=Month35;
DOC36=Month36;
DOC37=Month37;
DOC38=Month38;
DOC39=Month39;
DOC40=Month40;
DOC41=Month41;
DOC42=Month42;
DOC43=Month43;
DOC44=Month44;
DOC45=Month45;
DOC46=Month46;
DOC47=Month47;
DOC48=Month48;
DOC49=Month49;
DOC50=Month50;
DOC51=Month51;
DOC52=Month52;
DOC53=Month53;
DOC54=Month54;
DOC55=Month55;
DOC56=Month56;
DOC57=Month57;
DOC58=Month58;
DOC59=Month59;
DOC60=Month60;


run;

data Data_final;
merge incarcerated2  DOC_final;
by studyid;
keep id gender age race DOC1-DOC60 times;
run;

 

10 REPLIES 10
art297
Opal | Level 21

You didnt provide any data (or at least I didn't see it), but did see a problem with your code. You have a statement early on:

 

Month=intnx('month', date_release, i-1, 'b');

 

but haven't defined the variable    i

 

Art, CEO, AnalystFinder.com

 

malena
Calcite | Level 5

sorry, I defined that earlier... here is a file with a few observations

ballardw
Super User

If you have data in a SAS data set it is much better to post as a DATASTEP. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

If we have to write a method to read your data we may end up with characteristics that do not actually match your data and provide a solution that will not quite work. Also since most of us are unpaid volunteers the amount of time involved with writing and debugging code to read you file may exceed the time some of us are willing to spend.

 

malena
Calcite | Level 5

thanks- will try this to import myfile... here is a quick screenshot

 

id Release1 Release2 Release3 Release4 Release5 Street1 Street2 Street3 Street4 Street5 date_proxy1 date_proxy2 date_proxy3 date_proxy4 date_proxy5
1205 1/6/2011 9/4/2013       N Y       3461 2489      
1402 3/16/2011 11/18/2011 3/4/2013     N Y N     3392 3145 2673    
2354 12/12/2011 11/17/2015       Y Y       3121 1685      
5810 8/31/2011 6/7/2012 8/8/2012 3/29/2013 4/5/2013 Y Y Y Y Y 3224 2943 2881 2648 2641
1018 1/13/2012 6/14/2012 9/19/2012 12/3/2012 10/15/2013 Y Y Y Y Y 3089 2936 2839 2764 2448
5850 6/17/2013 10/21/2013 5/1/2014 12/5/2014 12/10/2014 N N Y Y Y 2568 2442 2250 2032 2027
ballardw
Super User

Screen shots really a poor. We would have to 1) copy the data and paste into something to reformat it(sometimes problematical), 2) write a data step or other code to import it. At that point we have to make choices that may result in a data set with different characteristics for your variables and a solution that relies on those differences.

lakshmi_74
Quartz | Level 8
Can you share above .txt file into csv format.
Thanks
malena
Calcite | Level 5
 
lakshmi_74
Quartz | Level 8
I am bit confusion what you are looking for. as your data set is having date_proxy variables.Please can you explain what you want get the output from the test.csv as a result.can you give brief explain? Thanks
malena
Calcite | Level 5

sorry, the dateproxy are the calculated values of the difference between release date and inputdate = '06/28/2020'. So for the 17 releases this has already been done.

 

Now I need to use these data to come up with a dataset in this way- only when Street1-Street17 is Y then put the date_proxy1 in the corresponding month (DOC1-DOC60). When 2 releases or more occu on the same month, concatenate the date proxy in the corresponding month

malena
Calcite | Level 5

I don't have the software to upload the sas file here- but here is sample input file

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2603 views
  • 0 likes
  • 4 in conversation