BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sathish_jammy
Lapis Lazuli | Level 10

Dear Experts,

 

I have to get the Cancer_onset_Date(Date column) using the (Age) month day year specified in the text column.

I shared the sample dataset for your reference.

 

data aaa;
Input ID DOB mmddyy10. Comments$90.;
format DOB mmddyy10.;
cards;
183556 03/13/1965 Cancer since 6 Months detected at the age of  53 Years 5 Month 13 Days in the yr 2018
452890 08/02/1950 Cancer since 25 years. Detected at the age of 43 years in the year 1994.
run;

All I need is to find the date of the IDs which fall under that age month days year.

If the month/date is not specified, Replace the month/date as same as from the IDs DOB.

 

Thanks in advance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Sathish_jammy 

 

Here is an attempt to solve your problem:

data aaa2;
	set aaa;
	format Cancer_onset_Date mmddyy10.;

	/* Retrieve year of detection*/	
	if prxmatch('/^(.*)(\d{4})(.*)$/i',Comments) then 
		comments_year_detect  = prxchange('s/(.*)(\d{4})(.*)/$2/',1,Comments);

	/* Retrieve age of detection year, month and day*/
		/*NB: the following research is case insensitive*/
		/*Assumption 1: the Year is preceded by 'detected at the age of'*/
		/*Some spelling variants have been considered: e.g. Years|Yrs|Year|Yr*/
		
	if prxmatch('/^(.*)(detected at the age of)\s+(\d|\d{2})(\sYear)(.*)$/i',Comments) then 
		Cancer_onset_Date_y = prxchange('s/^(.*)(detected at the age of)\s+(\d|\d{2})(\sYear)(.*)$/$3/i',1,Comments);
	if prxmatch('/^(.*)(Years|Yrs|Year|Yr)\s+(\d|\d{2})\s+(Month)(.*)$/i',Comments) then 
		Cancer_onset_Date_m = prxchange('s/^(.*)(Years|Yrs|Year|Yr)\s+(\d|\d{2})\s+(Month)(.*)$/$3/i',1,Comments);
	if prxmatch('/^(.*)(Months|Mths|Month|Mth)\s+(\d|\d{2})\s+(Day)(.*)$/i',Comments) then 
		Cancer_onset_Date_d = prxchange('s/^(.*)(Months|Mths|Month|Mth)\s+(\d|\d{2})\s+(Day)(.*)$/$3/i',1,Comments);

	/* Compute Cancer_onset_Date. If the month/date is not specified, replace the month/date as same as from the IDs DOB */
	
	Cancer_onset_Date = intnx('year',DOB,Cancer_onset_Date_y,"s");
	if Cancer_onset_Date_m ne "" then Cancer_onset_Date = intnx('month',Cancer_onset_Date,Cancer_onset_Date_m,"s");
	if Cancer_onset_Date_d ne "" then Cancer_onset_Date = intnx('day',Cancer_onset_Date,Cancer_onset_Date_d,"s");

run;

 Capture d’écran 2020-01-06 à 09.59.48.png

 

View solution in original post

3 REPLIES 3
KachiM
Rhodochrosite | Level 12

@Sathish_jammy ,

 

The COMMENT is split into parts for Year, Month and Days. Here is one way with explicit extraction.

 

data want;
   set aaa;
   length str $90 strleft $30;
   str = substr(Comments, find(Comments, 'age'));

   /* Find Age */
   k = find(upcase(str), 'YEARS');
   strleft = substr(str,1, k-1);
   str = substr(str, k+5);
   Age = compress(strleft,' ','kd');

   /* find Month */
   k = find(upcase(str), 'MONTH');
   if k then do;
      strleft = substr(str,1, k-1);
      Month = compress(strleft,' ','kd');
      str = substr(str, k+6);
   end;
   else Month = put(month(DOB),8.);

   /* find Day */
   k = find(upcase(str), 'DAYS');
   if k then do;
      strleft = substr(str,1, k-1);
      Day = compress(strleft,' ','kd');
   end;
   else Day = put(day(DOB),8.);

   *put Age = Month = Day =;
keep ID DOB Age Month Day;
run;

If this does not meet your need, please clarify.

ed_sas_member
Meteorite | Level 14

Hi @Sathish_jammy 

 

Here is an attempt to solve your problem:

data aaa2;
	set aaa;
	format Cancer_onset_Date mmddyy10.;

	/* Retrieve year of detection*/	
	if prxmatch('/^(.*)(\d{4})(.*)$/i',Comments) then 
		comments_year_detect  = prxchange('s/(.*)(\d{4})(.*)/$2/',1,Comments);

	/* Retrieve age of detection year, month and day*/
		/*NB: the following research is case insensitive*/
		/*Assumption 1: the Year is preceded by 'detected at the age of'*/
		/*Some spelling variants have been considered: e.g. Years|Yrs|Year|Yr*/
		
	if prxmatch('/^(.*)(detected at the age of)\s+(\d|\d{2})(\sYear)(.*)$/i',Comments) then 
		Cancer_onset_Date_y = prxchange('s/^(.*)(detected at the age of)\s+(\d|\d{2})(\sYear)(.*)$/$3/i',1,Comments);
	if prxmatch('/^(.*)(Years|Yrs|Year|Yr)\s+(\d|\d{2})\s+(Month)(.*)$/i',Comments) then 
		Cancer_onset_Date_m = prxchange('s/^(.*)(Years|Yrs|Year|Yr)\s+(\d|\d{2})\s+(Month)(.*)$/$3/i',1,Comments);
	if prxmatch('/^(.*)(Months|Mths|Month|Mth)\s+(\d|\d{2})\s+(Day)(.*)$/i',Comments) then 
		Cancer_onset_Date_d = prxchange('s/^(.*)(Months|Mths|Month|Mth)\s+(\d|\d{2})\s+(Day)(.*)$/$3/i',1,Comments);

	/* Compute Cancer_onset_Date. If the month/date is not specified, replace the month/date as same as from the IDs DOB */
	
	Cancer_onset_Date = intnx('year',DOB,Cancer_onset_Date_y,"s");
	if Cancer_onset_Date_m ne "" then Cancer_onset_Date = intnx('month',Cancer_onset_Date,Cancer_onset_Date_m,"s");
	if Cancer_onset_Date_d ne "" then Cancer_onset_Date = intnx('day',Cancer_onset_Date,Cancer_onset_Date_d,"s");

run;

 Capture d’écran 2020-01-06 à 09.59.48.png

 

andreas_lds
Jade | Level 19

With the data you have posted, the following steps seems to extract the data you need:

 

data want;
   set aaa;

   length 
      rx_year rx_month_day 8
      year month day 8
   ;

   rx_year = prxparse('/(year|yr) (\d+)/i');
   rx_month_day = prxparse('/years (\d+) month (\d+) days/i');

   comments = compbl(comments);

   if prxmatch(rx_year, comments) then do;
      year = input(prxposn(rx_year, 2, comments), ? 4.);
   end;

   if prxmatch(rx_month_day, comments) then do;
      month = input(prxposn(rx_month_day, 1, comments), ? 2.);
      day = input(prxposn(rx_month_day, 2, comments), ? 2.);
   end;
   else do;
      month = month(DOB);
      day = day(dob);
   end;


   drop rx_:;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 434 views
  • 3 likes
  • 4 in conversation