Copying the values from SAS variable, a whole column as text into a data step

Accepted Solution Solved
Reply
Contributor Kc2
Contributor
Posts: 46
Accepted Solution

Copying the values from SAS variable, a whole column as text into a data step

[ Edited ]

Hello,

 

I have a dataset with 2 variables, id and reason. the dataset has 240 rows. The issue I have is that the variable reason has date values in it but those values are not consistent in format:

e.g  HAVE

ID   REASON

1) Multiple calls 14May2018, 15May2018, 16May2018, 17May2018, 18May2018.

2) No contact made, participant had surgery on 20 Oct 2017

3) No response

4) no answer  18-19Jan18

5) no info not contactable 16/11/17.

 

I need to create a third variable and extract the date from the variable "reason" when available.

WANT

1)14May2018

1)15May2018

1)16May2018

1)17May2018

1)18May2018

2)20Oct17

3)

4)18Jan2018

4)19Jan2018

5)16Nov2017.

 

if there a way to copy  the content of the 240 rows of the variable reason in a subsequent data step in order to extract the date and assign he dates into a new variable date. The issue I am having sis that the dates are not in an identical format sometimes it is 16/11/17 , sometimes 12 dec, other times 18-19 jan


Accepted Solutions
Solution
3 weeks ago
Super User
Super User
Posts: 9,599

Re: Copying the values from SAS variable, a whole column as text into a data step

Manually may be the only option, consider the case:

5) no info not contactable 08/06/17

Is this 08Jun or 06Aug?  There are a fair few other examples out there, different formats methods of putting dates etc.

 

You could do some things in code:

start=substr(reason,findc(reason,"","d"));

For instance would find the first numeric in the string, if they are all like your example, that would trim it down a bit.  Even so, by the time you have programmed each check, it would be quicker to do it manually.  This is why free text fields are frowned upon on stats and databasing, next to impossible to get anything useful out of it.

View solution in original post


All Replies
Regular Contributor
Posts: 164

Re: Coying the values from SAS variable, a whole column as text into a data step

i don't understand your final sentence ie your Q. You just have to write code eg substr and index functions to extract the date data and then derive the new date variable

--------------
blog: papersandprograms.com
Super User
Posts: 23,776

Re: Copying the values from SAS variable, a whole column as text into a data step

There isn’t an automated method to do this. For 240 records I’d almost consider doing it manually, probably about an hour worth of work. 

 

To do it programming - which is better in the long run - first isolate all your different type of entry. 

Handle the simplest first and work through them. 

 

Use SCAN to isolate each word in the statement and then test if it’s a date using INPUT() with the ANYDTE informat. That’ll probably take care of most of your cases. 

 

PROC FREQ, and text functions will be useful. 

Find, index, translate, compress, substr, scan, countw are some that you’ll need. 

 


@Kc2 wrote:

Hello,

 

I have a dataset with 2 variables, id and reason. the dataset has 240 rows. The issue I have is that the variable reason has date values in it but those values are not consistent in format:

e.g  HAVE

ID   REASON

1) Multiple calls 14May2018, 15May2018, 16May2018, 17May2018, 18May2018.

2) No contact made, participant had surgery on 20 Oct 2017

3) No response

4) no answer  18-19Jan18

5) no info not contactable 16/11/17.

 

I need to create a third variable and extract the date from the variable "reason" when available.

WANT

1)14May2018

1)15May2018

1)16May2018

1)17May2018

1)18May2018

2)20Oct17

3)

4)18Jan2018

4)19Jan2018

5)16Nov2017.

 

if there a way to copy  the content of the 240 rows of the variable reason in a subsequent data step in order to extract the date and assign he dates into a new variable date. The issue I am having sis that the dates are not in an identical format sometimes it is 16/11/17 , sometimes 12 dec, other times 18-19 jan


 

 

Solution
3 weeks ago
Super User
Super User
Posts: 9,599

Re: Copying the values from SAS variable, a whole column as text into a data step

Manually may be the only option, consider the case:

5) no info not contactable 08/06/17

Is this 08Jun or 06Aug?  There are a fair few other examples out there, different formats methods of putting dates etc.

 

You could do some things in code:

start=substr(reason,findc(reason,"","d"));

For instance would find the first numeric in the string, if they are all like your example, that would trim it down a bit.  Even so, by the time you have programmed each check, it would be quicker to do it manually.  This is why free text fields are frowned upon on stats and databasing, next to impossible to get anything useful out of it.

Super User
Posts: 10,787

Re: Copying the values from SAS variable, a whole column as text into a data step

It is very difficulty ,you need consider  all the pattern of DATE.

 

data have;
infile cards truncover;
input id $ reason $80.;
cards;
1)  Multiple calls 14May2018, 15May2018, 16May2018, 17May2018, 18May2018.
2)  No contact made, participant had surgery on 20 Oct 2017
3)  No response
4)  no answer  18-19Jan18
5)  no info not contactable 16/11/17
;
run;

data want;
 set have;
 pid=prxparse('/\d+\s*[a-z]{3}\s*\d{2,}|\d+\-\d+[a-z]{3}\d+|\d+\/\d+\/\d+/i');
 s=1;
 e=length(reason);
 call prxnext(pid,s,e,reason,p,l);
 do while(p>0);
   want=substr(reason,p,l);output;
   call prxnext(pid,s,e,reason,p,l);
 end;
 drop s e pid p l;
run;

proc print noobs;run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 140 views
  • 1 like
  • 5 in conversation