BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LFern
Obsidian | Level 7

Hello, I have the following:

data have;
    input original_string $1-35;
    datalines;
I was 12 yearsold I am 50 yearsold
;
run;

I want to achieve the following:

data want;
    input desired_age1 desired_age2;
    datalines;
12 50
;
run;

 

I figured I need to use "SCAN" like below, except I need to figure out how to loop it so that I can capture multiple ages. 

data want;
set have;
do i=1 to countw(original_string,"");
	if scan(original_string,i,"")="yearsold" then do;
	  desired_age1=scan(original_string,i-1,"");
	end;
end;
run;


Any thoughts? I appreciate any help.

1 ACCEPTED SOLUTION

Accepted Solutions
Amir
PROC Star

Hi @LFern,

 

Does the following do what you require?

 

/* set up input data */
data have;
   infile datalines truncover;
   input original_string $1-100;
   datalines;
I was 12 yearsold I am 50 yearsold
I was 22 yearsold
I am 33 yearsold
I was 44 yearsold I am 55 yearsold tomorrow I'm 56 yearsold
;


/* output each age to a new observation */
data want1;
   set have;

   do i = 1 to countw(original_string,"");
   	if scan(original_string,i,"") eq "yearsold" then
      do;
         desired_age = scan(original_string,i-1,"");
         output;
   	end;
   end;
run;


/* transpose desired_age by original_string */
proc transpose data   = want1
               out    = want2
               name   = original_string
               prefix = desired_age
               ;
   by original_string notsorted;
   var desired_age;
run;

 

Kind regards,

Amir.

View solution in original post

8 REPLIES 8
Shmuel
Garnet | Level 18

For the sample you have posted your code should work. Have you tried it?

To get a better solution we need to see more test data and possible combinations.

LFern
Obsidian | Level 7

Hi Shmuel,

My posted code only yields "Desired_age1=50".

But I also need a variable called "Desired_age2=12" and I dont know how to get that second age.

Shmuel
Garnet | Level 18

You are right. I have fixed your code:

data want;
set have;
    array age {*} desired_age1 - desired_age2;
	do i=1 to countw(original_string,"");
		if scan(original_string,i,"")="yearsold" then do;
		  if i  le dim(age) then
		     age(i)=scan(original_string,i-1,"");
		end;
	end;
run;
LFern
Obsidian | Level 7

Hello again,

Sorry to bother. The code you provided displays missing values.

SAS_want.PNG

I'm not familiar with arrays but I did some quick research and I think the reason your code isn't working is because "Desired_Age1" & "Desired_Age2" are not variables in the "HAVE" dataset.

Shmuel
Garnet | Level 18

@LFern , you are right. I fixed the code and tested it:

data have;
    input original_string $1-35;
    datalines;
I was 12 yearsold I am 50 yearsold
;
run;
data want;
set have;
    array age {*} desired_age1 - desired_age2;
    j=0;
	do i=1 to countw(original_string);
		if scan(original_string,i)="yearsold" then do;
		  j+1;
		  if j le dim(age) then 
		     age(j)=input(scan(original_string,i-1),best2.); 
		end;
	end;
	drop i j;
run;
/* space is a default delimiter of scan() function */


Amir
PROC Star

Hi @LFern,

 

Does the following do what you require?

 

/* set up input data */
data have;
   infile datalines truncover;
   input original_string $1-100;
   datalines;
I was 12 yearsold I am 50 yearsold
I was 22 yearsold
I am 33 yearsold
I was 44 yearsold I am 55 yearsold tomorrow I'm 56 yearsold
;


/* output each age to a new observation */
data want1;
   set have;

   do i = 1 to countw(original_string,"");
   	if scan(original_string,i,"") eq "yearsold" then
      do;
         desired_age = scan(original_string,i-1,"");
         output;
   	end;
   end;
run;


/* transpose desired_age by original_string */
proc transpose data   = want1
               out    = want2
               name   = original_string
               prefix = desired_age
               ;
   by original_string notsorted;
   var desired_age;
run;

 

Kind regards,

Amir.

LFern
Obsidian | Level 7

This was fantastic, thank you so much!

Ksharp
Super User
data have;
   infile datalines truncover;
   input original_string $1-100;
   datalines;
I was 12 yearsold I am 50 yearsold
I was 22 yearsold
I am 33 yearsold
I was 44 yearsold I am 55 yearsold tomorrow I'm 56 yearsold
;

data temp;
 set have;
 id+1;
 pid=prxparse('/\d+\s*(?=yearsold)/i');
 s=1;e=length(original_string);
 call prxnext(pid,s,e,original_string,p,l);
 do while(p>0);
  want=substr(original_string,p,l);output;
  call prxnext(pid,s,e,original_string,p,l);
 end;
 drop s e p l pid;
run;
proc transpose data=temp out=want(drop=_:);
by id original_string;
var want;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 664 views
  • 0 likes
  • 4 in conversation