DATA Step, Macro, Functions and more

Search and retrieve data

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 92
Accepted Solution

Search and retrieve data

Hi There,

 

I need help to search a character and retrieve the data that match the character.

 

data have;

input ID $ data1 $ data2 $ data3 $ data4 $;
datalines;
1 ?_B C_D A_C B_D
2 A_A D_? A_A ?_B
3 B_B C_C C_C D_D
4 R_T P_? C_? K_A
;
run;

 

I want to output data that have "?" in it. I have more than 100 variables to search for.

 

I want the output like this:

ID data1 data2 data3 data4
1 ?_B      
2   D_?   ?_B
4   P_? C_?  

 

Thanks in advance!


Accepted Solutions
Highlighted
Solution
‎01-05-2018 04:06 PM
Super User
Super User
Posts: 9,193

Re: Search and retrieve data

This is a common problem, and one caused by the "Excel way of thinking" which seems to permeate throughout SAS programming nowadays.  You have X numbers of fixed columns, and Y number of non-fixed columns, and are now trying to find a way to process them logically.  A simple mindset change to work "In the programmers way of thinking" makes this kind of issue go away.  There are two main data structures, transposed - which is what you have and is useful for reporting out for humans to read, and normalised - which is useful for storage, programming etc.  

So from a storage point of view you have 12 cells to contain 5 data items, which is a waste.  From a programming point of view you have to work out how many elements, scan through each, and of course do this each time you want to use it.  

A far simpler storage would be:

data have;
  input ID $ data1 $ data2 $ data3 $ data4 $;
datalines;
1 ?_B C_D A_C B_D 
2 A_A D_? A_A ?_B 
3 B_B C_C C_C D_D 
4 R_T P_? C_? K_A 
;
run;

proc transpose data=have out=want;
  by id;
  var data:;
run;

data want;
  set want;
  where index(col1,"?")>0;
run;

So normalise the data, then its simply a matter of where clauses and such like to access the data you want.  Compare that to all the array codes and such like given before, how much simpler?

And if you need the transposed results at the end for a report, use another proc transpose to go up again.

View solution in original post


All Replies
PROC Star
Posts: 1,294

Re: Search and retrieve data

[ Edited ]

here you go-

 

data have;

input ID $ data1 $ data2 $ data3 $ data4 $;
datalines;
1 ?_B C_D A_C B_D
2 A_A D_? A_A ?_B
3 B_B C_C C_C D_D
4 R_T P_? C_? K_A
;
run;

data want;
set have;
array t(*) data1-data4;
do _n_=1 to dim(t);
if index(t(_n_),'?')>0 then continue;
else call missing(t(_n_));
end;
if cmiss(of t(*))=dim(t) then delete;
run;

PROC Star
Posts: 1,294

Re: Search and retrieve data

Posted in reply to novinosrin

Notes:

1. 100s of variables with the same pattern with a numeric suffix is easy to list as array elements using variable lists like var1-var100 and so on. I trust you can do that.

 

2. either using variable lists is the best short cut method to specify and compile with the array statement

3. if all vars after the conditional test happens to be missing , would have to be equal to total number of elements in the array to delete the observation

 

Hope that helps

 

Frequent Contributor
Posts: 92

Re: Search and retrieve data

Posted in reply to novinosrin

thanks for the speedy response!

 

it's almost there.

 

as you see that ID 3 does not have any "?" in all variables, so I do not need to retrieve ID 3.

 

again this is what I want:

ID data1 data2 data3 data4
1 ?_B      
2   D_?   ?_B
4   P_? C_?  
PROC Star
Posts: 1,294

Re: Search and retrieve data

[ Edited ]

yes i have edited the code to delete the 3rd obs later. sorry. Please notice the edit. Thank you

 

Please notice this addition in the edit:

 

if cmiss(of t(*))=dim(t) then delete;

Frequent Contributor
Posts: 92

Re: Search and retrieve data

Posted in reply to novinosrin

Very good!

 

I just realize that not all the variables have "?" data, I would like not to retrieve the variables that have no "?".

 

data have;

input ID $ data1 $ data2 $ data3 $ data4 $;
datalines;
1 ?_B C_D A_C B_D
2 A_A D_? A_A ?_B
3 B_B C_C C_C D_D
4 R_T P_? C_J K_A
;

run;

 

the output should look like this:  -- no data3

ID data1 data2 data4
1 ?_B    
2   D_? ?_B
4   P_?  
PROC Star
Posts: 1,294

Re: Search and retrieve data

Do you mean, you want to drop the column from the result if all values of a column is blank?

Frequent Contributor
Posts: 92

Re: Search and retrieve data

Posted in reply to novinosrin

yes, please .

PROC Star
Posts: 1,294

Re: Search and retrieve data

Read through this document https://www.lexjansen.com/nesug/nesug13/90_Final_Paper.pdf

 

while i try something simpler meanwhile

Frequent Contributor
Posts: 92

Re: Search and retrieve data

Posted in reply to novinosrin

Thank you very much!

PROC Star
Posts: 1,294

Re: Search and retrieve data

data have;

input ID $ data1 $ data2 $ data3 $ data4 $;

datalines;

1 ?_B C_D A_C B_D

2 A_A D_? A_A ?_B

3 B_B C_C C_C D_D

4 R_T P_? C_J K_A

;

run;

%macro op_ursula;

data want1;

set have nobs=nobs end=last;

array t(*) data1-data4;

array t1(*) _data1-_data4;

do _n_=1 to dim(t);

if index(t(_n_),'?')>0 then continue;

else call missing(t(_n_));

if missing(t(_n_)) then t1(_n_)+1;

if last then do;

if t1(_n_)=nobs then do;

call symputX('POS'||left(_n_),vname(t(_n_)));

_c=_n_;

end;

end;

end;

call symputX('count',_c);

run;

 

data final_want;

set want1;

drop  %do i=1 %to &count;

%if %symexist(pos&i) %then  &&pos&count;

%end; _: ;

array t(*)      data1-data4;

if cmiss(of t(*))=dim(t) then delete;

run;

 

%mend;

%op_ursula

Frequent Contributor
Posts: 92

Re: Search and retrieve data

Posted in reply to novinosrin

thank you so much for your help.

I wonder why it does not work on my real data, still retrieve all columns even though there are no "?" mark in them.

I do not really understand the codes, but it works on the sample data.

 

I would look into it later.

 

Thank you again.

PROC Star
Posts: 1,294

Re: Search and retrieve data

Go through the code statement by statement thoroughly until you understand. I am gonna sleep now. If you need any help on this thread, I'll look into your requirement when i wake

Highlighted
Solution
‎01-05-2018 04:06 PM
Super User
Super User
Posts: 9,193

Re: Search and retrieve data

This is a common problem, and one caused by the "Excel way of thinking" which seems to permeate throughout SAS programming nowadays.  You have X numbers of fixed columns, and Y number of non-fixed columns, and are now trying to find a way to process them logically.  A simple mindset change to work "In the programmers way of thinking" makes this kind of issue go away.  There are two main data structures, transposed - which is what you have and is useful for reporting out for humans to read, and normalised - which is useful for storage, programming etc.  

So from a storage point of view you have 12 cells to contain 5 data items, which is a waste.  From a programming point of view you have to work out how many elements, scan through each, and of course do this each time you want to use it.  

A far simpler storage would be:

data have;
  input ID $ data1 $ data2 $ data3 $ data4 $;
datalines;
1 ?_B C_D A_C B_D 
2 A_A D_? A_A ?_B 
3 B_B C_C C_C D_D 
4 R_T P_? C_? K_A 
;
run;

proc transpose data=have out=want;
  by id;
  var data:;
run;

data want;
  set want;
  where index(col1,"?")>0;
run;

So normalise the data, then its simply a matter of where clauses and such like to access the data you want.  Compare that to all the array codes and such like given before, how much simpler?

And if you need the transposed results at the end for a report, use another proc transpose to go up again.

Frequent Contributor
Posts: 92

Re: Search and retrieve data

Thanks so much for the simpler codes!

 

yes, it works!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 257 views
  • 4 likes
  • 4 in conversation