BookmarkSubscribeRSS Feed
Anita_n
Pyrite | Level 9

Dear all, 

if I have a dataset that looks like this:

 

 

data have;
	infile datalines;
	input var1 $ var2 $ var3 $ var4 $ ;
	datalines;
Y  Y  N  N 
N  Y  N  Y
N  N  N  N
N  Y  Y  Y
N  Y  N  Y
Y  Y  Y  Y
; run;

I want to select only the N values of each var. Can anyone help?

 

 

This is how the result should look like

data want;
	infile datalines truncover;
	input var1 $ var2 $ var3 $ var4 $ ;
	datalines;
      N  N
N     N   
N  N  N  N
N         
N     N   
          
;
run;

 

11 REPLIES 11
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
	infile datalines;
	input var1 $ var2 $ var3 $ var4 $ ;
	datalines;
Y  Y  N  N 
N  Y  N  Y
N  N  N  N
N  Y  Y  Y
N  Y  N  Y
Y  Y  Y  Y
;
run;

data want;
   set have;
   array v var:;
   do over v;
      if v = 'Y' then v = '';
   end;
run;

 

Result:

 

var1  var2  var3  var4
            N     N
N           N 
N     N     N     N
N   
N           N 

Anita_n
Pyrite | Level 9

@PeterClemmensen , thanks for the quick reply. I have a question does this select all variables whose corresponding to values of N or does this just delect the value Y.  I want to know this because I have over 1000 variables in the dataset and I just need only the values of all variables having var1-var4 = N

PeterClemmensen
Tourmaline | Level 20

@Anita_n This code goes through all variables with the prefix 'var' and sets the value to ' ' when the original value is 'Y'.

Anita_n
Pyrite | Level 9

@PeterClemmensen okay, I tested that and realise it just delects Y but I want it to select N and its corresponding values like for example if this were to be my data have

 

data have;
	infile datalines;
	input var1 $ num1 var2 $ num2 var3 $ num3 var4 $  num4;
	datalines;
Y 1 Y 7 N 1 N 8
N 2 Y 2 N 2 Y 7
N 3 N 1 N 5 N 6
N 4 Y 2 Y 5 Y 7
N 5 Y 3 N 4 Y 4
Y 6 Y 4 Y 3 Y 9
;
run;

My data want should be like this:

 

data want;
	infile datalines truncover;
	input var1 $ num1 var2 $ num2 var3 $ num3 var4 $  num4;
	datalines;
        N 1 N 8
N 2     N 2    
N 3 N 1 N 5 N 6
N 4            
N 5     N 4    
               
;
run;

ed 

ballardw
Super User

@Anita_n wrote:

@PeterClemmensen okay, I tested that and realise it just delects Y but I want it to select N and its corresponding values like for example if this were to be my data have


If a variable is in a SAS data set at all then there will be a value for it, which means missing with what you show for a "want".

The variable exists so is must have something.

It is much easier to set to missing for the unwanted than to attempt to "keep" the desired values.

data want;
   set have;
   array v (*)  var1- var4;
   array nn (*)  num1- num4;
   do i=1 to dim(v);
      if v[i]='Y' then call missing(v[i],nn[i]);
   end;
   drop i;
run;

 

Anita_n
Pyrite | Level 9

Is there no way to do that using proc sql. I don't know if arrays can be used in proc sql. I know using OR or AND doesn't work but aybe there is a specific way doing that

proc sql;
create table want as select * from have where var1="N" or var2="N" or var3="N" and var4="N";
quit;

 

 

ballardw
Super User

@Anita_n wrote:

Is there no way to do that using proc sql. I don't know if arrays can be used in proc sql. I know using OR or AND doesn't work but aybe there is a specific way doing that

proc sql;
create table want as select * from have where var1="N" or var2="N" or var3="N" and var4="N";
quit;

 

 


SQL does not support arrays.

You would need to write 8 Case statements, one for each variable involved, testing the Var variable to set the value of the Num variable.

If you had a nice variable for Joining on you might get away with 4 bits like (select id, var1,num1 from set where Var1='N').

Multiple variables requiring similar treatment easily is not an SQL strength.

Anita_n
Pyrite | Level 9

okay, thankyou

LinusH
Tourmaline | Level 20

May I ask what the actual application is?

I hope that the real data has som kind of row identifier.

If, so you could transpose the data, and then jsut do the transformation on one column, or just add a format that hides the "Y"

Data never sleeps
Anita_n
Pyrite | Level 9

@LinusH Is all about patients who received certain treatments on each visit (let's say visits 1 to 10),  But there a certain patients who took part in a clinical trial who should be excluded from this project. Now for each visit there is a variable which indicate if they took part in a clinical trial at that specific visit (if yes then the variable= Y and if no then it's =N) 

 

That was what I tried to depict above with var1 to var4 . Now I need to select all patients who did not take take in a clinical trial for all visits 1 to 10  (that means var1 to var10 has to be N, there shouldn't be any Y's  in the dataset) 

All this selection should include all data of the patients like the id, age, sex, diagnosis date, deathdate, type of treatment etc.....

 

I did this with sas macro that looks like this and later merged them: 

%macro test (num);
proc sql,
create table want&num as select id, therapy&num, visit&num, clintrial&num where clintrial&num="N";
quit;
%mend;
%test(1); %test(2); %test(3); %test(4); %test(5);%test(6); %test(7); %test(8); %test(9); %test(10);

*merge all 10 datasets;
data want,
merge want:;
by id;
run;

My problem here is I can't select all variables, since if I do that  for test1 for example then I will have test2 to test10 with  undesired values in there. So I was thinking there is a better way doing that. But it seems to be complicated

 

LinusH
Tourmaline | Level 20

Not sure if removing the value "Y" will help you with this logic (if understand your use case correcrtly).

If your flags were 0/1 instead a simple

 if sum(of var1-var10) = 0 then delete;

suffice.

I guess array as described earlier in the thread is a valid way of doing it, but just add some logic to delete the record if all elements are "N".

Data never sleeps

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1256 views
  • 0 likes
  • 4 in conversation