SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Conditional deleting observation from a dataset

Reply
Frequent Contributor
Posts: 84

Conditional deleting observation from a dataset

I have a data set with three variables type x y

type   x   y

1 12 13                                                                                                                                

1 14 15                                                                                                                                

1 12 13                                                                                                                                

2 14 25                                                                                                                                

2 15 26                                                                                                                                

2 26 23                                                                                                                                

3 25 26                                                                                                                                

3 26 35                                                                                                                                

3 36 38

if i want to have data set only for type 1 then I could use following code

if type ne 1 then delete;                                                                                                               

run;

Now I want to do this using do loop, I mean that for first iteration I want data set for type 1 only, for second iteration I want data set for type 2 only and for third iteration I want data set for type 3 only

so that i can have three separate data sets

I am using following code that is not working

do j=1 to 3;                                                                                                                           

if type ne j then delete ;                                                                                                                

output;

end;

pls help

Super User
Posts: 19,770

Re: Conditional deleting observation from a dataset

Why do you want to do this, it's usually not recommended.

Here's a manual way:

data type1 type2 type3;

set have;

if type=1 then output type1;

else if type=2 then output type2;

else if type=3 then output type3;

run;

Frequent Contributor
Posts: 84

Re: Conditional deleting observation from a dataset

Thank you I found your reply very useful

Super User
Posts: 11,343

Re: Conditional deleting observation from a dataset

OR use dataset options in the next step the data will be used for:

proc freq data=mydata (where=(type = 1));

run;

Then if you get to analysis you don't have to spend time finding/remember which set to use. Also all the types are there when you want to compare based on their values or a subset: (where= (type in (1 , 3, 5)) for example.

Contributor
Posts: 53

Re: Conditional deleting observation from a dataset

Okay, the straightforward answers have already been given. But. If you want to completely automate it -

proc sort data = a out = b nodupkey; by type;  <- Here a is your original dataset with all the values.

data _null_;

set b end=eof;

if eof then call symput('types',_N_);

proc sort data=b; by type;

%macro types;

%do i = 1 %to &types;

data btype;

set b (where = (_N_ = &i) );

data dat&i;

merge a (in=a) btype(in=b); by type;

if a and b;

%end;

%mend types;

%types;

This code helps when more types are added, you don't have to make changes to your code. Also helps, when type is not a straightforward 1 or 2  or 3.. but some complex number or character.

Respected Advisor
Posts: 3,156

Re: Conditional deleting observation from a dataset

Posted in reply to PoornimaRavishankar

If we are talking about splitting data, one of the proven method will be Hash, very efficient(one pass to the data if presorted, no need to know how many 'types' in advance) and fully dynamic.

data have;

     input type   x   y;

     cards;

1 12 13

1 14 15

1 12 13

2 14 25

2 15 26

2 26 23

3 25 26

3 26 35

3 36 38

;

data _null_;

     declare hash h();

     h.definekey('_n_');

     h.definedata('type', 'x', 'y');

     h.definedone();

     do _n_=1 by 1 until (last.type);

           set have;

           by type;

           rc=h.replace();

     end;

     rc=h.output(dataset:cats('want',type));

run;

Frequent Contributor
Posts: 84

Re: Conditional deleting observation from a dataset

Thanks I used this its works fine

Respected Advisor
Posts: 4,173

Re: Conditional deleting observation from a dataset

The advantage of the hash method as suggests is that you don't need to know the number of discrete type values in advance. The disadvantage is imho that your source data set must be pre-sorted by type, that it only works for data sets small enough that all the data for a single type fits into memory and that the code is more complicated (maintenance).

If you know the type values in advance or you anyway want to create new data sets only for selected type values then I would go for the approach suggested.

Contributor
Posts: 53

Re: Conditional deleting observation from a dataset

Hai.kuo

In rc=h.output(dataset:cats('want',type));

what does

cats('want',type)

mean?

I went through SAS manuals but ddn't find a satisfactory explanation?

Respected Advisor
Posts: 3,156

Re: Conditional deleting observation from a dataset

Posted in reply to PoornimaRavishankar

cats('want',type) is to generate a new data set name, start with 'want', tail with the value of 'type', as we use 'type' to split data. You should easily find docs on cats().

if not using cats(), you can do the same thing by using;
'want'||strip(type)

Haikuo

Frequent Contributor
Posts: 83

Re: Conditional deleting observation from a dataset

Rasheed,

Why not automate it through SQL to give you the different type names and the number of different types and then pass that back through a data step? The nice thing about this method is that if the type variable ever was a character it also would collect it properly, however this method tends to fail if the character string created as &typen exceeds a character limit (however I can't remember what that limit is).

%macro qk;

proc sql noprint;

    select distinct type into: typen separated by '*' from have;

quit;

%let typenn = &sqlobs;

%do i = 1 %to &typenn;

%let typ = %scan(&typen,&i,'*');

data type_&i;

    set have;

    where type = &typ;

run;

%end;

%mend;

%qk;

Robert

Ask a Question
Discussion stats
  • 10 replies
  • 1059 views
  • 1 like
  • 7 in conversation