Help using Base SAS procedures

Subsetting the dataset based on the observations

Reply
Contributor
Posts: 57

Subsetting the dataset based on the observations

What are the best ways to subset the dataset based on the observations? or I would like to keep few observatioins from the original dataset. 

Super Contributor
Posts: 441

Re: Subsetting the dataset based on the observations

Posted in reply to mantubiradar19

Hi Mahantesh19,

 

Please elaborate and preferrably samples of what you have and what the result should look like. This would greatly improve your chances of getting a good answer quickly (as the area of subsetting is well documented). Help us help you.

 

Regards,

- Jan.

Contributor
Posts: 57

Re: Subsetting the dataset based on the observations

Posted in reply to jklaverstijn

My datatset looks like following

 

Id  SNP

1  rs1234

2  rs2345

3  rs7364

4  rs7373

5  rs63634839

6  rs63479303

 

So I want to keep only observations corresponding to rs7373, rs7364 etc. Along with these two variables, I have other varaibles as well!

Super Contributor
Posts: 441

Re: Subsetting the dataset based on the observations

Posted in reply to mantubiradar19

Although it is still a bit vague ("etc."  could mean everything), something like this will work:

 

data subset;

set complete;

where snp like 'rs7%';

run;

 

using SQL would be another approach. You could create a view instead of a table for convenience and efficiency.

 

The were statement could also be given in any proc if that's what you need:

proc print data=complete;

where snp like 'rs7%';

run;

 

These alternatives are not exhaustive by a long shot. Considering the fundamental skills required for this type of task I suggest to dive further in the books and other sources of education on SAS programming. That would save you (and us) a lot of time and help you become selfsufficient much quicker than asking basic questions for every challenge you face.

 

Hope this helps,

- Jan.

Contributor
Posts: 57

Re: Subsetting the dataset based on the observations

Posted in reply to jklaverstijn

Thank you!

Contributor
Posts: 57

Re: Subsetting the dataset based on the observations

Posted in reply to jklaverstijn

In fact I'm getting the following note:

 

NOTE: Data file ORIGINAL is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.

 

And 0 observations read from ORIGINAL

Super Contributor
Posts: 441

Re: Subsetting the dataset based on the observations

Posted in reply to mantubiradar19

The NOTE: is exactly that: a note. It is not an error. It does point out a situation that you may need to fix (or at least be aware of) but is not likely the cause of your 0 observations (although it may be if some weird transcoding bit you).This encoding issue has been posted by you on this forum recently so you know where to look ;-) . You may want to fix this first. I know I would.

 

Most likely the WHERE clause you specified does not match any rows in the input table. Double check your data and code. For example, the LIKE is case sensitive. And after triple checking it post your entire log including the code and sample data from the actual table.

 

Hope this helps,

- Jan.

Contributor
Posts: 57

Re: Subsetting the dataset based on the observations

Posted in reply to jklaverstijn
Thank you very much!
Ask a Question
Discussion stats
  • 7 replies
  • 415 views
  • 1 like
  • 2 in conversation