DATA Step, Macro, Functions and more

Subsetting Dataset using Proc sql

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Subsetting Dataset using Proc sql

Hi,

I am trying to subset dataset based on the ranges in the dataset. I have done subsetting using Data step as :

 

DATA lt_15k lt_20k lt_30k lt_40k gt_40k;
set sas1.cars;
if price le 15 then Output lt_15k;
if (15<price<=20) then Output lt_20k;
if (20<price<=30) then Output lt_30k;
if (30<price<=40) then Output lt_40k;
if (price>40) then output gt_40k;

run;

 

I want same results by using Proc SQL command. Is possible to create multiple dataset using proc sql.


Accepted Solutions
Solution
‎08-15-2017 07:23 AM
New Contributor
Posts: 3

Re: Subsetting Dataset using Proc sql

Posted in reply to Harmandeep

It is not possible to produce multiple tables or views in a single sql query.

 

Best way could be :

proc sql;
create table lt_15k as select * from sas1.cars where price le 15 ;
create table lt_20k as select * from sas1.cars where price between 15 and 20 ;
create table lt_30k as select * from sas1.cars where price between 20 and 30 ;
create table lt_40k as select * from sas1.cars where price between 30 and 40 ;
quit;

 

View solution in original post


All Replies
Solution
‎08-15-2017 07:23 AM
New Contributor
Posts: 3

Re: Subsetting Dataset using Proc sql

Posted in reply to Harmandeep

It is not possible to produce multiple tables or views in a single sql query.

 

Best way could be :

proc sql;
create table lt_15k as select * from sas1.cars where price le 15 ;
create table lt_20k as select * from sas1.cars where price between 15 and 20 ;
create table lt_30k as select * from sas1.cars where price between 20 and 30 ;
create table lt_40k as select * from sas1.cars where price between 30 and 40 ;
quit;

 

Super User
Super User
Posts: 7,955

Re: Subsetting Dataset using Proc sql

Posted in reply to Harmandeep

You cannot in SQL.  I would however question the benefitof splitting the data in the first place.  First off you are multiplying the size of the size as you have each of the header blocks in addition, so this method takes more space.  It is also harder to program with as you need to know the datasets, and program for each of them.  A simpler methodology is to apply the grouping in the data, and then use that grouping.  Say you want to print each of those groups to a diffrent page:

DATA lt_15k lt_20k lt_30k lt_40k gt_40k;
set sas1.cars;
if price le 15 then Output lt_15k;
if (15<price<=20) then Output lt_20k;
if (20<price<=30) then Output lt_30k;
if (30<price<=40) then Output lt_40k;
if (price>40) then output gt_40k;
run;
title "Group1 "; Proc print data=lt_15k_20k; run;
title "Group2"; proc print data=...

Or, and this should look simpler:

data want;
  set sas1.cars;
  if price le 15 then group="lt_15k";
  if (15<price<=20) then group="t_20k";
  if (20<price<=30) then group="lt_30k";
  if (30<price<=40) then group="lt_40k";
  if (price>40) then group="gt_40k";
run;

proc print data=want;
  by group;
  title "Group: #byval1";
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 108 views
  • 3 likes
  • 3 in conversation