Subsetting data with proc sql

Reply
Occasional Contributor
Posts: 5

Subsetting data with proc sql

Hi SAS-experts

I have sales and assets data for different companies from 2000-2005 (year is shown by a dummy variable). Not all companies have data for each year. To make it completely clear, here is an example of data:

Company_code Sales Assets Yr0 Yr1 Yr2 Yr3 Yr4 Yr5

1 150 250 1 0 0 0 0

1 950 150 0 0 0 1 0

1 1000 200 0 0 0 0 1

2 50 50 0 1 0 0 0

2 75 100 0 0 1 0 0

78 1500 1000 0 0 0 0 1

I am now interested in data just from years 00 and 05, but only for companies which have data for both of these years. So in the above example, the new data set would be:

company_code sales assets yr0 yr01 yr02 yr03 yr04 yr05

1 150 250 1 0 0 0 0

1 950 150 0 0 0 1 0

Can this be done with proc sql? The assignment is for a university project, and I'm really struggling with it, so any help is much appreciated.

Thanks in advance!

Super Contributor
Posts: 644

Re: Subsetting data with proc sql

You can do this in SQL or in a data step.  In either case the statement you want to include is

Where yr00 = 1 and yr05 =1

Richard

Occasional Contributor
Posts: 5

Re: Subsetting data with proc sql

Hi Richard

So something like:

Data newdata;

Set oldset;

By company_code;

Where yr00=1 and yr05=1;

run;

?

Super Contributor
Posts: 644

Re: Subsetting data with proc sql

Looks good.  What purpose is the By statement serving?

Why not try the SQL as well?

Richard

Respected Advisor
Posts: 4,644

Re: Subsetting data with proc sql

To create a new dataset :

proc sql;

create table want as

select * from have where yr0 OR yr5;

quit;

but in most cases you can simply add a WHERE condition to the procedure that uses the data, for example:

proc reg data=have(where=(yr0 OR yr5));

model ...;

run;

PG

Message was edited by: PG

PG
Occasional Contributor
Posts: 5

Re: Subsetting data with proc sql

Thanks a lot for the answers. However, I'm not quite sure, you understand what I am trying to do. I want data for companies that have data for BOTH years 00 and 05. With your (PG's) code, I get these companies, but also get companies with data for only 00 or 05. Any suggestions to avoid these last group of companies?

Respected Advisor
Posts: 4,644

Re: Subsetting data with proc sql

Ho! My wrong, sorry. Use instead :

proc sql;

create table want as

select *

from have

where yr1 or yr5

group by company_code

having max(yr1)>0 and max(yr5)>0;

quit;

PG

PG
Ask a Question
Discussion stats
  • 6 replies
  • 218 views
  • 0 likes
  • 3 in conversation