BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yegen
Pyrite | Level 9

I know how to get around this issue by writing an additional code, but I would like to know whether I can select distinct observations based on the two variables "cusip" and "fyear" but still keep the variable "shrout" in the final dataset by the code below. The dataset work.outstanding_shares_1 contains shrout, cusip and fyear. 

 

 

 

proc sql;
	create table work.outstanding_shares_2 as
	select distinct cusip, fyear 
	from work.outstanding_shares_1;
quit;


Thanks for any suggestions. 

 

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

Maybe this is what you want.

 

data work.outstanding_shares_1;
infile cards expandtabs truncover;
input cusip	fyear shrout date : yymmdd10.;
format date yymmdd10.;
cards;
20910	1992	9284	19920831
20910	1992	9362	19921030
20910	1988	9002	19880331
20910	1988	9037	19881031
20910	1988	9023	19880729
36020	2006	12342	20061130
36020	2003	12541	20030930
36020	2006	12336	20060630
36020	2003	13031	20030131
36020	2003	12813	20031031
;
run;
proc print;
   run;
proc summary nway;
   class cusip fyear;
   output out=shrout(drop=_:)
      idgroup(max(date) out(shrout)=);
   run;
proc print;
   run;

2017-06-01_15-49-22.png 

View solution in original post

12 REPLIES 12
art297
Opal | Level 21

It would help if you can post example have and want datasets. On first glance it seems like you should be able to get what you want by simply including shrout in your select statement. If you are looking to have single records for each "cusip" and "fyear" combination, but with multiple values of "shrout", I'd still do the same, but then apply a proc transpose to create the wide records.

 

Art, CEO, AnalystFinder.com

 

Yegen
Pyrite | Level 9

Hi @art297, thanks for your reply.

Here is a sample of the data:

 

data work.outstanding_shares_1;
infile cards expandtabs truncover;
input cusip	fyear shrout date : yymmdd10.;
format date yymmdd10.;
cards;
20910	1992	9284	19920831
20910	1992	9362	19921030
20910	1988	9002	19880331
20910	1988	9037	19881031
20910	1988	9023	19880729
36020	2006	12342	20061130
36020	2003	12541	20030930
36020	2006	12336	20060630
36020	2003	13031	20030131
36020	2003	12813	20031031
;
run;


I want to select the most recent (and distinct) shrout for a given cusip and fyear. So the output will look like:

cusipfyearshrout
2091019889037
2091019929362
36020200312813
36020200612342

 

 

 

data_null__
Jade | Level 19
I don't understand why you picked two obs for FY 2003 for cusip=36020
Yegen
Pyrite | Level 9

Sorry that was a typo, @data_null__. I will update the output table now. Thanks for pointing that out. 

data_null__
Jade | Level 19

Maybe this is what you want.

 

data work.outstanding_shares_1;
infile cards expandtabs truncover;
input cusip	fyear shrout date : yymmdd10.;
format date yymmdd10.;
cards;
20910	1992	9284	19920831
20910	1992	9362	19921030
20910	1988	9002	19880331
20910	1988	9037	19881031
20910	1988	9023	19880729
36020	2006	12342	20061130
36020	2003	12541	20030930
36020	2006	12336	20060630
36020	2003	13031	20030131
36020	2003	12813	20031031
;
run;
proc print;
   run;
proc summary nway;
   class cusip fyear;
   output out=shrout(drop=_:)
      idgroup(max(date) out(shrout)=);
   run;
proc print;
   run;

2017-06-01_15-49-22.png 

Yegen
Pyrite | Level 9

Exactly! That is what I was looking for. Thanks!

@data_null__ (or even @art297) do you know whether one can do the same by just using SQL? I was thinking of the having and group clause but got stuck. I wanted to try whether one can actually do this by just using SQL. 

PBsas
Obsidian | Level 7

Alternative SQL Method:

proc sql;
create table want as
	select distinct cusip, fyear, shrout 
	from work.outstanding_shares_1 
	where date in (select max(date) from work.outstanding_shares_1 group by cusip,fyear) ;
quit;
Yegen
Pyrite | Level 9

@PBsas I just run your code, but the output does not look correct. I am having still duplicates for a given cusip & fyear. 

Do you think one can do it with the "having" clause? 

PBsas
Obsidian | Level 7

Your sample data produced right output using this code. Please provide sample data that produced duplicates.

Yegen
Pyrite | Level 9

You are right when I use your code on the inputted data, I am getting the correct result. However, when I use it on the actual data, I am getting a result as follows:

 

Screen Shot 2017-06-01 at 5.47.17 PM.png

Here is the log:

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         proc sql;
 63         create table want as
 64         select distinct cusip, fyear, shrout
 65         from work.outstanding_shares_1
 66         where date in (select max(date) from work.outstanding_shares_1 group
 66       !  by cusip,fyear) ;
 NOTE: Table WORK.WANT created, with 168203 rows and 3 columns.
 
 67         quit;
 NOTE: PROCEDURE SQL used (Total process time):
       real time           0.20 seconds
       cpu time            0.34 seconds
       
 
 68         
 69         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 82         



The only difference of the actual dataset and the sample I provided is the date format:

Screen Shot 2017-06-01 at 5.48.53 PM.png

Ksharp
Super User
data work.outstanding_shares_1;
infile cards expandtabs truncover;
input cusip	fyear shrout date : yymmdd10.;
format date yymmdd10.;
cards;
20910	1992	9284	19920831
20910	1992	9362	19921030
20910	1988	9002	19880331
20910	1988	9037	19881031
20910	1988	9023	19880729
36020	2006	12342	20061130
36020	2003	12541	20030930
36020	2006	12336	20060630
36020	2003	13031	20030131
36020	2003	12813	20031031
;
run;

proc sql;
select cusip,fyear,shrout
 from outstanding_shares_1
  group by cusip,fyear
   having date=max(date);
quit;
Yegen
Pyrite | Level 9

Thanks, @Ksharp. Your help is much appreciated! 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 4541 views
  • 2 likes
  • 5 in conversation