BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jtennyson
Calcite | Level 5

I am trying to create a column that will show a count of how many times a variable (deal_id) appears in a column.

 

Using the below example I would want deal_id's  100477 and 101797 to have a 2 and 102913 to have a one.  The 2 items that broken out need to be that way due to LGD diving factors downstream.

 

I am using PROC sql to create my tables and have tried using count(deal_id) and count(distinct deal_id) but they both returned a 1 value for all rows.

 

This is just a portion of what the entire table consists of

 

reporting_datesource_system_codedeal_idNotional_USDLGD
04/30/2017ACBS100477251019.460.31684
04/30/2017ACBS100477100000
04/30/2017ACBS1017975506337.480.31684
04/30/2017ACBS101797500000
04/30/2017ACBS1029136142612.780.17466
04/30/2017ACBS10307411416126.660.31684
1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21
proc sql noprint;
  create table want as
    select *,count(deal_id) as count
      from have
        group by deal_id
  ;
quit;

Art, CEO, AnalystFinder.com

View solution in original post

9 REPLIES 9
ballardw
Super User

Is the desired output a data set or a report? If a data set are you merging the results back onto the original data so the same value appears on each record?

 

It really helps to show exactly what the actual output should be.

jtennyson
Calcite | Level 5

I am creating a data set to be used for reporting

 

I am combining a few tables to be used for reporting ( source data is very large.)  My problem arose when I tried to combine another table to pull in an interest column (which has deal_id and interest no duplicates) and saw that values were being duplicated due to the deal_id being on multiple rows.  

 

I have been spinning my wheels on how to get this dat to pull once and my thought is to generate a count of deal_id and divide by that to remedy this.

 

 

porting_datesource_system_codedeal_idNotional_USDLGDCOUNT
4/30/2017ACBS100477251019.460.31682
4/30/2017ACBS1004771000002
4/30/2017ACBS1017975506337.480.31682
4/30/2017ACBS1017975000002
4/30/2017ACBS1029136142612.780.17471
4/30/2017ACBS10307411416126.70.31681
art297
Opal | Level 21
proc sql noprint;
  create table want as
    select *,count(deal_id) as count
      from have
        group by deal_id
  ;
quit;

Art, CEO, AnalystFinder.com

Quentin
Super User

What SQL code did you try?

 

I would try something like:

data have;
input deal_id Notional_USD;
cards;
100477 251019.46
100477 10000
101797 5506337.48
101797 50000
102913 6142612.78
103074 11416126.66
;

proc sql;
  select * , count(*) as DealCount
    from have 
    group by deal_id
  ;
quit;

Returns:

 

 deal_id  Notional_USD  DealCount
  100477         10000          2
  100477      251019.5          2
  101797       5506337          2
  101797         50000          2
  102913       6142613          1
  103074      11416127          1
The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
jtennyson
Calcite | Level 5

I think my issue arises from my data being much larger than my example. I have attached an example (actual details have over 20k rows) of what I am trying to accomplish along with the SQL I wrote.

 

I either would like to count the times a variable appears in my table and do some simple math, or pull data from the AIR table only once.

 

When I try adding Count(Deal_id) or Count (distinct deal_id) I get a value of 1 in return.

 

 

Quentin
Super User
Number of rows shouldn't be a problem. Suggest you add data step code to make sample data with 10-20 example rows (using CARDS statement or whatever), and the PROC SQL code you ran. That will make it easier for people to help you. Many people won't open attachments (Excel etc), due to virus concerns etc.
The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
art297
Opal | Level 21

You're grouping by a large collection of variables rather than just the one(s) you're interested in. Also, I didn't find a file in your attachments that showed the raw data.

 

Art, CEO, AnalystFinder.com

 

jtennyson
Calcite | Level 5

That was it! I limited the grouping and now I'm getting the result I was looking for!

 

Thank you everyone!

ballardw
Super User

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

 

Better would be to provide some example data and what the final report should look like. If you haven't explored Procs Report and Tabulate you might be surprised what is possible though all of the variables of interest do need to be in one data set for those.

Paste code into a code box using the forum {i} menu icon.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 31607 views
  • 0 likes
  • 4 in conversation