I am trying to create a column that will show a count of how many times a variable (deal_id) appears in a column.
Using the below example I would want deal_id's 100477 and 101797 to have a 2 and 102913 to have a one. The 2 items that broken out need to be that way due to LGD diving factors downstream.
I am using PROC sql to create my tables and have tried using count(deal_id) and count(distinct deal_id) but they both returned a 1 value for all rows.
This is just a portion of what the entire table consists of
reporting_date | source_system_code | deal_id | Notional_USD | LGD |
04/30/2017 | ACBS | 100477 | 251019.46 | 0.31684 |
04/30/2017 | ACBS | 100477 | 10000 | 0 |
04/30/2017 | ACBS | 101797 | 5506337.48 | 0.31684 |
04/30/2017 | ACBS | 101797 | 50000 | 0 |
04/30/2017 | ACBS | 102913 | 6142612.78 | 0.17466 |
04/30/2017 | ACBS | 103074 | 11416126.66 | 0.31684 |
proc sql noprint; create table want as select *,count(deal_id) as count from have group by deal_id ; quit;
Art, CEO, AnalystFinder.com
Is the desired output a data set or a report? If a data set are you merging the results back onto the original data so the same value appears on each record?
It really helps to show exactly what the actual output should be.
I am creating a data set to be used for reporting
I am combining a few tables to be used for reporting ( source data is very large.) My problem arose when I tried to combine another table to pull in an interest column (which has deal_id and interest no duplicates) and saw that values were being duplicated due to the deal_id being on multiple rows.
I have been spinning my wheels on how to get this dat to pull once and my thought is to generate a count of deal_id and divide by that to remedy this.
porting_date | source_system_code | deal_id | Notional_USD | LGD | COUNT |
4/30/2017 | ACBS | 100477 | 251019.46 | 0.3168 | 2 |
4/30/2017 | ACBS | 100477 | 10000 | 0 | 2 |
4/30/2017 | ACBS | 101797 | 5506337.48 | 0.3168 | 2 |
4/30/2017 | ACBS | 101797 | 50000 | 0 | 2 |
4/30/2017 | ACBS | 102913 | 6142612.78 | 0.1747 | 1 |
4/30/2017 | ACBS | 103074 | 11416126.7 | 0.3168 | 1 |
proc sql noprint; create table want as select *,count(deal_id) as count from have group by deal_id ; quit;
Art, CEO, AnalystFinder.com
What SQL code did you try?
I would try something like:
data have;
input deal_id Notional_USD;
cards;
100477 251019.46
100477 10000
101797 5506337.48
101797 50000
102913 6142612.78
103074 11416126.66
;
proc sql;
select * , count(*) as DealCount
from have
group by deal_id
;
quit;
Returns:
deal_id Notional_USD DealCount 100477 10000 2 100477 251019.5 2 101797 5506337 2 101797 50000 2 102913 6142613 1 103074 11416127 1
I think my issue arises from my data being much larger than my example. I have attached an example (actual details have over 20k rows) of what I am trying to accomplish along with the SQL I wrote.
I either would like to count the times a variable appears in my table and do some simple math, or pull data from the AIR table only once.
When I try adding Count(Deal_id) or Count (distinct deal_id) I get a value of 1 in return.
You're grouping by a large collection of variables rather than just the one(s) you're interested in. Also, I didn't find a file in your attachments that showed the raw data.
Art, CEO, AnalystFinder.com
That was it! I limited the grouping and now I'm getting the result I was looking for!
Thank you everyone!
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
Better would be to provide some example data and what the final report should look like. If you haven't explored Procs Report and Tabulate you might be surprised what is possible though all of the variables of interest do need to be in one data set for those.
Paste code into a code box using the forum {i} menu icon.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.