New SAS User

Completely new to SAS or trying something new with SAS? Post here for help getting started.
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
basic_sas
Calcite | Level 5

I import a list of strings stored in Excel in column A to search:

TATA
TCS
Accenture

 

I also import the below data set:

 

REPORT_CATTRANS_AMOUNTEXPEND_COMMENTVENDOR
Sal Dist & Contractor5177.36AP: TATA CONSULTANCY SERVICES LTD : INV USNI219089615 :  : MF   85217900TATA CONSULTANCY SERVICES LTD
Sal Dist & Contractor-10811FB:TATA:Oct-18:Accrual WO 16079non-AP
Sal Dist & Contractor10811FB:TATA:Nov-18:Accrual WO 16079non-AP
IT Hardware52000AP: ACCENTURE LLP : INV 1100457766 :  : MF   79310800ACCENTURE LLP
Sal Dist & Contractor1023.31Internal:Inventory Management:Dec-18:project distributionnon-AP
Sal Dist & Contractor-50000FB:Accenture:Aug-18:Accrual WO 16389non-AP

 

I want to use the imported list as strings to search in column EXPEND_COMMENT and Vendor

 

When there is a match, I want to sum by string on the TRANS_AMOUNT column and have a breakout by string and REPORT_CAT.

 

The table below would provide the desired output. Since one of the rows doesn't contain any of the strings, it's grouped in Other:

STRINGREPORT_CATTRANS_AMOUNT
TATASal Dist & Contractor5177.36
AccentureIT Hardware52000
AccentureSal Dist & Contractor-50000
OtherSal Dist & Contractor1023.31

 

Thank you a lot for any help you provide.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
5 REPLIES 5
basic_sas
Calcite | Level 5

In the data set, there are only two companies that have multiple iterations:

Tata Consultancy Services is listed as Tata and TCS, the ideal scenario would be that these strings are grouped together under Tata. In a single row, Tata and uppercase (TCS) are never listed together and Tata and uppercase (TCS) are unique in the data such that they won't make up another word such as Tatanica or Logistcs.

 

Ernst and Young is listed as Ernst and Young and Ernst. Similar to above, ideally, these would be grouped under Ernst. And similar to the above, they are not found as part of a word.

 

The strings I will be looking for are unique. Any row that is not matched to a string will be grouped as Other.

ballardw
Super User

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

I am not going to try to dummy up data for something like this.

 

Your example data should be small enough you can manually indicate the result.

 

Not need whatsoever to "make a list in Excel" to get values into SAS:

data search;
   informat string $25.;
   input string;
datalines;
TATA 
TCS 
Accenture 
;
run;

Are your search for values in the "list" case sensitive? Do you want to consider TATA found if the value searched is "FB:Tata:Oct-18:Accrual WO 16079"

basic_sas
Calcite | Level 5

Thanks for the detailed response, and sorry I didn't know about the forum etiquette. I attached a sample data set of 50 observations.

 

In these particular observations, I only see one string I would want to search for: TATA, the rest would be grouped into other. The full data set is about 100,000 rows and grows by about 30,000 rows per month. There are about 500 strings I will want to search for and return String, REPORT_CAT, and Sum of (TRANS_AMOUNT), grouping by REPORT_CAT.

 

All the strings will not be case sensitive, except in the case of EY and TCS which will be case sensitive. Therefore, I will consider TATA to be found if Tata or tata or TAta. I will only consider EY to be found if EY, and TCS to be found if TCS, which are the only two case sensitive strings I will be looking for.

basic_sas
Calcite | Level 5

Looks like it can't be done. Thanks everyone for the help!

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1209 views
  • 0 likes
  • 3 in conversation