BookmarkSubscribeRSS Feed
sms1891
Quartz | Level 8

Hi all,

I am trying to create a new categirical variable (Chain_Pharmacy=1/0 for Yes/ No) based on a string variable call Pharmacy_Name.

This variable Pharmacy_Name has several pharmacies, like Wal-Mart, Walgreens, Costco, CVS, etc. I have > 800,000 records in the datasheet . I need to define Chain_Pharmacy if any string under Pharmacy_Name variable occurs more than 10 times. The issue with this variable is I have >800,000 records to search for and I can only Identify few pharmacies such as Wal-Mart, Walgreens, Costco, CVS, Rite-Aid, etc, but the databse will have many other chain pharmacies and mom and pop stores. I need to identify chains which have more than 10 pharmacies with same name (irrespective of what that chain may be) versus mom and pop/ individual pharmacy stores.

 

Can I do this by count? Is there any better way other than count for creating my new variable?

Thanks!

 

See below:

CARRS PHARMACY 1739

CARRS PHARMACY 1805

CARRS PHARMACY 1812

FRED MEYER PHARMACY #701485

FRED MEYER PHARMACY #701668

FRED MEYER PHARMACY #701671

FRED MEYER PHARMACY 701011

FRED MEYER PHARMACY 701018

FRED MEYER PHARMACY 701071

FRED MEYER PHARMACY 701224

FRED MEYER PHARMACY 701653

FRED MEYER PHARMACY 701656

FRED MEYER PHARMACY 701671

GENOA HEALTHCARE #146

NORTH POLE PRESCRIPTION LABORATORY

PRESCRIPTION CENTER

SAFEWAY PHARMACY #3410

SAFEWAY PHARMACY 0548

SAFEWAY PHARMACY 1821

SAFEWAY PHARMACY 2728

SAFEWAY PHARMACY 2754

SAFEWAY PHARMACY 3410

SOLDOTNA PROFESSIONAL PHARMACY

TARGET PHARMACY #2371

TARGET PHARMACY 2339

ULMER DRUG AND HARDWARE

WAL-MART PHARMACY 10-2070 #102070

WAL-MART PHARMACY 10-2070 102070

WAL-MART PHARMACY 10-2071 #102071

WAL-MART PHARMACY 10-2071 102071

WAL-MART PHARMACY 10-2074 102074

1 REPLY 1
kiranv_
Rhodochrosite | Level 12

may be something like this

proc sql;
select * from yourtable
group by prxchange('s/(\d+|#).*$//i',-1, pharmacy)
having count(prxchange('s/(\d+|#).*$//i',-1, pharmacy))>10;

 to check what multiple store have, along with its parent name

proc sql;

select *, prxchange('s/(\d+|#).*$//i',-1, pharmacy) as parentcompanyname from have

group by parentcompanyname

having count(parentcompanyname)>10;

 

or to just companies  with parent name

 

proc sql;

select prxchange('s/(\d+|#).*$//i',-1, pharmacy) as parentcompanyname from ac

group by parentcompanyname

having count(parentcompanyname)>10;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 624 views
  • 1 like
  • 2 in conversation