Hi all,
I am trying to create a new categirical variable (Chain_Pharmacy=1/0 for Yes/ No) based on a string variable call Pharmacy_Name.
This variable Pharmacy_Name has several pharmacies, like Wal-Mart, Walgreens, Costco, CVS, etc. I have > 800,000 records in the datasheet . I need to define Chain_Pharmacy if any string under Pharmacy_Name variable occurs more than 10 times. The issue with this variable is I have >800,000 records to search for and I can only Identify few pharmacies such as Wal-Mart, Walgreens, Costco, CVS, Rite-Aid, etc, but the databse will have many other chain pharmacies and mom and pop stores. I need to identify chains which have more than 10 pharmacies with same name (irrespective of what that chain may be) versus mom and pop/ individual pharmacy stores.
Can I do this by count? Is there any better way other than count for creating my new variable?
Thanks!
See below:
CARRS PHARMACY 1739 |
CARRS PHARMACY 1805 |
CARRS PHARMACY 1812 |
FRED MEYER PHARMACY #701485 |
FRED MEYER PHARMACY #701668 |
FRED MEYER PHARMACY #701671 |
FRED MEYER PHARMACY 701011 |
FRED MEYER PHARMACY 701018 |
FRED MEYER PHARMACY 701071 |
FRED MEYER PHARMACY 701224 |
FRED MEYER PHARMACY 701653 |
FRED MEYER PHARMACY 701656 |
FRED MEYER PHARMACY 701671 |
GENOA HEALTHCARE #146 |
NORTH POLE PRESCRIPTION LABORATORY |
PRESCRIPTION CENTER |
SAFEWAY PHARMACY #3410 |
SAFEWAY PHARMACY 0548 |
SAFEWAY PHARMACY 1821 |
SAFEWAY PHARMACY 2728 |
SAFEWAY PHARMACY 2754 |
SAFEWAY PHARMACY 3410 |
SOLDOTNA PROFESSIONAL PHARMACY |
TARGET PHARMACY #2371 |
TARGET PHARMACY 2339 |
ULMER DRUG AND HARDWARE |
WAL-MART PHARMACY 10-2070 #102070 |
WAL-MART PHARMACY 10-2070 102070 |
WAL-MART PHARMACY 10-2071 #102071 |
WAL-MART PHARMACY 10-2071 102071 |
WAL-MART PHARMACY 10-2074 102074 |
may be something like this
proc sql;
select * from yourtable
group by prxchange('s/(\d+|#).*$//i',-1, pharmacy)
having count(prxchange('s/(\d+|#).*$//i',-1, pharmacy))>10;
to check what multiple store have, along with its parent name
proc sql;
select *, prxchange('s/(\d+|#).*$//i',-1, pharmacy) as parentcompanyname from have
group by parentcompanyname
having count(parentcompanyname)>10;
or to just companies with parent name
proc sql;
select prxchange('s/(\d+|#).*$//i',-1, pharmacy) as parentcompanyname from ac
group by parentcompanyname
having count(parentcompanyname)>10;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.