I am wanting to convert a series of ID values that are formatted as text to numeric using
Data SASCHART.Date_Fix_Data (drop = Contact_Person_ID); Retain Day_mon Case_ID County Day_mon1 Contact_Person_ID County_T; Set SASCHART.Date_Fix_Data; Contact_ID = input(Contact_Person_ID, 8.); run;
The vast majority of the entries in Contact_Person_ID field are all numbers like
54632784
53627928
64538272 and so on.
But there are some entries that begin with an alpha like
A324577
A564732
How do I remove the alpha character in the first place so I am left with just the numbers and can perform the input conversion?
Thank you for your help.
wlierman
if anyalpha(Contact_Person_ID)=1 then Contact_Person_ID=substr(Contact_Person_ID,2);
if anyalpha(Contact_Person_ID)=1 then Contact_Person_ID=substr(Contact_Person_ID,2);
Thank you for the function anyalpha
Why do you want an Id variable to be numeric? Are you going to do arithmetic with it? If you aren't doing arithmetic then it is probably better to leave the variable as character.
Here is one example of removing alphabetic characters from a variable.
data example; input Contact_Person_ID $ ; contact_person_id = compress(Contact_Person_ID,,'a'); datalines; A324577 A564732 ;
I need the numeric form so that I can use the first. last. method to count all individual cases and contacts by day_month over the time frame that Public Health has data from 2020 to current. I then hope to create a line graph of cases and contacts using SAS Graph.
Thank you for the method to remove alpha character.
wlierman
Using BY group processing does not require numeric variables.
proc sort data=sashelp.class out=class;
by sex age ;
run;
data want;
set class;
by sex;
if first.sex then youngest=name;
retain youngest;
if last.sex;
oldest = name;
keep sex youngest oldest;
run;
proc print;
run;
Thank you.
That looks to be a simpler solution to the counting up of cases and contacts.
wlierman
@wlierman wrote:
I need the numeric form so that I can use the first. last. method to count all individual cases and contacts by day_month over the time frame that Public Health has data from 2020 to current. I then hope to create a line graph of cases and contacts using SAS Graph.
You can use the DISTINCT keyword in SQL code. So if you have one record per contact that includes both a DATE and a CASE id then you should be able to use COUNT(*) for number of contacts and COUNT(DISTINCT CASE) as number of cases. Not sure what a day month is, perhaps you want "01-01" for January first?
proc sql;
create want as
select substr(put(date,yymmdd10.),6) as MMDD
, count(distinct cases) as n_cases
, count(*) as n_contacts
from have
group by MMDD
;
quit;
Your DAY_MON1 variable looks like a date, but apparently it is NOT a date. You can convert it to a date using the INPUT() function. Why is it called "DAY_MON" if it includes day, month and year? Note that if you want to use character strings for dates it is best to use YYMMDD. format instead of others because then lexicographic order will also be chronological order.
So use something like:
create Table SASCHART.Setup_Case as
select input(day_mon1,date9.) as date format=yymmdd10.
, count(distinct CaseID) as n_cases
/*, count(*) as n_contacts*/
from SASCHART.Date_Fix_Data
group by 1
;
To count at the individual date level.
If you want to aggregate to MONTH of other time division use INTNX() function to all of the dates in the same interval to the same value.
create Table SASCHART.Setup_Case as
select intnx('month',input(day_mon1,date9.),0)
as month format=yymmd7.
, count(distinct CaseID) as n_cases_per_month
/*, count(*) as n_contacts*/
from SASCHART.Date_Fix_Data
group by 1
;
Or if you are looking to see months from some base date perhaps you want to use intck() function instead?
create Table SASCHART.Setup_Case as
select intck('month','01OCT2020'd,input(day_mon1,date9.))
as month_num
, count(distinct CaseID) as n_cases_per_month
/*, count(*) as n_contacts*/
from SASCHART.Date_Fix_Data
where input(day_mon1,date9.) >= '01OCT2020'd
group by 1
;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.