BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wlierman
Lapis Lazuli | Level 10

I have data that I am trying to transform in order to create a SAS visual. There are several steps before arriving there.  I have exported a few lines of data as an xlsx file and attached here.

 

A typical line of data is like (I have chosen just the important fields)

Case              Date               Contact_ID       Date1

234243       01MAR2020      51954633      03SEP2020

716432       02MAR2020      53426754      03SEP2020

 

and so on

The date values are formatted as date9.  I want to split then by day, month, year so I tried the following

Data SASCHART.Date_Fix;
  set SASCHART.Cases_Contacts_Want;
  Yr_1 = substr(Date,6,4); 
  Mon_1= substr(Date,3,5);
  Day_1 = substr(Date,1,2);
  Daymon_1=trim(Day_1||Mon_1);

  Yr_2 = substr(Date1,6,4); 
  Mon_2= substr(Date1,3,5);
  Day_2 = substr(Date1,1,2);
  Daymon_2=trim(Day_2||Mon_2);
run;

which didn't produce the result I need which is

      Daymon_1                Daymon_2

      01MAR                        03SEP

 

and so on.  So I looked in some older SAS Communities posts and tried something like this

Data SASCHART.Date_Fix;
  set SASCHART.Cases_Contacts_Want;
  new_date_1 = mdy(substr(Date,3,5), substr(Date,1,2), substr(Date,6,4));
  format new_date_1 YYMMDD8.;
run;

That didn't work either.

 

I want to count the number of Cases for each daymon_1  and the number of contact_ID by daymon_2.  It may be something like           cases                                             contacts

                                                                date          count                            date             count

                                                               01MAR         1                              01SEP                38

                                                                     .                                                   .

                                                                     .                                                   .

                                                               26OCT        834                           16OCT               543

 

Then I will use the dates on an x-axis and the counts on the y-axis to create an overlaid line graph in SAS.   The cases started earlier than contact-tracing and are several days ahead of the contact counts, since contacting_tracing is slowing down for several reasons.

 

First, I need to split the dates into substrings properly. 

Your expertise is very much appreciated.

 

wlierman

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Run a PROC CONTENTS on your SAS dataset (not on your Excel output) and see what TYPE of variable you have.

 

If DATE is a numeric variable that is being displayed using the DATE9. format then it contains the number of days since 1960.  Do not use string functions like SUBSTR() on numeric variables.  SAS will first have to convert the number of days into a string and it will not look like the strings that the DATE9. format prints. If you want to get a string that just has the first 5 characters when printed using the DATE9. format then use the PUT() function with the DATE format but set the width to only 5 characters.

daymon = put(date,date5.);

And if you want the truncated ddMON strings to display in printouts and reports then don't bother the make any new variables, just use the one you already have with the different display format.

format date date5.;

If DATE is a character variable that has strings that look like they were dates printed using the DATE9. format then just use the substr() function directly on the character variable.

daymon = substr(date,1,5);

Or you can also use a format to have it only print the first 5 characters.

format date $5. ;

If you do have a character variable you can use the INPUT() function to generate a date value from it.  Which you can then use with any of the various functions that operate on date values or formats that display date values.

datenum=input(date,date9.);
day_of_month=day(datenum);
month_of_year=month(datenum);
calendar_year=year(datenum);
format datenum yymmdd10.;
format datenum mmddyy10.;

The PUTN() function makes it possible to pass in the format specification as a string so you can take a look at how a number of different formats will display the same value.

779   data _null_;
780     length format display $50 ;
781     date=today();
782     do format='yymmdd10.','mmddyy10.','ddmmyy10.','date5.','date7.','date9.','date11.','monyy7.' ;
783        display = putn(date,format);
784        put format $12. display ;
785     end;
786   run;

yymmdd10.   2021-11-02
mmddyy10.   11/02/2021
ddmmyy10.   02/11/2021
date5.      02NOV
date7.      02NOV21
date9.      02NOV2021
date11.     02-NOV-2021
monyy7.     NOV2021

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

In SAS data set SASCHART.Cases_Contacts_Want, are the dates numeric or text? What does PROC CONTENTS say?

--
Paige Miller
wlierman
Lapis Lazuli | Level 10
The dates are numeric from Proc Contents.



| Alphabetic List of Variables and Attributes |
| Variable | Type | Len | Format | Informat | Label |   |
| CaseID | Num | 8 | BEST. |   | CaseID |   |
| CaseStatus | Char | 11 | $11. | $11. | CaseStatus |   |
| CaseStatus1 | Char | 11 | $11. | $11. | CaseStatus1 |   |
| Contact_Person_ID | Char | 11 | $11. | $11. | Contact Person ID |   |
| County | Char | 10 | $10. | $10. | County |   |
| County_T | Char | 41 |   |   | County_T |   |
| Date | Num | 8 | DATE9. |   | Date |   |
| Date1 | Num | 8 | DATE9. |   | Date1 |   |






Reeza
Super User
Use date functions, YEAR(), MONTH(), DAY() to extract the components or look at the SAS formats.
Or you need to convert it to text if you need it.
ballardw
Super User

If a value is a DATE you can get the day of the month using the DAY function  : dayofmonth= day(datevariable).

Same with the Month and Year. The values will be numbers, which means the results will sort properly.

 

However if you want to Count something in ddMON appearance use Proc Freq and the format DATE5.

Groups created by formats will be honored for most analysis, reporting and graphing procedures.

 

An example:

data example;
   input date :Date9.;
   format date date9.;
datalines;
01Jan2021
05Jan2021
19Jan2021
01Feb2021
01Feb2021
01Feb2020
;

proc freq data=example;
   tables date;
   format date date5.;
run;

Please note that I included a value of the same date in a separate year intentionally. You have to consider year somewhere if your data includes more than one year.

 

This "does not work" because when date is numeric SAS will convert the number to a character value using the BEST12 format resulting in many leading blanks and the substr function has nothing to work with for most of the bits. Plus the MDY function would be attempting to create a new date value, not a string. If Date started as string that might work if starting at the 3rd position the 5 following characters consisted of spaces and or 2 consecutive digits to indicate month but since you have subst(date,6,4) one suspects what ever you borrowed this from was either different or looking a very odd "date" values as character.

  new_date_1 = mdy(substr(Date,3,5), substr(Date,1,2), substr(Date,6,4));

 

 

BTW Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

Tom
Super User Tom
Super User

Run a PROC CONTENTS on your SAS dataset (not on your Excel output) and see what TYPE of variable you have.

 

If DATE is a numeric variable that is being displayed using the DATE9. format then it contains the number of days since 1960.  Do not use string functions like SUBSTR() on numeric variables.  SAS will first have to convert the number of days into a string and it will not look like the strings that the DATE9. format prints. If you want to get a string that just has the first 5 characters when printed using the DATE9. format then use the PUT() function with the DATE format but set the width to only 5 characters.

daymon = put(date,date5.);

And if you want the truncated ddMON strings to display in printouts and reports then don't bother the make any new variables, just use the one you already have with the different display format.

format date date5.;

If DATE is a character variable that has strings that look like they were dates printed using the DATE9. format then just use the substr() function directly on the character variable.

daymon = substr(date,1,5);

Or you can also use a format to have it only print the first 5 characters.

format date $5. ;

If you do have a character variable you can use the INPUT() function to generate a date value from it.  Which you can then use with any of the various functions that operate on date values or formats that display date values.

datenum=input(date,date9.);
day_of_month=day(datenum);
month_of_year=month(datenum);
calendar_year=year(datenum);
format datenum yymmdd10.;
format datenum mmddyy10.;

The PUTN() function makes it possible to pass in the format specification as a string so you can take a look at how a number of different formats will display the same value.

779   data _null_;
780     length format display $50 ;
781     date=today();
782     do format='yymmdd10.','mmddyy10.','ddmmyy10.','date5.','date7.','date9.','date11.','monyy7.' ;
783        display = putn(date,format);
784        put format $12. display ;
785     end;
786   run;

yymmdd10.   2021-11-02
mmddyy10.   11/02/2021
ddmmyy10.   02/11/2021
date5.      02NOV
date7.      02NOV21
date9.      02NOV2021
date11.     02-NOV-2021
monyy7.     NOV2021
wlierman
Lapis Lazuli | Level 10
Proc contents reveals that the data variables are numeric of format date9. So I will apply you other suggestions to move forward.

Thank you.

wlierman
yabwon
Onyx | Level 15

You could do it with user define format, 

When you input date data are text strings:

data have2;
  input Case Date : $ 9. Contact_ID Date1 : $ 9.;
cards;
234243       01MAR2020      51954633      03SEP2020
716432       02MAR2020      53426754      03SEP2020
;
run;
proc print;
run;

do it like this:

proc format;
picture mydate (default=5)
          other='%d%b' (datatype=date LANGUAGE=English);
;
run;

[edit:] or use suggested date5. format

and then

data want2;
  set have2;
  length Daymon_1 Daymon_2 $ 5;
  Daymon_1 = put(input(Date,date9.), mydate. /* date5. */);
  Daymon_2 = put(input(Date1,date9.),mydate. /* date5. */);
run;
proc print;
run;

Bart

 

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1507 views
  • 0 likes
  • 6 in conversation