- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi there
I was hoping anyone could help me to generate within the Enterprise Guide recoded column function, the zodiac sign base on dd-mm-yyyy birthdates.
Should i breakdown the dates in Month(x), Day(x)? How can generate a code within the generate new column options?
Aries March 21 - April 19
Taurus April 20 - May 20
Gemini May 21 - June 20
Cancer June 21 - July 22
Leo July 23 - August 22
Virgo August 23 - September 22
Libra September 23 - October 22
Scorpio October 23 - November 21
Sagittarius November 22 - December 21
Capricorn December 22 - January 19
Aquarius January 20 - February 18
Pisces February 19 - March 20
Thanks in advanced
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Cesar,
That's why you need to recode the dates for use with the format -- take the Year out of the equation. Zodiac signs don't change from year-to-year (do they?).
In the Query Builder, compute a new column based on the date value, with the month, day, and a fixed year. I used Year 2000 because it was a leap year and accounts for Feb 29. Here's an expression you can use:
mdy(month(t1.date),day(t1.date),2000)
Then apply the sign. format in the column properties.
Following: complete program that shows it in action.
/* SAS format for zodiac signs in a given year */
proc format lib=work;
value sign
'21Mar2000'd - '19Apr2000'd = 'Aries'
'20Apr2000'd - '20May2000'd = 'Taurus'
'21May2000'd - '20Jun2000'd = 'Gemini'
'21Jun2000'd - '22Jul2000'd = 'Cancer'
'23Jul2000'd - '22Aug2000'd = 'Leo'
'23Aug2000'd - '22Sep2000'd = 'Virgo'
'23Sep2000'd - '22Oct2000'd = 'Libra'
'23Oct2000'd - '21Nov2000'd = 'Scorpio'
'22Nov2000'd - '21Dec2000'd = 'Sagittarious'
/* split Capricorn to make two valid ranges */
/* that don't span the calendar boundary */
'22Dec2000'd - '31Dec2000'd = 'Capricorn'
'01Jan2000'd - '19Jan2000'd = 'Capricorn'
'20Jan2000'd - '18Feb2000'd = 'Aquarius'
'19Feb2000'd - '20Mar2000'd = 'Pisces'
other = 'Unknown';
run;
proc sql;
select air, date format=date9.,
mdy(month(date),day(date),2000) as sign format=sign.
from sashelp.air;
quit;
Output:
Chris
(updated with image of results)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In this blog post I show an approach that uses SAS formats to map dates to Zodiac signs:
What’s our sign? - The SAS Dummy
It doesn't translate exactly to Query Builder, but you could:
- run the code to define the format
- use Query Builder to recode the date to the range the format expects (Mon, Day, Year=2000)
- apply the format to the new column
Chris
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks! I'm starting to understand how to generate the dates, however, i deal with an infinite amount of birthdate years, which should be taken into account in the formula and don't have knowledge of.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Cesar,
That's why you need to recode the dates for use with the format -- take the Year out of the equation. Zodiac signs don't change from year-to-year (do they?).
In the Query Builder, compute a new column based on the date value, with the month, day, and a fixed year. I used Year 2000 because it was a leap year and accounts for Feb 29. Here's an expression you can use:
mdy(month(t1.date),day(t1.date),2000)
Then apply the sign. format in the column properties.
Following: complete program that shows it in action.
/* SAS format for zodiac signs in a given year */
proc format lib=work;
value sign
'21Mar2000'd - '19Apr2000'd = 'Aries'
'20Apr2000'd - '20May2000'd = 'Taurus'
'21May2000'd - '20Jun2000'd = 'Gemini'
'21Jun2000'd - '22Jul2000'd = 'Cancer'
'23Jul2000'd - '22Aug2000'd = 'Leo'
'23Aug2000'd - '22Sep2000'd = 'Virgo'
'23Sep2000'd - '22Oct2000'd = 'Libra'
'23Oct2000'd - '21Nov2000'd = 'Scorpio'
'22Nov2000'd - '21Dec2000'd = 'Sagittarious'
/* split Capricorn to make two valid ranges */
/* that don't span the calendar boundary */
'22Dec2000'd - '31Dec2000'd = 'Capricorn'
'01Jan2000'd - '19Jan2000'd = 'Capricorn'
'20Jan2000'd - '18Feb2000'd = 'Aquarius'
'19Feb2000'd - '20Mar2000'd = 'Pisces'
other = 'Unknown';
run;
proc sql;
select air, date format=date9.,
mdy(month(date),day(date),2000) as sign format=sign.
from sashelp.air;
quit;
Output:
Chris
(updated with image of results)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Chris@SAS, i'll create the format as you suggest.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chris@SAS I'm creating the format to apply the zodiac signs but i'm having trouble getting SAS EG to approve it.
Here i show the specifications i'm using, i though i was writing the dates correctly. I've tried two ways: 01Jan2000'd - 20Jan2000'd and '01Jan2000'd - '20Jan2000'd, but SAS EG is no approving neither of them.
César
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Look at Chris' quotation marks compared to yours.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I also did the double quotation marks, still.. no approval of my format.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't know what is it that i'm doing wrong
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You could also defined a character format (Call it $MD2SIGN.) to convert from MMDD values to Zodiak sign.
Then you could calculate the sign using this simple statement.
sign = put(put(dob,mmddyy4.),$md2sign.);
Here is SAS code to generate the format.
proc format lib=work;
value $md2sign
'0321' - '0419' = 'Aries'
'0420' - '0520' = 'Taurus'
'0521' - '0620' = 'Gemini'
'0621' - '0722' = 'Cancer'
'0723' - '0822' = 'Leo'
'0823' - '0922' = 'Virgo'
'0923' - '1022' = 'Libra'
'1023' - '1121' = 'Scorpio'
'1122' - '1221' = 'Sagittarious'
'1222' - '1231'
,'0101' - '0119' = 'Capricorn'
'0120' - '0218' = 'Aquarius'
'0219' - '0320' = 'Pisces'
other = 'Unknown'
;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I know very little about EG but, I seem to recall, having a similar problem with applying date ranges. While it is totally against everything I've learned in SAS, try it WITHOUT THE QUOTES AND WITHOUT THE ENDING d.
That SHOULDN'T work, but I seem to recall that is how I got it to work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another way to go about this would be to populate a calculated column using a case statement. It would work in much the same way as the method Chris described above but does not require the creation of a format. The following is a case statement that *should* produce identical results:
CASE WHEN t1.date BETWEEN '21Mar2000' AND '19Apr2000' THEN 'Aries'
WHEN t1.date BETWEEN '20Apr2000' AND '20May2000' THEN 'Taurus'
WHEN t1.date BETWEEN '21May2000' AND '20Jun2000' THEN 'Gemini'
WHEN t1.date BETWEEN '21Jun2000' AND '22Jul2000' THEN 'Cancer'
WHEN t1.date BETWEEN '23Jul2000' AND '22Aug2000' THEN = 'Leo'
WHEN t1.date BETWEEN '23Aug2000' AND '22Sep2000' THEN 'Virgo'
WHEN t1.date BETWEEN '23Sep2000' AND '22Oct2000' THEN 'Libra'
WHEN t1.date BETWEEN '23Oct2000' AND '21Nov2000' THEN 'Scorpio'
WHEN t1.date BETWEEN '22Nov2000' AND '21Dec2000' THEN 'Sagittarious'
WHEN t1.date BETWEEN '22Dec2000' AND '31Dec2000' THEN 'Capricorn'
WHEN t1.date BETWEEN '01Jan2000' AND '19Jan2000' THEN 'Capricorn'
WHEN t1.date BETWEEN '20Jan2000' AND '18Feb2000' THEN 'Aquarius'
WHEN t1.date BETWEEN '19Feb2000' AND '20Mar2000' THEN 'Pisces'
ELSE 'Unknown'
END
I did not include the 'd behind my date values because in my input table the date column is defined as a string instead of a date. You may have to change it depending on the format of your input. Also, this raises a question that I do not know the answer to (I am a fairly new SAS EG user myself). What are the performance differences between running this case statement vs building a format like Chris described? If I were to run this function on say millions of rows would one run faster than the other? Just curious =]
Hope this helps!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS Formats are pretty fast, and I'm not sure you would see a big difference in a SAS data table. But if your data source is in a database, PROC SQL can push the logic into the database process and perform the operation there, and the result could come back a lot quicker.
For example, suppose you want to run a query on a database (Oracle, let's say) of 1 million records - names and birthdays. You want to know the freq distribution of the Zodiac signs across this population -- counts, essentially.
With a SAS format, PROC SQL has to pull all records from Oracle to your SAS session, apply the format, and then count the record into one of the 12 "sign" groups. But if your SQL code can run in the Oracle process, then the database evaluates the CASE and adds the result to the count (assuming you are doing a GROUP BY Sign). At the end, only 12 records come over into SAS -- one for each Zodiac sign and its aggregated count.
You've saved lots of I/O processing, moving 12 records instead of 1 million. If there is a difference in the speed of SAS formats and a CASE statement, it's most likely negligible compared to the I/O cost .
Note: there *is* a way for you to "install" SAS formats into a database by using SAS In-Database Technologies. That helps you to get the best of both worlds: flexibility and power of Formats, with the "local" processing in the database.
Chris
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That was very informative. Thanks for the quick reply!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Chris@SAS it all worked correctly.