- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 06-21-2007 09:32 AM
(2315 views)
I have a table that may or may not have data in certain fields. In my PROC SQL, when I select the field that might / might not have data, my subsequent PROC REPORT displays the field as 'Error' when data is not present.
How might I at least dummy fill that field with something like a quoted space (' ') or the words 'N/A', to avoid seeing that 'Error' in the field? Would I do that in the PROC SQL or in the PROC REPORT?
How might I at least dummy fill that field with something like a quoted space (' ') or the words 'N/A', to avoid seeing that 'Error' in the field? Would I do that in the PROC SQL or in the PROC REPORT?
15 REPLIES 15
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That is very odd. I have never had PROC REPORT display the word ERROR on a report when the data value is missing. Although you can sometimes get ERROR messages in the log when you try to use a variable the wrong way or apply the wrong format.
Try this program in a code node (NOT as a stored process). In my program, I am creating a file of 5 observations. Then I run a series of PROC SQL steps showing how to select rows based on whether NAME is missing, GRP is missing or NUMVAR is missing or whether they're NOT missing. Then I run a PROC REPORT on the same data.
[pre]
data makedata;
infile datalines dsd;
input name $ grp $ numvar;
return;
datalines;
allan, aaa, 10
bob,,20
carl,ccc,30
dave,ddd,.
,eee,50
;
run;
title 'name is missing';
proc sql;
select name, grp, numvar
from makedata
where name is missing;
quit;
title 'grp is missing';
proc sql;
select name, grp, numvar
from makedata
where grp is missing;
quit;
title 'numvar is missing';
proc sql;
select name, grp, numvar
from makedata
where numvar is missing;
quit;
title 'all have values';
proc sql;
select name, grp, numvar
from makedata
where name is not missing and
grp is not missing and
numvar is not missing;
quit;
title 'Simple Proc Report';
proc report data=makedata nowd;
column name grp numvar;
run;
proc format;
value $namefmt ' ' = 'Not Applicable';
value $grpfmt ' ' = 'Not Applicable';
value nvfmt . = 'Not Applicable';
run;
title 'Format with Proc Report';
proc report data=makedata nowd ;
column name grp numvar;
define name / display f=$namefmt.;
define grp / display f=$grpfmt.;
define numvar / sum f=nvfmt.;
run;
title;
[/pre]
The first PROC REPORT does not use any format for the missing values, but the second PROC REPORT does use a format to put "Not Applicable" in the columns with missing values.
If you are having trouble with a specific type of data or with PROC REPORT or your PROC SQL, your best bet might be to contact Tech Support because they can look at your data and at your code and help you resolve the problem.
cynthia
Results from the Proc Reports:
[pre]
Simple Proc Report
name grp numvar
allan aaa 10
bob 20
carl ccc 30
dave ddd .
eee 50
Format with Proc Report
name grp numvar
allan aaa 10
bob Not Applicable 20
carl ccc 30
dave ddd Not Applicable
Not Applicable eee 50
[/pre]
Try this program in a code node (NOT as a stored process). In my program, I am creating a file of 5 observations. Then I run a series of PROC SQL steps showing how to select rows based on whether NAME is missing, GRP is missing or NUMVAR is missing or whether they're NOT missing. Then I run a PROC REPORT on the same data.
[pre]
data makedata;
infile datalines dsd;
input name $ grp $ numvar;
return;
datalines;
allan, aaa, 10
bob,,20
carl,ccc,30
dave,ddd,.
,eee,50
;
run;
title 'name is missing';
proc sql;
select name, grp, numvar
from makedata
where name is missing;
quit;
title 'grp is missing';
proc sql;
select name, grp, numvar
from makedata
where grp is missing;
quit;
title 'numvar is missing';
proc sql;
select name, grp, numvar
from makedata
where numvar is missing;
quit;
title 'all have values';
proc sql;
select name, grp, numvar
from makedata
where name is not missing and
grp is not missing and
numvar is not missing;
quit;
title 'Simple Proc Report';
proc report data=makedata nowd;
column name grp numvar;
run;
proc format;
value $namefmt ' ' = 'Not Applicable';
value $grpfmt ' ' = 'Not Applicable';
value nvfmt . = 'Not Applicable';
run;
title 'Format with Proc Report';
proc report data=makedata nowd ;
column name grp numvar;
define name / display f=$namefmt.;
define grp / display f=$grpfmt.;
define numvar / sum f=nvfmt.;
run;
title;
[/pre]
The first PROC REPORT does not use any format for the missing values, but the second PROC REPORT does use a format to put "Not Applicable" in the columns with missing values.
If you are having trouble with a specific type of data or with PROC REPORT or your PROC SQL, your best bet might be to contact Tech Support because they can look at your data and at your code and help you resolve the problem.
cynthia
Results from the Proc Reports:
[pre]
Simple Proc Report
name grp numvar
allan aaa 10
bob 20
carl ccc 30
dave ddd .
eee 50
Format with Proc Report
name grp numvar
allan aaa 10
bob Not Applicable 20
carl ccc 30
dave ddd Not Applicable
Not Applicable eee 50
[/pre]
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This time, my bad. I didn't tell you that I had defined a format as follows:
proc format;
picture reportdate other='%0m/%0d/%0Y' (datatype=datetime);
run;
And then used it in a Proc SQL as follows:
EMPLOYEE.CURRENT_HIRE_DATE FORMAT=REPORTDATE.,
EMPLOYEE.ORIGINAL_HIRE_DATE FORMAT=REPORTDATE.,
EMPLOYEE.ADJUSTED_SERVICE_DATE FORMAT=REPORTDATE.,
EMPLOYEE.TERMINATION_DATE FORMAT=REPORTDATE.,
It's in my follow-on Proc Report, when there is nothing for the employee's termination date that I get a display of the word 'Error' if there is nothing for the employee's termination date.
I defined my own date format because I did not see a MM/DD/YY format in SAS's date formats. So....I'm thinking there might be something related to my date format definition that caused the "Error" to show when a field is empty.
Any thoughts?
proc format;
picture reportdate other='%0m/%0d/%0Y' (datatype=datetime);
run;
And then used it in a Proc SQL as follows:
EMPLOYEE.CURRENT_HIRE_DATE FORMAT=REPORTDATE.,
EMPLOYEE.ORIGINAL_HIRE_DATE FORMAT=REPORTDATE.,
EMPLOYEE.ADJUSTED_SERVICE_DATE FORMAT=REPORTDATE.,
EMPLOYEE.TERMINATION_DATE FORMAT=REPORTDATE.,
It's in my follow-on Proc Report, when there is nothing for the employee's termination date that I get a display of the word 'Error' if there is nothing for the employee's termination date.
I defined my own date format because I did not see a MM/DD/YY format in SAS's date formats. So....I'm thinking there might be something related to my date format definition that caused the "Error" to show when a field is empty.
Any thoughts?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi...Ah, that explains it. How about the mmddyy8. or mmddyy10. formats???
Although picture formats are wonderful, I generally don't use OTHER as the way to format values because MISSING can fall into the OTHER category. If I were using a PICTURE format, I always code LOW-HIGH as the range instead of OTHER.
However, I would probably do this if I needed to have some string for missing values. The use of mmddyy10. in square brackets tells SAS to use the regular mmddyy10. format for any dates between the lowest date in the file up to an including 12/31/2007 -- then you can see how missing is handled versus a future date (handled with OTHER):
[pre]
proc format;
value bfmt low-'31Dec2007'd = [mmddyy10.]
. = 'Missing Date'
other='Future Date';
run;
data testdate;
infile datalines;
input name $ bday : mmddyy10.;
otherdate = bday;
otherdate2 = bday;
datalines;
alan 12/31/2007
bob 11/15/1984
carl .
dave 07/07/2016
;
run;
proc print data=testdate;
format bday bfmt. otherdate mmddyy8.
otherdate2 worddate.;
run;
[/pre]
When you run the code in an EG code node, you should be able to compare the column that uses the BFMT format versus the one that uses just mmddyy8. or worddate. format. If you used the following options statement:[pre]
options missing = ' ';[/pre]
you would not see a . when the date was missing.
cynthia
Although picture formats are wonderful, I generally don't use OTHER as the way to format values because MISSING can fall into the OTHER category. If I were using a PICTURE format, I always code LOW-HIGH as the range instead of OTHER.
However, I would probably do this if I needed to have some string for missing values. The use of mmddyy10. in square brackets tells SAS to use the regular mmddyy10. format for any dates between the lowest date in the file up to an including 12/31/2007 -- then you can see how missing is handled versus a future date (handled with OTHER):
[pre]
proc format;
value bfmt low-'31Dec2007'd = [mmddyy10.]
. = 'Missing Date'
other='Future Date';
run;
data testdate;
infile datalines;
input name $ bday : mmddyy10.;
otherdate = bday;
otherdate2 = bday;
datalines;
alan 12/31/2007
bob 11/15/1984
carl .
dave 07/07/2016
;
run;
proc print data=testdate;
format bday bfmt. otherdate mmddyy8.
otherdate2 worddate.;
run;
[/pre]
When you run the code in an EG code node, you should be able to compare the column that uses the BFMT format versus the one that uses just mmddyy8. or worddate. format. If you used the following options statement:[pre]
options missing = ' ';[/pre]
you would not see a . when the date was missing.
cynthia
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And hello, again! I tried the mmddyy8. and mmddyy10. formats, but all I got on my output screen was abunch of asterisks; with my picture definition, at least I could see actual dates. Why would I get asterisks with the mmddyy8. or mmddyy10. formats?
I used the OTHER based on samples I saw on the web.
I'll play with your suggestion and see what happens.
Thanks again for your help and for your knowledge.
I used the OTHER based on samples I saw on the web.
I'll play with your suggestion and see what happens.
Thanks again for your help and for your knowledge.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Asterisks usually appear in a format when the value overflows the display capacity of the structure.
I have a hunch you are using a date format with a datetime value. Is that possible?
Kind regards
David
I have a hunch you are using a date format with a datetime value. Is that possible?
Kind regards
David
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
dkvi,
Yes, my date data comes in in a datetime value, however, shouldn't the MMDDYY8. or MMDDYY10. resolve that, so that all I see is something like 10/23/04 ?
Thanks!
Yes, my date data comes in in a datetime value, however, shouldn't the MMDDYY8. or MMDDYY10. resolve that, so that all I see is something like 10/23/04 ?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No, as you can see in the output below, MMDDYY10. does NOT format a date/time number the way you want. However, there IS a function (DATEPART) you can use to extract the Date portion of a date/time number. Note how the unformatted value of TEMP is NOT the number of days since Jan 1, 1960, but is the number of seconds (I think) from Jan 1, 1960 -- so it's really NOT a date number that can be formatted with MMDDYY10 format.
cynthia
[pre]
data testdate;
infile datalines;
input name $ temp : datetime.;
bday = datepart(temp);
otherdate = datepart(temp);
errdate = temp;
datalines;
alan 31DEC2007:02:35:43
bob 15NOV1984:08:55:41
carl .
dave 07JUL2016:10:45:31
;
run;
proc print data=testdate;
format bday bfmt. otherdate mmddyy10.
errdate mmddyy10.;
run;
[/pre]
The output
[pre]
Obs name temp bday otherdate errdate
1 alan 1514687743 12/31/2007 12/31/2007 **********
2 bob 784976141 11/15/1984 11/15/1984 **********
3 carl . Missing Date . .
4 dave 1783507531 Future Date 07/07/2016 **********
[/pre]
cynthia
[pre]
data testdate;
infile datalines;
input name $ temp : datetime.;
bday = datepart(temp);
otherdate = datepart(temp);
errdate = temp;
datalines;
alan 31DEC2007:02:35:43
bob 15NOV1984:08:55:41
carl .
dave 07JUL2016:10:45:31
;
run;
proc print data=testdate;
format bday bfmt. otherdate mmddyy10.
errdate mmddyy10.;
run;
[/pre]
The output
[pre]
Obs name temp bday otherdate errdate
1 alan 1514687743 12/31/2007 12/31/2007 **********
2 bob 784976141 11/15/1984 11/15/1984 **********
3 carl . Missing Date . .
4 dave 1783507531 Future Date 07/07/2016 **********
[/pre]
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Interesting..... I think I understand. So, you can't go directly from a SAS-format date into a date format. This gets deeper and deeper!! 🙂 OK... I'll play around with DATEPART also. Thanks for the information!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not exactly --- the WHOLE point is that a SAS date/time value is different from a SAS date value. There ARE formats that work with SAS date/time values, but they do not use the mm/dd/yy form for representing dates.
So even though a SAS date/time value represents BOTH the date and the time, the number that's stored is the number of SECONDS since Jan 1, 1960; whereas, the SAS date value is the number of DAYS since Jan 1, 1960. So MMDDYY format works on DATE values, NOT date/time values.
That's why you need the DATEPART function -- to EXTRACT the number of DAYS from the number of SECONDS.
cynthia
So even though a SAS date/time value represents BOTH the date and the time, the number that's stored is the number of SECONDS since Jan 1, 1960; whereas, the SAS date value is the number of DAYS since Jan 1, 1960. So MMDDYY format works on DATE values, NOT date/time values.
That's why you need the DATEPART function -- to EXTRACT the number of DAYS from the number of SECONDS.
cynthia
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK....I am beginning to understand. I think part of my confusion has come from having worked with Microsoft products, like Access, where date fields containing Dates / Times could be broken out in one step to date formats.
It's starting to make sense....thanks!
It's starting to make sense....thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think just about every data storage package uses an epoch for calculating dates. SAS is 1960, Microsoft is usually 1900. However, be aware that not even M$ is consistent since the date time values stored on physical file headers in Windows represent the number of milliseconds since Jan 1 1900.
For Excel, Access and SQLServer, M$ also store date times as days and decimal parts of days, so the time 8am is actually 0.33333333. Try that for yourself in an Excel sheet, and change the format as needed to see the real values and the representation.
Frankly, although it is tiresome when you forget and try to format a DateTime as a date, it was a lesson I learned early and has been very helpful.
I'm also going to disagree very slightly with Cynthia, but in the nicest possible way. If you use the DateTime9. format for a datetime you will get what appears to be a Date9. appearance. This is because SAS formats are terribly clever and drop redundant pieces of information if there is insufficient space to display them.
Kind regards
David
For Excel, Access and SQLServer, M$ also store date times as days and decimal parts of days, so the time 8am is actually 0.33333333. Try that for yourself in an Excel sheet, and change the format as needed to see the real values and the representation.
Frankly, although it is tiresome when you forget and try to format a DateTime as a date, it was a lesson I learned early and has been very helpful.
I'm also going to disagree very slightly with Cynthia, but in the nicest possible way. If you use the DateTime9. format for a datetime you will get what appears to be a Date9. appearance. This is because SAS formats are terribly clever and drop redundant pieces of information if there is insufficient space to display them.
Kind regards
David
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, David...I meant to put that in my code as a format on the TEMP variable and a neuron failed to fire. 😉
Thanks!
cynthia
Thanks!
cynthia
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I agree.... this has been a lesson learned and has been very helpful. And you and Cynthia have been extremely helpful. I just love vertical learning curves!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Very interesting and useful discussion but I think one of the key things that David wrote here is "If you use the DateTime9. format for a datetime you will get what appears to be a Date9. appearance.".
A formatted value is just that. The appearance of a data value in a manner that we can understand. The underlying value is still the number of seconds not days. Just a pointer when it comes to manipulating dates and date/times.
Cheers,
Lawrence
A formatted value is just that. The appearance of a data value in a manner that we can understand. The underlying value is still the number of seconds not days. Just a pointer when it comes to manipulating dates and date/times.
Cheers,
Lawrence