BookmarkSubscribeRSS Feed
Jepi
Calcite | Level 5
I need to compare 2 datasets (should be equal). They have same variables but I want to see matching and missing observations. I have attempted PROC Compare. I'm having a problem with the "value comparison results for variables" output listing a value in both datasets and calculating a percent difference.
Ex: obs Base Compare Diff. %Diff
17 17Sep2014 17Sep2014. 0.6250 0.003128
12 REPLIES 12
SASKiwi
PROC Star

One explanation for what you are seeing is that the variable you are comparing is in fact a DATETIME but you are only seeing the DATE part of it. What SAS format is assigned to this variable? 

Jepi
Calcite | Level 5
I don't see an assigned format for the specimen collection date variable. It only lists the format of "common variables with differing attributes" under variable summary
SASKiwi
PROC Star

Use PROC CONTENTS to check for assigned formats. PROC COMPARE is indicating there is an assigned format by displaying data in the form DDMONYYYY:

 

proc contents data = DatasetName;
run;
Jepi
Calcite | Level 5
Dataset Type Len Format Informat
Base Num 8 Date9. Date9.
Compare Num 8 Date9. Date9.
ballardw
Super User

Please show the code you are attempting. Compare has a number of options on what to compare and how to display the results.

 

Also if you are interested in comparisons of observations you may need to sort the data sets by a number of variables to get them into the same order.

Jepi
Calcite | Level 5
PROC COMPARE BASE = Dataset COMPARE = Dataset;

I created the base dataset manually to match that of compare. More than one facility reports to the base and therefore had more data than needed. As opposed to the compare dataset which is data from one facility.

I'm looking to compare them for accuracy
Tom
Super User Tom
Super User

You probably converted one of the files from Excel.  In Excel datetime values are stored using the interger part for the date and the fraction as the percent of 24 hours.  In SAS the DATE format will ignore the fractional part of the value and just display the date.

404   data _null_;
405    dt1='17Sep2014'd ;
406    dt2=dt1 + 0.6250;
407    put (dt1 dt2) (=) / (dt1 dt2) (date9. +1) ;
408   run;

dt1=19983 dt2=19983.625
17SEP2014 17SEP2014
Jepi
Calcite | Level 5
Both files are imported Excel files. I should do this with all (3) of date variables?
Tom
Super User Tom
Super User

You will need to decide what to do based on what you need.

If you are just looking to see if the values have the date part and the very few that don't are just because of these fractional parts then you could perhaps just declare success and move on.

 

If you did want to convert you would use a data step to make a new version of the data with the date values converted.  If you have more than one date variable per dataset then you can use ARRAY processing to simplify the code.

 

For example to just remove the fractional date part using the INT() or ROUND(,1) function.

mydate = int(mydate);

If you wanted to convert the variable to DATETIME instead to preserve the time part then you could use the DHMS() format. It will automatically conver the fractional part for you.

mydate = dhms(mydate,0,0,0);
format mydate datetime20.;

 

 

Jepi
Calcite | Level 5
I don't understand why it's reading date and time. I know fate is the integer portion and time fractional? Both excel sheets are in the same date format XX/XX/XXXX
Tom
Super User Tom
Super User

The import wizard noticed that most of the cells in that column have a date format attached, so it decided to import the value as a date. Since Excel stores dates as number and so does SAS it just copied over the number and adjusted the value to match the difference in base dates that the two systems use.  If the value had a fractional date then that was copied along with it.  In both SAS and Excel if you format the value with a format that ignores the fractional value then you do not normally see the fraction value, but it is still there in the data.

Jepi
Calcite | Level 5
I used the import wizard

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2303 views
  • 0 likes
  • 4 in conversation