SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
elwayfan446
Barite | Level 11

Hello,

 

I have a large dataset that was populated with multiple sources of data.  I have a percentage variable formatted differently due to the way they were formatted in the original datasets.  I would like an easy way to compare the different values of a variable to see which of them I need to reformat.  For example, in the original data source, I may have a value which is a percentage that  has a value of 4.25% formatted as .0425 or incorrectly at .00425.  That is just one example.  I need to compare and possibly be able to see all of the differences in format of that variable.

 

Is there an easy way to do this?  

 

Thanks!

6 REPLIES 6
Reeza
Super User

If you can specify rules, yes. If not, then no. 

 

How do you know if it's really 0.045 or should be 0.0045?

elwayfan446
Barite | Level 11
Because I created the data in the multiple original sources. When I converted it from excel, I may have divided by a bigger factor of the data than I needed too. Some original data may have not even needed to be divided at all but I didn't catch this until all of the data was merged together in a new dataset.
Reeza
Super User

I would: 

 

1. Redo my merge/append and make sure to identify teh source files. It's likely the files are all the same, e.g if a variable is messed up for FileA, it's messed up for all the records in FileA for a particular variable. If you appended, you can use the INDSNAME option. 

 

2. Eyeball the data or do a histogram and isolate values with the file source. You'll likely be able to pick out the issues. 

 

Unless there's a rule that you can define, such as if percent is < 0.001 then it's wrong then I'm not sure how you would identify those records.

 

 

elwayfan446
Barite | Level 11
Great, let me look into that and I will let you know. Thanks for the quick reply.
Reeza
Super User

It's probably obvious that reimporting/fixing data at the source is the ideal solution, which should be relatively easy since it's a program. Then you just re-run the remaining portion of your programs.

elwayfan446
Barite | Level 11

Yes, that would definitely be easiest.  It is a lot of code to sift through and over 50 datasets would have to be appended back together for the new dataset.  I was trying to avoid that by doing a quick update but that doesn't look likely at this point.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1350 views
  • 0 likes
  • 2 in conversation