Hello,
I have a large dataset that was populated with multiple sources of data. I have a percentage variable formatted differently due to the way they were formatted in the original datasets. I would like an easy way to compare the different values of a variable to see which of them I need to reformat. For example, in the original data source, I may have a value which is a percentage that has a value of 4.25% formatted as .0425 or incorrectly at .00425. That is just one example. I need to compare and possibly be able to see all of the differences in format of that variable.
Is there an easy way to do this?
Thanks!
If you can specify rules, yes. If not, then no.
How do you know if it's really 0.045 or should be 0.0045?
I would:
1. Redo my merge/append and make sure to identify teh source files. It's likely the files are all the same, e.g if a variable is messed up for FileA, it's messed up for all the records in FileA for a particular variable. If you appended, you can use the INDSNAME option.
2. Eyeball the data or do a histogram and isolate values with the file source. You'll likely be able to pick out the issues.
Unless there's a rule that you can define, such as if percent is < 0.001 then it's wrong then I'm not sure how you would identify those records.
It's probably obvious that reimporting/fixing data at the source is the ideal solution, which should be relatively easy since it's a program. Then you just re-run the remaining portion of your programs.
Yes, that would definitely be easiest. It is a lot of code to sift through and over 50 datasets would have to be appended back together for the new dataset. I was trying to avoid that by doing a quick update but that doesn't look likely at this point.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.