BookmarkSubscribeRSS Feed
elwayfan446
Barite | Level 11

Hello,

 

I have a large dataset that was populated with multiple sources of data.  I have a percentage variable formatted differently due to the way they were formatted in the original datasets.  I would like an easy way to compare the different values of a variable to see which of them I need to reformat.  For example, in the original data source, I may have a value which is a percentage that  has a value of 4.25% formatted as .0425 or incorrectly at .00425.  That is just one example.  I need to compare and possibly be able to see all of the differences in format of that variable.

 

Is there an easy way to do this?  

 

Thanks!

6 REPLIES 6
Reeza
Super User

If you can specify rules, yes. If not, then no. 

 

How do you know if it's really 0.045 or should be 0.0045?

elwayfan446
Barite | Level 11
Because I created the data in the multiple original sources. When I converted it from excel, I may have divided by a bigger factor of the data than I needed too. Some original data may have not even needed to be divided at all but I didn't catch this until all of the data was merged together in a new dataset.
Reeza
Super User

I would: 

 

1. Redo my merge/append and make sure to identify teh source files. It's likely the files are all the same, e.g if a variable is messed up for FileA, it's messed up for all the records in FileA for a particular variable. If you appended, you can use the INDSNAME option. 

 

2. Eyeball the data or do a histogram and isolate values with the file source. You'll likely be able to pick out the issues. 

 

Unless there's a rule that you can define, such as if percent is < 0.001 then it's wrong then I'm not sure how you would identify those records.

 

 

elwayfan446
Barite | Level 11
Great, let me look into that and I will let you know. Thanks for the quick reply.
Reeza
Super User

It's probably obvious that reimporting/fixing data at the source is the ideal solution, which should be relatively easy since it's a program. Then you just re-run the remaining portion of your programs.

elwayfan446
Barite | Level 11

Yes, that would definitely be easiest.  It is a lot of code to sift through and over 50 datasets would have to be appended back together for the new dataset.  I was trying to avoid that by doing a quick update but that doesn't look likely at this point.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 792 views
  • 0 likes
  • 2 in conversation