BookmarkSubscribeRSS Feed
NOA
Calcite | Level 5 NOA
Calcite | Level 5

Perhaps my question is totally a newbie thing but I am hitting a wall with simple stuff.

I did PROC IMPORT to import an access data file successfully. What I need to do is to fix my data because under one of the variables called country some observations says BRAZIL and BRAZIL1 when in fact they are both the same thing.

Can someone walk me through how i can create a new variable so I can fix this and merge them?  I tried if then statements but it would not work.

Thanks!

4 REPLIES 4
Reeza
Super User

You say variable called country and observations with Brazil/Brazil1, is that correct?

If that's correct, did it import incorrectly or is your data incorrect from the source?

Or did you mean two variables, one called Brazil and one called Brazil1?

Regardless, how would you like the output to look like.

NOA
Calcite | Level 5 NOA
Calcite | Level 5

Just to clarify. This is an example of how my PROC Import output looks like.

COUNTRY        SEX           MARITAL STATUS

USA                  F                    S

PERU               M                    S

BRAZIL1            M                    S

BRAZIL            F                       M

USA                  M                    S

BRAZIL1            M                    S

Brazil1 and Brazil are one and the same. The reason why some are called brazil1 and some are called brazil is because two different people collected the data and there was no uniformity. My question is how to convert all the BRAZIL1 into BRAZIL to make the results streamlined?

Reeza
Super User

So really this is about cleaning data. This can be done via a data step and using the compress function to remove any numbers from Brazil for example.

Here's a link to the documentation for the Compress function.

SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition

Data Want;

Set have;

Country_Clean=compress(Country, , 'ka');

run;

ballardw
Super User

What was the original data file type or source?

I ask because if the original data was a text file such as tab or comma delimited then Proc Import writes data step code to read the data. You could then modify the generated to incorporate your fixes. If you are going to do this with a number of files I would recommend considering this approach as it will allow you to set such things as character variable lengths, informats and formats you want instead of defaults and routine data manipulation such as this.


sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2317 views
  • 0 likes
  • 3 in conversation