Solved: Column is designated 'character' by Import, with % at end of each numb...

NKormanik · Posted 02-05-2021 02:55 AM

Imported .csv from Excel. Several columns are as follows:

+25.49%

+28.95%

-31.39%

+35.36%

-24.07%

-34.71%

Import has designated these as 'character.' I need to change these to 'number,' and get rid of the +, -, %.

Suggestions greatly appreciated.

Thanks!

Nicholas Kormanik

Kurt_Bremser · Posted 02-05-2021 04:00 AM

Don't use IMPORT for CSV files. You have much less trouble writing the data step yourself, and not have IMPORT make any guesses:

data want;
input number percent8.;
format number nlpct8.2;
datalines;
+25.49%
+28.95%
-31.39%
+35.36%
-24.07%
-34.71%
;

I used the NLS format because the "native SAS" forma PERCENT uses brackets for negative values.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 02-05-2021 04:00 AM

Don't use IMPORT for CSV files. You have much less trouble writing the data step yourself, and not have IMPORT make any guesses:

data want;
input number percent8.;
format number nlpct8.2;
datalines;
+25.49%
+28.95%
-31.39%
+35.36%
-24.07%
-34.71%
;

I used the NLS format because the "native SAS" forma PERCENT uses brackets for negative values.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

NKormanik · Posted 02-05-2021 04:15 AM

Beautiful, Kurt. Thanks tons.

The .csv file has 18 columns of a lot of junk that I actually do not want.

Only two of the columns I want to keep. The ones with the percentages.

After getting output from Import, maybe I could resave it, then use your approach?

As said, all I really want out of the whole big spreadsheet are those percentages, such as +2.07%. And they aren't even in the early rows.

Arrrggghhh.

Kurt_Bremser · Posted 02-05-2021 04:47 AM

Use a dummy variable to read unwanted columns, and drop it:

data want;
length dummy $1;
infile datalines dlm="," dsd truncover;
input id $ dummy dummy dummy number :percent8.;
format number nlpct8.2;
drop dummy;
datalines;
A,x,y,z,+25.49%
B,,f,,+28.95%
C,,,,-31.39%
D,1,2,3,+35.36%
E,a,b,c,-24.07%
F,s1,s2,s3,-34.71%
;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

NKormanik · Posted 02-05-2021 05:36 AM

Wow, as easy as that. There is extra junk even in the relevant columns. Need to get rid of that as well.

Thus, the idea of looking for a particular character in each 'cell' -- "%".

Then clearing every other cell in the entire dataset.

I see loads of advice and examples for deleting observations, rows, etc., based on the WHOLE value that's in a 'cell.' Like Sex=Male -- "Male" being the whole value. But not much on looking for specifics WITHIN a cell, such as Sex=Male%.

Objective: Find the "%" sign, regardless of the rest that's in the cell. DELETE every cell that lacks the magic % sign. Throughout the entire dataset.

Or, KEEP just those cells.

Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Re: Column is designated 'character' by Import, with % at end of each number

Ready to join fellow brilliant minds for the SAS Hackathon?

Click image to register for webinar

Classroom Training Available!