BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Hi guys,

1) I am using SAS EG5.1

2) My data looks something like this:

11,1374,C5,37,24 to,286330.962000,10. >=$23,000 <$51,000  ,6553.3000,11.

The problem is that I want to capture the ">=$23,000 <$51,000" as one variable, rather than giving me ">=$23" "000 <$51" "000" in three different columns.

Further compounded when some of the sections are like >=$5<$15 which is the problem.

I thought of concatenating them, but it doesnt fix the problem of data being out of alignment.

Any thoughts guys?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Maybe you could simply replace all occurrences of  ",000" with "000" with a text editor before reading the file. Would that do the trick? - PG

PG

View solution in original post

7 REPLIES 7
JerryLeBreton
Pyrite | Level 9

A CSV file which has commas embedded in values will typically have those values within double quotes.

Can the CSV file be created so that it includes the double quotes?

nicholas_leung_gmail_com
Calcite | Level 5

Thanks Jerry, I don't have access to the raw data unfortunately - I'm working with extracts and they're extracting in text files than CSV.

Not sure if they can extract the data as CSV -> the entire dataset is allgedly 50GB.

PGStats
Opal | Level 21

Maybe you could simply replace all occurrences of  ",000" with "000" with a text editor before reading the file. Would that do the trick? - PG

PG
nicholas_leung_gmail_com
Calcite | Level 5

Good idea - do you have any suggestions for what to use?

Point and click interfaces are helpful

I tried using LTF to no success (couldn't replace text).

No clue how to work VIM out.


Downloading SlickEdit as I type.

Peter_L
Quartz | Level 8

I would use sed for this. It is the unix stream editor, also available for Windows. A command line like this will do what you want:

sed -e 's/,000/000/g' infile.csv >outfile.csv

My version is 37 kB. It will work efficiently with very large files.

An alternative would be to read the whole line in with SAS and use the perl regular expression functions to parse the line. The regular expression syntax is similar to that of sed.

Another possibility would be awk (or perl) to process the file into something SAS can read easily. I find these tools much easier to use on delimited text files than SAS.

PGStats
Opal | Level 21

You could also do everything in SAS, as suggested by Peter.L , in a fairly robust fashion :

data test;

length v1-v9 $24;

infile datalines dlm=',';

input @;

_infile_ = prxchange("s/(\$\d+),(\d{3})/$1$2/o", -1, _infile_);

input v1-v9;

datalines;

11,1374,C5,37,24 to,286330.962000,10. >=$23,000 <$51,000  ,6553.3000,11

;

PG

PG
art297
Opal | Level 21

I think that your problem isn't just not solved but, even worse, that you may not know what you are trying to import.

What do those values represent?  The 3rd value in your example is a character field, the 5th one looks like an incomplete range (i.e., 24 to .), and the 7th one (which you said you want to be">=$23,000 <$51,000" doesn't explain why it is preceded with 10. and not separated with a comma.

Do you know what variables all of those values represent and which ones might be ranges and how the end user wants each dealt with?

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1370 views
  • 1 like
  • 5 in conversation