BookmarkSubscribeRSS Feed
rsam1013
Calcite | Level 5

HELP!

 

Attached is a an example of a file that I need inport into SAS. Unfortunately, it has no delimiters and even better, the first data (Revenue Amount) varies in length before the first character of the next data (Priority Line) starts. Is there code I can use in the INPUT statement that can determine when the first character starts in each record?

 

The data in the attached example needs to be placed in columns as follows:

 

Revenue_Amount (varies in each record)

Priority_Line (first character in string and Length is 3)

Revenue_Year (used as text and length is 4)

Revenue_Month (used as text and length is 2)

Legacy_Customer_Numer (text with length of 9)

SAP_Customer_Number (text with length of 10)

 

 

Thanks in advance for your help with this!

 

Rudy

6 REPLIES 6
Kurt_Bremser
Super User

Since each line is encapsulated in those unhappy double quotes, you can't input in one statement, but have to read the line and dissect it:

data want;
infile '$HOME/sascommunity/SASHelpExample.csv' dsd firstobs=2;
input line :$100.;
length
  revenue_amount 8
  priority_line $3
  revenue_year $4
  revenue_month $2
  legacy_customer_number $9
  sap_customer_number $10
;
revenue_amount = input(scan(line,1,'09'x),12.);
priority_line = scan(line,2,'09'x);
revenue_year = scan(line,3,'09'x);
revenue_month = scan(line,4,'09'x);
legacy_customer_number = scan(line,5,'09'x);
sap_customer_number = scan(line,6,'09'x);
drop line;
run;
rsam1013
Calcite | Level 5

Kurt,

 

Thanks for the quick reply! This worked like a charm!

 

Thanks!

 

Rudy

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Why do you not just reject the file?  Surely according to your data import agreement and contract with the data provider you will know what you are expecting from the file.  Me, if someone sent me a file called .csv (Comma Separated Variable file) which wasn't in that format, or in fact a proper usable format, I would reject with the reponse of poor data received.

Reeza
Super User

@RW9 wrote:

Why do you not just reject the file?  Surely according to your data import agreement and contract with the data provider you will know what you are expecting from the file.  Me, if someone sent me a file called .csv (Comma Separated Variable file) which wasn't in that format, or in fact a proper usable format, I would reject with the reponse of poor data received.


This isn't always possible.

I've seen this due to either restrictions on what the survey tool exports or an application system. This is one way they get you with 'free' survey options, if you want a decent data file you need to pay to get a SPSS or SAS version.  Another is, I've worked for social services agencies and doctors. There focus is on patients/clients and data is secondary, that's why they hired me in the first place. Am I really going to ask a social worker to skip an appointment with a client or a doctor to skip time with a patient to fix data? No. I'm going to try and see if I can fix it first before 'rejecting' the file.  And if I had a staff member who did reject a file, they'd likely get reprimanded at minimum. 99% of the time data cleaning is part of the job, and if someone doesn't want to do that part they're not doing their job. 

 

ballardw
Super User

if your file is NOT delimited do not name it as CSV. Use TXT for an extension. CSV wants to open as if is comma separted and dependig on a number of different factors just plain misbehaves.

 

And except for the stupid " is actually TAB delimited.

JohnSAScom
Quartz | Level 8

Another option would be to open your file, save it under a different name and perform 2 search and replacements:  replace the double-quotation marks with nothing, and replace all tabs with a space.  To replace the tabs with a space, you would first have to copy a tab character from your data, and then paste it into the search and replace window as the item to search for.  Then do a "Replace All".  After that, this would work:

 

data one;
infile 'C:\MyPath\SASHelpExample.txt' firstobs=2;
input Revenue_Amount Priority_Line $ Revenue_Year Revenue_Month Legacy_Customer_Numer SAP_Customer_Number;
run;
 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2621 views
  • 3 likes
  • 6 in conversation