Desktop productivity for business analysts and programmers

Two Csv files on a similar format after import

Reply
Frequent Contributor
Posts: 120

Two Csv files on a similar format after import

Hi

i have two Csv files containing 10 coloumns and they are same coloumns but

formats of columns of diff from formats of other Csv file and order of columns also diff and in one Csv file (old) they have while I  trying to import

i saw. "mbrnum" "pricing" like this but we have remove these " "

Community Manager
Posts: 2,692

Re: Two Csv files on a similar format after import

I'm afraid that your question is difficult to understand.  Are you saying that you have two similar CSV files but that when you import them, the results are different?  Are you looking for a way to have more control over how the data fields are read in?

And are you using the SAS Enterprise Guide Import Data task, or running DATA step code?

If you could share details and maybe some example records, that would help.

Chris

Frequent Contributor
Posts: 120

Re: Two Csv files on a similar format after import

Hi

ok csvfile 1 : data looks like when I open the Csv file in process flow in EG is

"item","price","id","list"

1,24,"MAB123","Check"

2,34,"MAB324","Check2"

ccsvfile 2 : data looks like when I open csv file in process flow in sas EG

item,price,list,id

1,24,Check,MAB325

2,35,Birth,MAS434

so Csv file 1 is old file and Csv 2 is new file

we shd get it look similar when import Csv file 1 and Csv file 2

pld file shd to be changed to new file

1. Formats and informat after impirtimg to sas shd be same

2. the variables shd be aligned in same manner

3. Shd remove quotations in old file to look same as new file

Grand Advisor
Posts: 10,210

Re: Two Csv files on a similar format after import

You'll likely need to write 2 programs to get sufficient control. But since this has happened with just 2 files you might want to see if you can address this going forward from the data source. If the order and layout changes often you are asking for a lot of headaches.

The basic steps I would follow if this were my project would be:

1) Identify the characteristics of ALL of the expected fields. For obvious character values like your ID or LIST that would include the MAXIMUM number of characters these value will ever have. For less obvious such as ITEM, decide if this is even needed (if a sequence number possibly not needed) and whether it is numeric or not. HINT: most times if you are not going to use it in arithmetic calculations it probably should be character.

2) Decide on your variable names and labels. One reason, you can read the columns from a CSV file and completely ignore the headers provided by your data source.

3) I would generally not spend a lot of time early in a process making the order of a SAS data set align as you'll likely cause yourself MUCH work if the format keeps changing. If you are worried about combining the data sets SAS will "align" the same named variables as long as they are the same data type automatically. However the order a variable first appears in a SAS data step, which is what you'll want, will determine the order in the resulting data. So if you describe the variables with say INFORMAT statements then the order they are read (i.e. appear in the source file) does not matter.

If you run proc import on either of those data sets as a trial you will see that the quotes basically go away and they proc import will generate code, likely appearing in the log, of a data step you could use to read the data. Save one as a code node and edit the names and formats as desired. You'll see what I meant about the INFORMAT statements. To read the other layout just change the order of the INPUT statement (in a DIFFERENT node).

Due to the large number of acronyms and code shortcuts it is a good idea to spell out the English on this forum. I have no idea what you mean by a "pld file" and have to assume that shd means Should and not something like Shared Hard Drive...

Ask a Question
Discussion stats
  • 3 replies
  • 323 views
  • 0 likes
  • 3 in conversation