Show an example of a line where the values "shift".
If you data has the delimiter character in it (which is normally a comma for a Comma Separated Values file) then those values need to be enclosed in quotes. Otherwise there is no way to parse the line.
Consider the case of a field with variable A,B and C. A=1, B='2,3,4' and C=5 and you generate a comma delimited files without quoting the value of B you get a line like:
1,2,3,4,5
Which has 5 values, not 3 values.
Once you add the quotes then it has three values:
1,"2,3,4",5
If you have a file that was generated improperly you MIGHT be able to interpret it, but only if the delimiters can only appear in one of the fields. So in the example you can take A from the left and C from the right and whatever is left over is the value of B.
@Emma2021 wrote:
I deprecated by |, but still could not import properly. I think it is because there are random line spaces. How can I remove/fix those in Sas? Thank you.
Post example data of what you mean by "random line spaces".
Open a text box on the forum with the </> icon, copy a few lines that show the behavior from your file opened in a text editor like Notepad or similar, and paste into the window opened on the forum.
If you mean that different lines have different order of variables, or different variables, is there a pattern to tell which are which?
I have written code to read as many as 25 different file "layouts" inside a single file but details matter in such things.
Do you have any documentation from the source about the file structure, Variable types or lengths, date value appearance?
If you don't have any of this you add a lot of headaches to "guess until it doesn't fail" type coding.
@Emma2021 wrote:
I do not want to open each csv file and fix the line (inserted a new line within one text variable)—since there are many csv files. Is there a way to fix those random new lines were inserted within one text variable?
Exactly. That is why we asked you to open one of the files with issues and copy and paste a few of the lines where the issue appears and paste it into the Insert Code box that pops up.
Once we can see examples of the issue we can give you advice on whether or not a programmatic solution to clean up the files is possible.
@Emma2021 wrote:
I do not want to open each csv file and fix the line (inserted a new line within one text variable)—since there are many csv files. Is there a way to fix those random new lines were inserted within one text variable?
If it is actually random, then almost certainly there is no "good" programming method to fix it? But are you sure they are actually "random" or based on content. What is the response to the question about file layout documentation?
And did you open any of these in a spreadsheet program and then save it? That can change the actual content.
I assume you meant delimited.
Changing the delimiter to one that is less common reduces the chances of files that cannot be parsed, but does not eliminate the issue. Unless the process of creating the files makes sure to remove any delimiters that appear in the data before generating the file.
What is the source of these files? Are you making them yourself? Are they all historical? Or do you plan to get more in the future?
If you are making them yourself then how are you making them? Are you generating them from some database system? What system? What command are you using?
If you are getting them from someone else what agreements did you put in place in advance to make sure they delivered files that can be read?
Details matter.
If the values with the inserted end-of-line characters are quoted then it is easy to fix.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.