Hi All,
I have a bizarre issue. I created some plots in Tableau and I wanted to transfer my files to SAS to do some stat tests. It appeared that the process went smoothly. Tableau exported a .csv file and SAS imported that file without complaining. However, when I use the file in SAS some variables act odd. I have a ProjectID variable in .csv which has moved to SAS as Project_ID. The format is "$15."
I wrote a data step to transfer some Project_IDs to a new dataset to do debugging. Very simple:
data want;
set have;
if Project_ID eq 'ConnectCare_014';
run;
I have used codes like this at least 400 times before but somehow it did not work this time!
Interestingly when I wrote:
data have;
set have;
if Project_ID eq 'ConnectCare_014' then PersonID='person014';
run;
It worked! the PersonID variable was created and it was correctly populated.
Does anyone know what is going on?
Here is my dataset (have attached it as well). Thank you all,
Project ID | FT Eload | Surg Med | Surg Med Proc | Metric | Date scaf | Domain |
ConnectCare_014 | H2 | Surgery | Surgery | Time in In Basket per Day | 10/31/2020 | In Basket |
ConnectCare_014 | H2 | Surgery | Surgery | Time in In Basket per Day | 10/30/2020 | In Basket |
ConnectCare_014 | H2 | Surgery | Surgery | Time in In Basket per Day | 10/29/2020 | In Basket |
ConnectCare_014 | H2 | Surgery | Surgery | Time in In Basket per Day | 10/28/2020 | In Basket |
ConnectCare_014 | H2 | Surgery | Surgery | Time in In Basket per Day | 10/27/2020 | In Basket |
ConnectCare_014 | H2 | Surgery | Surgery | Time in In Basket per Day | 10/26/2020 | In Basket |
ConnectCare_029 | H1 | Medical | Procedure | Time in In Basket per Day | 6/1/2021 | In Basket |
ConnectCare_029 | H1 | Medical | Procedure | Time in In Basket per Day | 5/31/2021 | In Basket |
ConnectCare_029 | H1 | Medical | Procedure | Time in In Basket per Day | 5/30/2021 | In Basket |
ConnectCare_029 | H1 | Medical | Procedure | Time in In Basket per Day | 7/31/2021 | In Basket |
ConnectCare_029 | H1 | Medical | Procedure | Time in In Basket per Day | 7/30/2021 | In Basket |
ConnectCare_029 | H1 | Medical | Procedure | Time in In Basket per Day | 7/29/2021 | In Basket |
ConnectCare_029 | H1 | Medical | Procedure | Time in In Basket per Day | 7/28/2021 | In Basket |
Some questions may be in order.
Did you open that CSV in a spreadsheet program? Did you accidentally SAVE it afterwards if you did? This may be important as several spreadsheet programs will change some sorts of values when opened from CSV and hence may vary.
How did you import the file into SAS? I am going to guess Proc Import, which by default uses very few rows of data to set properties of your variables. If your actual data has more than the example that could be a cause for concern.
Use of a data step to read CSV allows you to control things.
As a quick note. Your header row of the attached CSV shows this text:
Project ID,FT Eload,Surg Med,Surg Med Proc,Metric,Date scaf,Domain
Since your column headers have spaces in most of those variables if your SAS is using the default VALIDVARNAME=V7 then proc import would turn those spaces into _ characters because the V7 rules do not allow spaces or other special characters as part of the variable name.
"Did not work" is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.
When I download your attached csv, use Proc Import to read the data and run your code:
proc import file="&outpath.sasproject_ID2.csv" out=have dbms=CSV; run; data want; set have; if Project_ID eq 'ConnectCare_014'; run; ods listing; proc print data=want; run;
My results of the Print output:
Surg_Med_ Obs Project_ID FT_Eload Surg_Med Proc Metric Date_scaf Domain 1 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/31/2020 In Basket 2 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/30/2020 In Basket 3 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/29/2020 In Basket 4 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/28/2020 In Basket 5 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/27/2020 In Basket 6 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/26/2020 In Basket
So you really need to show YOUR log for the data step that doesn't work and describe how the result is wrong.
Some questions may be in order.
Did you open that CSV in a spreadsheet program? Did you accidentally SAVE it afterwards if you did? This may be important as several spreadsheet programs will change some sorts of values when opened from CSV and hence may vary.
How did you import the file into SAS? I am going to guess Proc Import, which by default uses very few rows of data to set properties of your variables. If your actual data has more than the example that could be a cause for concern.
Use of a data step to read CSV allows you to control things.
As a quick note. Your header row of the attached CSV shows this text:
Project ID,FT Eload,Surg Med,Surg Med Proc,Metric,Date scaf,Domain
Since your column headers have spaces in most of those variables if your SAS is using the default VALIDVARNAME=V7 then proc import would turn those spaces into _ characters because the V7 rules do not allow spaces or other special characters as part of the variable name.
"Did not work" is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.
When I download your attached csv, use Proc Import to read the data and run your code:
proc import file="&outpath.sasproject_ID2.csv" out=have dbms=CSV; run; data want; set have; if Project_ID eq 'ConnectCare_014'; run; ods listing; proc print data=want; run;
My results of the Print output:
Surg_Med_ Obs Project_ID FT_Eload Surg_Med Proc Metric Date_scaf Domain 1 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/31/2020 In Basket 2 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/30/2020 In Basket 3 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/29/2020 In Basket 4 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/28/2020 In Basket 5 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/27/2020 In Basket 6 ConnectCare_014 H2 Surgery Surgery Time in In Basket per Day 10/26/2020 In Basket
So you really need to show YOUR log for the data step that doesn't work and describe how the result is wrong.
Hi,
Thank you for the quick reply.
I always open files after exporting to make sure that they are properly populated. I am not sure if I saved it or not before closing.
I used the Import Data (I) function in the drop-down File menu to import the file in SAS. I have used this technique many times and never have had an issue. This is the first time I am having issues.
The issue that I had was that the resulting want dataset had zero observations after running my code. There were no errors:
NOTE: There were 57463 observations read from the data set WORK.SASQ. NOTE: The data set WORK.SASQ1 has 0 observations and 25 variables. NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.06 seconds
My expected result was pretty much the same as yours.
Update: My problem is solved. I closed my SAS session, deleted the file that I had imported, and used your proc import method to import the file again and it worked. I am not sure what had gone wrong but it seems to be fixed now. I should learn to close and reopen SAS before asking questions.
Sorry to take your time and thank you for the help 🙂
Now that I have your attention, may I ask one more question, please?
I also have a variable called "Day_of_Date_scaf" in my original file that I had exported from Tableau.
When I import that to SAS I get an odd date format "DATETIME16." for it and it looks like "31OCT20:00:00:00". I tried to format it as date9. in SAS and it does not wor:
319 data datescafq; 320 set datescafq; 321 format Day_of_Date_scaf date9.; 322 run; NOTE: There were 57463 observations read from the data set WORK.DATESCAFQ. NOTE: The data set WORK.DATESCAFQ has 57463 observations and 25 variables. NOTE: DATA statement used (Total process time): real time 0.09 seconds cpu time 0.09 seconds NOTE: Table has been opened in browse mode. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used. ERROR: There was a problem with the format so BEST. was used.
I get a strange thing that looks like "1.91972E9".
Do you know how I can fix that so I can use a line of code like this? Thank you (have attached the new data file)
if Day_of_Date_scaf eq '31OCT2020'd;
First I'll provide a link to information for working with dates and such that may be a good reference for later: https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.
If you don't need the time portion you could use a data step with the DATEPART function to make the value a date and then the value would be a date and would accept the date9. or other date format.
Another approach is leave the value as datetime and value dispaly using a different format: DTDATE9. The DT, for datetime versions are available for some of the common date formats to display just the date portion of a datetime value.
The CSV you attached for this should not have imported as a datetime as there is no time portion shown unless there is something interesting about your set up. You don't mention which version or environment you use for SAS so you might be getting something from options set by an admin.
It is easy to copy text from a text editor (Notepad for example) and post it as part of the message windows using the </> for a text box. So we can see the text without downloading. Many users will not download files from unknown sources and since the browsers typically want to force CSV to open in a spreadsheet the shown results may vary.
Day of Date scaf,Project ID,FT Eload,FTEmedian,Surg Med,Surg Med Proc 31-Oct-20,ConnectCare_014,H2,M2,Surgery,Surgery 30-Oct-20,ConnectCare_014,H2,M2,Surgery,Surgery 29-Oct-20,ConnectCare_014,H2,M2,Surgery,Surgery
Another approach is to write a data step to read the csv. Then you control how the text is read and the format assigned.
Most of the common datetime layouts will read just the date portion if the proper informat is used.
If you have any control over the data I suggest not relying on two-digit years. I spent way too much time dealing with Y2K issues from such at one time.
Some "date" formats when using 2-digit years are extremely difficult to determine the content. Example: 010203 could be 1 Feb 2003 (or 1903), 2 January 2003, or 3 Feb 2001. And if a spreadsheet gets ahold of that and strips off the leading 0, leaving 10203 you might be looking a date of 22 July 2010 (Look up Julian Dates).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.