BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Rhodochrosite | Level 12

Hi Everyone,

I run the proc import below to import my file (attached).

However, some column of number turn to character.

Can you help me to change all variables starting with _  into numeric?

Thank you for your help.

HHCFX

 

Each month, I need to import similar file and I don't know how many column there are. Thus, using data step might not be good idea for me.

 

	%put &=sysvlong ;
	options validvarname=v7 ;
		proc import datafile="C:\Users\sample.CSV"
	out=client dbms=CSV replace; 
		getnames=yes; 
	guessingrows=100;
	run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

That is not a good format for a guessing procedure like PROC IMPORT.  Having date in metadata (variable names) is not a good idea in general.  Since your file is just a text file you can instead just read it with a data step.

 

data want ;
 length id $40 date 8 value 8 ;
 infile "&path/sample.csv" dsd truncover ;
 if _n_=1 then do ;
   array dates (100) _temporary_;
   input id @ ;
   do ncol=1 by 1 until(date=.);
     input date :??mmddyy. @ ;
     dates(ncol)=date ;
   end;
   ncol=ncol-1;
   retain ncol;
   input ;
 end;
 input id @;
 do col=1 to ncol;
   input value ?? @;
   date=dates(col);
   output;
 end;
 drop col ncol;
 format date yymmdd10.;
run;

 

 

Note that your file also has text string ' -  ' for some of the values.  I have told the data step to just make those missing.  You could create a custom INFORMAT that would set those to a special missing instead.

 

You could recreate your original report from that data structure by using PROC REPORT.

 

proc report data=want;
  where date between '01DEC2018'd and '05dec2018'd ;
  columns id value,date ;
  define id / group ;
  define date / across ' ';
  define value / ' ';
run;

image.png

 

View solution in original post

10 REPLIES 10
Reeza
Super User
Will the data always have this form, ID, date start to date end + grand total?
hhchenfx
Rhodochrosite | Level 12

Yes, the actual data always has number column starts in column 4.

In the sample uploaded, I delete 2 first column.

and it also ended with grand total.

 

Reeza
Super User
Is there a way to determine how many dates you should have in the file, based on current date or date of report?
hhchenfx
Rhodochrosite | Level 12

The report will contain every days in a calendar month.

So to a certain extend, the answer is Yes. but there might be a need of an If-then for number of day in month. 

ballardw
Super User

@hhchenfx wrote:

Hi Everyone,

I run the proc import below to import my file (attached).

However, some column of number turn to character.

Can you help me to change all variables starting with _  into numeric?

Thank you for your help.

HHCFX

 

Each month, I need to import similar file and I don't know how many column there are. Thus, using data step might not be good idea for me.

 

	%put &=sysvlong ;
	options validvarname=v7 ;
		proc import datafile="C:\Users\sample.CSV"
	out=client dbms=CSV replace; 
		getnames=yes; 
	guessingrows=100;
	run;

The problem is that your data contains values that are not numeric

Example from your file the first 5 rows:

X1B_X1SF_X1TI_DWX1,,,8,8,8,8,8,,,8,8,8.5,8,7.5,,,8,8,8,8,8,,,8, -   ,8,8,,,,8,152
X1B_X1SF_X1TI_PJM_II,,,6.5,6,7.5,8,8,,,8,9,9.5,6,4.5,,,8.5,6,4,4,6.5,,,8,,,4,6,,,7,127
X1B_X1SF_X1TI_PJM_II,,,1.5,5,2,1,,,,,,,4,3,,,,3,6,7,2,,,,,,,4.5,,,1,40
X1B_X1SF_X1TI_SX1_II,,,8,8,8,8,8,,,6,9,9,8,6.5,,,8,8,8,8,8,,,8,,8,8,8,,,8,158.5
X1B_X1SF_X1TI_DWX1,,,,,,,,,,,,,,,,,,,,,,,,8, -   , -   , -   , -   ,,,8,16

Those fields that contain dashes are character not numeric. You have more examples further down in your file. Any column that has a dash in the first 100 rows (your guessing rows) will be set to character type.

Time to write a data step to read things correctly, whatever "correctly" may mean in the presence of a dash surrounded by several spaces.

 

Not terrible fond of making variable names alike '12/1/2018'n either. That way your variable name contains actual data in the name.

 

Tom
Super User Tom
Super User

That is not a good format for a guessing procedure like PROC IMPORT.  Having date in metadata (variable names) is not a good idea in general.  Since your file is just a text file you can instead just read it with a data step.

 

data want ;
 length id $40 date 8 value 8 ;
 infile "&path/sample.csv" dsd truncover ;
 if _n_=1 then do ;
   array dates (100) _temporary_;
   input id @ ;
   do ncol=1 by 1 until(date=.);
     input date :??mmddyy. @ ;
     dates(ncol)=date ;
   end;
   ncol=ncol-1;
   retain ncol;
   input ;
 end;
 input id @;
 do col=1 to ncol;
   input value ?? @;
   date=dates(col);
   output;
 end;
 drop col ncol;
 format date yymmdd10.;
run;

 

 

Note that your file also has text string ' -  ' for some of the values.  I have told the data step to just make those missing.  You could create a custom INFORMAT that would set those to a special missing instead.

 

You could recreate your original report from that data structure by using PROC REPORT.

 

proc report data=want;
  where date between '01DEC2018'd and '05dec2018'd ;
  columns id value,date ;
  define id / group ;
  define date / across ' ';
  define value / ' ';
run;

image.png

 

hhchenfx
Rhodochrosite | Level 12

I see.

I can replace all these "-" with blank in Excel in the new file enclosed

can it help to simplify the problem?

hhchenfx
Rhodochrosite | Level 12

I really appreciate your help!

HHCFX

ballardw
Super User

@hhchenfx wrote:

I see.

I can replace all these "-" with blank in Excel in the new file enclosed

can it help to simplify the problem?


It is likely actually easier to apply rules in SAS as @Tom shows  then to add a manual step in Excel that could cause other issues.

Tom
Super User Tom
Super User

@hhchenfx wrote:

I see.

I can replace all these "-" with blank in Excel in the new file enclosed

can it help to simplify the problem?


Opening a CSV file in Excel can cause a lot of problems.  If you are not careful Excel will transform some of the value.  For example it might convert character variables with hyphens in them into dates.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 5204 views
  • 3 likes
  • 4 in conversation