BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Barite | Level 11

Hi Everyone,

I run the proc import below to import my file (attached).

However, some column of number turn to character.

Can you help me to change all variables starting with _  into numeric?

Thank you for your help.

HHCFX

 

Each month, I need to import similar file and I don't know how many column there are. Thus, using data step might not be good idea for me.

 

	%put &=sysvlong ;
	options validvarname=v7 ;
		proc import datafile="C:\Users\sample.CSV"
	out=client dbms=CSV replace; 
		getnames=yes; 
	guessingrows=100;
	run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

That is not a good format for a guessing procedure like PROC IMPORT.  Having date in metadata (variable names) is not a good idea in general.  Since your file is just a text file you can instead just read it with a data step.

 

data want ;
 length id $40 date 8 value 8 ;
 infile "&path/sample.csv" dsd truncover ;
 if _n_=1 then do ;
   array dates (100) _temporary_;
   input id @ ;
   do ncol=1 by 1 until(date=.);
     input date :??mmddyy. @ ;
     dates(ncol)=date ;
   end;
   ncol=ncol-1;
   retain ncol;
   input ;
 end;
 input id @;
 do col=1 to ncol;
   input value ?? @;
   date=dates(col);
   output;
 end;
 drop col ncol;
 format date yymmdd10.;
run;

 

 

Note that your file also has text string ' -  ' for some of the values.  I have told the data step to just make those missing.  You could create a custom INFORMAT that would set those to a special missing instead.

 

You could recreate your original report from that data structure by using PROC REPORT.

 

proc report data=want;
  where date between '01DEC2018'd and '05dec2018'd ;
  columns id value,date ;
  define id / group ;
  define date / across ' ';
  define value / ' ';
run;

image.png

 

View solution in original post

10 REPLIES 10
Reeza
Super User
Will the data always have this form, ID, date start to date end + grand total?
hhchenfx
Barite | Level 11

Yes, the actual data always has number column starts in column 4.

In the sample uploaded, I delete 2 first column.

and it also ended with grand total.

 

Reeza
Super User
Is there a way to determine how many dates you should have in the file, based on current date or date of report?
hhchenfx
Barite | Level 11

The report will contain every days in a calendar month.

So to a certain extend, the answer is Yes. but there might be a need of an If-then for number of day in month. 

ballardw
Super User

@hhchenfx wrote:

Hi Everyone,

I run the proc import below to import my file (attached).

However, some column of number turn to character.

Can you help me to change all variables starting with _  into numeric?

Thank you for your help.

HHCFX

 

Each month, I need to import similar file and I don't know how many column there are. Thus, using data step might not be good idea for me.

 

	%put &=sysvlong ;
	options validvarname=v7 ;
		proc import datafile="C:\Users\sample.CSV"
	out=client dbms=CSV replace; 
		getnames=yes; 
	guessingrows=100;
	run;

The problem is that your data contains values that are not numeric

Example from your file the first 5 rows:

X1B_X1SF_X1TI_DWX1,,,8,8,8,8,8,,,8,8,8.5,8,7.5,,,8,8,8,8,8,,,8, -   ,8,8,,,,8,152
X1B_X1SF_X1TI_PJM_II,,,6.5,6,7.5,8,8,,,8,9,9.5,6,4.5,,,8.5,6,4,4,6.5,,,8,,,4,6,,,7,127
X1B_X1SF_X1TI_PJM_II,,,1.5,5,2,1,,,,,,,4,3,,,,3,6,7,2,,,,,,,4.5,,,1,40
X1B_X1SF_X1TI_SX1_II,,,8,8,8,8,8,,,6,9,9,8,6.5,,,8,8,8,8,8,,,8,,8,8,8,,,8,158.5
X1B_X1SF_X1TI_DWX1,,,,,,,,,,,,,,,,,,,,,,,,8, -   , -   , -   , -   ,,,8,16

Those fields that contain dashes are character not numeric. You have more examples further down in your file. Any column that has a dash in the first 100 rows (your guessing rows) will be set to character type.

Time to write a data step to read things correctly, whatever "correctly" may mean in the presence of a dash surrounded by several spaces.

 

Not terrible fond of making variable names alike '12/1/2018'n either. That way your variable name contains actual data in the name.

 

Tom
Super User Tom
Super User

That is not a good format for a guessing procedure like PROC IMPORT.  Having date in metadata (variable names) is not a good idea in general.  Since your file is just a text file you can instead just read it with a data step.

 

data want ;
 length id $40 date 8 value 8 ;
 infile "&path/sample.csv" dsd truncover ;
 if _n_=1 then do ;
   array dates (100) _temporary_;
   input id @ ;
   do ncol=1 by 1 until(date=.);
     input date :??mmddyy. @ ;
     dates(ncol)=date ;
   end;
   ncol=ncol-1;
   retain ncol;
   input ;
 end;
 input id @;
 do col=1 to ncol;
   input value ?? @;
   date=dates(col);
   output;
 end;
 drop col ncol;
 format date yymmdd10.;
run;

 

 

Note that your file also has text string ' -  ' for some of the values.  I have told the data step to just make those missing.  You could create a custom INFORMAT that would set those to a special missing instead.

 

You could recreate your original report from that data structure by using PROC REPORT.

 

proc report data=want;
  where date between '01DEC2018'd and '05dec2018'd ;
  columns id value,date ;
  define id / group ;
  define date / across ' ';
  define value / ' ';
run;

image.png

 

hhchenfx
Barite | Level 11

I see.

I can replace all these "-" with blank in Excel in the new file enclosed

can it help to simplify the problem?

hhchenfx
Barite | Level 11

I really appreciate your help!

HHCFX

ballardw
Super User

@hhchenfx wrote:

I see.

I can replace all these "-" with blank in Excel in the new file enclosed

can it help to simplify the problem?


It is likely actually easier to apply rules in SAS as @Tom shows  then to add a manual step in Excel that could cause other issues.

Tom
Super User Tom
Super User

@hhchenfx wrote:

I see.

I can replace all these "-" with blank in Excel in the new file enclosed

can it help to simplify the problem?


Opening a CSV file in Excel can cause a lot of problems.  If you are not careful Excel will transform some of the value.  For example it might convert character variables with hyphens in them into dates.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 4010 views
  • 3 likes
  • 4 in conversation