Solved: Add column names to dataset

michokwu · Posted 06-23-2020 01:13 PM

Hello Experts,

Please what is the best way to insert a column name row into a SAS dataset. It is a part of a larger file that was split into smaller parts for ease of transfer (I don't want to merge the files). See sample below i.e name the columns in file2

HAVE					WANT
FILE1					FILE1
ID	date	Brand Code	Total transaction	Value($)	ID	date	Brand Code	Total transaction	Value($)
00303	1/1/2020	11566	1234	5000	00303	1/1/2020	11566	1234	5000
05221	1/1/2020	38220	770	2000	05221	1/1/2020	38220	770	2000
44990	1/1/2020	76489	3567	7300	44990	1/1/2020	76489	3567	7300

FILE2					FILE2
07773	1/11/2020	35139	2345	3350	ID	date	Brand Code	Total transaction	Value($)
22222	1/11/2020	60275	167	470	07773	1/11/2020	35139	2345	3350
10050	1/11/2020	24677	200	500	22222	1/11/2020	60275	167	470
					10050	1/11/2020	24677	200	500

Thank you,

Reeza · Posted 06-23-2020 01:29 PM

Then you need to fix your data import step instead, not after the fact. I'm assuming you specified firstobs=1 and used a data step. As I'm sure you're aware, PROC IMPORT will not work for files of this structure. Or go back and fix how you split the file so it writes headers to each file as well - this is the optimal solution to avoid issues but you'll still want a data step. Otherwise when it comes time to combine these datasets you'll have mismatch of types and that will cause other issues.

View solution in original post

PaigeMiller · Posted 06-23-2020 01:16 PM

I'm afraid your data doesn't make sense to me.

In FILE2, the HAVE data set has no variable names, which is impossible, all SAS data sets have variable names. Please explain.

--
Paige Miller

michokwu · Posted 06-23-2020 01:21 PM

The file was split into smaller csv files. When imported into SAS, the first observation is interpreted as variable names.

Reeza · Posted 06-23-2020 01:29 PM

Then you need to fix your data import step instead, not after the fact. I'm assuming you specified firstobs=1 and used a data step. As I'm sure you're aware, PROC IMPORT will not work for files of this structure. Or go back and fix how you split the file so it writes headers to each file as well - this is the optimal solution to avoid issues but you'll still want a data step. Otherwise when it comes time to combine these datasets you'll have mismatch of types and that will cause other issues.

michokwu · Posted 06-23-2020 03:16 PM

The files were sent by someone else. I've fixed it. I unchecked the box 'first row of range contains field names'

Reeza · Posted 06-23-2020 03:48 PM

If you don't use a data step you''ll like end up with the type inconsistency issue. You really need to fix it.

michokwu · Posted 06-23-2020 03:13 PM

@PaigeMiller You are right, if the 'first row of range contains field names' box is unchecked, the variables are automatically named F1,F2..........

Reeza · Posted 06-23-2020 01:17 PM

So you have at least one dataset with the correct names? Are the positions the same between all versions of the data set?

Are you 100% sure you had to split your data set and/or how did you do that? Ideally you'll go back and make sure it's happening correctly at that stage but renaming is relatively straightforward once you clarify the rules. If you're certain all the file structures are exactly the same you can use PROC DATASETS to easily update all your datasets. But do you want variable names or labels is something else you should consider. Do you want to have 'Brand Code'n as your variable name or BrandCode and a label of "Brand Code"?

proc datasets lib=work nodetails nolist;
modify want;
rename var1=ID var2 = Date var3 = 'Brand Code'n var4 = 'Total Transaction'n var4 = 'Value($)'n;
run;quit;

ballardw · Posted 06-23-2020 04:31 PM

I am very confused about splitting a file to "transfer" it but not wanting a single file. If the sole purpose of the two files is to append them back together then read them correctly to begin with. You can read multiple files with a single data step. Sort of an example:

filename toread ("c:\path\file1.csv" "c:\path\file2.csv" );
data want;
   infile toread dlm=',' dsd firstobs=2;
   informat id $6. date mmddyy10. brand $6. code $5.
           total  value best12.;
   format date mmddyy10.;
   informat id  date  brand  code
           total  value ;
run;

Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Re: Add column names to dataset

Registration is open

SAS Training: Just a Click Away