BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cxyu1
Calcite | Level 5

have the following raw data set, and based on my understanding of "transpose" feature I need to first create some kind of group before I can transpose the data.

 

Raw Data

Field                 Value

Column1          ABC

Column2          DEF

Column3          HIJ

Column1          123

Column2          456

Column3          789

 

There are 3 columns or filed names. 

 

I think I need to create a group Id first before I can use the transpose feature.

 

GroupID    Field     Value

1                Column1          ABC

1                Column2          DEF

1                Column3          HIJ

2                Column1          123

2                Column2          456

2                Column3          789

 

 

Here is my desired outcome...

 

Column1           Column 2        Column3

ABC                   DEF                HIJ

123                     456                789

 

Thanks in advance for any help or guidance/ aadvice.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@cxyu1 wrote:

Yes, the columns fields will always be the same.  I am a newbie so all I really did thus far was import the data using the data import wizard and tried the transpose data task.  So this is a bit advanced for me. 

 

When I used the transpose data task, that's where I for that I needed a group ID.


If you read data properly, such as with @Reeza's example, then you avoid the whole issue. Additionally some of your values look like they should be numeric, dates, times which will take yet another step to straighten out.

Plus your "reference number" value of 1.90806E+11 may have been corrupted by Excel (at a fairly well educated guess). Typically long identifiers should be read as character because of precision of numeric storage and many ID systems actually have significant leading zeroes which get lost in conversion to numeric. If that is how the values appear from Proc Import than that is almost certainly not what you want.

 

Proc import is best when used for tabular data where each column has the same type of data for each value not the mixed stuff you show.

View solution in original post

10 REPLIES 10
Reeza
Super User
Is your file a text file? Is it just a single row of data? I'm worried about a solution provided not scaling to your actual problem.
cxyu1
Calcite | Level 5

Its a text file, and contains less than 100 records, and less than 10 column/fields.

Reeza
Super User
Can you show a small sample of how the text file is structured. I don't know if column1 is actually the first column or values in a row?
cxyu1
Calcite | Level 5

Here is the text data, so this would be 2 records after transposed

 

Cost Center0300/0025812
Initiatorcxyu1
MethodABC
Date Wire Initiated8/6/2019
Value Date8/6/2019
Time Wire Initiated7:29:17
Committed Time of Wire8:23:49
Reference Number1.90806E+11
Account NumberXXXXX456
Amount41,949.42
Cost Center0300/0025812
Initiatorcxyu1
MethodABC
Date Wire Initiated8/6/2019
Value Date8/6/2019
Time Wire Initiated8:03:03
Committed Time of Wire8:29:11
Reference Number1.90806E+11
Account NumberXXXXXXX12
Amount1,230.50
Reeza
Super User
Are you guaranteed to have every single value repeated?

If so, I highly recommend using a data step to import the file from a text file. Your code would end up looking like:

data want;
infile 'path to txt';
input #1 CostCenter $
#2 Intiator $
#3 Method
#4 Date_Wire_Initiated
#5 Date_Value
....
#10 Amount
;

run;

It's documented here:
http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a002001051.htm

This way you can control the types a bit more easily. I'm assuming you're familiar with a standard data step import and can modify this to fit your data via formats/informats or lengths.
cxyu1
Calcite | Level 5

Yes, the columns fields will always be the same.  I am a newbie so all I really did thus far was import the data using the data import wizard and tried the transpose data task.  So this is a bit advanced for me. 

 

When I used the transpose data task, that's where I for that I needed a group ID.

novinosrin
Tourmaline | Level 20

Hi @cxyu1 Import using datastep like this if you like, and then you can proceed 

 

/*If your variable name patterns are consisten as in your sample*/
data have;
infile cards;
input ;
call scan(_infile_, -1, _p, _,' ');
var_names=substr(_infile_,1,_p-1);
var_values=substr(_infile_,_p);
if var_names='Cost Center' then grp+1;
drop _:;
cards;
Cost Center	0300/0025812
Initiator	cxyu1
Method	ABC
Date Wire Initiated	8/6/2019
Value Date	8/6/2019
Time Wire Initiated	7:29:17
Committed Time of Wire	8:23:49
Reference Number	1.90806E+11
Account Number	XXXXX456
Amount	41,949.42
Cost Center	0300/0025812
Initiator	cxyu1
Method	ABC
Date Wire Initiated	8/6/2019
Value Date	8/6/2019
Time Wire Initiated	8:03:03
Committed Time of Wire	8:29:11
Reference Number	1.90806E+11
Account Number	XXXXXXX12
Amount	1,230.50
;


proc transpose data= have out=want(drop=grp _name_);
by grp;
var var_values;
id var_names;
run;
cxyu1
Calcite | Level 5

Instead of the ‘infile card’ function can I direct the program to an existing work query and then do the group ID assignment form there. I have friend different proc statements however no success.

novinosrin
Tourmaline | Level 20

@cxyu1  Infile cards that I used is merely a demonstration that gives you an idea.So yes, the idea is to use a datastep to import your raw data into sas dataset using infile and input precisely list input. 

 

For your raw txt file, the code could like like

 

infile 'raw_file' ;

input 

----------- and so on

ballardw
Super User

@cxyu1 wrote:

Yes, the columns fields will always be the same.  I am a newbie so all I really did thus far was import the data using the data import wizard and tried the transpose data task.  So this is a bit advanced for me. 

 

When I used the transpose data task, that's where I for that I needed a group ID.


If you read data properly, such as with @Reeza's example, then you avoid the whole issue. Additionally some of your values look like they should be numeric, dates, times which will take yet another step to straighten out.

Plus your "reference number" value of 1.90806E+11 may have been corrupted by Excel (at a fairly well educated guess). Typically long identifiers should be read as character because of precision of numeric storage and many ID systems actually have significant leading zeroes which get lost in conversion to numeric. If that is how the values appear from Proc Import than that is almost certainly not what you want.

 

Proc import is best when used for tabular data where each column has the same type of data for each value not the mixed stuff you show.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 826 views
  • 2 likes
  • 4 in conversation