BookmarkSubscribeRSS Feed
AvocadoRivalry
Calcite | Level 5

I have a raw data file that contains data in row-form that I need as columns.  The file has 23 columns and 61 rows as-is.  The first two are identifiers, which I need to keep as they are.  The third column contains each of the column names I need post-transposition and the rest of the columns contain the values that correspond with the post-transposition variable found in the third column of each row.  I need to transpose this entire file so that instead of having 23 columns and 61 rows, it has 61 columns and 20 rows (the two identifiers and one variable name column would not be observations), but I don't think SAS will let me read the data and do it with a PROC TRANSPOSE because the formats of each column as-is is inconsistent.  I've attached a sample of the data with a few rows as reference.

10 REPLIES 10
FriedEgg
SAS Employee

I am not entirely sure I understand the exact format by which you expect your output to conform, but here is what my guess is that you are trying to achieve.

data have;

infile cards dlm='09'x;

input id (name var data1-data20) (:$20.);

cards;

7647          Dummy          Date          11-Jun          11-Mar          10-Dec          10-Sep          10-Jun          10-Mar          9-Dec          9-Sep          9-Jun          9-Mar          8-Dec          8-Sep          8-Jun          8-Mar          7-Dec          7-Sep          7-Jun          7-Mar          6-Dec          6-Sep

7647          Dummy          Cash          374,999.00          355,305.00          344,476.00          421,880.00          419,419.00          362,088.00          335,474.00          363,165.00          350,761.00          350,637.00          124,905.00          138,088.00          153,846.00          169,608.00          183,856.00          178,935.00          180,308.00          182,232.00          185,859.00          178,840.00

7647          Dummy          Credit          903,945.00          892,582.00          898,555.00          890,329.00          910,922.00          929,207.00          862,928.00          878,434.00          908,463.00          947,960.00          908,375.00          922,330.00          853,334.00          858,979.00          864,756.00          784,002.00          749,575.00          714,901.00          697,474.00          660,277.00

7647          Dummy          Supplies          22,567.00          27,059.00          37,032.00          35,536.00          40,547.00          37,694.00          46,079.00          42,035.00          52,795.00          41,942.00          33,281.00          29,246.00          593          494          351          192          108          89          69          76

7647          Dummy          Other          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA

7647          Dummy          Total          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA

7647          Dummy          Equipment          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA

;

run;

proc transpose

data=have

out=want;

by id name var;

id var;

var data1-data20;

run;

          Obs     id     name     _NAME_     Date        Cash         Credit      Supplies     Other    Total    Equipment

            1    7647    Dummy    data1     11-Jun    374,999.00    903,945.00    22,567.00     @NA      @NA        @NA  

            2    7647    Dummy    data2     11-Mar    355,305.00    892,582.00    27,059.00     @NA      @NA        @NA  

            3    7647    Dummy    data3     10-Dec    344,476.00    898,555.00    37,032.00     @NA      @NA        @NA  

            4    7647    Dummy    data4     10-Sep    421,880.00    890,329.00    35,536.00     @NA      @NA        @NA  

            5    7647    Dummy    data5     10-Jun    419,419.00    910,922.00    40,547.00     @NA      @NA        @NA  

            6    7647    Dummy    data6     10-Mar    362,088.00    929,207.00    37,694.00     @NA      @NA        @NA  

            7    7647    Dummy    data7     9-Dec     335,474.00    862,928.00    46,079.00     @NA      @NA        @NA  

            8    7647    Dummy    data8     9-Sep     363,165.00    878,434.00    42,035.00     @NA      @NA        @NA  

            9    7647    Dummy    data9     9-Jun     350,761.00    908,463.00    52,795.00     @NA      @NA        @NA  

           10    7647    Dummy    data10    9-Mar     350,637.00    947,960.00    41,942.00     @NA      @NA        @NA  

           11    7647    Dummy    data11    8-Dec     124,905.00    908,375.00    33,281.00     @NA      @NA        @NA  

           12    7647    Dummy    data12    8-Sep     138,088.00    922,330.00    29,246.00     @NA      @NA        @NA  

           13    7647    Dummy    data13    8-Jun     153,846.00    853,334.00    593           @NA      @NA        @NA  

           14    7647    Dummy    data14    8-Mar     169,608.00    858,979.00    494           @NA      @NA        @NA  

           15    7647    Dummy    data15    7-Dec     183,856.00    864,756.00    351           @NA      @NA        @NA  

           16    7647    Dummy    data16    7-Sep     178,935.00    784,002.00    192           @NA      @NA        @NA  

           17    7647    Dummy    data17    7-Jun     180,308.00    749,575.00    108           @NA      @NA        @NA  

           18    7647    Dummy    data18    7-Mar     182,232.00    714,901.00    89            @NA      @NA        @NA  

           19    7647    Dummy    data19    6-Dec     185,859.00    697,474.00    69            @NA      @NA        @NA  

           20    7647    Dummy    data20    6-Sep     178,840.00    660,277.00    76            @NA      @NA        @NA 

AvocadoRivalry
Calcite | Level 5

That's the way I want the output to show up, but I want to read-in the file instead of creating a dataset in SAS.  Using an infile statement or Proc Import will suffice.  Will the fact that reading the data in before transposing it will result in mixed variable formatting cause problems?

FriedEgg
SAS Employee

The proc import procedure or whatever other method you would utilize would likely determine the input column formats as character, just as I have here.  The fact that they are mixed on the input file is why.  To adjust to the proper formats you would need to perform additional actions in a datastep after the transpose to fix it.

FriedEgg
SAS Employee

filename tmp temp;

data _null_;

infile cards missover;

input data $512.;

file tmp;

put _infile_;

cards;

7647          Dummy          Date          11-Jun          11-Mar          10-Dec          10-Sep          10-Jun          10-Mar          9-Dec          9-Sep          9-Jun          9-Mar          8-Dec          8-Sep          8-Jun          8-Mar          7-Dec          7-Sep          7-Jun          7-Mar          6-Dec          6-Sep

7647          Dummy          Cash          374,999.00          355,305.00          344,476.00          421,880.00          419,419.00          362,088.00          335,474.00          363,165.00          350,761.00          350,637.00          124,905.00          138,088.00          153,846.00          169,608.00          183,856.00          178,935.00          180,308.00          182,232.00          185,859.00          178,840.00

7647          Dummy          Credit          903,945.00          892,582.00          898,555.00          890,329.00          910,922.00          929,207.00          862,928.00          878,434.00          908,463.00          947,960.00          908,375.00          922,330.00          853,334.00          858,979.00          864,756.00          784,002.00          749,575.00          714,901.00          697,474.00          660,277.00

7647          Dummy          Supplies          22,567.00          27,059.00          37,032.00          35,536.00          40,547.00          37,694.00          46,079.00          42,035.00          52,795.00          41,942.00          33,281.00          29,246.00          593          494          351          192          108          89          69          76

7647          Dummy          Other          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA

7647          Dummy          Total          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA

7647          Dummy          Equipment          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA          @NA

;

run;

proc import

datafile=tmp

out=have

dbms=dlm

replace;

getnames=no;

delimiter='09'x;

run;

proc transpose

data=have

out=want(drop=_:)

prefix=xyz;

by var1 var2;

id var3;

var var4-var23;

run;

data want;

set want(rename=( var1=id var2=name xyzother=other xyztotal=total xyzequipment=equipment));

cash=input(xyzcash,comma10.);

credit=input(xyzcredit,comma10.);

supplies=input(xyzsupplies,comma10.);

xyzdate=strip(xyzdate) || '-2010';

date=input(xyzdate,date11.);

format cash credit supplies dollar12.2;

drop xyz:;

run;

art297
Opal | Level 21

Tom,

I never thought I'd ever be saying something nice about Excel (other than I really do like the simplicity of its graph routines), but I had never seen its transpose feature before.  Definitely a nice routine.

Avocado,

The one thing that the Excel transpose doesn't take care of is the first two columns.  They don't have variable names and there aren't enough of them to spread across the various rows.  Are they simply to be duplicated as done in FriedEgg's code?

AvocadoRivalry
Calcite | Level 5

It seems like doing it in Excel is probably the most efficient approach, as this will only have to be done a few times.  The transpose feature in Excel basically does exactly what I need, so no reason to over-complicate things.         

Peter_C
Rhodochrosite | Level 12

If the data are like the excel demo but in a plain CSV file, a DATA STEP can deliver the required result with little code. (not easy to create demo on iphone) - more later

Peter_C
Rhodochrosite | Level 12

my mis-judgement:

1

using @ and # to read from multiple lines would only be effective for a fixed layout (all raw data columns having the same width)

2

for a straightforward data step variabvle types and lengths must be able to be defined before runtime, so inforrmation only collected with input statements cannot affect the structure (names, types, lengths) of the output data set

so at least two steps would be needed

and so it will require more than a "little code"

peter

MikeZdeb
Rhodochrosite | Level 12

Hi ... here's another idea,  transpose the spreadsheet using the libname engine then use SQL to assign variable names and types ...

libname x 'z:\rawdata.xls' getnames=no scantime=no;

proc transpose data=x.'sheet1$'n out=new;

var f4-f20;

by f1 f2;

run;

libname x clear;

proc sql;

create table xyz as

select f1 as id, f2 as name, col7 as date,

       input(col6,comma10.) as cash,

       input(col5,comma10.) as credit,

       input(col4,comma10.) as supplies,

       col3 as other, col2 as total, col1 as equipment

from new;

quit;

If those "@NA" variables are really supposed to be missing numeric data ...

proc format;

invalue fix @NA = .  other=[comma10.];

value fixed . = '@NA' other=[10.];

run;

proc sql;

create table xyz as

select f1 as id, f2 as name, col7 as date,

       input(col6,comma10.) as cash,

       input(col5,comma10.) as credit,

       input(col4,comma10.) as supplies,

       input(col3,fix.) as other format=fixed.,

       input(col2,fix.) as total format=fixed.,

       input(col1,fix.) as equipment format=fixed.

from new;

quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2611 views
  • 0 likes
  • 6 in conversation