BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
raqthesolid
Quartz | Level 8

Hello guys I'm trying to transpose some data in sas by using following codes
PROC TRANSPOSE DATA=temp OUT=Vertical4;
ID Variable;
VAR '19851001' - '20061001';
BY Ticker; RUN;

...

I am getting following error messages:

352 PROC TRANSPOSE DATA=temp OUT=Vertical4;
353 ID Variable;
354 VAR '19851001' - '20061001';
----------
22
200
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, _ALL_, _CHARACTER_,
_CHAR_, _NUMERIC_.
ERROR 200-322: The symbol is not recognized and will be ignored.
355 BY Ticker; RUN;

my original data looks like data work.new;
  infile datalines dsd truncover;
  input Variable:$3. NAME:$15. _1_Oct_85:$7. _1_Nov_85:$7. _1_Dec_85:$7. _1_Jan_86:$7. _1_Feb_86:$7. _1_Mar_86:$7. _1_Apr_86:$7. _1_May_86:$7. _1_Jun_86:$7. _1_Jul_86:$7. _1_Aug_86:$7. _1_Sep_86:$7. _1_Oct_86:$7. _1_Nov_86:$7. _1_Dec_86:$7. _1_Jan_87:$7. _1_Feb_87:$7. _1_Mar_87:$7. _1_Apr_87:$7. _1_May_87:$7. _1_Jun_87:$7. _1_Jul_87:$7. _1_Aug_87:$7. _1_Sep_87:$7. _1_Oct_87:$7. _1_Nov_87:$7. _1_Dec_87:$7. _1_Jan_88:$7. _1_Feb_88:$7. _1_Mar_88:$7. _1_Apr_88:$7. _1_May_88:$7. _1_Jun_88:$7. _1_Jul_88:$7. _1_Aug_88:$7. _1_Sep_88:$7. _1_Oct_88:$7. _1_Nov_88:$7. _1_Dec_88:$7. _1_Jan_89:$7. _1_Feb_89:$7. _1_Mar_89:$7. _1_Apr_89:$7. _1_May_89:$7. _1_Jun_89:$7. _1_Jul_89:$7. _1_Aug_89:$7. _1_Sep_89:$7. _1_Oct_89:$7. _1_Nov_89:$7. _1_Dec_89:$7. _1_Jan_90:$7. _1_Feb_90:$7. _1_Mar_90:$7. _1_Apr_90:$7. _1_May_90:$7. _1_Jun_90:$7. _1_Jul_90:$7. _1_Aug_90:$7. _1_Sep_90:$7. _1_Oct_90:$7. _1_Nov_90:$7. _1_Dec_90:$7. _1_Jan_91:$7. _1_Feb_91:$7. _1_Mar_91:$7. _1_Apr_91:$7. _1_May_91:$7. _1_Jun_91:$7. _1_Jul_91:$7. _1_Aug_91:$7. _1_Sep_91:$7. _1_Oct_91:$7. _1_Nov_91:$7. _1_Dec_91:$7. _1_Jan_92:$7. _1_Feb_92:$7. _1_Mar_92:$7. _1_Apr_92:$7. _1_May_92:$7. _1_Jun_92:$7. _1_Jul_92:$7. _1_Aug_92:$7. _1_Sep_92:$7. _1_Oct_92:$7. _1_Nov_92:$7. _1_Dec_92:$7. _1_Jan_93:32. _1_Feb_93:32. _1_Mar_93:32. _1_Apr_93:32. _1_May_93:32. _1_Jun_93:32. _1_Jul_93:32. _1_Aug_93:32. _1_Sep_93:32. _1_Oct_93:32. _1_Nov_93:32. _1_Dec_93:32. _1_Jan_94:32. _1_Feb_94:32. _1_Mar_94:32. _1_Apr_94:32. _1_May_94:32. _1_Jun_94:32. _1_Jul_94:32. _1_Aug_94:32. _1_Sep_94:32. _1_Oct_94:32. _1_Nov_94:32. _1_Dec_94:32. _1_Jan_95:32. _1_Feb_95:32. _1_Mar_95:32. _1_Apr_95:32. _1_May_95:32. _1_Jun_95:32. _1_Jul_95:32. _1_Aug_95:32. _1_Sep_95:32. _1_Oct_95:32. _1_Nov_95:32. _1_Dec_95:32. _1_Jan_96:32. _1_Feb_96:32. _1_Mar_96:32. _1_Apr_96:32. _1_May_96:32. _1_Jun_96:32. _1_Jul_96:32. _1_Aug_96:32. _1_Sep_96:32. _1_Oct_96:32. _1_Nov_96:32. _1_Dec_96:32. _1_Jan_97:32. _1_Feb_97:32. _1_Mar_97:32. _1_Apr_97:32. _1_May_97:32. _1_Jun_97:32. _1_Jul_97:32. _1_Aug_97:32. _1_Sep_97:32. _1_Oct_97:32. _1_Nov_97:32. _1_Dec_97:32. _1_Jan_98:32. _1_Feb_98:32. _1_Mar_98:32. _1_Apr_98:32. _1_May_98:32. _1_Jun_98:32. _1_Jul_98:32. _1_Aug_98:32. _1_Sep_98:32. _1_Oct_98:32. _1_Nov_98:32. _1_Dec_98:32. _1_Jan_99:32. _1_Feb_99:32. _1_Mar_99:32. _1_Apr_99:32. _1_May_99:32. _1_Jun_99:32. _1_Jul_99:32. _1_Aug_99:32. _1_Sep_99:32. _1_Oct_99:32. _1_Nov_99:32. _1_Dec_99:32. _1_Jan_00:32. _1_Feb_00:32. _1_Mar_00:32. _1_Apr_00:32. _1_May_00:32. _1_Jun_00:32. _1_Jul_00:32. _1_Aug_00:32. _1_Sep_00:32. _1_Oct_00:32. _1_Nov_00:32. _1_Dec_00:32. _1_Jan_01:32. _1_Feb_01:32. _1_Mar_01:32. _1_Apr_01:32. _1_May_01:32. _1_Jun_01:32. _1_Jul_01:32. _1_Aug_01:32. _1_Sep_01:32. _1_Oct_01:32. _1_Nov_01:32. _1_Dec_01:32. _1_Jan_02:32. _1_Feb_02:32. _1_Mar_02:32. _1_Apr_02:32. _1_May_02:32. _1_Jun_02:32. _1_Jul_02:32. _1_Aug_02:32. _1_Sep_02:32. _1_Oct_02:32. _1_Nov_02:32. _1_Dec_02:32. _1_Jan_03:32. _1_Feb_03:32. _1_Mar_03:32. _1_Apr_03:32. _1_May_03:32. _1_Jun_03:32. _1_Jul_03:32. _1_Aug_03:32. _1_Sep_03:32. _1_Oct_03:32. _1_Nov_03:32. _1_Dec_03:32. _1_Jan_04:32. _1_Feb_04:32. _1_Mar_04:32. _1_Apr_04:32. _1_May_04:32. _1_Jun_04:32. _1_Jul_04:32. _1_Aug_04:32. _1_Sep_04:32. _1_Oct_04:32. _1_Nov_04:32. _1_Dec_04:32. _1_Jan_05:32. _1_Feb_05:32. _1_Mar_05:32. _1_Apr_05:32. _1_May_05:32. _1_Jun_05:32. _1_Jul_05:32. _1_Aug_05:32. _1_Sep_05:32. _1_Oct_05:32. _1_Nov_05:32. _1_Dec_05:32. _1_Jan_06:32. _1_Feb_06:32. _1_Mar_06:32. _1_Apr_06:32. _1_May_06:32. _1_Jun_06:32. _1_Jul_06:32. _1_Aug_06:32. _1_Sep_06:32. _1_Oct_06:32.;
datalines;
LTD,@LAUR(WC03251),NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,1783,1783,1783,1783,1783,1783,1783,1783,1783,1783,1783,1783,4045,4045,4045,4045,4045,4045,4045,4045,4045,4045,4045,4045,2521,2521,2521,2521,2521,2521,2521,2521,2521,2521,2521,2521,1575,1575,1575,1575,1575,1575,1575,1575,1575,1575,1575,1575,0,0,0,0,0,0,0,0,0,0,0,0,12504,12504,12504,12504,12504,12504,12504,12504,12504,12504,12504,12504,151204,151204,151204,151204,151204,151204,151204,151204,151204,151204,151204,151204,128575,128575,128575,128575,128575,128575,128575,128575,128575,128575,128575,128575,124474,124474,124474,124474,124474,124474,124474,124474,124474,124474,124474,124474,154055,154055,154055,154055,154055,154055,154055,154055,154055,154055,154055,154055,75100,75100,75100,75100,75100,75100,75100,75100,75100,75100,75100,75100,86605,86605,86605,86605,86605,86605,86605,86605,86605,86605,86605,86605,99997,99997,99997,99997,99997,99997,99997,99997,99997,99997,99997,99997,327734,327734,327734,327734,327734,327734,327734,327734,327734,327734
LTD,@BWLD

Please have a look if anyone can help.
thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

I would recommend getting it out of Excel as fast as you can.  For example you could use PROC IMPORT with GETNAMES=NO to read it.

 

proc import dbms=xlsx out=raw datafile='c:\downloads\book1.xlsx';
  getnames=no;
run;

Note this could cause dates to be imported as the raw number of days instead of valid dates or even date strings.

You could then transpose it and convert it into ROW,COL, value.  You can use INPUT to convert the character strings back into numbers.

proc transpose data=raw out=tall ;
  by A notsorted ;
  var _all_;
  format _all_ ;
run;
data tall2 ;
  set tall ;
  array x _all_ ;
  if _name_ = 'A' then do;
    row + 1;
    col = 0;
  end;
  col + 1;
  value = input(col1,??comma32.);
  valuec = col1 ;
  keep row col value valuec ;
run;

Now you can parse out the values like the company names in the first row and the dates in the first column.

data dates ;
  set tall2 ;
  where col=1 and row > 2 ;
  date = value + '01JAN1900'd -2 ;
  format date yymmdd10.;
  keep row date ;
run;

data ticker ;
  set tall2 ;
  where row=2 and col > 1;
  ticker = valuec ;
  keep col ticker ;
run;

data names ;
  set tall2 ;
  where row =1 and col>1 ;
  if index(valuec,' - ') then do ;
    type = left(scan(valuec,-1,'-'));
    company = substr(valuec,1,index(valuec,' - '));
  end;
  else type = 'ERROR' ;
  retain company ;
  keep col company type ;
run;

You can then combine them together again by ROW number or COL number where appropriate.

data tall3 ;
  merge dates tall2 ;
  by row ;
  if row > 2 ;
run;

proc sort data=tall3 out=tall4;
  by col row ;
run;

data tall5 (keep=col row company ticker type date value)
     error (keep=col row company valuec)
;
  merge names ticker tall4 ;
  by col ;
  if col > 1 ;
  if type ne 'ERROR' then output tall5;
  else if valuec ne ' ' then output error ;
run;

You could then transpose to have one row per company * date .

proc sort data=tall5 ;
  by company date type ;
run;

proc transpose data=tall5 out=want (drop=_:) ;
  by company date ;
  id type ;
  var value ;
run;

View solution in original post

8 REPLIES 8
Patrick
Opal | Level 21

@raqthesolid

I believe there are several issues in your code.

1. VAR '19851001' - '20061001'; These are not valid SAS variable names and also nothing you've provided as your sample data. You can use name literals as SAS variable names (not recommended) but you need to address them using syntax 'name literal'n

 

2. Proc Transpose transposes a data structure from Long to Wide. I believe you're after the opposite.

 

3. Proc Transpose deals only with Numeric variables in the VAR statement. You've got also character variables there.

 

Below a code sample using your sample data and transposing the data into a long structure. Is that what you're after?


data want(keep=variable name var_date var_value);

  if _n_=1 then
    do;
      if 0 then set work.new(keep= _:);
      array cvars {*} _character_;
      array nvars {*} _numeric_;
    end;

  set work.new;
  format var_date date9. var_value best32.;
  do _i=1 to dim(cvars);
    var_date=input(compress(vname(cvars[_i]),'_'),date.);
    var_value=input(cvars[_i],?? best32.);
    output;
  end;
  do _i=1 to dim(nvars);
    var_date=input(compress(vname(nvars[_i]),'_'),date.);
    var_value=input(nvars[_i],?? best32.);
    output;
  end;

run;
 
Tom
Super User Tom
Super User

That is not how to specify a variable list.  First you need to use variable names, not string literal. Second since you are transposing from wide to tall you don't want to use the ID statement.  You can add the extra variable to the BY statement instead.

 

Looking at your variable names there are a few ways you could list them.

You could just pick every variable whose name starts with an underscore.

var _: ;

You could list them by POSITION by using a double hyphen.

var _1_Oct_85 -- _1_Oct_06 ;

Or you could just do all of the variables then throw out the extra rows that are generated later.

var _all_ ;

Note that since you are mixing character and numeric variables the resulting column will be character.

raqthesolid
Quartz | Level 8

Hello @Tom @Patrick I am sure you are writing right codes but the data is terrible. I tried to transform it in excel then in sas but its not working for me. Here is the original data i downloaded from Datastream Database. I am uploading Excel file one with data(book1) and next with the format i want. I often encounter this type of data but this time im stuck. I will be very thankful if you can help me. 
Thanks

The first row is name of company followed by variable name (there are only two variables) like in column

B1= Avangrid Inc - Book Value per share           *I want only book value per share as my third column

C1= Avangrid Inc - Long term debt                       * Long term debt in Fourth column

Second row is Ticker:U: AGR(Wc05476)   I want only AGR in first column. Column header will be Ticker 

                                     U:AGR(Wc03251)

First column in excel is date:                          *These will come in second column

 I am trying to do it for last two days and i numerous different ways available on internet but failed. might be some variable format issues. 

I will be really grateful if you can help me with that. 
thanks a lot for all those you have done before. 

 

Tom
Super User Tom
Super User

Are you sure that is a valid XLSX file? The Website will not display it for me.

 

Error displaying attachment content
raqthesolid
Quartz | Level 8

@Tom Yes this is excel file may be due to large size. I dropped data and brought it to less than half mb. hopefully it will work now. 
Good day

raqthesolid
Quartz | Level 8
In browser window its giving error but when you download it, It's perfectly fine. There are few error columns in data. they are useless and not required at all.
Tom
Super User Tom
Super User

What exactly are you going to do with a table that has 37 rows and over 10,000 columns? Especially if many of the columns have names that are really not valid variable names?

Can't you clean it up in EXCEL before trying to read it into SAS?

 

3    libname x xlsx 'c:\downloads\book1.xlsx';
NOTE: Libref X was successfully assigned as follows:
      Engine:        XLSX
      Physical Name: c:\downloads\book1.xlsx
4    proc copy inlib=x outlib=work; run;
NOTE: Writing HTML Body file: sashtml.htm

NOTE: Copying X.BOOK1 to WORK.BOOK1 (memtype=DATA).
NOTE:    Variable Name Change.  AVANGRID INC - BOOK VALUE PER SH -> AVANGRID_INC___BOOK_VALUE_PER_SH
NOTE:    Variable Name Change.  AVANGRID INC - LONG TERM DEBT -> 
...
NOTE:    Variable Name Change.  #ERROR_7096 -> _ERROR_7096
NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used.
NOTE: The import data set has 37 observations and 10204 variables.
NOTE: There were 37 observations read from the data set X.BOOK1.
NOTE: The data set WORK.BOOK1 has 37 observations and 10204 variables.
NOTE: Copying X.WANT to WORK.WANT (memtype=DATA).
NOTE:    Variable Name Change.  Long term debt -> Long_term_debt
NOTE:    Variable Name Change.  Book value per share -> Book_value_per_share
NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used.
NOTE: The import data set has 7 observations and 4 variables.
NOTE: There were 7 observations read from the data set X.WANT.
NOTE: The data set WORK.WANT has 7 observations and 4 variables.
NOTE: PROCEDURE COPY used (Total process time):
      real time           53.46 seconds
      cpu time            53.10 seconds

Tom
Super User Tom
Super User

I would recommend getting it out of Excel as fast as you can.  For example you could use PROC IMPORT with GETNAMES=NO to read it.

 

proc import dbms=xlsx out=raw datafile='c:\downloads\book1.xlsx';
  getnames=no;
run;

Note this could cause dates to be imported as the raw number of days instead of valid dates or even date strings.

You could then transpose it and convert it into ROW,COL, value.  You can use INPUT to convert the character strings back into numbers.

proc transpose data=raw out=tall ;
  by A notsorted ;
  var _all_;
  format _all_ ;
run;
data tall2 ;
  set tall ;
  array x _all_ ;
  if _name_ = 'A' then do;
    row + 1;
    col = 0;
  end;
  col + 1;
  value = input(col1,??comma32.);
  valuec = col1 ;
  keep row col value valuec ;
run;

Now you can parse out the values like the company names in the first row and the dates in the first column.

data dates ;
  set tall2 ;
  where col=1 and row > 2 ;
  date = value + '01JAN1900'd -2 ;
  format date yymmdd10.;
  keep row date ;
run;

data ticker ;
  set tall2 ;
  where row=2 and col > 1;
  ticker = valuec ;
  keep col ticker ;
run;

data names ;
  set tall2 ;
  where row =1 and col>1 ;
  if index(valuec,' - ') then do ;
    type = left(scan(valuec,-1,'-'));
    company = substr(valuec,1,index(valuec,' - '));
  end;
  else type = 'ERROR' ;
  retain company ;
  keep col company type ;
run;

You can then combine them together again by ROW number or COL number where appropriate.

data tall3 ;
  merge dates tall2 ;
  by row ;
  if row > 2 ;
run;

proc sort data=tall3 out=tall4;
  by col row ;
run;

data tall5 (keep=col row company ticker type date value)
     error (keep=col row company valuec)
;
  merge names ticker tall4 ;
  by col ;
  if col > 1 ;
  if type ne 'ERROR' then output tall5;
  else if valuec ne ' ' then output error ;
run;

You could then transpose to have one row per company * date .

proc sort data=tall5 ;
  by company date type ;
run;

proc transpose data=tall5 out=want (drop=_:) ;
  by company date ;
  id type ;
  var value ;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 8602 views
  • 2 likes
  • 3 in conversation