Reshape Datastream data for SAS

Reply
New Contributor
Posts: 2

Reshape Datastream data for SAS

Dear All


Could someone help me with the following problem? It is urgent! Thanks a lot!

I have retrieved data about singapore equities from 1990 to 2012 from datastream, which is exported into excel files.

The format of the data, however, is not compatible with SAS.

For instance, from datastream,

                date1  date2  date3

comA       p           p       p

comB      p           p         p

comC     p           p          p

However, i understand that for sas, it need to be transformed to

Company Date Price

A               1         p

A               2         p

.

.

.

C               3         p

How could i reshape the data? As I have large number of companeis and dates in my dataset, I cant really mannually change the format.

Is there any function in SAS that could help the reshaping?

Or could someone help with an excel marco that could fulfill the purpose?

Thank you so much!

Best regards

Chloe

Respected Advisor
Posts: 3,840

Re: Reshape Datastream data for SAS

To give you actual code very close to what you need I suggest that you attach some of the data for us (eg. a copy of your Excel with the first 200 lines populated).

The approach I'm thinking about will include some array processing over all the date variables - but to then actually populate a date variable based on variable names we really need to see the actual data.

New Contributor
Posts: 2

Re: Reshape Datastream data for SAS

Thank you for your help!

Attached is the screen shot of the sample data.

How can I attach excel worksheet?

sample.PNG

Super Contributor
Posts: 644

Re: Reshape Datastream data for SAS

Here are a few steps you can take

  1. Delete rows 1 to 4 and column A of your Excel table and save the worksheet
  2. Import the table into SAS - use Proc Import if you have installed the SAS Access to PC File formats
  3. Otherwise export your dataset as a CSV or Tab delimited dataset and import it into SAS (Proc Export in its Bas cut down form)
  4. In Proc Import you need to specify a table name - temp will do
  5. Apart from the Name column, SAS will give unusual names to the weekly dates, like _2004_1_5
  6. Now use Proc transpose to change rows to columns

Proc transpose

               data = temp

               out = have

               ;

     By     Noitsorted Name ;

Run ;

Nearly there, just need to manage the dates, now held in the _NAME_ column :

Data have ;

     Set want ;

     Date_Str = Substr (_Name_, 1, 11) ;

     = Translate (Date_Str, '-', '_') ;

     date = input (DateStr, YYMMDD10.) ;

     Drop _NAME_ Date_str ;

Run ;

Then sort it by name and date ;

Proc Sort

               data = have ;

     By     Name Date ;

Run ;

    

[NB untested code]

Richard in NZ

Super Contributor
Posts: 644

Re: Reshape Datastream data for SAS

Also, change the name of the column with your data in the data step

Data have ;

     Format Name     $200.

               Date YYMMDD10.

               Col1 comma12.2

               ;

     Set want ;

     Date_Str = Substr (_Name_, 1, 11) ;

     = Translate (Date_Str, '-', '_') ;

     date = input (DateStr, YYMMDD10.) ;

     Drop _NAME_ Date_str ;

     Rename Col1 = Price ;

Run ;

Richard in OZ

Contributor
Posts: 58

Re: Reshape Datastream data for SAS

Hello, @RichardinOz @chloe @Patrick i am trying to do almost similar task but facing this issue. I will be thankful if anyone can guide.  Below is sas log file


80 Data want ;
81 Format Name $200.
82 Date YYMMDD10.
83 Col1 comma12.2
84 ;
85 Set have ;
86 Date_Str = Substr (_Name_, 1, 11) ;
87 = Translate (Date_Str, '-', '_') ;
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.

88 date = input (DateStr, YYMMDD10.) ;
89 Drop _NAME_ Date_str ;
90 Rename Col1 = Price ;
91 Run ;

Thanks 

Esteemed Advisor
Posts: 5,202

Re: Reshape Datastream data for SAS

You can't start a line of SAS code with an equal sign, you need a variable name, like in log row 86.
Data never sleeps
Contributor
Posts: 58

Re: Reshape Datastream data for SAS

[ Edited ]

Hello @LinusH @Peter_C my data set looks like this one;

data work.new;
infile datalines dsd truncover;
input Variable:$3. NAME:$15. _1_Oct_85:$7. _1_Nov_85:$7. _1_Dec_85:$7. _1_Jan_86:$7. _1_Feb_86:$7. _1_Mar_86:$7. _1_Apr_86:$7. _1_May_86:$7. _1_Jun_86:$7. _1_Jul_86:$7. _1_Aug_86:$7. _1_Sep_86:$7. _1_Oct_86:$7. _1_Nov_86:$7. _1_Dec_86:$7. _1_Jan_87:$7. _1_Feb_87:$7. _1_Mar_87:$7. _1_Apr_87:$7. _1_May_87:$7. _1_Jun_87:$7. _1_Jul_87:$7. _1_Aug_87:$7. _1_Sep_87:$7. _1_Oct_87:$7. _1_Nov_87:$7. _1_Dec_87:$7. _1_Jan_88:$7. _1_Feb_88:$7. _1_Mar_88:$7. _1_Apr_88:$7. _1_May_88:$7. _1_Jun_88:$7. _1_Jul_88:$7. _1_Aug_88:$7. _1_Sep_88:$7. _1_Oct_88:$7. _1_Nov_88:$7. _1_Dec_88:$7. _1_Jan_89:$7. _1_Feb_89:$7. _1_Mar_89:$7. _1_Apr_89:$7. _1_May_89:$7. _1_Jun_89:$7. _1_Jul_89:$7. _1_Aug_89:$7. _1_Sep_89:$7. _1_Oct_89:$7. _1_Nov_89:$7. _1_Dec_89:$7. _1_Jan_90:$7. _1_Feb_90:$7. _1_Mar_90:$7. _1_Apr_90:$7. _1_May_90:$7. _1_Jun_90:$7. _1_Jul_90:$7. _1_Aug_90:$7. _1_Sep_90:$7. _1_Oct_90:$7. _1_Nov_90:$7. _1_Dec_90:$7. _1_Jan_91:$7. _1_Feb_91:$7. _1_Mar_91:$7. _1_Apr_91:$7. _1_May_91:$7. _1_Jun_91:$7. _1_Jul_91:$7. _1_Aug_91:$7. _1_Sep_91:$7. _1_Oct_91:$7. _1_Nov_91:$7. _1_Dec_91:$7. _1_Jan_92:$7. _1_Feb_92:$7. _1_Mar_92:$7. _1_Apr_92:$7. _1_May_92:$7. _1_Jun_92:$7. _1_Jul_92:$7. _1_Aug_92:$7. _1_Sep_92:$7. _1_Oct_92:$7. _1_Nov_92:$7. _1_Dec_92:$7. _1_Jan_93:32. _1_Feb_93:32. _1_Mar_93:32. _1_Apr_93:32. _1_May_93:32. _1_Jun_93:32. _1_Jul_93:32. _1_Aug_93:32. _1_Sep_93:32. _1_Oct_93:32. _1_Nov_93:32. _1_Dec_93:32. _1_Jan_94:32. _1_Feb_94:32. _1_Mar_94:32. _1_Apr_94:32. _1_May_94:32. _1_Jun_94:32. _1_Jul_94:32. _1_Aug_94:32. _1_Sep_94:32. _1_Oct_94:32. _1_Nov_94:32. _1_Dec_94:32. _1_Jan_95:32. _1_Feb_95:32. _1_Mar_95:32. _1_Apr_95:32. _1_May_95:32. _1_Jun_95:32. _1_Jul_95:32. _1_Aug_95:32. _1_Sep_95:32. _1_Oct_95:32. _1_Nov_95:32. _1_Dec_95:32. _1_Jan_96:32. _1_Feb_96:32. _1_Mar_96:32. _1_Apr_96:32. _1_May_96:32. _1_Jun_96:32. _1_Jul_96:32. _1_Aug_96:32. _1_Sep_96:32. _1_Oct_96:32. _1_Nov_96:32. _1_Dec_96:32. _1_Jan_97:32. _1_Feb_97:32. _1_Mar_97:32. _1_Apr_97:32. _1_May_97:32. _1_Jun_97:32. _1_Jul_97:32. _1_Aug_97:32. _1_Sep_97:32. _1_Oct_97:32. _1_Nov_97:32. _1_Dec_97:32. _1_Jan_98:32. _1_Feb_98:32. _1_Mar_98:32. _1_Apr_98:32. _1_May_98:32. _1_Jun_98:32. _1_Jul_98:32. _1_Aug_98:32. _1_Sep_98:32. _1_Oct_98:32. _1_Nov_98:32. _1_Dec_98:32. _1_Jan_99:32. _1_Feb_99:32. _1_Mar_99:32. _1_Apr_99:32. _1_May_99:32. _1_Jun_99:32. _1_Jul_99:32. _1_Aug_99:32. _1_Sep_99:32. _1_Oct_99:32. _1_Nov_99:32. _1_Dec_99:32. _1_Jan_00:32. _1_Feb_00:32. _1_Mar_00:32. _1_Apr_00:32. _1_May_00:32. _1_Jun_00:32. _1_Jul_00:32. _1_Aug_00:32. _1_Sep_00:32. _1_Oct_00:32. _1_Nov_00:32. _1_Dec_00:32. _1_Jan_01:32. _1_Feb_01:32. _1_Mar_01:32. _1_Apr_01:32. _1_May_01:32. _1_Jun_01:32. _1_Jul_01:32. _1_Aug_01:32. _1_Sep_01:32. _1_Oct_01:32. _1_Nov_01:32. _1_Dec_01:32. _1_Jan_02:32. _1_Feb_02:32. _1_Mar_02:32. _1_Apr_02:32. _1_May_02:32. _1_Jun_02:32. _1_Jul_02:32. _1_Aug_02:32. _1_Sep_02:32. _1_Oct_02:32. _1_Nov_02:32. _1_Dec_02:32. _1_Jan_03:32. _1_Feb_03:32. _1_Mar_03:32. _1_Apr_03:32. _1_May_03:32. _1_Jun_03:32. _1_Jul_03:32. _1_Aug_03:32. _1_Sep_03:32. _1_Oct_03:32. _1_Nov_03:32. _1_Dec_03:32. _1_Jan_04:32. _1_Feb_04:32. _1_Mar_04:32. _1_Apr_04:32. _1_May_04:32. _1_Jun_04:32. _1_Jul_04:32. _1_Aug_04:32. _1_Sep_04:32. _1_Oct_04:32. _1_Nov_04:32. _1_Dec_04:32. _1_Jan_05:32. _1_Feb_05:32. _1_Mar_05:32. _1_Apr_05:32. _1_May_05:32. _1_Jun_05:32. _1_Jul_05:32. _1_Aug_05:32. _1_Sep_05:32. _1_Oct_05:32. _1_Nov_05:32. _1_Dec_05:32. _1_Jan_06:32. _1_Feb_06:32. _1_Mar_06:32. _1_Apr_06:32. _1_May_06:32. _1_Jun_06:32. _1_Jul_06:32. _1_Aug_06:32. _1_Sep_06:32. _1_Oct_06:32.;
datalines;
LTD,@LAUR(WC03251),NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N

and i want it follwing format; 

Name Date(Preferably in YYYYMM format) Variables (two variables in separate columns, LTD, BV)
Can you help with this thing. 
Good day

Valued Guide
Posts: 2,174

Re: Reshape Datastream data for SAS

Hi Contributor raqthesolid

 see the other message on this community

I posted a single step solution at https://communities.sas.com/t5/General-SAS-Programming/datastream-into-SAS/m-p/346407/highlight/true...

 

peterC

Valued Guide
Posts: 2,174

Re: Reshape Datastream data for SAS

is that a header row of dates, followed by 1 row for each company with price under the corresponding dates?
If that is the data structure, a short data step could read a .csv version of the datastream data and create thecrequested data structure ---
except we would have poorer performance if the row width is unknown.
Data ds1 ;
attrib co length=$100 pdate format= date9. Informat= date11. ;
array dates(10000) _temporary_ ; * allows 10,000 dates on a row.;
Infile "that .csv version of the input" dsd lrecl= 1000000 truncover ;
If _n_ = 1 then do;
Input co @;
do c= 1 to dim(dates) ;
Input dates(c) @ ;
end ;
Delete ;
end ;
Input co @ ;
do c= 1 to dim(dates) ;
Input price @ ;
Pdate = dates(c) ;
If price>0 then Output ;
End ;
Run ;

That code assumes the dates conform to date. format like dd-mmm-yyyy or dd-mmm-yy
Or without those dashes.
If datesxare in some other date layout for which SAS has a suitable informat that should replace the informat date11.
If no suitable informat is available conversion might still be possible. Other ways to define the price-date could be dealt with.
These alternatives need better definition of the input information structure .
Ask a Question
Discussion stats
  • 9 replies
  • 922 views
  • 0 likes
  • 6 in conversation