BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BillJones
Calcite | Level 5

Hello everyone,

I'm trying to download some customer number data from a company intranet site.  Attempting to adapt some code that Tom gave me (thanks again!) which downloads data from the internet using an input statement.  Code works fine when I specify the number of customer numbers.  Unfortunately, the number of customers is unknown.  There may be 10 or 100. The number varies based on company region and other characteristics.  Note I tried using the DIM(var) syntax to obtain the dimensions of the array, but I get the following error message:

ERROR: The array CUST_NUM has been defined with zero elements.

ERROR: Too many array subscripts specified for array CUST_NUM.

Below is the code I'm using.  Does anyone have any suggestions on how to define the array such it'll be able to handle varying numbers of customers?

Thanks so much for any help or advice.

-Bill

Again when I run this code I get the above error message.

%MACRO CUSTOMER_NUMBER(HTML_TAG,SAS_NAME);

data &SAS_NAME.;

infile in flowover ;

input @&HTML_TAG. @;  /*START*/

array &SAS_NAME. (*) $8.;

do col=1 to dim(&SAS_NAME.);

   input @ '/q?s='  &SAS_NAME.(col) $8. @;

end;

drop col;

run;

%MEND;

%CUSTOMER_NUMBER('>Preferred Customers<',CUST_NUM);

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

Why not just read the names as observations instead of into and array.  You don't know how many you need.  You also needed larger LRECL.  When doing the "blind scan" you can either live with LOSTCARD or use INFILE statement option EOF instead of END.

%MACRO INSDIER(HTML_TAG,SAS_NAME);


filename in url
"http://finance.yahoo.com/q/it?s=LNKD+Insider+Transactions" lrecl=1000000;

data test;
   infile in length=len truncover;
   input @;
   recno +
1;
   reclen = len;
   do block=
1 by 1;
      input record
$char128. @;
      if missing(record) then leave;
      output;
      end;
   run;



%LET NCOL=50; /*set the number of columns to an arbitrarily large number to ensure all insider info is recorded*/

data &
SAS_NAME.;
   infile in scanover eof=eof;
   input @&
HTML_TAG. @;  /*START*/
   length &sas_name $
25;
   do i =
1 by 1;
      input @
'http://biz.yahoo.com' @'>' string $40. @;
      &SAS_NAME=scan(string,1,'<'); /*SCAN STRING AND KEEP UP TO <*/
      output;
      end;
   eof: stop;
   drop string;
   run;
%MEND;
options mprint=1;
%
INSDIER('>Insider Transactions Reported - Last Two Years<',INSIDER);

View solution in original post

14 REPLIES 14
Tom
Super User Tom
Super User

You can only use (*) as the dimension for an array definition when the variables in question already exist.

You could try adding a third macro parameter to specify the number of expected values.

%MACRO CUSTOMER_NUMBER(HTML_TAG,SAS_NAME,NCOL);

...

array &SAS_NAME. (&NCOL) $8.;

If you cannot know in advance how many copies there will be then you probably need to add conditional to the DO loop to prevent reading past the end of the file.

infile in flowover end=eof ;

...

do col=1 to dim(&NCOL) while (not eof);


BillJones
Calcite | Level 5

Tom,

Thanks so much for your thoughts.  I tried to implement your solution by specifying an arbitrarily large number of customers and then using the conditional DO loop to prevent reading past the end of the file.  I can see the data in the log!  However, I get a "LOST CARD" error and SAS doesn't output any records to the data set.  I tried using the MISSOVER command in the infile statement but that set all the data to blank.  Am I correctly using the conditional DO loop as you suggested?  Should it be a loop within a loop?

Here's the revised code.  Note it works fine when I specify NCOL=10.  When I put NCOL=100, I get a "LOST CARD" error and no data records are output.

%LET NCOL=100;

%MACRO CUSTOMER_NUMBER(HTML_TAG,SAS_NAME);

data &SAS_NAME.;

infile in flowover end=eof ;

input @&HTML_TAG. @;  /*START*/

array &SAS_NAME. (&NCOL.) $8.;

do col=1 to dim(&SAS_NAME.) while (not eof);

   input @ '/q?s='  &SAS_NAME.(col) $8. @;

end;

drop col;

run;

%MEND;

%CUSTOMER_NUMBER('>Preferred Customers<',CUST_NUM);

data_null__
Jade | Level 19

Can you show example of the site you are reading from.

BillJones
Calcite | Level 5

data_null_;,

Thanks for the response.  Unfortunately, I can't give an example of the website since it's a company intranet site.  However, the code I'm using is almost working.  I just need to figure out how to generalize it, so it'll handle an unknown number of customers. 

Regards,

Bill

Scott_Mitchell
Quartz | Level 8

Substitute the data contained within the website with something generic.  It is difficult to help you if you can't provide a basic structure of the data you are reading.

BillJones
Calcite | Level 5

My apologies for 1) not having an example and 2) taking a long time to respond (work has been crazy this week).  Here's some code that pulls the names of LinkedIn insiders who've traded the stock in the last 2 years from yahoo finance.  The structure of the table on yahoo is very similar to the table on the intranet page that I'm working with. Also, both programs get the exact same error when I set the NCOL (number of columns) variable to something large, e.g., 50.  (SAS comes back with a "LOST CARD" error and states "SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line."  Note the program runs fine when I set the NCOL = 10.  Note I'm setting the number of columns to an arbitrarily large number because I don't know how many insiders traded the stock during the past 2 years.  Is there any easy way to count them before hand and then specify the number of columns correctly?  Or is it advisable to simply pick a large number and then delete any records with missing data?


Thanks so much for any help or advice.

Regards,

Bill



Code:

DM 'CLEAR LOG';

OPTIONS SPOOL;

%MACRO INSDIER(HTML_TAG,SAS_NAME);

filename in url "http://finance.yahoo.com/q/it?s=LNKD+Insider+Transactions" lrecl=32000;

data test;

infile in length=len;

input record $varying32000. len;

run;

%LET NCOL=50; /*set the number of columns to an arbitrarily large number to ensure all insider info is recorded*/

data &SAS_NAME.;

infile in flowover end=eof;

input @&HTML_TAG. @;  /*START*/

array &SAS_NAME. (&NCOL.) $25.;

do col=1 to dim(&SAS_NAME.) while (not eof);

    input @ '8557.html">'  string $25. @; /*GRAB 25 CHARACTERS AFTER TAG*/

  &SAS_NAME.(col)=input(scan(string,1,'<'),??$25.); /*SCAN STRING AND KEEP UP TO <*/

end;

drop col string;

run;

%MEND;

%INSDIER('>Insider Transactions Reported - Last Two Years<',INSIDER);

data_null__
Jade | Level 19

Why not just read the names as observations instead of into and array.  You don't know how many you need.  You also needed larger LRECL.  When doing the "blind scan" you can either live with LOSTCARD or use INFILE statement option EOF instead of END.

%MACRO INSDIER(HTML_TAG,SAS_NAME);


filename in url
"http://finance.yahoo.com/q/it?s=LNKD+Insider+Transactions" lrecl=1000000;

data test;
   infile in length=len truncover;
   input @;
   recno +
1;
   reclen = len;
   do block=
1 by 1;
      input record
$char128. @;
      if missing(record) then leave;
      output;
      end;
   run;



%LET NCOL=50; /*set the number of columns to an arbitrarily large number to ensure all insider info is recorded*/

data &
SAS_NAME.;
   infile in scanover eof=eof;
   input @&
HTML_TAG. @;  /*START*/
   length &sas_name $
25;
   do i =
1 by 1;
      input @
'http://biz.yahoo.com' @'>' string $40. @;
      &SAS_NAME=scan(string,1,'<'); /*SCAN STRING AND KEEP UP TO <*/
      output;
      end;
   eof: stop;
   drop string;
   run;
%MEND;
options mprint=1;
%
INSDIER('>Insider Transactions Reported - Last Two Years<',INSIDER);
BillJones
Calcite | Level 5

data_null_;,

Thanks so much for your help. Program works perfectly now!

Best,

Bill

BillJones
Calcite | Level 5

data_null_;

Thanks again for the code. Quick question: Currently, I'm using a macro to create separate data sets for each variable of interest and then merging the information.  Is there a way to easily modify the code such that it can capture more than one variable in a data step?  Using the insider information example, suppose I wanted to capture the insider type(director, officer) along with the insider name (html tags in code below).  Is this possible to do in one data step?  If this is a pain, then please don't worry about it.  The code you gave me works really well.  I'm just trying to make the program as efficient as possible.  Thank you!

-Bill

Current code uses macro to create separate data sets for each variable of interest (e.g., INSIDER_NAME, INSIDER_TYPE):

DM 'CLEAR LOG';

OPTIONS SPOOL;

filename in url "http://finance.yahoo.com/q/it?s=LNKD+Insider+Transactions" lrecl=1000000;

data test;

   infile in length=len truncover;

   input @;

   recno + 1;

   reclen = len;

   do block=1 by 1;

      input record $char128. @;

      if missing(record) then leave;

      output;

      end;

   run;

%LET MAIN_HTML_TAG='>Insider Transactions Reported - Last Two Years<';

%MACRO INSIDER_VARS(SAS_NAME,VAR_HTML_TAG,FORMAT);

data &SAS_NAME.;

   infile in scanover eof=eof;

   input @&MAIN_HTML_TAG. @;  /*START*/

   length &sas_name $25;

   do i = 1 by 1;

    INPUT @ &VAR_HTML_TAG. @'>' STRING &FORMAT. @;

    &SAS_NAME=INPUT(SCAN(STRING,1,'<'),??&FORMAT.);

    output;

    end;

   eof: stop;

   drop string;

run;

%MEND;

options mprint=1;

%INSIDER_VARS(INSIDER_NAME,'http://biz.yahoo.com',$40.);

%INSIDER_VARS(INSIDER_TYPE,'<span style="font-size:90%; display:block;"',$40.);

data_null__
Jade | Level 19

Yes can read everything at once, just more of the same so to speak.

data insidertrans;
   infile in scanover eof=eof;
   input @'>Insider Transactions Reported - Last Two Years<' @;  /*START*/
  
length name title $25 shares 8 type $10 transaction $40;
  
do i = 1 by 1;
     
input @'http://biz.yahoo.com' @'>' string $40. +(-40) @;
      NAME=scan(string,1,'<'); /*SCAN STRING AND KEEP UP TO <*/
     
input @'<span style="font-size:90%; display:block;"' @'>' string $40. +(-40) @;
      title=scan(string,1,'<');
      input @'class="yfnc_tabledata1"' @'>' string $40. +(-40) @;
      shares=input(scan(string,1,'<'),comma12.);
      input @'class="yfnc_tabledata1"' @'>' string $40. +(-40) @;
      type=scan(string,1,'<');
      input @'class="yfnc_tabledata1"' @'>' string $40. +(-40) @;
      transaction=scan(string,1,'<');
      output;
     
end;
   eof:
stop;
  
*drop string;
  
run;
BillJones
Calcite | Level 5

Thanks so much for the updated code!!!  It eliminates the need to merge the data in a subsequent data step.

Regards,

Bill

data_null__
Jade | Level 19

I think there is one more variable I forget to include but you should be able to do that with no problem.

BillJones
Calcite | Level 5

With the code you gave me, picking up the last variable should be no problem whatsoever.  Just a few more questions if you don't mind: What's the purpose of the +(-40) in the input statement?  Is it related to the variable with the greatest length?  In other words, if the variable with the greatest length was $60, then the code would be +(-60)?

Also, do you know of any good references for mastering the input statement?  I have a LOT to learn.

Thanks so much for your help!

Best,

Bill

data_null__
Jade | Level 19

I used +(-40) to move backward the amount that was read by STRING $40.  I don't think this is necessary but it thought it might happen that reading in the STRING could move the pointer past the next "target" of the @''class="yfnc_tabledata1" bit.  If you read string $60. then your would move back +(-60), or $10.  +(-10)

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 12389 views
  • 3 likes
  • 4 in conversation