BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MeganE
Pyrite | Level 9

Hi all,

 

I have array code that works just fine.

 

array startx(3) start1 start2 start3;

 

But i have more than 3 possibles and i want to run it as an unknown upperbound.  I know this:

 

array startx(*)

 

do i=1 to dim(startx);

 

and all that, but i can't figure out how to give a default base name so that sas can just number them as they come across them.  Is there a way to do that?  I've tried:

 

array startx(*) start:;

array startx(*) start*:

 

and nothing works.

 

I know that i can put a count into a macro variable and use that but because i'm working with millions of records and i'd rather not add the processing time.  I was hoping i could do it right in the array line.


Thanks,
Megan

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
MeganE
Pyrite | Level 9

So right now, sas doesn't have the ability to determine an upper bound array when using an array to create new variables.  That's what i'm hearing.

 

Thanks

View solution in original post

21 REPLIES 21
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

data have;
  start1=1; start2=3; start3=4;
run;
data want;
  set have;
  array startx start:;
  do i=1 to dim(startx);
    startx{i}=9;
  end;
run;

Although if you have lots of counts you would probably be better off having a normalised structure with data going down rather than across.

MeganE
Pyrite | Level 9

Thanks.  The data is currently going down and i want it to go across so i can compare end to start dates.

 

Doing it without the () and using the wildcard i'm getting warning saying it defined an array with zero elements.

 

 

 

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, I can't see your code/data or log??

MeganE
Pyrite | Level 9

I'm on an enclave computer that has no copy function or internet (security you know!).  I can't copy anything over.  The best i can do is a screenshot.

 

When the arrays are the normal "startx (3) start1 - start3" it runs just fine.  I just need to know how to not have to write out start3.  Leave it open for sas to fill.

 

array.png

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its telling you there are no variables with start prefix.  Your code seems confused, why is the set <dataset> within the do loop?

Your code should look like:

data...;
  set mods.mcare...;
  by usrds_id;
  array startx start:;
  ...;
  do i=1 to dim(startx);
    ...;
  end;
run;
  
MeganE
Pyrite | Level 9

I have 2 million rows of data, multiple rows per id, and i want one row per id with all of the start and end values on a single row.

 

When i'm going from "long to wide" i always put the arrays first listing out the new variables that i want to have in the new dataset i'm trying to create.

 

 

 

mkeintz
PROC Star

By "the data is going down" do you mean that the "array" of values you want comes from the rows of data (i.e. one row per array element)?  If so then the array size would equal the number of obs in the data set, which is easily determined.

 

But you also say you want to "compare end to start dates", which I guess means the date in the first obs vs the date in the last obs.  In that case you don't even need the array.  Just read the first obs and also used the "point=" option to read the last obs without reading any obs in between:

 

 

data _null_;

   set have (obs=1)   nobs=nrows;

   set have (keep=date rename=(date=last_date)) point=nrows;

   last_minus_first=last_date - date;

   put date=date9.  last_date=date9. last_minus_first=;

run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I think we have here another clear example of why following the guidance by the Post button is so key to getting clear answers.

1) Post test data in the form of a datastep

2) Post example required output

3) Explain logic between the two

4) Provide code/logs where present

 

We are just guessing here.

MeganE
Pyrite | Level 9

Frankly, either there's a way to let sas determine the upper bound array name or there isn't.  Me posting code isn't going to change whether or not someone knows if that's even possible.  And i did post the array line.  THe one line i needed help with.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

And following on from the minimum information you provided in the your first post:

---

array startx(3) start1 start2 start3;

 

But i have more than 3 possibles and i want to run it as an unknown upperbound.  I know this:

 

array startx(*)

 

do i=1 to dim(startx);

---

 

I created this code, which runs fine:

data have;
  start1=1; start2=3; start3=4;
run;
data want;
  set have;
  array startx start:;
  do i=1 to dim(startx);
    startx{i}=9;
  end;
run;

Now you are telling us that your data does not have variables start1 start2 etc.  Your data looks something like this:

START  END   MCARE

...

 

And you are putting these one by one up into variables, correct?  This is different as there is no way for the datastep to know how many elements until the end.

So then you have several options:

- Find out how many items there are before your datastep;

 

proc sql;
  select max(count(start))
  into  :items
  from  have
  group by urds_id;
quit;

data want;
  set have;
  array startx{&items.)...;

 

This will likely be the quickest method.

You could do three transposes.

Or create a large array that will be bigger than you will ever get to.

 

MeganE
Pyrite | Level 9

Correct, i only have start and end.  I wanted to create start1 - start?.

 

And i want to know if SAS can determine the "?" for me.

 

Thank you

Reeza
Super User

HBOUND()

LBOUND()

 


@MeganE wrote:

Frankly, either there's a way to let sas determine the upper bound array name or there isn't.  Me posting code isn't going to change whether or not someone knows if that's even possible.  And i did post the array line.  THe one line i needed help with.


 

You can short cut your variable list as well if you have a common prefix:

 

array myVars(*) Start: ;

If you're trying to automate the declaration of a macro that's going to be difficult, you need to precalculate the number, into a macro variable and work with that. The other option is to use PROC TRANSPOSE and then you don't need to know any of that. If you have multiple variables you'll need multiple transposes and a merge. 

MeganE
Pyrite | Level 9

I know the upper bound can be determined before hand, i said that in my initial post.  What i also said is that because i'm working with millions of records, i'd rather not add to the processing time.

 

It looks like your option only allows me to compare two dates.  I have multiple dates per id that i would like to see all on one line.

 

ballardw
Super User

@MeganE wrote:

I know the upper bound can be determined before hand, i said that in my initial post.  What i also said is that because i'm working with millions of records, i'd rather not add to the processing time.

 

It looks like your option only allows me to compare two dates.  I have multiple dates per id that i would like to see all on one line.

 


If those dates are not on the same record than an array, at least at the begining, is not going to work as arrays only exist for variables on a single record. What you will need to do is to transpose the data so those dates are one a single row

Something like this:

data example;
   do id = 1 to 3;
      do j= 1 to 5;
         date = today() + id +j; /* just to have some variety in the dates*/
         output;
      end;
   end;
   format date mmddyy10.;
run;

proc transpose data=example out=extrans
   prefix=start ;
by id;
var date;
run;

However with "millions" of records depending on how many ids and dates are concerned this might be a longish program.

 

And if you are looking at end dates you would need to make two separate transproses and then merge the two transposed sets to get both variables on the same row.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 6046 views
  • 0 likes
  • 8 in conversation