BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Gucci913
Fluorite | Level 6

I have imported a list which is one variable and over 66k records (A1:A66000).  I then need to create a list adding an apostrophe in order to query these IDs in the next data step for how the field is stored as VARCHAR.  I am unable to create a list from the Proc Import since there are too many records.  It appears I may need to run this below step several times to use the list(s) in my where statement.

 

Is there a way in the below to add in a range from the proc import?  Then maybe I can have NPI_LIST1, NPI_LIST2, NPI_LIST3 etc?

 

Is there a way I an make sure the excel import is clean enough so that I do not need to add the apostrophes and change the records to VARCHAR?

 

Or is there a better solution I am not considering?  I am new to SAS so any guidance is greatly appreciated.

 

PROC SQL:

Select

CATS(" ' ",NPI_ID," ' ") AS VARCHAR into: NPI_LIST1 separated by " ' "

From Work.npis;

Quit;

 

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

In SAS SQL you can join columns with different data types by using either the PUT or INPUT function in the join syntax:

proc sql;
  create table Want as 
  select *
  from Have1 as A
  left join Have2 as B
  on put(A.ID_Num, 8.) = B.ID_Char;
quit;

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

I am confused.


First you say "I have imported a list which is one variable and over 66k records" — okay, that is clear. Then you say "I am unable to create a list from the Proc Import since there are too many records." How are you creating the list? Why are you creating the list? Why do you need to add an apostrophe?

--
Paige Miller
Gucci913
Fluorite | Level 6

Sorry I was not clear.  

 

The import works fine.  The problem is that I can not use the NPI_ID in the where statement because the field I am filtering on is stored as VARCHAR and it needs to have an apostrophe since that is how the field is formatted.

 

This works fine when the proc import is smaller. I think the limitation is 10,000.  Therefore I think I need to create multiple Lists.

 

Instead of importing several files, to then create the lists casting them as VARCHAR and adding an apostrophe I am trying to determine if there is a better solution.


Does that help?

Gucci913
Fluorite | Level 6
Thx for your suggestions. The join does not work because the components are of different data types (my main issue). I am not sure how to use hash object but I am going to look into it.
SASKiwi
PROC Star

In SAS SQL you can join columns with different data types by using either the PUT or INPUT function in the join syntax:

proc sql;
  create table Want as 
  select *
  from Have1 as A
  left join Have2 as B
  on put(A.ID_Num, 8.) = B.ID_Char;
quit;
Gucci913
Fluorite | Level 6
This is amazing! Thank you thank you !
Kurt_Bremser
Super User

Let's start at the beginning: Excel files are the worst means of data transfer, as the tools available for import have to make guesses about data types and attributes. The best way to work around this is to save the data as text (.csv) which enables you to read it with a DATA step where you set all the attributes so they match the data you already have (use PROC CONTENTS on your existing data).

 

To provide sample code for the use of hash objects, we need to see examples of your data. Ideally, you post these as DATA steps with DATALINES, so we can recreate your datasets with a simple copy/paste and submit. Do not use Excel attachments or pictures.

Tom
Super User Tom
Super User

Why do you want to stuff data into macro variable(s)?

Why can't you just use the data from the dataset?  What are you going to do with this list of values?

 

If you need to generate a string like 'A','B', .... that is going to be longer than the 64K byte limit of a single macro variable then you will need to use multiple macro variables.  Like this example:

891  data _null_;
892    call symputx('list1',"'A','B'");
893    call symputx('list2',"'C','D'");
894    call symputx('list','&list1,&list2');
895  run;

NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


896
897  %put &=list1 &=list1 &=list;
LIST1='A','B' LIST1='A','B' LIST='A','B','C','D'
898  %put LIST=%superq(list);
LIST=&list1,&list2

You can use a data step to generate the need macro variables.

Let's make up some sample data:

data have;
  input id $10.;
cards;
ABCD
EFGH
IJKL
MNOP
QRST
UVWX
YZ
;

Now let's convert into macro variables with a maximum length of 20 bytes each.  Note that a dataset variable has a maximum length of only 32K bytes (half of the maximum length of a macro variable).

%let maxlen=20 ;
data _null_;
  length string $%sysfunc(min(32767,%sysfunc(max(&maxlen,1000))));
  retain string;
  do until(eof or toolong) ;
    set have end=eof;
    qid=quote(trim(id),"'");
    toolong=length(catx(',',string,qid))>&maxlen ;
    if toolong then do;
      listnum+1;
      call symputx(cats('list',listnum),string);
      string=qid;
    end;
    else string=catx(',',string,qid);
  end;
  if eof then do;
    if toolong then do;
      listnum+1;
      call symputx(cats('list',listnum),string);
    end;
    string=' ';
    do i=1 to listnum;
      string=catx(',',string,cats('&list',i));
    end;
    call symputx('listnum',listnum);
    call symputx('list',string);
  end;
run;

Let's check the results:

1574  %put &=listnum &=list;
LISTNUM=3 LIST='ABCD','EFGH','IJKL','MNOP','QRST','UVWX','YZ'
1575  %put %superq(list);
&list1,&list2,&list3
1576  %put &=list1 ;
LIST1='ABCD','EFGH','IJKL'
1577  %put LIST&listnum=&&list&listnum;
LIST3='YZ'
Gucci913
Fluorite | Level 6
Thank you for taking the time to document the steps, admittedly its over my head. The excel file I am using as a dataset is not stored the same way as my data set. I am not sure how to fix this, therefore my workaround was the macro to generate a string... it works but not without creating the step many times into various lists.
Tom
Super User Tom
Super User

@Gucci913 wrote:
Thank you for taking the time to document the steps, admittedly its over my head. The excel file I am using as a dataset is not stored the same way as my data set. I am not sure how to fix this, therefore my workaround was the macro to generate a string... it works but not without creating the step many times into various lists.

You need to explain more about what is wrong.  SAS only has TWO data types.  So there are only THREE combinations you could have.  Both character, Both numeric and one character and one numeric.  Any of them is easy to work with using SQL.

create table want as 
  select * 
  from have1 inner join have2
  on have1.charvar1 = have2.charvar2
;
create table want as 
  select * 
  from have1 inner join have2
  on have1.numvar1 = have2.numvar2
;
create table want as 
  select * 
  from have1 inner join have2
  on have1.input(charvar1,32.) = have2.numvar1
;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 672 views
  • 4 likes
  • 5 in conversation