About glcoolj12

glcoolj12 · ‎02-27-2019

Hello, I have a list of 17,888 unique ids (IDVAR) from a very large dataset (HAVE1) that I would like to split into several macro variables. I'd like to query based on these unique ids in another very large dataset (HAVE2). I define "large dataset" as having over 200 millions rows. Creating a sub-query would take hours to run which is why I'm creating the macro. I've used the code below when I have fewer unique IDs, which runs beautifully. But now I run into the issue of exceed the maximum macro limit. PROC SQL NOPRINT; SELECT DISTINCT QUOTE(TRIM(idvar)) INTO :LIST1 SEPARATED BY "," FROM have1; QUIT; %PUT LIST1= &LIST1; PROC SQL; CREATE TABLE want AS SELECT DISTINCT *, FROM have2 WHERE idvar IN (&LIST1); QUIT; I've attempted to use the below code, but I run into issues where the unique IDs get cut-off and additional wording gets input to the list. %let n_per_list=4000 ; data _null_; length idlist $32000; length macrolist $1000; retain macrolist; do i=1 to &n_per_list until (eof); set HAVE1 end=eof; idlist=catx(',',idlist,QUOTE(TRIM(IDVAR))); end; listno+1; call symputx(cats('paralist',listno),idlist); macrolist=catx(',',macrolist,cats('&','paralist',listno)); call symputx('paralist',macrolist); run; %put Paralist=%superq(ParaList); Paralist=&paralist1,&paralist2,&paralist3,&paralist4,&paralist5 %put &=Paralist; Any assistance that explains how I can properly split up a long list of IDs into several macro variables, will be much appreciated.

glcoolj12 · ‎10-11-2017

Thank you to everyone for their very helpful assistance!

glcoolj12 · ‎10-11-2017

Hello, I'm working with four data sets (6-year, 12-year, 18-year and 24-year measurements), within each data set every observation is unique to a person. In each of these data sets, there are 13 measures of interest (height, weight, pulse....) that were measured 3 times (height1, height2, height3) there are cases where an individual only had 1 or 2 measurements (i.e. height1=48, height2=48.2, height3= . ). I'm looking to find the average measurement on cases where the individual had 2 or more measurements (for all 13 variables of interest). My base code works (see below) but I'd like to create a macro so I don't have 26 IF/DO statements x 4 data sets. Any Suggestions on how to shorten up the code? Any help is greatly appreciated! %LET WT= WEIGHT; %LET HT= HEIGHT; . . . . DATA Y; SET X; IF &WT.1 NE . AND &WT.2 NE . AND &WT.3 NE . THEN DO; AVG&WT. = (&WT.1+&WT.2+&WT.3)/3; END; IF &WT.1 NE . AND &WT.2 NE . AND &WT.3 = . THEN DO; AVG&WT. = (&WT.1+&WT.2)/2; END; IF &HT.1 NE........ RUN;

glcoolj12 · ‎08-03-2017

Thank you everyone for your help! I restructured using the method @ballardw suggested and was successful. I'm very grateful for all your advice.

glcoolj12 · ‎08-02-2017

When I run the code you suggested, I get an error stating "Ambiguous reference, column "ID" is in more than one table".

glcoolj12 · ‎08-01-2017

I like this route, but I have several thousand IDs (that are 8 digits long); moreover, ID is a character variable in my data.

glcoolj12 · ‎08-01-2017

Hello, I'm looking to fully join 7 datasets (set1-set7) together on variable 'ID' to create a master file that contains one record for each ID (no duplicate IDs). The issue is that not all IDs are in every dataset (i.e. ID="333" may only be in set3 and set6). I've been joining the sets piece by piece (see SAS code), but I was curious if anyone knows a way to consolidate my code into one PROC SQL statement to achieve what I'm trying to do. I've attempted myself but I get the SAS NOTE ("The execution of this query involves performing one or more Cartesian product joins that can not be optimized.") Thanks! proc sql; create table test1 as select distinct coalesce(a.ID, b.ID) as ID, a.var1, b.var2 from (select ID, var1 from set1) as a full join (select ID, var2 from set2) as b on a.ID=b.ID; quit; proc sql; create table test2 as select distinct coalesce (a.ID, b.ID) as ID, a.var1, a.var2, b.var3 from (select ID,var1,var2 from test1) as a full join (select ID, var2 from set3) as b on a.ID=b.ID; quit; set 4... set 5... set 6... set 7...

glcoolj12 · ‎07-31-2017

This fixed the issue - thank you so much!

glcoolj12 · ‎07-31-2017

I have several datasets (.sas7bdat files in libname 'dataset') that I'm trying to read into my program. I only want to keep variables that contain 'ID' in the name. I have code that does what I want it to do, but I want to put it in a macro. When I run the code in the macro I created, I have an issue at the PROC SQL/%PUT part of my code and I get warning that says, "No rows selected.......no KEEP variables found, statement ignored." Thus, it reads in the datasets in but doesn't keep the variables I want. Any suggestions? Thank you. %macro datain(old,new,final); data &new; set dataset.&old; run; proc sql noprint; select trim(compress(name)) into :keep_vars separated by ' ' from dictionary.columns where libname = upcase('work') and memname = upcase('&new.') and upcase(name) like '%ID%'; quit; %put &keep_vars.; data &final; set &new; keep &keep_vars.; run; %mend datain;

glcoolj12 · ‎07-28-2017

Thank you all for your help. I was able to piece together a program that works. Thanks!! %macro k; %put &keep_vars.; %mend k; %macro datain (old, new); data &new; set dataset.&old; run; proc sql noprint; select trim(compress(name)) into :keep_vars separated by ' ' from sashelp.vcolumn where libname = upcase('work') and memname = upcase('&new.') and upcase(name) like '%PID%'; quit; %k; data &new; set &new; keep &keep_vars.; run; %mend;

glcoolj12 · ‎07-28-2017

Yes. 'dataset' is the libname I assigned to the folder that holds all 50+ datasets (.sas7bdat)

glcoolj12 · ‎07-28-2017

Hello, I've created a very basic macro to bring in several datasets. I only want to keep variables in the datasets that contain the word "PID" (it's typically the last 3 characters in the variable name but that's not always the case). I created the following macro, but I get an error saying that "variable VARNAME is not on file". Any help would be greatly appreciated, thanks! %macro datain (old,new); data &new; set dataset.&old; where VARNAME contains 'PID'; run; %mend;

glcoolj12 · ‎03-15-2017

Thank you very much for your help!

glcoolj12 · ‎03-15-2017

I have an .XLSX document with two sheets - sheet1 and sheet2. I saved both sheets as .CSV files. I'm using SAS9.4. I'm having an issue importing a sheet1.csv file into SAS. The import itself works, but it's cutting off responses for one variable. For example, a response for CONDITION in the .csv file is "to treat allergies" will be imput as "to treat aller". I've attached the code that SAS import wizard produced when importing this file. I don't have this issue when I import sheet2.csv file - the response for CONDITION do not get cutoff. For example, "cough and to control asthma" is imput as "cough and to control asthma". Both .csv files come from the same document, I don't know why the responses are getting cut off for CONDITION in sheet1. Any help would be much appreciated! PROC IMPORT out=work.sheet1 datafile = "path\sheet1.csv" DSMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; RUN; PROC IMPORT out=work.sheet2 datafile = "path\sheet2.csv" DSMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; RUN;

glcoolj12 · ‎03-07-2017

Yes, that worked, thank you! I appreciate all of your help.

Online Status	Offline
Date Last Visited	‎03-21-2019 02:41 PM

Macro length of the value of the macro variable exceeds maximum length

Re: Macro/Array for calculating averages on multiple, repeated variabl...

Macro/Array for calculating averages on multiple, repeated variables

Re: FULL JOIN with Multiple Datasets in ID variable (one record per ID...

Re: FULL JOIN with Multiple Datasets in ID variable (one record per ID...

Re: FULL JOIN with Multiple Datasets in ID variable (one record per ID...

FULL JOIN with Multiple Datasets in ID variable (one record per ID)

Re: Macro for Keeping variable names containing specific string

Macro for Keeping variable names containing specific string

Re: Issues with VARNAME when bringing in data

Re: Removing a suffix from a variable name

Re: Data Transformation - long to wide - multiple ID variables

Re: Data Transformation - long to wide - multiple ID variables

Re: Identifying and deleting duplicates by multiple variables

Macro length of the value of the macro variable exceeds maximum length

Re: Macro/Array for calculating averages on multiple, repeated variabl...

Macro/Array for calculating averages on multiple, repeated variables

Re: FULL JOIN with Multiple Datasets in ID variable (one record per ID...

Re: FULL JOIN with Multiple Datasets in ID variable (one record per ID...

Re: FULL JOIN with Multiple Datasets in ID variable (one record per ID...

FULL JOIN with Multiple Datasets in ID variable (one record per ID)

Re: Macro for Keeping variable names containing specific string

Macro for Keeping variable names containing specific string

Re: Issues with VARNAME when bringing in data

Re: Issues with VARNAME when bringing in data

Issues with VARNAME when bringing in data

Re: CSV to SAS import is trimming variable responses

CSV to SAS import is trimming variable responses

Re: Data Transformation - long to wide - multiple ID variables