About river1

river1 · ‎03-31-2024

Hello, please can you direct me to where I can find more information about the data used to create the maps.nz dataset. I can not find what level of data is in the ID variable or even the level of the long/lat. The data doesn't appear to be domicile codes, my data is grouped by domicile codes or even DHBs which is what I am after. Thank you so much

river1 · ‎03-17-2024

Thank you, this is really helpful and much neater

river1 · ‎03-16-2024

Thank you for your reply, I'd be grateful for an example of the ODS OUTPUT and how to combine the output - it would be helpful to be able to output to excel in a format that I can manipulate the data. I've used the 'ods excel file= ...' but it outputs everything into separate tabs plus there are merged cells and the percentages output in the same cell as the frequency so it is difficult to copy and paste quickly. My dataset has around 50 variables and 60,000 observations. Mostly binary variables and a couple of categorical variables. I've made an example of the data and an example of one of my tables where I am pasting out the frequencies, I hope that is okay? 1. Example data: 2. Example of a table I am creating with the output 3. Example code to produce some frequency tables ID2 ID EXP1 OUT1 OUT2 OUT3 OUT4 VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 VAR8 VAR9 VAR10 VAR11 VAR12 VAR13 VAR14 VAR15 VAR16 VAR17 VAR18 VAR19 CITY_NAME A1 123 0 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 NEW_YORK A2 124 0 0 1 0 1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 LONDON A3 125 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DEHLI A4 126 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 KATHMANDU A5 127 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 EDINBURGH A6 128 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 TORONTO A7 129 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 TOKYO A8 130 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 NEW_YORK A9 131 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 LONDON A10 132 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DEHLI A11 133 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 KATHMANDU A12 134 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EDINBURGH A13 135 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 TORONTO A14 134 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EDINBURGH A15 135 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 TORONTO A16 125 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DEHLI A17 126 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 KATHMANDU A18 127 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 EDINBURGH A19 128 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 TORONTO A20 129 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 TOKYO A21 130 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 NEW_YORK A22 131 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 LONDON A23 132 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DEHLI Label | EXP1 (0) | % | OUT1 (0) | % | OUT1 (2) | % | EXP1 (1) | % | OUT1 (0) | % | OUT1 (1) | %| VAR1*EXP1*COF_2 DETAILS1 VAR1*EXP1*COF_2 DETAILS2 VAR1*EXP1*COF_2 DETAILS3 0 proc freq data=MYLIB.DATA1; tables VAR1*EXP1*OUT1 / chisq outpct nocum norow nocol; tables VAR2*EXP1*OUT1 / chisq outpct nocum norow nocol; tables VAR3*EXP1*OUT1 / chisq outpct nocum norow nocol; tables VAR4*EXP1*OUT1 / chisq outpct nocum norow nocol; tables VAR5*EXP1*OUT1 / chisq outpct nocum norow nocol; tables VAR6*EXP1*OUT1 / chisq outpct nocum norow nocol; tables VAR7*EXP1*OUT1 / chisq outpct nocum norow nocol; tables VAR8*EXP1*OUT1 / chisq outpct nocum norow nocol; run;

river1 · ‎03-15-2024

Hello I am trying to output a lot of data and present it in a spreadsheet. I am creating a table with the odds ratios, p values etc for around 23 variables. Plus I am running it several times with different outcome variables. The problem I have is that it is tedious copying the data out. With the variables that have multiple levels I can use excel to manipulate them but for the single variables it is easier to copy and paste. SAS seems to output everything with merged cells. Even when I export to a spreadsheet every section is on a separate tab. It is the same problem with the percentage being in the same cell as the frequency. Very frustrating when you need to rerun the data with fresh data. Is there a way to output everything to a single table so it can be more easily manipulated? I've given an example of the code I am using but I also have a lot of frequency tables . Many thanks for your time ods excel file="C:\RESULTS_D_CROSSTAB_OUTCOME1.xlsx" options(sheet_name="Sheet1" sheet_interval='none'); /*xxxxxxxxxxxxxxxxxxxxxxxx*/ title "ALL DATA - OUTCOME1"; /*xxxxxxxxxxxxxxxxxxxxxxxx*/ %macro iterate_variables; %local letters i; %let letters = VAR1...VAR22 ; %do i=1 %to 23; %let VAR1 = %scan(&letters, &i); title "PROC FREQ - OUTCOME1*EXP1* &VAR1 "; proc freq data=MYLIB.DATA1; tables &VAR1.*OUTCOME1*EXP1/ relrisk plots(only)=relriskplot(stats) cmh ; tables OUTCOME1*EXP1 / chisq oddsratio; run; title "PROC FREQ - OUTCOME1 - &VAR1 "; proc freq data=MYLIB.DATA1; tables &VAR1.*OUTCOME1/ relrisk plots(only)=relriskplot(stats) cmh ; tables OUTCOME1 / chisq oddsratio; run; title "PROC LOGISTIC - OUTCOME1 - &VAR1 "; proc logistic DATA=MYLIB.DATA1; model OUTCOME1(Event = '1')= &VAR1 / expb LINK=LOGIT; run; title "PROC LOGISTIC - OUTCOME1 - EXP1 &VAR1 "; proc logistic DATA=MYLIB.DATA1; model OUTCOME1(Event = '1')= EXP1 &VAR1 / expb LINK=LOGIT; run; %end; %mend iterate_variables; %iterate_variables;

river1 · ‎01-29-2024

Thanks for your reply that is helpful. I have a lot of data and perhaps there is a simpler way to deal with this. I will now combine everything into a single dataset rather than the four datasets I had intended and this will reduce the duplication and make it easier for analysis. There are also two smaller datasets with ethnicity /gender/age groups variables and they all have the linking variable VAR2. I have only been able to use the macro above on the 18 single keywords. I have several other variables that are a combination of keywords (or excluding keywords) so I am extracting these separately. The main dataset (MYLIB2.DATA1) has about 3.5 million rows and it is down to 8 variables but if I was to also merge it with the outputs from the hash objects and the two ethnicity datasets (using the linking variable var2) it would have 60+ variables, which would be large. I’ve not been able to make the hash object add the output to a variable in a single dataset so the best I could do is below. The final merge statement will be a bit messy if there is an easier way to do this it would be helpful. But I will no longer need the merge a and b statement - this was causing a warning, which I think it was because of the multiple observations with the same VAR2 in MYLIB2.DATA1 data TEST.B1 ; if _n_= 1 then do; declare hash CS (dataset:”MYLIB2.DATA1(where=( find(cats(VAR1),’XXX’, ‘I’) AND find(cats(VAR1),’AAA’, ‘I’) ))”); CS.defineKey('VAR2'); CS.defineDone(); end; set MYLIB.DATA1; if CS.check()=0; run; proc sql; select count(distinct VAR2) as distinct_B1_XA from TEST.B1; quit; data TEST.B1_1; set TEST.B1; XA = 1; run; data TEST.B2 (compress=yes); if _n_= 1 then do; dcl hash CS(dataset:"MYLIB.DATA1(where=( find(cats(VAR1),'AAA', 'i') OR find(cats(VAR1),'BBB', 'i') ))"); CS.defineKey('VAR2'); CS.defineDone(); end; set MYLIB.DATA1; if CS.check()=0; run; proc sql; select count(distinct VAR2) as distinct_B2 from TEST.B2; quit; data TEST.B2_1; set TEST.B2; AB = 1; run; proc sort data=TEST.B2_1 out=TEST_B2_1; by VAR2; run; proc sort data=TEST.B1_1 out=TEST_B1_1 ; by VAR2; run; proc sort data=MYLIB.DATA1 out= MYLIB_DATA1; by VAR2; run; data test.F1; merge TEST_B2_1 (in=a) TEST_B1_1 (in=b) MYLIB_DATA1 (in=c); by VAR2; output; run; proc delete data=TEST.B1 TEST.B1_1 TEST_B1_1 TEST.B2 TEST.B2_1 TEST_B2_1 MYLIB_DATA1; run;

river1 · ‎01-27-2024

Thanks for this, I'll change my code. Would it be possible to extend the macro to also iterate the same code and search for the keywords through multiple datasets? I am running this so many times as I am also searching for different keywords in another variable plus different variations and it would help to cut down on my pages of code. for example to run the code through LIB2.DATA1 but also through LIB2.DATA2, LIB2.DATA3 and LIB2.DATA4. Many thanks

river1 · ‎01-27-2024

Hello, I have some code to search for keywords in multiple datasets that are created from about 40 merge statements. I've given an example of one of the merge statements and 8 keywords. I have been trying to get the macro to also iterate through three other datasets, for example, LIB2.DATA1, LIB2.DATA2, LIB2.DATA3 and LIB2.DATA4. Although it will take longer to run I'm trying to simplify the output but I've struggled adding in the different datasets. I'd really appreciate some ideas on how to do this. Many thanks ods excel file="C:\SAS_output\output_A1.xlsx" options(sheet_name="Sheet1" sheet_interval='none'); %macro SINGLES4(keyword); data MYLIB1.A1_&keyword ; if _n_ = 1 then do; dcl hash h1(dataset:'LIB2.DATA1(where=(find(cats(VAR1), "&keyword", "i")))'); h1.defineKey('VAR2'); h1.defineDone(); end; set LIB2.DATA1; if h1.check() = 0; run; title "next search"; proc sort data=MYLIB1.A1_&keyword out=MYLIB1_A1_&keyword; by VAR2; run; proc sort data=MYLIB4.D4 out=MYLIB4_D4_&keyword; by VAR2; run; data MYLIB1.A2_&keyword; merge MYLIB1_A1_&keyword (in=a) MYLIB4_D4_&keyword (in=b); by VAR2; if a and b; output; run; proc sql; select count(distinct VAR2) as distinct_A2_&keyword from MYLIB1.A2_&keyword; quit; [merge and count statements repeated many times] %mend; %macro iterate_keywords; %do i=1 %to 8; %let keyword = %scan(ADF DF QW ER RT TY YU UI, &i); %SINGLES4(&keyword); %end; %mend; %iterate_keywords;

river1 · ‎12-24-2023

thanks for this, I used the hash object to create my intermediary datasets. I now see the problem was use use the sql. This makes sense

river1 · ‎12-24-2023

thanks these were very helpful

river1 · ‎12-24-2023

I’m retrieving data between a few datasets for multivariate data analysis. I’ve created a few variables which I am using to then compare age/gender/ethnicity and other variables etc. There isn’t a linking variable in all three datasets so I have to created intermediary variables and these have taken up a lot of storage. But for the counts I am running a few blocks and then deleting.

river1 · ‎12-24-2023

Thanks for your help. The compress is not working with the datasets created via proc sql. I tried the COMPRESS=BINARY and the file size remained the same with the same warning messages. No problem, I’ll just run fewer blocks of code each time and delete. I have several datasets that I am using and then the rest can be deleted. I have used the compress=yes to create the earlier datasets with success so this must be because I can only use it to create the initial SAS datasets. Error message: WARNING: The option COMPRESS is not valid in this context. Option ignored. WARNING: Some options for file mylib.data1 were not processed because of errors or warnings noted above.

river1 · ‎12-18-2023

Hello, I'm having problems with disk space, I am running a lot of code that is linking three datasets so I need to keep many of the datasets to refer back to. I delete wherever possible but they are huge files. I have used (compress=yes) wherever possible and I've dropped any variables I am not using. At the moment I am running this code over and over again to retrieve data using different variables in the two databases but I'll need to work in a third next. In total it will be over 300 datasets for this segment but I can only store about 10 and then my 100GB space is depleted. I added the proc data step block to compress the output but the file size is exactly the same file size. I've tried writing a programme to reduce the effort of changing the variables but I just ran out of space so I gave up. Are there any techniques that I can refer to or do people just buy external storage? I am using SAS Enterprise Guide 8.3 Update 8 (8.3.8.206) (64-bit)z. Thank you for the advice proc sql; create table MYLIB3.DATA3 as select * from MYLIB1.DATA1 where ID in (select ID from MYLIB2.DATA2 where VAR_2="XXXX"); quit; proc datasets library=MYLIB3; modify DATA3 (compress=yes); run; proc sql; select count(distinct ID) as DATA3 from MYLIB3.DATA3; quit;

river1 · ‎11-20-2023

Thank you. This looks much neater than what I ended up with. I ran the code three times. First to select the keyword of interest then I ran it twice to exclude the keywords I did not want ie data work.WANT; if _n_ = 1 then do; dcl hash h1(dataset: WORK.HAVE(where=(find(cats(VAR_1), "KEYWORD_2", "i") > 0))'); h1.defineKey(VAR_2); h1.defineDone(); end; set WORK.HAVE; if h1.check()ne 0; run;

river1 · ‎11-17-2023

Thanks that works perfectly! It outputs the rows and includes the tilde characters. I've been trying to add a line to exclude keywords to reduce the dataset. I've tried the following but this is not working, do you have suggestions for an alternative? data work.new; if _n_=1 then do; dcl hash h1(dataset:'work.have(where=(find(cats(var_1),"keyword","i") and not find(cats(var_1), "keyword_2", "i")))'); ))'); h1.defineKey('event_id'); h1.defineDone(); end; set work.have; if h1.check()=0; run; Example dataset ID Num_1 Num_2 Char_1 Char_2 Char_3 Char_4 Char_5 145896555 19 25 A Text Text Text Text 145896555 19 25 B Text Text Text Text 145896555 19 25 B Text Text Text Text 145896555 19 25 B Text Text Text Text 145896555 19 25 B Keyword_1 Text Text Text 145896555 19 25 B Text Text Text Text 145896556 19 25 A Text Text Text Text 145896556 19 25 B Text Text Text Text 145896556 19 25 B Text Text Text Text 145896556 19 25 B Text Text Text Text 145896557 19 25 B Text Text Text Text 145896557 19 25 B Text Text Text Text 145896557 19 25 B Text Text Text Text 145896557 19 25 A Text Text Text Text 145896557 19 25 B Text Text Text Text 145896558 19 25 B Keyword_1 Text Text Text 145896558 19 25 B Keyword_2 Text Text Text 145896558 19 25 B Text Text Text Text 145896558 19 25 B Text Text Text Text 145896559 19 25 B Text Text Text Text 145896559 19 25 B Text Text Text Text 145896559 19 25 B Text Text Text Text 145896559 19 25 B Text Text Text Text I'd like the output dataset to read: 145896555 19 25 A Text Text Text Text 145896555 19 25 B Text Text Text Text 145896555 19 25 B Text Text Text Text 145896555 19 25 B Text Text Text Text 145896555 19 25 B Keyword_1 Text Text Text 145896555 19 25 B Text Text Text Text

river1 · ‎11-16-2023

Thanks, this code was great. But I still had the same problem as above, when there was a tilde ~ character at the start of the cell it did not output ie '~keyword' would not cause the row to not output. The tilde ~ characters are present in the want dataset but they are not outputting in the output3 dataset. I've tried to search for '~keyword' and I've tried to remove the tilde character before running the SQL code but this did not work either. Data want2 ; set want; Code=compress(Code,'~'); run; Thanks for your help!

Online Status	Offline
Date Last Visited	‎04-01-2024 05:27 AM

maps level of data

Re: output multiple outputs to single table

Re: output multiple outputs to single table

output multiple outputs to single table

Re: create a macro to iterate through multiple keywords and datasets

Re: create a macro to iterate through multiple keywords and datasets

create a macro to iterate through multiple keywords and datasets

Re: Disk space

Re: Disk space

Re: Disk space

Re: maps level of data

Re: output multiple outputs to single table

Re: create a macro to iterate through multiple keywords and datasets

Re: create a macro to iterate through multiple keywords and datasets

Re: create a macro to iterate through multiple keywords and datasets

maps level of data

Re: output multiple outputs to single table

Re: output multiple outputs to single table

output multiple outputs to single table

Re: create a macro to iterate through multiple keywords and datasets

Re: create a macro to iterate through multiple keywords and datasets

create a macro to iterate through multiple keywords and datasets

Re: Disk space

Re: Disk space

Re: Disk space

Re: Disk space

Disk space

Re: Search for a keyword contained within text and output other rows t...

Re: Search for a keyword contained within text and output other rows t...

Re: Search for a keyword contained within text and output other rows t...