BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Gick
Pyrite | Level 9
data mytable;
input name $ 1_1 2_1 3_1 4_1 5_1 6_1 7_1 8_1 9_1 10_1 11_1 12_1 13_1 14_1 15_1 16_1 17_1 18 19 20_1 21_1 22_1 23_1 24_1 25_1 26;
datalines;
John 0 1 3 5 7 2 4 7 6 8 3 4 5 8 9 10 3 4 5 1 3 4 6 32 1 
Mary 32 3 5 7 2 4 7 6 8 3 4 39 8 9 10 39 8 9 10 39 8 9 10
Peter 9 10 3 4 5 1 3 4 41 
Sarah 28 8 3 4 5 8 0 1 3 5 7 2 
Chris 19 9 10 3 4 5 1 0 1 3 5 7 2 3
Jane 36 32 3 5 7 2 4 7 6 8 5 1 3 4 6 32
Bob 47 0 1 4 7 11 20 
Alice 22 
David 39 8 9 10 3 4 5 4 5 8 0 1 3  8 5 1 3 4 6 32 
;
run; 

Hi,

I want to create variables such as:

"0_4ans_H"n=sum(of "1_1"n-"4_1"n); "

5_14ans_H"n=sum(of "5_1"n-"14_1"n);

"15_24ans_H"n=sum(of "15_1"n-"24_1"n);

Here is my code:

/*Age classes: Male*/

array var_0_4 {"0_1"n:"4_1"n} _numeric_;

"0_4ans_H"n=sum(of _numeric_var_0_4 [*]);

"5_14ans_H"n=sum(of "5_1"n-"14_1"n);

"15_24ans_H"n=sum(of "15_1"n-"24_1"n);

 

Note that the table columns are named 1, 2, 3, 4, 1_1, 2_1, 3_1, 4_1, etc.

I just took part of the table in occurrence from 1_.... For part 1, 2, 3, 4, etc. I was able to calculate the range sum by this formula: "0_4ans_F"n=sum(of "0"n-"4"n);

Which works but with 1_1, 2_1, etc. I can't find a solution.

 

Can anyone help me please. A SAS/SQL code proposal will be welcome.

 

THANKS

Gick

1 ACCEPTED SOLUTION

Accepted Solutions
yabwon
Onyx | Level 15

1) use the Maxim 1 and got to Documentation: Name Range Lists

 

2) do it like this:

 

options validvarname=any;
data mytable;
infile cards missover;
input name $ "1_1"n "2_1"n "3_1"n "4_1"n "5_1"n "6_1"n "7_1"n "8_1"n "9_1"n "10_1"n "11_1"n "12_1"n "13_1"n "14_1"n "15_1"n "16_1"n "17_1"n "18"n "19"n "20_1"n "21_1"n "22_1"n "23_1"n "24_1"n "25_1"n "26_1"n
;
datalines;
John 0 1 3 5 7 2 4 7 6 8 3 4 5 8 9 10 3 4 5 1 3 4 6 32 1 
Mary 32 3 5 7 2 4 7 6 8 3 4 39 8 9 10 39 8 9 10 39 8 9 10
Peter 9 10 3 4 5 1 3 4 41 
Sarah 28 8 3 4 5 8 0 1 3 5 7 2 
Chris 19 9 10 3 4 5 1 0 1 3 5 7 2 3
Jane 36 32 3 5 7 2 4 7 6 8 5 1 3 4 6 32
Bob 47 0 1 4 7 11 20 
Alice 22 
David 39 8 9 10 3 4 5 4 5 8 0 1 3  8 5 1 3 4 6 32 
;
run; 
proc print;
run;

data want;
set mytable;
"0_4ans_H"n = sum(of "1_1"n -numeric- "4_1"n);

"5_14ans_H"n = sum(of "5_1"n -numeric- "14_1"n);

"15_24ans_H"n = sum(of "15_1"n -numeric- "24_1"n);
run;
proc print;
run;

 

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



View solution in original post

6 REPLIES 6
SASKiwi
PROC Star

SAS variable names cannot start with a number so you will have to add at least one alpha character on the front, like V (for variable) or other character of your choosing.

 

EDIT: Setting the SAS option VALIDVARNAME to ANY will allow you specify the names you want as long as you specify all such names in quotes with a trailing letter N:

options validvarname = any;
data test;
  "1_1"n = 0;
run;

Your naming convention will require a LOT of extra typing however.

Gick
Pyrite | Level 9
Of course. I use it. But to calculate the sum as specified above does not work.

The variables are indeed "1_1"n , "2_1"n. I can directly calculate the sum by doing "1_1"n+ "2_1"n+... but it will be too cumbersome because I have a lot of variables. I want to use the ranges by doing "1_1"n-"4_1"n for example. But it does not work.

Does anyone have a suggestion
Best regards,
Gick
ballardw
Super User

I suspect any problem that some attempts to solve with arrays is going to drive them nuts with the amount of code required to accomplish similar actions in SQL. SQL does not allow any list arguments.

 

Also you have a bit of misunderstanding on how you define an array. The subscript portion, that controls the values of of the index of the array must be literal integer values, not variable names.

So in the following statement the highlighted text is an error.

array var_0_4 {"0_1"n:"4_1"n} _numeric_;

Second when you provide the _numeric_ then you have told SAS to use ALL of the numeric variables in the data set.

The proper array statement fro that attempt would be:

Array var_0_4 (*)   "1_1"n   "2_1"n   "3_1"n   "4_1"n ;

The use of variable names with the same suffix (the ending _1) means that you cannot use any of the list building options that would be be possible using a numeric suffix that changed like A1 - A4 where variable lists are allowed.

 

From the example data shown I see no reason to even use the 1_1 2_1 etc names. It might make sense if you also had 1_2 1_3 but without those why not use just 1 and 2?????

 

And this makes no sense at all: "

Note that the table columns are named 1, 2, 3, 4, 1_1, 2_1, 3_1, 4_1, etc.

I just took part of the table in occurrence from 1_.... For part 1, 2, 3, 4, etc. I was able to calculate the range sum by this formula: "0_4ans_F"n=sum(of "0"n-"4"n); "

Your example data does not show any "columns name 1 2 3 or 4. So where did they come from.

Your data step has other issues in that you are reading way more values than appear in the datalines which causes multiple errors.

 

IF you have variables that should be linked conceptually as a group then typically name the "group" with common prefix and then differentiate the members with a suffix: V1 V2 V3, A1 A2 A3, not 1_v 2_v 3_v or 1_a 2_a 3_a.

Then you can use lists like V1-V3 directly without having to create arrays (unless there are other things that need arrays)

data example ;
   input v1 - v3  a1-a4;
   v1_v3 = sum(of v1-v3);
   a1_a4 = sum(of a1-a4);
datalines;
1 2 3 44 55 66 77
;

As a restart I would go back to when you brought this data into SAS. I suspect probable use of proc import with a poor data layout. But if you set:

Options validvarname=V7; before that step at least it won't create all of those terribly verbose name literals. Better would be to write a data step to read the data in sensible form.

 

If this "

/*Age classes: Male*/

array var_0_4 {"0_1"n:"4_1"n} _numeric_;"

implies that certain columns are supposed to relate only to male subjects, with a different set of variables implied for females then your data is terribly screwy and probably needs to be completely restructured.

yabwon
Onyx | Level 15

1) use the Maxim 1 and got to Documentation: Name Range Lists

 

2) do it like this:

 

options validvarname=any;
data mytable;
infile cards missover;
input name $ "1_1"n "2_1"n "3_1"n "4_1"n "5_1"n "6_1"n "7_1"n "8_1"n "9_1"n "10_1"n "11_1"n "12_1"n "13_1"n "14_1"n "15_1"n "16_1"n "17_1"n "18"n "19"n "20_1"n "21_1"n "22_1"n "23_1"n "24_1"n "25_1"n "26_1"n
;
datalines;
John 0 1 3 5 7 2 4 7 6 8 3 4 5 8 9 10 3 4 5 1 3 4 6 32 1 
Mary 32 3 5 7 2 4 7 6 8 3 4 39 8 9 10 39 8 9 10 39 8 9 10
Peter 9 10 3 4 5 1 3 4 41 
Sarah 28 8 3 4 5 8 0 1 3 5 7 2 
Chris 19 9 10 3 4 5 1 0 1 3 5 7 2 3
Jane 36 32 3 5 7 2 4 7 6 8 5 1 3 4 6 32
Bob 47 0 1 4 7 11 20 
Alice 22 
David 39 8 9 10 3 4 5 4 5 8 0 1 3  8 5 1 3 4 6 32 
;
run; 
proc print;
run;

data want;
set mytable;
"0_4ans_H"n = sum(of "1_1"n -numeric- "4_1"n);

"5_14ans_H"n = sum(of "5_1"n -numeric- "14_1"n);

"15_24ans_H"n = sum(of "15_1"n -numeric- "24_1"n);
run;
proc print;
run;

 

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Gick
Pyrite | Level 9
It's amazing.

THANKS.
PaigeMiller
Diamond | Level 26

The idea of arranging data in this fashion seems to be a poor idea. Almost all SAS PROCs expect data to be arranged vertically, not horizontally. Yes, you can do additional programming to make it work horizontally, but its much much easier with vertical data sets, which is why I keep stealing Maxim 19

 

Maxim 19

Long beats wide.

(Don't keep data in structure)

In the world of spreadsheets, people tend to line up data side-by-side, and put data items (dates, categories, …) into column headers. This runs counter to all the methods available in SAS for group processing, and makes programming difficult, as one has variable column names and has to resort to creating dynamic code (with macros and/or call execute) where such is not necessary at all if categories were represented in their own column and data aligned vertically.
There are times where a wide format is needed, eg when preparing data for regression analysis. But for the processing and storing of data, long formats are always to be preferred.

Dynamic variable names force unnecessary dynamic code.

 

 

data have;
     input name $ time value;
     datalines;
John 1 0
John 2 1
John 3 3
...
David 16 32
;

 

From here, the programming is a lot easier than working with variable names like '4_1'n, using PROCs like PROC SUMMARY, and maybe even SQL works a lot easier. And while you have marked an answer as "Correct", I think the whole approach is a poor way to do this.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1019 views
  • 4 likes
  • 5 in conversation