DATA Step, Macro, Functions and more

Need Help Constructing Code

New Contributor
Posts: 4

Need Help Constructing Code

[ Edited ]

I have a small project constructing code to obtain data illustrated below: Are constructing Crosstabs the best method to illustrate these frequencies? Or using Proc Print? I'm relatively new to SAS and feel this project may be trivial to advanced users - Please no trolls - all input is welcome. Thank you SAS community.


All counts and %s will be done by Facility and Facility/Program Type

- Count submissions with the same DOB, name, sex

- Count of Unknowns for assigned sex at birth

- Count and % of submissions age > 90 or < 2.

- Count and % of submissions with length of first name=2 or length of last name=2.

- Count and % of submissions with first name=(unknown, un, unk, qq, zz, or format of alpha, “.”) or last name=(unknown, un, unk, qq, zz, or format of alpha, “.”) [Note: check other non-alpha characters allowed by the application.]

-Count and % of submissions of questions with Unknown as a possible response by FUS (or Facility)

-Count and % of submissions with unknown month or day of birth.

- Count and % of submissions from children’s programs with age > 21. See list below. I may have listed a few that don’t have to do the PCS.

Super User
Posts: 6,751

Re: Need Help Constructing Code

Posted in reply to joeyyyyyg

Given your "newness", some of the responses you get may skip over questions that you really need to investigate.


PROC FREQ will answer most of these questions.  Some will require a combination of PROC FORMAT and PROC FREQ which is easy for advanced users, but might require some study on your part.


Where do you begin?  I would suggest looking at your data.


Do you already have a SAS data set? 


Does each "submission" represent a single observation in the SAS data set?


I would imagine that the answers are "yes" so far.


Do you know the names of the variables in the SAS data set?  (If you haven't done so already, run PROC CONTENTS on the data set.)


After that, explore the data.  You will need to know, for example, whether "unknown" is represented by a blank value, or by "unk" or by "UNK" (or by some other value).  A simple PROC FREQ (on GENDER) would tell you that. 


You will need to know how the birth date values are stored ... as one numeric variable, as one character variable, as three variables ...


What is the definition of "non-alpha"?


What is the definition of a "children's program"?


All of this takes place before you begin to answer any of the questions that you originally posed.


Good luck.

Ask a Question
Discussion stats
  • 1 reply
  • 2 in conversation