DATA Step, Macro, Functions and more

How to create info based on an unknown imported dataset (no clue of # and kind of variables)?

Reply
Contributor
Posts: 62

How to create info based on an unknown imported dataset (no clue of # and kind of variables)?

Hi,

I'm facing the following 'challenge': based on *any* imported dataset with an unknown number / set of variables (by proc import, or just simply dragged into Enterprise Guide) I want to create the following information among other things (as a report or output in a dataset):

1) The number of records filled and/or blanc (OBC_ID=_n_ creates a copy of obs, always filled, so that should be an easy step in the process)

2) Per variable (presented by variable name), the total sum of all records (in case of numeric field)

3) Per variable (presented by variable name), the ratio filled records vs. blanc records, so the % of the variable name that is blanc (for numeric and character variables)

The dataset vcolumn in SASHELP contains all variables of the (random) imported set, but not the info as needed unfortunately. For this assignment, it looks like I need to create a loop in which all unknown variables are assigned to a macro, and from there the code should read out all records of the variable, so the output can be generated? Should that be a good approach? If so, how can this be done?

For example, the output has to provide me with information as presented below (all variables of the imported dataset which contains unknown variables should be present, so the dummy dataset as presented as the desired output below does contain 7 variables):

***************************

Output, based on a random imported set:

Variable name | number of records in total dataset |    total sum of records    | % blanc (or % filled)

Video_ID      |                1600                |         1280800            |  100%

Video_ID2     |                800                 |         158766             |  50%

Name          |                1600                |         N/A (Char)         |  100%

Address       |                400                 |         N/A (Char)         |  25%

Amount        |                1600                |         5500               |  100%

Date_rent     |                1600                |    N/A (output=sum dates)  |  100%

Date_due      |                1600                |    N/A (output=sum dates)  |  100%

How can this be done...? There's no dataset in SASHELP with such info, if I'm correct.

Thanks for your input!!

Valued Guide
Posts: 3,208

Re: How to create info based on an unknown imported dataset (no clue of # and kind of variables)?

proc summary..... http://support.sas.com/resources/papers/proceedings10/026-2010.pdf

---->-- ja karman --<-----
Contributor
Posts: 62

Re: How to create info based on an unknown imported dataset (no clue of # and kind of variables)?

Yes, most likely that will be incorporated at the end of the code (or somtething similar). But for this, you *need* the variable names. And... these are unknown in 'my project'. So the question leaves us with: how to put in unknown (number of) variables into this PROC? Create a DO loop --> macro's?

Valued Guide
Posts: 3,208

Re: How to create info based on an unknown imported dataset (no clue of # and kind of variables)?

If you omit the var statement it will produce a report for all of them.

Base SAS(R) 9.2 Procedures Guide var statement proc summary

It is part of the concepts the variables are not necessary strictly defined. This is quite different is a lot of other languages.

SAS(R) 9.4 Language Reference: Concepts SAS Variable list. You can define ranges but also having special list like _all_ _numeric_ _character_

It is part of the SAS VA promotion this behavior is coming back at more parts. Just klik klik and a lot of programming decisions are automated behind screens.  

---->-- ja karman --<-----
Super User
Super User
Posts: 6,502

Re: How to create info based on an unknown imported dataset (no clue of # and kind of variables)?

Most SAS procedures have no need to know the variable names. 

And in many places you can substitute variables lists like _ALL_, _NUMERIC_ or _CHARACTER_.

Super User
Posts: 17,840

Re: How to create info based on an unknown imported dataset (no clue of # and kind of variables)?

There's at least 2 solutions on this forum, one from me and one from someone else that pretty much are the full solution to this question.

Some more searching will find you the full answer.

Some things to think about, SASHELP or the Dictionary tables don't have the full answer, but do they give you other things to get there, i.e. variable names? But as Tom's mentioned you don't even really need those Smiley Happy

Ask a Question
Discussion stats
  • 5 replies
  • 291 views
  • 0 likes
  • 4 in conversation