DATA Step, Macro, Functions and more

How would you transfer the data into long form?

Accepted Solution Solved
Reply
Super Contributor
Posts: 1,636
Accepted Solution

How would you transfer the data into long form?

Hi Experts,

If you were asked to transfer the attached file into a long form, how would you do it?

there are 18 subjects in the attached file and each subject has three rows of information. There should be 114 observations and 4 variables(NEWID,AAA,BBB,CCC) in the final long form dataset.

NEWID should be 1-18.

Thanks - Linlin

Attachment

Accepted Solutions
Solution
‎01-30-2013 08:59 PM
Respected Advisor
Posts: 4,920

Re: How would you transfer the data into long form?

Finally got around doing it :

data have;
array x{16}; /* Increase if needed */
infile "&sasforum.\datasets\sheet555.csv" dsd missover;
input type $ (x{*}) (??);
type = substr(type,1,3);
if not missing(type);
if upcase(type) = "AAA" then newid + 1;
run;

proc transpose data=have out=want(drop=_name_ where=(n(AAA,BBB,CCC)>0));
by newid;
var x:;
id type;
run;

PG

PG

View solution in original post


All Replies
Contributor
Posts: 22

Re: How would you transfer the data into long form?


Hey Linlin,

This is the way I'd feel most comfortable reading in the data. However, I get a weird note on the first record. See what you think:

data long_form (keep=ID AAA BBB CCC);
     infile "Q:\SAS Community\sheet555.csv" end=done;

     length ID $4 AAA BBB CCC 8.;
     do until (done=1);
          line+1;
          input #line @1;
          A_Rec = _infile_;

          line+1;
          input #line @1;
          B_Rec = _infile_;

          line+1;
          input #line @1;
          C_Rec = _infile_;


          ID=scan(scan(A_rec,1,","),2," ");
          do i=2 to countw(A_Rec,",");
               AAA=scan(A_rec,i,",");
               BBB=scan(B_rec,i,",");
               CCC=scan(C_rec,i,",");
               if not missing(AAA) then output;
          end;

          line+1;
end;
run;

There are many other techniques, but using line pointers in this way is how I'd do it.

Hope this helps,

Huey

Super User
Posts: 19,775

Re: How would you transfer the data into long form?

I don't like mine solution...its too hardcoded and I'm not sure its right, but its where I usually start.

data want;

    set test;

    retain newid 0;

    check=substr(var1, 1, 3);

    if check='AAA' then newid+1;

    array var(*) var2--var9;

    if check='AAA' then do;

        do i=1 to dim(var);

        AAA=var(i);

        output;

        end;

    end;

    else if check='BBB' then do;

        do i=1 to dim(var);

            BBB=var(i);

            output;

        end;

    end;

    else if check='CCC' then do;

        do i=1 to dim(var);

            CCC=var(i);

            output;

        end;

    end;

run;

data want2;

    merge want (where = (check='AAA') keep =newid check i aaa)

            want (where=(check='BBB') keep = newid check i bbb)

            want (where=(check='CCC') keep=newid check  i ccc);

    by newid i;

    keep newid AAA BBB CCC;

run;

Contributor
Posts: 22

Re: How would you transfer the data into long form?

So it looks like you read the csv file in a previous step and then porcess the resultant dataset two more times to get what you want.

Did my method work for you?

Super Contributor
Posts: 1,636

Re: How would you transfer the data into long form?

Hi Huey and Reeza,

Thank you for your inputs! I used proc import. there is an error message in the log file but the outcome is fine.

data WORK.temp    ;

     infile 'c:\temp\forum\sheet555.csv' delimiter = ','

MISSOVER DSD lrecl=32767 ;

        informat VAR1 $7. ;

        informat VAR2 best32. ;

         informat VAR3 best32. ;

         informat VAR4 best32. ;

         informat VAR5 best32. ;

         informat VAR6 best32. ;

         informat VAR7 best32. ;

        informat VAR8 best32. ;

         informat VAR9 best32. ;

         format VAR1 $7. ;

        format VAR2 best12. ;

         format VAR3 best12. ;

        format VAR4 best12. ;

        format VAR5 best12. ;

        format VAR6 best12. ;

        format VAR7 best12. ;

        format VAR8 best12. ;

         format VAR9 best12. ;

      input

                  VAR1 $

                  VAR2

                 VAR3

                  VAR4

                  VAR5

                  VAR6

                  VAR7

                 VAR8

                  VAR9

      ;

  

     run;

data wide (drop=var1);

set temp;

newid=round((_n_+1)/4,1);

if missing(var1) then delete;

id=upcase(scan(var1,1));

run;

proc transpose data=wide out=wanted(drop=_Smiley Happy;

   by newid;

   id id;

   var var2-var9;

run;

data long;

   set wanted(where=(aaa ne .));

run;

/********* log file ********/

1    data WORK.temp    ;

2         infile 'c:\temp\forum\sheet555.csv' delimiter = ','

3    MISSOVER DSD lrecl=32767 ;

4            informat VAR1 $7. ;

5            informat VAR2 best32. ;

6             informat VAR3 best32. ;

7             informat VAR4 best32. ;

8             informat VAR5 best32. ;

9             informat VAR6 best32. ;

10            informat VAR7 best32. ;

11           informat VAR8 best32. ;

12            informat VAR9 best32. ;

13            format VAR1 $7. ;

14           format VAR2 best12. ;

15            format VAR3 best12. ;

16           format VAR4 best12. ;

17           format VAR5 best12. ;

18           format VAR6 best12. ;

19           format VAR7 best12. ;

20           format VAR8 best12. ;

21            format VAR9 best12. ;

22         input

23                     VAR1 $

24                     VAR2

25                    VAR3

26                     VAR4

27                     VAR5

28                     VAR6

29                     VAR7

30                    VAR8

31                     VAR9

32         ;

33

34        run;

NOTE: The infile 'c:\temp\forum\sheet555.csv' is:

      Filename=c:\temp\forum\sheet555.csv,

      RECFM=V,LRECL=32767,File Size (bytes)=2051,

      Last Modified=29Jan2013:15:53:30,

      Create Time=29Jan2013:18:00:47

NOTE: Invalid data for VAR4 in line 62 16-16.

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7---

62        BBB,-4.6,-18.1,X,-4.5,1.3,-9.6,-9.3,-7 38

VAR1=BBB VAR2=-4.6 VAR3=-18.1 VAR4=. VAR5=-4.5 VAR6=1.3 VAR7=-9.6 VAR8=-9.3 VAR9=-7

_ERROR_=1 _N_=62

NOTE: 71 records were read from the infile 'c:\temp\forum\sheet555.csv'.

      The minimum record length was 8.

      The maximum record length was 53.

NOTE: The data set WORK.TEMP has 71 observations and 9 variables.

NOTE: DATA statement used (Total process time):

      real time           0.29 seconds

      cpu time            0.03 seconds

35   data wide (drop=var1);

36   set temp;

37   newid=round((_n_+1)/4,1);

38   if missing(var1) then delete;

39   id=upcase(scan(var1,1));

40   run;

NOTE: There were 71 observations read from the data set WORK.TEMP.

NOTE: The data set WORK.WIDE has 54 observations and 10 variables.

NOTE: DATA statement used (Total process time):

      real time           0.12 seconds

      cpu time            0.04 seconds

41

42   proc transpose data=wide out=wanted(drop=_Smiley Happy;

43      by newid;

44      id id;

45      var var2-var9;

46   run;

NOTE: There were 54 observations read from the data set WORK.WIDE.

NOTE: The data set WORK.WANTED has 144 observations and 4 variables.

NOTE: PROCEDURE TRANSPOSE used (Total process time):

      real time           0.22 seconds

      cpu time            0.00 seconds

47   data long;

48      set wanted(where=(aaa ne .));

49   run;

NOTE: There were 114 observations read from the data set WORK.WANTED.

      WHERE aaa not = .;

NOTE: The data set WORK.LONG has 114 observations and 4 variables.

NOTE: DATA statement used (Total process time):

      real time           0.01 seconds

      cpu time            0.00 seconds

Contributor
Posts: 22

Re: How would you transfer the data into long form?

Yeah, that 'X' messed up my input method as well. Looks like the data contain "X's" for missing data?

Solution
‎01-30-2013 08:59 PM
Respected Advisor
Posts: 4,920

Re: How would you transfer the data into long form?

Finally got around doing it :

data have;
array x{16}; /* Increase if needed */
infile "&sasforum.\datasets\sheet555.csv" dsd missover;
input type $ (x{*}) (??);
type = substr(type,1,3);
if not missing(type);
if upcase(type) = "AAA" then newid + 1;
run;

proc transpose data=have out=want(drop=_name_ where=(n(AAA,BBB,CCC)>0));
by newid;
var x:;
id type;
run;

PG

PG
Super Contributor
Posts: 1,636

Re: How would you transfer the data into long form?

Hi PG,

Thank you very much!!!  I am glad I posted the question. Would you please explain more about the double question marks?

input type $ (x{*}) (??);


Linlin

Respected Advisor
Posts: 4,920

Re: How would you transfer the data into long form?

Hi!

From the doc :

The ? modifier suppresses the invalid data message. The ?? modifier also

suppresses the invalid data message and, in addition, prevents the automatic

variable _ERROR_ from being set to 1 when invalid data are read.

PG


PG
Super Contributor
Posts: 1,636

Re: How would you transfer the data into long form?

Thank you PG! I wonder what caused the invalid message, Maybe some invisible characters in the file?

Respected Advisor
Posts: 4,920

Re: How would you transfer the data into long form?

As hdodson mentioned, there is a letter X on line 62 of the csv file where there should be a number.

PG
Super Contributor
Posts: 1,636

Re: How would you transfer the data into long form?

Thank you PG! I did not notice the "X".

Good night! - Linlin

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 440 views
  • 7 likes
  • 4 in conversation