|
|||
|
Dorian,
Here's a modified version to handle the x=0 case (only works correctly if the x=0 is on the last time record for an ID. The out of memory problem can be circumvented (up to a point) with the MVARSIZE system option: options mvarsize=1000000; which increases the memory allocation for the maximum macro variable size. It's hard to say what would be most efficient here without running some tests. The relative number of time points versus ID's makes a difference on the best approach. HTH, Dennis Diskin options mvarsize=512000; proc sql noprint; select distinct trim(left(put(time,8.))) into :times separated by ',' from test order by time; %let ntimes = &sqlobs; %put × quit; proc sort data=test out=stest; by id descending time; run; data fills; array times(&ntimes) _temporary_ (×); set stest; by id; original_time=time; do i = 1 to dim(times) until(itime ne .); if time eq times(i) then itime = i; end; retain ntime; if first.id then if x eq 0 then ntime = itime; else ntime = dim(times); do i = itime to ntime; time = times(i); output; end; drop i itime ntime; run; proc sort data=fills; by id time original_time; run; data newtime; set fills(drop=original_time); by id time; if first.time; run; "Noel, Dorian" <D.Noel@ISMAcentre.reading.ac.uk> wrote: Dennis, Thanks for your assistance. Your code looks fine for a small dataset. Unfortunately, I am working with a huge dataset (>1,000,000 observations by 7 variables) where the number of distinct time can exceed 400,000. As results, I was not surprise that I encountered the following: ERROR: Overflow has occurred; evaluation is terminated. ERROR: Out of memory. The mcaro-variable (×)’s value exceeds the 64K set in SAS. In addition, the line do i = itime to dim(times) in your program means that each id will move forward as long there are time points beyond the last occurrence of the id. I forgot to mention that there is a natural break in the processing the data. I apologise for that oversight. If x = 0 for a particular id then the processing stop at that time (e.g. id 1004 at time 37105). Therefore, no record should be written for id 1004 beyond time 37105. data test; format time time8.; length id $ 4 x 8; input id $ time x; cards; 1004 36678 13 100B 36684 12 100C 36690 9 1004 36700 9 100G 36790 10 100F 36805 10 100B 37002 10 100C 37105 5 100F 37105 7 1004 37105 0 100H 37150 10 100H 37155 5 ; run; I need an efficient code in order to process the data further where I am require to perform group processing of all ids by another variable by time. Take care. Dorian -----Original Message----- From: Dennis Diskin [mailto:diskin@snet.net] Sent: 14 January 2005 02:53 To: Noel, Dorian; SAS-L@LISTSERV.UGA.EDU Subject: Re: Adding Records for Missing Time d, Afraid my original reply left out a little: proc sql noprint; select distinct time into :times separated by ',' from test order by time; %let ntimes = &sqlobs; quit; proc sort data=test out=stest; by id time; run; data fills; array times(&ntimes) _temporary_ (×); set stest; by id; original_time=time; do i = 1 to dim(times) until(itime ne .); if time eq times(i) then itime = i; end; do i = itime to dim(times); time = times(i); output; end; drop i itime; run; proc sort data=fills; by id time original_time; run; data newtime; set fills(drop=original_time); by id time; if first.time; run; Hope that works better, Dennis Diskin "d.noel@ismacentre.rdg.ac.uk" <d.noel@ISMACENTRE.RDG.AC.UK> wrote: SAS-L, I am working on a block of programs that is turning out to very inefficient.Consider the following sample dataset. data test; length id $4 time 8 x 8; input id time x; cards; 1004 36678 13 100B 36684 12 100C 36690 9 1004 36700 9 100G 36790 10 100F 36805 10 100B 37002 10 100C 37105 5 100F 37105 7 1004 37105 0 ; run; I want to add records for each id for missing time between non-missing time for that id. Moreover, I want the following expanded dataset: 1004 36678 13 100B 36684 12 1004 36684 13 100C 36690 9 1004 36690 13 100B 36690 12 1004 36700 9 100C 36700 9 100B 36700 12 100G 36790 10 1004 36790 9 100C 36790 9 100B 36790 12 100F 36805 10 100G 36805 10 1004 36805 9 100C 36805 9 100B 36805 12 100B 37002 10 100F 37002 10 100G 37002 10 1004 37002 9 100C 37002 9 100C 37105 5 100F 37105 7 100B 37105 10 100G 37105 10 1004 37105 0 I want to avoid run the foll. inefficient set of codes: proc sql noprint; create table times as select id, min(time) as min_time, max(time) as max_time from test group by id order by id; run; data times (keep = id time); set times; do time = min_time to max_time; output; end; run; proc sort data = test; by id time; run; proc sort data = times; by id time; run; data output; merge test times; by id time; retain temp; if x = . then x = temp; temp = x; run; proc sql; create table eventdt as select distinct time as dt from test order by time; create table output as select * from output where time in (select dt from eventdt) order by time; quit; Thanks in advance. Dorian |
|
|
||||
|
||||
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Re: SAS/Windows CPU Utilization does not increase beyond 50% | Ben Powell | Newsgroup comp.soft-sys.sas | 0 | 03-20-2009 10:28 AM |
| Call Execute usage/behavior on Macros | dc353@hotmail.com | Newsgroup comp.soft-sys.sas | 0 | 11-14-2008 08:02 PM |
| Re: Comparing groups at each time point with Proc mixed model | Robin R High | Newsgroup comp.soft-sys.sas | 0 | 07-30-2008 02:17 PM |
| Re: SUDOKU Solver Informal Comparisons | Talbot Michael Katz | Newsgroup comp.soft-sys.sas | 0 | 05-24-2006 03:21 PM |
| Re: sas Performance Enhancement | Paul M. Dorfman | Newsgroup comp.soft-sys.sas | 0 | 10-14-2005 01:54 AM |