|
|||
|
I'm working on a project using a text widget. I've noticed that
after saving the data from the text widget that the data file format is exactly as it was displayed in the text widget. So TABS and CRLFs end up taking a huge amount of space. My data file looks like... Some text tabbed to some column several CRLF's down the page So the data file ends up taking 10 lines of data in this example. Is there any way to "compress" this for saving and "decompress" for loading, retaining the original tabbing & CRLF's without using so much space in the data file?? Thanks! GG |
|
|
||||
|
||||
|
|
|
|||
|
On Feb 9, 9:18*am, GizmoGorilla <gizmogori...@hotmail.com> wrote:
> I'm working on a project using a text widget. I've noticed that > after saving the data from the text widget that the data file > format is exactly as it was displayed in the text widget. So > TABS and CRLFs end up taking a huge amount of space. My data > file looks like... > > Some text * * * * * * * * *tabbed to some column > > several CRLF's down the page > > So the data file ends up taking 10 lines of data in this > example. Is there any way to "compress" this for saving > and "decompress" for loading, retaining the original > tabbing & CRLF's without using so much space in the > data file?? > > Thanks! > > GG I'm missing something. Why is this a problem? Each TAB is one character, and each newline is one character (two characters on the disk, on Windows). How much more can you compress it? You could choose to replace each tab and each CRLF with some other printable string on output, e.g., "\t" and "\n", using [string map], and convert them back on input; but that's not going to decrease the actual number of bytes. |
|
|||
|
On 2010-02-09 1:30 PM, Will Duquette wrote:
> On Feb 9, 9:18 am, GizmoGorilla<gizmogori...@hotmail.com> wrote: >> I'm working on a project using a text widget. I've noticed that >> after saving the data from the text widget that the data file >> format is exactly as it was displayed in the text widget. So >> TABS and CRLFs end up taking a huge amount of space. My data >> file looks like... >> >> Some text tabbed to some column >> >> several CRLF's down the page >> >> So the data file ends up taking 10 lines of data in this >> example. Is there any way to "compress" this for saving >> and "decompress" for loading, retaining the original >> tabbing& CRLF's without using so much space in the >> data file?? >> >> Thanks! >> >> GG > > I'm missing something. Why is this a problem? Each TAB is one > character, and each newline is one character (two characters on the > disk, on Windows). How much more can you compress it? > > You could choose to replace each tab and each CRLF with some other > printable string on output, e.g., "\t" and "\n", using [string map], > and convert them back on input; but that's not going to decrease the > actual number of bytes. The tab is inserting blanks up to the column that was tabbed to. CRLF's will leave a blank line if there's no text on the line. That's fine for the user but not for the data file. If I crlf 5 times with no text on the line, I get 5 blank lines, in my data file. This is a huge waste of space, that's why its a problem. Data is being saved exactly as it is viewed in the text widget, WYSIWYG. So perhaps when I save, for example... (users view) some text tab to here \n \n \n \n ....it could be saved as... (data view) some text [5x\t] tab to here [4x\n] and then expanded on a load. This is what I mean by compressing, removing the redundant data to save space. This would remove the spaces inserted by the tabs, and remove the blank lines, So my data now uses 2 lines, not 5. I was hoping there was an easier way to do this, other than converting during save/load. I hope that clarifies my post... Thanks! |
|
|||
|
On Feb 9, 1:26 pm, GizmoGorilla <gizmogori...@hotmail.com> wrote:
> > The tab is inserting blanks up to the column that was tabbed to. .... so, you don't actually have tabs in the files, you have blocks of consecutive spaces. Make sure when you describe a problem that you describe it accurately. > CRLF's will leave a blank line if there's no text on the line. > That's fine for the user but not for the data file. > If I crlf 5 times with no text on the line, I get 5 blank lines, > in my data file. This is a huge waste of space, five *bytes* is a huge waste of space? *bytes*? Even if you're talking about one thousand blank lines, that's still only 1K. That's a minuscule amount of disk space in this day and age, probably smaller than the disk block size. Are you are a system that is seriously constrained in space? If you're concerned about disk space you should consider using a standard compression scheme. Just run your data through zip/gzip and that way any other wasteful characters will also get compressed. Of course, you're doing this all to the inconvenience of the user who will no longer be able to edit your text files with anything but your tool (or, force them to unzip the file before editing it with some other tool) I don't mean to sound preachy, but it sounds like you're making a beginners mistake of premature optimization. Don't worry about compression until it actually proves to be a problem. Otherwise you'll spend way too much time on something that just doesn't matter. |
|
|||
|
On Feb 9, 11:26*am, GizmoGorilla <gizmogori...@hotmail.com> wrote:
> On 2010-02-09 1:30 PM, Will Duquette wrote: > > > > > > > On Feb 9, 9:18 am, GizmoGorilla<gizmogori...@hotmail.com> *wrote: > >> I'm working on a project using a text widget. I've noticed that > >> after saving the data from the text widget that the data file > >> format is exactly as it was displayed in the text widget. So > >> TABS and CRLFs end up taking a huge amount of space. My data > >> file looks like... > > >> Some text * * * * * * * * *tabbed to some column > > >> several CRLF's down the page > > >> So the data file ends up taking 10 lines of data in this > >> example. Is there any way to "compress" this for saving > >> and "decompress" for loading, retaining the original > >> tabbing& *CRLF's without using so much space in the > >> data file?? > > >> Thanks! > > >> GG > > > I'm missing something. *Why is this a problem? *Each TAB is one > > character, and each newline is one character (two characters on the > > disk, on Windows). *How much more can you compress it? > > > You could choose to replace each tab and each CRLF with some other > > printable string on output, e.g., "\t" and "\n", using [string map], > > and convert them back on input; but that's not going to decrease the > > actual number of bytes. > > The tab is inserting blanks up to the column that was tabbed to. > CRLF's will leave a blank line if there's no text on the line. > That's fine for the user but not for the data file. > If I crlf 5 times with no text on the line, I get 5 blank lines, > in my data file. This is a huge waste of space, that's why its a > problem. Data is being saved exactly as it is viewed in the text > widget, WYSIWYG. So perhaps when I save, for example... > > (users view) > some text * * * * * * * * * * * * * *tab to here > \n > \n > \n > \n > > ...it could be saved as... > > (data view) > some text [5x\t] tab to here > [4x\n] > > and then expanded on a load. This is what I mean by compressing, > removing the redundant data to save space. > > This would remove the spaces inserted by the tabs, and remove > the blank lines, So my data now uses 2 lines, not 5. > > I was hoping there was an easier way to do this, other than > converting during save/load. > > I hope that clarifies my post... > > Thanks! No, converting during save/load is the way to do it. But why does the number of lines in the data file matter? You say it's a waste of space; but in what sense is the space being wasted? Are you simply concerned about the amount of screen space the record consumes when you view the file in an editor? Unless you're dealing with truly vast amounts of data, you're not going to saving enough bytes to make the disk space consumption matter one way or another. Will |
|
|||
|
On 2010-02-09 3:55 PM, Bryan Oakley wrote:
[8<] > I don't mean to sound preachy, but it sounds like you're making a > beginners mistake of premature optimization. Don't worry about > compression until it actually proves to be a problem. Otherwise you'll > spend way too much time on something that just doesn't matter. ....comments appreciated, and I will try to be more precise next time... Yes you're right that in this day, space is cheap & abundant, but that is not always the case. I'm developing this software to be run on an old, slow, Solaris networked server where space is not as abundant and cpu speed is not blazing. This, I have no control over. The text widget will be used to link a user comment to a record in a db, of which there may be 10's of 000's of records so Im trying to be efficient given my constraints. I will take the posted comments into consideration. Thanks. Norm [GG] |
|
|||
|
On Feb 10, 4:49*am, GizmoGorilla <gizmogori...@hotmail.com> wrote:
> On 2010-02-09 3:55 PM, Bryan Oakley wrote: > > [8<] > > > I don't mean to sound preachy, but it sounds like you're making a > > beginners mistake of premature optimization. Don't worry about > > compression until it actually proves to be a problem. Otherwise you'll > > spend way too much time on something that just doesn't matter. > > ...comments appreciated, and I will try to be more precise next > time... > > Yes you're right that in this day, space is cheap & abundant, > but that is not always the case. I'm developing this software > to be run on an old, slow, Solaris networked server where space > is not as abundant and cpu speed is not blazing. This, I have > no control over. The text widget will be used to link a > user comment to a record in a db, of which there may be 10's > of 000's of records so Im trying to be efficient given my > constraints. I will take the posted comments into > consideration. Thanks. > > Norm [GG] OK; but man, it's been a long time since I've worried about the space taken up by *text*. What kind of database are you using? Will |
|
|||
|
[8<] > OK; but man, it's been a long time since I've worried about the space > taken up by *text*. > > What kind of database are you using? Everything is custom made here using a simple flat file system. SQL would be nicer but this is what I have to work with. The DB is used to track about 1000 students per semester for equipment assignment and account related issues. Each student could potentially have a couple dozen entries in a semester, each entry with a comment field... Thanks again for your input. Norm |
|
|||
|
GizmoGorilla wrote:
> widget, WYSIWYG. So perhaps when I save, for example... > > (users view) > some text tab to here > \n > \n > \n > \n > > ...it could be saved as... > > (data view) > some text [5x\t] tab to here > [4x\n] > This would remove the spaces inserted by the tabs, and remove > the blank lines, So my data now uses 2 lines, not 5. Do you realize by doing that you are actually increasing your file size and not compressing it in any way? > string length "\n\n\n\n" 4 > string length "\[5x\\n\]" 6 Especially with tabs, where you are more likely to have one here and there, or 2-3 in consecutive order, you are increasing the size of the file not to mention the extra processing that you must do at load/unload. Plus how many 1x commands are you planning to define? I would suggest to either trim tabs/newlines away if they are multiples, or leave things as is. DrS |
|
|||
|
On 10.02.11 7:07, drscrypt@gmail.com wrote:
> Do you realize by doing that you are actually increasing your file size > and not compressing it in any way? [8<] > > I would suggest to either trim tabs/newlines away if they are multiples, > or leave things as is. Yea, I do realize that, and in retrospect, a little more thought should have been exercised prior my to posting. ![]() I will leave things as they are for now. Once the software goes live I'll see how it performs... Thanks for the feedback... Norm |
|
|
|
|
![]() |
| Popular Tags in the Forum |
| saving, tabs or crlf, text, widget |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Can't update contents of text widget | Will Duquette | Newsgroup comp.lang.tcl | 3 | 12-21-2009 09:48 PM |
| ODS Layout Question | J M | Newsgroup comp.soft-sys.sas | 0 | 08-11-2009 09:56 PM |
| looking for -textvariable option for text widget ... | Spam@ControlQ.com | Newsgroup comp.lang.tcl | 5 | 05-21-2009 03:59 PM |
| how to get rid of this error msg | Jeff | Newsgroup comp.soft-sys.sas | 0 | 03-02-2009 06:36 PM |
| Re: SAS term: character string, text expression | Ian Whitlock | Newsgroup comp.soft-sys.sas | 0 | 06-09-2007 05:08 PM |