View Single Post
  #13 (permalink)  
Old 05-20-2008, 07:07 PM
James Giles
Guest
 
Posts: n/a
Default Re: encoding="utf-8" and formatted direct access

Thomas Koenig wrote:
> On 2008-05-19, James Giles <jamesgiles@worldnet.att.net> wrote:
>
>> In fact there's probably not *any* language with built-in support for
>> direct I/O style access to records made up of utf-8 characters. At
>> least none that will do any better than merely accomodating 4*N
>> byte lengths where N is the max number of characters your records
>> might have.

>
> The conclusion appears to be that the combination of direct access
> formatted with utf-8 fundamentally broken. If that's the case, should
> the Fortran standard address this issue by prohibiting the
> combination? I don't see any such wording in the F2003 draft.


The standard permits implementations not to support features, or
combinations of features that the implementor finds too difficult.
That provision is in section 1 of the document and is specifically
reiterated in the I/O section.

I don't see anything in the combination of UTF-8 and direct I/O
that actually undoable. And encouraging implementations to
find ways of doing it seems appropriate. So I don't think there's
any need for a specific prohibition.

To be sure, most implementations will probably have a lot of
"internal fragmentation" in such files.

--
J. Giles

"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare


Reply With Quote