Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Programming > Ada > Re: Wide_[Wide_...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 12 of 12 Topic 5807 of 5857
Post > Topic >>

Re: Wide_[Wide_]Character

by Adam Beneschan <adam@[EMAIL PROTECTED] > Jul 22, 2008 at 12:18 PM

On Jul 12, 12:44 am, Dale Stanbrough <MrNoS...@[EMAIL PROTECTED]
> wrote:
> Unicode can be represented using UTF-8, UTF-16 and UTF-32 (amongst
> others).
>
> I gather that Character is simply ISO-8859-1 (Latin-1).
>
> I suspect that Wide_Character is UCS-2 (simple 2 byte values, no escapes
> like UTF-16).
>
> Is Wide_Wide_Character
>
>    * UTF-16
>    * UTF-32 (i.e. UCS-4)
>    * System dependent
>    * Something else
>
> Thanks,
>
> Dale

I'm not convinced that the question makes sense.  Wide_Character
refers to an enumeration type with 2**16 literals, where
Wide_Charater'Val(N) denotes the corresponding character in the ISO
10646 Basic Multilingual Plane, i.e. Unicode.  Unicode is a
*character* *set*, i.e. a definition of what character corresponds to
each integer; it says nothing about how characters are represented.
Wide_Wide_Character is similarly an enumeration type with 2**32
literals.

When a sequence of characters is represented in internal memory, it's
up to an implementation to decide how to represent each character in
memory.  But in most cases, it makes no sense to represent it as
anything other than a flat array.  Thus, a Wide_String would be, in
essence, an array of 16-bit integers, and a Wide_Wide_String would be
an array of 32-bit integers.  If it were represented otherwise, how
could a program access, say, S(1000) where S is declared as a
Wide_Wide_String(1..2000)?  If it were represented as, say, UTF-8 or
UTF-16, the program would have to start at the beginning of the string
and do an expensive search every time it wanted to access one
particular character of the string.  This would not make sense.  So I
think that any implementation would implement those character (and
string) types as an integer (or array of integers), with whatever
endianness is most convenient for that processor.

When a sequence of characters is represented in a file (or is
communicated some other way e.g. over a socket), the characters may
well be encoded as UTF-8 or UTF-16 or something.  The language doesn't
define how different encodings are handled.  I believe GNAT uses the
"form" parameter when a file is opened or created to specify the
encoding; it sup****ts a number of different possible encodings,
because different files that come from different places may be encoded
in different ways.  When a line is read from one of those files into
memory, though, I'm sure that the runtime will convert it to an
internal representation that is a flat array.

I'm not sure if this tells you what you need to know or not; if not,
then if you tell us why you're asking the question (i.e. what you want
to accomplish), this will give us a better idea of what we need to
tell you.  If you're trying to do some sort of overlay, where you read
in raw bytes from a file and then use Unchecked_Conversion or
something to convert it to a Wide_Wide_String, or something of that
nature, my advice is: Just don't do that.

P.S. I know I'm coming in late to this thread---I just got back from
vacation.  If your question has already been answered, my apologies.

                                -- Adam
 




 12 Posts in Topic:
Wide_[Wide_]Character
Dale Stanbrough <MrNoS  2008-07-12 07:44:38 
Re: Wide_[Wide_]Character
"Dmitry A. Kazakov&q  2008-07-12 10:11:38 
Re: Wide_[Wide_]Character
Dale Stanbrough <MrNoS  2008-07-12 11:00:05 
Re: Wide_[Wide_]Character
"Peter C. Chapin&quo  2008-07-12 07:27:07 
Re: Wide_[Wide_]Character
Georg Bauhaus <see.rep  2008-07-12 14:25:43 
Re: Wide_[Wide_]Character
Dale Stanbrough <MrNoS  2008-07-15 12:37:16 
Re: Wide_[Wide_]Character
Georg Bauhaus <rm.dash  2008-07-15 16:06:42 
Re: Wide_[Wide_]Character
"Dmitry A. Kazakov&q  2008-07-12 22:56:44 
Re: Wide_[Wide_]Character
anon@[EMAIL PROTECTED] (  2008-07-12 10:11:47 
Re: Wide_[Wide_]Character
Dale Stanbrough <MrNoS  2008-07-12 10:58:12 
Re: Wide_[Wide_]Character
anon@[EMAIL PROTECTED] (  2008-07-13 01:38:24 
Re: Wide_[Wide_]Character
Adam Beneschan <adam@[  2008-07-22 12:18:41 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Wed Aug 20 4:24:43 CDT 2008.