Discussion:
What is meant by 'canonical representation'
(too old to reply)
Petterson Mikael
2004-11-25 08:36:17 UTC
Permalink
Hi,

I was reading the sun java api 1.4.2 and found the following for
String.intern().

public String intern()

Returns a canonical representation for the string object.

What is a canonical representation? Any example?

//Mikael
VisionSet
2004-11-25 13:33:54 UTC
Permalink
Post by Petterson Mikael
Hi,
I was reading the sun java api 1.4.2 and found the following for
String.intern().
public String intern()
Returns a canonical representation for the string object.
What is a canonical representation? Any example?
Canonical means reduced to its simplest form.
In this context it means it returns a reference to the very same String
object itself, which will be a reference to one of the internally pooled
String objects.
The process of interning means that the object is added to the internally
maintained collection of String objects. Since String objects are immutable
and any two objects where str1.equal(Object str2) == true are equivalent and
are guaranteed to remain so. Because of this it is reasonable and more
efficient to allow any created String to be GC's and rely on the returned
value of intern().

Eg if you get a String object from the Standard input stream, this will be a
unique String object.

If you want to use it you may do

myInputString = myInputString.intern();

This will guarantee that you do not create extra copies of an identical
String.

Note that typically interning is automatic within a class as long as you
create your strings at compile time and like so: String s = "xyz";
Do not do String s = new String("xyz"); since that will not auto intern the
result.
I believe that runtime String instantiation is not auto interned.
But you would have to read the JLS for the exact mechanism.
--
Mike W
Chris Smith
2004-11-25 14:36:46 UTC
Permalink
Post by VisionSet
Canonical means reduced to its simplest form.
It's more like "reduced to its representative form". The canonical
string is not necessarily simpler -- it's just the one that you always
get from intern(). Or, if you're a latin buff, canonical roughly means
"from the list" -- the list being the Strings in the String pool --
which works, too.
Post by VisionSet
The process of interning means that the object is added to the internally
maintained collection of String objects. Since String objects are immutable
and any two objects where str1.equal(Object str2) == true are equivalent and
are guaranteed to remain so. Because of this it is reasonable and more
efficient to allow any created String to be GC's and rely on the returned
value of intern().
This really depends on how much comparing of the String you'll be doing.
Your post talks a lot about the cost of creating a new String, which is
O(n) with a fairly small constant on the number of characters in the
String, and not much about the cost of interning the String, which is
probably about O(log m) on the number of interned Strings in existence
in the entire application. There is a definite tradeoff there, and in
my experience String interning is a very special-purpose technique.
--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
Loading...