the name "AtomicString"

# Darin Adler (2 days ago)

The name “AtomicString” was inspired by the term of art, “atom”, traditionally used in at least some programming language implementations for what I see now is often called interned strings. You’ll see a mention of that term in the article en.wikipedia.org/wiki/String_interning in the context of ML along with some other terms used for this such as “symbol”.

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

A mild objection I have to the term “interned string” is that the term “interned” is not really a normal English word; I wasn’t familiar with that jargon when we named AtomicString and I am still not entirely thrilled with it. I think that specific term comes from the LISP intern function and is familiar to programmers largely because of its use in Java, .NET, and some other modern programming languages and libraries; I had encountered the technique many times over the years without ever hearing the word “interning” and don’t find the jargon entirely logical.

Some people might suggest using the term “flyweight string” instead en.wikipedia.org/wiki/Flyweight_pattern and I’m not sure which I’d prefer. Maybe there’s another obvious name?

Apparently in Delphi’s DWScript they called it “unified string” www.delphitools.info/2013/06/17/string-unification but in the article I cited they are chided for not calling it an interned string.

— Darin

PS: I do love loudly declaiming, “Atomic string, the string of tomorrow!” but I can get over it.

Contact us to advertise here
# Michael Catanzaro (a day ago)

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

Michael

# Simon Fraser (a day ago)

On Dec 19, 2018, at 12:33 PM, Michael Catanzaro <mcatanzaro at igalia.com> wrote:

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

SingletonString? UniquedString?

Simon

# Maciej Stachowiak (a day ago)

On Dec 19, 2018, at 12:33 PM, Michael Catanzaro <mcatanzaro at igalia.com> wrote:

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

“Interned string” is really obvious to people who have seen it but mysterious to those who have not. Unfortunately, I don’t think there is a term for this concept that most programmers will already know. So the best we can do is a term that is clearly something you need to look up and can’t be confused for something else. Another term of art name for this concept is Symbol. Unfortunately, search results for String, Atom and Atom String don’t give a helpful explanation, and Symbol could be confused with the JavaScript language’s notion of symbols. “Interned String” gives helpful search results, but is not the normal use of the terminology. Usually there is an operation called “intern”, or there might be reference to “string interning”, but it usually gives back just a plain string, or a type with a name like “atom” or “symbol”.

On balance, if we change, I’d lean towards Atom, since it is concise, obviously mysterious if you don’t know what it is, and understandable to at least some people who have seem that name for the concept. The only downside is that it’s not obvious from the name that it is a kind of string.

# Darin Adler (a day ago)

Seems to me we could fix the current problem by renaming from AtomicString to AtomString without causing any new problem.

— Darin

# Ryosuke Niwa (18 hours ago)

On Wed, Dec 19, 2018 at 1:13 PM Simon Fraser <simon.fraser at apple.com> wrote:

On Dec 19, 2018, at 12:33 PM, Michael Catanzaro <mcatanzaro at igalia.com> wrote:

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

SingletonString? UniquedString?

I do like UniquedString. That conveys what AtomicString really is. SingletonString isn't so great since AtomicString table is still per thread.

# Maciej Stachowiak (17 hours ago)

On Dec 19, 2018, at 8:06 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:

On Wed, Dec 19, 2018 at 1:13 PM Simon Fraser <simon.fraser at apple.com <mailto:simon.fraser at apple.com>> wrote:

On Dec 19, 2018, at 12:33 PM, Michael Catanzaro <mcatanzaro at igalia.com <mailto:mcatanzaro at igalia.com>> wrote:

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com <mailto:darin at apple.com>> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

SingletonString? UniquedString?

I do like UniquedString. That conveys what AtomicString really is. SingletonString isn't so great since AtomicString table is still per thread.

So hard to pronounce though! Why not UniqueString? It’s not quite as explicit but close enough.

# Chris Dumez (17 hours ago)

On Dec 19, 2018, at 9:17 PM, Maciej Stachowiak <mjs at apple.com> wrote:

On Dec 19, 2018, at 8:06 PM, Ryosuke Niwa <rniwa at webkit.org <mailto:rniwa at webkit.org>> wrote:

On Wed, Dec 19, 2018 at 1:13 PM Simon Fraser <simon.fraser at apple.com <mailto:simon.fraser at apple.com>> wrote:

On Dec 19, 2018, at 12:33 PM, Michael Catanzaro <mcatanzaro at igalia.com <mailto:mcatanzaro at igalia.com>> wrote:

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com <mailto:darin at apple.com>> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

SingletonString? UniquedString?

I do like UniquedString. That conveys what AtomicString really is. SingletonString isn't so great since AtomicString table is still per thread.

So hard to pronounce though! Why not UniqueString? It’s not quite as explicit but close enough.

Wouldn’t it be confusing to use UniqueString type for a string that is common in order to save memory?

Personally, I like the AtomString proposal as it is close to the naming we are used to and addresses the issue raised (atomic has a different meaning with threading). Also, I had never heard of interned strings before.

# Chris Dumez (17 hours ago)

On Dec 19, 2018, at 9:41 PM, Chris Dumez <cdumez at apple.com> wrote:

On Dec 19, 2018, at 9:17 PM, Maciej Stachowiak <mjs at apple.com <mailto:mjs at apple.com>> wrote:

On Dec 19, 2018, at 8:06 PM, Ryosuke Niwa <rniwa at webkit.org <mailto:rniwa at webkit.org>> wrote:

On Wed, Dec 19, 2018 at 1:13 PM Simon Fraser <simon.fraser at apple.com <mailto:simon.fraser at apple.com>> wrote:

On Dec 19, 2018, at 12:33 PM, Michael Catanzaro <mcatanzaro at igalia.com <mailto:mcatanzaro at igalia.com>> wrote:

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com <mailto:darin at apple.com>> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

SingletonString? UniquedString?

I do like UniquedString. That conveys what AtomicString really is. SingletonString isn't so great since AtomicString table is still per thread.

So hard to pronounce though! Why not UniqueString? It’s not quite as explicit but close enough.

Wouldn’t it be confusing to use UniqueString type for a string that is common in order to save memory?

Personally, I like the AtomString proposal as it is close to the naming we are used to and addresses the issue raised (atomic has a different meaning with threading).

Or I also like “AtomizedString”.

# Maciej Stachowiak (12 hours ago)

On Dec 19, 2018, at 9:41 PM, Chris Dumez <cdumez at apple.com> wrote:

On Dec 19, 2018, at 9:17 PM, Maciej Stachowiak <mjs at apple.com <mailto:mjs at apple.com>> wrote:

On Dec 19, 2018, at 8:06 PM, Ryosuke Niwa <rniwa at webkit.org <mailto:rniwa at webkit.org>> wrote:

On Wed, Dec 19, 2018 at 1:13 PM Simon Fraser <simon.fraser at apple.com <mailto:simon.fraser at apple.com>> wrote:

On Dec 19, 2018, at 12:33 PM, Michael Catanzaro <mcatanzaro at igalia.com <mailto:mcatanzaro at igalia.com>> wrote:

On Tue, Dec 18, 2018 at 9:31 PM, Darin Adler <darin at apple.com <mailto:darin at apple.com>> wrote:

I’ve gotten used to the name AtomicString over the years, but I wouldn’t strongly object to changing it if other programmers are often confused by it’s similarity to the term “atomic operations”.

Well there were two other developers in the thread Ryosuke linked to who made the exact same mistake as me, so I do think the current name is problematic. A change wouldn't need to be drastic, though. I think suggestions from the old thread like "StringAtom" or "AtomString" would be unproblematic. The problem is the specific word "atomic" carries an expectation that the object be safe to access concurrently across threads without locks; I think that expectation doesn't exist if not for the "ic" at the end.

FWIW I've only ever heard the "interned string" terminology prior to now.

SingletonString? UniquedString?

I do like UniquedString. That conveys what AtomicString really is. SingletonString isn't so great since AtomicString table is still per thread.

So hard to pronounce though! Why not UniqueString? It’s not quite as explicit but close enough.

Wouldn’t it be confusing to use UniqueString type for a string that is common in order to save memory?

I would interpret it as UniqueString(foo) means “give me the unique copy of string foo”. You use a unique copy so you can use the same string in many places without wasting memory, or excess time on string compares. It’s used in many places, but there is only one. (Maybe we should call it HighlanderString? OK, not serious.)

# Geoffrey Garen (3 hours ago)

So hard to pronounce though! Why not UniqueString? It’s not quite as explicit but close enough.

Wouldn’t it be confusing to use UniqueString type for a string that is common in order to save memory?

I would interpret it as UniqueString(foo) means “give me the unique copy of string foo”. You use a unique copy so you can use the same string in many places without wasting memory, or excess time on string compares. It’s used in many places, but there is only one. (Maybe we should call it HighlanderString? OK, not serious.)

By definition, any string that has been uniqued is unique.

So, maybe we like “unique” or maybe we don’t. But if we like “unique”, it’s strictly better than “uniqued”.

Personally, I like the AtomString proposal as it is close to the naming we are used to and addresses the issue raised (atomic has a different meaning with threading). Also, I had never heard of interned strings before.

AtomicString has two features:

(1) You do comparison by pointer / object identity;

(2) You never allocate two objects for the same sequence of characters.

JavaScript symbols offer (1) but avoid (2):

let a = Symbol(“The string of the past!”);
let b = Symbol(“The string of the past!”);
a == b; // false
a === b; // false

Today we call (1) “UniquedStringImpl” and (1) + (2) “AtomicStringImpl”.

If we rename (1) + (2) to “UniqueString” or “UniquedString”, we need a new name for (1) alone.

Geoff

# Maciej Stachowiak (2 hours ago)

On Dec 20, 2018, at 11:24 AM, Geoffrey Garen <ggaren at apple.com> wrote:

So hard to pronounce though! Why not UniqueString? It’s not quite as explicit but close enough.

Wouldn’t it be confusing to use UniqueString type for a string that is common in order to save memory?

I would interpret it as UniqueString(foo) means “give me the unique copy of string foo”. You use a unique copy so you can use the same string in many places without wasting memory, or excess time on string compares. It’s used in many places, but there is only one. (Maybe we should call it HighlanderString? OK, not serious.)

By definition, any string that has been uniqued is unique.

So, maybe we like “unique” or maybe we don’t. But if we like “unique”, it’s strictly better than “uniqued”.

Personally, I like the AtomString proposal as it is close to the naming we are used to and addresses the issue raised (atomic has a different meaning with threading). Also, I had never heard of interned strings before.

AtomicString has two features:

(1) You do comparison by pointer / object identity;

(2) You never allocate two objects for the same sequence of characters.

JavaScript symbols offer (1) but avoid (2):

let a = Symbol(“The string of the past!”);
let b = Symbol(“The string of the past!”);
a == b; // false
a === b; // false

Today we call (1) “UniquedStringImpl” and (1) + (2) “AtomicStringImpl”.

If we rename (1) + (2) to “UniqueString” or “UniquedString”, we need a new name for (1) alone.

It seems like (1) alone is not actually “uniqued". In the case of symbols at least, it's the opposite: the string deliberately kept distinct from all other strings of equal value, so pointer equality is used as a check for equality by pointer identity, rather than a shortcut for equality by value.

From what I can tell, UniquedStringImpl exists mainly to be a base class for SymbolImpl and AtomicStringImpl. So you could imagine giving it a more verbose and explicit name, like PointerEqualityStringImpl. However, JSC seems to use UniquedStringImpl directly in a bunch of cases, I think to make it convenient for a property name to be either an AtomicString or a Symbol. Maybe a longer name for that case is ok?

Want more features?

Request early access to our private beta of readable email premium.