Rename wtf/unicode/UTF8.h

# Fujii Hironori (3 days ago)

wtf/unicode/UTF8.h is conflicting with ICU header in MSVC builds. I'd like to rename wtf/unicode/UTF8.h to wtf/unicode/WTFUTF8.h. Any suggestion?

Here is ICU's #include "unicode/utf8.h" which happens to include wtf/unicode/UTF8.h unicode-org/icu/blob/master/icu4c/source/common/unicode/utf.h#L217

Here is MSVC quoted form #include behavior documented. msdn.microsoft.com/en-us/library/36k2cdd4.aspx

Bug 189693 – [Win][Clang] warning: #include resolved using non-portable Microsoft search rules as: ....\Source\WTF\wtf/unicode/utf8.h bugs.webkit.org/show_bug.cgi?id=189693

Contact us to advertise here
# Mathias Bynens (2 days ago)

Note that "WTF-8" is an encoding: simonsapin.github.io/wtf-8 "WTFUTF8" might be ambiguous.

On Wed, Oct 31, 2018 at 3:18 AM Fujii Hironori <fujii.hironori at gmail.com>

wrote:

# Fujii Hironori (2 days ago)

Oh no. I didn't know that. Thank you for letting me know.

Then, how about <wtf/unicode/UTF-8.h>?

There is <wtf/dtoa/double-conversion.h> which has '-' in the name.

# Konstantin Tokarev (2 days ago)

31.10.2018, 05:18, "Fujii Hironori" <fujii.hironori at gmail.com>:

wtf/unicode/UTF8.h is conflicting with ICU header in MSVC builds. I'd like to rename wtf/unicode/UTF8.h to wtf/unicode/WTFUTF8.h. Any suggestion?

What about Unicode.h or UnicodeHelpers.h? UTF8.h deals with UTF16 as well

# Darin Adler (2 days ago)

On Oct 31, 2018, at 7:52 AM, Konstantin Tokarev <annulen at yandex.ru> wrote:

31.10.2018, 05:18, "Fujii Hironori" <fujii.hironori at gmail.com <mailto:fujii.hironori at gmail.com>>:

wtf/unicode/UTF8.h is conflicting with ICU header in MSVC builds. I'd like to rename wtf/unicode/UTF8.h to wtf/unicode/WTFUTF8.h. Any suggestion?

What about Unicode.h or UnicodeHelpers.h? UTF8.h deals with UTF16 as well

While I don’t love other of those names, I do like the idea of avoiding awkward “WTF” in the filename.

I don’t think it’s right to say “deals with UTF16”; this header contains only functions about dealing with UTF-8 and converting UTF-8 to and from other encodings (and yet those other encoding include UTF-16).

With a few seconds thought I am thinking that maybe UTF8Conversion.h or UTF8Transcoding.h are possible better ideas for new names. Neither is completely accurate. If we were going to add the word “helpers” than I would say UTF8Helpers.h, but I really don’t like those kinds of word in header names (“utilities”, “helpers”, “functions”, “classes”).

A separate issue once we rename: the header is also pretty old and crufty. Eventually we might want to remove or refine the functions in here. Not sure how widely they are used.

— Darin

# Fujii Hironori (2 days ago)

Thank you for the feedback, Konstantin and Darin.

On Thu, Nov 1, 2018 at 1:52 AM Darin Adler <darin at apple.com> wrote:

With a few seconds thought I am thinking that maybe UTF8Conversion.h or UTF8Transcoding.h are possible better ideas for new names. Neither is completely accurate. If we were going to add the word “helpers” than I would say UTF8Helpers.h, but I really don’t like those kinds of word in header names (“utilities”, “helpers”, “functions”, “classes”).

Sounds good. I take UTF8Conversion.h.

A separate issue once we rename: the header is also pretty old and crufty. Eventually we might want to remove or refine the functions in here. Not sure how widely they are used.

There are 8 functions in the header.

Following 2 functions are used only in JavaScriptCore/runtime/JSGlobalObjectFunctions.cpp.

  • UTF8SequenceLength
  • decodeUTF8Sequence

Following 3 functions are used only in WTF/wtf/text/AtomicStringImpl.cpp

  • calculateStringHashAndLengthFromUTF8MaskingTop8Bits
  • equalUTF16WithUTF8
  • equalLatin1WithUTF8

Following 3 functions are widely used

  • convertUTF8ToUTF16
  • convertLatin1ToUTF8
  • convertUTF16ToUTF8

Any suggestions are welcome to improve.

Want more features?

Request early access to our private beta of readable email premium.