Zero-width space
In UnicodeU+200B ZERO WIDTH SPACE (​, ​, ​, ​, ​)

The zero-width space (rendered: ; HTML entity: ​ or ​), abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate where the word boundaries are, without actually displaying a visible space in the rendered text. This enables text-processing systems for scripts that do not use explicit spacing to recognize where word boundaries are for the purpose of handling line breaks appropriately.

The zero-width space is Unicode character U+200B, and is located in the Unicode General Punctuation block. In HTML, it can be represented by the character entity reference ​.

Purpose

edit

The zero-width space marks a potential line break without hyphenation. Its semantics and HTML implementation are similar to the soft hyphen, but soft hyphens display a hyphen character at the point where the line is broken.

The zero-width space can be used to mark word breaks in languages without visible space between words, such as Thai, Myanmar, Khmer, and Japanese.[1]

In justified text, the rendering engine may add inter-character spacing, also known as letter spacing, between letters separated by a zero-width space, unlike around fixed-width spaces.[1]

Example

edit

To show the effect of the zero-width space in text, the following words have been separated with zero-width spaces:

Lorem​Ipsum​Dolor​Sit​Amet​Consectetur​Adipiscing​Elit​Sed​Do​Eiusmod​Tempor​Incididunt​Ut​Labore​Et​Dolore​Magna​Aliqua​Ut​Enim​Ad​Minim​Veniam​Quis​Nostrud​Exercitation​Ullamco​Laboris​Nisi​Ut​Aliquip​Ex​Ea​Commodo​Consequat​Duis​Aute​Irure​Dolor​In​Reprehenderit​In​Voluptate​Velit​Esse​Cillum​Dolore​Eu​Fugiat​Nulla​Pariatur​Excepteur​Sint​Occaecat​Cupidatat​Non​Proident​Sunt​In​Culpa​Qui​Officia​Deserunt​Mollit​Anim​Id​Est​Laborum

By contrast, the following words have not been separated:

LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum

The first text is broken into lines but only at word boundaries, and resizing the browser window will re-break the text accordingly, while the second text is not broken at all.

Usage

edit

HTML

edit

In HTML pages, the HTML element <wbr> functions as a zero-width space. In Internet Explorer 6, the zero-width space was not supported in some fonts.[2]

Prohibition in domain names

edit

ICANN rules prohibit domain names from containing non-displayed characters, including the zero-width space, and most browsers prohibit their use within domain names because they can be used to create a homograph attack, where a malicious URL is visually indistinguishable from a legitimate one.[3][4]

Encoding

edit
Character information
Preview
Unicode name ZERO WIDTH SPACE
Encodings decimal hex
Unicode 8203 U+200B
UTF-8 226 128 139 E2 80 8B
Numeric character reference &#8203; &#x200B;
Named character reference &NegativeMediumSpace;, &NegativeThickSpace;, &NegativeThinSpace;, &NegativeVeryThinSpace;, &ZeroWidthSpace;

The zero-width space character is encoded in Unicode as U+200B ZERO WIDTH SPACE.[5]

In HTML, it can be referenced as &ZeroWidthSpace;, &#8203; or &#x200B;. Additionally, the character entities &NegativeThickSpace;, &NegativeMediumSpace;, &NegativeThinSpace;, and &NegativeVeryThinSpace; all also refer to the zero-width space, contrary to what their names suggest.[6]

The TeX representation is \hskip0pt; the LaTeX representation is \hspace{0pt};[7] and the groff representation is \:.[8]

See also

edit

References

edit

Citations

edit
  1. ^ a b "23.2 Layout Controls". The Unicode Standard Version 15.0 – Core Specification (PDF). The Unicode Consortium. September 2022. p. 918. ISBN 978-1-936213-32-0.
  2. ^ Dunae, Alex. "Better Web Typography with Spaces and Hyphens". dunae.ca. Archived from the original on December 14, 2010. Retrieved December 3, 2009.
  3. ^ "Network.IDN.blacklist_chars". mozillaZine. Retrieved 2018-02-07.
  4. ^ "Unicode Character 'Zero Width Space'". FileFormat.Info. Retrieved 2018-02-07.
  5. ^ "General Punctuation – Unicode" (PDF). Retrieved 2013-07-20.
  6. ^ Entities/ZeroWidthSpace in MathML Version 2.0
  7. ^ "The LaTeX Companion. Chapter 3: Basic Formatting Tools" (PDF). Retrieved 2019-07-16.
  8. ^ "groff(7) – Linux manual page". Retrieved 2014-02-08.

Sources

edit

📚 Artikel Terkait di Wikipedia

Whitespace character

white-space character class. Space bar Space (punctuation) Tab key Trimming (computer programming) Whitespace (programming language) Zero-width space "The

Word joiner

preferred. Byte order mark, which uses U+FEFF ZERO WIDTH NO-BREAK SPACE (ZWNBSP) character Zero-width space Zero-width joiner, which in scripts such as Arabic

Zero-width non-joiner

The zero-width non-joiner (ZWNJ, rendered: ‌; HTML entity: &zwnj; or &#8204;) is a non-printing character used in the computerization of writing systems

Non-breaking space

non-breaking space ( ), also called NBSP, required space, hard space, or fixed space (in most typefaces, it is not of fixed width), is a space character

0W

typesetting of some complex scripts Zero-width joiner Zero-width non-joiner Zero-width space Zero-width non-breaking space Zero waste, an environmental concept

Zero width

Zero-width non-joiner Zero-width space Zero-width no-break space This disambiguation page lists articles associated with the title Zero width. If an internal

Space (punctuation)

representation %20. Figure space Non-breaking space   Thin space Visible space Whitespace character § Hair spaces around dashes Zero-width space Em (typography)

Soft hyphen

zero-width space, with the exception that the soft hyphen will preserve the kerning of the characters on either side when not visible. The zero-width