StructureSwift

    Character

    A single extended grapheme cluster that approximates a user-perceived character.

    @frozen struct Character

    Overview

    The Character type represents a character made up of one or more Unicode scalar values, grouped by a Unicode boundary algorithm. Generally, a Character instance matches what the reader of a string will perceive as a single character. Strings are collections of Character instances, so the number of visible characters is generally the most natural way to count the length of a string.

    let greeting = "Hello! đŸ„"
    print("Length: \(greeting.count)")
    // Prints "Length: 8"

    Because each character in a string can be made up of one or more Unicode scalar values, the number of characters in a string may not match the length of the Unicode scalar value representation or the length of the string in a particular binary representation.

    print("Unicode scalar value count: \(greeting.unicodeScalars.count)")
    // Prints "Unicode scalar value count: 8"
    
    print("UTF-8 representation count: \(greeting.utf8.count)")
    // Prints "UTF-8 representation count: 11"

    Every Character instance is composed of one or more Unicode scalar values that are grouped together as an extended grapheme cluster. The way these scalar values are grouped is defined by a canonical, localized, or otherwise tailored Unicode segmentation algorithm.

    For example, a country’s Unicode flag character is made up of two regional indicator scalar values that correspond to that country’s ISO 3166-1 alpha-2 code. The alpha-2 code for The United States is “US”, so its flag character is made up of the Unicode scalar values "\u{1F1FA}" (REGIONAL INDICATOR SYMBOL LETTER U) and "\u{1F1F8}" (REGIONAL INDICATOR SYMBOL LETTER S). When placed next to each other in a string literal, these two scalar values are combined into a single grapheme cluster, represented by a Character instance in Swift.

    let usFlag: Character = "\u{1F1FA}\u{1F1F8}"
    print(usFlag)
    // Prints "đŸ‡ș🇾"

    For more information about the Unicode terms used in this discussion, see the Unicode.org glossary. In particular, this discussion mentions extended grapheme clusters and Unicode scalar values.

    Members

    Typealiases

    RegexBuilder

    Initializers

    Instance Properties

    RegexBuilder

    RegexParser

    Instance Methods

    Type Operators

    Removed Members

    Instance Properties