{ Unicode Code Point Lookup }

// enter any character β€” get its code point, name & block

Enter any character or text to instantly get its Unicode code points, official names, blocks, categories, and escape sequences.

Enter any character, word, emoji, or string. Each code point will be listed separately.
// QUICK EXAMPLES
πŸ”

Ready to inspect

Enter any character or text and click Inspect

HOW TO USE

  1. 01
    Enter Text

    Type or paste any character, word, emoji, or text string into the input area.

  2. 02
    Click Inspect

    Hit the Inspect button or press Ctrl+Enter to analyze all code points in your input.

  3. 03
    Read & Copy Results

    Each character shows its code point, name, block, category, and escape sequences. Copy any value with one click.

FEATURES

Code Points U+XXXX Official Names Unicode Blocks General Categories HTML Entities JS / CSS / Python Escapes Emoji Sequences Client-Side Only

USE CASES

  • πŸ”§ Debug encoding issues in strings
  • πŸ”§ Get escape sequences for HTML/CSS/JS
  • πŸ”§ Identify unknown or invisible characters
  • πŸ”§ Inspect emoji and multi-codepoint sequences
  • πŸ”§ Learn Unicode blocks and categories

WHAT IS THIS?

This tool decodes any text into its individual Unicode code points β€” the unique numbers assigned to every character in the Unicode standard. For each character you get its U+ code, official name, Unicode block, general category, and ready-to-use escape sequences for HTML, CSS, JavaScript, and Python. All processing happens in your browser.

RELATED TOOLS

FREQUENTLY ASKED QUESTIONS

What is a Unicode code point?

A Unicode code point is a unique integer assigned to every character in the Unicode standard, written as U+XXXX (e.g. U+0041 for "A"). The Unicode standard covers over 140,000 characters across all scripts and symbol sets.

What is a Unicode block?

Unicode code points are grouped into named blocks by script or purpose β€” for example "Basic Latin" (U+0000–U+007F), "CJK Unified Ideographs", or "Emoticons". The block tells you which writing system or symbol group a character belongs to.

Why does an emoji show multiple code points?

Many emoji are sequences of multiple code points β€” for example a skin-tone modifier, a ZWJ (Zero Width Joiner, U+200D), or a variation selector. This tool lists each code point in the sequence individually so you can see the full composition.

What escape sequences does this tool provide?

For each code point the tool shows: HTML named entity (if available), HTML numeric entity, CSS content escape (\HHHH), JavaScript string escape (\uHHHH or \u{HHHHH}), and Python unicode literal (\uHHHH or \UHHHHHHHH).

What is the Unicode General Category?

Each code point has a two-letter General Category β€” Lu (Uppercase Letter), Ll (Lowercase Letter), Nd (Decimal Digit), Po (Other Punctuation), So (Other Symbol), Zs (Space Separator), Cf (Format character), and so on. Categories determine how software handles characters in regex, text layout, and processing.

Is my input sent to a server?

No. All processing happens entirely in your browser using JavaScript. No text you enter is sent to any server or stored anywhere. You can safely inspect passwords, tokens, or private strings.

What is the difference between UTF-8 and code points?

A Unicode code point is the abstract number assigned to a character. UTF-8 is a byte encoding that represents those code points in memory or files. The code point U+1F600 (πŸ˜€) is one code point but takes 4 bytes when encoded in UTF-8.

How do I look up a character by its U+ number?

Paste the actual character into the input field. If you know the hex value (e.g. 20AC for €) you can paste the character € directly. Most operating systems have a character map or emoji picker where you can find and copy characters by code point.

What is a Unicode Code Point Lookup Tool?

A Unicode Code Point Lookup tool takes any text input and breaks it down character by character, revealing the underlying numeric identity that the Unicode standard assigns to each symbol. Every letter, digit, punctuation mark, emoji, currency sign, mathematical operator, and ideograph in Unicode has a unique code point β€” an integer written in the familiar U+XXXX notation. This tool makes those hidden numbers visible and puts the accompanying metadata right in front of you.

πŸ’‘ Looking for premium web development assets? MonsterONE offers unlimited downloads of templates, UI kits, and assets β€” worth checking out.

Why Unicode Code Points Matter for Developers

Developers encounter Unicode issues constantly β€” corrupted strings, encoding mismatches, mystery characters that look identical but behave differently, emoji that break string length assumptions, or right-to-left marks that invisibly affect layout. Understanding the code points behind your text is the first step to diagnosing and fixing these problems.

When a character causes your regex to fail, a database insert to throw an error, or a font to render a tofu box, the code point lookup is where you start. Knowing that U+FEFF is a Byte Order Mark, that U+00AD is a soft hyphen, or that two visually similar dashes are actually U+002D (HYPHEN-MINUS) and U+2013 (EN DASH) gives you the leverage to fix the real problem.

Understanding Unicode Blocks

The Unicode standard organizes its 140,000+ code points into named blocks. Each block covers a contiguous range of code points and typically corresponds to a writing system, symbol set, or technical domain. Common blocks include:

Knowing which block a character belongs to helps you understand why it might be missing from a font, excluded from an input validator, or handled differently by text-processing libraries.

Unicode General Categories Explained

Every code point is assigned a General Category β€” a two-character code that classifies the character's type. The major categories relevant to everyday development are:

Regex engines use these categories in \p{L}, \p{N} type patterns. Text renderers use them to decide directionality, line-breaking behavior, and capitalization rules.

Escape Sequences for Different Languages

Once you know a code point, you often need to reference it in source code. This tool generates all of the following automatically:

Emoji and Multi-Codepoint Sequences

Emoji are one of the most complex areas of Unicode. What looks like a single character may actually be a sequence of several code points. A family emoji like πŸ‘¨β€πŸ‘©β€πŸ‘§ is composed of individual person emoji joined by Zero Width Joiner characters (U+200D). A flag emoji like πŸ‡»πŸ‡³ is actually two Regional Indicator Symbol letters. Skin tone variants append a Fitzpatrick modifier (U+1F3FB through U+1F3FF) after the base emoji.

Understanding this composition is essential when working with character counts, string slicing, or rendering engines. A naive str.length in JavaScript returns the UTF-16 code unit count β€” which differs from the actual visible character count for supplementary characters. This inspector shows you every code point in a sequence so there are no surprises.

Invisible and Control Characters

Some of the most troublesome characters in Unicode are the ones you cannot see. This tool highlights format characters and control characters with a special badge so you know they are there even when they render as nothing. Common invisible troublemakers include:

Pasting text from PDFs, Office documents, or rich-text editors often introduces these characters silently. The code point lookup reveals them immediately.

Related Unicode Tools

This lookup tool pairs well with the Unicode Normalizer (which converts between NFC, NFD, NFKC, and NFKD forms) and the ASCII / Unicode Converter. For encoding text for the web, check out the HTML Entity Encoder and URL Encoder/Decoder.

β˜•