Datasets
Emoji's are generated into JSON files called datasets, with each dataset being grouped into one of
the following: localized data, versioned data, and metadata. These datasets can be found within the
emojibase-data
package, or loaded from a CDN.
- Yarn
- NPM
yarn add emojibase-data
npm install emojibase-data
JSON files will need to be parsed manually unless handled by a build/bundle process.
Usage
As stated, there are 3 groups of datasets, each serving a specific purpose. The first group, localized data, is exactly that, datasets with localization provided by CLDR (view supported locales). These datasets return an array of emoji objects that adhere to the defined data structure.
import emojis from 'emojibase-data/<locale>/data.json';
import compactEmojis from 'emojibase-data/<locale>/compact.json';
import groupsSubgroups from 'emojibase-data/<locale>/messages.json';
The second group, versioned data, provides datasets for emoji and Unicode release versions. These datasets return a map, with the key being the version, and the value being an array of emoji hexcodes included in the associated release version.
emojibase-data/versions/emoji.json
- Emoji characters grouped by emoji version.emojibase-data/versions/unicode.json
- Emoji characters grouped by Unicode version.
import unicodeVersions from 'emojibase-data/versions/unicode.json';
The third and last group, metadata, provides specialized datasets for unique use cases.
emojibase-data/meta/groups.json
- A map of non-localized emoji groups (Smileys & People), subgroups (Sky & Weather), and hierarchy, according to the official Unicode data files.emojibase-data/meta/hexcodes.json
- A map of emoji hexcodes (hexadecimal codepoints) to an object of hexcodes with different qualified status: fully qualified, minimally qualified, and unqualified.emojibase-data/meta/unicode.json
- An array of all emoji unicode characters, including text and emoji presentation characters.emojibase-data/meta/unicode-names.json
- A map of hexcodes to official Unicode names for each emoji.
import { groups, subgroups, hierarchy } from 'emojibase-data/meta/groups.json';
Data structure
Each emoji character found within the pre-generated datasets are represented by an object composed
of the properties listed below. In an effort to reduce the overall dataset filesize, most property
values have been implemented using integers,
with associated constants.
View the Emoji
object for a list of all available fields.
Not all properties will be found in the emoji object, as properties without an applicable value are omitted from the emoji object. This helps to reduce the filesize!
{
annotation: 'man lifting weights',
emoji: '🏋️♂️',
gender: 1,
group: 0,
hexcode: '1F3CB-FE0F-200D-2642-FE0F',
order: 1518,
shortcodes: [
'man_lifting_weights',
],
subgroup: 0,
tags: [
'weight lifter',
'man',
],
type: 1,
version: 4,
skins: [
{
annotation: 'man lifting weights: light skin tone',
emoji: '🏋🏻♂️',
gender: 1,
group: 0,
hexcode: '1F3CB-1F3FB-200D-2642-FE0F',
order: 1522,
shortcodes: [
'man_lifting_weights_tone1',
],
subgroup: 0,
type: 1,
tone: 1,
version: 4,
},
// ...
],
},
Compact format
While the emoji data is pretty thorough, not all of it may be required, and as such, a compact
dataset is supported. View the CompactEmoji
object for a
list of all available fields.
To use a compact dataset, replace data.json
with compact.json
.
import data from 'emojibase-data/en/compact.json';
{
annotation: 'man lifting weights',
group: 0,
hexcode: '1F3CB-FE0F-200D-2642-FE0F',
order: 1518,
shortcodes: [
'man_lifting_weights',
],
tags: [
'weight lifter',
'man',
],
unicode: '🏋️♂️',
skins: [
{
annotation: 'man lifting weights: light skin tone',
group: 0,
hexcode: '1F3CB-1F3FB-200D-2642-FE0F',
order: 1522,
shortcodes: [
'man_lifting_weights_tone1',
],
unicode: '🏋🏻♂️',
},
// ...
],
},
Messages format
The messages format is a special dataset that provides translations for groups, sub-groups, and any
other related emoji metadata. The key
in each message lines up with a defined TypeScript type
alias.
import data from 'emojibase-data/en/messages.json';
{
groups: [
{
key: 'smileys-emotion',
message: 'smileys & emotion',
order: 0,
},
// ...
],
subgroups: [
{
key: 'face-smiling',
message: 'smiling',
order: 0,
},
// ...
],
skinTones: [
{
key: 'light',
message: 'light skin tone',
},
// ...
],
};
Fetching from a CDN
If you prefer to not inflate your bundle size with these large JSON datasets, you can fetch them
from our CDN (provided by jsdelivr.com) using
fetchFromCDN()
,
fetchEmojis()
, or
fetchShortcodes()
.
import { fetchFromCDN, fetchEmojis, fetchMessages, fetchShortcodes } from 'emojibase';
const englishEmojis = await fetchFromCDN('en/data.json', { shortcodes: ['github'] });
const japaneseCompactEmojis = await fetchEmojis('ja', { compact: true });
const germanCldrShortcodes = await fetchShortcodes('de', 'cldr');
const chineseTranslations = await fetchMessages('zh');
Fetching from your own CDN
If you want to load the JSON datasets from your own CDN, you can customize the cdnUrl
using the
options object.
When cdnUrl
is a string, fetchFromCDN
will append '/${path}'
to the url. Make sure to include
the version
within the cdnUrl
yourself, it's not added automatically to give you control over
its placement.
import { fetchFromCDN, fetchEmojis, fetchMessages, fetchShortcodes } from 'emojibase';
const cdnUrl = 'https://example.com/cdn/emojidata/latest';
const englishEmojis = await fetchFromCDN('en/data.json', { shortcodes: ['github'], cdnUrl });
const japaneseCompactEmojis = await fetchEmojis('ja', { compact: true, cdnUrl });
const germanCldrShortcodes = await fetchShortcodes('de', 'cldr', { cdnUrl });
const chineseTranslations = await fetchMessages('zh', { cdnUrl });
cdnUrl
can also be a function, so you have complete control over the format of the url. This
function receives path
and version
as parameters. Version will be what you pass in within the
options object, or it will default to latest
. Note that version
is also used for the cache key,
so it's advised to set the option and not hard-code it in the cdnUrl
function.
import { fetchFromCDN, fetchEmojis, fetchMessages, fetchShortcodes } from 'emojibase';
function cdnUrl(path: string, version: string): string {
return `https://example.com/cdn/emojidata/${version}/${path}`;
}
const englishEmojis = await fetchFromCDN('en/data.json', { shortcodes: ['github'], cdnUrl });
const japaneseCompactEmojis = await fetchEmojis('ja', { compact: true, cdnUrl });
const germanCldrShortcodes = await fetchShortcodes('de', 'cldr', { cdnUrl });
const chineseTranslations = await fetchMessages('zh', { cdnUrl });
Supported locales
Follow locales are supported for both full and compact datasets.
- Bengali (
bu
) - Chinese (
zh
) - Chinese, Traditional (
zh-hant
) - Danish (
da
) - Dutch (
nl
) - English (
en
) - English, Great Britain (
en-gb
) - Estonian (
et
) - Finnish (
fi
) - French (
fr
) - German (
de
) - Hindu (
hi
) - Hungarian (
hu
) - Italian (
it
) - Japanese (
ja
) - Korean (
ko
) - Lithuanian (
lt
) - Malay (
ms
) - Norwegian (
nb
) - Polish (
pl
) - Portuguese (
pt
) - Russian (
ru
) - Spanish (
es
) - Spanish, Mexico (
es-mx
) - Swedish (
sv
) - Thai (
th
) - Ukrainian (
uk
)
Filesizes
Sorted by original size in ascending order.
- Emojis
- Emojis (compact)
- Shortcodes
- Messages
- Other
File | Size | Gzipped |
---|---|---|
zh-hant/data.json | 674.07 kB | 84.59 kB |
sv/data.json | 698.15 kB | 82.79 kB |
nb/data.json | 701.89 kB | 84.36 kB |
zh/data.json | 708.75 kB | 93.53 kB |
da/data.json | 710.08 kB | 86 kB |
fi/data.json | 715.13 kB | 86.26 kB |
et/data.json | 722.12 kB | 85.51 kB |
es/data.json | 730.87 kB | 83 kB |
es-mx/data.json | 732.06 kB | 83.39 kB |
lt/data.json | 735.13 kB | 88.48 kB |
en/data.json | 735.59 kB | 89.22 kB |
en-gb/data.json | 735.59 kB | 89.22 kB |
ja/data.json | 736.62 kB | 91.25 kB |
ms/data.json | 743.18 kB | 86.71 kB |
nl/data.json | 748.35 kB | 92.29 kB |
pt/data.json | 751.02 kB | 94.34 kB |
hu/data.json | 761.02 kB | 93.98 kB |
ko/data.json | 761.63 kB | 100.11 kB |
vi/data.json | 762.17 kB | 87.89 kB |
fr/data.json | 764.67 kB | 95.29 kB |
de/data.json | 768.58 kB | 97.99 kB |
pl/data.json | 772.79 kB | 96.72 kB |
it/data.json | 799.19 kB | 102.82 kB |
ru/data.json | 897.47 kB | 104.02 kB |
th/data.json | 913.12 kB | 92.7 kB |
uk/data.json | 931.48 kB | 106 kB |
hi/data.json | 977.49 kB | 103.19 kB |
bn/data.json | 1.02 MB | 102.24 kB |
File | Size | Gzipped |
---|---|---|
zh-hant/compact.json | 479.62 kB | 75.38 kB |
sv/compact.json | 503.7 kB | 73.14 kB |
nb/compact.json | 507.44 kB | 76.07 kB |
zh/compact.json | 514.3 kB | 83.42 kB |
da/compact.json | 515.63 kB | 76.4 kB |
fi/compact.json | 520.68 kB | 76.38 kB |
et/compact.json | 527.67 kB | 76.11 kB |
es/compact.json | 536.42 kB | 73.51 kB |
es-mx/compact.json | 537.61 kB | 73.8 kB |
lt/compact.json | 540.68 kB | 78.71 kB |
en/compact.json | 541.14 kB | 80.05 kB |
en-gb/compact.json | 541.14 kB | 80.05 kB |
ja/compact.json | 542.17 kB | 81.14 kB |
ms/compact.json | 548.73 kB | 77.29 kB |
nl/compact.json | 553.9 kB | 82.72 kB |
pt/compact.json | 556.57 kB | 84.61 kB |
hu/compact.json | 566.57 kB | 83.95 kB |
ko/compact.json | 567.17 kB | 89.51 kB |
vi/compact.json | 567.72 kB | 78.07 kB |
fr/compact.json | 570.22 kB | 85.5 kB |
de/compact.json | 574.13 kB | 87.44 kB |
pl/compact.json | 578.34 kB | 86.66 kB |
it/compact.json | 604.74 kB | 92.63 kB |
ru/compact.json | 703.02 kB | 93.93 kB |
th/compact.json | 718.67 kB | 82.66 kB |
uk/compact.json | 737.03 kB | 95.63 kB |
hi/compact.json | 783.04 kB | 92.86 kB |
bn/compact.json | 821.3 kB | 92.12 kB |
File | Size | Gzipped |
---|---|---|
fr/shortcodes/emojibase.json | 42 B | 62 B |
en/shortcodes/cldr-native.json | 258 B | 184 B |
en-gb/shortcodes/cldr-native.json | 258 B | 184 B |
zh/shortcodes/emojibase-native.json | 298 B | 202 B |
zh/shortcodes/emojibase.json | 347 B | 186 B |
ja/shortcodes/emojibase.json | 1.02 kB | 472 B |
ja/shortcodes/emojibase-native.json | 1.09 kB | 571 B |
it/shortcodes/cldr-native.json | 1.18 kB | 496 B |
nl/shortcodes/cldr-native.json | 2.39 kB | 725 B |
ru/shortcodes/emojibase.json | 19.23 kB | 5.9 kB |
ru/shortcodes/emojibase-native.json | 25.23 kB | 6.59 kB |
da/shortcodes/emojibase-native.json | 36.68 kB | 6.7 kB |
es/shortcodes/cldr-native.json | 42.73 kB | 8.38 kB |
es-mx/shortcodes/cldr-native.json | 42.76 kB | 8.47 kB |
de/shortcodes/cldr-native.json | 43.33 kB | 6.84 kB |
nb/shortcodes/cldr-native.json | 44.34 kB | 7.4 kB |
en/shortcodes/github.json | 45.31 kB | 15.6 kB |
en/shortcodes/iamcal.json | 47.83 kB | 15.11 kB |
pt/shortcodes/cldr-native.json | 53.2 kB | 10.53 kB |
sv/shortcodes/emojibase-native.json | 54.62 kB | 10.14 kB |
fr/shortcodes/cldr-native.json | 55.89 kB | 10.4 kB |
da/shortcodes/cldr-native.json | 56.77 kB | 9.1 kB |
fi/shortcodes/cldr-native.json | 69.88 kB | 10.7 kB |
et/shortcodes/cldr-native.json | 71.04 kB | 12.06 kB |
sv/shortcodes/cldr-native.json | 72.39 kB | 11.97 kB |
lt/shortcodes/cldr-native.json | 120.4 kB | 20.59 kB |
pl/shortcodes/cldr-native.json | 124.71 kB | 18.9 kB |
en/shortcodes/emojibase-legacy.json | 129.02 kB | 24.32 kB |
sv/shortcodes/emojibase.json | 137.61 kB | 27.25 kB |
zh-hant/shortcodes/cldr-native.json | 139.82 kB | 27.14 kB |
zh/shortcodes/cldr-native.json | 144.03 kB | 27.5 kB |
hu/shortcodes/cldr.json | 147.25 kB | 27.16 kB |
ja/shortcodes/cldr.json | 147.6 kB | 26.24 kB |
hu/shortcodes/cldr-native.json | 147.9 kB | 25.67 kB |
en/shortcodes/cldr.json | 148.77 kB | 26.83 kB |
en-gb/shortcodes/cldr.json | 148.77 kB | 26.83 kB |
zh-hant/shortcodes/cldr.json | 149.27 kB | 25.21 kB |
da/shortcodes/cldr.json | 150.32 kB | 26.76 kB |
da/shortcodes/emojibase.json | 150.69 kB | 29.99 kB |
sv/shortcodes/cldr.json | 150.92 kB | 27.05 kB |
th/shortcodes/cldr.json | 150.97 kB | 27.1 kB |
nb/shortcodes/cldr.json | 151.5 kB | 26.79 kB |
et/shortcodes/cldr.json | 152.36 kB | 27.01 kB |
fi/shortcodes/cldr.json | 153.47 kB | 27.39 kB |
nl/shortcodes/cldr.json | 156.87 kB | 27.49 kB |
ja/shortcodes/cldr-native.json | 156.96 kB | 28.82 kB |
vi/shortcodes/cldr.json | 157.32 kB | 26.44 kB |
de/shortcodes/cldr.json | 157.47 kB | 27.62 kB |
zh/shortcodes/cldr.json | 157.57 kB | 25.7 kB |
en/shortcodes/emojibase.json | 157.82 kB | 30.03 kB |
en-gb/shortcodes/emojibase.json | 157.82 kB | 30.03 kB |
ru/shortcodes/cldr.json | 158.47 kB | 27.99 kB |
pt/shortcodes/cldr.json | 158.65 kB | 27.69 kB |
bn/shortcodes/cldr.json | 161.54 kB | 28.37 kB |
ms/shortcodes/cldr.json | 164.58 kB | 27.8 kB |
hi/shortcodes/cldr.json | 164.82 kB | 29.08 kB |
pl/shortcodes/cldr.json | 164.96 kB | 28.54 kB |
lt/shortcodes/cldr.json | 165.08 kB | 28.53 kB |
it/shortcodes/cldr.json | 165.37 kB | 28.28 kB |
ko/shortcodes/cldr-native.json | 165.47 kB | 29.22 kB |
es-mx/shortcodes/cldr.json | 165.53 kB | 28.01 kB |
fr/shortcodes/cldr.json | 165.83 kB | 28.03 kB |
es/shortcodes/cldr.json | 165.87 kB | 27.99 kB |
ko/shortcodes/cldr.json | 166.01 kB | 27.46 kB |
uk/shortcodes/cldr.json | 172.82 kB | 29.26 kB |
vi/shortcodes/cldr-native.json | 178.58 kB | 29.13 kB |
ru/shortcodes/cldr-native.json | 212.58 kB | 31.13 kB |
en/shortcodes/joypixels.json | 223.11 kB | 34.54 kB |
th/shortcodes/cldr-native.json | 235 kB | 31.32 kB |
uk/shortcodes/cldr-native.json | 238.37 kB | 32.93 kB |
hi/shortcodes/cldr-native.json | 269.67 kB | 33.37 kB |
bn/shortcodes/cldr-native.json | 279.18 kB | 32.56 kB |
File | Size | Gzipped |
---|---|---|
zh/messages.json | 6.2 kB | 1.93 kB |
zh-hant/messages.json | 6.2 kB | 1.93 kB |
en/messages.json | 6.5 kB | 1.59 kB |
en-gb/messages.json | 6.5 kB | 1.6 kB |
da/messages.json | 6.51 kB | 1.79 kB |
sv/messages.json | 6.51 kB | 1.8 kB |
ms/messages.json | 6.54 kB | 1.81 kB |
nb/messages.json | 6.56 kB | 1.8 kB |
ko/messages.json | 6.57 kB | 2.09 kB |
et/messages.json | 6.61 kB | 1.85 kB |
nl/messages.json | 6.64 kB | 1.82 kB |
de/messages.json | 6.65 kB | 1.91 kB |
it/messages.json | 6.65 kB | 1.83 kB |
fi/messages.json | 6.66 kB | 1.88 kB |
pl/messages.json | 6.66 kB | 1.99 kB |
pt/messages.json | 6.74 kB | 1.89 kB |
es-mx/messages.json | 6.76 kB | 1.89 kB |
es/messages.json | 6.77 kB | 1.9 kB |
ja/messages.json | 6.78 kB | 2.24 kB |
fr/messages.json | 6.78 kB | 1.91 kB |
hu/messages.json | 6.81 kB | 2 kB |
vi/messages.json | 6.85 kB | 2.12 kB |
lt/messages.json | 6.85 kB | 1.95 kB |
ru/messages.json | 7.84 kB | 2.33 kB |
uk/messages.json | 7.91 kB | 2.38 kB |
hi/messages.json | 8.51 kB | 2.31 kB |
bn/messages.json | 8.66 kB | 2.32 kB |
th/messages.json | 9.1 kB | 2.42 kB |
File | Size | Gzipped |
---|---|---|
meta/groups.json | 3.86 kB | 1.24 kB |
meta/unicode.json | 72.55 kB | 12.63 kB |
versions/unicode.json | 95.08 kB | 11.98 kB |
versions/emoji.json | 95.12 kB | 12.08 kB |
meta/unicode-names.json | 237.34 kB | 28.48 kB |
meta/hexcodes.json | 258.7 kB | 28.51 kB |