Skip to content

Commit

Permalink
feat(RegEx): Add Remove emojis from text
Browse files Browse the repository at this point in the history
  • Loading branch information
FlorentLvr committed Dec 4, 2023
1 parent c1b5dbf commit 70d6716
Showing 1 changed file with 252 additions and 0 deletions.
252 changes: 252 additions & 0 deletions RegEx/RegEx_Remove_emojis_from_text.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a5079ec3-1272-4ba9-8087-9ca2775b875c",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"<img width=\"10%\" alt=\"Naas\" src=\"https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160\"/>"
]
},
{
"cell_type": "markdown",
"id": "c2fb4534-d4bc-4a8d-aa7e-38fd8cbbb8e8",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"# RegEx - Remove emojis from text"
]
},
{
"cell_type": "markdown",
"id": "574b8aa0-d7f4-4968-bf81-cb9cb16fc2a9",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Tags:** #regex #text #emoji #remove #string #python"
]
},
{
"cell_type": "markdown",
"id": "88c2e429-f0e0-44e1-a4ff-9ce3ea2ba00a",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel/)"
]
},
{
"cell_type": "markdown",
"id": "fba0b1cc-4994-44a2-9907-b29f6d97ebe7",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Last update:** 2023-12-04 (Created: 2023-12-04)"
]
},
{
"cell_type": "markdown",
"id": "94759cd1-345e-404e-a314-91c437f218bf",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**Description:** This notebook will show how to remove emojis from a text using RegEx and Python. It is usefull for organizations that need to clean text from emojis."
]
},
{
"cell_type": "markdown",
"id": "696eb3c2-cfbf-4774-9212-8b6c561366c1",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"**References:**\n- [Regular Expressions - Python Documentation](https://docs.python.org/3/library/re.html)\n- [Remove Emojis from Text - Stack Overflow](https://stackoverflow.com/questions/33404752/removing-emojis-from-a-string-in-python)"
]
},
{
"cell_type": "markdown",
"id": "6dec562d-c2e4-428e-99e3-4e4b296531dd",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"## Input"
]
},
{
"cell_type": "markdown",
"id": "f3212bcb-8f14-48e8-851c-5641f82f2472",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Import libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae43c8fe-0b1b-4d08-9288-f57bd10e33e8",
"metadata": {
"papermill": {},
"tags": []
},
"source": "import re",
"outputs": []
},
{
"cell_type": "markdown",
"id": "331c8e10-7d6b-4ca8-8fa4-9fb9703d8d66",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Setup variables\n- `text`: Text containing emojis"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f34796c6-2a63-420e-b9aa-908dcd5e3d06",
"metadata": {
"papermill": {},
"tags": []
},
"source": "text = \"This is a text with emojis \ud83d\ude0a\ud83d\ude0a\ud83d\ude0a\"",
"outputs": []
},
{
"cell_type": "markdown",
"id": "f761910f-065c-4283-9862-b7a3d4e0bda3",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"## Model"
]
},
{
"cell_type": "markdown",
"id": "dcfa31dd-e669-4560-ab25-39a9d44acdf6",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Remove emojis from text"
]
},
{
"cell_type": "markdown",
"id": "62834ce1-dee1-4dee-a047-2cca8f1e6ba5",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"This function will remove all emojis from a text using RegEx."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9f88fa6-8e9c-4a2d-a7ea-630c4b03fe18",
"metadata": {
"papermill": {},
"tags": []
},
"source": "def remove_emojis(text):\n return re.sub(r\"[^\\w\\s]\", \"\", text)",
"outputs": []
},
{
"cell_type": "markdown",
"id": "97a7aac6-6f66-451c-800e-450ca28f6bba",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"## Output"
]
},
{
"cell_type": "markdown",
"id": "61492b45-5d3b-4da5-9adf-016273345594",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
"### Display result"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "58b4118c-3a12-4bdd-8425-bb2d02039619",
"metadata": {
"papermill": {},
"tags": []
},
"source": "print(remove_emojis(text))",
"outputs": []
},
{
"cell_type": "markdown",
"id": "f527d917-519a-47ab-93b8-1264ca217b80",
"metadata": {
"papermill": {},
"tags": []
},
"source": [
" "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

0 comments on commit 70d6716

Please sign in to comment.