diff --git a/README.md b/README.md index 3a471ec..44bf68f 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,15 @@ [![codecov](https://codecov.io/gh/joweich/chat-miner/branch/main/graph/badge.svg?token=6EQF0YNGLK)](https://codecov.io/gh/joweich/chat-miner) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) +🌐 +**English** +[Русский][RU] + +[EN]:README.md +[RU]:README.ru.md + +----------------- + **chat-miner** provides lean parsers for every major platform transforming chats into pandas dataframes. Artistic visualizations allow you to explore your data and create artwork from your chats. diff --git a/README.ru.md b/README.ru.md new file mode 100644 index 0000000..7e1161e --- /dev/null +++ b/README.ru.md @@ -0,0 +1,157 @@ + + + + chat-miner: turn your chats into artwork + + +----------------- + +# chat-miner: ΠŸΡ€Π΅Π²Ρ€Π°Ρ‚ΠΈΡ‚Π΅ свои Ρ‡Π°Ρ‚Ρ‹ Π² искусство! + +[![PyPI Version](https://img.shields.io/pypi/v/chat-miner.svg)](https://pypi.org/project/chat-miner/) +[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) +[![Downloads](https://static.pepy.tech/badge/chat-miner/month)](https://pepy.tech/project/chat-miner) +[![codecov](https://codecov.io/gh/joweich/chat-miner/branch/main/graph/badge.svg?token=6EQF0YNGLK)](https://codecov.io/gh/joweich/chat-miner) +[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) + +🌐 +[English][EN] +**Русский** + +[EN]:README.md +[RU]:README.ru.md + +----------------- + +**chat-miner** прСдоставляСт эффСктивныС парсСры для любой ΠΊΡ€ΡƒΠΏΠ½ΠΎΠΉ ΠΏΠ»Π°Ρ‚Ρ„ΠΎΡ€ΠΌΡ‹, ΠΏΡ€Π΅Π΄ΡΡ‚Π°Π²Π»ΡΡŽΡ‰ΠΈΠ΅ Ρ‡Π°Ρ‚Ρ‹ ΠΊΠ°ΠΊ pandas-Π΄Π°Ρ‚Π°Ρ„Ρ€Π΅ΠΉΠΌΡ‹. Π₯удоТСствСнная визуализация позволяСт Π²Π°ΠΌ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Ρ‚ΡŒ Π΄Π°Π½Π½Ρ‹Π΅ Π²Π°ΡˆΠΈΡ… пСрСписок ΠΈ ΡΠΎΠ·Π΄Π°Π²Π°Ρ‚ΡŒ ΠΈΠ· Π½ΠΈΡ… произвСдСния искусства. + + +## 1. Установка +ПослСдний выпуск, Π²ΠΊΠ»ΡŽΡ‡Π°Ρ зависимости, ΠΌΠΎΠΆΠ½ΠΎ ΡƒΡΡ‚Π°Π½ΠΎΠ²ΠΈΡ‚ΡŒ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ PyPI: +```sh +pip install chat-miner +``` +Если Π²Ρ‹ заинтСрСсованы Π² участии Π² ΠΏΡ€ΠΎΠ΅ΠΊΡ‚Π΅, запускС свСТСго исходного ΠΊΠΎΠ΄Π° ΠΈΠ»ΠΈ просто Π»ΡŽΠ±ΠΈΡ‚Π΅ всС Π±ΠΈΠ»Π΄ΠΈΡ‚ΡŒ сами: +```sh +git clone https://github.com/joweich/chat-miner.git +cd chat-miner +pip install -r requirements.txt +``` + +## 2. ЭкспортированиС Ρ‡Π°Ρ‚ΠΎΠ² +ΠžΠ·Π½Π°ΠΊΠΎΠΌΡŒΡ‚Π΅ΡΡŒ с ΠΎΡ„ΠΈΡ†ΠΈΠ°Π»ΡŒΠ½Ρ‹ΠΌΠΈ руководствами для [WhatsApp](https://faq.whatsapp.com/1180414079177245/), [Signal](https://github.com/carderne/signal-export), [Telegram](https://telegram.org/blog/export-and-more), [Facebook Messenger](https://www.facebook.com/help/messenger-app/713635396288741) ΠΈΠ»ΠΈ [Instagram Chats](https://help.instagram.com/181231772500920), Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΡƒΠ·Π½Π°Ρ‚ΡŒ, ΠΊΠ°ΠΊ ΡΠΊΡΠΏΠΎΡ€Ρ‚ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Ρ‡Π°Ρ‚Ρ‹ для вашСй ΠΏΠ»Π°Ρ‚Ρ„ΠΎΡ€ΠΌΡ‹. + +## 3. ΠŸΠ°Ρ€ΡΠΈΠ½Π³ +Код Π½ΠΈΠΆΠ΅ ΠΏΠΎΠΊΠ°Π·Ρ‹Π²Π°Π΅Ρ‚ Ρ€Π°Π±ΠΎΡ‚Ρƒ модуля ``WhatsAppParser``. +``SignalParser``, ``TelegramJsonParser``, ``FacebookMessengerParser`` ΠΈ ``InstagramJsonParser`` ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡŽΡ‚ΡΡ Ρ‚Π΅ΠΌ ΠΆΠ΅ ΠΎΠ±Ρ€Π°Π·ΠΎΠΌ. +```python +from chatminer.chatparsers import WhatsAppParser + +parser = WhatsAppParser(FILEPATH) +parser.parse_file() +df = parser.parsed_messages.get_df() +``` +**Π’Π½ΠΈΠΌΠ°Π½ΠΈΠ΅:** +Π’ зависимости ΠΎΡ‚ вашСй ОБ, python ΠΌΠΎΠΆΠ΅Ρ‚ Ρ‚Ρ€Π΅Π±ΠΎΠ²Π°Ρ‚ΡŒ конвСртирования ΠΏΡƒΡ‚ΠΈ ΠΊ Ρ„Π°ΠΉΠ»Ρƒ Π² "ΡΡ‹Ρ€ΡƒΡŽ" строку. +```python +import os +FILEPATH = r"C:\Users\Username\chat.txt" # Windows +FILEPATH = "/home/username/chat.txt" # Unix +assert os.path.isfile(FILEPATH) + +``` + +## 4. Визуализация +```python +import chatminer.visualizations as vis +import matplotlib.pyplot as plt +``` +### 4.1 ВСпловая ΠΊΠ°Ρ€Ρ‚Π°: ΠšΠΎΠ»ΠΈΡ‡Π΅ΡΡ‚Π²ΠΎ сообщСний Π² дСнь +```python +fig, ax = plt.subplots(2, 1, figsize=(9, 3)) +ax[0] = vis.calendar_heatmap(df, year=2020, cmap='Oranges', ax=ax[0]) +ax[1] = vis.calendar_heatmap(df, year=2021, linewidth=0, monthly_border=True, ax=ax[1]) +``` + +

+ +

+ +### 4.2 Sunburst-Π΄ΠΈΠ°Π³Ρ€Π°ΠΌΠΌΠ°: ΠšΠΎΠ»ΠΈΡ‡Π΅ΡΡ‚Π²ΠΎ сообщСний ΠΏΠΎ Π²Ρ€Π΅ΠΌΠ΅Π½ΠΈ суток +```python +fig, ax = plt.subplots(1, 2, figsize=(7, 3), subplot_kw={'projection': 'polar'}) +ax[0] = vis.sunburst(df, highlight_max=True, isolines=[2500, 5000], isolines_relative=False, ax=ax[0]) +ax[1] = vis.sunburst(df, highlight_max=False, isolines=[0.5, 1], color='C1', ax=ax[1]) +``` + +

+ +

+ +### 4.3 Облако слов: Частота слов +```python +fig, ax = plt.subplots(figsize=(8, 3)) +stopwords = ['these', 'are', 'stopwords'] +kwargs={"background_color": "white", "width": 800, "height": 300, "max_words": 500} +ax = vis.wordcloud(df, ax=ax, stopwords=stopwords, **kwargs) +``` +

+ +

+ +### 4.4 Радарная Π΄ΠΈΠ°Π³Ρ€Π°ΠΌΠΌΠ°: ΠšΠΎΠ»ΠΈΡ‡Π΅ΡΡ‚Π²ΠΎ сообщСний ΠΏΠΎ дням Π½Π΅Π΄Π΅Π»ΠΈ +```python +if not vis.is_radar_registered(): + vis.radar_factory(7, frame="polygon") +fig, ax = plt.subplots(1, 2, figsize=(7, 3), subplot_kw={'projection': 'radar'}) +ax[0] = vis.radar(df, ax=ax[0]) +ax[1] = vis.radar(df, ax=ax[1], color='C1', alpha=0) +``` +

+ +

+ +## 5. ΠžΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠ° СстСсствСнного языка + +### 5.1 Π”ΠΎΠ±Π°Π²ΡŒΡ‚Π΅ настрой + +```python +from chatminer.nlp import add_sentiment + +df_sentiment = add_sentiment(df) +``` +### 5.2 ΠŸΡ€ΠΈΠΌΠ΅Ρ€ Π΄ΠΈΠ°Π³Ρ€Π°ΠΌΠΌΡ‹: Настрой ΠΊΠ°ΠΆΠ΄ΠΎΠ³ΠΎ Π°Π²Ρ‚ΠΎΡ€Π° Π² Π³Ρ€ΡƒΠΏΠΏΠΎΠ²ΠΎΠΌ Ρ‡Π°Ρ‚Π΅ + +```python +df_grouped = df_sentiment.groupby(['author', 'sentiment']).size().unstack(fill_value=0) +ax = df_grouped.plot(kind='bar', stacked=True, figsize=(8, 3)) +``` + +

+ +

+ + +## 6. Π˜Π½Ρ‚Π΅Ρ€Ρ„Π΅ΠΉΡ ΠΊΠΎΠΌΠΌΠ°Π½Π΄Π½ΠΎΠΉ строки +Π§Π΅Ρ€Π΅Π· ΠΊΠΎΠΌΠΌΠ°Π½Π΄Π½ΡƒΡŽ строку поддСрТиваСтся парс Ρ‡Π°Ρ‚ΠΎΠ² Π² csv-Ρ„Π°ΠΉΠ»Ρ‹. +На Π΄Π°Π½Π½Ρ‹ΠΉ ΠΌΠΎΠΌΠ΅Π½Ρ‚, Π½Π°ΠΏΡ€ΡΠΌΡƒΡŽ Ρ‡Π΅Ρ€Π΅Π· ΠΊΠΎΠΌΠΌΠ°Π½Π΄Π½ΡƒΡŽ строку ΡΠΎΠ·Π΄Π°Π²Π°Ρ‚ΡŒ Π²ΠΈΠ·ΡƒΠ°Π»ΠΈΠ·Π°Ρ†ΠΈΠΈ **нСльзя!** + +ΠŸΡ€ΠΈΠΌΠ΅Ρ€ использования: +```bash +$ chatminer -p whatsapp -i exportfile.txt -o output.csv +``` + +Руководство ΠΊ использованию: +``` +usage: chatminer [-h] [-p {whatsapp,instagram,facebook,signal,telegram}] [-i INPUT] [-o OUTPUT] + +options: + -h, --help + Show this help message and exit + -p {whatsapp,instagram,facebook,signal,telegram}, --parser {whatsapp,instagram,facebook,signal,telegram} + The platform from which the chats are imported + -i INPUT, --input INPUT + Input file to be processed + -o OUTPUT, --output OUTPUT + Output file for the results +```