Skip to content

Commit

Permalink
Update intent Recognition Engineering + Testing Section
Browse files Browse the repository at this point in the history
  • Loading branch information
nikl216 committed Mar 26, 2024
1 parent 23c145d commit df4c379
Show file tree
Hide file tree
Showing 7 changed files with 236 additions and 115 deletions.
62 changes: 61 additions & 1 deletion docs/Intent Recognition System/Engineering/Deployment.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,63 @@
# Deployment

- Feature List- Kanban Board
## Introduction

This documentation outlines the deployment strategy for Noora Health's Intent Recognition System, specifically designed to enhance user communication via the TURN platform. The system operates through a webhook mechanism that triggers the Intent Recognition System upon receiving new messages, ensuring each is processed and acted upon based on its categorized intent.

## System Overview

The Intent Recognition System is configured within the TURN platform. It leverages webhook triggers to initiate intent recognition processes for incoming messages. This setup aims to automate the categorization and appropriate handling of user queries, focusing on streamlining responses to non-medical inquiries and efficiently escalating medical-related queries.

## Deployment Process

### Prerequisites

- TURN platform with operational webhook functionality.
- Access to the GPT-4 API for intent recognition tasks.
- Predefined sets of few-shot examples and response templates for various intent categories.

### Configuration Steps

#### Step 1: TURN Platform Setup

- **Objective**: Prepare the TURN platform to interface with the Intent Recognition System via webhooks.
- **Actions**:
- Ensure that TURN's webhook functionality is enabled and properly configured to interact with the external Intent Recognition System.
- Validate TURN API keys and webhook permissions for secure and reliable data exchange.

#### Step 2: Intent Recognition System Integration

- **Objective**: Integrate the Intent Recognition System with TURN, enabling automated intent analysis upon message reception.
- **Actions**:
- Configure the webhook in TURN to trigger the Intent Recognition System whenever a new message is received.
- Load the Intent Recognition System with predefined few-shot examples and response templates to facilitate accurate intent recognition.

#### Step 3: Multilingual and Multi-Intent Configuration

- **Objective**: Ensure the system can accurately process messages in multiple languages and categorize them into predefined intents.
- **Actions**:
- Implement and configure a multilingual model within the Intent Recognition System to handle messages across various languages.
- Test the system with sample messages in different languages to fine-tune the multilingual model and ensure accurate intent recognition.

#### Step 4: Testing and Validation

- **Objective**: Conduct comprehensive testing to ensure the system accurately categorizes messages and triggers the appropriate responses or escalations.
- **Actions**:
- Perform end-to-end testing by simulating incoming messages that cover all predefined intent categories.
- Assess the system's response accuracy and tweak the intent recognition logic as needed to improve performance.

#### Step 5: Deployment and Monitoring

- **Objective**: Deploy the webhook-triggered Intent Recognition System in a live environment and set up monitoring for continuous performance evaluation.
- **Actions**:
- Implement the system in the live TURN environment, enabling real-time processing of user messages.
- Utilize monitoring tools to track the system's performance, focusing on the accuracy of intent recognition and the effectiveness of automated responses or query escalations.

### Post-Deployment

- **Continuous Optimization**: Regularly review system performance data and user feedback to optimize intent recognition accuracy and response effectiveness.
- **Scalability and Expansion**: Plan for future scalability to accommodate an increasing volume of messages and the potential inclusion of more languages and intents.

## Conclusion

The deployment of the Intent Recognition System through a webhook on the TURN platform represents a significant step forward in Noora Health's ability to efficiently categorize and respond to user messages. This system ensures that MSEs can focus their efforts on providing high-quality, empathetic responses to medical inquiries, thereby enhancing overall user satisfaction and operational efficiency.
104 changes: 104 additions & 0 deletions docs/Intent Recognition System/Engineering/Development-copy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Development

- High Risk Intent Classification
- Data Processing and Message Classification
- Integration with Chat and Messaging Platforms
- Continuous Learning and Model Improvement
- Architecture Overview (Technical Stack, Data Flow, APIs)

Intent Recognition: Turn stacks/Journey to trigger custom webhook
hosting the LLM Chat Bot Application that does classification.

![Untitled](Untitled.png)

<!-- ![Untitled](../Engineering%20f06030dea04e40cf84573246d73d39f9/Untitled.png) -->

### System Components

### Message Reception and

Pre-processing

Ingestion API: Utilize Python’s Fast API framework to create RESTful
APIs that efficiently handle incoming messages from turn’s webhook (or
Meta’s cloud business manager.)

Pre-processing Service: Implement a service to sanitize and
standardize messages by removing special characters, converting text to
lowercase, and other necessary pre processing steps to ensure data
quality for NLP/LLM analysis.

### NLP and

Machine Learning for Intent Classification:

Method 1: Easy and Fast Dev Time but a bit Expensive:

GPT-4 Classification: Leverage GPT-4 for direct message
classification, bypassing the need for a separate translation layer,
suitable for straightforward intent recognition tasks.More Dev time but
cheaper and less accurate:

Method 2:  Translation services and NLP Libraries: For multilingual
support, integrate ai4bharat’s transliteration and translation services
followed by TensorFlow or PyTorch for machine learning. Use NLP
libraries like NLTK or spaCy for additional text processing and
classification into predefined intent categories.

Model Training: Employ supervised learning techniques with a
well-labeled dataset containing diverse examples of user messages mapped
to target intent categories.

Model Deployment: Package and deploy the trained model as a
microservice, utilizing Docker for enhanced scalability, portability,
and isolation.

### Action Handlers

Function Dispatcher: A crucial system component responsible for
mapping classified intents to corresponding action handlers, including
integration with GPT-4 Functions for dynamic response generation.

Handlers: Develop specific handlers for common intents such as
greetings, feedback, acknowledgments, marking spam, and managing user
language preferences. Ensure handlers update the user’s status or
preferences in a unified database for consistent communication.

Types: Greetings, Acknowledgments ,Spam, Requests to change language,
Medical Questions

### Suggested FAQ:

Semantic based gpt4 retrieval to provide user with suggested FAQ and
its answers while the users wait for a medical response.

![Whatsapp response](img/whatsapp.png)

<!-- ![Untitled](Objective%2087c497e68c234d699d6825e2549b06ce/Untitled.png) -->

### Technical Stack and Tools

Backend and APIs: Build the backend infrastructure using Python,
leveraging Fast API for its asynchronous support and ease of use in
creating RESTful services. NLP/ML Components: Utilize GPT-4 for intent
classification, TensorFlow or PyTorch for building custom ML models, and
NLP libraries such as NLTK or spaCy for text analysis and preprocessing.
Database Management: Implement SQL Alchemy for structured data storage
and intent categorization records. Use Redis for caching frequently
accessed data or responses to enhance system performance. Suggested FAQ:
Whatsapp templates with buttons/list.

### Caveats

When classifying messages received through webhooks, it’s important
to note that this approach may overlook the broader context of
conversations. Specifically, accurate message classification often
necessitates access to the chat history, as some inputs can only be
properly understood and classified within their conversational
context.

Therefore, after establishing a foundational message classification
system, it’s advisable to enhance its accuracy by incorporating the
user’s chat history into the classification process. This integration
will allow for a more nuanced understanding of the messages, leading to
improved classification outcomes.
123 changes: 40 additions & 83 deletions docs/Intent Recognition System/Engineering/Development.md
Original file line number Diff line number Diff line change
@@ -1,107 +1,64 @@
# Development

- High Risk Intent Classification
- Data Processing and Message Classification
- Integration with Chat and Messaging Platforms
- Continuous Learning and Model Improvement
- Architecture Overview (Technical Stack, Data Flow, APIs)
## Overview

Intent Recognition: Turn stacks/Journey to trigger custom webhook
hosting the LLM Chat Bot Application that does classification.
The application uses a Language Model (LLM) with GPT-4 at its core for intent recognition. The process involves embedding input data, utilizing a multilingual model for dynamic example selection, and performing intent classification with a few-shot learning approach. Below are the detailed steps following the flow chart provided.

![Untitled](Untitled.png)
## Detailed Process Flow

![Untitled](intent-engg-flow.png)

<!-- ![Untitled](../Engineering%20f06030dea04e40cf84573246d73d39f9/Untitled.png) -->
### 1. Embed Source of Truth File

- **Purpose:** Create embeddings for the few-shot examples that serve as the "source of truth."
- **Procedure:** Input the few-shot examples into GPT-4, which generates a vector representation of each example.

### 2. Use Multilingual Embedding Model

### System Components
- **Functionality:** Facilitate the understanding and processing of messages in multiple languages.
- **Selection:** Dynamically choose the relevant embeddings based on the language and content of the incoming query.

### Message Reception and
### 3. Receive Query for Classification

Pre-processing
- **Reception:** The system accepts an incoming user query that needs to be classified.
- **Pre-processing:** Standardize the query to match the format expected by the model (e.g., lowercasing, removing special characters).

Ingestion API: Utilize Python’s Fast API framework to create RESTful
APIs that efficiently handle incoming messages from turn’s webhook (or
Meta’s cloud business manager.)
### 4. Find Similar Examples in Embedding Model

Pre-processing Service: Implement a service to sanitize and
standardize messages by removing special characters, converting text to
lowercase, and other necessary pre processing steps to ensure data
quality for NLP/LLM analysis.
- **Search:** Within the multilingual embedding space, identify few-shot examples with high similarity to the received query.
- **Criteria:** Examples are selected based on semantic similarity and language relevance.

### NLP and
### 5. Add Retrieved Examples as Few-Shots to Prompt

Machine Learning for Intent Classification:
- **Integration:** The selected few-shot examples are combined with the query to create a new prompt for GPT-4.
- **Contextualization:** This step ensures that the context for the classification is set correctly, which is particularly important for the few-shot learning approach.

Method 1: Easy and Fast Dev Time but a bit Expensive:
### 6. Send Message with Few-Shots for Classification

GPT-4 Classification: Leverage GPT-4 for direct message
classification, bypassing the need for a separate translation layer,
suitable for straightforward intent recognition tasks.More Dev time but
cheaper and less accurate:
- **Classification Request:** The prompt consisting of the query and few-shot examples is fed to GPT-4.
- **GPT-4 Processing:** GPT-4 analyzes the combined input to understand the intent of the query by drawing parallels with the provided few-shot examples.

Method 2:  Translation services and NLP Libraries: For multilingual
support, integrate ai4bharat’s transliteration and translation services
followed by TensorFlow or PyTorch for machine learning. Use NLP
libraries like NLTK or spaCy for additional text processing and
classification into predefined intent categories.
### 7. Classification

Model Training: Employ supervised learning techniques with a
well-labeled dataset containing diverse examples of user messages mapped
to target intent categories.
- **Outcome:** GPT-4 outputs the classification of the query's intent.
- **Post-Processing:** The result may then be used to trigger corresponding action handlers or responses within the application.

Model Deployment: Package and deploy the trained model as a
microservice, utilizing Docker for enhanced scalability, portability,
and isolation.
## Action Handlers and Response Functions

### Action Handlers
- **Purpose:** Design specific functions/handlers to address common intents such as greetings, feedback, acknowledgments, marking spam, and language preference management.
- **Implementation:**
- Develop a set of predefined response templates for each intent category.
- Create a function dispatcher that routes the classified intents to their respective handlers.
- **Database Integration:**
- Ensure each handler is capable of updating the user's status or preferences in the application's unified database.
- This update mechanism is crucial for maintaining consistent communication and personalizing the user experience.
- **Dynamic Response Generation:**
- Integrate with GPT-4 Functions to dynamically generate responses for intents that require more nuanced or context-specific information.

Function Dispatcher: A crucial system component responsible for
mapping classified intents to corresponding action handlers, including
integration with GPT-4 Functions for dynamic response generation.
## Additional Technical Considerations

Handlers: Develop specific handlers for common intents such as
greetings, feedback, acknowledgments, marking spam, and managing user
language preferences. Ensure handlers update the user’s status or
preferences in a unified database for consistent communication.
- **Multilingual Capability:** The system must accurately handle and classify queries across different languages, necessitating the use of a robust multilingual model.
- **Dynamic Example Selection:** To accommodate the variability in user queries, the embedding model must be capable of selecting the most appropriate few-shot examples on-the-fly.
- **Few-Shot Learning:** This approach allows the application to effectively classify intents with minimal training data per category, as GPT-4 can generalize from a few examples.

Types: Greetings, Acknowledgments ,Spam, Requests to change language,
Medical Questions

### Suggested FAQ:

Semantic based gpt4 retrieval to provide user with suggested FAQ and
its answers while the users wait for a medical response.

![Whatsapp response](img/whatsapp.png)

<!-- ![Untitled](Objective%2087c497e68c234d699d6825e2549b06ce/Untitled.png) -->

### Technical Stack and Tools

Backend and APIs: Build the backend infrastructure using Python,
leveraging Fast API for its asynchronous support and ease of use in
creating RESTful services. NLP/ML Components: Utilize GPT-4 for intent
classification, TensorFlow or PyTorch for building custom ML models, and
NLP libraries such as NLTK or spaCy for text analysis and preprocessing.
Database Management: Implement SQL Alchemy for structured data storage
and intent categorization records. Use Redis for caching frequently
accessed data or responses to enhance system performance. Suggested FAQ:
Whatsapp templates with buttons/list.

### Caveats

When classifying messages received through webhooks, it’s important
to note that this approach may overlook the broader context of
conversations. Specifically, accurate message classification often
necessitates access to the chat history, as some inputs can only be
properly understood and classified within their conversational
context.

Therefore, after establishing a foundational message classification
system, it’s advisable to enhance its accuracy by incorporating the
user’s chat history into the classification process. This integration
will allow for a more nuanced understanding of the messages, leading to
improved classification outcomes.
By following these steps, the application leverages the powerful capabilities of GPT-4 for intent recognition, while also ensuring adaptability and accuracy through the use of multilingual embeddings and dynamic few-shot example selection.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit df4c379

Please sign in to comment.