Add Intent Recognition- Dev,Deploy and testing protocol

NooraHealth · Mar 27, 2024 · c1ca90c · c1ca90c
1 parent e8c2d3e
commit c1ca90c
Show file tree

Hide file tree

Showing 5 changed files with 248 additions and 347 deletions.
diff --git a/docs/Intent Recognition System/Engineering/Deployment.md b/docs/Intent Recognition System/Engineering/Deployment.md
@@ -1,3 +1,63 @@
 # Deployment
 
-- Feature List- Kanban Board
+## Introduction
+
+This documentation outlines the deployment strategy for Noora Health's Intent Recognition System, specifically designed to enhance user communication via the TURN platform. The system operates through a webhook mechanism that triggers the Intent Recognition System upon receiving new messages, ensuring each is processed and acted upon based on its categorized intent.
+
+## System Overview
+
+The Intent Recognition System is configured within the TURN platform. It leverages webhook triggers to initiate intent recognition processes for incoming messages. This setup aims to automate the categorization and appropriate handling of user queries, focusing on streamlining responses to non-medical inquiries and efficiently escalating medical-related queries.
+
+## Deployment Process
+
+### Prerequisites
+
+- TURN platform with operational webhook functionality.
+- Access to the GPT-4 API for intent recognition tasks.
+- Predefined sets of few-shot examples and response templates for various intent categories.
+
+### Configuration Steps
+
+#### Step 1: TURN Platform Setup
+
+- **Objective**: Prepare the TURN platform to interface with the Intent Recognition System via webhooks.
+- **Actions**:
+  - Ensure that TURN's webhook functionality is enabled and properly configured to interact with the external Intent Recognition System.
+  - Validate TURN API keys and webhook permissions for secure and reliable data exchange.
+
+#### Step 2: Intent Recognition System Integration
+
+- **Objective**: Integrate the Intent Recognition System with TURN, enabling automated intent analysis upon message reception.
+- **Actions**:
+  - Configure the webhook in TURN to trigger the Intent Recognition System whenever a new message is received.
+  - Load the Intent Recognition System with predefined few-shot examples and response templates to facilitate accurate intent recognition.
+
+#### Step 3: Multilingual and Multi-Intent Configuration
+
+- **Objective**: Ensure the system can accurately process messages in multiple languages and categorize them into predefined intents.
+- **Actions**:
+  - Implement and configure a multilingual model within the Intent Recognition System to handle messages across various languages.
+  - Test the system with sample messages in different languages to fine-tune the multilingual model and ensure accurate intent recognition.
+
+#### Step 4: Testing and Validation
+
+- **Objective**: Conduct comprehensive testing to ensure the system accurately categorizes messages and triggers the appropriate responses or escalations.
+- **Actions**:
+  - Perform end-to-end testing by simulating incoming messages that cover all predefined intent categories.
+  - Assess the system's response accuracy and tweak the intent recognition logic as needed to improve performance.
+
+#### Step 5: Deployment and Monitoring
+
+- **Objective**: Deploy the webhook-triggered Intent Recognition System in a live environment and set up monitoring for continuous performance evaluation.
+- **Actions**:
+  - Implement the system in the live TURN environment, enabling real-time processing of user messages.
+  - Utilize monitoring tools to track the system's performance, focusing on the accuracy of intent recognition and the effectiveness of automated responses or query escalations.
+
+### Post-Deployment
+
+- **Continuous Optimization**: Regularly review system performance data and user feedback to optimize intent recognition accuracy and response effectiveness.
+- **Scalability and Expansion**: Plan for future scalability to accommodate an increasing volume of messages and the potential inclusion of more languages and intents.
+
+## Conclusion
+
+The deployment of the Intent Recognition System through a webhook on the TURN platform represents a significant step forward in Noora Health's ability to efficiently categorize and respond to user messages. This system ensures that MSEs can focus their efforts on providing high-quality, empathetic responses to medical inquiries, thereby enhancing overall user satisfaction and operational efficiency.
diff --git a/docs/Intent Recognition System/Engineering/Development.md b/docs/Intent Recognition System/Engineering/Development.md
@@ -1,96 +1,64 @@
 # Development
 
-These are the key engineering divisions that we need to bring together for an improvised intent recognition system:
+## Overview
 
-- High Risk Intent Classification
-- Data Processing and Message Classification
-- Integration with Chat and Messaging Platforms
-- Continuous Learning and Model Improvement
+The application uses a Language Model (LLM) with GPT-4 at its core for intent recognition. The process involves embedding input data, utilizing a multilingual model for dynamic example selection, and performing intent classification with a few-shot learning approach. Below are the detailed steps following the flow chart provided.
 
+## Detailed Process Flow
 
-![Untitled](Untitled.png)
+![Untitled](intent-engg-flow.png)
 
+### 1. Embed Source of Truth File
 
-<!-- ![Untitled](../Engineering%20f06030dea04e40cf84573246d73d39f9/Untitled.png) -->
+- **Purpose:** Create embeddings for the few-shot examples that serve as the "source of truth."
+- **Procedure:** Input the few-shot examples into GPT-4, which generates a vector representation of each example.
 
+### 2. Use Multilingual Embedding Model
 
-### System Components
+- **Functionality:** Facilitate the understanding and processing of messages in multiple languages.
+- **Selection:** Dynamically choose the relevant embeddings based on the language and content of the incoming query.
 
-### Message Reception and Pre-processing
+### 3. Receive Query for Classification
 
-Ingestion API: Utilize Python’s Fast API framework to create RESTful APIs that efficiently handle incoming messages from turn’s webhook (or Meta’s cloud business manager.)
+- **Reception:** The system accepts an incoming user query that needs to be classified.
+- **Pre-processing:** Standardize the query to match the format expected by the model (e.g., lowercasing, removing special characters).
 
-Pre-processing Service: Implement a service to sanitize and
-standardize messages by removing special characters, converting text to
-lowercase, and other necessary pre processing steps to ensure data
-quality for NLP/LLM analysis.
+### 4. Find Similar Examples in Embedding Model
 
-### NLP and Machine Learning for Intent Classification:
+- **Search:** Within the multilingual embedding space, identify few-shot examples with high similarity to the received query.
+- **Criteria:** Examples are selected based on semantic similarity and language relevance.
 
-Intent Recognition: Turn stacks/Journey to trigger custom webhook hosting the LLM Chat Bot Application that does classification.
+### 5. Add Retrieved Examples as Few-Shots to Prompt
 
+- **Integration:** The selected few-shot examples are combined with the query to create a new prompt for GPT-4.
+- **Contextualization:** This step ensures that the context for the classification is set correctly, which is particularly important for the few-shot learning approach.
 
-#### Method 1
+### 6. Send Message with Few-Shots for Classification
 
-Easy and Fast Dev Time but a bit Expensive:
+- **Classification Request:** The prompt consisting of the query and few-shot examples is fed to GPT-4.
+- **GPT-4 Processing:** GPT-4 analyzes the combined input to understand the intent of the query by drawing parallels with the provided few-shot examples.
 
-GPT-4 Classification: Leverage GPT-4 for direct message
-classification, bypassing the need for a separate translation layer,suitable for straightforward intent recognition tasks.More Dev time but cheaper and less accurate:
+### 7. Classification
 
-##### Method 2
+- **Outcome:** GPT-4 outputs the classification of the query's intent.
+- **Post-Processing:** The result may then be used to trigger corresponding action handlers or responses within the application.
 
-Translation services and NLP Libraries: For multilingual
-support, integrate ai4bharat’s transliteration and translation services followed by TensorFlow or PyTorch for machine learning. Use NLP libraries like NLTK or spaCy for additional text processing and classification into predefined intent categories.
+## Action Handlers and Response Functions
 
-Model Training: Employ supervised learning techniques with a
-well-labeled dataset containing diverse examples of user messages mapped
-to target intent categories.
+- **Purpose:** Design specific functions/handlers to address common intents such as greetings, feedback, acknowledgments, marking spam, and language preference management.
+- **Implementation:**
+  - Develop a set of predefined response templates for each intent category.
+  - Create a function dispatcher that routes the classified intents to their respective handlers.
+- **Database Integration:**
+  - Ensure each handler is capable of updating the user's status or preferences in the application's unified database.
+  - This update mechanism is crucial for maintaining consistent communication and personalizing the user experience.
+- **Dynamic Response Generation:**
+  - Integrate with GPT-4 Functions to dynamically generate responses for intents that require more nuanced or context-specific information.
 
-Model Deployment: Package and deploy the trained model as a
-microservice, utilizing Docker for enhanced scalability, portability,
-and isolation.
+## Additional Technical Considerations
 
-### Action Handlers
+- **Multilingual Capability:** The system must accurately handle and classify queries across different languages, necessitating the use of a robust multilingual model.
+- **Dynamic Example Selection:** To accommodate the variability in user queries, the embedding model must be capable of selecting the most appropriate few-shot examples on-the-fly.
+- **Few-Shot Learning:** This approach allows the application to effectively classify intents with minimal training data per category, as GPT-4 can generalize from a few examples.
 
-Function Dispatcher: A crucial system component responsible for
-mapping classified intents to corresponding action handlers, including
-integration with GPT-4 Functions for dynamic response generation.
-
-Handlers: Develop specific handlers for common intents such as greetings, feedback, acknowledgments, marking spam, and managing user language preferences. Ensure handlers update the user’s status or preferences in a unified database for consistent communication.
-
-Types: Greetings, Acknowledgments ,Spam, Requests to change language, Medical Questions
-
-### Suggested FAQ:
-
-Semantic based gpt4 retrieval to provide user with suggested FAQ and its answers while the users wait for a medical response.
-
-![Whatsapp response](img/whatsapp.png)
-
-<!-- ![Untitled](Objective%2087c497e68c234d699d6825e2549b06ce/Untitled.png) -->
-
-### Technical Stack and Tools
-
-Backend and APIs: Build the backend infrastructure using Python,
-leveraging Fast API for its asynchronous support and ease of use in
-creating RESTful services. NLP/ML Components: Utilize GPT-4 for intent
-classification, TensorFlow or PyTorch for building custom ML models, and
-NLP libraries such as NLTK or spaCy for text analysis and preprocessing.
-Database Management: Implement SQL Alchemy for structured data storage
-and intent categorization records. Use Redis for caching frequently
-accessed data or responses to enhance system performance. Suggested FAQ:
-Whatsapp templates with buttons/list.
-
-### Caveats
-
-When classifying messages received through webhooks, it’s important
-to note that this approach may overlook the broader context of
-conversations. Specifically, accurate message classification often
-necessitates access to the chat history, as some inputs can only be
-properly understood and classified within their conversational
-context.
-
-Therefore, after establishing a foundational message classification
-system, it’s advisable to enhance its accuracy by incorporating the
-user’s chat history into the classification process. This integration
-will allow for a more nuanced understanding of the messages, leading to
-improved classification outcomes.
+By following these steps, the application leverages the powerful capabilities of GPT-4 for intent recognition, while also ensuring adaptability and accuracy through the use of multilingual embeddings and dynamic few-shot example selection.
diff --git a/docs/Intent Recognition System/Engineering/intent-engg-flow.png b/docs/Intent Recognition System/Engineering/intent-engg-flow.png
diff --git a/docs/Intent Recognition System/Testing Procedures.md b/docs/Intent Recognition System/Testing Procedures.md
@@ -1,41 +1,40 @@
-# Testing Procedures
+# Testing Plan
 
-### Protocols for Testers
+The testing of the Intent Recognition System will be conducted in three distinct phases. Each phase is designed to progressively validate the system's accuracy and efficiency in categorizing incoming messages, with an increasing level of automation and integration into the daily operations of Medical Support Executives (MSEs).
 
-- Engage a diverse group of testers, including those fluent in the system’s supported languages and individuals with healthcare knowledge,
-to manually input queries and evaluate the system’s responses for accuracy, relevance, and usefulness.
-- Testers should also assess the user interface’s usability, particularly the effectiveness and accessibility of the feedback
-mechanisms. ### Protocols for Feedback Collection
-- For each query response provided by the MSE Assistant, include a “thumbs up” (satisfied) or “thumbs down” (dissatisfied) option, allowing
-users to quickly express their satisfaction level with the
-response.
-- Alongside the satisfaction indicator, provide an optional input field where users can elaborate on their feedback, offering insights
-into what they found helpful or areas where the response fell short. This input field can capture valuable details on the user’s experience,
-expectations, and specific issues encountered.
+## Test Phase 1: Observation Mode
 
-### Protocols for Iterative Improvement
+**Objective**: Silently evaluate the accuracy of the Intent Recognition System by comparing its categorizations against those made by MSEs without affecting current operations.
 
-### Analysis of Feedback
+### Steps:
 
-- Regularly analyze the feedback, categorizing it based on the type of query, the nature of the feedback (positive or negative), and any
-specific suggestions or issues highlighted by users.
-- Pay particular attention to repeated patterns of dissatisfaction or specific areas where users consistently request improvements.
+1. **Configuration**: Set up the Intent Recognition System to run in observation mode, where it categorizes incoming messages in the background.
+2. **Normal Operation**: Allow MSEs to label incoming messages as per the existing process without awareness of the system's categorizations.
+3. **Data Collection**: Collect data on the labels assigned by MSEs and the categorizations made by the Intent Recognition System for the same messages.
+4. **Analysis**: Compare the system's categorizations against MSE labels to assess accuracy, identify patterns in discrepancies, and adjust the system accordingly.
 
-### System Refinement
+## Test Phase 2: UI Labeling
 
-- Utilize the feedback to make targeted improvements to the MSE Assistant. This could involve enhancing the translation layer for better
-accuracy, updating the medical FAQ bank to address gaps in information, or refining the response generation process for greater relevance and
-clarity.
-- Prioritize updates based on the frequency and severity of the feedback, ensuring that changes are likely to have a significant impact
-on user satisfaction.
+**Objective**: Test the system's integration with the user interface by displaying its categorizations to MSEs, allowing them to correct any inaccuracies.
 
-### Routine Review and Adjustment
+### Steps:
 
-- Establish a process for routine review of the feedback collection and analysis system itself, ensuring that it remains effective in
-capturing and categorizing user insights.
-- Adjust the feedback mechanism as needed to ensure it encourages maximum user participation and captures high-quality, actionable
-feedback. Correctness , consistency. 
+1. **UI Integration**: Modify the TURN platform's UI to display labels assigned by the Intent Recognition System for each chat.
+2. **Human in the Loop**: Instruct MSEs to review the system's categorizations and add or adjust labels if the categorization is found to be incorrect.
+3. **Feedback Loop**: Collect feedback from MSEs on the system's accuracy and the intuitiveness of correcting categorizations.
+4. **System Refinement**: Use MSE feedback and correction patterns to refine the intent recognition algorithms and improve categorization accuracy.
 
-### Protocols for Evaluation and Logging
+## Test Phase 3: Medical vs. Non-Medical Segregation
 
-(to be added)
+**Objective**: Fully integrate the Intent Recognition System into the operational flow, automatically segregating non-medical messages and allowing MSEs to focus on medically relevant queries.
+
+### Steps:
+
+1. **Automatic Segregation**: Configure the Intent Recognition System to automatically assign non-medical messages to a separate bucket, directing only medically relevant questions to MSEs.
+2. **Teletrainer Review**: Assign a team of teletrainers to regularly check the non-medical chat bucket for classification errors, ensuring no medical queries are incorrectly categorized.
+3. **Action Based on Classification**: Enable the Intent Recognition System to respond to or take action on non-medical messages based on their classification (e.g., automated responses for greetings or spam).
+4. **Evaluation and Adjustment**: Monitor the system's performance in segregating messages and the accuracy of automated actions, making necessary adjustments to improve reliability and efficiency.
+
+## Overall Testing Strategy
+
+The phased testing approach allows for a careful evaluation of the Intent Recognition System's performance, starting from a non-intrusive observation mode to a full integration with operational processes. This strategy ensures that adjustments can be made before full deployment, minimizing the impact on MSEs' workload and maintaining the quality of user interactions. Regular feedback loops and data analysis are critical at each phase to refine the system's algorithms and ensure that it meets the operational needs of Noora Health.