Initial commit

laurybeth · Sep 8, 2023 · 0ab044b · 0ab044b
commit 0ab044b
Show file tree

Hide file tree

Showing 11 changed files with 5,594 additions and 0 deletions.
diff --git a/.eslintrc b/.eslintrc
@@ -0,0 +1,17 @@
+{
+  "env": {
+    "node": true
+  },
+  "parserOptions": {
+    "ecmaVersion": 2018,
+    "sourceType": "module"
+  },
+  "extends": "eslint:recommended",
+  "rules": {
+    "no-console": "warn",
+    "no-var": "error",
+    "prefer-const": "error",
+    "eqeqeq": "error",
+    "indent": ["error", 2]
+  }
+}
diff --git a/.github/workflows/test-flowise-flow.yml b/.github/workflows/test-flowise-flow.yml
@@ -0,0 +1,38 @@
+name: Test Flowise Flow
+
+on:
+  push:
+    branches: ['main']
+  pull_request:
+    branches: ['main']
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+
+    strategy:
+      matrix:
+        node-version: [18.x]
+
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Use Node.js ${{ matrix.node-version }}
+        uses: actions/setup-node@v3
+        with:
+          node-version: ${{ matrix.node-version }}
+
+      - name: install flowise
+        run: npm install -g flowise
+
+      - name: install wait-on
+        run: npm install -g wait-on
+
+      - name: install dependencies
+        run: npm install
+
+      - name: main
+        run: npx flowise start &
+          wait-on tcp:3000 &&
+          OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }} npm run load &&
+          npm test
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,16 @@
+coverage/
+node_modules/
+.DS_Store
+*.log
+
+# Editor directories and files
+.idea
+.vscode
+*.suo
+*.ntvs*
+*.njsproj
+*.sln
+*.sw*
+/test-results/
+/playwright-report/*
+/playwright/.cache/
diff --git a/README.md b/README.md
@@ -0,0 +1,128 @@
+# Chatea con tus archivos
+
+## Índice
+
+- [1. Consideraciones generales](#1-consideraciones-generales)
+- [2. Preámbulo](#2-preámbulo)
+- [3. Resumen del proyecto](#3-resumen-del-proyecto)
+- [4. Objetivos de aprendizaje](#4-objetivos-de-aprendizaje)
+- [5. Criterios de aceptación](#5-criterios-de-aceptación)
+- [6. Getting started](#6-getting-started)
+- [7. Valida tu solución](#7-valida-tu-solución)
+- [8. Recursos](#8-recursos)
+
+***
+
+## 1. Consideraciones generales
+
+- Este proyecto lo resolvemos de manera **individual**.
+- El rango de tiempo estimado para completar el proyecto es de 1 a 2 Sprints.
+
+## 2. Preámbulo
+
+En los últimos años, la inteligencia artificial (IA) ha desempeñado un papel
+fundamental en prácticamente todos los aspectos de nuestra vida. No se limita
+solo al campo de la tecnología de la información, sino que se ha convertido en
+un campo donde las personas han desarrollado algoritmos "inteligentes" que
+tienen la capacidad de "entender" de manera similar a los seres humanos. Estos
+algoritmos tienen la capacidad de redactar ensayos, sistematizar información y
+hasta mantener conversaciones coherentes.
+
+La habilidad de dirigir eficazmente la IA y entender los diferentes tipos de
+herramientas que componen el ecosistema, son una competencia excepcionalmente
+poderosa, ya que nos permite aprovechar al máximo sus capacidades para
+automatizar tareas, optimizar procesos, generar contenido de alta calidad,
+analizar datos y mucho más.
+
+## 3. Resumen del proyecto
+
+Crearás un chatbot con la habilidad de contestar preguntas basadas en la
+información de un documento, por ejemplo un archivo `txt` o `pdf`, para esto
+podrás utilizar la herramienta [Flowise](https://flowiseai.com/) para extender
+la funcionalidad de un chatbot tradicional, crearás un chatflow utilizando las
+herramientas disponibles para darle esta capacidades a tu chatbot.
+
+![image](https://github.com/Laboratoria/DEV006-md-links/assets/5282075/2ef997e5-22b8-4f92-b4a0-9d000e31c4f1)
+
+## 4. Objetivos de aprendizaje
+
+Familiarizarse con los principales conceptos alrededor de la
+[Inteligencia Artificial Generativa](https://es.wikipedia.org/wiki/Inteligencia_artificial_generativa)
+y trabajar con [Flowise](https://docs.flowiseai.com/) para implementar
+soluciones AI expuestas atraves de una API.
+Flowise es una herramienta basada en [LangChain](https://docs.langchain.com/docs/),
+por lo que además deberás poder entender los conceptos fundamentales de esta
+herramienta.
+
+- [ ] [Flowise basics](https://www.youtube.com/watch?v=tD6fwQyUIJE&list=PL4HikwTaYE0HDOuXMm5sU6DH6_ZrHBLSJ)
+- [ ] [Langchain Components](https://docs.langchain.com/docs/category/components)
+- [ ] [Chat models](https://docs.flowiseai.com/chat-models)
+- [ ] [Chat Flows Basics](https://www.youtube.com/watch?v=fn4GCZuiwdk&list=PL4HikwTaYE0HDOuXMm5sU6DH6_ZrHBLSJ&index=3)
+
+- [ ] Document Labels
+  + [Flowise: Document loaders](https://docs.flowiseai.com/document-loaders)
+  + [Langchain Concepts: Document loaders](https://docs.langchain.com/docs/components/indexing/document-loaders)
+- [ ] Text Splitters
+  + [Text Splitters demo](https://www.youtube.com/watch?v=kMtf9sNIcao&list=PL4HikwTaYE0HDOuXMm5sU6DH6_ZrHBLSJ&index=3)
+  + [Langchain Concepts: Text Splitters](https://docs.langchain.com/docs/components/indexing/text-splitters)
+- [ ] [Embeddings](https://docs.flowiseai.com/embeddings/azure-openai-embeddings)
+- [ ] Vector Stores:
+  + [Flowise: Vector Stores](https://docs.flowiseai.com/vector-stores)
+  + [Langchain Concepts: Vector Stores](https://docs.langchain.com/docs/components/indexing/vectorstore)
+
+## 5. Criterios de aceptación
+
+1. Deberás configurar tu chatflow de manera que acepta la carga de al menos 1
+   archivo de texto, en formato `txt` o `pdf`.
+
+2. Utilizar el módelo `gpt-3.5-turbo`.
+
+3. El chatbot generado debe ser capaz de contestar preguntas usando la
+   información de el/los archivos cargados.
+
+4. Tus github actions deben pasar exitosamente.
+
+5. Debes utilizar al menos lo siguientes nodos:
+
+- Conversational Retrival QA Chain
+- Document Loaders
+- Text Splitters
+- Vector Stores
+- Embeddings
+- Memory
+- Conversational Agent
+
+## 6. Getting started
+
+### Instalar Flowise
+
+Seguir las indicaciones para instalar globalmente [flowise](https://github.com/FlowiseAI/Flowise)
+
+```bash
+npm  install  -g  flowise
+
+npx  flowise  start
+```
+
+## 7. Valida tu solución
+
+Necesitarás definir una variable de ambiente con el nombre `OPENAI_API_KEY` y
+darle el valor de api key de OpenAI que utilizarás
+
+Antes de ejecutar los test copiar en la carpeta `/test` con el nombre
+`'flow.json'` el archivo de exportación del flow implementado
+
+```bash
+OPENAI_API_KEY=<TODO:  poner  tu api  key> npm  test
+```
+
+Una vez inicializada la herramienta podrás acceder [aquí](http://localhost:3000/)
+
+Utilizarás esta herramienta para crear y configurar tu propia aplicación AI,
+la cual podrás utilizar por medio de la UI proporcionada y también a través de
+peticiones HTTP.
+
+## 8. Recursos
+
+- [Serie de tutoriales en youtube (inglés)](https://www.youtube.com/watch?v=tD6fwQyUIJE&list=PL4HikwTaYE0HDOuXMm5sU6DH6_ZrHBLSJ)
+- [Webscrap QnA](https://docs.flowiseai.com/use-cases/web-scrape-qna)
diff --git a/jest.config.js b/jest.config.js
@@ -0,0 +1,196 @@
+/**
+ * For a detailed explanation regarding each configuration property, visit:
+ * https://jestjs.io/docs/configuration
+ */
+
+/** @type {import('jest').Config} */
+module.exports = {
+  // All imported modules in your tests should be mocked automatically
+  // automock: false,
+
+  // Stop running tests after `n` failures
+  // bail: 0,
+
+  // The directory where Jest should store its cached dependency information
+  // cacheDirectory: "/private/var/folders/gf/65r9cr7x6t1874dxr5nktff80000gn/T/jest_dx",
+
+  // Automatically clear mock calls, instances, contexts and results before every test
+  clearMocks: true,
+
+  // Indicates whether the coverage information should be collected while executing the test
+  collectCoverage: true,
+
+  // An array of glob patterns indicating a set of files for which coverage information should be collected
+  // collectCoverageFrom: undefined,
+
+  // The directory where Jest should output its coverage files
+  coverageDirectory: 'coverage',
+
+  // An array of regexp pattern strings used to skip coverage collection
+  // coveragePathIgnorePatterns: [
+  //   "/node_modules/"
+  // ],
+
+  // Indicates which provider should be used to instrument code for coverage
+  // coverageProvider: "babel",
+
+  // A list of reporter names that Jest uses when writing coverage reports
+  // coverageReporters: [
+  //   "json",
+  //   "text",
+  //   "lcov",
+  //   "clover"
+  // ],
+
+  // An object that configures minimum threshold enforcement for coverage results
+  // coverageThreshold: undefined,
+
+  // A path to a custom dependency extractor
+  // dependencyExtractor: undefined,
+
+  // Make calling deprecated APIs throw helpful error messages
+  // errorOnDeprecated: false,
+
+  // The default configuration for fake timers
+  // fakeTimers: {
+  //   "enableGlobally": false
+  // },
+
+  // Force coverage collection from ignored files using an array of glob patterns
+  // forceCoverageMatch: [],
+
+  // A path to a module which exports an async function that is triggered once before all test suites
+  // globalSetup: undefined,
+
+  // A path to a module which exports an async function that is triggered once after all test suites
+  // globalTeardown: undefined,
+
+  // A set of global variables that need to be available in all test environments
+  // globals: {},
+
+  // The maximum amount of workers used to run your tests. Can be specified as % or a number. E.g. maxWorkers: 10% will use 10% of your CPU amount + 1 as the maximum worker number. maxWorkers: 2 will use a maximum of 2 workers.
+  // maxWorkers: "50%",
+
+  // An array of directory names to be searched recursively up from the requiring module's location
+  // moduleDirectories: [
+  //   "node_modules"
+  // ],
+
+  // An array of file extensions your modules use
+  // moduleFileExtensions: [
+  //   "js",
+  //   "mjs",
+  //   "cjs",
+  //   "jsx",
+  //   "ts",
+  //   "tsx",
+  //   "json",
+  //   "node"
+  // ],
+
+  // A map from regular expressions to module names or to arrays of module names that allow to stub out resources with a single module
+  // moduleNameMapper: {},
+
+  // An array of regexp pattern strings, matched against all module paths before considered 'visible' to the module loader
+  // modulePathIgnorePatterns: [],
+
+  // Activates notifications for test results
+  // notify: false,
+
+  // An enum that specifies notification mode. Requires { notify: true }
+  // notifyMode: "failure-change",
+
+  // A preset that is used as a base for Jest's configuration
+  // preset: undefined,
+
+  // Run tests from one or more projects
+  // projects: undefined,
+
+  // Use this configuration option to add custom reporters to Jest
+  // reporters: undefined,
+
+  // Automatically reset mock state before every test
+  // resetMocks: false,
+
+  // Reset the module registry before running each individual test
+  // resetModules: false,
+
+  // A path to a custom resolver
+  // resolver: undefined,
+
+  // Automatically restore mock state and implementation before every test
+  // restoreMocks: false,
+
+  // The root directory that Jest should scan for tests and modules within
+  // rootDir: undefined,
+
+  // A list of paths to directories that Jest should use to search for files in
+  // roots: [
+  //   "<rootDir>"
+  // ],
+
+  // Allows you to use a custom runner instead of Jest's default test runner
+  // runner: "jest-runner",
+
+  // The paths to modules that run some code to configure or set up the testing environment before each test
+  // setupFiles: [],
+
+  // A list of paths to modules that run some code to configure or set up the testing framework before each test
+  // setupFilesAfterEnv: [],
+
+  // The number of seconds after which a test is considered as slow and reported as such in the results.
+  // slowTestThreshold: 5,
+
+  // A list of paths to snapshot serializer modules Jest should use for snapshot testing
+  // snapshotSerializers: [],
+
+  // The test environment that will be used for testing
+  // testEnvironment: "jest-environment-node",
+
+  // Options that will be passed to the testEnvironment
+  // testEnvironmentOptions: {},
+
+  // Adds a location field to test results
+  // testLocationInResults: false,
+
+  // The glob patterns Jest uses to detect test files
+  // testMatch: [
+  //   "**/__tests__/**/*.[jt]s?(x)",
+  //   "**/?(*.)+(spec|test).[tj]s?(x)"
+  // ],
+
+  // An array of regexp pattern strings that are matched against all test paths, matched tests are skipped
+  // testPathIgnorePatterns: [
+  //   "/node_modules/"
+  // ],
+
+  // The regexp pattern or array of patterns that Jest uses to detect test files
+  // testRegex: [],
+
+  // This option allows the use of a custom results processor
+  // testResultsProcessor: undefined,
+
+  // This option allows use of a custom test runner
+  // testRunner: "jest-circus/runner",
+
+  // A map from regular expressions to paths to transformers
+  // transform: undefined,
+
+  // An array of regexp pattern strings that are matched against all source file paths, matched files will skip transformation
+  // transformIgnorePatterns: [
+  //   "/node_modules/",
+  //   "\\.pnp\\.[^\\/]+$"
+  // ],
+
+  // An array of regexp pattern strings that are matched against all modules before the module loader will automatically return a mock for them
+  // unmockedModulePathPatterns: undefined,
+
+  // Indicates whether each individual test should be reported during the run
+  // verbose: undefined,
+
+  // An array of regexp patterns that are matched against all source file paths before re-running tests in watch mode
+  // watchPathIgnorePatterns: [],
+
+  // Whether to use watchman for file crawling
+  // watchman: true,
+};