changed readme and demonotebook for documentation

ssciwr · Oct 24, 2023 · 0cfea1c · 0cfea1c
1 parent 3c8bbe5
commit 0cfea1c
Show file tree

Hide file tree

Showing 2 changed files with 19 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -38,9 +38,11 @@ The `AMMICO` package can be installed using pip:
 ```
 pip install ammico
 ```
-This will install the package and its dependencies locally.
+This will install the package and its dependencies locally. If after installation you get some errors when running some modules, please follow the instructions below. 
 
-Some ammico components require tensorflow (e.g. Emotion detector), some pytorch (e.g. Summary detector). Sometimes there are compatibility problems between these two frameworks. To avoid compatibility problems on your machines, we suggest you to follow these steps before installing the package (you need conda on your machine):
+## Compatibility problems solving
+
+Some ammico components require `tensorflow` (e.g. Emotion detector), some `pytorch` (e.g. Summary detector). Sometimes there are compatibility problems between these two frameworks. To avoid these problems on your machines, you can prepare proper environment before installing the package (you need conda on your machine):
 
 ### 1. First, install tensorflow (https://www.tensorflow.org/install/pip)
 - create a new environment with python and activate it
@@ -68,7 +70,7 @@ Some ammico components require tensorflow (e.g. Emotion detector), some pytorch
 
     ```conda activate ammico_env ```
 
-- and now we can install tensorflow
+- install tensorflow
 
     ```python -m pip install tensorflow==2.12.1```
 
@@ -85,7 +87,7 @@ Some ammico components require tensorflow (e.g. Emotion detector), some pytorch
 It is done.
     
 ### Micromamba
-If you have micromamba on your machine you can prepare environment with just one command: 
+If you are using micromamba you can prepare environment with just one command: 
 
 ```micromamba create --no-channel-priority -c nvidia -c pytorch -c conda-forge -n ammico_env "python=3.10" pytorch torchvision torchaudio pytorch-cuda "tensorflow-gpu<=2.12.3" "numpy<=1.23.4"```  
    

diff --git a/ammico/notebooks/DemoNotebook_ammico.ipynb b/ammico/notebooks/DemoNotebook_ammico.ipynb
@@ -134,10 +134,10 @@
    "outputs": [],
    "source": [
     "# initialize the models\n",
-    "summary_model, summary_vis_processors = ammico.SummaryDetector(image_dict).load_model(model_type=\"base\")\n",
+    "obj = ammico.SummaryDetector(image_dict, analysis_type=\"summary\", model_type=\"base\") \n",
     "# run the analysis without having to re-iniatialize the model\n",
     "for key in image_dict:\n",
-    "    image_dict[key] = ammico.SummaryDetector(image_dict[key], analysis_type=\"summary\", summary_model=summary_model, summary_vis_processors=summary_vis_processors).analyse_image()"
+    "    image_dict[key] = obj.analyse_image(analysis_type=\"summary\", subdict = image_dict[key])"
    ]
   },
   {
@@ -210,6 +210,17 @@
     "\n",
     "## Image summary and query\n",
     "\n",
+    "`SummaryDetector` can be used to generate image captions (`summary`) as well as visual question answering (`VQA`). This module is based on the [LAVIS](https://github.com/salesforce/LAVIS) library. Since the models can be quite large, an initial object is created which will load the necessary models into RAM/VRAM and then use them in the analysis. The user can specify the type of analysis to be performed using the `analysis_type` keyword. Setting it to `summary` will generate a caption (summary), `questions` will prepare answers (VQA) to a list of questions as set by the user, `summary_and_questions` will do both. Note that the desired analysis type needs to be set here in the initialization of the detector object, and not when running the analysis for each image; the same holds true for the selected model. \n",
+    "\n",
+    "For VQA, a list of questions needs to be passed when carrying out the analysis; these should be given as a list of strings.\n",
+    "```\n",
+    "list_of_questions = [\n",
+    "    \"How many persons on the picture?\",\n",
+    "    \"Are there any politicians in the picture?\",\n",
+    "    \"Does the picture show something from medicine?\",\n",
+    "]\n",
+    "```\n",
+    "\n",
     "## Detection of faces and facial expression analysis\n",
     "Faces and facial expressions are detected and analyzed using the `EmotionDetector` class from the `faces` module. Initially, it is detected if faces are present on the image using RetinaFace, followed by analysis if face masks are worn (Face-Mask-Detection). The detection of age, gender, race, and emotions is carried out with deepface.\n",
     "\n",