Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Dockerfile for LibraryMetricScripts and Pull librarycomparisonswebsite images from Docker Hub #43

Open
wants to merge 47 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
1afc1e1
create docker-compose file for LibraryMetricScripts and librarycompar…
Jan 15, 2021
e26a48b
update
Jan 15, 2021
4042cb8
enable the docker for combinding two repos
Jan 15, 2021
12eb56c
mysql problem to connect the host
Jan 22, 2021
45a1a64
run web successfully
Jan 29, 2021
85a4a25
update the readme
Jan 29, 2021
b53c425
update the readme
Jan 29, 2021
2c01a39
update readme
Feb 5, 2021
533beaf
update readme
Feb 5, 2021
8812cc5
revert the changes
Feb 12, 2021
fe02889
revert the changes
Feb 12, 2021
fd925a7
create the shared folder
Feb 12, 2021
b11ab27
create a shared folder in home/scripts/charts
Feb 12, 2021
0701eb9
update the app.py
Feb 12, 2021
ddb6be1
add the readme
Feb 26, 2021
2bc7e44
update readme
Feb 26, 2021
f78550b
update readme
Feb 26, 2021
0917ebc
update readme
Feb 26, 2021
f0d3db9
add openJDK (java) in docker
Feb 26, 2021
bc081e6
finish testing
Mar 11, 2021
ddec4b5
udpate the readme
Mar 11, 2021
643a3b6
Update libraries that changed repo urls
snadi May 20, 2021
181d84d
Address Issue #36 popularity calculation
snadi May 20, 2021
09eb866
Add git stash first
May 20, 2021
d089e21
Fix thousands number format
May 20, 2021
15a0f57
create docker-compose file for LibraryMetricScripts and librarycompar…
Jan 15, 2021
7d8dadf
update
Jan 15, 2021
295fc4a
enable the docker for combinding two repos
Jan 15, 2021
b0e6835
mysql problem to connect the host
Jan 22, 2021
af31859
run web successfully
Jan 29, 2021
f4871cd
update the readme
Jan 29, 2021
68fb526
update the readme
Jan 29, 2021
487d853
update readme
Feb 5, 2021
8a9f137
update readme
Feb 5, 2021
1e50747
revert the changes
Feb 12, 2021
ba276b9
revert the changes
Feb 12, 2021
4db6569
create the shared folder
Feb 12, 2021
fa24571
create a shared folder in home/scripts/charts
Feb 12, 2021
a6d8608
add the readme
Feb 26, 2021
f03cf2c
update readme
Feb 26, 2021
3ee8560
update readme
Feb 26, 2021
08cbb82
update readme
Feb 26, 2021
6d98a2e
add openJDK (java) in docker
Feb 26, 2021
83388ce
finish testing
Mar 11, 2021
1f4d1d0
udpate the readme
Mar 11, 2021
cb001b5
Sync with master
snadi Jun 17, 2021
6e55a04
Add WIP notes for testing containers
snadi Jun 17, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 0 additions & 13 deletions Dockerfile

This file was deleted.

4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Each script describes its input and output.
If you would like to get updated metric data, for the same list of libraries we have (found in `SharedFiles/LibraryData.json`, please follow the following steps:

- You first need to set up some of the configuration parameters in the file `Config.json`:
- Change the value of `TOKEN` to your own GitHub generated token.
- Change the value of `SO_TOKEN` to your stack exchange key.
- Change the value of `TOKEN` to your own GitHub generated token. ([How to create Github TOKEN](https://github.com/ualberta-smr/LibraryMetricScripts/wiki/Creating-access-tokens#github-token))
- Change the value of `SO_TOKEN` to your stack exchange key. ([How to create StackOverflow TOKEN](https://github.com/ualberta-smr/LibraryMetricScripts/wiki/Creating-access-tokens#stackoverflow-token))
- You also need to set a DB to fill with the results of running the script. You will need to create a MySQL database (we call ours libcomp). In the `librarycomparison/settings.py`, change the username, database name, and password in the `DATABASE` information to that of your created database. Afterwards run `python3 manage.py makemigrations` and then `python3 manage.py migrate`. This will create the database schema for you. See notes below about the database schema.
- Run `python3 -m scripts` from within the main repo directory which will call all the metric scripts. This script runs all metrics and fills the MySQL database with all the results from all scripts.

Expand Down
10 changes: 10 additions & 0 deletions database/metric-setup.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
use libcomp;
alter table Metric add unique (name);
insert into Metric(name) value("popularity");
insert into Metric(name) value("release frequency");
insert into Metric(name) value("last discussed on so");
insert into Metric(name) value("last modification date");
insert into Metric(name) value("breaking changes");
insert into Metric(name) value("issue response");
insert into Metric(name) value("issue closing");
insert into Metric(name) value("issue classification");
55 changes: 55 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
version: "3.9"
services:
metric-script:
build:
context: .
dockerfile: ./docker/Dockerfile
restart: always
stdin_open: true
tty: true
volumes:
- shared_data:/home/scripts
container_name: "metric-script"
depends_on:
- "db"
web:
image: ualbertasmr/librarycomparisons_web:latest
hostname: web
restart: always
stdin_open: true
tty: true
ports:
- "8000:8000"
volumes:
- shared_data:/home/scripts
networks:
default:
container_name: "librarycomparisons_web"
depends_on:
- "db"
db:
image: ualbertasmr/librarycomparisons_db:latest
hostname: db
restart: always
command:
--default-authentication-plugin=mysql_native_password
ports:
- "3306:3306"
volumes:
- ./database:/docker-entrypoint-initdb.d
- db_data:/var/lib/mysql
networks:
default:
environment:
MYSQL_HOST: localhost
MYSQL_PORT: 3306
MYSQL_DATABASE: "libcomp"
MYSQL_PASSWORD: "mypwd"
MYSQL_ROOT_PASSWORD: "mypwd"

volumes:
db_data:
shared_data:

networks:
default:
26 changes: 26 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
FROM python:3.8.0-slim

# Install OpenJDK 11
ENV DEBIAN_FRONTEND=noninteractive
RUN mkdir -p /usr/share/man/man1 /usr/share/man/man2
RUN apt-get update && apt-get install -y --no-install-recommends openjdk-11-jre
# Prints installed java version, just for checking
RUN java --version

WORKDIR /main

COPY . /main

ENV PYTHONPATH="/main/scripts:${PYTHONPATH}"

RUN apt-get update; \
apt-get -y install sudo; \
sudo apt-get -y install default-libmysqlclient-dev \
gcc \
default-mysql-client \
default-mysql-server \
git \
libpangocairo-1.0-0; \
pip install --trusted-host pypi.python.org -r requirements.txt;

ENTRYPOINT [ "bash", "./docker/start.sh" ]
102 changes: 102 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
Temp notes for updating ReadMe:

- had to run the down with -v to delete data
- followed exact instructions... but noticed that you have to run createmetrics from inside the script in docker-compose step (seems in parallel with leaving the other things running). Need to clarify that in instructions
- Had a problem with the SO key because my key was on old version of API. Had to do it for API V2.0
- Next step is to check: if I exit metrics container, is data still saved in DB? Can I import the DB dump in DB and have it reflect in website? Can I trigger update in the container from localhost? Right now, the release data is taking time to be calculated.
- Remove note about MAXSIZE since it's not used anymore
- Add note that for testing, reduce num of libs in the lib file

# How to calculate metrics & run visualization website using docker

This is a simple way to run the metrics and also get a local website setup to view the metrics (similar to [https://smr.cs.ualberta.ca/comparelibraries/](https://smr.cs.ualberta.ca/comparelibraries/)

You'll need to have [docker](https://docs.docker.com/get-docker/) and [docker-compose](https://docs.docker.com/compose/install/) installed.

- After cloning this repo, you first need to set up some of the configuration parameters in the file `scripts/Config.json`:
- Change the value of `TOKEN` to your own GitHub generated token. ([How to create Github TOKEN](https://github.com/ualberta-smr/LibraryMetricScripts/wiki/Creating-access-tokens#github-token)).
- Change the value of `SO_TOKEN` to your stack exchange key. ([How to create StackOverflow TOKEN](https://github.com/ualberta-smr/LibraryMetricScripts/wiki/Creating-access-tokens#stackoverflow-token)). Please make sure to create a token for v2.0 of the API.
- Change `"OUTPUT_PATH"` to `"../home/scripts/"`.

- You can update the `MAXSIZE` to 100 in `Config.json` for testing purpose.

## Creating the image
### 1. Builds/Rebuilds the image (not start the containers) in the docker-compose.yml file:

```
docker-compose build --no-cache
```

### 2. Starts the containers

**Starts the containers && Starts the website**
```
docker-compose up
```
To access the website, use http://127.0.0.1:8000/comparelibraries/

**Run metric script:**
The above step will have the website running, but right now, there is no data in the DB yet to be displayed. To calculate the metrics, run:

```
docker-compose run metric-script
```

This will open an interactive shell into the container and you can then invoke `createmetrics` to calculate the Metrics:

```
root@e7c767ab1a70:/main# createmetrics
```

**(Optional) Open librarycomparisons website command shell:**
```
docker-compose run --service-ports web
```
- `start`: Starts the Django server. The librarycomparison web will run in the `8000` port by default.
- `migrate`: Runs Django migrations
- `make`: Runs Django makemigrations
- `createsuperuser`: Runs Django createsuperuser

To access the website, use http://127.0.0.1:8000/comparelibraries/

### 3. Stops containers and removes containers, networks, volumes, and images created by up

```
docker-compose down
```
Remove volume: `docker-compose down -v`. Warning: this will permanently delete the contents in the db_data volume, wiping out any previous database you had there

### 4. Setup Metric Table if you create the docker volumn for the first time
```
docker exec librarymetricscripts_db_1 /bin/sh -c 'mysql -uroot -p"mypwd" libcomp < docker-entrypoint-initdb.d/metric-setup.sql'
```

## Accessing docker container mysql databases
1. docker exec -it MyContainer mysql -uroot -pMyPassword
eg: `docker exec -it librarymetricscripts_db_1 mysql -uroot -p"mypwd"`
2. Show MySQL Databases: `show databases;`
```
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| libcomp |
| mysql |
| performance_schema |
| sys |
+--------------------+
```
3. Show MySQL Tables:
```
use libcomp;
show tables;
```
4. Show Table's schema
```
describe libcomp.Metric;
```
5. Show the values of Metric table
```
select * from libcomp.Metric;
```
6 changes: 6 additions & 0 deletions docker/start.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

echo "alias createmetrics='python -m scripts'" >> ~/.bashrc
echo "alias updatemetrics='./updatemetrics.sh'" >> ~/.bashrc
snadi marked this conversation as resolved.
Show resolved Hide resolved

/bin/bash
4 changes: 3 additions & 1 deletion librarycomparison/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,9 @@
'ENGINE': 'django.db.backends.mysql',
'NAME': 'libcomp',
'USER': 'root',
'PASSWORD': ''
'HOST': 'db',
'PORT' : 3306,
'PASSWORD': 'mypwd'
}

}
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ cairosvg
gitpython
mysqlclient>=1.4.6
djangorestframework>=3.11
beautifulsoup4>=4.9.3
3 changes: 2 additions & 1 deletion scripts/Config.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@
"MAXSIZE": "1000",
"POPULARITY_OUTPUT_FILE":"scripts/Popularity/popularity_results.txt",
"TIME_SPAN":"365",
"SO_TOKEN":"enter your SO token"
"SO_TOKEN":"enter your SO token",
"OUTPUT_PATH": "../home/scripts/"
}
1 change: 1 addition & 0 deletions scripts/IssueMetrics/performanceclassifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from github import Github, Repository
import string

nltk.download('punkt')
snadi marked this conversation as resolved.
Show resolved Hide resolved
stemmer = PorterStemmer()

def stem_words(tokens):
Expand Down
67 changes: 67 additions & 0 deletions scripts/Popularity/GHDepPopularity.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
import requests
from bs4 import BeautifulSoup
import re
import json
from scripts.CommonUtilities import Common_Utilities
from scripts.SharedFiles.utility_tool import read_json_file

"""Gets number of dependent repos as calculated by github dependency graph https://docs.github.com/en/code-security/supply-chain-security/understanding-your-software-supply-chain/about-the-dependency-graph#:~:text=The%20dependency%20graph%20is%20a,packages%20that%20depend%20on%20it

Parameters
----------
repo : str
Github repo represented as user/repo

Returns
-------
num_dependents
number of dependents
"""
def get_num_dependents(repo):
#inspired from Bertrand Martel's answer on https://stackoverflow.com/questions/58734176/how-to-use-github-api-to-get-a-repositorys-dependents-information-in-github
url = 'https://github.com/{}/network/dependents'.format(repo)
dependent_href = '/{}/network/dependents?dependent_type=REPOSITORY'.format(repo)
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")

if len(soup.body.findAll("We haven’t found any dependents for this repository yet.")) != 0:
return 0

dependents = soup.find('a', href= dependent_href) #returns, for example, "1,234,000 Repositories"
#regex from https://www.regexpal.com/98336
num_dependents = re.search(r'(\d{0,3},)?(\d{3},)?\d{0,3}', dependents.text.strip()).group(0)
print(num_dependents)
return num_dependents

def read_libraries(file_path):
libdict = {}
f = read_json_file(file_path)
for line in f:
libdict[line['Package']]=line['FullRepoName']

return libdict

def send_totals_to_file(output_file, keyword, num_found):
output_file = open(output_file, "a")
output_file.write(keyword + ":" + str(num_found) + "\n")
output_file.close()

def get_popularity():
print("Getting popularity")
config_dict = Common_Utilities.read_config_file() # read all config data

library_dict = read_libraries(config_dict["LIBRARY_LIST"]) # read all libraries to search against

output_file_name = config_dict["POPULARITY_OUTPUT_FILE"] # this is the output file that we are going to send libraries with their total counts to

output_file = open(output_file_name, "w")
output_file.close()

for keyword,repo in library_dict.items():
print("for lib", repo)
num_dependents = get_num_dependents(repo)
send_totals_to_file(output_file_name, repo, num_dependents)


if __name__ == "__main__":
get_popularity()
1 change: 1 addition & 0 deletions scripts/Popularity/GitHub_Phase2.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
'''
This script is no longer used due to the limits of the API. Instead, we are going to depend on the github dependency graph
This script searches the 1500 repositories in Top_Repo.txt for import statements of all the library packages in library.txt
Requires: A configuration file called GitHubSearch.ini
Ensure a GitHub token generated by your account is in the configuration account so that the script may connect to github
Expand Down
12 changes: 6 additions & 6 deletions scripts/SharedFiles/LibraryData.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,10 @@
{
"LibraryName": "tinylog",
"Domain": "Logging",
"FullRepoName": "pmwmedia/tinylog",
"FullRepoName": "tinylog-org/tinylog",
"SOtags": "tinylog",
"Package": "org.tinylog",
"GitHubURL": "git://github.com/pmwmedia/tinylog.git",
"GitHubURL": "git://github.com/tinylog-org/tinylog.git",
"JIRAURL": "",
"MavenURL":"https://mvnrepository.com/artifact/org.tinylog/tinylog"
},
Expand Down Expand Up @@ -172,10 +172,10 @@
{
"LibraryName": "conceal",
"Domain": "Cryptography",
"FullRepoName": "facebook/conceal",
"FullRepoName": "facebookarchive/conceal",
"SOtags": "facebook-conceal",
"Package": "com.facebook.crypto",
"GitHubURL": "git://github.com/facebook/conceal.git",
"GitHubURL": "git://github.com/facebookarchive/conceal.git",
"JIRAURL": "",
"MavenURL":"https://mvnrepository.com/artifact/com.facebook.conceal/conceal"
},
Expand Down Expand Up @@ -432,10 +432,10 @@
{
"LibraryName": "jcommon",
"Domain": "Collections",
"FullRepoName": "facebook/jcommon",
"FullRepoName": "facebookarchive/jcommon",
"SOtags": "facebook-jcommon",
"Package": "com.facebook.util",
"GitHubURL": "git://github.com/facebook/jcommon.git",
"GitHubURL": "git://github.com/facebookarchive/jcommon.git",
"JIRAURL": "",
"MavenURL":"https://mvnrepository.com/artifact/com.facebook.jcommon/util"
},
Expand Down
Loading