Skip to content

Commit

Permalink
feat(parser): Added Docker support with new parameter to define outpu…
Browse files Browse the repository at this point in the history
…t path for exported files

- **Dockerfile**:
  - Added new file Dockerfile for building Docker image.

- **README.md**:
  - Added section for Docker
  - A usage of application updated to include new parameters `-p` and `--output-path`.

- **app.ts**:
  - Added new variable `outputPath` to define the path of the output file.
  - Updated function for parsing input parameters to include the new parameters `-p` and `--output-path` with defined default value.
  - Updated `console.log` messages to include the new parameters.
  - Updated constant for calling a Chromium browser, including new arguments for launching browser in Docker container.
  - Updated calling `writeJson` and `writeCsv` functions to include the new `outputPath` parameter.

- **write.ts**:
  - Function `writeJson`: Added new parameter `outputPath` to define the path of the output file.
  - Function `writeJson`: Updated constant `outFilename` where base path is defined by `outputPath` variable, not hard-coded anymore.
  - Function `writeCsv`: Added new parameter `outputPath` to define the path of the output file.
  - Function `writeCsv`: Updated constant `outFilename` where base path is defined by `outputPath` variable, not hard-coded anymore.
  • Loading branch information
Jan Pelikan committed Sep 20, 2024
1 parent 8dbf397 commit 5c9a49f
Show file tree
Hide file tree
Showing 4 changed files with 123 additions and 11 deletions.
79 changes: 79 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Default architecture
ARG ARCH=amd64

# Use the official lightweight Node.js 22 image
FROM $ARCH/node:22-slim

# Do not download Chromium, will be installed manually from Debian repositories
ENV PUPPETEER_SKIP_DOWNLOAD true
# Set the path to the Chromium executable
ENV PUPPETEER_EXECUTABLE_PATH "/usr/bin/chromium"

# Create a directory for the app
WORKDIR /app

# Create a directory for the app data
RUN mkdir /data

# Set volume for the app data
VOLUME /data

# Install dependencies
RUN apt-get update && apt-get install -y \
wget \
ca-certificates \
chromium \
fonts-liberation \
libasound2 \
libatk1.0-0 \
libc6 \
libcairo2 \
libcups2 \
libdbus-1-3 \
libexpat1 \
libfontconfig1 \
libgbm-dev \
libgcc1 \
libglib2.0-0 \
libgdk-pixbuf2.0-0 \
libgtk-3-0 \
libnspr4 \
libnss3 \
libpango-1.0-0 \
libpangocairo-1.0-0 \
libx11-6 \
libx11-xcb1 \
libxcb1 \
libxcomposite1 \
libxcursor1 \
libxdamage1 \
libxext6 \
libxfixes3 \
libxi6 \
libxrandr2 \
libxrender1 \
libxss1 \
libxtst6 \
lsb-release \
xdg-utils \
libu2f-udev \
libvulkan1 \
--no-install-recommends \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

# Copy the app files (azure-vm-pricing) to the container
COPY ./parser .

# When use a non-root user, the processes under the user will not have access to mounted volumes for some reason
# # Create a non-root user to run Puppeteer
# RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
# && mkdir -p /home/pptruser/Downloads \
# && chown -R pptruser:pptruser /home/pptruser \
# && chown -R pptruser:pptruser /app

# # Switch to the non-root user
# USER pptruser

# Install Yarn and dependencies
RUN yarn install
28 changes: 25 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,13 +74,13 @@ Scroll down for the list of [supported regions](#supported-regions) and [support

```powershell
> cd .\parser\
> yarn crawl --culture en-us --currency usd --operating-system linux --region us-west
> yarn crawl --culture en-us --currency usd --operating-system linux --region us-west --output-path .\out\
```

You can also use short names:

```powershell
> yarn crawl -l en-us -c usd -o linux -r us-west
> yarn crawl -l en-us -c usd -o linux -r us-west -p .\out\
```

Arguments:
Expand All @@ -99,7 +99,7 @@ In the footer:

### Parser output

Writes `2` output files in the `out\` directory. One is a `CSV`, the other one is `JSON`. Both files contain the same data.
Writes `2` output files in the `out\` directory, or the directory specified by the `--output-path` argument. One is a `CSV`, the other one is `JSON`. Both files contain the same data.

```text
.\out\vm-pricing_<region>_<operating-system>.csv
Expand All @@ -124,6 +124,28 @@ Fields:
- _Three Year Savings plan
- _Three Year Savings plan With Azure Hybrid Benefit_

### Docker

#### Build the Docker image

You can build a `Docker` image for the `azure-vm-pricing`:

```bash
# For Linux machines running on x86_64 or in Windows WSL
docker build -f ./Dockerfile --platform linux/amd64 --build-arg ARCH=amd64 -t azure-vm-pricing .

# For Linux machines running on arm64, for example Apple Macbooks with Apple Silicon
docker build -f ./Dockerfile --platform linux/arm64 --build-arg ARCH=arm64 -t azure-vm-pricing .
```

#### Run the Docker image

You can run the `azure-vm-pricing` image:

```bash
docker run --rm -it -v ./data:/data/ azure-vm-pricing:latest bash -c "yarn crawl --culture en-us --currency eur --operating-system linux --region europe-west -p /data/"
```

### Parser tests

The parser has unit tests focusing on edge cases of price formatting:
Expand Down
19 changes: 15 additions & 4 deletions parser/src/app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ import { isUrlBlocked } from './isUrlBlocked';
import { writeCsv, writeJson } from './writeFile';
import { AzurePortal, getPrice, getPricing } from './azurePortalExtensions';

let outputPath: string | undefined;

let recordTiming = false;
let previousPerformanceNow = 0;
let wasSuccessful = false;
Expand Down Expand Up @@ -91,6 +93,14 @@ function timeEvent(eventName: string): void {
case '--region':
region = args[offset + 1];
break;
case '-p':
case '--output-path':
outputPath = args[offset + 1];
// If the output path is not defined or is an empty string, use the default path
if (outputPath === undefined || outputPath === '') {
outputPath = './out';
}
break;
default:
parsedBinaryArg = false;
break;
Expand All @@ -104,7 +114,7 @@ function timeEvent(eventName: string): void {
debugMode = true;
break;
default:
console.log(`'${args[offset]}' is not a known switch, supported values are: '-l', '--culture', '-c', '--currency', '-o', '--operating-system', '-r', '--region'. None of these switches should be provided as the last arg as they require a value.`);
console.log(`'${args[offset]}' is not a known switch, supported values are: '-l', '--culture', '-c', '--currency', '-o', '--operating-system', '-r', '--region', '-p', '--output-path . None of these switches should be provided as the last arg as they require a value.`);
break;
}
}
Expand All @@ -120,7 +130,8 @@ function timeEvent(eventName: string): void {
}

timeEvent('chromeStartedAt');
const browser = await puppeteer.launch({headless: headlessMode});
// --no-sandbox and --disable-setuid-sandbox are required for running in a Docker container
const browser = await puppeteer.launch({headless: headlessMode, args: ['--no-sandbox', '--disable-setuid-sandbox']});
const page = await browser.newPage();
timeEvent('chromeLaunchedAt');

Expand Down Expand Up @@ -229,8 +240,8 @@ function timeEvent(eventName: string): void {

console.log();

writeJson(vmPricing, config.region, config.operatingSystem);
writeCsv(vmPricing, config.culture, config.region, config.operatingSystem);
writeJson(vmPricing, config.region, config.operatingSystem, outputPath);
writeCsv(vmPricing, config.culture, config.region, config.operatingSystem, outputPath);
wasSuccessful = true;
}
catch (e)
Expand Down
8 changes: 4 additions & 4 deletions parser/src/writeFile.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
const fs = require('fs');
import { VmPricing } from './vmPricing';

export function writeJson(vmPricing: VmPricing[], region: string, operatingSystem: string): void {
const outFilename = `./out/vm-pricing_${region}_${operatingSystem}.json`;
export function writeJson(vmPricing: VmPricing[], region: string, operatingSystem: string, outputPath: string): void {
const outFilename = `${outputPath}/vm-pricing_${region}_${operatingSystem}.json`;

fs.writeFile(outFilename, JSON.stringify(vmPricing, null, 2), function (err) {
if (err) {
Expand All @@ -13,8 +13,8 @@ export function writeJson(vmPricing: VmPricing[], region: string, operatingSyste
});
}

export function writeCsv(vmPricing: VmPricing[], culture: string, region: string, operatingSystem: string): void {
const outFilename = `./out/vm-pricing_${region}_${operatingSystem}.csv`;
export function writeCsv(vmPricing: VmPricing[], culture: string, region: string, operatingSystem: string, outputPath: string): void {
const outFilename = `${outputPath}/vm-pricing_${region}_${operatingSystem}.csv`;

const writer = fs.createWriteStream(outFilename);
writer.write('INSTANCE,VCPU,RAM,PAY AS YOU GO,PAY AS YOU GO WITH AZURE HYBRID BENEFIT,ONE YEAR RESERVED,ONE YEAR RESERVED WITH AZURE HYBRID BENEFIT,THREE YEAR RESERVED,THREE YEAR RESERVED WITH AZURE HYBRID BENEFIT,SPOT,SPOT WITH AZURE HYBRID BENEFIT,ONE YEAR SAVINGS PLAN,ONE YEAR SAVINGS PLAN WITH AZURE HYBRID BENEFIT,THREE YEAR SAVINGS PLAN,THREE YEAR SAVINGS PLAN WITH AZURE HYBRID BENEFIT\n');
Expand Down

0 comments on commit 5c9a49f

Please sign in to comment.