Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DON'T MERGE - i852 demo deploy test branch #957

Closed
wants to merge 41 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
3316f89
queue derivative jobs in new :resource_intensive queue
bkiahstroud Jan 3, 2024
ab95c9a
attempt to only put video and audio jobs in separate queue
bkiahstroud Jan 4, 2024
cceb544
create child job class to use new Sidekiq queue
bkiahstroud Jan 4, 2024
acb09ad
rubocop fixes in CreateDerivativesJobDecorator
bkiahstroud Jan 4, 2024
70bdf48
remove CreateLargeDerivativesJob priority override
bkiahstroud Jan 4, 2024
debb2f1
new worker to run resource intensive jobs
bkiahstroud Jan 4, 2024
1abe7bd
deploy resource intensive worker with 4x the normal resources
bkiahstroud Jan 4, 2024
23202f1
no need to specify default file location
bkiahstroud Jan 4, 2024
38b4f8a
add code docs
bkiahstroud Jan 4, 2024
817657c
Revert "🩹 Temporary fix for video processing"
bkiahstroud Jan 4, 2024
307236a
Revert "🐛 HACK: Disable mp4 and webm derivative generation"
bkiahstroud Jan 4, 2024
69b4165
helm deployment of intensive worker
orangewolf Jan 5, 2024
47546b6
resource adjustments
orangewolf Jan 5, 2024
e176af1
undo changes to unused file
bkiahstroud Jan 5, 2024
f82c858
use ENV var for intensive worker thread count
bkiahstroud Jan 6, 2024
1c89d52
adjust intensive worker prod resources
bkiahstroud Jan 6, 2024
10e834b
fix disarranged YAML
bkiahstroud Jan 6, 2024
8171b37
update prod to Hyrax helm chart v3.5.1
bkiahstroud Jan 6, 2024
ea6a6e0
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 9, 2024
00b4508
Merge branch 'i852-sack-the-hack' into i852-demo-deploy-test-branch
bkiahstroud Jan 9, 2024
70a805b
Merge branch 'main' into i852-demo-deploy-test-branch
bkiahstroud Jan 9, 2024
a4d5347
fix demo redis deploy issue
bkiahstroud Jan 9, 2024
68a681d
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 9, 2024
dc7d470
env vars need to be strings
bkiahstroud Jan 17, 2024
0aea9de
env vars need to be strings
bkiahstroud Jan 17, 2024
f763d4d
correctly parse Redis host value
bkiahstroud Jan 24, 2024
f0f90fd
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 24, 2024
3576eb8
prevent variable substitution for db-wait step
bkiahstroud Jan 24, 2024
6e632d8
prevent variable substitution for db-wait step
bkiahstroud Jan 24, 2024
f0b45b3
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 24, 2024
31e2564
attempt to escape variable substitution again
bkiahstroud Jan 24, 2024
7a4dedb
attempt to escape variable substitution again
bkiahstroud Jan 24, 2024
38642d1
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 24, 2024
9998530
use helm to hopefully avoid env variable substitution
bkiahstroud Jan 25, 2024
a7b07b9
use helm to hopefully avoid env variable substitution
bkiahstroud Jan 25, 2024
66476fc
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 25, 2024
eb3b2fa
let's try this again...
bkiahstroud Jan 25, 2024
989e5a5
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 25, 2024
f07f23b
add redis port
bkiahstroud Jan 25, 2024
4733738
add redis port
bkiahstroud Jan 25, 2024
322f1c9
Merge branch 'i852-new-worker-to-run-resource-intensive-jobs' into i8…
bkiahstroud Jan 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ PASSENGER_APP_ENV=development
RAILS_LOG_TO_STDOUT=true
REDIS_HOST=redis
SECRET_KEY_BASE=asdf
SIDEKIQ_INTENSIVE_THREAD_COUNT=1
SOLR_ADMIN_PASSWORD=SolrRocks
SOLR_ADMIN_USER=solr
SOLR_COLLECTION_NAME=hydra-development
Expand Down
17 changes: 17 additions & 0 deletions app/jobs/create_derivatives_job_decorator.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# frozen_string_literal: true

# OVERRIDE Hyrax v3.6.0
# @see CreateLargeDerivativesJob
module CreateDerivativesJobDecorator
# OVERRIDE: Divert audio and video derivative
# creation to CreateLargeDerivativesJob.
def perform(file_set, file_id, filepath = nil)
return super if is_a?(CreateLargeDerivativesJob)
return super unless file_set.video? || file_set.audio?

CreateLargeDerivativesJob.perform_later(*arguments)
true
end
end

CreateDerivativesJob.prepend(CreateDerivativesJobDecorator)
16 changes: 16 additions & 0 deletions app/jobs/create_large_derivatives_job.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# frozen_string_literal: true

# CreateLargeDerivativesJob is intended to be used for resource-intensive derivative
# generation (e.g. video processing). It is functionally similar to CreateDerivativesJob,
# except that it queues jobs in the :resource_intensive queue.
#
# The worker responsible for processing jobs in the :resource_intensive queue should be
# configured to have more resources dedicated to it, especially CPU. Otherwise, the
# `ffmpeg` commands that this job class eventually triggers could be throttled.
#
# @see CreateDerivativesJobDecorator
# @see Hydra::Derivatives::Processors::Ffmpeg
# @see https://github.com/scientist-softserv/palni-palci/issues/852
class CreateLargeDerivativesJob < CreateDerivativesJob
queue_as :resource_intensive
end
Original file line number Diff line number Diff line change
Expand Up @@ -42,18 +42,11 @@ def video_display_content(_url, label = '')
height = solr_document.height&.try(:to_i) || 240
duration = conformed_duration_in_seconds
IIIFManifest::V3::DisplayContent.new(
# Hyrax::IiifAv::Engine.routes.url_helpers.iiif_av_content_url(
# solr_document.id,
# label: label,
# host: request.base_url
# ),
# TODO: This is a hack to pull the download url from hyrax as the video resource.
# Ultimately we want to fix the processing times of the video derivatives so it doesn't take
# hours to days to complete. The draw back of doing it this way is that we're using the original
# video file which is fine if it's already processed, but if it's a raw, then it is not ideal for
# streaming purposes. The good thing is that PALs seem to be processing the video derivatives out
# of band first before ingesting so we shouldn't run into this issue.
Hyrax::Engine.routes.url_helpers.download_url(solr_document.id, host: request.base_url, protocol: 'https'),
Hyrax::IiifAv::Engine.routes.url_helpers.iiif_av_content_url(
solr_document.id,
label: label,
host: request.base_url
),
label: label,
width: width,
height: height,
Expand Down
2 changes: 1 addition & 1 deletion bin/helm_deploy
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ WORKER_IMAGE="${WORKER_IMAGE:-ghcr.io/samvera/hyku/worker}"
DEPLOY_TAG="${DEPLOY_TAG:-latest}"
WORKER_TAG="${WORKER_TAG:-$DEPLOY_TAG}"

helm pull oci://ghcr.io/samvera/charts/hyrax --version 2.0.0 --untar --untardir charts
helm pull oci://ghcr.io/samvera/charts/hyrax --version 3.5.1 --untar --untardir charts

helm repo update

Expand Down
6 changes: 5 additions & 1 deletion bin/worker
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,8 @@ else
puts 'DATABASE_URL not set, no pool change needed'
end

exec "echo $DATABASE_URL && bundle exec sidekiq"
if ENV['SIDEKIQ_CONFIG']
exec "echo $DATABASE_URL && bundle exec sidekiq -C #{ENV['SIDEKIQ_CONFIG']}"
else
exec "echo $DATABASE_URL && bundle exec sidekiq"
end
15 changes: 5 additions & 10 deletions config/initializers/file_set_derivatives_overrides.rb
Original file line number Diff line number Diff line change
Expand Up @@ -48,17 +48,12 @@ def create_video_derivatives(filename)
original_size = "#{width}x#{height}"
size = width.nil? || height.nil? ? DEFAULT_VIDEO_SIZE : original_size
Hydra::Derivatives::Processors::Video::Processor.config.size_attributes = size
# HACK: Commented out the non-thumbnail derivative generation as they are clogging the ecosystem.
# See https://github.com/scientist-softserv/palni-palci/issues/924
# rubocop:disable Style/TrailingCommaInHashLiteral
Hydra::Derivatives::VideoDerivatives.create(filename,
outputs: [{ label: :thumbnail, format: 'jpg',
url: derivative_url('thumbnail') },
# { label: 'webm', format: 'webm',
# url: derivative_url('webm') },
# { label: 'mp4', format: 'mp4',
# url: derivative_url('mp4') }
])
# rubocop:enable Style/TrailingCommaInHashLiteral
{ label: 'webm', format: 'webm',
url: derivative_url('webm') },
{ label: 'mp4', format: 'mp4',
url: derivative_url('mp4') }])
end
end
end
7 changes: 7 additions & 0 deletions config/sidekiq_resource_intensive.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
:concurrency: <%= ENV['SIDEKIQ_INTENSIVE_THREAD_COUNT'] %>
:queues:
- default
- import
- export
- resource_intensive
63 changes: 36 additions & 27 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,35 @@ x-app: &app
networks:
internal:

x-app-worker: &app-worker
<<: *app
build:
context: .
target: hyku-worker
args:
- EXTRA_APK_PACKAGES=less vim bash openjdk11-jre ffmpeg rsync exiftool
- HYKU_BULKRAX_ENABLED=true
cache_from:
- ghcr.io/scientist-softserv/palni-palci:${TAG:-latest}
- ghcr.io/scientist-softserv/palni-palci/worker:${TAG:-latest}
image: ghcr.io/scientist-softserv/palni-palci/worker:${TAG:-latest}
command: sh -l -c 'bundle && bundle exec sidekiq'
depends_on:
check_volumes:
condition: service_completed_successfully
initialize_app:
condition: service_completed_successfully
db:
condition: service_started
solr:
condition: service_started
fcrepo:
condition: service_started
redis:
condition: service_started
zoo:
condition: service_started

volumes:
assets:
cache:
Expand Down Expand Up @@ -145,6 +174,8 @@ services:
condition: service_started
worker:
condition: service_started
worker_resource_intensive:
condition: service_started
initialize_app:
condition: service_completed_successfully
# ports:
Expand All @@ -153,33 +184,11 @@ services:
- 3000

worker:
<<: *app
build:
context: .
target: hyku-worker
args:
- EXTRA_APK_PACKAGES=less vim bash openjdk11-jre ffmpeg rsync exiftool
- HYKU_BULKRAX_ENABLED=true
cache_from:
- ghcr.io/scientist-softserv/palni-palci:${TAG:-latest}
- ghcr.io/scientist-softserv/palni-palci/worker:${TAG:-latest}
image: ghcr.io/scientist-softserv/palni-palci/worker:${TAG:-latest}
command: sh -l -c 'bundle && bundle exec sidekiq'
depends_on:
check_volumes:
condition: service_completed_successfully
initialize_app:
condition: service_completed_successfully
db:
condition: service_started
solr:
condition: service_started
fcrepo:
condition: service_started
redis:
condition: service_started
zoo:
condition: service_started
<<: *app-worker

worker_resource_intensive:
<<: *app-worker
command: sh -l -c 'bundle && bundle exec sidekiq -C config/sidekiq_resource_intensive.yml'

# Do not recurse through all of tmp. derivitives will make booting
# very slow and eventually just time out as data grows
Expand Down
Loading
Loading