Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve and unify ingestion server logging, simplify docker CMDs #2639

Merged
merged 7 commits into from
Jul 14, 2023

Conversation

AetherUnbound
Copy link
Collaborator

@AetherUnbound AetherUnbound commented Jul 12, 2023

Fixes

This PR is an effort towards addressing https://github.com/WordPress/openverse-infrastructure/issues/529, which seeks to handle multi-line log capturing in Cloudwatch. While working on that issue, I noticed that the ingestion server actually had two separate log formats: one coming from gunicorn and the other coming from the standard python in-process logging. Here's an example of them mixing in production:

[2023-07-11 06:30:23 +0000] [7] [DEBUG] GET /task/393d1d750d2843369185d9016bda243e
[2023-07-11 06:45:24 +0000] [7] [DEBUG] GET /task/393d1d750d2843369185d9016bda243e
2023-07-11 07:00:08,074 INFO base.py:265 - POST http://openverse-elasticsearch-prod.private:9200/_reindex?slices=auto&wait_for_completion=true [status:200 request:40542.852s]
[2023-07-11 07:00:25 +0000] [7] [DEBUG] GET /task/393d1d750d2843369185d9016bda243e
2023-07-11 07:00:38,835 INFO base.py:265 - POST http://openverse-elasticsearch-prod.private:9200/image-273c6f23b5b144ad8f1b576f5e3a14d7-filtered/_refresh [status:200 request:30.760s]
2023-07-11 07:00:38,861 INFO base.py:265 - PUT http://openverse-elasticsearch-prod.private:9200/image-273c6f23b5b144ad8f1b576f5e3a14d7-filtered/_settings [status:200 request:0.026s]
2023-07-11 07:00:38,862 INFO tasks.py:268 - Task 393d1d750d2843369185d9016bda243e completed.
[2023-07-11 07:15:27 +0000] [7] [DEBUG] GET /task/393d1d750d2843369185d9016bda243e
[2023-07-11 07:15:30 +0000] [7] [DEBUG] POST /task
2023-07-11 07:15:30,346 INFO es_helpers.py:58 - Connecting to openverse-elasticsearch-prod.private:9200 with AWS auth
2023-07-11 07:15:30,358 INFO base.py:265 - GET http://openverse-elasticsearch-prod.private:9200/ [status:200 request:0.010s]
2023-07-11 07:15:30,359 INFO base.py:265 - GET http://openverse-elasticsearch-prod.private:9200/ [status:200 request:0.001s]
2023-07-11 07:15:30,360 INFO indexer.py:357 - Waiting for replica shards. . .
[2023-07-11 07:15:31 +0000] [7] [DEBUG] POST /task
2023-07-11 07:15:31,939 INFO es_helpers.py:58 - Connecting to openverse-elasticsearch-prod.private:9200 with AWS auth
2023-07-11 07:15:31,947 INFO base.py:265 - GET http://openverse-elasticsearch-prod.private:9200/ [status:200 request:0.006s]
2023-07-11 07:15:31,948 INFO base.py:265 - GET http://openverse-elasticsearch-prod.private:9200/ [status:200 request:0.001s]

While we could set up a matcher for these two different formats, Cloudwatch also supports using the timestamp parsing format as the multiline matcher, which seems ideal for us. The timestamp parser is very naive, so we needed a single format per service in order to handle this appropriately.

Description

This PR modifies the gunicorn configuration for both the ingestion server and the indexer worker to provide a single logging format for all messages emitted by the service while preserving the various log levels. As an added bonus, we also gain the gunicorn.access logs, which were previously not showing up for the ingestion server.

Previously, we had separate configuration files for the ingestion server and the indexer worker. These files were nearly identical (all that differs now is the port binding and the actual WSGI application declaration). With the new logging config which was added, I decided that it'd be better to use the same config across both containers and only declare the differences in the actual command which runs the container. This prevents us having two verbose configurations which we would need to maintain two copies of.

Additionally, the docker-compose file within the ingestion server folder was removed. I believe this was intended to be used similar to the catalog's version (which was recently moved into the infra repo anyway). However we recreate the docker-compose file when deploying in the init file anyway, so this is no longer useful.

This will be followed up by a PR in the infrastructure repo with adjustments for deployment that are necessary given this new setup.

Logging comparison

Below are the logs for just api/init, before and after these changes. The important differences to note are that the log format is now all the same and we get a bit more information regarding access logs!

Logs before
openverse-ingestion_server-1  | [2023-07-12 21:47:18 +0000] [1] [DEBUG] Current configuration:
openverse-ingestion_server-1  |   config: ./gunicorn.conf.py
openverse-ingestion_server-1  |   wsgi_app: api:api
openverse-ingestion_server-1  |   bind: ['0.0.0.0:8001']
openverse-ingestion_server-1  |   backlog: 2048
openverse-ingestion_server-1  |   workers: 1
openverse-ingestion_server-1  |   worker_class: sync
openverse-ingestion_server-1  |   threads: 1
openverse-ingestion_server-1  |   worker_connections: 1000
openverse-ingestion_server-1  |   max_requests: 0
openverse-ingestion_server-1  |   max_requests_jitter: 0
openverse-ingestion_server-1  |   timeout: 120
openverse-ingestion_server-1  |   graceful_timeout: 30
openverse-ingestion_server-1  |   keepalive: 2
openverse-ingestion_server-1  |   limit_request_line: 4094
openverse-ingestion_server-1  |   limit_request_fields: 100
openverse-ingestion_server-1  |   limit_request_field_size: 8190
openverse-ingestion_server-1  |   reload: True
openverse-ingestion_server-1  |   reload_engine: auto
openverse-ingestion_server-1  |   reload_extra_files: []
openverse-ingestion_server-1  |   spew: False
openverse-ingestion_server-1  |   check_config: False
openverse-ingestion_server-1  |   print_config: False
openverse-ingestion_server-1  |   preload_app: False
openverse-ingestion_server-1  |   sendfile: None
openverse-ingestion_server-1  |   reuse_port: False
openverse-ingestion_server-1  |   chdir: /ingestion_server/ingestion_server
openverse-ingestion_server-1  |   daemon: False
openverse-ingestion_server-1  |   raw_env: []
openverse-ingestion_server-1  |   pidfile: None
openverse-ingestion_server-1  |   worker_tmp_dir: None
openverse-ingestion_server-1  |   user: 1000
openverse-ingestion_server-1  |   group: 1000
openverse-ingestion_server-1  |   umask: 0
openverse-ingestion_server-1  |   initgroups: False
openverse-ingestion_server-1  |   tmp_upload_dir: None
openverse-ingestion_server-1  |   secure_scheme_headers: {'X-FORWARDED-PROTOCOL': 'ssl', 'X-FORWARDED-PROTO': 'https', 'X-FORWARDED-SSL': 'on'}
openverse-ingestion_server-1  |   forwarded_allow_ips: ['127.0.0.1']
openverse-ingestion_server-1  |   accesslog: None
openverse-ingestion_server-1  |   disable_redirect_access_to_syslog: False
openverse-ingestion_server-1  |   access_log_format: %(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"
openverse-ingestion_server-1  |   errorlog: -
openverse-ingestion_server-1  |   loglevel: debug
openverse-ingestion_server-1  |   capture_output: True
openverse-ingestion_server-1  |   logger_class: gunicorn.glogging.Logger
openverse-ingestion_server-1  |   logconfig: None
openverse-ingestion_server-1  |   logconfig_dict: {}
openverse-ingestion_server-1  |   syslog_addr: udp://localhost:514
openverse-ingestion_server-1  |   syslog: False
openverse-ingestion_server-1  |   syslog_prefix: None
openverse-ingestion_server-1  |   syslog_facility: user
openverse-ingestion_server-1  |   enable_stdio_inheritance: False
openverse-ingestion_server-1  |   statsd_host: None
openverse-ingestion_server-1  |   dogstatsd_tags: 
openverse-ingestion_server-1  |   statsd_prefix: 
openverse-ingestion_server-1  |   proc_name: None
openverse-ingestion_server-1  |   default_proc_name: gunicorn
openverse-ingestion_server-1  |   pythonpath: None
openverse-ingestion_server-1  |   paste: None
openverse-ingestion_server-1  |   on_starting: <function OnStarting.on_starting at 0x7f3ea8c24400>
openverse-ingestion_server-1  |   on_reload: <function OnReload.on_reload at 0x7f3ea8c24540>
openverse-ingestion_server-1  |   when_ready: <function WhenReady.when_ready at 0x7f3ea8c24680>
openverse-ingestion_server-1  |   pre_fork: <function Prefork.pre_fork at 0x7f3ea8c247c0>
openverse-ingestion_server-1  |   post_fork: <function Postfork.post_fork at 0x7f3ea8c24900>
openverse-ingestion_server-1  |   post_worker_init: <function PostWorkerInit.post_worker_init at 0x7f3ea8c24a40>
openverse-ingestion_server-1  |   worker_int: <function WorkerInt.worker_int at 0x7f3ea8c24b80>
openverse-ingestion_server-1  |   worker_abort: <function WorkerAbort.worker_abort at 0x7f3ea8c24cc0>
openverse-ingestion_server-1  |   pre_exec: <function PreExec.pre_exec at 0x7f3ea8c24e00>
openverse-ingestion_server-1  |   pre_request: <function PreRequest.pre_request at 0x7f3ea8c24f40>
openverse-ingestion_server-1  |   post_request: <function PostRequest.post_request at 0x7f3ea8c24fe0>
openverse-ingestion_server-1  |   child_exit: <function ChildExit.child_exit at 0x7f3ea8c25120>
openverse-ingestion_server-1  |   worker_exit: <function WorkerExit.worker_exit at 0x7f3ea8c25260>
openverse-ingestion_server-1  |   nworkers_changed: <function NumWorkersChanged.nworkers_changed at 0x7f3ea8c253a0>
openverse-ingestion_server-1  |   on_exit: <function OnExit.on_exit at 0x7f3ea8c254e0>
openverse-ingestion_server-1  |   proxy_protocol: False
openverse-ingestion_server-1  |   proxy_allow_ips: ['127.0.0.1']
openverse-ingestion_server-1  |   keyfile: None
openverse-ingestion_server-1  |   certfile: None
openverse-ingestion_server-1  |   ssl_version: 2
openverse-ingestion_server-1  |   cert_reqs: 0
openverse-ingestion_server-1  |   ca_certs: None
openverse-ingestion_server-1  |   suppress_ragged_eofs: True
openverse-ingestion_server-1  |   do_handshake_on_connect: False
openverse-ingestion_server-1  |   ciphers: None
openverse-ingestion_server-1  |   raw_paste_global_conf: []
openverse-ingestion_server-1  |   strip_header_spaces: False
openverse-ingestion_server-1  | [2023-07-12 21:47:18 +0000] [1] [INFO] Starting gunicorn 20.1.0
openverse-ingestion_server-1  | [2023-07-12 21:47:18 +0000] [1] [DEBUG] Arbiter booted
openverse-ingestion_server-1  | [2023-07-12 21:47:18 +0000] [1] [INFO] Listening at: http://0.0.0.0:8001 (1)
openverse-ingestion_server-1  | [2023-07-12 21:47:18 +0000] [1] [INFO] Using worker: sync
openverse-ingestion_server-1  | [2023-07-12 21:47:18 +0000] [7] [INFO] Booting worker with pid: 7
openverse-ingestion_server-1  | [2023-07-12 21:47:18 +0000] [1] [DEBUG] 1 workers
openverse-ingestion_server-1  | [2023-07-12 21:47:20 +0000] [7] [DEBUG] GET /
openverse-ingestion_server-1  | [2023-07-12 21:47:44 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:47:44,265 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:47:44,275 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.008s]
openverse-ingestion_server-1  | 2023-07-12 21:47:44,279 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | 2023-07-12 21:47:44,280 INFO ingest.py:277 - `audio`: Starting ingestion server data refresh | _Next: copying data from upstream_
openverse-ingestion_server-1  | 2023-07-12 21:47:44,293 INFO ingest.py:79 - Shared columns: ['title', 'sample_rate', 'last_synced_with_source', 'thumbnail', 'watermarked', 'creator', 'meta_data', 'license_version', 'created_on', 'foreign_landing_url', 'provider', 'license', 'genres', 'filetype', 'creator_url', 'identifier', 'duration', 'removed_from_source', 'url', 'foreign_identifier', 'source', 'category', 'tags', 'filesize', 'alt_files', 'bit_rate', 'updated_on', 'audio_set_foreign_identifier']
openverse-ingestion_server-1  | 2023-07-12 21:47:44,293 INFO ingest.py:295 - (Re)initializing foreign data wrapper
openverse-ingestion_server-1  | 2023-07-12 21:47:44,312 INFO ingest.py:315 - Copying upstream data...
openverse-ingestion_server-1  | 2023-07-12 21:47:44,313 INFO ingest.py:325 - Running copy-data query: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | DROP TABLE IF EXISTS "temp_import_audio";
openverse-ingestion_server-1  | CREATE TABLE "temp_import_audio" (LIKE "audio" INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     id serial;
openverse-ingestion_server-1  | CREATE TEMP SEQUENCE IF NOT EXISTS id_temp_seq;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ALTER COLUMN
openverse-ingestion_server-1  |     id SET DEFAULT nextval('id_temp_seq'::regclass);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     standardized_popularity double precision;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ALTER COLUMN
openverse-ingestion_server-1  |     view_count SET DEFAULT 0;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | INSERT INTO "temp_import_audio" ("title","sample_rate","last_synced_with_source","thumbnail","watermarked","creator","meta_data","license_version","created_on","foreign_landing_url","provider","license","genres","filetype","creator_url","identifier","duration","removed_from_source","url","foreign_identifier","source","category","tags","filesize","alt_files","bit_rate","updated_on","audio_set_foreign_identifier")
openverse-ingestion_server-1  |     SELECT "title","sample_rate","last_synced_with_source","thumbnail","watermarked","creator","meta_data","license_version","created_on","foreign_landing_url","provider","license","genres","filetype","creator_url","identifier","duration","removed_from_source","url","foreign_identifier","source","category","tags","filesize","alt_files","bit_rate","updated_on","audio_set_foreign_identifier" from "upstream_schema"."audio_view" AS u
openverse-ingestion_server-1  |     WHERE NOT EXISTS(
openverse-ingestion_server-1  |         SELECT FROM "api_deletedaudio" WHERE identifier = u.identifier
openverse-ingestion_server-1  |     )
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ORDER BY identifier
openverse-ingestion_server-1  | LIMIT 100000;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ADD PRIMARY KEY (id);
openverse-ingestion_server-1  | DROP SERVER upstream CASCADE;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | 2023-07-12 21:47:44,499 INFO ingest.py:329 - `audio`: Data copy complete | _Next: Elasticsearch reindex_
openverse-ingestion_server-1  | 2023-07-12 21:47:44,499 INFO ingest.py:339 - Finished refreshing table 'audio'.
openverse-ingestion_server-1  | 2023-07-12 21:47:44,500 INFO ingest.py:277 - `audioset`: Starting ingestion server data refresh | _Next: copying data from upstream_
openverse-ingestion_server-1  | 2023-07-12 21:47:44,509 INFO ingest.py:79 - Shared columns: ['foreign_identifier', 'filetype', 'thumbnail', 'title', 'creator_url', 'creator', 'filesize', 'url', 'foreign_landing_url', 'provider']
openverse-ingestion_server-1  | 2023-07-12 21:47:44,509 INFO ingest.py:295 - (Re)initializing foreign data wrapper
openverse-ingestion_server-1  | 2023-07-12 21:47:44,520 INFO ingest.py:315 - Copying upstream data...
openverse-ingestion_server-1  | 2023-07-12 21:47:44,521 INFO ingest.py:325 - Running copy-data query: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | DROP TABLE IF EXISTS "temp_import_audioset";
openverse-ingestion_server-1  | CREATE TABLE "temp_import_audioset" (LIKE "audioset" INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     id serial;
openverse-ingestion_server-1  | CREATE TEMP SEQUENCE IF NOT EXISTS id_temp_seq;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ALTER COLUMN
openverse-ingestion_server-1  |     id SET DEFAULT nextval('id_temp_seq'::regclass);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ALTER COLUMN
openverse-ingestion_server-1  |     created_on SET DEFAULT CURRENT_TIMESTAMP;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ALTER COLUMN
openverse-ingestion_server-1  |     updated_on SET DEFAULT CURRENT_TIMESTAMP;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | INSERT INTO "temp_import_audioset" ("foreign_identifier","filetype","thumbnail","title","creator_url","creator","filesize","url","foreign_landing_url","provider") SELECT "foreign_identifier","filetype","thumbnail","title","creator_url","creator","filesize","url","foreign_landing_url","provider" FROM "upstream_schema"."audioset_view"
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | LIMIT 100000;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ADD PRIMARY KEY (id);
openverse-ingestion_server-1  | DROP SERVER upstream CASCADE;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | 2023-07-12 21:47:44,536 INFO ingest.py:329 - `audioset`: Data copy complete | _Next: Elasticsearch reindex_
openverse-ingestion_server-1  | 2023-07-12 21:47:44,536 INFO ingest.py:339 - Finished refreshing table 'audioset'.
openverse-ingestion_server-1  | 2023-07-12 21:47:44,536 INFO indexer.py:298 - Creating index audio-init for model audio from table temp_import_audio.
openverse-ingestion_server-1  | 2023-07-12 21:47:45,099 INFO base.py:265 - PUT http://es:9200/audio-init [status:200 request:0.562s]
openverse-ingestion_server-1  | 2023-07-12 21:47:45,099 INFO indexer.py:307 - Running distributed index using indexer workers.
openverse-ingestion_server-1  | 2023-07-12 21:47:45,116 INFO distributed_reindex_scheduler.py:107 - Checking http://172.25.0.7:8002/healthcheck. . .
openverse-ingestion_server-1  | 2023-07-12 21:47:45,125 INFO distributed_reindex_scheduler.py:120 - http://172.25.0.7:8002/healthcheck passed healthcheck
openverse-ingestion_server-1  | 2023-07-12 21:47:45,126 INFO distributed_reindex_scheduler.py:60 - Assigning job {'model_name': 'audio', 'table_name': 'temp_import_audio', 'start_id': 0, 'end_id': 5000, 'target_index': 'audio-init'} to http://172.25.0.7:8002
openverse-ingestion_server-1  | 2023-07-12 21:47:45,150 INFO tasks.py:268 - Task 777a38ff7f694b60b41aed126056a997 completed.
openverse-ingestion_server-1  | [2023-07-12 21:47:46 +0000] [7] [DEBUG] POST /worker_finished
openverse-ingestion_server-1  | 2023-07-12 21:47:46,843 INFO state.py:88 - Received worker_finished signal from 172.25.0.7
openverse-ingestion_server-1  | 2023-07-12 21:47:46,845 INFO api.py:285 - All indexer workers succeeded! New index: audio-init
openverse-ingestion_server-1  | 2023-07-12 21:47:46,846 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:47:46,855 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.009s]
openverse-ingestion_server-1  | 2023-07-12 21:47:46,860 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.005s]
openverse-ingestion_server-1  | 2023-07-12 21:47:47,750 INFO base.py:265 - POST http://es:9200/audio-init/_refresh [status:200 request:0.885s]
openverse-ingestion_server-1  | 2023-07-12 21:47:47,812 INFO base.py:265 - PUT http://es:9200/audio-init/_settings [status:200 request:0.061s]
openverse-ingestion_server-1  | [2023-07-12 21:47:49 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:47:49,663 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:47:49,672 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.007s]
openverse-ingestion_server-1  | 2023-07-12 21:47:49,675 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.004s]
openverse-ingestion_server-1  | 2023-07-12 21:47:49,676 INFO ingest.py:360 - `audio`: Starting table promotion | _Next: recreate-indices_
openverse-ingestion_server-1  | 2023-07-12 21:47:49,679 INFO ingest.py:365 - Recreating database indices...
openverse-ingestion_server-1  | 2023-07-12 21:47:49,688 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_audio_identifier_key ON public.temp_import_audio USING btree (identifier)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,695 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_audio_url_key ON public.temp_import_audio USING btree (url)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,713 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_foreign_identifier_617f66ad_like ON public.temp_import_audio USING btree (foreign_identifier varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,720 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_url_b6a832d3_like ON public.temp_import_audio USING btree (url varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,729 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_provider_8fe1eb54 ON public.temp_import_audio USING btree (provider)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,736 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_provider_8fe1eb54_like ON public.temp_import_audio USING btree (provider varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,744 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_source_e9ccc813 ON public.temp_import_audio USING btree (source)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,754 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_source_e9ccc813_like ON public.temp_import_audio USING btree (source varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,762 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_last_synced_with_source_94c4a383 ON public.temp_import_audio USING btree (last_synced_with_source)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,768 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_genres_e34cc474 ON public.temp_import_audio USING btree (genres)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,775 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_category_ceb7d386 ON public.temp_import_audio USING btree (category)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,782 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_category_ceb7d386_like ON public.temp_import_audio USING btree (category varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,790 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audio_foreign_identifier_617f66ad ON public.temp_import_audio USING btree (foreign_identifier)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,800 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_unique_provider_audio ON public.temp_import_audio USING btree (foreign_identifier, provider)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,814 INFO ingest.py:372 - Done creating indices! Remapping constraints...
openverse-ingestion_server-1  | 2023-07-12 21:47:49,824 INFO ingest.py:379 - Running: DELETE FROM "audiolist_audios" AS fk_table WHERE NOT EXISTS(SELECT 1 FROM "temp_import_audio" AS r WHERE "r"."id" = "fk_table"."audio_id");
openverse-ingestion_server-1  | 2023-07-12 21:47:49,826 INFO ingest.py:379 - Running: ALTER TABLE "audio" DROP CONSTRAINT "unique_provider_audio"
openverse-ingestion_server-1  | 2023-07-12 21:47:49,828 INFO ingest.py:379 - Running: ALTER TABLE "audio" ADD UNIQUE (foreign_identifier, provider)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,831 INFO ingest.py:379 - Running: ALTER TABLE "audio" DROP CONSTRAINT "audio_url_key"
openverse-ingestion_server-1  | 2023-07-12 21:47:49,832 INFO ingest.py:379 - Running: ALTER TABLE "audio" ADD UNIQUE (url)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,835 INFO ingest.py:379 - Running: ALTER TABLE "audio" DROP CONSTRAINT "audio_identifier_key"
openverse-ingestion_server-1  | 2023-07-12 21:47:49,836 INFO ingest.py:379 - Running: ALTER TABLE "audio" ADD UNIQUE (identifier)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,840 INFO ingest.py:379 - Running: ALTER TABLE "audiolist_audios" DROP CONSTRAINT "audiolist_audios_audio_id_3b70015d_fk_audio_id"
openverse-ingestion_server-1  | 2023-07-12 21:47:49,841 INFO ingest.py:379 - Running: ALTER TABLE "audiolist_audios" ADD FOREIGN KEY (audio_id) REFERENCES temp_import_audio(id) DEFERRABLE INITIALLY DEFERRED
openverse-ingestion_server-1  | 2023-07-12 21:47:49,844 INFO ingest.py:381 - Done remapping constraints! Going live with new table...
openverse-ingestion_server-1  | 2023-07-12 21:47:49,844 INFO ingest.py:383 - `audio`: Indices & constraints applied | _Next: table promotion_
openverse-ingestion_server-1  | 2023-07-12 21:47:49,844 INFO ingest.py:387 - Running go-live: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  |         DROP TABLE "audio";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_pkey" RENAME TO "audio_pkey";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_identifier_key" RENAME TO "audio_identifier_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_url_key" RENAME TO "audio_url_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_foreign_identifier_617f66ad_like" RENAME TO "audio_foreign_identifier_617f66ad_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_url_b6a832d3_like" RENAME TO "audio_url_b6a832d3_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_provider_8fe1eb54" RENAME TO "audio_provider_8fe1eb54";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_provider_8fe1eb54_like" RENAME TO "audio_provider_8fe1eb54_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_source_e9ccc813" RENAME TO "audio_source_e9ccc813";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_source_e9ccc813_like" RENAME TO "audio_source_e9ccc813_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_last_synced_with_source_94c4a383" RENAME TO "audio_last_synced_with_source_94c4a383";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_genres_e34cc474" RENAME TO "audio_genres_e34cc474";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_category_ceb7d386" RENAME TO "audio_category_ceb7d386";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_category_ceb7d386_like" RENAME TO "audio_category_ceb7d386_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_foreign_identifier_617f66ad" RENAME TO "audio_foreign_identifier_617f66ad";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_unique_provider_audio" RENAME TO "unique_provider_audio";
openverse-ingestion_server-1  |         ALTER TABLE "temp_import_audio" RENAME TO "audio";
openverse-ingestion_server-1  |     
openverse-ingestion_server-1  | 2023-07-12 21:47:49,852 INFO ingest.py:389 - `audio`: Finished table promotion | _Next: Elasticsearch promotion_
openverse-ingestion_server-1  | 2023-07-12 21:47:49,855 INFO ingest.py:394 - Finished promoting table audio
openverse-ingestion_server-1  | 2023-07-12 21:47:49,855 INFO ingest.py:360 - `audioset`: Starting table promotion | _Next: recreate-indices_
openverse-ingestion_server-1  | 2023-07-12 21:47:49,859 INFO ingest.py:365 - Recreating database indices...
openverse-ingestion_server-1  | 2023-07-12 21:47:49,865 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_api_audioset_url_key ON public.temp_import_audioset USING btree (url)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,869 INFO ingest.py:370 - Running: CREATE INDEX temp_import_api_audioset_foreign_identifier_28c1db59_like ON public.temp_import_audioset USING btree (foreign_identifier varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,871 INFO ingest.py:370 - Running: CREATE INDEX temp_import_api_audioset_url_200f9fb4_like ON public.temp_import_audioset USING btree (url varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,876 INFO ingest.py:370 - Running: CREATE INDEX temp_import_api_audioset_foreign_identifier_28c1db59 ON public.temp_import_audioset USING btree (foreign_identifier)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,878 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_unique_foreign_identifier_provider ON public.temp_import_audioset USING btree (foreign_identifier, provider)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,881 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audioset_provider_cc061f30 ON public.temp_import_audioset USING btree (provider)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,883 INFO ingest.py:370 - Running: CREATE INDEX temp_import_audioset_provider_cc061f30_like ON public.temp_import_audioset USING btree (provider varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,885 INFO ingest.py:372 - Done creating indices! Remapping constraints...
openverse-ingestion_server-1  | 2023-07-12 21:47:49,893 INFO ingest.py:379 - Running: ALTER TABLE "audioset" DROP CONSTRAINT "unique_foreign_identifier_provider"
openverse-ingestion_server-1  | 2023-07-12 21:47:49,894 INFO ingest.py:379 - Running: ALTER TABLE "audioset" ADD UNIQUE (foreign_identifier, provider)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,897 INFO ingest.py:379 - Running: ALTER TABLE "audioset" DROP CONSTRAINT "api_audioset_url_key"
openverse-ingestion_server-1  | 2023-07-12 21:47:49,898 INFO ingest.py:379 - Running: ALTER TABLE "audioset" ADD UNIQUE (url)
openverse-ingestion_server-1  | 2023-07-12 21:47:49,900 INFO ingest.py:381 - Done remapping constraints! Going live with new table...
openverse-ingestion_server-1  | 2023-07-12 21:47:49,900 INFO ingest.py:383 - `audioset`: Indices & constraints applied | _Next: table promotion_
openverse-ingestion_server-1  | 2023-07-12 21:47:49,900 INFO ingest.py:387 - Running go-live: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  |         DROP TABLE "audioset";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audioset_pkey" RENAME TO "api_audioset_pkey";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_url_key" RENAME TO "api_audioset_url_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_foreign_identifier_28c1db59_like" RENAME TO "api_audioset_foreign_identifier_28c1db59_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_url_200f9fb4_like" RENAME TO "api_audioset_url_200f9fb4_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_foreign_identifier_28c1db59" RENAME TO "api_audioset_foreign_identifier_28c1db59";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_unique_foreign_identifier_provider" RENAME TO "unique_foreign_identifier_provider";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audioset_provider_cc061f30" RENAME TO "audioset_provider_cc061f30";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audioset_provider_cc061f30_like" RENAME TO "audioset_provider_cc061f30_like";
openverse-ingestion_server-1  |         ALTER TABLE "temp_import_audioset" RENAME TO "audioset";
openverse-ingestion_server-1  |     
openverse-ingestion_server-1  | 2023-07-12 21:47:49,905 INFO ingest.py:389 - `audioset`: Finished table promotion | _Next: Elasticsearch promotion_
openverse-ingestion_server-1  | 2023-07-12 21:47:49,907 INFO ingest.py:394 - Finished promoting table audioset
openverse-ingestion_server-1  | 2023-07-12 21:47:49,917 WARNING base.py:288 - GET http://es:9200/audio [status:404 request:0.009s]
openverse-ingestion_server-1  | 2023-07-12 21:47:49,952 INFO base.py:265 - PUT http://es:9200/audio-init/_alias/audio [status:200 request:0.034s]
openverse-ingestion_server-1  | 2023-07-12 21:47:49,952 INFO indexer.py:399 - `audio`: Created alias `audio` pointing to index `audio-init`.
openverse-ingestion_server-1  | 2023-07-12 21:47:49,952 INFO tasks.py:268 - Task 1035abe169e44798863ae05af5eadc3a completed.
openverse-ingestion_server-1  | [2023-07-12 21:47:55 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:47:55,026 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:47:55,036 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.009s]
openverse-ingestion_server-1  | 2023-07-12 21:47:55,041 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.004s]
openverse-ingestion_server-1  | [2023-07-12 21:47:55 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:47:55,287 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:47:55,301 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.011s]
openverse-ingestion_server-1  | 2023-07-12 21:47:55,309 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.007s]
openverse-ingestion_server-1  | 2023-07-12 21:47:55,315 WARNING base.py:288 - GET http://es:9200/audio-filtered [status:404 request:0.005s]
openverse-ingestion_server-1  | 2023-07-12 21:47:55,373 INFO base.py:265 - PUT http://es:9200/audio-init-filtered/_alias/audio-filtered [status:200 request:0.057s]
openverse-ingestion_server-1  | 2023-07-12 21:47:55,373 INFO indexer.py:399 - `audio`: Created alias `audio-filtered` pointing to index `audio-init-filtered`.
openverse-ingestion_server-1  | 2023-07-12 21:47:55,374 INFO tasks.py:268 - Task 597abf20ac494a349b3078935bb5d58e completed.
openverse-ingestion_server-1  | [2023-07-12 21:47:55 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:47:55,523 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:47:55,537 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.011s]
openverse-ingestion_server-1  | 2023-07-12 21:47:55,544 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.007s]
openverse-ingestion_server-1  | 2023-07-12 21:47:55,545 INFO ingest.py:277 - `image`: Starting ingestion server data refresh | _Next: copying data from upstream_
openverse-ingestion_server-1  | 2023-07-12 21:47:55,567 INFO ingest.py:79 - Shared columns: ['title', 'width', 'last_synced_with_source', 'thumbnail', 'watermarked', 'creator', 'meta_data', 'license_version', 'created_on', 'foreign_landing_url', 'provider', 'license', 'filetype', 'creator_url', 'identifier', 'removed_from_source', 'url', 'foreign_identifier', 'source', 'category', 'tags', 'filesize', 'updated_on', 'height']
openverse-ingestion_server-1  | 2023-07-12 21:47:55,567 INFO ingest.py:295 - (Re)initializing foreign data wrapper
openverse-ingestion_server-1  | 2023-07-12 21:47:55,586 INFO ingest.py:315 - Copying upstream data...
openverse-ingestion_server-1  | 2023-07-12 21:47:55,587 INFO ingest.py:325 - Running copy-data query: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | DROP TABLE IF EXISTS "temp_import_image";
openverse-ingestion_server-1  | CREATE TABLE "temp_import_image" (LIKE "image" INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     id serial;
openverse-ingestion_server-1  | CREATE TEMP SEQUENCE IF NOT EXISTS id_temp_seq;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ALTER COLUMN
openverse-ingestion_server-1  |     id SET DEFAULT nextval('id_temp_seq'::regclass);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     standardized_popularity double precision;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ALTER COLUMN
openverse-ingestion_server-1  |     view_count SET DEFAULT 0;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | INSERT INTO "temp_import_image" ("title","width","last_synced_with_source","thumbnail","watermarked","creator","meta_data","license_version","created_on","foreign_landing_url","provider","license","filetype","creator_url","identifier","removed_from_source","url","foreign_identifier","source","category","tags","filesize","updated_on","height")
openverse-ingestion_server-1  |     SELECT "title","width","last_synced_with_source","thumbnail","watermarked","creator","meta_data","license_version","created_on","foreign_landing_url","provider","license","filetype","creator_url","identifier","removed_from_source","url","foreign_identifier","source","category","tags","filesize","updated_on","height" from "upstream_schema"."image_view" AS u
openverse-ingestion_server-1  |     WHERE NOT EXISTS(
openverse-ingestion_server-1  |         SELECT FROM "api_deletedimage" WHERE identifier = u.identifier
openverse-ingestion_server-1  |     )
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ORDER BY identifier
openverse-ingestion_server-1  | LIMIT 100000;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ADD PRIMARY KEY (id);
openverse-ingestion_server-1  | DROP SERVER upstream CASCADE;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | 2023-07-12 21:47:55,692 INFO base.py:265 - PUT http://es:9200/audio-init-filtered [status:200 request:0.650s]
openverse-ingestion_server-1  | [2023-07-12 21:47:55 +0000] [7] [DEBUG] GET /static/mock_sensitive_terms.txt
openverse-ingestion_server-1  | 2023-07-12 21:47:56,039 INFO ingest.py:329 - `image`: Data copy complete | _Next: image data cleaning_
openverse-ingestion_server-1  | 2023-07-12 21:47:56,039 INFO ingest.py:333 - Cleaning data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,040 INFO cleanup.py:303 - Cleaning up data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,040 INFO cleanup.py:320 - Running cleanup on selection "SELECT id, identifier, source, url, foreign_landing_url, creator_url, tags from temp_import_image"
openverse-ingestion_server-1  | 2023-07-12 21:47:56,072 INFO cleanup.py:332 - Fetching first batch
openverse-ingestion_server-1  | 2023-07-12 21:47:56,478 INFO base.py:265 - POST http://es:9200/_reindex?slices=auto&wait_for_completion=true [status:200 request:0.777s]
openverse-ingestion_server-1  | 2023-07-12 21:47:56,486 INFO cleanup.py:346 - Dividing work
openverse-ingestion_server-1  | 2023-07-12 21:47:56,486 INFO cleanup.py:348 - Scheduling job 1
openverse-ingestion_server-1  | 2023-07-12 21:47:56,486 INFO cleanup.py:348 - Scheduling job 2
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 3
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 4
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 5
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 6
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 7
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 8
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 9
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 10
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 11
openverse-ingestion_server-1  | 2023-07-12 21:47:56,487 INFO cleanup.py:348 - Scheduling job 12
openverse-ingestion_server-1  | 2023-07-12 21:47:56,488 INFO cleanup.py:348 - Scheduling job 13
openverse-ingestion_server-1  | 2023-07-12 21:47:56,488 INFO cleanup.py:348 - Scheduling job 14
openverse-ingestion_server-1  | 2023-07-12 21:47:56,489 INFO cleanup.py:348 - Scheduling job 15
openverse-ingestion_server-1  | 2023-07-12 21:47:56,489 INFO cleanup.py:348 - Scheduling job 16
openverse-ingestion_server-1  | 2023-07-12 21:47:56,683 INFO cleanup.py:362 - Starting 16 cleaning jobs
openverse-ingestion_server-1  | 2023-07-12 21:47:56,711 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,717 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,718 INFO cleanup.py:216 - Cleaning 312 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,727 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,735 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,736 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,740 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,740 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,742 INFO cleanup.py:269 - Worker finished batch in 0.02375507354736328
openverse-ingestion_server-1  | 2023-07-12 21:47:56,744 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,751 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,752 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,757 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,758 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,759 INFO cleanup.py:269 - Worker finished batch in 0.023212194442749023
openverse-ingestion_server-1  | 2023-07-12 21:47:56,763 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,770 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,770 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,771 INFO cleanup.py:269 - Worker finished batch in 0.01866292953491211
openverse-ingestion_server-1  | 2023-07-12 21:47:56,771 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,771 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,780 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,785 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,785 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,790 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,791 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,791 INFO cleanup.py:269 - Worker finished batch in 0.019587039947509766
openverse-ingestion_server-1  | 2023-07-12 21:47:56,792 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,798 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,798 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,800 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,800 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,800 INFO cleanup.py:269 - Worker finished batch in 0.014754772186279297
openverse-ingestion_server-1  | 2023-07-12 21:47:56,803 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,804 INFO base.py:265 - POST http://es:9200/audio-init-filtered/_refresh [status:200 request:0.326s]
openverse-ingestion_server-1  | 2023-07-12 21:47:56,809 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,810 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,817 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,818 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,818 INFO cleanup.py:269 - Worker finished batch in 0.0196993350982666
openverse-ingestion_server-1  | 2023-07-12 21:47:56,818 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,824 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,825 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,830 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,831 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,832 INFO cleanup.py:269 - Worker finished batch in 0.022463321685791016
openverse-ingestion_server-1  | 2023-07-12 21:47:56,835 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,841 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,842 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,845 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,847 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,847 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,848 INFO cleanup.py:269 - Worker finished batch in 0.022488832473754883
openverse-ingestion_server-1  | 2023-07-12 21:47:56,848 INFO base.py:265 - PUT http://es:9200/audio-init-filtered/_settings [status:200 request:0.043s]
openverse-ingestion_server-1  | 2023-07-12 21:47:56,849 INFO tasks.py:268 - Task 5a70fde35d5e4a9fa64d3e64059c06f6 completed.
openverse-ingestion_server-1  | 2023-07-12 21:47:56,851 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,851 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,858 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,861 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,862 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,862 INFO cleanup.py:269 - Worker finished batch in 0.019951343536376953
openverse-ingestion_server-1  | 2023-07-12 21:47:56,865 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,865 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,868 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,868 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,869 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,869 INFO cleanup.py:269 - Worker finished batch in 0.017658710479736328
openverse-ingestion_server-1  | 2023-07-12 21:47:56,873 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,873 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,877 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,882 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,883 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,885 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,885 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,886 INFO cleanup.py:269 - Worker finished batch in 0.020590782165527344
openverse-ingestion_server-1  | 2023-07-12 21:47:56,886 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,892 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,892 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,892 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,893 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,893 INFO cleanup.py:269 - Worker finished batch in 0.01956009864807129
openverse-ingestion_server-1  | 2023-07-12 21:47:56,898 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,898 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,898 INFO cleanup.py:269 - Worker finished batch in 0.01525259017944336
openverse-ingestion_server-1  | 2023-07-12 21:47:56,900 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,905 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,906 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,908 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,908 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,908 INFO cleanup.py:269 - Worker finished batch in 0.015399932861328125
openverse-ingestion_server-1  | 2023-07-12 21:47:56,910 INFO cleanup.py:211 - Starting data cleaning worker
openverse-ingestion_server-1  | 2023-07-12 21:47:56,915 INFO cleanup.py:214 - Data cleaning worker connected to database
openverse-ingestion_server-1  | 2023-07-12 21:47:56,915 INFO cleanup.py:216 - Cleaning 311 rows
openverse-ingestion_server-1  | 2023-07-12 21:47:56,926 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,926 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,926 INFO cleanup.py:269 - Worker finished batch in 0.020476341247558594
openverse-ingestion_server-1  | 2023-07-12 21:47:56,934 INFO cleanup.py:262 - TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,934 INFO cleanup.py:263 - Worker committing changes...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,934 INFO cleanup.py:269 - Worker finished batch in 0.01942157745361328
openverse-ingestion_server-1  | 2023-07-12 21:47:56,935 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,935 INFO cleanup.py:289 - Finished saving cleaned data in 1.9311904907226562e-05,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,935 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:289 - Finished saving cleaned data in 9.775161743164062e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:289 - Finished saving cleaned data in 4.5299530029296875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:289 - Finished saving cleaned data in 4.291534423828125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:289 - Finished saving cleaned data in 3.0994415283203125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,936 INFO cleanup.py:289 - Finished saving cleaned data in 4.0531158447265625e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:289 - Finished saving cleaned data in 4.5299530029296875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:289 - Finished saving cleaned data in 5.7220458984375e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:289 - Finished saving cleaned data in 3.337860107421875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:289 - Finished saving cleaned data in 3.814697265625e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,937 INFO cleanup.py:289 - Finished saving cleaned data in 3.814697265625e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:289 - Finished saving cleaned data in 4.76837158203125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:289 - Finished saving cleaned data in 4.5299530029296875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:289 - Finished saving cleaned data in 5.0067901611328125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:289 - Finished saving cleaned data in 4.76837158203125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:274 - Saving cleaned data...
openverse-ingestion_server-1  | 2023-07-12 21:47:56,938 INFO cleanup.py:289 - Finished saving cleaned data in 4.291534423828125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,956 INFO cleanup.py:374 - Batch finished, records/s: cleanup_rate=10636.543642678613, items cleaned: {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}.
openverse-ingestion_server-1  | Fetching next batch.
openverse-ingestion_server-1  | 2023-07-12 21:47:56,974 INFO cleanup.py:386 - Cleaned all records in 0.9340667724609375 seconds,counts: {'url': 0, 'foreign_landing_url': 0, 'creator_url': 0, 'tags': 0}
openverse-ingestion_server-1  | 2023-07-12 21:47:56,974 INFO ingest.py:335 - Cleaning completed!
openverse-ingestion_server-1  | 2023-07-12 21:47:56,975 INFO ingest.py:336 - `image`: Data cleaning complete | _Next: Elasticsearch reindex_
openverse-ingestion_server-1  | 2023-07-12 21:47:56,975 INFO ingest.py:339 - Finished refreshing table 'image'.
openverse-ingestion_server-1  | 2023-07-12 21:47:56,975 INFO indexer.py:298 - Creating index image-init for model image from table temp_import_image.
openverse-ingestion_server-1  | 2023-07-12 21:47:57,440 INFO base.py:265 - PUT http://es:9200/image-init [status:200 request:0.464s]
openverse-ingestion_server-1  | 2023-07-12 21:47:57,441 INFO indexer.py:307 - Running distributed index using indexer workers.
openverse-ingestion_server-1  | 2023-07-12 21:47:57,448 INFO distributed_reindex_scheduler.py:107 - Checking http://172.25.0.7:8002/healthcheck. . .
openverse-ingestion_server-1  | 2023-07-12 21:47:57,458 INFO distributed_reindex_scheduler.py:120 - http://172.25.0.7:8002/healthcheck passed healthcheck
openverse-ingestion_server-1  | 2023-07-12 21:47:57,458 INFO distributed_reindex_scheduler.py:60 - Assigning job {'model_name': 'image', 'table_name': 'temp_import_image', 'start_id': 0, 'end_id': 5000, 'target_index': 'image-init'} to http://172.25.0.7:8002
openverse-ingestion_server-1  | 2023-07-12 21:47:57,476 INFO tasks.py:268 - Task e3cf685e1be94c579fb786387ec6d64c completed.
openverse-ingestion_server-1  | [2023-07-12 21:47:59 +0000] [7] [DEBUG] POST /worker_finished
openverse-ingestion_server-1  | 2023-07-12 21:47:59,140 INFO state.py:88 - Received worker_finished signal from 172.25.0.7
openverse-ingestion_server-1  | 2023-07-12 21:47:59,142 INFO api.py:285 - All indexer workers succeeded! New index: image-init
openverse-ingestion_server-1  | 2023-07-12 21:47:59,142 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:47:59,149 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.006s]
openverse-ingestion_server-1  | 2023-07-12 21:47:59,153 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.004s]
openverse-ingestion_server-1  | 2023-07-12 21:47:59,424 INFO base.py:265 - POST http://es:9200/image-init/_refresh [status:200 request:0.265s]
openverse-ingestion_server-1  | 2023-07-12 21:47:59,473 INFO base.py:265 - PUT http://es:9200/image-init/_settings [status:200 request:0.048s]
openverse-ingestion_server-1  | [2023-07-12 21:48:00 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:48:00,899 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:48:00,909 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.008s]
openverse-ingestion_server-1  | 2023-07-12 21:48:00,915 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.005s]
openverse-ingestion_server-1  | 2023-07-12 21:48:00,915 INFO ingest.py:360 - `image`: Starting table promotion | _Next: recreate-indices_
openverse-ingestion_server-1  | 2023-07-12 21:48:00,920 INFO ingest.py:365 - Recreating database indices...
openverse-ingestion_server-1  | 2023-07-12 21:48:00,928 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_image_url_key ON public.temp_import_image USING btree (url)
openverse-ingestion_server-1  | 2023-07-12 21:48:00,947 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_provider_7d11f847 ON public.temp_import_image USING btree (provider)
openverse-ingestion_server-1  | 2023-07-12 21:48:00,954 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_provider_7d11f847_like ON public.temp_import_image USING btree (provider varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:48:00,961 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_source_d5a89e97 ON public.temp_import_image USING btree (source)
openverse-ingestion_server-1  | 2023-07-12 21:48:00,969 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_source_d5a89e97_like ON public.temp_import_image USING btree (source varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:48:00,976 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_url_c6aabda2_like ON public.temp_import_image USING btree (url varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:48:00,986 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_last_synced_with_source_187adf09 ON public.temp_import_image USING btree (last_synced_with_source)
openverse-ingestion_server-1  | 2023-07-12 21:48:00,994 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_foreign_identifier_4c72d3ee_like ON public.temp_import_image USING btree (foreign_identifier varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,001 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_image_identifier_key ON public.temp_import_image USING btree (identifier)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,007 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_foreign_identifier_4c72d3ee ON public.temp_import_image USING btree (foreign_identifier)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,020 INFO ingest.py:370 - Running: CREATE UNIQUE INDEX temp_import_unique_provider_image ON public.temp_import_image USING btree (foreign_identifier, provider)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,030 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_category_daa5d91c ON public.temp_import_image USING btree (category)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,038 INFO ingest.py:370 - Running: CREATE INDEX temp_import_image_category_daa5d91c_like ON public.temp_import_image USING btree (category varchar_pattern_ops)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,046 INFO ingest.py:372 - Done creating indices! Remapping constraints...
openverse-ingestion_server-1  | 2023-07-12 21:48:01,055 INFO ingest.py:379 - Running: DELETE FROM "imagelist_images" AS fk_table WHERE NOT EXISTS(SELECT 1 FROM "temp_import_image" AS r WHERE "r"."id" = "fk_table"."image_id");
openverse-ingestion_server-1  | 2023-07-12 21:48:01,057 INFO ingest.py:379 - Running: ALTER TABLE "image" DROP CONSTRAINT "unique_provider_image"
openverse-ingestion_server-1  | 2023-07-12 21:48:01,059 INFO ingest.py:379 - Running: ALTER TABLE "image" ADD UNIQUE (foreign_identifier, provider)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,061 INFO ingest.py:379 - Running: ALTER TABLE "image" DROP CONSTRAINT "image_identifier_key"
openverse-ingestion_server-1  | 2023-07-12 21:48:01,062 INFO ingest.py:379 - Running: ALTER TABLE "image" ADD UNIQUE (identifier)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,064 INFO ingest.py:379 - Running: ALTER TABLE "image" DROP CONSTRAINT "image_url_key"
openverse-ingestion_server-1  | 2023-07-12 21:48:01,065 INFO ingest.py:379 - Running: ALTER TABLE "image" ADD UNIQUE (url)
openverse-ingestion_server-1  | 2023-07-12 21:48:01,068 INFO ingest.py:379 - Running: ALTER TABLE "imagelist_images" DROP CONSTRAINT "imagelist_images_image_id_efcd124c_fk_image_id"
openverse-ingestion_server-1  | 2023-07-12 21:48:01,069 INFO ingest.py:379 - Running: ALTER TABLE "imagelist_images" ADD FOREIGN KEY (image_id) REFERENCES temp_import_image(id) DEFERRABLE INITIALLY DEFERRED
openverse-ingestion_server-1  | 2023-07-12 21:48:01,072 INFO ingest.py:381 - Done remapping constraints! Going live with new table...
openverse-ingestion_server-1  | 2023-07-12 21:48:01,072 INFO ingest.py:383 - `image`: Indices & constraints applied | _Next: table promotion_
openverse-ingestion_server-1  | 2023-07-12 21:48:01,073 INFO ingest.py:387 - Running go-live: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  |         DROP TABLE "image";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_pkey" RENAME TO "image_pkey";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_url_key" RENAME TO "image_url_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_provider_7d11f847" RENAME TO "image_provider_7d11f847";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_provider_7d11f847_like" RENAME TO "image_provider_7d11f847_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_source_d5a89e97" RENAME TO "image_source_d5a89e97";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_source_d5a89e97_like" RENAME TO "image_source_d5a89e97_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_url_c6aabda2_like" RENAME TO "image_url_c6aabda2_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_last_synced_with_source_187adf09" RENAME TO "image_last_synced_with_source_187adf09";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_foreign_identifier_4c72d3ee_like" RENAME TO "image_foreign_identifier_4c72d3ee_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_identifier_key" RENAME TO "image_identifier_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_foreign_identifier_4c72d3ee" RENAME TO "image_foreign_identifier_4c72d3ee";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_unique_provider_image" RENAME TO "unique_provider_image";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_category_daa5d91c" RENAME TO "image_category_daa5d91c";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_category_daa5d91c_like" RENAME TO "image_category_daa5d91c_like";
openverse-ingestion_server-1  |         ALTER TABLE "temp_import_image" RENAME TO "image";
openverse-ingestion_server-1  |     
openverse-ingestion_server-1  | 2023-07-12 21:48:01,081 INFO ingest.py:389 - `image`: Finished table promotion | _Next: Elasticsearch promotion_
openverse-ingestion_server-1  | 2023-07-12 21:48:01,084 INFO ingest.py:394 - Finished promoting table image
openverse-ingestion_server-1  | 2023-07-12 21:48:01,088 WARNING base.py:288 - GET http://es:9200/image [status:404 request:0.004s]
openverse-ingestion_server-1  | 2023-07-12 21:48:01,132 INFO base.py:265 - PUT http://es:9200/image-init/_alias/image [status:200 request:0.043s]
openverse-ingestion_server-1  | 2023-07-12 21:48:01,132 INFO indexer.py:399 - `image`: Created alias `image` pointing to index `image-init`.
openverse-ingestion_server-1  | 2023-07-12 21:48:01,132 INFO tasks.py:268 - Task 4f672ecac019455abe8f6efe3e968500 completed.
openverse-ingestion_server-1  | [2023-07-12 21:48:06 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:48:06,262 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:48:06,272 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.009s]
openverse-ingestion_server-1  | 2023-07-12 21:48:06,277 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.004s]
openverse-ingestion_server-1  | [2023-07-12 21:48:06 +0000] [7] [DEBUG] POST /task
openverse-ingestion_server-1  | 2023-07-12 21:48:06,508 INFO es_helpers.py:58 - Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | 2023-07-12 21:48:06,518 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.008s]
openverse-ingestion_server-1  | 2023-07-12 21:48:06,524 INFO base.py:265 - GET http://es:9200/ [status:200 request:0.005s]
openverse-ingestion_server-1  | 2023-07-12 21:48:06,529 WARNING base.py:288 - GET http://es:9200/image-filtered [status:404 request:0.004s]
openverse-ingestion_server-1  | 2023-07-12 21:48:06,588 INFO base.py:265 - PUT http://es:9200/image-init-filtered/_alias/image-filtered [status:200 request:0.058s]
openverse-ingestion_server-1  | 2023-07-12 21:48:06,589 INFO indexer.py:399 - `image`: Created alias `image-filtered` pointing to index `image-init-filtered`.
openverse-ingestion_server-1  | 2023-07-12 21:48:06,589 INFO tasks.py:268 - Task f01d8fda7ac046f093f1a873b158cb11 completed.
openverse-ingestion_server-1  | 2023-07-12 21:48:06,791 INFO base.py:265 - PUT http://es:9200/image-init-filtered [status:200 request:0.513s]
openverse-ingestion_server-1  | [2023-07-12 21:48:06 +0000] [7] [DEBUG] GET /static/mock_sensitive_terms.txt
openverse-ingestion_server-1  | 2023-07-12 21:48:07,176 INFO base.py:265 - POST http://es:9200/_reindex?slices=auto&wait_for_completion=true [status:200 request:0.377s]
openverse-ingestion_server-1  | 2023-07-12 21:48:07,407 INFO base.py:265 - POST http://es:9200/image-init-filtered/_refresh [status:200 request:0.230s]
openverse-ingestion_server-1  | 2023-07-12 21:48:07,457 INFO base.py:265 - PUT http://es:9200/image-init-filtered/_settings [status:200 request:0.049s]
openverse-ingestion_server-1  | 2023-07-12 21:48:07,458 INFO tasks.py:268 - Task 27a53b79574044eeb66e1fc6c8bc0e1d completed.
Logs after
openverse-ingestion_server-1  | [2023-07-12 22:40:41,111 - gunicorn.error - 267][DEBUG] Current configuration:
openverse-ingestion_server-1  |   config: ./gunicorn.conf.py
openverse-ingestion_server-1  |   wsgi_app: None
openverse-ingestion_server-1  |   bind: ['0.0.0.0:8001']
openverse-ingestion_server-1  |   backlog: 2048
openverse-ingestion_server-1  |   workers: 1
openverse-ingestion_server-1  |   worker_class: sync
openverse-ingestion_server-1  |   threads: 1
openverse-ingestion_server-1  |   worker_connections: 1000
openverse-ingestion_server-1  |   max_requests: 0
openverse-ingestion_server-1  |   max_requests_jitter: 0
openverse-ingestion_server-1  |   timeout: 120
openverse-ingestion_server-1  |   graceful_timeout: 30
openverse-ingestion_server-1  |   keepalive: 2
openverse-ingestion_server-1  |   limit_request_line: 4094
openverse-ingestion_server-1  |   limit_request_fields: 100
openverse-ingestion_server-1  |   limit_request_field_size: 8190
openverse-ingestion_server-1  |   reload: True
openverse-ingestion_server-1  |   reload_engine: auto
openverse-ingestion_server-1  |   reload_extra_files: []
openverse-ingestion_server-1  |   spew: False
openverse-ingestion_server-1  |   check_config: False
openverse-ingestion_server-1  |   print_config: False
openverse-ingestion_server-1  |   preload_app: False
openverse-ingestion_server-1  |   sendfile: None
openverse-ingestion_server-1  |   reuse_port: False
openverse-ingestion_server-1  |   chdir: /ingestion_server/ingestion_server
openverse-ingestion_server-1  |   daemon: False
openverse-ingestion_server-1  |   raw_env: []
openverse-ingestion_server-1  |   pidfile: None
openverse-ingestion_server-1  |   worker_tmp_dir: None
openverse-ingestion_server-1  |   user: 1000
openverse-ingestion_server-1  |   group: 1000
openverse-ingestion_server-1  |   umask: 0
openverse-ingestion_server-1  |   initgroups: False
openverse-ingestion_server-1  |   tmp_upload_dir: None
openverse-ingestion_server-1  |   secure_scheme_headers: {'X-FORWARDED-PROTOCOL': 'ssl', 'X-FORWARDED-PROTO': 'https', 'X-FORWARDED-SSL': 'on'}
openverse-ingestion_server-1  |   forwarded_allow_ips: ['127.0.0.1']
openverse-ingestion_server-1  |   accesslog: -
openverse-ingestion_server-1  |   disable_redirect_access_to_syslog: False
openverse-ingestion_server-1  |   access_log_format: %(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"
openverse-ingestion_server-1  |   errorlog: -
openverse-ingestion_server-1  |   loglevel: debug
openverse-ingestion_server-1  |   capture_output: True
openverse-ingestion_server-1  |   logger_class: gunicorn.glogging.Logger
openverse-ingestion_server-1  |   logconfig: None
openverse-ingestion_server-1  |   logconfig_dict: {'version': 1, 'disable_existing_loggers': False, 'formatters': {'generic': {'format': '[%(asctime)s - %(name)s - %(lineno)3d][%(levelname)s] %(message)s'}}, 'root': {'level': 'INFO', 'handlers': ['console']}, 'loggers': {'gunicorn.error': {'level': 'DEBUG', 'handlers': ['console'], 'propagate': False, 'qualname': 'gunicorn.error'}, 'gunicorn.access': {'level': 'INFO', 'handlers': ['console'], 'propagate': False, 'qualname': 'gunicorn.access'}, '': {'level': 'INFO', 'handlers': ['console']}}, 'handlers': {'console': {'class': 'logging.StreamHandler', 'formatter': 'generic', 'stream': 'ext://sys.stdout'}}}
openverse-ingestion_server-1  |   syslog_addr: udp://localhost:514
openverse-ingestion_server-1  |   syslog: False
openverse-ingestion_server-1  |   syslog_prefix: None
openverse-ingestion_server-1  |   syslog_facility: user
openverse-ingestion_server-1  |   enable_stdio_inheritance: False
openverse-ingestion_server-1  |   statsd_host: None
openverse-ingestion_server-1  |   dogstatsd_tags: 
openverse-ingestion_server-1  |   statsd_prefix: 
openverse-ingestion_server-1  |   proc_name: None
openverse-ingestion_server-1  |   default_proc_name: api:api
openverse-ingestion_server-1  |   pythonpath: None
openverse-ingestion_server-1  |   paste: None
openverse-ingestion_server-1  |   on_starting: <function OnStarting.on_starting at 0x7f55e1244400>
openverse-ingestion_server-1  |   on_reload: <function OnReload.on_reload at 0x7f55e1244540>
openverse-ingestion_server-1  |   when_ready: <function WhenReady.when_ready at 0x7f55e1244680>
openverse-ingestion_server-1  |   pre_fork: <function Prefork.pre_fork at 0x7f55e12447c0>
openverse-ingestion_server-1  |   post_fork: <function Postfork.post_fork at 0x7f55e1244900>
openverse-ingestion_server-1  |   post_worker_init: <function PostWorkerInit.post_worker_init at 0x7f55e1244a40>
openverse-ingestion_server-1  |   worker_int: <function WorkerInt.worker_int at 0x7f55e1244b80>
openverse-ingestion_server-1  |   worker_abort: <function WorkerAbort.worker_abort at 0x7f55e1244cc0>
openverse-ingestion_server-1  |   pre_exec: <function PreExec.pre_exec at 0x7f55e1244e00>
openverse-ingestion_server-1  |   pre_request: <function PreRequest.pre_request at 0x7f55e1244f40>
openverse-ingestion_server-1  |   post_request: <function PostRequest.post_request at 0x7f55e1244fe0>
openverse-ingestion_server-1  |   child_exit: <function ChildExit.child_exit at 0x7f55e1245120>
openverse-ingestion_server-1  |   worker_exit: <function WorkerExit.worker_exit at 0x7f55e1245260>
openverse-ingestion_server-1  |   nworkers_changed: <function NumWorkersChanged.nworkers_changed at 0x7f55e12453a0>
openverse-ingestion_server-1  |   on_exit: <function OnExit.on_exit at 0x7f55e12454e0>
openverse-ingestion_server-1  |   proxy_protocol: False
openverse-ingestion_server-1  |   proxy_allow_ips: ['127.0.0.1']
openverse-ingestion_server-1  |   keyfile: None
openverse-ingestion_server-1  |   certfile: None
openverse-ingestion_server-1  |   ssl_version: 2
openverse-ingestion_server-1  |   cert_reqs: 0
openverse-ingestion_server-1  |   ca_certs: None
openverse-ingestion_server-1  |   suppress_ragged_eofs: True
openverse-ingestion_server-1  |   do_handshake_on_connect: False
openverse-ingestion_server-1  |   ciphers: None
openverse-ingestion_server-1  |   raw_paste_global_conf: []
openverse-ingestion_server-1  |   strip_header_spaces: False
openverse-ingestion_server-1  | [2023-07-12 22:40:41,111 - gunicorn.error - 264][INFO] Starting gunicorn 20.1.0
openverse-ingestion_server-1  | [2023-07-12 22:40:41,112 - gunicorn.error - 267][DEBUG] Arbiter booted
openverse-ingestion_server-1  | [2023-07-12 22:40:41,112 - gunicorn.error - 264][INFO] Listening at: http://0.0.0.0:8001 (1)
openverse-ingestion_server-1  | [2023-07-12 22:40:41,112 - gunicorn.error - 264][INFO] Using worker: sync
openverse-ingestion_server-1  | [2023-07-12 22:40:41,114 - gunicorn.error - 264][INFO] Booting worker with pid: 7
openverse-ingestion_server-1  | [2023-07-12 22:40:41,185 - gunicorn.error - 267][DEBUG] 1 workers
openverse-ingestion_server-1  | [2023-07-12 22:40:43,663 - gunicorn.error - 267][DEBUG] GET /
openverse-ingestion_server-1  | [2023-07-12 22:40:43,665 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:40:43 +0000] "GET / HTTP/1.1" 200 20 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:05,611 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:05,640 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:05,649 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.008s]
openverse-ingestion_server-1  | [2023-07-12 22:41:05,653 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:05,653 - ingestion_server.slack - 277][INFO] `audio`: Starting ingestion server data refresh | _Next: copying data from upstream_
openverse-ingestion_server-1  | [2023-07-12 22:41:05,665 - root -  79][INFO] Shared columns: ['duration', 'foreign_identifier', 'filetype', 'sample_rate', 'license', 'removed_from_source', 'provider', 'tags', 'alt_files', 'last_synced_with_source', 'meta_data', 'title', 'filesize', 'url', 'creator_url', 'created_on', 'updated_on', 'bit_rate', 'creator', 'identifier', 'license_version', 'category', 'genres', 'watermarked', 'foreign_landing_url', 'audio_set_foreign_identifier', 'source', 'thumbnail']
openverse-ingestion_server-1  | [2023-07-12 22:41:05,665 - root - 295][INFO] (Re)initializing foreign data wrapper
openverse-ingestion_server-1  | [2023-07-12 22:41:05,680 - root - 315][INFO] Copying upstream data...
openverse-ingestion_server-1  | [2023-07-12 22:41:05,680 - root - 325][INFO] Running copy-data query: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | DROP TABLE IF EXISTS "temp_import_audio";
openverse-ingestion_server-1  | CREATE TABLE "temp_import_audio" (LIKE "audio" INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     id serial;
openverse-ingestion_server-1  | CREATE TEMP SEQUENCE IF NOT EXISTS id_temp_seq;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ALTER COLUMN
openverse-ingestion_server-1  |     id SET DEFAULT nextval('id_temp_seq'::regclass);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     standardized_popularity double precision;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ALTER COLUMN
openverse-ingestion_server-1  |     view_count SET DEFAULT 0;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | INSERT INTO "temp_import_audio" ("duration","foreign_identifier","filetype","sample_rate","license","removed_from_source","provider","tags","alt_files","last_synced_with_source","meta_data","title","filesize","url","creator_url","created_on","updated_on","bit_rate","creator","identifier","license_version","category","genres","watermarked","foreign_landing_url","audio_set_foreign_identifier","source","thumbnail")
openverse-ingestion_server-1  |     SELECT "duration","foreign_identifier","filetype","sample_rate","license","removed_from_source","provider","tags","alt_files","last_synced_with_source","meta_data","title","filesize","url","creator_url","created_on","updated_on","bit_rate","creator","identifier","license_version","category","genres","watermarked","foreign_landing_url","audio_set_foreign_identifier","source","thumbnail" from "upstream_schema"."audio_view" AS u
openverse-ingestion_server-1  |     WHERE NOT EXISTS(
openverse-ingestion_server-1  |         SELECT FROM "api_deletedaudio" WHERE identifier = u.identifier
openverse-ingestion_server-1  |     )
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ORDER BY identifier
openverse-ingestion_server-1  | LIMIT 100000;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audio" ADD PRIMARY KEY (id);
openverse-ingestion_server-1  | DROP SERVER upstream CASCADE;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | [2023-07-12 22:41:05,741 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:05 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:05,836 - ingestion_server.slack - 329][INFO] `audio`: Data copy complete | _Next: Elasticsearch reindex_
openverse-ingestion_server-1  | [2023-07-12 22:41:05,836 - root - 339][INFO] Finished refreshing table 'audio'.
openverse-ingestion_server-1  | [2023-07-12 22:41:05,836 - ingestion_server.slack - 277][INFO] `audioset`: Starting ingestion server data refresh | _Next: copying data from upstream_
openverse-ingestion_server-1  | [2023-07-12 22:41:05,846 - root -  79][INFO] Shared columns: ['provider', 'foreign_identifier', 'foreign_landing_url', 'creator', 'title', 'filesize', 'filetype', 'url', 'creator_url', 'thumbnail']
openverse-ingestion_server-1  | [2023-07-12 22:41:05,846 - root - 295][INFO] (Re)initializing foreign data wrapper
openverse-ingestion_server-1  | [2023-07-12 22:41:05,857 - root - 315][INFO] Copying upstream data...
openverse-ingestion_server-1  | [2023-07-12 22:41:05,858 - root - 325][INFO] Running copy-data query: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | DROP TABLE IF EXISTS "temp_import_audioset";
openverse-ingestion_server-1  | CREATE TABLE "temp_import_audioset" (LIKE "audioset" INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     id serial;
openverse-ingestion_server-1  | CREATE TEMP SEQUENCE IF NOT EXISTS id_temp_seq;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ALTER COLUMN
openverse-ingestion_server-1  |     id SET DEFAULT nextval('id_temp_seq'::regclass);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ALTER COLUMN
openverse-ingestion_server-1  |     created_on SET DEFAULT CURRENT_TIMESTAMP;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ALTER COLUMN
openverse-ingestion_server-1  |     updated_on SET DEFAULT CURRENT_TIMESTAMP;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | INSERT INTO "temp_import_audioset" ("provider","foreign_identifier","foreign_landing_url","creator","title","filesize","filetype","url","creator_url","thumbnail") SELECT "provider","foreign_identifier","foreign_landing_url","creator","title","filesize","filetype","url","creator_url","thumbnail" FROM "upstream_schema"."audioset_view"
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | LIMIT 100000;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_audioset" ADD PRIMARY KEY (id);
openverse-ingestion_server-1  | DROP SERVER upstream CASCADE;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | [2023-07-12 22:41:05,872 - ingestion_server.slack - 329][INFO] `audioset`: Data copy complete | _Next: Elasticsearch reindex_
openverse-ingestion_server-1  | [2023-07-12 22:41:05,873 - root - 339][INFO] Finished refreshing table 'audioset'.
openverse-ingestion_server-1  | [2023-07-12 22:41:05,873 - root - 298][INFO] Creating index audio-init for model audio from table temp_import_audio.
openverse-ingestion_server-1  | [2023-07-12 22:41:06,346 - elasticsearch - 265][INFO] PUT http://es:9200/audio-init [status:200 request:0.473s]
openverse-ingestion_server-1  | [2023-07-12 22:41:06,346 - root - 307][INFO] Running distributed index using indexer workers.
openverse-ingestion_server-1  | [2023-07-12 22:41:06,360 - root - 107][INFO] Checking http://192.168.32.7:8002/healthcheck. . .
openverse-ingestion_server-1  | [2023-07-12 22:41:06,365 - root - 120][INFO] http://192.168.32.7:8002/healthcheck passed healthcheck
openverse-ingestion_server-1  | [2023-07-12 22:41:06,365 - root -  60][INFO] Assigning job {'model_name': 'audio', 'table_name': 'temp_import_audio', 'start_id': 0, 'end_id': 5000, 'target_index': 'audio-init'} to http://192.168.32.7:8002
openverse-ingestion_server-1  | [2023-07-12 22:41:06,383 - root - 268][INFO] Task 3d3b103c6dec445faa23c7d46488d08b completed.
openverse-ingestion_server-1  | [2023-07-12 22:41:07,977 - gunicorn.error - 267][DEBUG] POST /worker_finished
openverse-ingestion_server-1  | [2023-07-12 22:41:07,984 - root -  88][INFO] Received worker_finished signal from 192.168.32.7
openverse-ingestion_server-1  | [2023-07-12 22:41:07,985 - root - 284][INFO] All indexer workers succeeded! New index: audio-init
openverse-ingestion_server-1  | [2023-07-12 22:41:07,985 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:07,995 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.009s]
openverse-ingestion_server-1  | [2023-07-12 22:41:08,000 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.004s]
openverse-ingestion_server-1  | [2023-07-12 22:41:08,004 - gunicorn.access - 349][INFO] 192.168.32.7 - - [12/Jul/2023:22:41:08 +0000] "POST /worker_finished HTTP/1.1" 200 0 "-" "python-requests/2.31.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:08,609 - elasticsearch - 265][INFO] POST http://es:9200/audio-init/_refresh [status:200 request:0.605s]
openverse-ingestion_server-1  | [2023-07-12 22:41:08,670 - elasticsearch - 265][INFO] PUT http://es:9200/audio-init/_settings [status:200 request:0.060s]
openverse-ingestion_server-1  | [2023-07-12 22:41:11,000 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:11,010 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:11,017 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.006s]
openverse-ingestion_server-1  | [2023-07-12 22:41:11,020 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:11,021 - root - 360][INFO] `audio`: Starting table promotion | _Next: recreate-indices_
openverse-ingestion_server-1  | [2023-07-12 22:41:11,024 - root - 365][INFO] Recreating database indices...
openverse-ingestion_server-1  | [2023-07-12 22:41:11,030 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_audio_identifier_key ON public.temp_import_audio USING btree (identifier)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,034 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_audio_url_key ON public.temp_import_audio USING btree (url)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,045 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_foreign_identifier_617f66ad_like ON public.temp_import_audio USING btree (foreign_identifier varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,050 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_url_b6a832d3_like ON public.temp_import_audio USING btree (url varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,057 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_provider_8fe1eb54 ON public.temp_import_audio USING btree (provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,063 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_provider_8fe1eb54_like ON public.temp_import_audio USING btree (provider varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,069 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_source_e9ccc813 ON public.temp_import_audio USING btree (source)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,074 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_source_e9ccc813_like ON public.temp_import_audio USING btree (source varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,079 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_last_synced_with_source_94c4a383 ON public.temp_import_audio USING btree (last_synced_with_source)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,084 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_genres_e34cc474 ON public.temp_import_audio USING btree (genres)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,090 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_category_ceb7d386 ON public.temp_import_audio USING btree (category)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,096 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_category_ceb7d386_like ON public.temp_import_audio USING btree (category varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,102 - root - 370][INFO] Running: CREATE INDEX temp_import_audio_foreign_identifier_617f66ad ON public.temp_import_audio USING btree (foreign_identifier)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,109 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_unique_provider_audio ON public.temp_import_audio USING btree (foreign_identifier, provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,109 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:11 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:11,118 - root - 372][INFO] Done creating indices! Remapping constraints...
openverse-ingestion_server-1  | [2023-07-12 22:41:11,124 - root - 379][INFO] Running: DELETE FROM "audiolist_audios" AS fk_table WHERE NOT EXISTS(SELECT 1 FROM "temp_import_audio" AS r WHERE "r"."id" = "fk_table"."audio_id");
openverse-ingestion_server-1  | [2023-07-12 22:41:11,125 - root - 379][INFO] Running: ALTER TABLE "audio" DROP CONSTRAINT "unique_provider_audio"
openverse-ingestion_server-1  | [2023-07-12 22:41:11,127 - root - 379][INFO] Running: ALTER TABLE "audio" ADD UNIQUE (foreign_identifier, provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,129 - root - 379][INFO] Running: ALTER TABLE "audio" DROP CONSTRAINT "audio_url_key"
openverse-ingestion_server-1  | [2023-07-12 22:41:11,130 - root - 379][INFO] Running: ALTER TABLE "audio" ADD UNIQUE (url)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,131 - root - 379][INFO] Running: ALTER TABLE "audio" DROP CONSTRAINT "audio_identifier_key"
openverse-ingestion_server-1  | [2023-07-12 22:41:11,132 - root - 379][INFO] Running: ALTER TABLE "audio" ADD UNIQUE (identifier)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,134 - root - 379][INFO] Running: ALTER TABLE "audiolist_audios" DROP CONSTRAINT "audiolist_audios_audio_id_3b70015d_fk_audio_id"
openverse-ingestion_server-1  | [2023-07-12 22:41:11,134 - root - 379][INFO] Running: ALTER TABLE "audiolist_audios" ADD FOREIGN KEY (audio_id) REFERENCES temp_import_audio(id) DEFERRABLE INITIALLY DEFERRED
openverse-ingestion_server-1  | [2023-07-12 22:41:11,136 - root - 381][INFO] Done remapping constraints! Going live with new table...
openverse-ingestion_server-1  | [2023-07-12 22:41:11,136 - ingestion_server.slack - 383][INFO] `audio`: Indices & constraints applied | _Next: table promotion_
openverse-ingestion_server-1  | [2023-07-12 22:41:11,136 - root - 387][INFO] Running go-live: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  |         DROP TABLE "audio";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_pkey" RENAME TO "audio_pkey";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_identifier_key" RENAME TO "audio_identifier_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_url_key" RENAME TO "audio_url_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_foreign_identifier_617f66ad_like" RENAME TO "audio_foreign_identifier_617f66ad_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_url_b6a832d3_like" RENAME TO "audio_url_b6a832d3_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_provider_8fe1eb54" RENAME TO "audio_provider_8fe1eb54";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_provider_8fe1eb54_like" RENAME TO "audio_provider_8fe1eb54_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_source_e9ccc813" RENAME TO "audio_source_e9ccc813";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_source_e9ccc813_like" RENAME TO "audio_source_e9ccc813_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_last_synced_with_source_94c4a383" RENAME TO "audio_last_synced_with_source_94c4a383";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_genres_e34cc474" RENAME TO "audio_genres_e34cc474";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_category_ceb7d386" RENAME TO "audio_category_ceb7d386";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_category_ceb7d386_like" RENAME TO "audio_category_ceb7d386_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audio_foreign_identifier_617f66ad" RENAME TO "audio_foreign_identifier_617f66ad";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_unique_provider_audio" RENAME TO "unique_provider_audio";
openverse-ingestion_server-1  |         ALTER TABLE "temp_import_audio" RENAME TO "audio";
openverse-ingestion_server-1  |     
openverse-ingestion_server-1  | [2023-07-12 22:41:11,141 - ingestion_server.slack - 389][INFO] `audio`: Finished table promotion | _Next: Elasticsearch promotion_
openverse-ingestion_server-1  | [2023-07-12 22:41:11,143 - root - 394][INFO] Finished promoting table audio
openverse-ingestion_server-1  | [2023-07-12 22:41:11,143 - root - 360][INFO] `audioset`: Starting table promotion | _Next: recreate-indices_
openverse-ingestion_server-1  | [2023-07-12 22:41:11,146 - root - 365][INFO] Recreating database indices...
openverse-ingestion_server-1  | [2023-07-12 22:41:11,151 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_api_audioset_url_key ON public.temp_import_audioset USING btree (url)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,153 - root - 370][INFO] Running: CREATE INDEX temp_import_api_audioset_foreign_identifier_28c1db59_like ON public.temp_import_audioset USING btree (foreign_identifier varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,156 - root - 370][INFO] Running: CREATE INDEX temp_import_api_audioset_url_200f9fb4_like ON public.temp_import_audioset USING btree (url varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,158 - root - 370][INFO] Running: CREATE INDEX temp_import_api_audioset_foreign_identifier_28c1db59 ON public.temp_import_audioset USING btree (foreign_identifier)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,159 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_unique_foreign_identifier_provider ON public.temp_import_audioset USING btree (foreign_identifier, provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,162 - root - 370][INFO] Running: CREATE INDEX temp_import_audioset_provider_cc061f30 ON public.temp_import_audioset USING btree (provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,164 - root - 370][INFO] Running: CREATE INDEX temp_import_audioset_provider_cc061f30_like ON public.temp_import_audioset USING btree (provider varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,165 - root - 372][INFO] Done creating indices! Remapping constraints...
openverse-ingestion_server-1  | [2023-07-12 22:41:11,171 - root - 379][INFO] Running: ALTER TABLE "audioset" DROP CONSTRAINT "unique_foreign_identifier_provider"
openverse-ingestion_server-1  | [2023-07-12 22:41:11,172 - root - 379][INFO] Running: ALTER TABLE "audioset" ADD UNIQUE (foreign_identifier, provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,174 - root - 379][INFO] Running: ALTER TABLE "audioset" DROP CONSTRAINT "api_audioset_url_key"
openverse-ingestion_server-1  | [2023-07-12 22:41:11,175 - root - 379][INFO] Running: ALTER TABLE "audioset" ADD UNIQUE (url)
openverse-ingestion_server-1  | [2023-07-12 22:41:11,176 - root - 381][INFO] Done remapping constraints! Going live with new table...
openverse-ingestion_server-1  | [2023-07-12 22:41:11,176 - ingestion_server.slack - 383][INFO] `audioset`: Indices & constraints applied | _Next: table promotion_
openverse-ingestion_server-1  | [2023-07-12 22:41:11,176 - root - 387][INFO] Running go-live: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  |         DROP TABLE "audioset";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audioset_pkey" RENAME TO "api_audioset_pkey";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_url_key" RENAME TO "api_audioset_url_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_foreign_identifier_28c1db59_like" RENAME TO "api_audioset_foreign_identifier_28c1db59_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_url_200f9fb4_like" RENAME TO "api_audioset_url_200f9fb4_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_api_audioset_foreign_identifier_28c1db59" RENAME TO "api_audioset_foreign_identifier_28c1db59";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_unique_foreign_identifier_provider" RENAME TO "unique_foreign_identifier_provider";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audioset_provider_cc061f30" RENAME TO "audioset_provider_cc061f30";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_audioset_provider_cc061f30_like" RENAME TO "audioset_provider_cc061f30_like";
openverse-ingestion_server-1  |         ALTER TABLE "temp_import_audioset" RENAME TO "audioset";
openverse-ingestion_server-1  |     
openverse-ingestion_server-1  | [2023-07-12 22:41:11,180 - ingestion_server.slack - 389][INFO] `audioset`: Finished table promotion | _Next: Elasticsearch promotion_
openverse-ingestion_server-1  | [2023-07-12 22:41:11,182 - root - 394][INFO] Finished promoting table audioset
openverse-ingestion_server-1  | [2023-07-12 22:41:11,188 - elasticsearch - 288][WARNING] GET http://es:9200/audio [status:404 request:0.006s]
openverse-ingestion_server-1  | [2023-07-12 22:41:11,211 - elasticsearch - 265][INFO] PUT http://es:9200/audio-init/_alias/audio [status:200 request:0.023s]
openverse-ingestion_server-1  | [2023-07-12 22:41:11,211 - ingestion_server.slack - 399][INFO] `audio`: Created alias `audio` pointing to index `audio-init`.
openverse-ingestion_server-1  | [2023-07-12 22:41:11,211 - root - 268][INFO] Task 79b24e40bcfc4e809cf730c7134ba8ea completed.
openverse-ingestion_server-1  | [2023-07-12 22:41:16,283 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:16,291 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:16,298 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.006s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,301 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,391 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:16 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:16,470 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:16,483 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:16,491 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.007s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,495 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,499 - elasticsearch - 288][WARNING] GET http://es:9200/audio-filtered [status:404 request:0.004s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,561 - elasticsearch - 265][INFO] PUT http://es:9200/audio-init-filtered/_alias/audio-filtered [status:200 request:0.062s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,561 - ingestion_server.slack - 399][INFO] `audio`: Created alias `audio-filtered` pointing to index `audio-init-filtered`.
openverse-ingestion_server-1  | [2023-07-12 22:41:16,561 - root - 268][INFO] Task d4ed54a04ce44b088255b88e5b4b0da4 completed.
openverse-ingestion_server-1  | [2023-07-12 22:41:16,583 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:16 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:16,664 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:16,673 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:16,680 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.006s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,683 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,683 - ingestion_server.slack - 277][INFO] `image`: Starting ingestion server data refresh | _Next: copying data from upstream_
openverse-ingestion_server-1  | [2023-07-12 22:41:16,695 - root -  79][INFO] Shared columns: ['foreign_identifier', 'filetype', 'license', 'removed_from_source', 'width', 'provider', 'tags', 'last_synced_with_source', 'meta_data', 'title', 'filesize', 'url', 'creator_url', 'created_on', 'updated_on', 'creator', 'identifier', 'license_version', 'category', 'watermarked', 'foreign_landing_url', 'height', 'source', 'thumbnail']
openverse-ingestion_server-1  | [2023-07-12 22:41:16,695 - root - 295][INFO] (Re)initializing foreign data wrapper
openverse-ingestion_server-1  | [2023-07-12 22:41:16,705 - root - 315][INFO] Copying upstream data...
openverse-ingestion_server-1  | [2023-07-12 22:41:16,706 - root - 325][INFO] Running copy-data query: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | DROP TABLE IF EXISTS "temp_import_image";
openverse-ingestion_server-1  | CREATE TABLE "temp_import_image" (LIKE "image" INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     id serial;
openverse-ingestion_server-1  | CREATE TEMP SEQUENCE IF NOT EXISTS id_temp_seq;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ALTER COLUMN
openverse-ingestion_server-1  |     id SET DEFAULT nextval('id_temp_seq'::regclass);
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ADD COLUMN IF NOT EXISTS
openverse-ingestion_server-1  |     standardized_popularity double precision;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ALTER COLUMN
openverse-ingestion_server-1  |     view_count SET DEFAULT 0;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | INSERT INTO "temp_import_image" ("foreign_identifier","filetype","license","removed_from_source","width","provider","tags","last_synced_with_source","meta_data","title","filesize","url","creator_url","created_on","updated_on","creator","identifier","license_version","category","watermarked","foreign_landing_url","height","source","thumbnail")
openverse-ingestion_server-1  |     SELECT "foreign_identifier","filetype","license","removed_from_source","width","provider","tags","last_synced_with_source","meta_data","title","filesize","url","creator_url","created_on","updated_on","creator","identifier","license_version","category","watermarked","foreign_landing_url","height","source","thumbnail" from "upstream_schema"."image_view" AS u
openverse-ingestion_server-1  |     WHERE NOT EXISTS(
openverse-ingestion_server-1  |         SELECT FROM "api_deletedimage" WHERE identifier = u.identifier
openverse-ingestion_server-1  |     )
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | ORDER BY identifier
openverse-ingestion_server-1  | LIMIT 100000;
openverse-ingestion_server-1  | ALTER TABLE "temp_import_image" ADD PRIMARY KEY (id);
openverse-ingestion_server-1  | DROP SERVER upstream CASCADE;
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  | [2023-07-12 22:41:16,734 - elasticsearch - 265][INFO] PUT http://es:9200/audio-init-filtered [status:200 request:0.432s]
openverse-ingestion_server-1  | [2023-07-12 22:41:16,773 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:16 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:16,773 - gunicorn.error - 267][DEBUG] GET /static/mock_sensitive_terms.txt
openverse-ingestion_server-1  | [2023-07-12 22:41:16,775 - gunicorn.access - 349][INFO] 127.0.0.1 - - [12/Jul/2023:22:41:16 +0000] "GET /static/mock_sensitive_terms.txt HTTP/1.1" 200 19 "-" "Python-urllib/3.11"
openverse-ingestion_server-1  | [2023-07-12 22:41:16,981 - ingestion_server.slack - 329][INFO] `image`: Data copy complete | _Next: image data cleaning_
openverse-ingestion_server-1  | [2023-07-12 22:41:16,981 - root - 333][INFO] Cleaning data...
openverse-ingestion_server-1  | [2023-07-12 22:41:16,982 - root - 303][INFO] Cleaning up data...
openverse-ingestion_server-1  | [2023-07-12 22:41:16,982 - root - 320][INFO] Running cleanup on selection "SELECT id, identifier, source, creator_url, tags, url, foreign_landing_url from temp_import_image"
openverse-ingestion_server-1  | [2023-07-12 22:41:17,002 - root - 332][INFO] Fetching first batch
openverse-ingestion_server-1  | [2023-07-12 22:41:17,280 - root - 346][INFO] Dividing work
openverse-ingestion_server-1  | [2023-07-12 22:41:17,280 - root - 348][INFO] Scheduling job 1
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 2
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 3
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 4
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 5
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 6
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 7
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 8
openverse-ingestion_server-1  | [2023-07-12 22:41:17,281 - root - 348][INFO] Scheduling job 9
openverse-ingestion_server-1  | [2023-07-12 22:41:17,282 - root - 348][INFO] Scheduling job 10
openverse-ingestion_server-1  | [2023-07-12 22:41:17,282 - root - 348][INFO] Scheduling job 11
openverse-ingestion_server-1  | [2023-07-12 22:41:17,282 - root - 348][INFO] Scheduling job 12
openverse-ingestion_server-1  | [2023-07-12 22:41:17,282 - root - 348][INFO] Scheduling job 13
openverse-ingestion_server-1  | [2023-07-12 22:41:17,282 - root - 348][INFO] Scheduling job 14
openverse-ingestion_server-1  | [2023-07-12 22:41:17,282 - root - 348][INFO] Scheduling job 15
openverse-ingestion_server-1  | [2023-07-12 22:41:17,282 - root - 348][INFO] Scheduling job 16
openverse-ingestion_server-1  | [2023-07-12 22:41:17,366 - elasticsearch - 265][INFO] POST http://es:9200/_reindex?slices=auto&wait_for_completion=true [status:200 request:0.590s]
openverse-ingestion_server-1  | [2023-07-12 22:41:17,382 - root - 362][INFO] Starting 16 cleaning jobs
openverse-ingestion_server-1  | [2023-07-12 22:41:17,403 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,409 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,409 - root - 216][INFO] Cleaning 312 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,416 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,423 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,424 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,426 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,426 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,428 - root - 269][INFO] Worker finished batch in 0.01869654655456543
openverse-ingestion_server-1  | [2023-07-12 22:41:17,435 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,441 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,441 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,444 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,444 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,444 - root - 269][INFO] Worker finished batch in 0.01995992660522461
openverse-ingestion_server-1  | [2023-07-12 22:41:17,454 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,457 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,457 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,458 - root - 269][INFO] Worker finished batch in 0.016288042068481445
openverse-ingestion_server-1  | [2023-07-12 22:41:17,459 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,460 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,475 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,475 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,475 - root - 269][INFO] Worker finished batch in 0.01563429832458496
openverse-ingestion_server-1  | [2023-07-12 22:41:17,477 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,482 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,483 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,488 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,494 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,494 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,501 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,501 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,501 - root - 269][INFO] Worker finished batch in 0.018531084060668945
openverse-ingestion_server-1  | [2023-07-12 22:41:17,502 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,507 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,508 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,512 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,513 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,513 - root - 269][INFO] Worker finished batch in 0.018243789672851562
openverse-ingestion_server-1  | [2023-07-12 22:41:17,515 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,522 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,522 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,524 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,524 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,526 - root - 269][INFO] Worker finished batch in 0.018518924713134766
openverse-ingestion_server-1  | [2023-07-12 22:41:17,533 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,539 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,540 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,540 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,540 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,542 - root - 269][INFO] Worker finished batch in 0.018977880477905273
openverse-ingestion_server-1  | [2023-07-12 22:41:17,546 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,557 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,557 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,559 - root - 269][INFO] Worker finished batch in 0.01883721351623535
openverse-ingestion_server-1  | [2023-07-12 22:41:17,567 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,568 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,581 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,590 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,590 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,590 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,592 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,592 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,590 - root - 269][INFO] Worker finished batch in 0.022430419921875
openverse-ingestion_server-1  | [2023-07-12 22:41:17,596 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,597 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,605 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,609 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,609 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,611 - root - 269][INFO] Worker finished batch in 0.01842355728149414
openverse-ingestion_server-1  | [2023-07-12 22:41:17,615 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,615 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,615 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,616 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,618 - root - 269][INFO] Worker finished batch in 0.020381927490234375
openverse-ingestion_server-1  | [2023-07-12 22:41:17,621 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,627 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,627 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,632 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,633 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,635 - root - 269][INFO] Worker finished batch in 0.01963520050048828
openverse-ingestion_server-1  | [2023-07-12 22:41:17,640 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,645 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,645 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,645 - root - 269][INFO] Worker finished batch in 0.01754164695739746
openverse-ingestion_server-1  | [2023-07-12 22:41:17,646 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,646 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,656 - root - 211][INFO] Starting data cleaning worker
openverse-ingestion_server-1  | [2023-07-12 22:41:17,660 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,660 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,660 - root - 269][INFO] Worker finished batch in 0.014148712158203125
openverse-ingestion_server-1  | [2023-07-12 22:41:17,663 - root - 214][INFO] Data cleaning worker connected to database
openverse-ingestion_server-1  | [2023-07-12 22:41:17,663 - root - 216][INFO] Cleaning 311 rows
openverse-ingestion_server-1  | [2023-07-12 22:41:17,680 - root - 262][INFO] TLS cache: {'www.flickr.com': True, 'commons.wikimedia.org': True, '.geograph.org.uk': True, 'www.geograph.org.uk': True, '.eol.org': True, 'www.eol.org': True, '.digitaltmuseum.org': True, 'collections.musee-mccord.qc.ca': False}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,680 - root - 263][INFO] Worker committing changes...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,681 - root - 269][INFO] Worker finished batch in 0.017447233200073242
openverse-ingestion_server-1  | [2023-07-12 22:41:17,682 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,682 - root - 289][INFO] Finished saving cleaned data in 1.8358230590820312e-05,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,682 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,682 - root - 289][INFO] Finished saving cleaned data in 1.52587890625e-05,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,682 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,682 - root - 289][INFO] Finished saving cleaned data in 5.9604644775390625e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 289][INFO] Finished saving cleaned data in 3.337860107421875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 289][INFO] Finished saving cleaned data in 3.814697265625e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 289][INFO] Finished saving cleaned data in 5.245208740234375e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 289][INFO] Finished saving cleaned data in 4.5299530029296875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 289][INFO] Finished saving cleaned data in 3.814697265625e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 289][INFO] Finished saving cleaned data in 4.291534423828125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,683 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 289][INFO] Finished saving cleaned data in 4.5299530029296875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 289][INFO] Finished saving cleaned data in 4.76837158203125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 289][INFO] Finished saving cleaned data in 4.291534423828125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 289][INFO] Finished saving cleaned data in 3.337860107421875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 289][INFO] Finished saving cleaned data in 5.0067901611328125e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 289][INFO] Finished saving cleaned data in 3.337860107421875e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 274][INFO] Saving cleaned data...
openverse-ingestion_server-1  | [2023-07-12 22:41:17,684 - root - 289][INFO] Finished saving cleaned data in 3.814697265625e-06,
openverse-ingestion_server-1  | {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,700 - root - 374][INFO] Batch finished, records/s: cleanup_rate=11918.683754369014, items cleaned: {'tags': 0, 'url': 0, 'creator_url': 0, 'foreign_landing_url': 0}.
openverse-ingestion_server-1  | Fetching next batch.
openverse-ingestion_server-1  | [2023-07-12 22:41:17,706 - elasticsearch - 265][INFO] POST http://es:9200/audio-init-filtered/_refresh [status:200 request:0.340s]
openverse-ingestion_server-1  | [2023-07-12 22:41:17,719 - root - 386][INFO] Cleaned all records in 0.7373456954956055 seconds,counts: {'creator_url': 0, 'tags': 0, 'url': 0, 'foreign_landing_url': 0}
openverse-ingestion_server-1  | [2023-07-12 22:41:17,720 - root - 335][INFO] Cleaning completed!
openverse-ingestion_server-1  | [2023-07-12 22:41:17,720 - ingestion_server.slack - 336][INFO] `image`: Data cleaning complete | _Next: Elasticsearch reindex_
openverse-ingestion_server-1  | [2023-07-12 22:41:17,720 - root - 339][INFO] Finished refreshing table 'image'.
openverse-ingestion_server-1  | [2023-07-12 22:41:17,721 - root - 298][INFO] Creating index image-init for model image from table temp_import_image.
openverse-ingestion_server-1  | [2023-07-12 22:41:17,742 - elasticsearch - 265][INFO] PUT http://es:9200/audio-init-filtered/_settings [status:200 request:0.036s]
openverse-ingestion_server-1  | [2023-07-12 22:41:17,743 - root - 268][INFO] Task a1ea3a85beda4846b5dd9dbb55fb0c53 completed.
openverse-ingestion_server-1  | [2023-07-12 22:41:18,150 - elasticsearch - 265][INFO] PUT http://es:9200/image-init [status:200 request:0.429s]
openverse-ingestion_server-1  | [2023-07-12 22:41:18,151 - root - 307][INFO] Running distributed index using indexer workers.
openverse-ingestion_server-1  | [2023-07-12 22:41:18,156 - root - 107][INFO] Checking http://192.168.32.7:8002/healthcheck. . .
openverse-ingestion_server-1  | [2023-07-12 22:41:18,160 - root - 120][INFO] http://192.168.32.7:8002/healthcheck passed healthcheck
openverse-ingestion_server-1  | [2023-07-12 22:41:18,160 - root -  60][INFO] Assigning job {'model_name': 'image', 'table_name': 'temp_import_image', 'start_id': 0, 'end_id': 5000, 'target_index': 'image-init'} to http://192.168.32.7:8002
openverse-ingestion_server-1  | [2023-07-12 22:41:18,170 - root - 268][INFO] Task a027e0e4b09a4088ba50954ade43e1d2 completed.
openverse-ingestion_server-1  | [2023-07-12 22:41:19,362 - gunicorn.error - 267][DEBUG] POST /worker_finished
openverse-ingestion_server-1  | [2023-07-12 22:41:19,363 - root -  88][INFO] Received worker_finished signal from 192.168.32.7
openverse-ingestion_server-1  | [2023-07-12 22:41:19,364 - root - 284][INFO] All indexer workers succeeded! New index: image-init
openverse-ingestion_server-1  | [2023-07-12 22:41:19,364 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:19,370 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.005s]
openverse-ingestion_server-1  | [2023-07-12 22:41:19,373 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:19,377 - gunicorn.access - 349][INFO] 192.168.32.7 - - [12/Jul/2023:22:41:19 +0000] "POST /worker_finished HTTP/1.1" 200 0 "-" "python-requests/2.31.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:19,607 - elasticsearch - 265][INFO] POST http://es:9200/image-init/_refresh [status:200 request:0.230s]
openverse-ingestion_server-1  | [2023-07-12 22:41:19,645 - elasticsearch - 265][INFO] PUT http://es:9200/image-init/_settings [status:200 request:0.037s]
openverse-ingestion_server-1  | [2023-07-12 22:41:21,953 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:21,964 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:21,970 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.005s]
openverse-ingestion_server-1  | [2023-07-12 22:41:21,972 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:21,973 - root - 360][INFO] `image`: Starting table promotion | _Next: recreate-indices_
openverse-ingestion_server-1  | [2023-07-12 22:41:21,976 - root - 365][INFO] Recreating database indices...
openverse-ingestion_server-1  | [2023-07-12 22:41:21,981 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_image_url_key ON public.temp_import_image USING btree (url)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,003 - root - 370][INFO] Running: CREATE INDEX temp_import_image_provider_7d11f847 ON public.temp_import_image USING btree (provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,008 - root - 370][INFO] Running: CREATE INDEX temp_import_image_provider_7d11f847_like ON public.temp_import_image USING btree (provider varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,013 - root - 370][INFO] Running: CREATE INDEX temp_import_image_source_d5a89e97 ON public.temp_import_image USING btree (source)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,017 - root - 370][INFO] Running: CREATE INDEX temp_import_image_source_d5a89e97_like ON public.temp_import_image USING btree (source varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,022 - root - 370][INFO] Running: CREATE INDEX temp_import_image_url_c6aabda2_like ON public.temp_import_image USING btree (url varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,028 - root - 370][INFO] Running: CREATE INDEX temp_import_image_last_synced_with_source_187adf09 ON public.temp_import_image USING btree (last_synced_with_source)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,032 - root - 370][INFO] Running: CREATE INDEX temp_import_image_foreign_identifier_4c72d3ee_like ON public.temp_import_image USING btree (foreign_identifier varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,046 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_image_identifier_key ON public.temp_import_image USING btree (identifier)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,050 - root - 370][INFO] Running: CREATE INDEX temp_import_image_foreign_identifier_4c72d3ee ON public.temp_import_image USING btree (foreign_identifier)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,056 - root - 370][INFO] Running: CREATE UNIQUE INDEX temp_import_unique_provider_image ON public.temp_import_image USING btree (foreign_identifier, provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,064 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:22 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:22,065 - root - 370][INFO] Running: CREATE INDEX temp_import_image_category_daa5d91c ON public.temp_import_image USING btree (category)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,070 - root - 370][INFO] Running: CREATE INDEX temp_import_image_category_daa5d91c_like ON public.temp_import_image USING btree (category varchar_pattern_ops)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,075 - root - 372][INFO] Done creating indices! Remapping constraints...
openverse-ingestion_server-1  | [2023-07-12 22:41:22,082 - root - 379][INFO] Running: DELETE FROM "imagelist_images" AS fk_table WHERE NOT EXISTS(SELECT 1 FROM "temp_import_image" AS r WHERE "r"."id" = "fk_table"."image_id");
openverse-ingestion_server-1  | [2023-07-12 22:41:22,083 - root - 379][INFO] Running: ALTER TABLE "image" DROP CONSTRAINT "unique_provider_image"
openverse-ingestion_server-1  | [2023-07-12 22:41:22,084 - root - 379][INFO] Running: ALTER TABLE "image" ADD UNIQUE (foreign_identifier, provider)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,086 - root - 379][INFO] Running: ALTER TABLE "image" DROP CONSTRAINT "image_identifier_key"
openverse-ingestion_server-1  | [2023-07-12 22:41:22,087 - root - 379][INFO] Running: ALTER TABLE "image" ADD UNIQUE (identifier)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,088 - root - 379][INFO] Running: ALTER TABLE "image" DROP CONSTRAINT "image_url_key"
openverse-ingestion_server-1  | [2023-07-12 22:41:22,089 - root - 379][INFO] Running: ALTER TABLE "image" ADD UNIQUE (url)
openverse-ingestion_server-1  | [2023-07-12 22:41:22,091 - root - 379][INFO] Running: ALTER TABLE "imagelist_images" DROP CONSTRAINT "imagelist_images_image_id_efcd124c_fk_image_id"
openverse-ingestion_server-1  | [2023-07-12 22:41:22,092 - root - 379][INFO] Running: ALTER TABLE "imagelist_images" ADD FOREIGN KEY (image_id) REFERENCES temp_import_image(id) DEFERRABLE INITIALLY DEFERRED
openverse-ingestion_server-1  | [2023-07-12 22:41:22,094 - root - 381][INFO] Done remapping constraints! Going live with new table...
openverse-ingestion_server-1  | [2023-07-12 22:41:22,095 - ingestion_server.slack - 383][INFO] `image`: Indices & constraints applied | _Next: table promotion_
openverse-ingestion_server-1  | [2023-07-12 22:41:22,095 - root - 387][INFO] Running go-live: 
openverse-ingestion_server-1  | 
openverse-ingestion_server-1  |         DROP TABLE "image";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_pkey" RENAME TO "image_pkey";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_url_key" RENAME TO "image_url_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_provider_7d11f847" RENAME TO "image_provider_7d11f847";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_provider_7d11f847_like" RENAME TO "image_provider_7d11f847_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_source_d5a89e97" RENAME TO "image_source_d5a89e97";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_source_d5a89e97_like" RENAME TO "image_source_d5a89e97_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_url_c6aabda2_like" RENAME TO "image_url_c6aabda2_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_last_synced_with_source_187adf09" RENAME TO "image_last_synced_with_source_187adf09";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_foreign_identifier_4c72d3ee_like" RENAME TO "image_foreign_identifier_4c72d3ee_like";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_identifier_key" RENAME TO "image_identifier_key";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_foreign_identifier_4c72d3ee" RENAME TO "image_foreign_identifier_4c72d3ee";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_unique_provider_image" RENAME TO "unique_provider_image";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_category_daa5d91c" RENAME TO "image_category_daa5d91c";
openverse-ingestion_server-1  |         ALTER INDEX "temp_import_image_category_daa5d91c_like" RENAME TO "image_category_daa5d91c_like";
openverse-ingestion_server-1  |         ALTER TABLE "temp_import_image" RENAME TO "image";
openverse-ingestion_server-1  |     
openverse-ingestion_server-1  | [2023-07-12 22:41:22,100 - ingestion_server.slack - 389][INFO] `image`: Finished table promotion | _Next: Elasticsearch promotion_
openverse-ingestion_server-1  | [2023-07-12 22:41:22,104 - root - 394][INFO] Finished promoting table image
openverse-ingestion_server-1  | [2023-07-12 22:41:22,107 - elasticsearch - 288][WARNING] GET http://es:9200/image [status:404 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:22,132 - elasticsearch - 265][INFO] PUT http://es:9200/image-init/_alias/image [status:200 request:0.025s]
openverse-ingestion_server-1  | [2023-07-12 22:41:22,133 - ingestion_server.slack - 399][INFO] `image`: Created alias `image` pointing to index `image-init`.
openverse-ingestion_server-1  | [2023-07-12 22:41:22,133 - root - 268][INFO] Task ec1c8dd4dc394d97bf88d6687e4efe49 completed.
openverse-ingestion_server-1  | [2023-07-12 22:41:27,234 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:27,243 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:27,249 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.005s]
openverse-ingestion_server-1  | [2023-07-12 22:41:27,252 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:27,343 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:27 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:27,423 - gunicorn.error - 267][DEBUG] POST /task
openverse-ingestion_server-1  | [2023-07-12 22:41:27,433 - root -  58][INFO] Connecting to es:9200 with AWS auth
openverse-ingestion_server-1  | [2023-07-12 22:41:27,441 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.007s]
openverse-ingestion_server-1  | [2023-07-12 22:41:27,444 - elasticsearch - 265][INFO] GET http://es:9200/ [status:200 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:27,448 - elasticsearch - 288][WARNING] GET http://es:9200/image-filtered [status:404 request:0.003s]
openverse-ingestion_server-1  | [2023-07-12 22:41:27,490 - elasticsearch - 265][INFO] PUT http://es:9200/image-init-filtered/_alias/image-filtered [status:200 request:0.042s]
openverse-ingestion_server-1  | [2023-07-12 22:41:27,491 - ingestion_server.slack - 399][INFO] `image`: Created alias `image-filtered` pointing to index `image-init-filtered`.
openverse-ingestion_server-1  | [2023-07-12 22:41:27,491 - root - 268][INFO] Task bf53ed02d94f4124892e4e103a1bc8cf completed.
openverse-ingestion_server-1  | [2023-07-12 22:41:27,532 - gunicorn.access - 349][INFO] 192.168.32.1 - - [12/Jul/2023:22:41:27 +0000] "POST /task HTTP/1.1" 202 169 "-" "curl/7.81.0"
openverse-ingestion_server-1  | [2023-07-12 22:41:27,630 - elasticsearch - 265][INFO] PUT http://es:9200/image-init-filtered [status:200 request:0.378s]
openverse-ingestion_server-1  | [2023-07-12 22:41:27,632 - gunicorn.error - 267][DEBUG] GET /static/mock_sensitive_terms.txt
openverse-ingestion_server-1  | [2023-07-12 22:41:27,633 - gunicorn.access - 349][INFO] 127.0.0.1 - - [12/Jul/2023:22:41:27 +0000] "GET /static/mock_sensitive_terms.txt HTTP/1.1" 200 19 "-" "Python-urllib/3.11"
openverse-ingestion_server-1  | [2023-07-12 22:41:27,859 - elasticsearch - 265][INFO] POST http://es:9200/_reindex?slices=auto&wait_for_completion=true [status:200 request:0.226s]
openverse-ingestion_server-1  | [2023-07-12 22:41:28,009 - elasticsearch - 265][INFO] POST http://es:9200/image-init-filtered/_refresh [status:200 request:0.150s]
openverse-ingestion_server-1  | [2023-07-12 22:41:28,049 - elasticsearch - 265][INFO] PUT http://es:9200/image-init-filtered/_settings [status:200 request:0.039s]
openverse-ingestion_server-1  | [2023-07-12 22:41:28,049 - root - 268][INFO] Task 5767d22927324f839145d872435384ec completed.

Testing Instructions

Run just build && just down -v then just api/init, followed with just logs ingestion_server. You should see the new log format in the produced logs!

Checklist

  • My pull request has a descriptive title (not a vague title likeUpdate index.md).
  • My pull request targets the default branch of the repository (main) or a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.
  • I ran the DAG documentation generator (if applicable).

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@AetherUnbound AetherUnbound requested a review from a team as a code owner July 12, 2023 22:43
@AetherUnbound AetherUnbound added 🟧 priority: high Stalls work on the project or its dependents 🛠 goal: fix Bug fix 🤖 aspect: dx Concerns developers' experience with the codebase 🧱 stack: ingestion server Related to the ingestion/data refresh server labels Jul 12, 2023
Copy link
Contributor

@obulat obulat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested locally. The logs are much cleaner.

Do you know why we get an error logs at the beginning? Here, notice the gunicorn.error:

[2023-07-14 08:04:06,168 - gunicorn.error - 264][INFO] Starting gunicorn 20.1.0
[2023-07-14 08:04:06,169 - gunicorn.error - 267][DEBUG] Arbiter booted
[2023-07-14 08:04:06,169 - gunicorn.error - 264][INFO] Listening at: http://0.0.0.0:8002 (1)
[2023-07-14 08:04:06,169 - gunicorn.error - 264][INFO] Using worker: sync
[2023-07-14 08:04:06,171 - gunicorn.error - 264][INFO] Booting worker with pid: 7
[2023-07-14 08:04:06,260 - gunicorn.error - 267][DEBUG] 1 workers

Copy link
Member

@zackkrida zackkrida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@AetherUnbound
Copy link
Collaborator Author

@obulat I think that's just the logger that's delegated to log those specifics 🤷🏼‍♀️ Seems like a gunicorn internal thing, since the logging config is copied almost wholesale with only a few modifications. In short, no I'm not sure why! But it seems fine 🙂

@AetherUnbound AetherUnbound merged commit dd49aea into main Jul 14, 2023
43 checks passed
@AetherUnbound AetherUnbound deleted the feature/common-python-logging branch July 14, 2023 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖 aspect: dx Concerns developers' experience with the codebase 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: ingestion server Related to the ingestion/data refresh server
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants