Skip to content

Releases: dmwm/WMCore

WMAgent 2.2.3.1 WMAgent production release

15 Jul 13:08
Compare
Choose a tag to compare

Release providing a few important new and feature changes, and overall enhancements to make the agent more resilient.

Release date: 14 July 2023.
Changes since release: 2.2.0.4.

WMAgent

Software stack

  • Upgrade pip-based HTCondor from 8.9.7 to 10.2.3 (Alan Malta Rodrigues) cmsdist#8582

Features and/or feature changes

  • Add new exception/error code for jobs removed by condor for unknown reasons. (khurtado) #11649
  • Explicitly enabled verbose and abort on failure for GFAL2 plugin (Alan Malta Rodrigues) #11636
  • Make ResourceControlUpdater to continuously update PNNs in the database (Alan Malta Rodrigues) #11599
  • Replace imp by importlib (Valentin Kuznetsov) #11530

Bug Fixes

  • Gracefully parse cpu performance metrics in SummaryDB (Alan Malta Rodrigues) #11590
  • Use previous existStatus code instead of default 99108 (Valentin Kuznetsov) #11581
  • Fix manage path in the restartComponent script (Alan Malta Rodrigues) #11572
  • Add new CMSCouch exception for Request Entity Too Large (Alan Malta Rodrigues) #11502

Enhancements

WMCore 2.2.2 production central services release

11 Jul 17:15
Compare
Choose a tag to compare

This release is mostly providing enhancements and some important bug fixing, like a potential resolution for the GFAL2 stage out issues; correction of the workflow status transition for growing workflows; and the ability to automatically insert new RSEs (PNNs) into the agent database.
In addition, we have taken one step closer to a WMAgent containerization solution, which also involved some refactoring of our CD pipeline workflow.
Lastly, this is a base release for the upcoming WMAgent upgrade cycle.

Release date: 11 July 2023.
Changes since release: 2.2.1.

Central services

Software stack

  • Upgrade pip-based HTCondor from 8.9.7 to 10.2.3 (Alan Malta Rodrigues) cmsdist#8582
  • Changes to install WMAgent from PyPi; provided install and run.sh scripts (Todor Ivanov) CMSKubernetes#1393
  • Create a wmagent-base Dockerfile separating OS from application dependencies (Alan Malta Rodrigues) CMSKubernetes#1394
  • Update all WM Pypi Dockerfiles to take TAG as build argument (Alan Malta Rodrigues) CMSKubernetes#1397

Features and/or feature changes

  • Continuously update PNNs with ResourceControlUpdater (Alan Malta Rodrigues) #11599
  • New CouchDB view for WorkQueue OpenForNewData requests; add constraint to status transition (Alan Malta Rodrigues) #11611
  • CI/CD: Substitute curl with svn in GH actions workflow (Todor Ivanov) #11639
  • CI/CD: Add Docker context path to docker/build-push-action@v1 (Todor Ivanov) #11642
  • CI/CD: Update GH action for docker image to use a build argument (Alan Malta Rodrigues) #11638 #11652
  • CI/CD: Fix GH action to build/push docker image (Alan Malta Rodrigues) #11651
  • CI/CD: Refactor docker build/push workflow action (Alan Malta Rodrigues) #11653

Bug Fixes

  • Change GH api to build release notes; plus refactoring (Alan Malta Rodrigues) #11645 #11650

Enhancements

WMAgent

Features and/or feature changes

  • Explicitly enabled verbose and abort on failure for GFAL2 plugin (Alan Malta Rodrigues) #11636
  • Add new exception/error code for jobs removed by condor for unknown reasons. (khurtado) #11649

Bug Fixes

Enhancements

WMCore 2.2.1 production central services release

31 May 13:48
Compare
Choose a tag to compare

This release contains support to GPU StepChain workflows - assuming that steps have the same GPU requirements. It brings in many bug fixes and improvements to both central services and WMAgent. Note that the imp python library has been fully replaced by the standard importlib as well. Lastly, further changes to the Docker images have been made such that all central services run prometheus exporter and populate service metrics to MonIT.

Release date: 31 May 2023.
Changes since release: 2.2.0.2.

Central services

Software stack

  • Update dmwm-base image to pypi-20230525, fixing process_exporter

Features and/or feature changes

  • Add ReqMgr2 validateRunlist (Valentin Kuznetsov) #11535
  • Replace imp by importlib (Valentin Kuznetsov) #11530
  • Add GPU support to the StepChain spec (Alan Malta Rodrigues) #11588

Bug Fixes

  • MSPileup: Remove postToAMQ (post_to_amq) and fix typo in doc_type_amq (Valentin Kuznetsov) #11573
  • Fix racing condition in unit test with CouchDB (Valentin Kuznetsov) #11540
  • Fix broken module path and module name during import (#11587) (Todor Ivanov) #11587
  • MSPileup: detect active containers with no wmcore_transferor rules (Alan Malta Rodrigues) #11579

Enhancements

  • MSPileup consider Neutrino PDs as premix (Alan Malta Rodrigues) #11543

WMAgent

Features and/or feature changes

  • Replace imp by importlib (Valentin Kuznetsov) #11530

Bug Fixes

  • Fix broken module path and module name during import (#11587) (Todor Ivanov) #11587
  • Fix manage path in the restartComponent script (Alan Malta Rodrigues) #11572
  • Add new CMSCouch exception for Request Entity Too Large (Alan Malta Rodrigues) #11502
  • Use previous exitStatus code instead of default 99108 (Valentin Kuznetsov) #11581
  • PyPi: Fix missing static dependencies for wmagent package (Todor Ivanov) #11586
  • PyPi: Fix missing sublevel areas of packages in the wm-database module (#11592) (Todor Ivanov) #11592
  • Gracefully parse cpu performance metrics in SummaryDB (Alan Malta Rodrigues) #11590

Enhancements

WMAgent 2.2.0.4 WMAgent production release

26 Apr 01:39
Compare
Choose a tag to compare

This release provides a few feature changes related to the worker node runtime environment. The remaining changes are mostly bug-fixing and overall enhancements of the agent.

Release date: 26 April 2023.
Changes since release: 2.1.6.1.

WMAgent

Software stack

Features and/or feature changes

  • Deprecate TFC override for multi-step jobs (Alan Malta Rodrigues) #11473
  • Move site config setup from CMS_PATH to SITECONFIG_PATH (Alan Malta Rodrigues) #11481
  • Added optional client to ctor; added getRSEUsage to wrap get_rse_usage rucio API (Valentin Kuznetsov) #11487
  • Add uploaderName as DBS3Upload config parameter (German Giraldo) #11497

Bug Fixes

  • Fix broken syntax for the f-string in DBSUploadPoller (Alan Malta Rodrigues) #11437
  • Ensure JobAccountant does not insert files without any location (Alan Malta Rodrigues) #11433
  • Corrected GQ -> LQ data acquisition in slices (Alan Malta Rodrigues) #11457
  • Fix a typo in the XMLParser (Todor Ivanov) #11533
  • Add new CMSCouch exception for Request Entity Too Large (Alan Malta Rodrigues) #11577

Enhancements

  • Do not duplicate error messages in DBS3Upload (Alan Malta Rodrigues) #11442
  • Add ORA-25401 to the list of handled exceptions in DBExceptionHandler. (Todor Ivanov) #11447
  • Update WMAgent deploy script to use deployment HEAD/master (Alan Malta Rodrigues) #11448
  • Fix CRIC unit tests wrt to the T2_CH_CERN_P5 site (Alan Malta Rodrigues) #11480
  • Fix manage path in the restartComponent script (Alan Malta Rodrigues) #11577

WMCore 2.2.0.2 production central services release

17 Apr 13:49
Compare
Choose a tag to compare

This release provides a production-ready MSPileup microservice to manage secondary input data. It also brings in the relevant changes to MSTransferor and MSRuleCleaner to support this new mode of secondary data management.
Important fixes to the base Docker image are provided, making Prometheus monitoring fully functional again. Usual enhancements are provided as well.

Release date: 17 April 2023.
Changes since release: 2.1.8.

Central services

Software stack

Features and/or feature changes

  • MSPileup: validate list of RSEs in both create/update calls (Alan Malta Rodrigues) #11510
  • Integrate MSAuth into MSPileup (Valentin Kuznetsov) #11509
  • Implement MSPileup clean-up task (Valentin Kuznetsov) #11517
  • New MSMonitoring module (Valentin Kuznetsov) #11478
  • Remove MS/DataPileup from MSRuleCleanerWflow && Remove Pilup logic from MSRuleCleaner. (Todor Ivanov) #11441
  • Refactor to use MSPileup data location in MSTransferor (Alan Malta Rodrigues) #11519
  • Add TapeRulesStatus to the alarm message && Decrease alarm severity. (Todor Ivanov) #11528
  • Track rule ids not created by MSPileup (Alan Malta Rodrigues) #11545

Bug Fixes

  • Only add data to the cache in Rucio wrapper when requested for (Alan Malta Rodrigues) #11510
  • Fix error message in MSPileup (Valentin Kuznetsov) #11512
  • Final tweaks to MSPileup (Alan Malta Rodrigues) #11526
  • Do not strip anything from MSPileup docs when fetching them and use pileupName for query in clean-up task (Valentin Kuznetsov) #11529
  • Remove _id attribute from pileup mongo objects (Alan Malta Rodrigues) #11539
  • Set rseList for RelVal workflows (Todor Ivanov) #11542
  • Protect rule and document deletion with the dryRun option (Alan Malta Rodrigues) #11545
  • MSPileup: fix logic for deleting old inactive pileup documents (Alan Malta Rodrigues) #11549
  • Couple pileup activatedOn/deactivatedOn to its active state change (Alan Malta Rodrigues) #11553

Enhancements

  • Mock Rucio in MSCore module (Alan Malta Rodrigues) #11510
  • Adding Null parentage information for RunLumi pairs missing at the parent Dataset. (Todor Ivanov) #11520
  • Support logger object in the CodeTimer class (Alan Malta Rodrigues) #11519
  • Script to parse campaigns and extract pileup configuration (Alan Malta Rodrigues) #11521
  • Dump of production pileup configurations; pileup json for dev cluster (Alan Malta Rodrigues) #11521
  • Make MSPileupTasks log more friendly (Alan Malta Rodrigues) #11545

WMAgent

Features and/or feature changes

  • Add uploaderName as DBS3Upload config parameter (German Giraldo) #11497

Bug Fixes

  • Fix a typo in the XMLParser (Todor Ivanov) #11533

Enhancements

WMCore 2.1.8 production central services release

09 Mar 18:57
Compare
Choose a tag to compare

This is a lightweight release which provides some fixes to the WMCore docker images, fixing the liveness probe and the monitoring process.
It also provides an important bug-fix for MSRuleCleaner. In addition, it brings in the initial functionality for a new microservice to manage input pileup data (called MSPileup). This service is still under development and not yet available in our production environment.

Release date: 9 March 2023.
Changes since release: 2.1.7.

Central services

Software stack

Features and/or feature changes

  • Fix /data/monitor.sh permissions in the base image (Valentin Kuznetsov) dmwm/CMSKubernetes#1323
  • MSAuth module for authorization (Valentin Kuznetsov) #11476
  • First draft implementation of MSPileup tasks (Valentin Kuznetsov) #11487
  • Added optional client to ctor; added getRSEUsage to wrap get_rse_usage rucio API (Valentin Kuznetsov) #11487
  • MSPileup integration with MSManager (Valentin Kuznetsov) #11504
  • Second round of changes for MSPileup data placement logic (Valentin Kuznetsov) #11499

Bug Fixes

  • Run all MSRuleCleaner pipelines for rejected/aborted requests (anehnis) #11489

Enhancements

WMAgent

Features and/or feature changes

  • Added optional client to ctor; added getRSEUsage to wrap get_rse_usage rucio API (Valentin Kuznetsov) #11487

Bug Fixes

Enhancements

WMCore 2.1.7 production central services release

13 Feb 20:13
Compare
Choose a tag to compare

This release introduces a new microservice to manage input pileup data, called MSPileup. It's not yet fully functional, hence it is only available in the Dev and Integration environments. This release also provides some minor bug fixes and enhancements to the REST layer, as well as the resolution of some annoying long standing WMAgent issues.
Lastly, changes to support the latest CMSSW releases that only provide support to JSON-based site catalog have been integrated into our software.

Release date: 13 February 2023.
Changes since release: 2.1.6.

Central services

Software stack

Features and/or feature changes

  • Add reqmgr2ms-pileup (Valentin Kuznetsov) #11430 #11474
  • MSPileup service data management layer (Valentin Kuznetsov) #11438
  • MSPileup service HTTP APIs (Valentin Kuznetsov) #11443
  • Skip auth check in case of localhost usage (Valentin Kuznetsov) #11462

Bug Fixes

  • Fix file open issue in REST/Main (Valentin Kuznetsov) #11431 #11463
  • Fix bad import for mspileupError (Alan Malta Rodrigues) #11468
  • MSRuleCleaner: initialize transferInfo local var (Alan Malta Rodrigues) #11470
  • Fix MSPileup query logic (Valentin Kuznetsov) #11475
  • fix NameError import in the REST package (Thanayut Seethongchuen) #11485

Enhancements

  • Use setswitchinterval instead of setcheckinterval (Valentin Kuznetsov) #11432

WMAgent

Features and/or feature changes

  • Deprecate TFC override for multi-step jobs (Alan Malta Rodrigues) #11473
  • Move site config setup from CMS_PATH to SITECONFIG_PATH (Alan Malta Rodrigues) #11481

Bug Fixes

  • Fix broken syntax for the f-string in DBSUploadPoller (Alan Malta Rodrigues) #11437
  • Ensure JobAccountant does not insert files without any location (Alan Malta Rodrigues) #11433
  • Corrected GQ -> LQ data acquisition in slices (Alan Malta Rodrigues) #11457

Enhancements

  • Do not duplicate error messages in DBS3Upload (Alan Malta Rodrigues) #11442
  • Add ORA-25401 to the list of handled exceptions in DBExceptionHandler. (Todor Ivanov) #11447
  • Update WMAgent deploy script to use deployment HEAD/master (Alan Malta Rodrigues) #11448
  • Fix CRIC unit tests wrt to the T2_CH_CERN_P5 site (Alan Malta Rodrigues) #11480

WMAgent 2.1.6.1 production release

18 Jan 18:48
Compare
Choose a tag to compare

This is not a revolutionary release, but it does bring in some feature changes involving DBS, a few bug fixes and enhancements.
In addition, it has some software stack changes, where rucio-clients and CMSMonitoring python packages have been updated.

Release date: 18 January 2023.
Changes since release: 2.1.4.

WMAgent

Software stack

  • Update rucio-clients to 1.29.10 and all its dependencies (Alan Malta Rodrigues) cms-sw/cmsdist#8196
  • Update CMSMonitoring to 0.6.10 and all its dependencies (Alan Malta Rodrigues) cms-sw/cmsdist#8196

Features and/or feature changes

  • Provide DBSError class to handle DBS Go-server errors (Valentin Kuznetsov) #11173
  • Change metadata field in the Grafana SSB query (Alan Malta Rodrigues) #11379
  • Remove python2 compatibility for WMRuntime package (Alan Malta Rodrigues) #11420
  • Convert bin/ and test/data/ReqMgr scripts to python3-only (Alan Malta Rodrigues) #11419
  • Switch to use DBSError reason/srvCode instead of if/else exception block (Valentin Kuznetsov) #11375

Bug Fixes

  • Add crt/key path resolution during service creation for RucioConMon. (Todor Ivanov) #11398
  • Remove obsolete estimate_ttc parameter to Rucio client calls. (Todor Ivanov) #11383
  • Fix broken syntax for the f-string in DBSUploadPoller (Alan Malta Rodrigues) #11445
  • Ensure JobAccountant does not insert files without any location (Alan Malta Rodrigues) #11445

Enhancements

  • Bump WMAgent deployment example/tag to 2.1.4 (Alan Malta Rodrigues) #11362
  • Given a block name in DBS3Upload config, dump its info as json (Alan Malta Rodrigues) #11361
  • Add generic msDebugger && Fetch all wflows with missing TotaInput* parameters. (Todor Ivanov) #11366
  • Do not duplicate error messages in DBS3Upload (Alan Malta Rodrigues) #11445

WMCore 2.1.6 production central services release

12 Jan 15:47
Compare
Choose a tag to compare

This is the very first WMCore cycle fully adopting Pypi-based packages for central services deployment. In addition, a continuous delivery GH action has been created, allowing us to automatically and easily create Pypi packages and Docker images. The pycurl_manager module has now been made compliant with HTTP 2 as well.
Last but not least, there are many other bug fixes and enhancements in this release.

Release date: 12 January 2023.
Changes since release: 2.1.5.

Central services

Software stack

Features and/or feature changes

  • Many changes to the CMSKubernetes repository (new Dockerfiles based on Pypi-packages)
  • Change RucioConMon APIs' prefix. (Todor Ivanov) #11389
  • Build docker image for t0_reqmon (Alan Malta Rodrigues) #11395
  • Rename t0 package for pypi (to t0-agent) (Kenyi Hurtado) #11345

Bug Fixes

  • Add all the subpackages for each microservice (Alan Malta Rodrigues) #11392
  • Fix regexp pattern to match HTTP/1.1 and HTTP/2 (Valentin Kuznetsov) #11400
  • Fix multirequest API to encode input parameters and properly handle gziped content (Valentin Kuznetsov) #11404
  • Add crt/key path resolution during service creation for RucioConMon. (Todor Ivanov) #11398

Enhancements

WMAgent

Features and/or feature changes

  • Change metadata field in the Grafana SSB query (Alan Malta Rodrigues) #11379
  • Remove python2 compatibility for WMRuntime package (Alan Malta Rodrigues) #11420
  • Convert bin/ and test/data/ReqMgr scripts to python3-only (Alan Malta Rodrigues) #11419
  • Switch to use DBSError reason/srvCode instead of if/else exception block (Valentin Kuznetsov) #11375

Bug Fixes

  • Add crt/key path resolution during service creation for RucioConMon. (Todor Ivanov) #11398

Enhancements

  • Given a block name in DBS3Upload config, dump its info as json (Alan Malta Rodrigues) #11361
  • Add generic msDebugger && Fetch all wflows with missing TotaInput* parameters. (Todor Ivanov) #11366

WMCore 2.1.5 production central services release

09 Dec 02:46
Compare
Choose a tag to compare

This cycle brings in a new WMCore tagging convention, compliant with PEP-440; a GitHub action for building and upload packages to PyPi and Gitlab registry has been integrated as well.
In addition, this release brings in feature changes and bug fixes for many of the Microservices; a new REST endpoint (metrics) is provided to yield monitoring information compliant with Prometheus format.
There is also some usual enhancements and minor bug fixes here and there.

Release date: 8 December 2022.
Changes since release: 2.1.4.

Central services

Software stack

  • Update rucio-clients version from 1.25.5 to 1.29.10 (Alan Malta Rodrigues) #11368

Features and/or feature changes

  • [MSRuleCleaner] Add checks for StatusAdvanceTimeout expiration and send alarms. (Todor Ivanov) #11299
  • Add new GH Action workflows to build WMCore images (Valentin Kuznetsov) #11377
  • Add new module to convert cp stats into prometheus format suited for monitoring (Valentin Kuznetsov) #9940 #11381
  • Update buildrelease.sh to consider release-candidate tags (Alan Malta Rodrigues) #11370
  • [MSOutput] Switch mongo client to connect to a fully defined replicaset. (Todor Ivanov) #11360

Bug Fixes

  • [MSMonitor] Add try/except block for fetching PartialCopy parameter from campaign configuration. (Todor Ivanov) #11387
  • [MSRuleCleaner] properly evaluate tape transfers (Alan Malta Rodrigues) #11384
  • Remove obsolete estimate_ttc parameter to Rucio client calls. (Todor Ivanov) #11383

Enhancements

  • Adding sleep of 5min to the docker build images workflow (Alan Malta Rodrigues)
  • Update CERN credentials in the GH docker template (Alan Malta Rodrigues)
  • Update CERN credentials to lower case (Alan Malta Rodrigues)

WMAgent

Features and/or feature changes

  • Provide DBSError class to handle DBS Go-server errors (Valentin Kuznetsov) #11173

Bug Fixes

Enhancements

  • Bump WMAgent deployment example/tag to 2.1.4 (Alan Malta Rodrigues) #11362