Releases: dmwm/WMCore
WMAgent 2.2.3.1 WMAgent production release
Release providing a few important new and feature changes, and overall enhancements to make the agent more resilient.
Release date: 14 July 2023.
Changes since release: 2.2.0.4.
WMAgent
Software stack
- Upgrade pip-based HTCondor from 8.9.7 to 10.2.3 (Alan Malta Rodrigues) cmsdist#8582
Features and/or feature changes
- Add new exception/error code for jobs removed by condor for unknown reasons. (khurtado) #11649
- Explicitly enabled verbose and abort on failure for GFAL2 plugin (Alan Malta Rodrigues) #11636
- Make ResourceControlUpdater to continuously update PNNs in the database (Alan Malta Rodrigues) #11599
- Replace imp by importlib (Valentin Kuznetsov) #11530
Bug Fixes
- Gracefully parse cpu performance metrics in SummaryDB (Alan Malta Rodrigues) #11590
- Use previous existStatus code instead of default 99108 (Valentin Kuznetsov) #11581
- Fix manage path in the restartComponent script (Alan Malta Rodrigues) #11572
- Add new CMSCouch exception for Request Entity Too Large (Alan Malta Rodrigues) #11502
Enhancements
WMCore 2.2.2 production central services release
This release is mostly providing enhancements and some important bug fixing, like a potential resolution for the GFAL2 stage out issues; correction of the workflow status transition for growing workflows; and the ability to automatically insert new RSEs (PNNs) into the agent database.
In addition, we have taken one step closer to a WMAgent containerization solution, which also involved some refactoring of our CD pipeline workflow.
Lastly, this is a base release for the upcoming WMAgent upgrade cycle.
Release date: 11 July 2023.
Changes since release: 2.2.1.
Central services
Software stack
- Upgrade pip-based HTCondor from 8.9.7 to 10.2.3 (Alan Malta Rodrigues) cmsdist#8582
- Changes to install WMAgent from PyPi; provided install and run.sh scripts (Todor Ivanov) CMSKubernetes#1393
- Create a wmagent-base Dockerfile separating OS from application dependencies (Alan Malta Rodrigues) CMSKubernetes#1394
- Update all WM Pypi Dockerfiles to take TAG as build argument (Alan Malta Rodrigues) CMSKubernetes#1397
Features and/or feature changes
- Continuously update PNNs with ResourceControlUpdater (Alan Malta Rodrigues) #11599
- New CouchDB view for WorkQueue OpenForNewData requests; add constraint to status transition (Alan Malta Rodrigues) #11611
- CI/CD: Substitute curl with svn in GH actions workflow (Todor Ivanov) #11639
- CI/CD: Add Docker context path to docker/build-push-action@v1 (Todor Ivanov) #11642
- CI/CD: Update GH action for docker image to use a build argument (Alan Malta Rodrigues) #11638 #11652
- CI/CD: Fix GH action to build/push docker image (Alan Malta Rodrigues) #11651
- CI/CD: Refactor docker build/push workflow action (Alan Malta Rodrigues) #11653
Bug Fixes
Enhancements
WMAgent
Features and/or feature changes
- Explicitly enabled verbose and abort on failure for GFAL2 plugin (Alan Malta Rodrigues) #11636
- Add new exception/error code for jobs removed by condor for unknown reasons. (khurtado) #11649
Bug Fixes
Enhancements
WMCore 2.2.1 production central services release
This release contains support to GPU StepChain workflows - assuming that steps have the same GPU requirements. It brings in many bug fixes and improvements to both central services and WMAgent. Note that the imp python library has been fully replaced by the standard importlib as well. Lastly, further changes to the Docker images have been made such that all central services run prometheus exporter and populate service metrics to MonIT.
Release date: 31 May 2023.
Changes since release: 2.2.0.2.
Central services
Software stack
- Update dmwm-base image to pypi-20230525, fixing process_exporter
Features and/or feature changes
- Add ReqMgr2 validateRunlist (Valentin Kuznetsov) #11535
- Replace imp by importlib (Valentin Kuznetsov) #11530
- Add GPU support to the StepChain spec (Alan Malta Rodrigues) #11588
Bug Fixes
- MSPileup: Remove postToAMQ (post_to_amq) and fix typo in doc_type_amq (Valentin Kuznetsov) #11573
- Fix racing condition in unit test with CouchDB (Valentin Kuznetsov) #11540
- Fix broken module path and module name during import (#11587) (Todor Ivanov) #11587
- MSPileup: detect active containers with no wmcore_transferor rules (Alan Malta Rodrigues) #11579
Enhancements
- MSPileup consider Neutrino PDs as premix (Alan Malta Rodrigues) #11543
WMAgent
Features and/or feature changes
- Replace imp by importlib (Valentin Kuznetsov) #11530
Bug Fixes
- Fix broken module path and module name during import (#11587) (Todor Ivanov) #11587
- Fix manage path in the restartComponent script (Alan Malta Rodrigues) #11572
- Add new CMSCouch exception for Request Entity Too Large (Alan Malta Rodrigues) #11502
- Use previous exitStatus code instead of default 99108 (Valentin Kuznetsov) #11581
- PyPi: Fix missing static dependencies for wmagent package (Todor Ivanov) #11586
- PyPi: Fix missing sublevel areas of packages in the wm-database module (#11592) (Todor Ivanov) #11592
- Gracefully parse cpu performance metrics in SummaryDB (Alan Malta Rodrigues) #11590
Enhancements
WMAgent 2.2.0.4 WMAgent production release
This release provides a few feature changes related to the worker node runtime environment. The remaining changes are mostly bug-fixing and overall enhancements of the agent.
Release date: 26 April 2023.
Changes since release: 2.1.6.1.
WMAgent
Software stack
Features and/or feature changes
- Deprecate TFC override for multi-step jobs (Alan Malta Rodrigues) #11473
- Move site config setup from CMS_PATH to SITECONFIG_PATH (Alan Malta Rodrigues) #11481
- Added optional client to ctor; added getRSEUsage to wrap get_rse_usage rucio API (Valentin Kuznetsov) #11487
- Add uploaderName as DBS3Upload config parameter (German Giraldo) #11497
Bug Fixes
- Fix broken syntax for the f-string in DBSUploadPoller (Alan Malta Rodrigues) #11437
- Ensure JobAccountant does not insert files without any location (Alan Malta Rodrigues) #11433
- Corrected GQ -> LQ data acquisition in slices (Alan Malta Rodrigues) #11457
- Fix a typo in the XMLParser (Todor Ivanov) #11533
- Add new CMSCouch exception for Request Entity Too Large (Alan Malta Rodrigues) #11577
Enhancements
- Do not duplicate error messages in DBS3Upload (Alan Malta Rodrigues) #11442
- Add ORA-25401 to the list of handled exceptions in DBExceptionHandler. (Todor Ivanov) #11447
- Update WMAgent deploy script to use deployment HEAD/master (Alan Malta Rodrigues) #11448
- Fix CRIC unit tests wrt to the T2_CH_CERN_P5 site (Alan Malta Rodrigues) #11480
- Fix manage path in the restartComponent script (Alan Malta Rodrigues) #11577
WMCore 2.2.0.2 production central services release
This release provides a production-ready MSPileup microservice to manage secondary input data. It also brings in the relevant changes to MSTransferor and MSRuleCleaner to support this new mode of secondary data management.
Important fixes to the base Docker image are provided, making Prometheus monitoring fully functional again. Usual enhancements are provided as well.
Release date: 17 April 2023.
Changes since release: 2.1.8.
Central services
Software stack
- New dmwm-base image tag (pypi-20230314) with fixes to the Prometheus monitoring dmwm/CMSKubernetes#1332
- Remove ms-output dependency on ms-output-mongo dmwm/CMSKubernetes#1339
- Set CouchDB image to use 3.2.2 version; fix monitoring.ini in the CouchDB image dmwm/CMSKubernetes#1350
Features and/or feature changes
- MSPileup: validate list of RSEs in both create/update calls (Alan Malta Rodrigues) #11510
- Integrate MSAuth into MSPileup (Valentin Kuznetsov) #11509
- Implement MSPileup clean-up task (Valentin Kuznetsov) #11517
- New MSMonitoring module (Valentin Kuznetsov) #11478
- Remove MS/DataPileup from MSRuleCleanerWflow && Remove Pilup logic from MSRuleCleaner. (Todor Ivanov) #11441
- Refactor to use MSPileup data location in MSTransferor (Alan Malta Rodrigues) #11519
- Add TapeRulesStatus to the alarm message && Decrease alarm severity. (Todor Ivanov) #11528
- Track rule ids not created by MSPileup (Alan Malta Rodrigues) #11545
Bug Fixes
- Only add data to the cache in Rucio wrapper when requested for (Alan Malta Rodrigues) #11510
- Fix error message in MSPileup (Valentin Kuznetsov) #11512
- Final tweaks to MSPileup (Alan Malta Rodrigues) #11526
- Do not strip anything from MSPileup docs when fetching them and use pileupName for query in clean-up task (Valentin Kuznetsov) #11529
- Remove _id attribute from pileup mongo objects (Alan Malta Rodrigues) #11539
- Set rseList for RelVal workflows (Todor Ivanov) #11542
- Protect rule and document deletion with the dryRun option (Alan Malta Rodrigues) #11545
- MSPileup: fix logic for deleting old inactive pileup documents (Alan Malta Rodrigues) #11549
- Couple pileup activatedOn/deactivatedOn to its active state change (Alan Malta Rodrigues) #11553
Enhancements
- Mock Rucio in MSCore module (Alan Malta Rodrigues) #11510
- Adding Null parentage information for RunLumi pairs missing at the parent Dataset. (Todor Ivanov) #11520
- Support logger object in the CodeTimer class (Alan Malta Rodrigues) #11519
- Script to parse campaigns and extract pileup configuration (Alan Malta Rodrigues) #11521
- Dump of production pileup configurations; pileup json for dev cluster (Alan Malta Rodrigues) #11521
- Make MSPileupTasks log more friendly (Alan Malta Rodrigues) #11545
WMAgent
Features and/or feature changes
- Add uploaderName as DBS3Upload config parameter (German Giraldo) #11497
Bug Fixes
- Fix a typo in the XMLParser (Todor Ivanov) #11533
Enhancements
WMCore 2.1.8 production central services release
This is a lightweight release which provides some fixes to the WMCore docker images, fixing the liveness probe and the monitoring process.
It also provides an important bug-fix for MSRuleCleaner. In addition, it brings in the initial functionality for a new microservice to manage input pileup data (called MSPileup). This service is still under development and not yet available in our production environment.
Release date: 9 March 2023.
Changes since release: 2.1.7.
Central services
Software stack
Features and/or feature changes
- Fix /data/monitor.sh permissions in the base image (Valentin Kuznetsov) dmwm/CMSKubernetes#1323
- MSAuth module for authorization (Valentin Kuznetsov) #11476
- First draft implementation of MSPileup tasks (Valentin Kuznetsov) #11487
- Added optional client to ctor; added getRSEUsage to wrap get_rse_usage rucio API (Valentin Kuznetsov) #11487
- MSPileup integration with MSManager (Valentin Kuznetsov) #11504
- Second round of changes for MSPileup data placement logic (Valentin Kuznetsov) #11499
Bug Fixes
- Run all MSRuleCleaner pipelines for rejected/aborted requests (anehnis) #11489
Enhancements
WMAgent
Features and/or feature changes
- Added optional client to ctor; added getRSEUsage to wrap get_rse_usage rucio API (Valentin Kuznetsov) #11487
Bug Fixes
Enhancements
WMCore 2.1.7 production central services release
This release introduces a new microservice to manage input pileup data, called MSPileup. It's not yet fully functional, hence it is only available in the Dev and Integration environments. This release also provides some minor bug fixes and enhancements to the REST layer, as well as the resolution of some annoying long standing WMAgent issues.
Lastly, changes to support the latest CMSSW releases that only provide support to JSON-based site catalog have been integrated into our software.
Release date: 13 February 2023.
Changes since release: 2.1.6.
Central services
Software stack
Features and/or feature changes
- Add reqmgr2ms-pileup (Valentin Kuznetsov) #11430 #11474
- MSPileup service data management layer (Valentin Kuznetsov) #11438
- MSPileup service HTTP APIs (Valentin Kuznetsov) #11443
- Skip auth check in case of localhost usage (Valentin Kuznetsov) #11462
Bug Fixes
- Fix file open issue in REST/Main (Valentin Kuznetsov) #11431 #11463
- Fix bad import for mspileupError (Alan Malta Rodrigues) #11468
- MSRuleCleaner: initialize transferInfo local var (Alan Malta Rodrigues) #11470
- Fix MSPileup query logic (Valentin Kuznetsov) #11475
- fix NameError import in the REST package (Thanayut Seethongchuen) #11485
Enhancements
- Use setswitchinterval instead of setcheckinterval (Valentin Kuznetsov) #11432
WMAgent
Features and/or feature changes
- Deprecate TFC override for multi-step jobs (Alan Malta Rodrigues) #11473
- Move site config setup from CMS_PATH to SITECONFIG_PATH (Alan Malta Rodrigues) #11481
Bug Fixes
- Fix broken syntax for the f-string in DBSUploadPoller (Alan Malta Rodrigues) #11437
- Ensure JobAccountant does not insert files without any location (Alan Malta Rodrigues) #11433
- Corrected GQ -> LQ data acquisition in slices (Alan Malta Rodrigues) #11457
Enhancements
- Do not duplicate error messages in DBS3Upload (Alan Malta Rodrigues) #11442
- Add ORA-25401 to the list of handled exceptions in DBExceptionHandler. (Todor Ivanov) #11447
- Update WMAgent deploy script to use deployment HEAD/master (Alan Malta Rodrigues) #11448
- Fix CRIC unit tests wrt to the T2_CH_CERN_P5 site (Alan Malta Rodrigues) #11480
WMAgent 2.1.6.1 production release
This is not a revolutionary release, but it does bring in some feature changes involving DBS, a few bug fixes and enhancements.
In addition, it has some software stack changes, where rucio-clients and CMSMonitoring python packages have been updated.
Release date: 18 January 2023.
Changes since release: 2.1.4.
WMAgent
Software stack
- Update rucio-clients to 1.29.10 and all its dependencies (Alan Malta Rodrigues) cms-sw/cmsdist#8196
- Update CMSMonitoring to 0.6.10 and all its dependencies (Alan Malta Rodrigues) cms-sw/cmsdist#8196
Features and/or feature changes
- Provide DBSError class to handle DBS Go-server errors (Valentin Kuznetsov) #11173
- Change metadata field in the Grafana SSB query (Alan Malta Rodrigues) #11379
- Remove python2 compatibility for WMRuntime package (Alan Malta Rodrigues) #11420
- Convert bin/ and test/data/ReqMgr scripts to python3-only (Alan Malta Rodrigues) #11419
- Switch to use DBSError reason/srvCode instead of if/else exception block (Valentin Kuznetsov) #11375
Bug Fixes
- Add crt/key path resolution during service creation for RucioConMon. (Todor Ivanov) #11398
- Remove obsolete estimate_ttc parameter to Rucio client calls. (Todor Ivanov) #11383
- Fix broken syntax for the f-string in DBSUploadPoller (Alan Malta Rodrigues) #11445
- Ensure JobAccountant does not insert files without any location (Alan Malta Rodrigues) #11445
Enhancements
- Bump WMAgent deployment example/tag to 2.1.4 (Alan Malta Rodrigues) #11362
- Given a block name in DBS3Upload config, dump its info as json (Alan Malta Rodrigues) #11361
- Add generic msDebugger && Fetch all wflows with missing TotaInput* parameters. (Todor Ivanov) #11366
- Do not duplicate error messages in DBS3Upload (Alan Malta Rodrigues) #11445
WMCore 2.1.6 production central services release
This is the very first WMCore cycle fully adopting Pypi-based packages for central services deployment. In addition, a continuous delivery GH action has been created, allowing us to automatically and easily create Pypi packages and Docker images. The pycurl_manager module has now been made compliant with HTTP 2 as well.
Last but not least, there are many other bug fixes and enhancements in this release.
Release date: 12 January 2023.
Changes since release: 2.1.5.
Central services
Software stack
Features and/or feature changes
- Many changes to the CMSKubernetes repository (new Dockerfiles based on Pypi-packages)
- Change RucioConMon APIs' prefix. (Todor Ivanov) #11389
- Build docker image for t0_reqmon (Alan Malta Rodrigues) #11395
- Rename t0 package for pypi (to t0-agent) (Kenyi Hurtado) #11345
Bug Fixes
- Add all the subpackages for each microservice (Alan Malta Rodrigues) #11392
- Fix regexp pattern to match HTTP/1.1 and HTTP/2 (Valentin Kuznetsov) #11400
- Fix multirequest API to encode input parameters and properly handle gziped content (Valentin Kuznetsov) #11404
- Add crt/key path resolution during service creation for RucioConMon. (Todor Ivanov) #11398
Enhancements
WMAgent
Features and/or feature changes
- Change metadata field in the Grafana SSB query (Alan Malta Rodrigues) #11379
- Remove python2 compatibility for WMRuntime package (Alan Malta Rodrigues) #11420
- Convert bin/ and test/data/ReqMgr scripts to python3-only (Alan Malta Rodrigues) #11419
- Switch to use DBSError reason/srvCode instead of if/else exception block (Valentin Kuznetsov) #11375
Bug Fixes
- Add crt/key path resolution during service creation for RucioConMon. (Todor Ivanov) #11398
Enhancements
WMCore 2.1.5 production central services release
This cycle brings in a new WMCore tagging convention, compliant with PEP-440; a GitHub action for building and upload packages to PyPi and Gitlab registry has been integrated as well.
In addition, this release brings in feature changes and bug fixes for many of the Microservices; a new REST endpoint (metrics) is provided to yield monitoring information compliant with Prometheus format.
There is also some usual enhancements and minor bug fixes here and there.
Release date: 8 December 2022.
Changes since release: 2.1.4.
Central services
Software stack
- Update rucio-clients version from 1.25.5 to 1.29.10 (Alan Malta Rodrigues) #11368
Features and/or feature changes
- [MSRuleCleaner] Add checks for StatusAdvanceTimeout expiration and send alarms. (Todor Ivanov) #11299
- Add new GH Action workflows to build WMCore images (Valentin Kuznetsov) #11377
- Add new module to convert cp stats into prometheus format suited for monitoring (Valentin Kuznetsov) #9940 #11381
- Update buildrelease.sh to consider release-candidate tags (Alan Malta Rodrigues) #11370
- [MSOutput] Switch mongo client to connect to a fully defined replicaset. (Todor Ivanov) #11360
Bug Fixes
- [MSMonitor] Add try/except block for fetching PartialCopy parameter from campaign configuration. (Todor Ivanov) #11387
- [MSRuleCleaner] properly evaluate tape transfers (Alan Malta Rodrigues) #11384
- Remove obsolete estimate_ttc parameter to Rucio client calls. (Todor Ivanov) #11383
Enhancements
- Adding sleep of 5min to the docker build images workflow (Alan Malta Rodrigues)
- Update CERN credentials in the GH docker template (Alan Malta Rodrigues)
- Update CERN credentials to lower case (Alan Malta Rodrigues)
WMAgent
Features and/or feature changes
- Provide DBSError class to handle DBS Go-server errors (Valentin Kuznetsov) #11173
Bug Fixes
Enhancements
- Bump WMAgent deployment example/tag to 2.1.4 (Alan Malta Rodrigues) #11362