Version 2.3.0
Released on : xx/xx/2025
Operational version
Features
Indexation improvement
Elastic entities insertions are now asynchronous. It is now recommended to increase the size of the Elasticsearch connection pool to handle the higher level of parallelism. See the configuration guide for details on how to configure this value (regards.jpa.multitenant.maxPoolSize and regards.elasticsearch.threadpool.size).
Multiple insertion tasks can now run at the same time. Depending on the size of the inserted products and the microservice’s allocated memory, you may need to reduce the configured insertion batch size of the rs-dam microservice (regards.crawler.max.bulk.size in rs-dam.properties) if you encounter memory errors.
The new Elasticsearch reindexing does not introduce any breaking changes. However, there are now recommendations regarding the following naming conventions:
- There is now a default Elasticsearch alias named [tenant]_alias. If you use a custom Elasticsearch alias, make sure it does not use this name.
- In the configuration of crawling datasources , do not use a label or ID ending with the string "_building", as this suffix is reserved for a specific category of ingestions used during reindexing.
Due to a limitation in the Elasticsearch 7 client, applying access rights may take longer than the default timeout configured in the rs-dam microservice (regards.elasticsearch.index.request.timeout in rs-dam.properties). When the timeout is reached, the aspiration process will exit with an error, even though Elasticsearch continues applying the access rights in the background. In reality, the aspiration is typically successful. To avoid this false-negative outcome, you can significantly increase the timeout value.
- It is now forbidden to delete an index or an alias directly in Elasticsearch.
- When querying Elasticsearch directly to perform catalog searches, you must now use the alias name instead of the index name.
- During reindexing, Elasticsearch server will require twice as much disk space as during normal use. Make sure you have enough free space allocated to ElasticSearch for the reindexing to complete successfully.
Restriction to one category per OAIS product
The ability to associate several categories per OAIS product has never been used, so the data structure and APIs have been modified to permit only one category per product. This restriction allows some performance improvements during crawling on OAIS products, and in the administration interface.
Improved Naming and Layout of Delivery Files
The name and structure of the zip files generated by the delivery service have evolved. The new naming conventions are described in the documentation.
Breaking change
Deployment
You need to follow the Ansible migration guide to update your playbook from V2.2.0 to V2.3.0.
Database
OAIS Category change
The database tables ingest.t_sip and ingest.t_aip have changed: the categories column (format jsonb) is replaced
by a category column (format varchar 128). After first deploying the 2.3.0 version of the service, the data
will be moved to the new column. If the categories column contained a list of a single category, it will be
transferred to the new column. If it contained several categories, only the first one will be preserved.
The change is automatic on updating the service, but may take a long time depending on the number of products. If the script takes too long, the service may restart before the migration finishes. In this case the migration will continue from where it was when the service restarts. It may happen several times, until it eventually completes the migration. To avoid any problem during the migration, we provide a script to manually modify these 2 tables, to be executed after shutdown of the current system and before the restart in the new version: see the Ansible migration guide.
Similarly for the acquisition chains in service dataprovider: the table dataprovider.t_acq_processing_chain
has had its categories column replaced by a category column. The change is automatic on updating the service
and the data is migrated automatically. If the categories column contained a list of a single category,
it will be transferred to the new column. If it contained several categories, only the first one will be preserved.
Index changes
An index was added in the table ingest.t_aip on (last_update, id) to improve the performance
of aspirations of OAIS products.
An index was replaced in the table fem.t_feature on (model, last_update) by a new one on (model, last_update, id)
to improve the performance of aspirations of GeoJson products.
These changes are automatic on updating the service, but may take a long time depending on the number of products.
To avoid any problem during the migration, the script provided for the OAIS Category change above can be used,
it also performs the update to the indexes.
To be executed after shutdown of the current system and before the restart in the new version:
See the Ansible migration guide.
Rest API
The OAIS product manager modification API (rs-ingest/aips/update) no longer allows to set multiple categories on a product. Requests to add a category to a product that already has one, or to add multiple categories will be ignored. It is still possible to replace an existing category by removing it and adding another one at the same time.
Service configuration
The format for the configuration file for the rs-dataprovider service has changed following the change
to a single category.
When specifying an AcquisitionProcessingChain in the configuration, the categories attribute must be replaced
by a category attribute containing a single string.
AMQP API
OAIS products for rs-ingest no longer allow multiple categories. To allow retro-compatibility, a product can have either
a string category field, or a list categories field with a single category, but not both.
The use of the categories field is deprecated.
The OAISMigrationWorker no longer allows multiple categories in a product. To allow retro-compatibility,
a product can have either a string category field, or a list categories field with a single category,
but not both. The use of the categories field is deprecated.