This page was intended to collect information regarding evaluation of various technologies, libraries, database design and other activities related to initial architecture of the system.
1. Current state of tool
1.1. Main parts of Catalog QT tool
- Client (written in C/C++) - does not have to work on the machine on which the API and UI work. Allows user to add new data to the Catalog dataset.
- UI (written in PHP/javascript) - server part. It allows you to present the collected data. For now, it is not communicating remotely with the API. Both must be on the same machine.
- API (written in PHP) - server part. Handles client and UI SOAP requests. The middleman between them and MySQL.
- MySQL - serves as a source of the data.
1.1.1. MySQL database schemas
At the moment there are two, separate schemas for MySLQ based databases. First one, responsible for storing summary IDS data
1.1.2. MariaDB data sharding
1.2. Main problems of the existing solution
- The tool is very hermetic and designed for one platform (Gateway)
- Very problematic to create environment for Catalog QT
- Current solutions are old and less efficient
- Very difficult to add new functionalities or extend existing ones
- The authorization module is based on Single Sign-On from Gateway and works only on GW
- SOAP - a tool with great possibilitiesand large requirements, that we do not really need
- Only one request can be handled at the time (catalogUpdataProcess is a serial application) and quite fragile when it comes to error handling. In case there is a major issue with data import and catalogUpdateProcess fails, there is no way to recover other than restart the process again
2. Benefits of using Java and Spring Boot Framework
2.1. Parts on new Catalog QT tool
- Client (Java based) - this application will be responsible for sending user's request (equivalent of
catalogScheduler
application) - Server (Java based) - this part of application will be developed using Spring Boot. Spring Boot is used by a large community of developers, so we can rely on extensive experience and numerous projects based on this solution.
- Server side (
catalogUpdateProcess
) will be embedded inside WebService container (e.g. using RabbitMQ)
2.2. Main problems of new approach
- we have to develop all the parts using new technology
- we won't be backward compatible with old installations
2.3. Main benefits of new approach
- No need to configure lot of components and modules to run serwer side modules.
- To run the Server side components, it would be enough to run a single java file.
- No
Apache
,GlassFish
or other www container needed to run it -Spring Boot
has its own. - Better handling multiple requests for storing data
3. What was tested
3.1. Spring Boot Framework
It has been checked what possibilities it has.
Entirely based on Java and uses the latest capabilities of the language.
Allows to create stand-alone Web Services.
3.2. Multidatabase access
Unfortunately, using Hibernate can cause long data access times, because Catalog uses extended SQL queries.
Therefore, light and more simple ways to access MySQL databases should be used.
To testing multidatabase access with Spring Boot Framework was used basic service for managing a set of JDBC drivers.
They fulfilled their task and allowed to correctly select the database from which the data was downloaded.
3.3. Web services
Web Services are the part of the Spring Boot Framework. Creating new WS requires marking the access path in the URL and specifying the returned data, if any.
@RestController public class HelloController { @RequestMapping(value="/hello") public String sayHelloWorld() { return "Hello World\n"; } }
4. Final architecture
Final architecture contains all the components that will take part in data processing. Note that CataloqQT
server will provide access to data via Web Services
- we want to be as much client independent as possible. We have planned to implement the whole solution using Java
- due to the fact of huge maturity of Spring
framework.
5. Tool development plans
5.1. Separation of UI and API
For now, the User Interface UI does not query the API remotely, but knows its location on the machine on which both of them are located and query it locally.
Thanks to the separation, UI could connect to any API on any machine and browse stored data remotely.
5.2. Integration with IMAS Docker image
5.2.1. IMAS Docker
IMAS Docker is a basis for Catalog QT Docker. It provides set of most basic components required to run IMAS based codes. You can find description of IMAS Docker at following location: WFMS:IMAS @Docker
There is a dedicated repository with samples you can use, for starters: https:
//
github.com/tzok/imas-hello-world.git.
5.2.2. Catalog QT Docker
Catalog QT Docker is built on top of IMAS Docker. In addition to IMAS components it introduces additional elements relevant to Catalog QT itself:
- MySQL database
- Java environment
- Spring Based web services
- TBD: Dashboard
This way, it provides all the components in one place. It is, however, possible to detach all the elements and run them separately. This would require manual intervention and customised, yet still possible.
Catalog QT Docker can be found at following location: https://github.com/mkowsiak/catalogue_qt_docker
5.2.3. Scenarios with Docker based installation
5.3. SOAP removal - time for JSON?
SOAP requires many additional libraries and data must be created in a specific format, which make more difficult and extends the time it takes to create the software and makes code debugging difficult.
In addition, during installation of Catalog QT on a different machine, developer must ensure that SOAP libraries are created and loaded to run tool.
Current trends show that data formats such as JSON are more practical and allow the exchange of data with other systems in a more practical way.
If we want to integrate several tools, this should be done when the data format changes to a more universal and user-friendly one.
6. Data Feeder refactoring
At the moment, data feeder is tightly coupled with client application (CLI based client for Catalog QT). This was a natural choice as we planned to use only IMAS based data sources. Over the course of the development, it turned out that different data sources should be taken into consideration, for example: CSV files, HDF5 files, ASCII based formats. These formats are not supported by IMAS. It means, they shouldn't require IMAS dependency at all. Client code (CLI) depends on IMAS library (imas.jar). We want to avoid this dependency for other clients. This is why Data Feeder has to be moved into common part of codes, new data structures are needed to transfer data between common and client libraries, IMASFeeder should be the only IMAS dependent component. What we also need is a complete separation of URIParsers. Each client should depend on it's own URIParser implementation. These parsers should be either implemented inside common part, or stay close to client code responsible for reading data.