1. Current solution
It looks like (from user's perspective) we miss simplified way of choosing "dressed" Kepler and workflow that will be used to launch ETS. In fact, the same sort of issues will affect Python based workflows.
At the moment, release of Kepler follows (heavy simplified process):
- release of new
FC2K (e.g.: version
R4.2.6
this will be transparent for users - they don't need FC2K for day to day usage) - release of new
Kepler
(e.g. versionR5.0.16_v5.2.2
) - release of all actors (based on tag, e.g.:
v5.2.2
) - release of supplementary components (e.g.:
kplots - ETS_4.10b.10_v5.6.0/kplots
) - release of workflow (e.g.:
ETS_v5.2.2.xml
) - release of autoGui (e.g.:
1.1
)
all these components make user based environment.
2. Main goal
To provide to the user a mechanism that:
- allows to run ETS workflow via ETS GUI
- automatically and transparently updates all ETS components
3. Assumptions
- User sets up and runs ETS via ETS GUI only
3.1. Module ets
Module ets configures user working environment by loading consistent set of libraries that were used to build particular version of ETS.
It internally
- loads module itmenv
- sets particular release of ETS actors
- sets particular release of ETS workflow (ETS.xml)
Proposed syntax: {{ets/dd-ver/ets-ver e.g. ets/3.27.0/6.3
3.2. Script launch_ets.sh (please name it simply "ETS.sh")
This script
- checks if new versions of ETS environment are available
- in case a new version was released, allows the user to decide is local installation should be updated
- if YES - automatically updates local user installation of ETS environment
4. Use cases:
4.1. Use-case 1 - Running existing installation of ETS
1) Precondition - first run or no ETS versions released since last run
2) Setting ETS environment and launching ETS
# 1. log in to machine # 2. initialise environment > module load ets # 3. run the autoGui > launch_ets.sh
autoGui
is started, user can use his version of ETS
with the default workflow loaded by environment.
4.2. Use-case 2 - Running already installed version of ETS (even if new version was released)
1) Precondition - ETS environment updated
In this scenario, system was upgraded and new release of tools was installed
- release of new
Kepler
(e.g. versionR5.0.16_v5.2.3
) - release of all actors (based on tag, e.g.:
v5.2.3
) - release of supplementary components (e.g.:
kplots
) - release of workflow (e.g.:
ETS_v5.2.3.xml
)
With current layout of Kepler
and actors, we are forced to create new version of Kepler
. This is why, after releasing new set of actors we have new version of Kepler
:
R5.0.16_v5.2.2 -> R5.0.16_v5.2.3
2) Setting ETS environment and launching ETS
# 1. log in to machine # 2. initialise environment > module load ets # 3. run the autoGui > launch_ets.sh There is a new release of ETS. Do you want to upgrade [Y/N]?: N
User presses N and currently installed version of autoGui
[is autoGui being "installed" locally?] and current version of environment (Kepler, set of ETS actors, and workflow file)
is used. No changes inside user's settings are done.
4.3. Use-case 3 - Running ETS (automatic update of ETS installation)
1) Precondition - ETS environment updated
In this scenario, system was upgraded and new release of tools was installed
- release of new
Kepler
(e.g. versionR5.0.16_v5.2.3
) - release of all actors (based on tag, e.g.:
v5.2.3
) - release of supplementary components (e.g.:
kplots
) - release of workflow (e.g.:
ETS_v5.2.3.xml
)
With current layout of Kepler
and actors, we are forced to create new version of Kepler
. This is why, after releasing new set of actors we have new version of Kepler
:
R5.0.16_v5.2.2 -> R5.0.16_v5.2.3
2) Setting ETS environment and launching ETS
# 1. log in to machine # 2. initialise environment > module load ets # 3. run the autoGui > launch_ets.sh There is a new release of ETS. Do you want to upgrade [Y/N]?: Y
User presses Y. Script performs upgrade of all tools installed for the user and starts most recent version of autoGui
and most recent version of Kepler
. Previous version of user's installation are preserved and it will be possible to go back to this version later.
5. Preliminary investigation
All the above solutions are keeping the current approach which requires user to have a sort of local install of Kepler (composed of links or of the full sources).
Before going in that direction, we want to check the technical feasibility of running a workflow with a Kepler that remains entirely in public place.
5.1. Kepler
We need to check that running Kepler in such mode is possible, it involves a few actions, mainly on Michal, but with help of others when/where needed:
5.1.1. System variables used by Kepler
At the moment, the way Kepler is executed, can be controlled by the set of environment variables and Java properties
_JAVA_OPTIONS - this is the heaviest way of altering Java and Kepler. This option makes it possible to alter default settings of JVM. It overrides everything else. KEPLER_JAVA_OPTIONS - equivalent of _JAVA_OPTION however, this variable alters only Kepler example: setenv KEPLER_JAVA_OPTIONS "-Xss20m -Xms1g -Xmx4g -Dsun.java2d.xrender=false" KEPLER_DOT - points to root director where .kepler and KeplerData directories will be created PTOLEMYII_DOT - points to root director where .ptolemyII directory will be created KEPLER_WORK_DIR - points to location where Kepler will be started. This way, it's possible to change location of current directory (instead of $KEPLER it will be pointed by $KEPLER_WORK_DIR) alternatively, this value can be passed via: -Dkepler.work.dir=.... -Duser.home= - This variable is passed to Kepler, this way it's possible to change user.home inside Kepler -Duser.start.dir= - This variable allows to pass information regarding current directory. It's possible to access this variable later on by calling System.getProperty("java.property.name")
5.1.2. Preparation for the test
To simulate the case where we run everything from the central installation I am working on a local copy with all the access rights set to read only
> chmod a-w kepler
Test installation is here: /gss_efgw_work/work/g2michal/cpt/development/isolated_kepler/kepler
5.1.2.1. Running from public installation
First test is related to running Kepler interactively from public install (can Kepler log be stored elsewhere) - I am using copy of: ETS_4.10b.10_v5.2.2
We have first issue related to local changes inside KEPLER directory
> ant run ... ... [compile] Compiling itm... BUILD FAILED /gss_efgw_work/work/g2michal/cpt/development/isolated_kepler/kepler/build-area/build.xml:56: java.io.FileNotFoundException: ./depcache/itm/dependencies.txt (Permission denied)
The above issue can be solved by running from the place where uses has access rights. For example, running it inside directory where user has write access will result in creating directory deepcache
, and Kepler will start
> pwd /gss_efgw_work/work/g2michal/cpt/development/isolated_kepler > ant -f ant -f kepler/build-area/build.xml run > tree depcache depcache |-- actors | `-- dependencies.txt |-- authentication | `-- dependencies.txt |-- authentication-gui | `-- dependencies.txt |-- component-library | `-- dependencies.txt |-- configuration-manager | `-- dependencies.txt |-- core | `-- dependencies.txt |-- data-handling | `-- dependencies.txt |-- dataone | `-- dependencies.txt |-- dataturbine | `-- dependencies.txt |-- display-redirect | `-- dependencies.txt |-- ecogrid | `-- dependencies.txt |-- event-state | `-- dependencies.txt |-- gui | `-- dependencies.txt |-- io | `-- dependencies.txt |-- itm | `-- dependencies.txt |-- job | `-- dependencies.txt |-- loader | `-- dependencies.txt |-- module-manager | `-- dependencies.txt |-- module-manager-gui | `-- dependencies.txt |-- opendap | `-- dependencies.txt |-- r | `-- dependencies.txt |-- repository | `-- dependencies.txt |-- sms | `-- dependencies.txt |-- ssh | `-- dependencies.txt `-- util `-- dependencies.txt
install_path.txt
There is an issue with install_path.txt
file. It can be solved by adding file use.keplerdata
inside Kepler
installation
There is, an issue with Log4J
based logger
[compile] Compiling itm... run: [run] JVM Memory: min = 1g, max = 4g, stack = 20m [run] log4j.properties found in CLASSPATH: /gss_efgw_work/work/g2michal/cpt/development/isolated_kepler/kepler/itm/resources/log4j.properties [run] log4j:ERROR setFile(null,true) call failed. [run] java.io.FileNotFoundException: keplerLog4J.log (Permission denied)
FIXED: I was able to overcome this issue (and at the same time changing work dir to different location) by implementing something like this inside file: kepler/build-area/src/org/kepler/build/Run.java
Java java = new Java(); java.setDir(new java.io.File(System.getenv("KEPLER_WORK_DIR")));
Kepler is started in two phases. First one, responsible for collecting modules and starting another one (workflow execution). This process is maintained by Ant
. It is possible to overcome defaults and run the code in any directory (e.g. current one). At the moment, solution is little bit ugly (I am reading location - where Kepler is supposed to be started - from environment).
5.1.2.2. Running FC2K based actor inside Kepler
Running Kepler interactively with FC2K actors inside the workflow. We have to make sure that current directory can be set to any location outside of the Kepler. We run simple cases, then ETS-like ones, where some might read/write files in current directory
In order to run test case for ETS
make sure to follow these steps:
# prepare env. > module load itmenv/ETS_4.10b.10_v5.6.0 # This is the place where your workflow will run. # Just for a convenience I am creating new directory # where all the files will be stored. However, it can be # any location you like # /tmp # $ITMWORK # $HOME # # for the whole scenario it really doesn't matter where you run the whole thing. # However, note that after execution, inside your current directory you will find # lots of small files. This is why it might be a good idea to start everything inside # some separate directory - just to not pollute your $HOME > mkdir $ITMWORK/isolated_kepler_test > cd $ITMWORK/isolated_kepler_test # This step is still required, but it may change in the future > source $ITMSCRIPTDIR/ITMv2 jet # Note that we are using "artificial" central installation. # In the future, this location will be replaced with something like # $SWITMDIR/kepler/trunk/ETS_4.10b.10_v5.6.1/kepler > setenv KEPLER /gss_efgw_work/work/g2michal/cpt/development/isolated_kepler/ETS_4.10b.10_v5.6.1/kepler # If you have these files somewhere else, this step is not required. # If you want to make sure you are working with most recent release of ETS workflow # I suggest you run this step anyway. > svn co https://gforge6.eufus.eu/svn/keplerworkflows/tags/ETS_4.10b.10_v5.6.0 # do not go inside $KEPLER! # This command starts Kepler and it will run it inside your # current directory. > $KEPLER/kepler.sh ETS_4.10b.10_v5.6.0/ETS_WORKFLOW.xml # Once Kepler is opened and you can see the workflow in the canvas, simply execute the workflow
you should be able to execute the whole workflow and see the results as below
In case actors create some files (e.g. for internal processing or as intermediate results), these files are no longer created inside $KEPLER. Instead, these are created in current directory - place where you have started Kepler
> pwd /gss_efgw_work/work/g2michal/isolated_kepler_test > tree . -L 1 -a --charset=ascii . |-- EQDSK_COCOS_02_POS.OUT - these are some files created by workflow (codes inside workflow) |-- EQDSK_COCOS_13.OUT - -"- |-- EXPEQ_KEPLER.IN - -"- |-- EXPEQ.OUT - -"- |-- EXPEQ.OUT.TOR - -"- |-- EXPTNZ.OUT - -"- |-- NOUT - -"- |-- ETS_4.10b.10_v5.6.0 - This is the repository with ETS workflow, kplots, etc. |-- kepler.log - These are the logs of kepler.sh |-- keplerLog4J.log - These are the logs generated by Kepler via Log4J |-- .kepler - This is the file you would normally find in $HOME |-- KeplerData - -"- `-- .ptolemyII - -"-
kepler.log
file - one that contains log of Kepler's execution is also created in current directory.
No special actions are needed from user's side. Everything is set up inside kepler.sh
script.
5.1.2.3. Standalone mode
Running these actors in standalone mode (checking the behaviour of the copy of sources/exec from initial place in Kepler to KEPLEREXECUTION/sandbox)
5.1.2.4. Debug mode
Running these actors in debug mode (checking the behaviour of the copy of sources/exec from initial place in Kepler to KEPLEREXECUTION/sandbox)
5.1.2.5. Batch mode
Running these actors in batch mode (checking the behaviour of the copy of sources/exec from initial place in Kepler to KEPLEREXECUTION/sandbox)
5.1.2.6. SBK based execution
Checking sbk execution - with simple workflow and then with ETS workflow
5.1.2.7. Default location of Kepler based cache files
We have agreed to make current working directory a place where all the local data used by Kepler will be stored.
5.1.2.7.1. Use-case - running two different sessions of Kepler
Let's say we want to run two, completely separate sessions of Kepler. In that case, user will do following
We are using already existing features
KEPLER_DOT PTOLEMYII_DOT KEPLER_WORK_DIR
> mkdir experiments > cd experiments > mkdir my_firs_experiment > cd my_firs_experiment > kepler.sh ... some things are happening ... workflow is started ... done ... ... at the end, user saves his own version of ETS workflow as "workflow.xml" > cd .. > mkdir my_second_experiment > cd my_second_experiment > kepler.sh ... some things are happening ... workflow is started ... done ... ... at the end, user saves his own version of ETS workflow as "workflow.xml" > cd ../.. > tree experiments experiments |-- my_first_experiment | |-- KeplerData | | `-- workflow.xml | |-- kepler.log | |-- keplerLog4J.log | |-- .local | |-- .ptolemyII | `-- SOME_FILE_CREATED_BY_PHYSICS_CODE `-- my_second_experiment |-- KeplerData | `-- workflow.xml |-- kepler.log |-- keplerLog4J.log |-- .local |-- .ptolemyII `-- SOME_FILE_CREATED_BY_PHYSICS_CODE
At the very end, all the local data end up in a location where Kepler was started. There are no references to $HOME
or $KEPLER
or some other locations. Everything is stored in the directory where Kepler was started.
5.1.3. First implementation of centrally protected Kepler release
This section contains very simple sample of dressed Kepler that is write protected. To get it running do following
> mkdir -p ~/tmp/experiments/my_first_execution > cd ~/tmp/experiments/my_first_execution # This workflow simulates ETS workflow. # It uses actor that is centrally installed. > cp ~g2michal/public/nocpo_workflow.xml > setenv KEPLER /gw/switm/kepler/trunk/R6.0.7/kepler > $KEPLER/kepler.sh -cwd # Open file ~/tmp/experiments/my_first_execution/nocpo_workflow.xml and run the workflow
5.1.4. Using centrally installed Kepler with sbk
sbk.sh
was adapted to support centrally installed releases of Kepler. In order to use it one must do following:
> module load itmenv > setenv KEPLER /gw/switm/kepler/trunk/R6.0.7/kepler > module switch scripts/R4 # /gss_efgw_work/work/g2michal/cpt/issues/batch_and_central_kepler/Test.xml # contains sample workflow that we can use for testing # $ITMSCRIPTDIR/batch_submission/wrapper-central.sh # contains modified wrapper script that supports centrally installed Kepler # it should be backward compatible with regular Kepler > sbk.sh -name=test_batch_central \ -file=/gss_efgw_work/work/g2michal/cpt/issues/batch_and_central_kepler/Test.xml \ -marco:mem=10 \ -marco:select=1 \ -marco:cpus=1 \ -marco:walltime=00:10:00 \ -marco:block=false \ -marco:queue=gw \ -wrapper=$ITMSCRIPTDIR/batch_submission/wrapper-central.sh
5.1.5. Using centrally installed Kepler with sbk via ets module
> module purge > module load cineca > module load ets > sbk.sh -name=test_2 \ -file=$ETS_HOME/ETS5.xml \ -marco:mem=10 \ -marco:select=1 \ -marco:cpus=1 \ -marco:walltime=00:10:00 \ -marco:block=false \ -marco:queue=gw
5.2. Kplots and other scripts
We need to check that these scripts (currently copied within Kepler sources when Kepler is being dressed-up) can be stored outside Kepler (public place, versioned, and made available through module) and referenced from the workflow: kplots module will set up a KPLOTS_HOME which points to dir containing all the scripts, we need to check that reading this variable from the workflow will work correctly (getenv). Action on Olivier to create public install of kplots with modules, and on Dmitriy to update ETS workflow to link to Kplots scripts using the env variable set by module. These actions do no depend on Kepler actions above.
Acknowledgement
This work has been carried out within the framework of the EUROfusion Consortium and has received funding from the Euratom research and training programme 2014-2018 under grant agreement No 633053.The scientific work is published for the realization of the international project co-financed by Polish Ministry of Science and Higher Education in 2019 and 2020 from financial resources of the program entitled "PMW"; Agreement No. 5040/H2020/Euratom/2019/2 and 5142/H2020-Euratom/2020/2”.
5 Comments
Unknown User (olivier.hoenen@ipp.mpg.de)
In this page I am missing the part where users can select the exact version of ets module they want (I hope this is being planned)
(no this is not really covered by use-case 2)
Unknown User (olivier.hoenen@ipp.mpg.de)
btw, for this auto install of kepler, or update of local install of kepler, what will be the chosen name of the local kepler?
Unknown User (olivier.hoenen@ipp.mpg.de)
time for kplots to be released independently of Kepler no? (I know that ideally actors should also, but in case of kplots this should be easy, we just a need a module that sets env variable to where kplots scripts are, and update the ETS workflow to use this env for reaching scripts)
Unknown User (olivier.hoenen@ipp.mpg.de)
I am missing use-case 0, very first time a user wants to run ETS (nothing installed locally)
Unknown User (olivier.hoenen@ipp.mpg.de)
I would also like to see if no kepler install at all could be targeted (might require some updates, for instance logging mechanism, etc...), because this would be most useful for end-users (people who will never be developer of actors)