1. State of the Art
Current layout of components generated by FC2K is based on Kepler's module layout. The reason for such layout is based on historical choices - related to Kepler being a main workflow engine. Python based actors (being an extension to already existing solution) exploit the same layout. It means, parts of the Python code depend strongly on components created by FC2K.
1.1. Generated code
FC2K generates:
- Kepler actor
- Python actor
- Resources used by both: Python and Kepler actors
- libActor.so
- embeds user code
- provides Fortran/C wrapper that reads/write IDSes
- stored in $KEPLER/imas/lib64
- Actor.exe
- executable that is used to run actor in standalone mode
- debugging
- any kind of batch jobs (e.g. MPI)
- executable that is used to run actor in standalone mode
- Other: actor XML parameters, etc. (in case of Python actors these files must be located close to Python module)
- libActor.so
1.2. Why Python actor uses $KEPLER ?
- Python script loads
libActor.so
to run user's (Fortran or CPP) code libActor.so
is stored in$KEPLER/imas/lib64
directorylib64
is Kepler's default place to look for libraries to be loaded- adding library path to
LD_LIBRARY_PATH
could be an alternative BUT in case of complex workflows (100+ actors) it can be extremely long - potential risk of exceeding system limits
- Python module loads code parameters based on
$KEPLER
variable
2. Separation of actors
2.1. Raw idea
Separation of actors, at least in theory, should be easy. It would be enough:
- To move actor resources out of the $KEPLER
- To make resources "visible" by both : Kepler and Python actors
but we have also consider many other aspects...
2.2. Open points
Problems to be solved:
- How to assure compatibility with current version of IMAS?
- Binaries are build for particular IMAS version
- If user switch IMAS version, version of resources should be also changed
- How to assure compatibility of Kepler or Python workflow with given set of actors
- Actor A could have different ports (API) in workflow version X and Y
- This is currently achieved by actor release procedure and maintenance of ETS workflow (release of ETS workflow)
- ETS workflow + visualisation scripts + Kepler release + actor release + autoGUI release make a package that works.
- Actor A could have different ports (API) in workflow version X and Y
- How to make libraries "discoverable" by both actors?
LD_LIBRARY_PATH
?- absolute path to
libActor.so
? - Any other mechanism?
- How to design layout of directories to address before-mentioned points?
3. Proposed solutions
3.1. Independent Python/Kepler actor installations
In this scenario I assume complete separation of Kepler and Python. FC2K should allow choosing whether we want to generate actor for given workflow platform: Python, Kepler, something else in the future. Once platform is selected, FC2K generates source code and "Weighted Companion Cube" - one that contains all the native codes, libraries, etc. - for given platform.
3.1.1. Kepler
Due to the fact that Kepler will be used for ETS only (probably) it seems like there is no much sense in redesigning the whole idea of actor generation.
3.1.2. Python
I suggest moving towards Python packages that are self contained packages consisting of:
- python wrapper
- shared libraries
- standalone binary (MPI based execution)
- default parameters
Each package should be either installed inside user's virtual environment or should be available via wrapper module - this way, it will be possible to provide users with any version of actor and Python workflow will be composed of loosely coupled actors.
Sample structure of the actor can be found here: /gss_efgw_work/work/g2michal/cpt/development/python_modules/simple
Inside, we have a sample structure of Python module with native code: demo.c.
. |-- src - source code of the actor (native code) |-- testing - scripts used for testing the solution |-- venv-27 - virtual environment `-- workspace - workspace (as described by Bartek) `-- 3.24.0 `-- actor_demo - actor directory (this is not Python package) |-- 1.0 - version of the actor | `-- actor_demo - this is Python package (that contains module) `-- 1.1 `-- actor_demo
In this scenario, actor is treated as package. It means, we can either install it using pip
command or we can access it via wrapper module by setting some artificial access point for all the actors - PYTHON_WORKSPACE
. This approach provides huge flexibility when it comes to distributing actors. It's very easy to create virtual environment and install actor inside it.
3.1.3. Virtual environments
Virtual environment provides user based separation of Python environment from the system one. There are multiple choices when it comes to "virtualization" of Python environment:
- https://docs.python.org/3/tutorial/venv.html
- https://docs.conda.io/en/latest/
- https://www.anaconda.com
In this sample, I will use one that is available as Python package.
3.1.3.1. Installation of actor
In this section, I will install actor (version 1.0) execute it and then, I will upgrade actor to version (1.1) and execute it once again.
During the test, I will use very simple test code
from actor_demo import wrapper wrapper.actor_demo()
wrapper
is the name of the file inside actor_demo
module - it's an arbitrary choice.
# Initialise virtual environment > virtualenv --no-site-packages -p /gw/switm/python/2.7/bin/python2.7 venv-27 > source venv-27/bin/activate.csh # Install actor with version 1.0 > pip install --upgrade --force-reinstall ./workspace/3.24.0/actor_demo/1.0/ > python testing/test.py Hello from fun 1.0 Hello World! # Now, I want to use another version of an actor > pip install --upgrade --force-reinstall ./workspace/3.24.0/actor_demo/1.1 > python testing/test.py Hello from fun 1.1 Hello World!
This way, it's possible to create custom work environments where various actors can be installed, reinstalled, where we can mix different versions of actors. However, due to pip
related limitations we can have only one actor at the time inside virtual environment.
3.1.4. Accessing actors via wrappers
In this scenario, we provide place with all the actors WORKSPACE
and via the wrapper we allow user to use different versions of actor. The structure of workspace
directory remains the same. We have to make sure we point to this place using environment variable
# Make sure that Python based wrapper will be able to load actor's code > module load imasenv > setenv PYTHON_WORKSPACE `pwd`/workspace
for each actor we need a wrapper
''' actor_demo_loader.py ''' import os import sys import platform import imp def import_actor(actor_name='actor_demo', version=None): workspace = os.getenv('PYTHON_WORKSPACE') if workspace is None: raise Exception( 'It looks like PYTHON_WORKSPACE is not set') python_ver = platform.python_version_tuple() sys_arch = platform.machine() actor_path = os.path.join(workspace, os.getenv('IMAS_VERSION'), actor_name, version, actor_name) print('Importing actor: \n ' + actor_path) if not os.path.isdir(actor_path): raise Exception( 'Directory {} does not exists, you have probably not generated any actor'.format( actor_path)) sys.path.append(actor_path) fp, pathname, description = imp.find_module('wrapper', [actor_path]) _imas_module = imp.load_module('wrapper', fp, pathname, description) globals().update(_imas_module.__dict__) del _imas_module
With this, we can use actors inside regular Python environment (actors are never installed inside site-packages).
> module load imasenv > setenv PYTHON_WORKSPACE `pwd`/workspace > python testing/test-wrapper-1.0.py Importing actor: /gss_efgw_work/work/g2michal/cpt/development/python_modules/simple/workspace/3.24.0/actor_demo/1.0/actor_demo Hello from fun 1.0 Hello World! > python testing/test-wrapper-1.1.py Importing actor: /gss_efgw_work/work/g2michal/cpt/development/python_modules/simple/workspace/3.24.0/actor_demo/1.1/actor_demo Hello from fun 1.1 Hello World!
However, this solution has small issue. It requires explicit selection of the module. It means, it's not quite a Python way of loading modules. Note the difference between test-wrapper-1.0.py
import actor_demo_loader actor_demo_loader.import_actor(version='1.0') actor_demo_loader.actor_demo()
and test-wrapper-1.1.py
import actor_demo_loader actor_demo_loader.import_actor(version='1.1') actor_demo_loader.actor_demo()
as you can see, we have to explicitly select version of the actor. An alternative approach would be importing actors by embedding version as a part of package (sort of: from actor_demo.1_0.actor_demo import wrapper
). Anyway, there is no simple way of making the very same workflow to be compatible with different versions of actor without altering workflow's code, environment variables, whatever the way of selecting the version we choose.
3.2. One common installation of actor resources
3.2.1. Idea
- To keep all (common) actor resources for both: Python and Kepler actors in one place (directory)
- To allow switching between "workspaces" (user defined sets of actors)
- To load actor libraries only for current version of IMAS (if actor is IMAS dependent)
3.2.2. Layout of actor directory
TBD (the same as previously?)
3.2.3. IMAS Workspace
workspaces/ |-- workspaceX |-- workspaceY `-- workspaceZ |-- common | |-- actor1 | | `-- lib | | `-- libActor1.so | |-- actor2 | | `-- lib | | `-- libActor2.so | `-- lib | |-- libActor1.so -> ../actor1/lib/libActor1.so | `-- libActor2.so -> ../actor2/lib/libActor2.so `-- imas |-- 3.23.2 | |-- actorA | | `-- lib | | `-- libActorA.so | |-- actorB | | `-- lib | | `-- libActorB.so | |-- actorC | | `-- lib | | `-- libActorC.so | `-- lib | |-- libActorA.so -> ../actorA/lib/libActorA.so | |-- libActorB.so -> ../actorB/lib/libActorB.so | `-- libActorC.so -> ../actorC/lib/libActorC.so `-- 3.24.0 |-- actorA | `-- lib | `-- libActorA.so |-- actorB | `-- lib | `-- libActorB.so |-- actorD | `-- lib | `-- libActorD.so `-- lib |-- libActorA.so -> ../actorA/lib/libActorA.so |-- libActorB.so -> ../actorB/lib/libActorB.so `-- libActorD.so -> ../actorD/lib/libActorD.so
Workspace layout:
- directory
common
- keeps all actors with no IMAS dependencies - directory
imas/$IMAS_VERSION
- keeps all actors build for given version of IMAS - directories
common/lib
andimas/$IMAS_VERSION/lib
- keeps links to libraries (to simplifyLD_LIBRARY_PATH
)
Actor generation:
- Generated code (wrapper) will be saved under
$ACTIVE_WORKSPACE/imas/$IMAS_VERSION/<actor_name>
Switching workspace
LD_LIBRARY_PATH = $ACTIVE_WORKSPACE/common/lib + $ACTIVE_WORKSPACE/imas/$IMAS_VERSION/lib
- java.lib.path = $LD_LIBRARY_PATH
- Kepler: rm target/* ? ant compile?
Scripts:
- list workspaces
- switch workspace
- remove workspace
- create workspace
3.2.4. Open points
- Do we need directory "
common"
? Should we care about "no-IMAS" actors (usually there are none of them)... - At which level Kepler actor should be separated from their "resources" (i.e. what can be put within workspace)
The whole imas related directory (
$KEPLER/imas
) ?- How to achieve this? Kepler module or just CLASSPATH?
- Could we separate Kepler Core Actors to be independent module?