public release 3.0.0 - see README and CHANGES for details

update README
public release 2.2.0 - see README.md and CHANGES.md for details
2021-02-09 12:46:20 +01:00 · 2020-09-04 16:31:45 +02:00 · 2020-09-04 16:22:42 +02:00
92 changed files with 7210 additions and 2207 deletions
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@ -0,0 +1,14 @@
+pages:
+  stage: deploy
+  script:
+  - ~/miniconda3/bin/activate pmsco
+  - make docs
+  - mv docs/html/ public/
+  artifacts:
+    paths:
+    - public
+  only:
+  - master
+  tags:
+  - doxygen
+  
--- a/CHANGES.md
+++ b/CHANGES.md
@ -0,0 +1,59 @@
+Release 3.0.0 (2021-02-01)
+==========================
+
+| Hash | Date | Description |
+| ---- | ---- | ----------- |
+| 72a9f38 | 2021-02-06 | introduce run file based job scheduling |
+| 42e12d8 | 2021-02-05 | compatibility with recent conda and singularity versions |
+| caf9f43 | 2021-02-03 | installation: include plantuml.jar |
+| 574c88a | 2021-02-01 | docs: replace doxypy by doxypypy |
+| a5cb831 | 2021-02-05 | redefine output_file property |
+| 49dbb89 | 2021-01-27 | documentation of run file interface |
+| 940d9ae | 2021-01-07 | introduce run file interface |
+| 6950f98 | 2021-02-05 | set legacy fortran for compatibility with recent compiler |
+| 28d8bc9 | 2021-01-27 | graphics: fixed color range for modulation functions |
+| 1382508 | 2021-01-16 | cluster: build_element accepts symbol or number |
+| 53508b7 | 2021-01-06 | graphics: swarm plot |
+| 4a24163 | 2021-01-05 | graphics: genetic chart |
+| 99e9782 | 2020-12-23 | periodic table: use common binding energies in condensed matter XPS |
+| fdfcf90 | 2020-12-23 | periodic table: reformat bindingenergy.json, add more import/export functions |
+| 13cf90f | 2020-12-21 | hbnni: parameters for xpd demo with two domains |
+| 680edb4 | 2020-12-21 | documentation: update documentation of optimizers |
+| d909469 | 2020-12-18 | doc: update top components diagram (pmsco module is entry point) |
+| 574993e | 2020-12-09 | spectrum: add plot cross section function |
+
+
+Release 2.2.0 (2020-09-04)
+==========================
+
+| Hash | Date | Description |
+| ---- | ---- | ----------- |
+| 4bb2331 | 2020-07-30 | demo project for arbitrary molecule (cluster file) |
+| f984f64 | 2020-09-03 | bugfix: DATA CORRUPTION in phagen translator (emitter mix-up) |
+| 11fb849 | 2020-09-02 | bugfix: load native cluster file: wrong column order |
+| d071c97 | 2020-09-01 | bugfix: initial-state command line option not respected |
+| 9705eed | 2020-07-28 | photoionization cross sections and spectrum simulator |
+| 98312f0 | 2020-06-12 | database: use local lock objects |
+| c8fb974 | 2020-04-30 | database: create view on results and models |
+| 2cfebcb | 2020-05-14 | REFACTORING: Domain -> ModelSpace, Params -> CalculatorParams |
+| d5516ae | 2020-05-14 | REFACTORING: symmetry -> domain |
+| b2dd21b | 2020-05-13 | possible conda/mpi4py conflict - changed installation procedure |
+| cf5c7fd | 2020-05-12 | cluster: new calc_scattering_angles function |
+| 20df82d | 2020-05-07 | include a periodic table of binding energies of the elements |
+| 5d560bf | 2020-04-24 | clean up files in the main loop and in the end |
+| 6e0ade5 | 2020-04-24 | bugfix: database ingestion overwrites results from previous jobs |
+| 263b220 | 2020-04-24 | time out at least 10 minutes before the hard time limit given on the command line |
+| 4ec526d | 2020-04-09 | cluster: new get_center function |
+| fcdef4f | 2020-04-09 | bugfix: type error in grid optimizer |
+| a4d1cf7 | 2020-03-05 | bugfix: file extension in phagen/makefile |
+| 9461e46 | 2019-09-11 | dispatch: new algo to distribute processing slots to task levels |
+| 30851ea | 2020-03-04 | bugfix: load single-line data files correctly! |
+| 71fe0c6 | 2019-10-04 | cluster generator for zincblende crystal |
+| 23965e3 | 2020-02-26 | phagen translator: fix phase convention (MAJOR), fix single-energy |
+| cf1814f | 2019-09-11 | dispatch: give more priority to mid-level tasks in single mode |
+| 58c778d | 2019-09-05 | improve performance of cluster add_bulk, add_layer and rotate |
+| 20ef1af | 2019-09-05 | unit test for Cluster.translate, bugfix in translate and relax |
+| 0b80850 | 2019-07-17 | fix compatibility with numpy >= 1.14, require numpy >= 1.13 |
+| 1d0a542 | 2019-07-16 | database: introduce job-tags |
+| 8461d81 | 2019-07-05 | qpmsco: delete code after execution |
+
--- a/201
+++ b/201
@ -0,0 +1,201 @@
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "{}"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright 2015-2020 Paul Scherrer Institut
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/README.md
+++ b/README.md
@ -6,29 +6,33 @@ It is a collection of computer programs to calculate photoelectron diffraction p
 and to optimize structural models based on measured data.

 The actual scattering calculation is done by code developed by other parties.
-PMSCO wraps around that program and facilitates parameter handling, cluster building, structural optimization and parallel processing.
+PMSCO wraps around those programs and facilitates parameter handling, cluster building, structural optimization and parallel processing.
 In the current version, the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code
 developed by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley (1999) is used for scattering calculations.
-Other code can be integrated as well.
+Instead of EDAC built-in routines, alternatively,
+the PHAGEN program from [MsSpec-1.0](https://msspec.cnrs.fr/index.html) can be used to calculate atomic scattering factors.
+

 Highlights
 ----------

- angle or energy scanned XPD.
- various scanning modes including energy, polar angle, azimuthal angle, analyser angle.
- averaging over multiple symmetries (domains or emitters).
+- angle and energy scanned XPD.
+- various scanning modes including energy, manipulator angle (polar/azimuthal), emission angle.
+- averaging over multiple domains and emitters.
 - global optimization of multiple scans.
- structural optimization algorithms: particle swarm optimization, grid search, gradient search.
+- structural optimization algorithms: particle swarm optimization, genetic algorithm, grid scan, table scan.
+- detailed reports and graphs of result files.
 - calculation of the modulation function.
 - calculation of the weighted R-factor.
 - automatic parallel processing using OpenMPI.
+- compatible with Slurm resource manager on Linux cluster machines.


 Installation
 ============

-PMSCO is written in Python 3.6 and compatible with Python 2.7.
-The code will run in any recent Linux environment on a workstation or in a virtual machine.
+PMSCO is written in Python 3.6.
+The code will run in any recent Linux environment on a workstation or virtual machine.
 Scientific Linux, CentOS7, [Ubuntu](https://www.ubuntu.com/)
 and [Lubuntu](http://lubuntu.net/) (recommended for virtual machine) have been tested.
 For optimization jobs, a cluster with 20-50 available processor cores is recommended.
@ -36,7 +40,12 @@ The code requires about 2 GB of RAM per process.

 Detailed installation instructions and dependencies can be found in the documentation
 (docs/src/installation.dox).
-A [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) compiler with Doxypy is required to generate the documentation in HTML or LaTeX format.
+A [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) compiler with Doxypypy is required to generate the documentation in HTML format.
+
+The easiest way to set up an environment with all dependencies and without side-effects on other installed software is to use a [Singularity](https://www.sylabs.io/guides/3.7/user-guide/index.html) container.
+A Singularity recipe file is part of the distribution, see the PMSCO documentation for details, Singularity must be installed separately.
+Installation in a [virtual box](https://www.virtualbox.org/) on Windows or Mac is straightforward using pre-compiled images with [Vagrant](https://www.vagrantup.com/).
+A Vagrant definition file is included in the distribution.

 The public distribution of PMSCO does not contain the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code.
 Please obtain the EDAC source code from the original author, copy it to the pmsco/edac directory, and apply the edac_all.patch patch.
@ -61,10 +70,39 @@ Matthias Muntwiler, <mailto:matthias.muntwiler@psi.ch>
 Copyright
 ---------

-Copyright 2015-2018 by [Paul Scherrer Institut](http://www.psi.ch)
+Copyright 2015-2021 by [Paul Scherrer Institut](http://www.psi.ch)


 Release Notes
 =============

+For a detailed list of changes, see the CHANGES.md file.
+
+3.0.0 (2021-02-08)
+------------------
+
+- Run file interface replaces command line arguments:
+  - Specify all run-time parameters in a JSON-formatted text file.
+  - Override any public attribute of the project class.
+  - Only the name of the run file is needed on the command line.
+- The command line interface is still available, some default values and the handling of directory paths have changed.
+  Check your code for compatibility.
+- Integrated job scheduling with the Slurm resource manager:
+  - Declare all job arguments in the run file and have PMSCO submit the job.
+- Graphics scripts for genetic chart and swarm population (experimental feature).
+- Update for compatibility with recent Ubuntu (20.04), Anaconda (4.8) and Singularity (3.7).
+- Drop compatibility with Python 2.7, minimum requirement is Python 3.6.
+
+
+2.2.0 (2020-09-04)
+------------------
+
+This release breaks existing project code unless the listed refactorings are applied.
+
+- Major refactoring: The 'symmetry' calculation level is renamed to 'domain'. 
+  The previous Domain class is renamed to ModelSpace, Params to CalculatorParams.
+  The refactorings must be applied to project code as well.
+- Included periodic table of elements with electron binding energies and scattering cross-sections.
+- Various bug fixes in cluster routines, data file handling, and in the PHAGEN interface.
+- Experimental sqlite3 database interface for optimization results.

--- a/bin/pmsco.ra-git.template
+++ b/bin/pmsco.ra-git.template
@ -1,136 +0,0 @@
-#!/bin/bash
-#
-# Slurm script template for PMSCO calculations on the Ra cluster
-# based on run_mpi_HPL_nodes-2.sl by V. Markushin 2016-03-01
-#
-# this version checks out the source code from a git repository
-# to a temporary location and compiles the code.
-# this is to minimize conflicts between different jobs
-# but requires that each job has its own git commit.
-#
-# Use:
-# - enter the appropriate parameters and save as a new file.
-# - call the sbatch command to pass the job script.
-#   request a specific number of nodes and tasks.
-#   example:
-#   sbatch --nodes=2  --ntasks-per-node=24 --time=02:00:00 run_pmsco.sl
-# the qpmsco script does all this for you.
-#
-# PMSCO arguments
-# copy this template to a new file, and set the arguments
-#
-# PMSCO_WORK_DIR
-#   path to be used as working directory.
-#   contains the script derived from this template
-#   and a copy of the pmsco code in the 'pmsco' directory.
-#   receives output and temporary files.
-#
-# PMSCO_PROJECT_FILE
-#   python module that declares the project and starts the calculation.
-#   must include the file path relative to $PMSCO_WORK_DIR.
-#
-# PMSCO_OUT
-#   name of output file. should not include a path.
-#
-# all paths are relative to $PMSCO_WORK_DIR or (better) absolute.
-#
-#
-# Further arguments
-#
-# PMSCO_JOBNAME (required)
-#   the job name is the base name for output files.
-#
-# PMSCO_WALLTIME_HR (integer, required)
-#   wall time limit in hours. must be integer, minimum 1.
-#   this value is passed to PMSCO.
-#   it should specify the same amount of wall time as requested from the scheduler.
-#
-# PMSCO_PROJECT_ARGS (optional)
-#   extra arguments that are parsed by the project module.
-#
-#SBATCH --job-name="_PMSCO_JOBNAME"
-#SBATCH --output="_PMSCO_JOBNAME.o.%j"
-#SBATCH --error="_PMSCO_JOBNAME.e.%j"
-
-PMSCO_WORK_DIR="_PMSCO_WORK_DIR"
-PMSCO_JOBNAME="_PMSCO_JOBNAME"
-PMSCO_WALLTIME_HR=_PMSCO_WALLTIME_HR
-
-PMSCO_PROJECT_FILE="_PMSCO_PROJECT_FILE"
-PMSCO_OUT="_PMSCO_JOBNAME"
-PMSCO_PROJECT_ARGS="_PMSCO_PROJECT_ARGS"
-
-module load psi-python36/4.4.0
-module load gcc/4.8.5
-module load openmpi/3.1.3
-source activate pmsco3
-
-echo '================================================================================'
-echo "=== Running $0 at the following time and place:"
-date
-/bin/hostname
-cd $PMSCO_WORK_DIR
-pwd
-ls -lA
-#the intel compiler is currently not compatible with mpi4py. -mm 170131
-#echo
-#echo '================================================================================'
-#echo "=== Setting the environment to use Intel Cluster Studio XE 2016 Update 2 intel/16.2:"
-#cmd="source /opt/psi/Programming/intel/16.2/bin/compilervars.sh intel64"
-#echo $cmd
-#$cmd
-echo
-echo '================================================================================'
-echo "=== The environment is set as following:"
-env
-echo
-echo '================================================================================'
-echo "BEGIN test"
-which mpirun
-cmd="mpirun /bin/hostname"
-echo $cmd
-$cmd
-echo "END test"
-echo
-echo '================================================================================'
-echo "BEGIN mpirun pmsco"
-echo
-
-cd "$PMSCO_WORK_DIR"
-cd pmsco
-echo "code revision"
-git log --pretty=tformat:'%h %ai %d' -1
-make -C pmsco all
-python -m compileall pmsco
-python -m compileall projects
-echo
-
-cd "$PMSCO_WORK_DIR"
-PMSCO_CMD="python pmsco/pmsco $PMSCO_PROJECT_FILE"
-PMSCO_ARGS="$PMSCO_PROJECT_ARGS"
-if [ -n "$PMSCO_SCAN_FILES" ]; then
-    PMSCO_ARGS="-s $PMSCO_SCAN_FILES $PMSCO_ARGS"
-fi
-if [ -n "$PMSCO_OUT" ]; then
-    PMSCO_ARGS="-o $PMSCO_OUT $PMSCO_ARGS"
-fi
-if [ "$PMSCO_WALLTIME_HR" -ge 1 ]; then
-    PMSCO_ARGS="-t $PMSCO_WALLTIME_HR $PMSCO_ARGS"
-fi
-if [ -n "$PMSCO_LOGLEVEL" ]; then
-    PMSCO_ARGS="--log-level $PMSCO_LOGLEVEL --log-file $PMSCO_JOBNAME.log $PMSCO_ARGS"
-fi
-
-# Do no use the OpenMPI specific options, like "-x LD_LIBRARY_PATH", with the Intel mpirun.
-cmd="mpirun $PMSCO_CMD $PMSCO_ARGS"
-echo $cmd
-$cmd
-echo "END mpirun pmsco"
-echo '================================================================================'
-cd "$PMSCO_WORK_DIR"
-rm -rf pmsco
-date
-ls -lAtr
-echo '================================================================================'
-
-exit 0
--- a/bin/pmsco.ra.template
+++ b/bin/pmsco.ra.template
@ -1,157 +0,0 @@
-#!/bin/bash
-#
-# Slurm script template for PMSCO calculations on the Ra cluster
-# based on run_mpi_HPL_nodes-2.sl by V. Markushin 2016-03-01
-#
-# Use:
-# - enter the appropriate parameters and save as a new file.
-# - call the sbatch command to pass the job script.
-#   request a specific number of nodes and tasks.
-#   example:
-#   sbatch --nodes=2  --ntasks-per-node=24 --time=02:00:00 run_pmsco.sl
-#
-# PMSCO arguments
-# copy this template to a new file, and set the arguments
-#
-# PMSCO_WORK_DIR
-#   path to be used as working directory.
-#   contains the script derived from this template.
-#   receives output and temporary files.
-#
-# PMSCO_PROJECT_FILE
-#   python module that declares the project and starts the calculation.
-#   must include the file path relative to $PMSCO_WORK_DIR.
-#
-# PMSCO_SOURCE_DIR
-#   path to the pmsco source directory
-#   (the directory which contains the bin, lib, pmsco sub-directories)
-#
-# PMSCO_SCAN_FILES
-#   list of scan files.
-#
-# PMSCO_OUT
-#   name of output file. should not include a path.
-#
-# all paths are relative to $PMSCO_WORK_DIR or (better) absolute.
-#
-#
-# Further arguments
-#
-# PMSCO_JOBNAME (required)
-#   the job name is the base name for output files.
-#
-# PMSCO_WALLTIME_HR (integer, required)
-#   wall time limit in hours. must be integer, minimum 1.
-#   this value is passed to PMSCO.
-#   it should specify the same amount of wall time as requested from the scheduler.
-#
-# PMSCO_MODE (optional)
-#   calculation mode: single, swarm, grid, gradient
-#
-# PMSCO_CODE (optional)
-#   calculation code: edac, msc, test
-#
-# PMSCO_LOGLEVEL (optional)
-#   request log level: DEBUG, INFO, WARNING, ERROR
-#   create a log file based on the job name.
-#
-# PMSCO_PROJECT_ARGS (optional)
-#   extra arguments that are parsed by the project module.
-#
-#SBATCH --job-name="_PMSCO_JOBNAME"
-#SBATCH --output="_PMSCO_JOBNAME.o.%j"
-#SBATCH --error="_PMSCO_JOBNAME.e.%j"
-
-PMSCO_WORK_DIR="_PMSCO_WORK_DIR"
-PMSCO_JOBNAME="_PMSCO_JOBNAME"
-PMSCO_WALLTIME_HR=_PMSCO_WALLTIME_HR
-
-PMSCO_PROJECT_FILE="_PMSCO_PROJECT_FILE"
-PMSCO_MODE="_PMSCO_MODE"
-PMSCO_CODE="_PMSCO_CODE"
-PMSCO_SOURCE_DIR="_PMSCO_SOURCE_DIR"
-PMSCO_SCAN_FILES="_PMSCO_SCAN_FILES"
-PMSCO_OUT="_PMSCO_JOBNAME"
-PMSCO_LOGLEVEL="_PMSCO_LOGLEVEL"
-PMSCO_PROJECT_ARGS="_PMSCO_PROJECT_ARGS"
-
-module load psi-python36/4.4.0
-module load gcc/4.8.5
-module load openmpi/3.1.3
-source activate pmsco3
-
-echo '================================================================================'
-echo "=== Running $0 at the following time and place:"
-date
-/bin/hostname
-cd $PMSCO_WORK_DIR
-pwd
-ls -lA
-#the intel compiler is currently not compatible with mpi4py. -mm 170131
-#echo
-#echo '================================================================================'
-#echo "=== Setting the environment to use Intel Cluster Studio XE 2016 Update 2 intel/16.2:"
-#cmd="source /opt/psi/Programming/intel/16.2/bin/compilervars.sh intel64"
-#echo $cmd
-#$cmd
-echo
-echo '================================================================================'
-echo "=== The environment is set as following:"
-env
-echo
-echo '================================================================================'
-echo "BEGIN test"
-echo "=== Intel native mpirun will get the number of nodes and the machinefile from Slurm"
-which mpirun
-cmd="mpirun /bin/hostname"
-echo $cmd
-$cmd
-echo "END test"
-echo
-echo '================================================================================'
-echo "BEGIN mpirun pmsco"
-echo "Intel native mpirun will get the number of nodes and the machinefile from Slurm"
-echo
-echo "code revision"
-cd "$PMSCO_SOURCE_DIR"
-git log --pretty=tformat:'%h %ai %d' -1
-python -m compileall pmsco
-python -m compileall projects
-cd "$PMSCO_WORK_DIR"
-echo
-
-PMSCO_CMD="python $PMSCO_SOURCE_DIR/pmsco $PMSCO_PROJECT_FILE"
-PMSCO_ARGS="$PMSCO_PROJECT_ARGS"
-if [ -n "$PMSCO_SCAN_FILES" ]; then
-    PMSCO_ARGS="-s $PMSCO_SCAN_FILES $PMSCO_ARGS"
-fi
-if [ -n "$PMSCO_CODE" ]; then
-    PMSCO_ARGS="-c $PMSCO_CODE $PMSCO_ARGS"
-fi
-if [ -n "$PMSCO_MODE" ]; then
-    PMSCO_ARGS="-m $PMSCO_MODE $PMSCO_ARGS"
-fi
-if [ -n "$PMSCO_OUT" ]; then
-    PMSCO_ARGS="-o $PMSCO_OUT $PMSCO_ARGS"
-fi
-if [ "$PMSCO_WALLTIME_HR" -ge 1 ]; then
-    PMSCO_ARGS="-t $PMSCO_WALLTIME_HR $PMSCO_ARGS"
-fi
-if [ -n "$PMSCO_LOGLEVEL" ]; then
-    PMSCO_ARGS="--log-level $PMSCO_LOGLEVEL --log-file $PMSCO_JOBNAME.log $PMSCO_ARGS"
-fi
-
-which mpirun
-ls -l "$PMSCO_SOURCE_DIR"
-ls -l "$PMSCO_PROJECT_FILE"
-# Do no use the OpenMPI specific options, like "-x LD_LIBRARY_PATH", with the Intel mpirun.
-cmd="mpirun $PMSCO_CMD $PMSCO_ARGS"
-echo $cmd
-$cmd
-echo "END mpirun pmsco"
-echo '================================================================================'
-date
-ls -lAtr
-echo '================================================================================'
-
-exit 0
--- a/bin/pmsco.sge.template
+++ b/bin/pmsco.sge.template
@ -1,178 +0,0 @@
-#!/bin/bash
-#
-# SGE script template for MSC calculations
-#
-# This script uses the tight integration of openmpi-1.4.5-gcc-4.6.3 in SGE
-# using the parallel environment (PE) "orte".
-# This script must be used only with qsub command - do NOT run it as a stand-alone
-# shell script because it will start all processes on the local node.
-#
-# PhD arguments
-# copy this template to a new file, and set the arguments
-#
-# PHD_WORK_DIR
-#   path to be used as working directory.
-#   contains the SGE script derived from this template.
-#   receives output and temporary files.
-#
-# PHD_PROJECT_FILE
-#   python module that declares the project and starts the calculation.
-#   must include the file path relative to $PHD_WORK_DIR.
-#
-# PHD_SOURCE_DIR
-#   path to the pmsco source directory
-#   (the directory which contains the bin, lib, pmsco sub-directories)
-#
-# PHD_SCAN_FILES
-#   list of scan files.
-#
-# PHD_OUT
-#   name of output file. should not include a path.
-#
-# all paths are relative to $PHD_WORK_DIR or (better) absolute.
-#
-#
-# Further arguments
-#
-# PHD_JOBNAME (required)
-#   the job name is the base name for output files.
-#
-# PHD_NODES (required)
-#   number of computing nodes (processes) to allocate for the job.
-#
-# PHD_WALLTIME_HR (required)
-#   wall time limit (hours)
-#
-# PHD_WALLTIME_MIN (required)
-#   wall time limit (minutes)
-#
-# PHD_MODE (optional)
-#   calculation mode: single, swarm, grid, gradient
-#
-# PHD_CODE (optional)
-#   calculation code: edac, msc, test
-#
-# PHD_LOGLEVEL (optional)
-#   request log level: DEBUG, INFO, WARNING, ERROR
-#   create a log file based on the job name.
-#
-# PHD_PROJECT_ARGS (optional)
-#   extra arguments that are parsed by the project module.
-#
-
-PHD_WORK_DIR="_PHD_WORK_DIR"
-PHD_JOBNAME="_PHD_JOBNAME"
-PHD_NODES=_PHD_NODES
-PHD_WALLTIME_HR=_PHD_WALLTIME_HR
-PHD_WALLTIME_MIN=_PHD_WALLTIME_MIN
-
-PHD_PROJECT_FILE="_PHD_PROJECT_FILE"
-PHD_MODE="_PHD_MODE"
-PHD_CODE="_PHD_CODE"
-PHD_SOURCE_DIR="_PHD_SOURCE_DIR"
-PHD_SCAN_FILES="_PHD_SCAN_FILES"
-PHD_OUT="_PHD_JOBNAME"
-PHD_LOGLEVEL="_PHD_LOGLEVEL"
-PHD_PROJECT_ARGS="_PHD_PROJECT_ARGS"
-
-# Define your job name, parallel environment with the number of slots, and run time:
-#$ -cwd
-#$ -N _PHD_JOBNAME.job
-#$ -pe orte _PHD_NODES
-#$ -l ram=2G
-#$ -l s_rt=_PHD_WALLTIME_HR:_PHD_WALLTIME_MIN:00
-#$ -l h_rt=_PHD_WALLTIME_HR:_PHD_WALLTIME_MIN:30
-#$ -V
-
-###################################################
-# Fix the SGE environment-handling bug (bash):
-source /usr/share/Modules/init/sh
-export -n -f module
-
-# Load the environment modules for this job (the order may be important):
-module load python/python-2.7.5
-module load gcc/gcc-4.6.3
-module load mpi/openmpi-1.4.5-gcc-4.6.3
-module load blas/blas-20110419-gcc-4.6.3
-module load lapack/lapack-3.4.2-gcc-4.6.3
-export LD_LIBRARY_PATH=$PHD_SOURCE_DIR/lib/:$LD_LIBRARY_PATH
-
-###################################################
-# Set the environment variables:
-MPIEXEC=$OPENMPI/bin/mpiexec
-# OPENMPI is set by the mpi/openmpi-* module.
-
-export OMP_NUM_THREADS=1
-export OMPI_MCA_btl='openib,sm,self'
-# export OMPI_MCA_orte_process_binding=core
-
-##############
-# BEGIN DEBUG
-# Print the SGE environment on master host:
-echo "================================================================"
-echo "=== SGE job  JOB_NAME=$JOB_NAME  JOB_ID=$JOB_ID"
-echo "================================================================"
-echo DATE=`date`
-echo HOSTNAME=`hostname`
-echo PWD=`pwd`
-echo "NSLOTS=$NSLOTS"
-echo "PE_HOSTFILE=$PE_HOSTFILE"
-cat $PE_HOSTFILE
-echo "================================================================"
-echo "Running environment:"
-env
-echo "================================================================"
-echo "Loaded environment modules:"
-module list 2>&1
-echo
-# END DEBUG
-##############
-
-##############
-# Setup
-cd "$PHD_SOURCE_DIR"
-python -m compileall .
-
-cd "$PHD_WORK_DIR"
-ulimit -c 0
-
-###################################################
-# The command to run with mpiexec:
-CMD="python $PHD_PROJECT_FILE"
-ARGS="$PHD_PROJECT_ARGS"
-
-if [ -n "$PHD_SCAN_FILES" ]; then
-    ARGS="-s $PHD_SCAN_FILES -- $ARGS"
-fi
-
-if [ -n "$PHD_CODE" ]; then
-    ARGS="-c $PHD_CODE $ARGS"
-fi
-
-if [ -n "$PHD_MODE" ]; then
-    ARGS="-m $PHD_MODE $ARGS"
-fi
-
-if [ -n "$PHD_OUT" ]; then
-    ARGS="-o $PHD_OUT $ARGS"
-fi
-
-if [ "$PHD_WALLTIME_HR" -ge 1 ]
-then
-    ARGS="-t $PHD_WALLTIME_HR $ARGS"
-else
-    ARGS="-t 0.5 $ARGS"
-fi
-
-if [ -n "$PHD_LOGLEVEL" ]; then
-    ARGS="--log-level $PHD_LOGLEVEL --log-file $PHD_JOBNAME.log $ARGS"
-fi
-
-# The MPI command to run:
-MPICMD="$MPIEXEC --prefix $OPENMPI -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -x OMPI_MCA_btl -np $NSLOTS $CMD $ARGS"
-echo "Command to run:"
-echo "$MPICMD"
-echo
-exec $MPICMD
-
-exit 0
--- a/bin/qpmsco.ra-git.sh
+++ b/bin/qpmsco.ra-git.sh
@ -1,145 +0,0 @@
-#!/bin/sh
-#
-# submission script for PMSCO calculations on the Ra cluster
-#
-# this version clones the current git repository at HEAD to the work directory.
-# thus, version conflicts between jobs are avoided.
-#
-
-if [ $# -lt 1 ]; then
-  echo "Usage: $0 [NOSUB] GIT_TAG DESTDIR JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT [ARGS [ARGS [...]]]"
-  echo ""
-  echo "       NOSUB (optional): do not submit the script to the queue. default: submit."
-  echo "       GIT_TAG: git tag or branch name of the code. HEAD for current code."
-  echo "       DESTDIR: destination directory. must exist. a sub-dir \$JOBNAME is created."
-  echo "       JOBNAME (text): name of job. use only alphanumeric characters, no spaces."
-  echo "       NODES (integer): number of computing nodes. (1 node = 24 or 32 processors)."
-  echo "          do not specify more than 2."
-  echo "       TASKS_PER_NODE (integer): 1...24, or 32."
-  echo "          24 or 32 for full-node allocation."
-  echo "          1...23 for shared node allocation."
-  echo "       WALLTIME:HOURS (integer): requested wall time."
-  echo "          1...24 for day partition"
-  echo "          24...192 for week partition"
-  echo "          1...192 for shared partition"
-  echo "       PROJECT: python module (file path) that declares the project and starts the calculation."
-  echo "       ARGS (optional): any number of further PMSCO or project arguments (except time)."
-  echo ""
-  echo "the job script is written to \$DESTDIR/\$JOBNAME which is also the destination of calculation output."
-  exit 1
-fi
-
-# location of the pmsco package is derived from the path of this script
-SCRIPTDIR="$(dirname $(readlink -f $0))"
-SOURCEDIR="$(readlink -f $SCRIPTDIR/..)"
-PMSCO_SOURCE_DIR="$SOURCEDIR"
-
-# read arguments
-if [ "$1" == "NOSUB" ]; then
-  NOSUB="true"
-  shift
-else
-  NOSUB="false"
-fi
-
-if [ "$1" == "HEAD" ]; then
-    BRANCH_ARG=""
-else
-    BRANCH_ARG="-b $1"
-fi
-shift
-
-DEST_DIR="$1"
-shift
-
-PMSCO_JOBNAME=$1
-shift
-
-PMSCO_NODES=$1
-PMSCO_TASKS_PER_NODE=$2
-PMSCO_TASKS=$(expr $PMSCO_NODES \* $PMSCO_TASKS_PER_NODE)
-shift 2
-
-PMSCO_WALLTIME_HR=$1
-PMSCO_WALLTIME_MIN=$(expr $PMSCO_WALLTIME_HR \* 60)
-shift
-
-# select partition
-if [ $PMSCO_WALLTIME_HR -ge 25 ]; then
-    PMSCO_PARTITION="week"
-else
-    PMSCO_PARTITION="day"
-fi
-if [ $PMSCO_TASKS_PER_NODE -lt 24 ]; then
-    PMSCO_PARTITION="shared"
-fi
-
-PMSCO_PROJECT_FILE="$(readlink -f $1)"
-shift
-
-PMSCO_PROJECT_ARGS="$*"
-
-# set up working directory
-cd "$DEST_DIR"
-if [ ! -d "$PMSCO_JOBNAME" ]; then
-    mkdir "$PMSCO_JOBNAME"
-fi
-cd "$PMSCO_JOBNAME"
-WORKDIR="$(pwd)"
-PMSCO_WORK_DIR="$WORKDIR"
-
-# copy code
-PMSCO_SOURCE_REPO="file://$PMSCO_SOURCE_DIR"
-echo "$PMSCO_SOURCE_REPO"
-
-cd "$PMSCO_WORK_DIR"
-git clone $BRANCH_ARG --single-branch --depth 1 $PMSCO_SOURCE_REPO pmsco || exit
-cd pmsco
-PMSCO_REV=$(git log --pretty=format:"%h, %ai" -1) || exit
-cd "$WORKDIR"
-echo "$PMSCO_REV" > revision.txt
-
-# generate job script from template
-sed -e "s:_PMSCO_WORK_DIR:$PMSCO_WORK_DIR:g" \
-    -e "s:_PMSCO_JOBNAME:$PMSCO_JOBNAME:g" \
-    -e "s:_PMSCO_NODES:$PMSCO_NODES:g" \
-    -e "s:_PMSCO_WALLTIME_HR:$PMSCO_WALLTIME_HR:g" \
-    -e "s:_PMSCO_PROJECT_FILE:$PMSCO_PROJECT_FILE:g" \
-    -e "s:_PMSCO_PROJECT_ARGS:$PMSCO_PROJECT_ARGS:g" \
-    "$SCRIPTDIR/pmsco.ra-git.template" > $PMSCO_JOBNAME.job
-
-chmod u+x "$PMSCO_JOBNAME.job" || exit
-
-# request nodes and tasks
-#
-# The option --ntasks-per-node is meant to be used with the --nodes option.
-# (For the --ntasks option, the default is one task per node, use the --cpus-per-task option to change this default.)
-#
-# sbatch options
-# --cores-per-socket=16
-#   32 cores per node
-# --partition=[shared|day|week]
-# --time=8-00:00:00
-#   override default time limit (2 days in long queue)
-#   time formats: "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes", "days-hours:minutes:seconds"
-# --mail-type=ALL
-# --test-only
-#   check script but do not submit
-#
-SLURM_ARGS="--nodes=$PMSCO_NODES --ntasks-per-node=$PMSCO_TASKS_PER_NODE"
-
-if [ $PMSCO_TASKS_PER_NODE -gt 24 ]; then
-    SLURM_ARGS="--cores-per-socket=16 $SLURM_ARGS"
-fi
-
-SLURM_ARGS="--partition=$PMSCO_PARTITION $SLURM_ARGS"
-
-SLURM_ARGS="--time=$PMSCO_WALLTIME_HR:00:00 $SLURM_ARGS"
-
-CMD="sbatch $SLURM_ARGS $PMSCO_JOBNAME.job"
-echo $CMD
-if [ "$NOSUB" != "true" ]; then
-  $CMD
-fi
-
-exit 0
--- a/bin/qpmsco.ra.sh
+++ b/bin/qpmsco.ra.sh
@ -1,151 +0,0 @@
-#!/bin/sh
-#
-# submission script for PMSCO calculations on the Ra cluster
-#
-# CAUTION: the job will execute the pmsco code which is present in the directory tree
-#          of this script _at the time of job execution_, not submission!
-#          before changing the code, make sure that all pending jobs have started execution,
-#          otherwise you will experience version conflicts.
-#          it's better to use the qpmsco.ra-git.sh script which clones the code.
-
-if [ $# -lt 1 ]; then
-  echo "Usage: $0 [NOSUB] DESTDIR JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT MODE [ARGS [ARGS [...]]]"
-  echo ""
-  echo "       NOSUB (optional): do not submit the script to the queue. default: submit."
-  echo "       DESTDIR: destination directory. must exist. a sub-dir \$JOBNAME is created."
-  echo "       JOBNAME (text): name of job. use only alphanumeric characters, no spaces."
-  echo "       NODES (integer): number of computing nodes. (1 node = 24 or 32 processors)."
-  echo "          do not specify more than 2."
-  echo "       TASKS_PER_NODE (integer): 1...24, or 32."
-  echo "          24 or 32 for full-node allocation."
-  echo "          1...23 for shared node allocation."
-  echo "       WALLTIME:HOURS (integer): requested wall time."
-  echo "          1...24 for day partition"
-  echo "          24...192 for week partition"
-  echo "          1...192 for shared partition"
-  echo "       PROJECT: python module (file path) that declares the project and starts the calculation."
-  echo "       MODE: PMSCO calculation mode (single|swarm|gradient|grid)."
-  echo "       ARGS (optional): any number of further PMSCO or project arguments (except mode and time)."
-  echo ""
-  echo "the job script is written to \$DESTDIR/\$JOBNAME which is also the destination of calculation output."
-  exit 1
-fi
-
-# location of the pmsco package is derived from the path of this script
-SCRIPTDIR="$(dirname $(readlink -f $0))"
-SOURCEDIR="$SCRIPTDIR/.."
-PMSCO_SOURCE_DIR="$SOURCEDIR"
-
-# read arguments
-if [ "$1" == "NOSUB" ]; then
-  NOSUB="true"
-  shift
-else
-  NOSUB="false"
-fi
-
-DEST_DIR="$1"
-shift
-
-PMSCO_JOBNAME=$1
-shift
-
-PMSCO_NODES=$1
-PMSCO_TASKS_PER_NODE=$2
-PMSCO_TASKS=$(expr $PMSCO_NODES \* $PMSCO_TASKS_PER_NODE)
-shift 2
-
-PMSCO_WALLTIME_HR=$1
-PMSCO_WALLTIME_MIN=$(expr $PMSCO_WALLTIME_HR \* 60)
-shift
-
-# select partition
-if [ $PMSCO_WALLTIME_HR -ge 25 ]; then
-    PMSCO_PARTITION="week"
-else
-    PMSCO_PARTITION="day"
-fi
-if [ $PMSCO_TASKS_PER_NODE -lt 24 ]; then
-    PMSCO_PARTITION="shared"
-fi
-
-PMSCO_PROJECT_FILE="$(readlink -f $1)"
-shift
-
-PMSCO_MODE="$1"
-shift
-
-PMSCO_PROJECT_ARGS="$*"
-
-# use defaults, override explicitly in PMSCO_PROJECT_ARGS if necessary
-PMSCO_SCAN_FILES=""
-PMSCO_LOGLEVEL=""
-PMSCO_CODE=""
-
-# set up working directory
-cd "$DEST_DIR"
-if [ ! -d "$PMSCO_JOBNAME" ]; then
-    mkdir "$PMSCO_JOBNAME"
-fi
-cd "$PMSCO_JOBNAME"
-WORKDIR="$(pwd)"
-PMSCO_WORK_DIR="$WORKDIR"
-
-# provide revision information, requires git repository
-cd "$SOURCEDIR"
-PMSCO_REV=$(git log --pretty=format:"%h, %ai" -1)
-if [ $? -ne 0 ]; then
-   PMSCO_REV="revision unknown, "$(date +"%F %T %z")
-fi
-cd "$WORKDIR"
-echo "$PMSCO_REV" > revision.txt
-
-# generate job script from template
-sed -e "s:_PMSCO_WORK_DIR:$PMSCO_WORK_DIR:g" \
-    -e "s:_PMSCO_JOBNAME:$PMSCO_JOBNAME:g" \
-    -e "s:_PMSCO_NODES:$PMSCO_NODES:g" \
-    -e "s:_PMSCO_WALLTIME_HR:$PMSCO_WALLTIME_HR:g" \
-    -e "s:_PMSCO_PROJECT_FILE:$PMSCO_PROJECT_FILE:g" \
-    -e "s:_PMSCO_PROJECT_ARGS:$PMSCO_PROJECT_ARGS:g" \
-    -e "s:_PMSCO_CODE:$PMSCO_CODE:g" \
-    -e "s:_PMSCO_MODE:$PMSCO_MODE:g" \
-    -e "s:_PMSCO_SOURCE_DIR:$PMSCO_SOURCE_DIR:g" \
-    -e "s:_PMSCO_SCAN_FILES:$PMSCO_SCAN_FILES:g" \
-    -e "s:_PMSCO_LOGLEVEL:$PMSCO_LOGLEVEL:g" \
-    "$SCRIPTDIR/pmsco.ra.template" > $PMSCO_JOBNAME.job
-
-chmod u+x "$PMSCO_JOBNAME.job"
-
-# request nodes and tasks
-#
-# The option --ntasks-per-node is meant to be used with the --nodes option.
-# (For the --ntasks option, the default is one task per node, use the --cpus-per-task option to change this default.)
-#
-# sbatch options
-# --cores-per-socket=16
-#   32 cores per node
-# --partition=[shared|day|week]
-# --time=8-00:00:00
-#   override default time limit (2 days in long queue)
-#   time formats: "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes", "days-hours:minutes:seconds"
-# --mail-type=ALL
-# --test-only
-#   check script but do not submit
-#
-SLURM_ARGS="--nodes=$PMSCO_NODES --ntasks-per-node=$PMSCO_TASKS_PER_NODE"
-
-if [ $PMSCO_TASKS_PER_NODE -gt 24 ]; then
-    SLURM_ARGS="--cores-per-socket=16 $SLURM_ARGS"
-fi
-
-SLURM_ARGS="--partition=$PMSCO_PARTITION $SLURM_ARGS"
-
-SLURM_ARGS="--time=$PMSCO_WALLTIME_HR:00:00 $SLURM_ARGS"
-
-CMD="sbatch $SLURM_ARGS $PMSCO_JOBNAME.job"
-echo $CMD
-if [ "$NOSUB" != "true" ]; then
-  $CMD
-fi
-
-exit 0
--- a/bin/qpmsco.sge
+++ b/bin/qpmsco.sge
@ -1,128 +0,0 @@
-#!/bin/sh
-#
-# submission script for PMSCO calculations on Merlin cluster
-#
-
-if [ $# -lt 1 ]; then
-  echo "Usage: $0 [NOSUB] JOBNAME NODES WALLTIME:HOURS PROJECT MODE [LOG_LEVEL]"
-  echo ""
-  echo "       NOSUB (optional): do not submit the script to the queue. default: submit."
-  echo "       WALLTIME:HOURS (integer): sets the wall time limits."
-  echo "          soft limit = HOURS:00:00"
-  echo "          hard limit = HOURS:00:30"
-  echo "          for short.q: HOURS = 0 (-> MINUTES=30)"
-  echo "          for all.q:   HOURS <= 24"
-  echo "          for long.q:  HOURS <= 96"
-  echo "       PROJECT: python module (file path) that declares the project and starts the calculation."
-  echo "       MODE: PMSCO calculation mode (single|swarm|gradient|grid)."
-  echo "       LOG_LEVEL (optional): one of DEBUG, INFO, WARNING, ERROR if log files should be produced."
-  echo ""
-  echo "the job script complete with the program code and input/output data is generated in ~/jobs/\$JOBNAME"
-  exit 1
-fi
-
-# location of the pmsco package is derived from the path of this script
-SCRIPTDIR="$(dirname $(readlink -f $0))"
-SOURCEDIR="$SCRIPTDIR/.."
-PHD_SOURCE_DIR="$SOURCEDIR"
-
-PHD_CODE="edac"
-
-# read arguments
-if [ "$1" == "NOSUB" ]; then
-  NOSUB="true"
-  shift
-else
-  NOSUB="false"
-fi
-
-PHD_JOBNAME=$1
-shift
-
-PHD_NODES=$1
-shift
-
-PHD_WALLTIME_HR=$1
-PHD_WALLTIME_MIN=0
-shift
-
-PHD_PROJECT_FILE="$(readlink -f $1)"
-PHD_PROJECT_ARGS=""
-shift
-
-PHD_MODE="$1"
-shift
-
-PHD_LOGLEVEL=""
-if [ "$1" == "DEBUG" ] || [ "$1" == "INFO" ] || [ "$1" == "WARNING" ] || [ "$1" == "ERROR" ]; then
-  PHD_LOGLEVEL="$1"
-  shift
-fi
-
-# ignore remaining arguments
-PHD_SCAN_FILES=""
-
-# select allowed queues
-QUEUE=short.q,all.q,long.q
-
-# for short queue (limit 30 minutes)
-if [ "$PHD_WALLTIME_HR" -lt 1 ]; then
-    PHD_WALLTIME_HR=0
-    PHD_WALLTIME_MIN=30
-fi
-
-# set up working directory
-cd ~
-if [ ! -d "jobs" ]; then
-    mkdir jobs
-fi
-cd jobs
-if [ ! -d "$PHD_JOBNAME" ]; then
-    mkdir "$PHD_JOBNAME"
-fi
-cd "$PHD_JOBNAME"
-WORKDIR="$(pwd)"
-PHD_WORK_DIR="$WORKDIR"
-
-# provide revision information, requires git repository
-cd "$SOURCEDIR"
-PHD_REV=$(git log --pretty=format:"%h, %ad" --date=iso -1)
-if [ $? -ne 0 ]; then
-   PHD_REV="revision unknown, "$(date +"%F %T %z")
-fi
-cd "$WORKDIR"
-echo "$PHD_REV" > revision.txt
-
-# generate job script from template
-sed -e "s:_PHD_WORK_DIR:$PHD_WORK_DIR:g" \
-    -e "s:_PHD_JOBNAME:$PHD_JOBNAME:g" \
-    -e "s:_PHD_NODES:$PHD_NODES:g" \
-    -e "s:_PHD_WALLTIME_HR:$PHD_WALLTIME_HR:g" \
-    -e "s:_PHD_WALLTIME_MIN:$PHD_WALLTIME_MIN:g" \
-    -e "s:_PHD_PROJECT_FILE:$PHD_PROJECT_FILE:g" \
-    -e "s:_PHD_PROJECT_ARGS:$PHD_PROJECT_ARGS:g" \
-    -e "s:_PHD_CODE:$PHD_CODE:g" \
-    -e "s:_PHD_MODE:$PHD_MODE:g" \
-    -e "s:_PHD_SOURCE_DIR:$PHD_SOURCE_DIR:g" \
-    -e "s:_PHD_SCAN_FILES:$PHD_SCAN_FILES:g" \
-    -e "s:_PHD_LOGLEVEL:$PHD_LOGLEVEL:g" \
-    "$SCRIPTDIR/pmsco.sge.template" > $PHD_JOBNAME.job
-
-chmod u+x "$PHD_JOBNAME.job"
-
-if [ "$NOSUB" != "true" ]; then
-
-# suppress bash error [stackoverflow.com/questions/10496758]
-unset module
-
-# submit the job script
-# EMAIL must be defined in the environment
-if [ -n "$EMAIL" ]; then
-  qsub -q $QUEUE -m ae -M $EMAIL $PHD_JOBNAME.job
-else
-  qsub -q $QUEUE $PHD_JOBNAME.job
-fi
-
-fi
-
-exit 0
--- a/docs/config.dox
+++ b/docs/config.dox
@ -32,7 +32,7 @@ DOXYFILE_ENCODING      = UTF-8
 # title of most generated pages and in a few other places.
 # The default value is: My Project.

-PROJECT_NAME           = "PEARL MSCO"
+PROJECT_NAME           = "PMSCO"

 # The PROJECT_NUMBER tag can be used to enter a project or revision number. This
 # could be handy for archiving the generated documentation or if some version
@ -765,8 +765,10 @@ src/concepts-tasks.dox \
 src/concepts-emitter.dox \
 src/concepts-atomscat.dox \
 src/installation.dox \
+src/project.dox \
 src/execution.dox \
 src/commandline.dox \
+src/runfile.dox \
 src/optimizers.dox \
                         ../pmsco \
                         ../projects \
@ -889,7 +891,7 @@ INPUT_FILTER           =
 # filters are used. If the FILTER_PATTERNS tag is empty or if none of the
 # patterns match the file name, INPUT_FILTER is applied.

-FILTER_PATTERNS        = *.py=/usr/bin/doxypy
+FILTER_PATTERNS        = *.py=./py_filter.sh

 # If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using
 # INPUT_FILTER) will also be used to filter the input files that are used for
@ -2083,12 +2085,6 @@ EXTERNAL_GROUPS        = YES

 EXTERNAL_PAGES         = YES

-# The PERL_PATH should be the absolute path and name of the perl script
-# interpreter (i.e. the result of 'which perl').
-# The default file (with absolute path) is: /usr/bin/perl.
-
-PERL_PATH              = /usr/bin/perl
-
 #---------------------------------------------------------------------------
 # Configuration options related to the dot tool
 #---------------------------------------------------------------------------
@ -2102,15 +2098,6 @@ PERL_PATH              = /usr/bin/perl

 CLASS_DIAGRAMS         = YES

-# You can define message sequence charts within doxygen comments using the \msc
-# command. Doxygen will then run the mscgen tool (see:
-# http://www.mcternan.me.uk/mscgen/)) to produce the chart and insert it in the
-# documentation. The MSCGEN_PATH tag allows you to specify the directory where
-# the mscgen tool resides. If left empty the tool is assumed to be found in the
-# default search path.
-
-MSCGEN_PATH            = 
-
 # You can include diagrams made with dia in doxygen documentation. Doxygen will
 # then run dia to produce the diagram and insert it in the documentation. The
 # DIA_PATH tag allows you to specify the directory where the dia binary resides.
--- a/docs/py_filter.sh
+++ b/docs/py_filter.sh
@ -0,0 +1,2 @@
+#!/bin/bash
+python -m doxypypy.doxypypy -a -c $1
--- a/docs/readme.txt
+++ b/docs/readme.txt
@ -1,7 +1,17 @@
-to compile the source code documentation, you need the following packages (naming according to Debian):
+To compile the source code documentation in HTML format, 
+you need the following packages.
+They are available from Linux distributions unless noted otherwise.

+GNU make
 doxygen
-doxygen-gui (optional)
-doxypy
+python
+doxypypy (pip)
 graphviz
-latex (optional)
+java JRE
+plantuml (download from plantuml.com)
+
+export the location of plantuml.jar in the PLANTUML_JAR_PATH environment variable.
+
+go to the `docs` directory and execute `make html`.
+
+open `docs/html/index.html` in your browser.
--- a/docs/src/commandline.dox
+++ b/docs/src/commandline.dox
@ -22,7 +22,7 @@ Do not include the extension <code>.py</code> or a trailing slash.
 Common args and project args are described below.


-\subsection sec_common_args Common Arguments
+\subsection sec_command_common Common Arguments

 All common arguments are optional and default to more or less reasonable values if omitted.
 They can be added to the command line in arbitrary order.
@ -34,7 +34,7 @@ The following table is ordered by importance.
 | -h , --help | | Display a command line summary and exit. |
 | -m , --mode | single (default), grid, swarm, genetic | Operation mode. |
 | -d, --data-dir | file system path | Directory path for experimental data files (if required by project). Default: current working directory. |
-| -o, --output-file | file system path | Base path and/or name for intermediate and output files. Default: pmsco_data |
+| -o, --output-file | file system path | Base path and/or name for intermediate and output files. Default: pmsco0 |
 | -t, --time-limit | decimal number | Wall time limit in hours. The optimizers try to finish before the limit. Default: 24.0. |
 | -k, --keep-files | list of file categories | Output file categories to keep after the calculation. Multiple values can be specified and must be separated by spaces. By default, cluster and model (simulated data) of a limited number of best models are kept. See @ref sec_file_categories below. |
 | --log-level | DEBUG, INFO, WARNING (default), ERROR, CRITICAL | Minimum level of messages that should be added to the log. |
@ -45,7 +45,7 @@ The following table is ordered by importance.
 | --table-file | file system path | Name of the model table file in table scan mode. |


-\subsubsection sec_file_categories File Categories
+\subsubsection sec_command_files File Categories

 The following category names can be used with the `--keep-files` option.
 Multiple names can be specified and must be separated by spaces.
@ -60,7 +60,7 @@ Multiple names can be specified and must be separated by spaces.
 | debug |      debug files |  delete |
 | model |       output files in ETPAI format: complete simulation  (a_-1_-1_-1_-1) | keep |
 | scan |       output files in ETPAI format: scan (a_b_-1_-1_-1) |  keep |
-| symmetry |   output files in ETPAI format: symmetry (a_b_c_-1_-1) |  delete |
+| domain |     output files in ETPAI format: domain (a_b_c_-1_-1) |  delete |
 | emitter |    output files in ETPAI format: emitter (a_b_c_d_-1) |  delete |
 | region |     output files in ETPAI format: region (a_b_c_d_e) |  delete |
 | report|      final report of results | keep always |
@ -79,7 +79,7 @@ you have to add the file categories that you want to keep, e.g.,
 Do not specify `rfac` alone as this will effectively not return any file.


-\subsection sec_project_args Project Arguments
+\subsection sec_command_project_args Project Arguments

 The following table lists a few recommended options that are handled by the project code.
 Project options that are not listed here should use the long form to avoid conflicts in future versions.
@ -90,7 +90,7 @@ Project options that are not listed here should use the long form to avoid confl
 | -s, --scans | project-dependent | Nick names of scans to use in calculation. The nick name selects the experimental data file and the initial state of the photoelectron. Multiple values can be specified and must be separated by spaces. |


-\subsection sec_scanfile Experimental Scan Files
+\subsection sec_command_scanfile Experimental Scan Files

 The recommended way of specifying experimental scan files is using nick names (dictionary keys) and the @c --scans option.
 A dictionary in the module code defines the corresponding file name, chemical species of the emitter and initial state of the photoelectron.
@ -99,7 +99,7 @@ This way, the file names and photoelectron parameters are versioned with the cod
 whereas command line arguments may easily get forgotten in the records.


-\subsection sec_project_example Argument Handling
+\subsection sec_command_example Argument Handling

 To handle command line arguments in a project module,
 the module must define a <code>parse_project_args</code> and a <code>set_project_args</code> function.
--- a/docs/src/concepts-emitter.dox
+++ b/docs/src/concepts-emitter.dox
@ -105,12 +105,12 @@ is assigned to the project's cluster_generator attribute.
 1. Implement a count_emitters method in your project class
   if the project uses more than one emitter configurations.
   It must have same method contract as pmsco.cluster.ClusterGenerator.count_emitters.
-   Specifically, it must return the number of emitter configurations of a given model, scan and symmetry.
+   Specifically, it must return the number of emitter configurations of a given model, scan and domain.
   If there is only one configuration, the method does not need to be implemented.

 2. Implement a create_cluster method in your project class.
   It must have same method contract as pmsco.cluster.ClusterGenerator.create_cluster.
-   Specifically, it must return a cluster.Cluster object for the given model, scan, symmetry and emitter configuration.
+   Specifically, it must return a cluster.Cluster object for the given model, scan, domain and emitter configuration.
   The emitter atoms must be marked according to the emitter configuration specified by the index argument.
   Note that, depending on the index.emit argument, all emitter atoms must be marked
   or only the ones of the corresponding emitter configuration.
--- a/docs/src/concepts-symmetry.dox
+++ b/docs/src/concepts-symmetry.dox
@ -1,32 +1,32 @@
-/*! @page pag_concepts_symmetry Symmetry
+/*! @page pag_concepts_domain Domain

-\section sec_symmetry Symmetry and Domain Averaging
+\section sec_domain Domain Averaging

-A _symmetry_ under PMSCO is a discrete variant of a set of calculation parameters (including the atomic cluster)
+A _domain_ under PMSCO is a discrete variant of a set of calculation parameters (including the atomic cluster)
 that is derived from the same set of model parameters
 and that contributes incoherently to the measured diffraction pattern.
-A symmetry may be represented by a special symmetry parameter which is not subject to optimization.
+A domain may be represented by special domain parameters that are not subject to optimization.

-For instance, a real sample may have additional rotational domains that are not present in the cluster,
-increasing the symmetry from three-fold to six-fold.
+For instance, a real sample may have rotational domains that are not present in the cluster,
+changing the symmetry from three-fold to six-fold.
 Or, an adsorbate may be present in a number of different lateral configurations on the substrate.
 In the first case, it may be sufficient to fold calculated data in the proper way to generate the same symmetry as in the measurement.
 In the latter case, it may be necessary to execute a scattering calculation for each possible orientation or a representative number of possible orientations.

-PMSCO provides the basic framework to spawn multiple calculations according to the number of symmetries (cf. \ref sec_tasks).
-The actual data reduction from multiple symmetries to one measurement needs to be implemented on the project level.
+PMSCO provides the basic framework to spawn multiple calculations according to the number of domains (cf. \ref sec_tasks).
+The actual data reduction from multiple domain to one measurement needs to be implemented on the project level.
 This section explains the necessary steps.

-1. Your project needs to populate the pmsco.project.Project.symmetries list.
-   For each symmetry, add a dictionary of symmetry parameters,  e.g. <code>{'angle_azi': 15.0}</code>.
-   There must be at least one symmetry in a project, otherwise no calculation is executed.
+1. Your project needs to populate the pmsco.project.Project.domains list.
+   For each domain, add a dictionary of domain parameters,  e.g. <code>{'angle_azi': 15.0}</code>.
+   At least one domain must be declared in a project, otherwise no calculation is executed.

-2. The project may apply the symmetry of a task to the cluster and parameter file if necessary.
-   The pmsco.project.Project.create_cluster and pmsco.project.Project.create_params methods receive the index of the particular symmetry in addition to the model parameters.
-
-3. The project combines the results of the calculations for the various symmetries into one dataset that can be compared to the measurement.
-   The default method implemented in pmsco.project.Project just adds up all calculations with equal weight.
-   If you need more control, you need to override the pmsco.project.Project.combine_symmetries method and implement your own algorithm.
+2. The project may use the domain index of a task to build the cluster and parameter file as necessary.
+   The pmsco.project.Project.create_cluster and pmsco.project.Project.create_params methods receive the index of the particular domain in addition to the model parameters.

+3. The project combines the results of the calculations for the various domains into one dataset that can be compared to the measurement.
+   The default method implemented in pmsco.project.Project just adds up all calculations with customizable weight.
+   It uses the special model parameters `wdom1`, `wdom2`, ... (if defined, default 1) to weight each domain.
+   If you need more control, override the pmsco.project.Project.combine_domains method and implement your own algorithm.

 */
--- a/docs/src/concepts-tasks.dox
+++ b/docs/src/concepts-tasks.dox
@ -12,7 +12,7 @@ mandated by the project but also efficient calculations in a multi-process envir
   A concrete set of parameters is called @ref sec_task_model.
 2. The sample was measured multiple times or under different conditions (initial states, photon energy, emission angle).
   Each contiguous measured dataset is called a @ref sec_task_scan.
-3. The measurement averages over multiple inequivalent domains, cf. @ref sec_task_symmetry.
+3. The measurement averages over multiple inequivalent domains, cf. @ref sec_task_domain.
 4. The measurement includes multiple geometrically inequivalent emitters, cf. @ref sec_task_emitter.
 5. The calculation should be distributed over multiple processes that run in parallel to reduce the wall time, cf. @ref sec_task_region.

@ -24,7 +24,7 @@ as shown schematically in the following diagram.
 class CalculationTask {
 model
 scan
-symmetry
+domain
 emitter
 region
 ..
@ -55,7 +55,7 @@ class Scan {
    alphas
 }

-class Symmetry {
+class Domain {
    index
    ..
    rotation
@ -75,13 +75,13 @@ class Region {

 CalculationTask *-- Model
 CalculationTask *-- Scan
-CalculationTask *-- Symmetry
+CalculationTask *-- Domain
 CalculationTask *-- Emitter
 CalculationTask *-- Region

 class Project {
    scans
-    symmetries
+    domains
    model_handler
    cluster_generator
 }
@ -98,7 +98,7 @@ class ModelHandler {

 Model ..> ModelHandler
 Scan ..> Project
-Symmetry ..> Project
+Domain ..> Project
 Emitter ..> ClusterGenerator
 Region ..> Project

@ -141,29 +141,29 @@ PMSCO runs a separate calculation for each scan file and compares the combined r
 This is sometimes called a _global fit_.


-\subsection sec_task_symmetry Symmetry
+\subsection sec_task_domain Domain

-A _symmetry_ is a discrete variant of a set of calculation parameters (including the atomic cluster)
+A _domain_ is a discrete variant of a set of calculation parameters (including the atomic cluster)
 that is independent of the _model_ and contributes incoherently to the measured diffraction pattern.
 For instance, for a system that includes two inequivalent structural domains,
 two separate clusters have to be generated and calculated for each model.

-The symmetry parameter is not subject to optimization.
+The domain parameter is not subject to optimization.
 However, if the branching ratio is unknown a priori, a model parameter can be introduced
-to control the relative contribution of a particular symmetry to the diffraction pattern.
-In that case, the @ref pmsco.project.Project.combine_symmetries method must be overridden.
+to control the relative contribution of a particular domain to the diffraction pattern.
+The basic @ref pmsco.project.Project.combine_domains method reads the special model parameters `wdom1`, `wdom2`, etc. to weight the individual domains.

-A symmetry is identified by its index which is an index into the project's symmetries table (pmsco.project.Project.symmetries).
-It is up to the user project to give a physical description of the symmetry, e.g. a rotation angle,
-by assigning a meaningful value (e.g. a dictionary with key-value pairs) to the symmetries table.
+A domain is identified by its index which is an index into the project's domains table (pmsco.project.Project.domains).
+It is up to the user project to give a physical description of the domain, e.g. a rotation angle,
+by assigning a meaningful value (e.g. a dictionary with key-value pairs) to the domains table.
 The cluster generator can then read the value from the table rather than from constants in the code.

-The figure shows two examples of symmetry parameters.
-The corresponding symmetry table could be set up like this:
+The figure shows two examples of domain parameters.
+The corresponding domains table could be set up like this:

@code{.py}
-project.add_symmetry = {'rotation': 0.0, 'registry': 0.0}
-project.add_symmetry = {'rotation': 30.0, 'registry': 0.0}
+project.add_domain({'rotation': 0.0, 'registry': 0.0})
+project.add_domain({'rotation': 30.0, 'registry': 0.0})
@endcode


@ -173,9 +173,9 @@ The _emitter_ component of the calculation task selects a specific emitter confi
 This is merely an index whose interpretation is up to the cluster generator.
 The default emitter handler enumerates the emitter index from 1 to the emitter count reported by the cluster generator.

-The emitter count and list of emitters may depend on model, scan and symmetry.
+The emitter count and list of emitters may depend on model, scan and domain.

-The cluster generator can tailor a cluster to the given model, scan, symmetry and emitter index.
+The cluster generator can tailor a cluster to the given model, scan, domain and emitter index.
 For example, in a large unit cell with many inequivalent emitters,
 the generator might return a small sub-cluster around the actual emitter for better calculation performance
 since the distant atoms of the unit cell do not contribute to the diffraction pattern.
@ -237,20 +237,20 @@ scan

 object ScanHandler

-object "Sym: CalculationTask" as Sym {
+object "Domain: CalculationTask" as Domain {
 index = (i,j,k,-1,-1)
 model
 scan
-symmetry
+domain
 }

-object "SymmetryHandler" as SymHandler
+object "DomainHandler" as DomainHandler

 object "Emitter: CalculationTask" as Emitter {
 index = (i,j,k,l,-1)
 model
 scan
-symmetry
+domain
 emitter
 }

@ -260,7 +260,7 @@ object "Region: CalculationTask" as Region {
 index = (i,j,k,l,m)
 model
 scan
-symmetry
+domain
 emitter
 region
 }
@ -270,14 +270,14 @@ object RegionHandler

 Root "1" o.. "1..*" Model
 Model "1" o.. "1..*" Scan
-Scan "1" o.. "1..*" Sym
-Sym "1" o.. "1..*" Emitter
+Scan "1" o.. "1..*" Domain
+Domain "1" o.. "1..*" Emitter
 Emitter "1" o.. "1..*" Region

 (Root, Model) .. ModelHandler
 (Model, Scan) .. ScanHandler
-(Scan, Sym) .. SymHandler
-(Sym, Emitter) .. EmitterHandler
+(Scan, Domain) .. DomainHandler
+(Domain, Emitter) .. EmitterHandler
 (Emitter, Region) .. RegionHandler

@enduml
@ -293,7 +293,7 @@ and the tasks are passed back through the task handler stack.
 In this phase, each level joins the datasets from the sub-tasks to the data requested by the parent task.
 For example, at the lowest level, one result file is present for each region.
 The region handler gathers all files that correspond to the same parent task
-(i.e. have the same emitter, symmetry, scan and model attributes),
+(i.e. have the same emitter, domain, scan and model attributes),
 joins them to one file which includes all regions,
 links the file to the parent task and passes the result to the next higher level.

--- a/docs/src/concepts.dox
+++ b/docs/src/concepts.dox
@ -8,28 +8,30 @@ The code for a PMSCO job consists of the following components.

 skinparam componentStyle uml2

-component "project" as project
 component "PMSCO" as pmsco
+component "project" as project
 component "scattering code\n(calculator)" as calculator

 interface "command line" as cli
-interface "input files" as input
-interface "output files" as output
 interface "experimental data" as data
 interface "results" as results
+interface "output files" as output

+cli --> pmsco
 data -> project
-project ..> pmsco
+pmsco ..> project
 pmsco ..> calculator
-cli --> project
-input -> calculator
 calculator -> output
 pmsco -> results

@enduml

+The main entry point is the _PMSCO_ module.
+It implements a task loop to carry out the structural optimization
+and provides an interface between calculation programs and project-specific code.
+It also provides common utility classes and functions for the handling project data.

-The _project_ consists of program code, system and experimental parameters
+The _project_ consists of program code and parameters
 that are specific to a particular experiment and calculation job.
 The project code reads experimental data, defines the parameter dictionary of the model,
 and contains code to generate the cluster, parameter and phase files for the scattering code.
@ -40,10 +42,6 @@ which accepts detailed input files
 (parameters, atomic coordinates, emitter specification, scattering phases)
 and outputs an intensity distribution of photoelectrons versus energy and/or angle.

-The _PMSCO core_ interfaces between the project and the calculator.
-It carries out the structural optimization and manages the calculation tasks.
-It generates and sends input files to the calculator and reads back the output.
-

 \section sec_control_flow Control flow

--- a/docs/src/dataflow.dot
+++ b/docs/src/dataflow.dot
@ -10,7 +10,7 @@ digraph G {
        create_params;
        calc_modf;
        calc_rfac;
-        comb_syms;
+        comb_doms;
        comb_scans;
    }
    */
@ -24,11 +24,11 @@ digraph G {
        model_handler -> model_creator [constraint=false, label="optimize"];
    }

-    subgraph cluster_symmetry {
-        label = "symmetry handler";
+    subgraph cluster_domain {
+        label = "domain handler";
        rank = same;
-        sym_creator [label="expand models", group=creators];
-        sym_handler [label="combine symmetries", group=handlers];
+        dom_creator [label="expand models", group=creators];
+        dom_handler [label="combine domains", group=handlers];
    }

    subgraph cluster_scan {
@ -47,15 +47,15 @@ digraph G {

    calculator [label="calculator (EDAC)", shape=box];

-    model_creator -> sym_creator [label="model", style=bold];
-    sym_creator -> scan_creator [label="models", style=bold];
+    model_creator -> dom_creator [label="model", style=bold];
+    dom_creator -> scan_creator [label="models", style=bold];
    scan_creator -> calc_creator [label="models", style=bold];
    calc_creator -> calculator [label="clusters,\rparameters", style=bold];

    calculator -> calc_handler [label="output files", style=bold];
    calc_handler -> scan_handler [label="raw data files", style=bold];
-    scan_handler -> sym_handler [label="combined scans", style=bold];
-    sym_handler -> model_handler [label="combined symmetries", style=bold];
+    scan_handler -> dom_handler [label="combined scans", style=bold];
+    dom_handler -> model_handler [label="combined domains", style=bold];

    mode [shape=parallelogram];
    mode -> model_creator [lhead="cluster_model"];
@ -76,8 +76,8 @@ digraph G {
    calc_rfac [shape=cds, label="R-factor function"];
    calc_rfac -> model_handler [style=dashed];

-    comb_syms [shape=cds, label="symmetry combination rule"];
-    comb_syms -> sym_handler [style=dashed];
+    comb_doms [shape=cds, label="domain combination rule"];
+    comb_doms -> dom_handler [style=dashed];

    comb_scans [shape=cds, label="scan combination rule"];
    comb_scans -> scan_handler [style=dashed];
--- a/docs/src/execution.dox
+++ b/docs/src/execution.dox
@ -2,10 +2,15 @@
 \section sec_run Running PMSCO

 To run PMSCO you need the PMSCO code and its dependencies (cf. @ref pag_install),
-a code module that contains the project-specific code,
+a customized code module that contains the project-specific code,
 and one or several files containing the scan parameters and experimental data.
 Please check the <code>projects</code> folder for examples of project modules.
-For a detailed description of the command line, see @ref pag_command.
+
+The run-time arguments can either be passed on the command line
+(@ref pag_command - the older and less flexible way)
+or in a JSON-formatted run-file
+(@ref pag_runfile - the recommended new and flexible way).
+For beginners, it's also possible to hard-code all project parameters in the custom project module.


 \subsection sec_run_single Single Process
@ -14,40 +19,28 @@ Run PMSCO from the command prompt:

@code{.sh}
 cd work-dir
-python pmsco-dir project-dir/project.py [pmsco-arguments] [project-arguments]
+python pmsco-dir -r run-file
@endcode

 where <code>work-dir</code> is the destination directory for output files,
 <code>pmsco-dir</code> is the directory containing the <code>__main__.py</code> file,
-<code>project.py</code> is the specific project module,
-and <code>project-dir</code> is the directory where the project file is located.
-PMSCO is run in one process which handles all calculations sequentially.
+<code>run-file</code> is a json-formatted configuration file that defines run-time parameters.
+The format and content of the run-file is described in a separate section.

-The command line arguments are divided into common arguments interpreted by the main pmsco code (pmsco.py),
-and project-specific arguments interpreted by the project module.
+In this form, PMSCO is run in one process which handles all calculations sequentially.

 Example command line for a single EDAC calculation of the two-atom project:
@code{.sh}
 cd work/twoatom
-python ../../pmsco ../../projects/twoatom/twoatom.py -s ea -o twoatom-demo -m single
+python ../../pmsco -r twoatom-hemi.json
@endcode

 This command line executes the main pmsco module <code>pmsco.py</code>.
-The main module loads the project file <code>twoatom.py</code> as a plug-in
-and starts processing the common arguments.
-The <code>twoatom.py</code> module contains only project-specific code
-with several defined entry-points called from the main module.
+The information which project to load is contained in the <code>twoatom-hemi.json</code> file,
+along with all common and specific project arguments.

-In the command line above, the <code>-o twoatom-demo</code> and <code>-m single</code> arguments
-are interpreted by the pmsco module.
-<code>-o</code> sets the base name of output files,
-and <code>-m</code> selects the operation mode to a single calculation.
-
-The scan argument is interpreted by the project module.
-It refers to a dictionary entry that declares the scan file, the emitting atomic species, and the initial state.
-In this example, the project looks for the <code>twoatom_energy_alpha.etpai</code> scan file in the project directory,
-and calculates the modulation function for a N 1s initial state.
-The kinetic energy and emission angles are contained in the scan file.
+This example can be run for testing.
+All necessary parameters and data files are included in the code repository.


 \subsection sec_run_parallel Parallel Processes
@ -61,29 +54,45 @@ The slave processes will run the scattering calculations, while the master coord
 and optimizes the model parameters (depending on the operation mode).

 For optimum performance, the number of processes should not exceed the number of available processors.
-To start a two-hour optimization job with multiple processes on an quad-core workstation with hyperthreading:
+To start an optimization job with multiple processes on an quad-core workstation with hyperthreading:
@code{.sh}
 cd work/my_project
-mpiexec -np 8 pmsco-dir/pmsco project-dir/project.py -o my_job_0001 -t 2 -m swarm
+mpiexec -np 8 --use-hwthread-cpus python pmsco-dir -r run-file
@endcode

+The `--use-hwthread` option may be necessary on certain hyperthreading architectures.
+

 \subsection sec_run_hpc High-Performance Cluster

-The script @c bin/qpmsco.ra.sh takes care of submitting a PMSCO job to the slurm queue of the Ra cluster at PSI.
-The script can be adapted to other machines running the slurm resource manager.
-The script generates a job script based on @c pmsco.ra.template,
-substituting the necessary environment and parameters,
-and submits it to the queue.
+PMSCO is ready to run with resource managers on cluster machines.
+Code for submitting jobs to the slurm queue of the Ra cluster at PSI is included in the pmsco.schedule module
+(see also the PEARL wiki pages in the PSI intranet).
+The job parameters are entered in a separate section of the run file, cf. @pag_runfile for details.
+Other machines can be supported by sub-classing pmsco.schedule.JobSchedule or pmsco.schedule.SlurmSchedule.

-Execute @c bin/qpmsco.ra.sh without arguments to see a summary of the arguments.
+If a schedule section is present and enabled in the run file,
+the following command will submit a job to the cluster machine
+rather than starting a calculation directly:

-To submit a job to the PSI clusters (see also the PEARL-Wiki page MscCalcRa),
-the analog command to the previous section would be:
@code{.sh}
-bin/qpmsco.ra.sh my_job_0001 1 8 2 projects/my_project/project.py swarm
+cd ~/pmsco
+python pmsco -r run-file.json
@endcode

+The command will copy the pmsco and project source trees as well as the run file and job script to a job directory
+under the output directory specified in the project section of the run file.
+The full path of the job directory is _output-dir/job-name.
+The directory must be empty or not existing when you run the above command.
+
+Be careful to specify correct project file paths.
+The output and data directories should be specified as absolute paths.
+
+The scheduling command will also load the project and scan files.
+Many parameter errors can, thus, be caught and fixed before the job is submitted to the queue.
+The run file also offers an option to stop just before submitting the job
+so that you can inspect the job files and submit the job manually.
+
 Be sure to consider the resource allocation policy of the cluster
 before you decide on the number of processes.
 Requesting less resources will prolong the run time but might increase the scheduling priority.
--- a/docs/src/installation.dox
+++ b/docs/src/installation.dox
@ -9,7 +9,7 @@ the public repository at https://gitlab.psi.ch/pearl/pmsco.
 For their own developments, users should clone the repository.
 Changes to common code should be submitted via pull requests.

-The program code of PMSCO and its external programs is written in Python, C++ and Fortran.
+The program code of PMSCO and its external programs is written in Python 3.6, C++ and Fortran.
 The code will run in any recent Linux environment on a workstation or in a virtual machine.
 Scientific Linux, CentOS7, [Ubuntu](https://www.ubuntu.com/)
 and [Lubuntu](http://lubuntu.net/) (recommended for virtual machine) have been tested.
@ -18,6 +18,7 @@ or cluster with 20-50 available processor cores is recommended.
 The program requires about 2 GB of RAM per process.

 The recommended IDE is [PyCharm (community edition)](https://www.jetbrains.com/pycharm).
+[Spyder](https://docs.spyder-ide.org/index.html) is a good alternative with a better focus on scientific data.
 The documentation in [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) format is part of the source code.
 The Doxygen compiler can generate separate documentation in HTML or LaTeX.

@ -38,7 +39,7 @@ The code depends on the following libraries:
 - SWIG
 - BLAS
 - LAPACK
- Python 2.7 or 3.6
+- Python 3.6
 - Numpy >= 1.13
 - Python packages listed in the requirements.txt file

@ -50,12 +51,14 @@ and it's difficult to switch between different Python versions.
 On the PSI cluster machines, the environment must be set using the module system and conda (on Ra).
 Details are explained in the PEARL Wiki.

-PMSCO runs under Python 2.7 or Python 3.6.
-Since Python 2 is being deprecated, Python 3.6 is recommended.
-Compatibility with Python 2.7 is currently maintained by using
-the [future package](http://python-future.org/compatible_idioms.html)
-but may be dropped at any time.
+The following tools are required to compile the documentation:

+- doxygen
+- doxypypy
+- graphviz
+- Java
+- [plantUML](https://plantuml.com)
+- LaTeX (optional, generally not recommended)

 \subsection sec_install_instructions Instructions

@ -71,7 +74,6 @@ sudo apt install \
 binutils \
 build-essential \
 doxygen \
-doxypy \
 f2c \
 g++ \
 gcc \
@ -97,18 +99,20 @@ cd /usr/lib
 sudo ln -s /usr/lib/libblas/libblas.so.3 libblas.so
@endcode

-Install Miniconda according to their [instructions](https://conda.io/docs/user-guide/install/index.html),
+Download and install [Miniconda](https://conda.io/), 
 then configure the Python environment:

@code{.sh}
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
+bash ~/miniconda.sh
+
 conda create -q --yes -n pmsco python=3.6
-source activate pmsco
+conda activate pmsco
 conda install -q --yes -n pmsco \
    pip \
    "numpy>=1.13" \
    scipy \
    ipython \
-    mpi4py \
    matplotlib \
    nose \
    mock \
@ -116,21 +120,24 @@ conda install -q --yes -n pmsco \
    statsmodels \
    swig \
    gitpython
-pip install periodictable attrdict fasteners
+pip install periodictable attrdict commentjson fasteners mpi4py doxypypy
@endcode

+@note `mpi4pi` should be installed via pip, _not_ conda.
+   conda might install its own MPI libraries, which can cause a conflict with system libraries.
+   (cf. [mpi4py forum](https://groups.google.com/forum/#!topic/mpi4py/xpPKcOO-H4k))
+
 \subsubsection sec_install_singularity Installation in Singularity container

-A [Singularity](https://www.sylabs.io/guides/2.5/user-guide/index.html) container
+A [Singularity](https://sylabs.io/singularity/) container
 contains all OS and Python dependencies for running PMSCO.
 Besides the Singularity executable, nothing else needs to be installed in the host system.
 This may be the fastest way to get PMSCO running.

-For installation of Singularity,
-see their [user guide](https://www.sylabs.io/guides/2.5/user-guide/installation.html).
-On newer Linux systems (e.g. Ubuntu 18.04), Singularity is available from the package manager.
-Installation in a virtual machine on Windows or Mac are straightforward
-thanks to the [Vagrant system](https://www.vagrantup.com/).
+To get started with Singularity,
+download it from [sylabs.io](https://www.sylabs.io/singularity/) and install it according to their instructions.
+On Windows, Singularity can be installed in a virtual machine using the [Vagrant](https://www.vagrantup.com/)
+script included under `extras/vagrant`.

 After installing Singularity,
 check out PMSCO as explained in the @ref sec_compile section:
@ -138,6 +145,7 @@ check out PMSCO as explained in the @ref sec_compile section:
@code{.sh}
 cd ~
 mkdir containers
+cd containers
 git clone git@git.psi.ch:pearl/pmsco.git pmsco
 cd pmsco
 git checkout master
@ -145,11 +153,14 @@ git checkout -b my_branch
@endcode

 Then, either copy a pre-built container into `~/containers`,
-or build one from a script provided by the PMSCO repository:
+or build one from the definition file included under extras/singularity.
+You may need to customize the definition file to match the host OS
+or to install compatible OpenMPI libraries,
+cf. cf. [Singularity user guide](https://sylabs.io/guides/3.7/user-guide/mpi.html).

@code{.sh}
 cd ~/containers
-sudo singularity build pmsco.simg ~/containers/pmsco/extras/singularity/singularity_python2
+sudo singularity build pmsco.sif ~/containers/pmsco/extras/singularity/singularity_python3
@endcode

 To work with PMSCO, start an interactive shell in the container and switch to the pmsco environment.
@ -157,8 +168,9 @@ Note that the PMSCO code is outside the container and can be edited with the usu

@code{.sh}
 cd ~/containers
-singularity shell pmsco.simg
-source activate pmsco
+singularity shell pmsco.sif
+. /opt/miniconda/etc/profile.d/conda.sh
+conda activate pmsco
 cd ~/containers/pmsco
 make all
 nosetests -w tests/
@ -170,16 +182,17 @@ Or call PMSCO from outside:
 cd ~/containers
 mkdir output
 cd output
-singularity run ../pmsco.simg python ~/containers/pmsco/pmsco path/to/your-project.py arg1 arg2 ...
+singularity run -e ../pmsco.sif ~/containers/pmsco/pmsco -r path/to/your-runfile
@endcode

 For parallel processing, prepend `mpirun -np X` to the singularity command as needed.
+Note that this requires "compatible" OpenMPI versions on the host and container to avoid runtime errors.


 \subsubsection sec_install_extra Additional Applications

 For working with the code and data, some other applications are recommended.
-The PyCharm IDE can be installed from the Ubuntu software center.
+The PyCharm IDE (community edition) can be installed from the Ubuntu software center.
 The following commands install other useful helper applications:

@code{.sh}
@ -189,10 +202,24 @@ gitg \
 meld
@endcode

-To produce documentation in PDF format (not recommended on virtual machine), install LaTeX:
+To compile the documentation install the following tools.
+The basic documentation is in HTML format and can be opened in any internet browser.
+If you have a working LaTeX installation, a PDF document can be produced as well.
+It is not recommended to install LaTeX just for this documentation, however.

@code{.sh}
-sudo apt-get install texlive-latex-recommended
+sudo apt install \
+doxygen \
+graphviz \
+default-jre
+
+conda activate pmsco
+conda install -q --yes -n pmsco doxypypy
+
+wget -O plantuml.jar https://sourceforge.net/projects/plantuml/files/plantuml.jar/download
+sudo mkdir /opt/plantuml/
+sudo mv plantuml.jar /opt/plantuml/
+echo "export PLANTUML_JAR_PATH=/opt/plantuml/plantuml.jar" | sudo tee /etc/profile.d/pmsco-env.sh
@endcode


@ -252,7 +279,7 @@ mkdir work
 cd work
 mkdir twoatom
 cd twoatom/
-nice python ~/pmsco/pmsco ~/pmsco/projects/twoatom/twoatom.py -s ea -o twoatom_energy_alpha -m single
+nice python ~/pmsco/pmsco -r ~/pmsco/projects/twoatom/twoatom-energy.json
@endcode

 Runtime warnings may appear because the twoatom project does not contain experimental data.
--- a/docs/src/introduction.dox
+++ b/docs/src/introduction.dox
@ -24,28 +24,27 @@ Other programs may be integrated as well.

 - angle or energy scanned XPD.
 - various scanning modes including energy, polar angle, azimuthal angle, analyser angle.
- averaging over multiple symmetries (domains or emitters).
+- averaging over multiple domains and emitters.
 - global optimization of multiple scans.
- structural optimization algorithms: particle swarm optimization, grid search, gradient search.
+- structural optimization algorithms: genetic, particle swarm, grid search.
 - calculation of the modulation function.
 - calculation of the weighted R-factor.
 - automatic parallel processing using OpenMPI.


-\section sec_project Optimization Projects
+\section sec_intro_project Optimization Projects

 To set up a new optimization project, you need to:

 - create a new directory under projects.
 - create a new Python module in this directory, e.g., my_project.py.
 - implement a sub-class of project.Project in my_project.py.
- override the create_cluster, create_params, and create_domain methods.
- optionally, override the combine_symmetries and combine_scans methods.
+- override the create_cluster, create_params, and create_model_space methods.
+- optionally, override the combine_domains and combine_scans methods.
 - add a global function create_project to my_project.py.
 - provide experimental data files (intensity or modulation function).

-For details, see the documentation of the Project class,
-and the example projects.
+For details, see @ref pag_project, the documentation of the pmsco.project.Project class and the example projects.


 \section sec_intro_start Getting Started
@ -54,8 +53,9 @@ and the example projects.
  - @ref pag_concepts_tasks
  - @ref pag_concepts_emitter
 - @ref pag_install
+- @ref pag_project
 - @ref pag_run
- @ref pag_command
+- @ref pag_opt

 \section sec_license License Information

@ -70,6 +70,6 @@ These programs may not be used without an explicit agreement by the respective o

 \author    Matthias Muntwiler, <mailto:matthias.muntwiler@psi.ch>
 \version   This documentation is compiled from version $(REVISION).
-\copyright 2015-2019 by [Paul Scherrer Institut](http://www.psi.ch)
+\copyright 2015-2021 by [Paul Scherrer Institut](http://www.psi.ch)
 \copyright Licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0)
 */
--- a/docs/src/optimizers.dox
+++ b/docs/src/optimizers.dox
@ -3,28 +3,34 @@



-\subsection sec_opt_swarm Particle swarm
+\subsection sec_opt_swarm Particle swarm optimization (PSO)

-The particle swarm algorithm is adapted from
+The particle swarm optimization (PSO) algorithm seeks to find a global optimum in a multi-dimensional model space
+by employing the _swarm intelligence_ of a number of particles traversing space,
+each at its own velocity and direction,
+but adjusting its trajectory based on its own experience and the results of its peers.
+
+The PSO algorithm is adapted from
 D. A. Duncan et al., Surface Science 606, 278 (2012).
+It is implemented in the @ref pmsco.optimizers.swarm module.

-The general parameters of the genetic algorithm are specified in the @ref Project.optimizer_params dictionary.
+The general parameters of the algorithm are specified in the @ref Project.optimizer_params dictionary.
 Some of them can be changed on the command line.

 | Parameter | Command line | Range | Description |
 | --- | --- | --- | --- |
-| pop_size | --pop-size | &ge; 1 | |
+| pop_size | --pop-size | &ge; 1 | Recommended 20..50 |
 | position_constrain_mode | | default bounce | Resolution of domain limit violations. |
 | seed_file | --seed-file | a file path, default none | |
 | seed_limit | --seed-limit | 0..pop_size | |
 | rfac_limit | | 0..1, default 0.8 | Accept only seed values that have a lower R-factor. |
 | recalc_seed | | True or False, default True | |

-The domain parameters have the following meanings:
+The model space attributes have the following meaning:

 | Parameter | Description |
 | --- | --- |
-| start | Seed model. The start values are copied into particle 0 of the initial population. |
+| start | Start value of particle 0 in first iteration. |
 | min | Lower limit of the parameter range. |
 | max | Upper limit of the parameter range. |
 | step | Not used. |
@ -32,23 +38,23 @@ The domain parameters have the following meanings:

 \subsubsection sec_opt_seed Seeding a population

-By default, one particle is initialized with the start value declared in the parameter domain,
-and the other are set to random values within the domain.
+By default, one particle is initialized with the start value declared with the model space,
+and the other ones are initialized at random positions in the model space.
 You may initialize more particles of the population with specific values by providing a seed file.

 The seed file must have a similar format as the result `.dat` files
 with a header line specifying the column names and data rows containing the values for each particle.
 A good practice is to use a previous `.dat` file and remove unwanted rows.
-To continue an interrupted optimization,
-the `.dat` file from the previous optimization can be used as is.
+The `.dat` file from a previous optimization job can be used as is to continue the optimization,
+also in a different optimization mode.

 The seeding procedure can be tweaked by several optimizer parameters (see above).
 PMSCO normally loads the first rows up to population size - 1 or up to the `seed_limit` parameter,
 whichever is lower.
 If an `_rfac` column is present, the file is first sorted by R-factor and only the best models are loaded.
-Models that resulted in an R-factor above the `rfac_limit` parameter are always ignored.
+Models that resulted in an R-factor above the `rfac_limit` parameter are ignored in any case.

-During the optimization process, all models loaded from the seed file are normally re-calculated.
+In the first iteration of the optimization run, the models loaded from the seed file are re-calculated by default.
 This may waste CPU time if the calculation is run under the same conditions
 and would result in exactly the same R-factor,
 as is the case if the seed is used to continue a previous optimization, for example.
@ -58,25 +64,26 @@ and PMSCO will use the R-factor value from the seed file rather than calculating

 \subsubsection sec_opt_patch Patching a running optimization

-While an optimization process is running, the user can manually patch the population with arbitrary values,
+While an optimization job is running, the user can manually patch the population with arbitrary values,
 for instance, to kick the population out of a local optimum or to drive it to a less sampled parameter region.
 To patch a running population, prepare a population file named `pmsco_patch.pop` and copy it to the work directory.

-The file must have a similar format as the result `.dat` files
+The patch file must have the same format as the result `.dat` files
 with a header line specifying the column names and data rows containing the values.
 It should contain as many rows as particles to be patched but not more than the size of the population.
-The columns must include a `_particle` column which specifies the particle to patch
-as well as the model parameters to be changed.
+The columns must include a `_particle` column and the model parameters to be changed.
+The `_particle` column specifies the index of the particle that is patched (ranging from 0 to population size - 1).
 Parameters that should remain unaffected can be left out,
 extra columns including `_gen`, `_rfac` etc. are ignored.

 PMSCO checks the file for syntax errors and ignores it if errors are present.
-Parameter values that lie outside the domain boundary are ignored.
+Individual parameter values that lie outside the domain boundary are silently ignored.
 Successful or failed patching is logged at warning level.
-The patch file is re-applied whenever its time stamp has changed.
+PMSCO keeps track of the time stamp of the file and re-applies the patch whenever the time stamp has changed.

-\attention Do not edit the patch file in the working directory
-to prevent it from being read in an unfinished state or multiple times.
+\attention Since each change of time stamp may trigger patching,
+do not edit the patch file in the working directory
+to prevent it from being read in an unfinished state or multiple times!


 \subsection sec_opt_genetic Genetic optimization
@ -103,7 +110,7 @@ Some of them can be changed on the command line.

 | Parameter | Command line | Range | Description |
 | --- | --- | --- | --- |
-| pop_size | --pop-size | &ge; 1 | |
+| pop_size | --pop-size | &ge; 1 | Recommended 10..40 |
 | mating_factor | | 1..pop_size, default 4 | |
 | strong_mutation_probability | | 0..1, default 0.01 | Probability that a parameter undergoes a strong mutation. |
 | weak_mutation_probability | | 0..1, default 1 | Probability that a parameter undergoes a weak mutation. This parameters should be left at 1. Lower values tend to produce discrete parameter values. Weak mutations can be tuned by the step domain parameters. |
@ -113,7 +120,7 @@ Some of them can be changed on the command line.
 | rfac_limit | | 0..1, default 0.8 | Accept only seed values that have a lower R-factor. |
 | recalc_seed | | True or False, default True | |

-The domain parameters have the following meanings:
+The model space attributes have the following meaning:

 | Parameter | Description |
 | --- | --- |
@ -129,7 +136,11 @@ cf. sections @ref sec_opt_seed and @ref sec_opt_swarm.
 \subsection sec_opt_grid Grid search

 The grid search algorithm samples the parameter space at equidistant steps.
-The order of calculations is randomized so that distant parts of the parameter space are sampled at an early stage.
+It is implemented in the @ref pmsco.optimizers.grid module.
+
+
+The model space attributes have the following meaning.
+The order of calculations is random so that results from different parts of the model space become available early.

 | Parameter | Description |
 | --- | --- |
@ -149,15 +160,19 @@ The table scan calculates models from an explicit table of model parameters.
 It can be used to recalculate models from a previous optimization run on other experimental data,
 as an interface to external optimizers,
 or as a simple input of manually edited model parameters.
+It is implemented in the @ref pmsco.optimizers.table module.

 The table can be stored in an external file that is specified on the command line,
 or supplied in one of several forms by the custom project class.
 The table can be left unchanged during the calculations,
 or new models can be added on the go.
+Duplicate models are ignored.

-@attention Because it is not easily possible to know when and which models have been read from the table file, if you do modify the table file during processing, pay attention to the following hints:
-1. The file on disk must not be locked for more than a second. Do not keep the file open unnecessarily.
-2. _Append_ new models to the end of the table rather than overwriting previous ones. Otherwise, some models may be lost before they have been calculated.
+@attention Because it is not easily possible to know when the table file is read,
+if you do modify the table file while calculations are running,
+1. Do not keep the file locked for longer than a second.
+2. Append new models to the end of the table rather than overwriting previous ones.
+3. Delete lines only if you're sure that they are not needed any more.

 The general parameters of the table scan are specified in the @ref Project.optimizer_params dictionary.
 Some of them can be changed on the command line or in the project class (depending on how the project class is implemented).
@ -167,7 +182,7 @@ Some of them can be changed on the command line or in the project class (dependi
 | pop_size | --pop-size | &ge; 1 | Number of models in a generation (calculated in parallel). In table mode, this parameter is not so important and can be left at the default. It has nothing to do with table size. |
 | table_file | --table-file | a file path, default none | |

-The domain parameters have the following meanings.
+The model space attributes have the following meaning.
 Models that violate the parameter range are not calculated.

 | Parameter | Description |
--- a/docs/src/project.dox
+++ b/docs/src/project.dox
@ -0,0 +1,454 @@
+/*! @page pag_project Setting up a new project
+\section sec_project Setting Up a New Project
+
+This topic guides you through the setup of a new project.
+Be sure to check out the examples in the projects folder
+and the code documentation as well.
+
+The basic steps are:
+
+1. Create a new folder under `projects`.
+2. In the new folder, create a Python module for the project (subsequently called _the project module_).
+3. In the project module, define a cluster generator class which derives from pmsco.cluster.ClusterGenerator.
+4. In the project module, define a project class which derives from pmsco.project.Project.
+5. In the same folder as the project module, create a JSON run-file.
+
+\subsection sec_project_module Project Module
+
+A skeleton of the project module file (with some common imports) may look like this:
+
+~~~~~~{.py}
+import logging
+import math
+import numpy as np
+import periodictable as pt
+from pathlib import Path
+
+import pmsco.cluster
+import pmsco.data
+import pmsco.dispatch
+import pmsco.elements.bindingenergy
+import pmsco.project
+
+logger = logging.getLogger(__name__)
+
+
+class MyClusterGenerator(pmsco.cluster.ClusterGenerator):
+    def create_cluster(self, model, index):
+        clu = pmsco.cluster.Cluster()
+        # ...
+        return clu
+
+    def count_emitters(self, model, index):
+        # ...
+        return 1
+
+
+class MyProject(pmsco.project.Project):
+    def __init__(self):
+        super().__init__()
+        # ...
+        self.cluster_generator = MyClusterGenerator(self)
+
+    def create_model_space():
+        spa = pmsco.project.ModelSpace()
+        # ...
+        return spa
+
+    def create_params(self, model, index):
+        par = pmsco.project.CalculatorParams()
+        # ...
+        return par
+~~~~~~
+
+The main purpose of the `MyProject` class is to bundle the project-specific calculation parameters and code.
+The purpose of the `MyClusterGenerator` class is to produce atomic clusters as a function of a number of model parameters.
+For the project to be useful, some of the methods in the skeleton above need to be implemented.
+The individual methods are discussed in the following.
+Further descriptions can be found in the documentation of the code.
+
+\subsection sec_project_cluster Cluster Generator
+
+The cluster generator is a project-specific Python object that produces a cluster, i.e., a list of atomic coordinates,
+based on a small number of model parameters whenever PMSCO requires it.
+The most important member of a cluster generator is its `create_cluster` method.
+At least this method must be implemented for a functional cluster generator.
+
+A generic `count_emitters` method is implemented in the base class.
+It needs to be overridden if you want to use parallel calculation of multiple emitters.
+
+\subsubsection sec_project_cluster_create Cluster Definition
+
+The `create_cluster` method takes the model parameters (a dictionary)
+and the task index (a pmsco.dispatch.CalcID, cf. @ref pag_concepts_tasks) as arguments.
+Given these arguments, it must create and fill a pmsco.cluster.Cluster object.
+See pmsco.cluster.ClusterGenerator.create_cluster for details on the method contract.
+
+As an example, have a look at the following simplified excerpt from the twoatom demo project.
+
+~~~~~~{.py}
+    def create_cluster(self, model, index):
+        # access model parameters
+        # dAB - distance between atoms in Angstroms
+        # th - polar angle in degrees
+        # ph - azimuthal angle in degrees
+        r = model['dAB']
+        th = math.radians(model['th'])
+        ph = math.radians(model['ph'])
+
+        # prepare a cluster object
+        clu = pmsco.cluster.Cluster()
+        # the comment line is optional but can be useful
+        clu.comment = "{0} {1}".format(self.__class__, index)
+        # set the maximum radius of the cluster (outliers will be ignored)
+        clu.set_rmax(r * 2.0)
+
+        # calculate atomic vectors
+        dx = r * math.sin(th) * math.cos(ph)
+        dy = r * math.sin(th) * math.sin(ph)
+        dz = r * math.cos(th)
+        a_top = np.array((0.0, 0.0, 0.0))
+        a_bot = np.array((-dx, -dy, -dz))
+
+        # add an oxygen atom at a_top position and mark it as emitter
+        clu.add_atom('O', a_top, 1)
+        # add a copper atom at a_bot position
+        clu.add_atom('Cu', a_bot, 0)
+
+        # pass the created cluster to the calculator
+        return clu
+~~~~~~
+
+In this example, two atoms are added to the cluster.
+The pmsco.cluster.Cluster class provides several methods to simplify the task,
+such as adding layers or bulk regions, rotation, translation, trim, emitter selection, etc.
+Please refer to the documentation of its code for details.
+It may also be instructive to have a look at the demo projects.
+
+The main purposes of the cluster object are to store an array of atoms and to read/write cluster files in a variety of formats.
+For each atom, the following properties are stored:
+
+- sequential atom index (1-based, maintained by cluster code)
+- atom type (chemical element number)
+- chemical element symbol from periodic table
+- x coordinate of the atom position
+- t coordinate of the atom position
+- z coordinate of the atom position
+- emitter flag (0 = scatterer, 1 = emitter, default 0)
+- charge/ionicity (units of elementary charge, default 0)
+- scatterer class (default 0)
+
+All of these properties except the scatterer class can be set by the add methods of the cluster.
+The scatterer class is used internally by the atomic scattering factor calculators.
+Whether the charge/ionicity is used, depends on the particular calculators, EDAC does not use it, for instance.
+
+Note: You do not need to take care how many emitters a calculator allows,
+or whether the emitter needs to be at the origin or the first place of the array.
+These technical aspects are handled by PMSCO code transparently.
+
+\subsubsection sec_project_cluster_domains Domains
+
+Domains refer to regions of inequivalent structure in the probing region.
+This may include regions of different orientation, different lattice constant, or even different structure.
+The cluster methods can read the selected domain from the `index.domain` argument.
+This is an index into the pmsco.project.Project.domains list where each item is a dictionary
+that holds additional, invariable structural parameters.
+
+A common case are rotational domains.
+In this case, the list of domains may look like `[{"zrot": 0.0}, {"zrot": 60.0}]`, for example,
+and the `create_cluster` method would include additional code to rotate the cluster:
+
+~~~~~~{.py}
+    def create_cluster(self, model, index):
+        # filling atoms here
+        # ...
+
+        dom = self.domains[index.domain]
+        try:
+            z_rot = dom['zrot']
+        except KeyError:
+            z_rot = 0.0
+        if z_rot:
+            clu.rotate_z(z_rot)
+
+        # selecting emitters
+        # ...
+
+        return clu
+~~~~~~
+
+Depending on the complexity of the system, it may, however, be necessary to write a specific sub-routine for each domain.
+
+The pmsco.project.Project class includes generic code to add intensities of domains incoherently (cf. pmsco.project.Project.combine_domains).
+If the model space contains parameters 'wdom0', 'wdom1', etc.,
+these parameters are interpreted at weights of domain 0, 1, etc.
+One domain must have a fixed weight to avoid correlated parameters.
+Typically, 'wdom0' is left undefined and defaults to 1.
+
+\subsubsection sec_project_cluster_emitters Emitter Configurations
+
+If your project has a large cluster and/or many emitters, have a look at @ref pag_concepts_emitter.
+In this case, you should override the `count_emitters` method and return the number of emitter configurations.
+In the simplest case, this is the number of inequivalent emitters, and the implementation would be:
+
+~~~~~~{.py}
+    def count_emitters(self, model, index):
+        index = index._replace(emit=-1)
+        clu = self.create_cluster(model, index)
+        return clu.get_emitter_count()
+~~~~~~
+
+Next, modify the `create_cluster` method to check the emitter index (`index.emit`).
+If it is -1, the method must return the full cluster with all inequivalent emitters marked.
+If it is positive, only the corresponding emitter must be marked.
+The code could be similar to this example:
+
+~~~~~~{.py}
+    def create_cluster(self, model, index):
+        # filling atoms here
+        # ...
+
+        # select all possible emitters (atoms of a specific element) in a cylindrical volume
+        # idx_emit is an array of atom numbers (0-based atom index)
+        idx_emit = clu.find_index_cylinder(origin, r_xy, r_z, self.project.scans[index.scan].emitter)
+        # if a specific emitter should be marked, restrict the array index.
+        if index.emit >= 0:
+            idx_emit = idx_emit[index.emit]
+        # mark the selected emitters
+        # if index.emit was < 0, all emitters are marked
+        clu.data['e'][idx_emit] = 1
+
+        return clu
+~~~~~~
+
+Now, the individual emitter configurations will be calculated in separate tasks
+which can be run in parallel in a multi-process environment.
+Note that the processing time of EDAC scales linearly with the number of emitters.
+Thus, parallel execution is beneficial.
+
+Advanced programmers may exploit more of the flexibility of emitter configurations, cf. @ref pag_concepts_emitter.
+
+\subsection sec_project_project Project Class
+
+Most commonly, a project class overrides the `__init__`, `create_model_space` and `create_params` methods.
+Most other inherited methods can be overridden optionally,
+for instance `validate`, `setup`, `calc_modulation`, `rfactor`,
+as well as the combine methods `combine_rfactors`, `combine_domains`, `combine_emitters`, etc.
+Int his introduction, we focus on the most basic three methods.
+
+\subsubsection sec_project_project_init Initialization and Defaults
+
+In the `__init__` method, you define and initialize (with default values) additional project properties.
+You may also redefine properties of the base class.
+The following code is just an example to give you some ideas.
+
+~~~~~~{.py}
+class MyProject(pmsco.project.Project):
+    def __init__(self):
+        # call the inherited method first
+        super().__init__()
+        # re-define an inherited property
+        self.directories["data"] = Path("/home/pmsco/data")
+        # define a scan dictionary
+        self.scan_dict = {}
+        # fill the scan dictionary
+        self.build_scan_dict()
+        # create the cluster generator
+        self.cluster_generator = MyClusterGenerator(self)
+        # declare the list of domains (at least one is required)
+        self.domains = [{"zrot": 0.}]
+
+    def build_scan_dict(self):
+        self.scan_dict["empty"] = {"filename": "{pmsco}/projects/common/empty-hemiscan.etpi",
+                                   "emitter": "Si", "initial_state": "2p3/2"}
+        self.scan_dict["Si2p"]  = {"filename": "{data}/xpd-Si2p.etpis",
+                                   "emitter": "Si", "initial_state": "2p3/2"}
+~~~~~~
+
+The scan dictionary can come in handy if you want to select scans by a shortcut on the command line or in a run file.
+
+Note that most of the properties can be assigned from a run file.
+This happens after the `__init__` method.
+The values set by `__init__` serve as default values.
+
+\subsubsection sec_project_project_space Model Space
+
+The model space defines the keys and value ranges of the model parameters.
+There are three ways to declare the model space in order of priority:
+
+1. Declare the model space in the run-file.
+2. Assign a ModelSpace to the self.model_space property directly in the `__init__` method.
+3. Implement the `create_model_space` method.
+
+We begin the third way:
+
+~~~~~~{.py}
+# under class MyProject(pmsco.project.Project):
+    def create_model_space(self):
+        # create an empty model space
+        spa = pmsco.project.ModelSpace()
+
+        # add parameters
+        spa.add_param('dAB',    2.10,  2.00,   2.25,  0.05)
+        spa.add_param('th',    15.00,  0.00,  30.00,  1.00)
+        spa.add_param('ph',    90.00)
+        spa.add_param('V0',    21.96, 15.00,  25.00,  1.00)
+        spa.add_param('Zsurf',  1.50)
+        spa.add_param('wdom1',  0.5,   0.10,  10.00,  0.10)
+
+        # return the model space
+        return spa
+~~~~~~
+
+This code declares six model parameters: `dAB`, `th`, `ph`, `V0`, `Zsurf` and `wdom1`.
+Three of them are structural parameters (used by the cluster generator above),
+two are used by the `create_params` method (see below),
+and `wdom1` is used in pmsco.project.Project.combine_domains while summing up contributions from different domains.
+
+The values in the arguments list correspond to the start value (initial guess),
+the lower and upper boundaries of the value range,
+and the step size for optimizers that require it.
+If just one value is given, like for `ph` and `Zsurf`, the parameter is held constant during the optimization.
+
+The equivalent declaration in the run-file would look like (parameters after `th` omitted):
+
+~~~~~~{.py}
+{
+  "project": {
+    // ...
+    "model_space": {
+      "dAB": {
+        "start": 2.109,
+        "min": 2.0,
+        "max": 2.25,
+        "step": 0.05
+      },
+      "th": {
+        "start": 15.0,
+        "min": 0.0,
+        "max": 30.0,
+        "step": 1.0
+      },
+      // ...
+    }
+  }
+}
+~~~~~~
+
+\subsubsection sec_project_project_params Calculation Parameters
+
+Non-structural parameters that are needed for the input files of the calculators are passed
+in a pmsco.project.CalculatorParams object.
+This object should be created and filled in the `create_params` method of the project class.
+
+The following example is from the twoatoms demo project:
+
+~~~~~~{.py}
+# under class MyProject(pmsco.project.Project):
+    def create_params(self, model, index):
+        params = pmsco.project.CalculatorParams()
+
+        # meta data
+        params.title = "two-atom demo"
+        params.comment = "{0} {1}".format(self.__class__, index)
+
+        # initial state and binding energy
+        initial_state = self.scans[index.scan].initial_state
+        params.initial_state = initial_state
+        emitter = self.scans[index.scan].emitter
+        params.binding_energy = pt.elements.symbol(emitter).binding_energy[initial_state]
+
+        # experimental setup
+        params.polarization = "H"
+        params.polar_incidence_angle = 60.0
+        params.azimuthal_incidence_angle = 0.0
+        params.experiment_temperature = 300.0
+
+        # material parameters
+        params.z_surface = model['Zsurf']
+        params.work_function = 4.5
+        params.inner_potential = model['V0']
+        params.debye_temperature = 356.0
+
+        # multiple-scattering parameters (EDAC)
+        params.emitters = []
+        params.lmax = 15
+        params.dmax = 5.0
+        params.orders = [25]
+
+        return params
+~~~~~~
+
+Most of the code is generic and can be copied to other projects.
+Only the experimental and material parameters need to be adjusted.
+Other properties can be changed as needed, see the documentation of pmsco.project.CalculatorParams for details.
+
+\subsection sec_project_args Passing Runtime Parameters
+
+Runtime parameters can be passed in one of three ways:
+
+1. hard-coded in the project module,
+2. on the command line, or
+3. in a JSON run-file.
+
+In the first way, all parameters are hard-coded in the `create_project` function of the project module.
+This is the simplest way for a quick start to a small project.
+However, as the project code grows, it's easy to loose track of revisions.
+In programming it is usually best practice to separate code and data.
+
+The command line is another option for passing parameters to a process.
+It requires extra code for parsing the command line and is not very flexible.
+It is difficult to pass complex data types.
+Using the command line is no longer recommended and may become deprecated in a future version.
+
+The recommended way of passing parameters is via run-files.
+Run-files allow for complete separation of code and data in a generic and flexible way.
+For example, run-files can be stored along with the results.
+However, the semantics of the run-file may look intimidating at first.
+
+\subsubsection sec_project_args_runfile Setting Up a Run-File
+
+The usage and format of run-files is described in detail under @ref pag_runfile.
+
+\subsubsection sec_project_args_code Hard-Coded Arguments
+
+Hard-coded parameters are usually set in a `create_module` function of the project module.
+At the end of the module, this function can easily be found.
+The function has two purposes: to create the project object and to set parameters.
+The parameters can be any attributes of the project class and its ancestors.
+See the parent pmsco.project.Project class for a list of common attributes.
+
+The `create_project` function may look like in the following example.
+It must return a project object, i.e. an object instance of a class that inherits from pmsco.project.Project.
+
+~~~~~~{.py}
+def create_project():
+    project = MyProject()
+
+    project.optimizer_params["pop_size"] = 20
+
+    project_dir = Path(__file__).parent
+    scan_file = Path(project_dir, "hbnni_e156_int.etpi")
+    project.add_scan(filename=scan_file, emitter="N", initial_state="1s")
+
+    project.add_domain({"zrot": 0.0})
+    project.add_domain({"zrot": 60.0})
+
+    return project
+~~~~~~
+
+To have PMSCO call this function,
+pass the file path of the containing module as the first command line argument of PMSCO, cf. @ref pag_command.
+PMSCO calls this function in absence of a run-file.
+
+
+\subsubsection sec_project_args_cmd Command Line
+
+Since it is not recommended to pass calculation parameters on the command line,
+this mechanism is not described in detail here.
+It is, however, still available.
+If you really need to use it,
+have a look at the code of the pmsco.pmsco.main function
+and how it calls the `create_project`, `parse_project_args` and `set_project_args` of the project module.
+
+*/
--- a/docs/src/runfile.dox
+++ b/docs/src/runfile.dox
@ -0,0 +1,333 @@
+/*! @page pag_runfile Run File
+\section sec_runfile Run File
+
+This section describes the format of a run-file.
+Run-files are a new way of passing arguments to a PMSCO process which avoids cluttering up the command line.
+It is more flexible than the command line
+because run-files can assign a value to any property of the project object in an abstract way.
+Moreover, there is no necessity for the project code to parse the command line.
+
+
+\subsection sec_runfile_how How It Works
+
+Run-files are text files in [JSON](https://en.wikipedia.org/wiki/JSON) format
+which shares most syntax elements with Python.
+JSON files contain nested dictionaries, lists, strings and numbers.
+
+In PMSCO, run-files contain a dictionary of parameters for the project object
+which is the main container for calculation parameters, model objects and links to data files.
+An abstract run-file parser reads the run-file,
+constructs the specified project object based on the custom project class
+and assigns the attributes of the project object.
+It's important to note that the parser does not recognize specific data types or classes.
+All specific data handling is done by the instantiated objects, mainly the project class.
+
+The parser can handle the following situations:
+
+- Strings, numbers as well as dictionaries and lists of simple objects can be assigned directly to project attributes.
+- If the project class defines an attribute as a _property_,
+  the class can execute custom code to import or validate data.
+- The parser can instantiate an object from a class in the namespace of the project module
+  and assign its properties.
+
+
+\subsection sec_runfile_general General File Format
+
+Run-files must adhere to the [JSON](https://en.wikipedia.org/wiki/JSON) format,
+which shares most syntax elements with Python.
+Specifically, a JSON file can declare dictionaries, lists and simple objects
+such as strings, numbers and `null`.
+As one extension to plain JSON, PMSCO ignores line comments starting with a hash `#` or double-slash `//`.
+This can be used to temporarily hide a parameter from the parser.
+
+For example run-files, have a look at the twoatom demo project.
+
+
+\subsection sec_runfile_project Project Specification
+
+
+The following minimum run-file demonstrates how to specify the project at the top level:
+
+~~~~~~{.py}
+{
+  "project": {
+    "__module__": "projects.twoatom.twoatom",
+    "__class__": "TwoatomProject",
+    "mode": "single",
+    "output_file": "twoatom0001"
+  }
+}
+~~~~~~
+
+Here, the `project` keyword denotes the dictionary that is used to construct the project object.
+
+Within the project dictionary, the `__module__` key selects the Python module file that contains the project code,
+and `__class__` refers to the name of the actual project class.
+Further dictionary items correspond to attributes of the project class.
+
+The module name is the same as would be used in a Python import statement.
+It must be findable on the Python path.
+PMSCO ensures that the directory containing the `pmsco` and `projects` sub-directories is on the Python path.
+The class name must be in the namespace of the loaded module.
+
+As PMSCO starts, it imports the specified module,
+constructs an object of the specified project class,
+and assigns any further items to project attributes.
+In the example above, `twoatom0001` is assigned to the `output_file` property.
+Any attributes not specified in the run-file will remain at their default values
+that were set byt the `__init__` method of the project class.
+
+Note that parameter names must start with an alphabetic character, else they are ignored.
+This provides another way to temporarily ignore an item from the file besides line comments.
+
+Also note that PMSCO does not spell-check parameter names.
+The parameter values are just written to the corresponding object attribute.
+If a name is misspelled, the value will be written under the wrong name and missed by the code eventually.
+
+PMSCO carries out only some most important checks on the given parameter values.
+Incorrect values may lead to improper operation or exceptions later in the calculations.
+
+
+\subsection sec_runfile_common Common Arguments
+
+The following table lists some important parameters controlling the calculations.
+They are declared in the pmsco.projects.Project class.
+
+| Key | Values | Description |
+| --- | --- | --- |
+| mode | `single` (default), `grid`, `swarm`, `genetic`, `table`, `test`, `validate` | Operation mode. `validate` can be used to check the syntax of the run-file, the process exits before starting calculations. |
+| directories | dictionary | This dictionary lists common file paths used in the project. It contains keys such as `home`, `project`, `output` (see documentation of Project class in pmsco.project). Enclosed in curly braces, the keys can be used as placeholders in filenames. |
+| output_dir | path | Shortcut for directories["output"] |
+| data_dir | path | Shortcut for directories["data"] |
+| job_name | string, must be a valid file name | Base name for all produced output files. It is recommended to set a unique name for each calculation run. Do not include a path. The path can be set in _output_dir_. |
+| cluster_generator | dictionary | Class name and attributes of the cluster generator. See below. |
+| atomic_scattering_factory | string<br>Default: InternalAtomicCalculator from pmsco.calculators.calculator | Class name of the atomic scattering calculator. This name must be in the namespace of the project module. |
+| multiple_scattering_factory | string<br>Default: EdacCalculator from  pmsco.calculators.edac | Class name of the multiple scattering calculator. This name must be in the namespace of the project module. |
+| model_space | dictionary | See @ref sec_runfile_space below. |
+| domains | list of dictionaries | See @ref sec_runfile_domains below. |
+| scans | list of dictionaries | See @ref sec_runfile_scans below. |
+| optimizer_params | dictionary | See @ref sec_runfile_optimizer below. |
+
+The following table lists some common control parameters and metadata
+that affect the behaviour of the program but do not affect the calculation results.
+The job metadata is used to identify and describe a job in the results database if requested.
+
+| Key | Values | Description |
+| --- | --- | --- |
+| job_tags | list of strings | User-specified job tags (metadata). |
+| description | string | Description of the calculation job (metadata) |
+| time_limit | decimal number<br>Default: 24. | Wall time limit in hours. The optimizers try to finish before the limit. This cannot be guaranteed, however. |
+| keep_files | list of file categories | Output file categories to keep after the calculation. Multiple values can be specified and must be separated by spaces. By default, cluster and model (simulated data) of a limited number of best models are kept. See @ref sec_runfile_files below. |
+| keep_best | integer number<br>Default: 10 | number of best models for which result files should be kept. |
+| keep_level | integer number<br>Default: 1 | numeric task level down to which files are kept. 1 = scan level, 2 = domain level, etc. |
+| log_level | DEBUG, INFO, WARNING, ERROR, CRITICAL | Minimum level of messages that should be added to the log. Empty string turns off logging. |
+| log_file | file system path<br>Default: job_name + ".log". | Name of the main log file. Under MPI, the rank of the process is inserted before the extension. The log name is created in the working directory.  |
+
+
+\subsection sec_runfile_space Model Space
+
+The `model_space` parameter is a dictionary of model parameters.
+The key is the name of the parameter as used by the cluster and input-formatting code,
+the value is a dictionary holding the `start`, `min`, `max`, `step` values to be used by the optimizer.
+
+~~~~~~{.py}
+{
+  "project": {
+    // ...
+    "model_space": {
+      "dAB": {
+        "start": 2.109,
+        "min": 2.0,
+        "max": 2.25,
+        "step": 0.05
+      },
+      "pAB": {
+        "start": 15.0,
+        "min": 0.0,
+        "max": 30.0,
+        "step": 1.0
+      },
+      // ...
+    }
+  }
+}
+~~~~~~
+
+
+\subsection sec_runfile_domains Domains
+
+Domains is a list of dictionaries.
+Each dictionary holds keys describing the domain to the cluster and input-formatting code.
+The meaning of these keys is up to the project.
+
+~~~~~~{.py}
+{
+  "project": {
+    // ...
+    "domains": [
+      {"surface": "Te", "doping": null, "zrot": 0.0},
+      {"surface": "Te", "doping": null, "zrot": 60.0}
+    ],
+  }
+}
+~~~~~~
+
+
+\subsection sec_runfile_scans Experimental Scan Files
+
+The pmsco.project.Scan objects used in the calculation cannot be instantiated from the run-file directly.
+Instead, the scans object is a list of scan creators/loaders which specify what to do to create a Scan object.
+The pmsco.project module defines three scan creators: ScanLoader, ScanCreator and ScanKey.
+The following code block shows an example of each of the three:
+
+~~~~~~{.py}
+{
+  "project": {
+    // ...
+    "scans": [
+      {
+        "__class__": "pmsco.project.ScanCreator",
+        "filename": "twoatom_energy_alpha.etpai",
+        "emitter": "N",
+        "initial_state": "1s",
+        "positions": {
+          "e": "np.arange(10, 400, 5)",
+          "t": "0",
+          "p": "0",
+          "a": "np.linspace(-30, 30, 31)"
+        }
+      },
+      {
+        "__class__": "pmsco.project.ScanLoader",
+        "filename": "{project}/twoatom_hemi_250e.etpi",
+        "emitter": "N",
+        "initial_state": "1s",
+        "is_modf": false
+      },
+      {
+        "__class__": "pmsco_project.ScanKey",
+        "key": "Ge3s113tp"
+      }
+    ]
+  }
+}
+~~~~~~
+
+The class name must be specified as it would be called in the custom project module.
+`pmsco.project` must, thus, be imported in the custom project module.
+
+The *ScanCreator* object creates a scan using Numpy array constructors in `positions`.
+In the example above, a two-dimensional rectangular energy-alpha scan grid is created.
+The values of the positions axes are passed to Python's `eval` function
+and must return a one-dimensional Numpy `ndarray`.
+
+The `emitter` and `initial_state` keys define the probed core level.
+
+The *ScanLoader* object loads a data file, specified under `filename`.
+The filename can include a placeholder which is replaced by the corresponding item from Project.directories.
+Note that some of the directories (including `project`) are pre-set by PMSCO.
+It is recommended to add a `data` key under `directories` in the run-file
+if the data files are outside of the PMSCO directory tree.
+The `is_modf` key indicates whether the file contains a modulation function (`true`) or intensity (`false`).
+In the latter case, the modulation function is calculated after loading.
+
+The *ScanKey* is the shortest scan specification in the run-file.
+It is a shortcut to a complete scan description in `scan_dict` dictionary in the project object.
+The `scan_dict` must be set up in the `__init__` method of the project class.
+The `key` item specifies which key of `scan_dict` should be used to create the Scan object.
+
+Each item of `scan_dict` holds a dictionary
+that in turn holds the attributes for either a `ScanCreator` or a `ScanLoader`.
+If it contains a `positions` key, it represents a `ScanCreator`, else a `ScanLoader`.
+
+
+\subsection sec_runfile_optimizer Optimizer Parameters
+
+The `optimizer_params` is a dictionary holding one or more of the following items.
+
+| Key | Values | Description |
+| --- | --- | --- |
+| pop-size | integer<br>The default value is the greater of 4 or the number of parallel calculation processes. | Population size (number of particles) in swarm and genetic optimization mode. |
+| seed-file | file system path | Name of the population seed file. Population data of previous optimizations can be used to seed a new optimization. The file must have the same structure as the .pop or .dat files. See @ref pmsco.project.Project.seed_file. |
+| table-file | file system path | Name of the model table file in table scan mode. |
+
+
+\subsubsection sec_runfile_files File Categories
+
+The following category names can be used with the `keep_files` option.
+Multiple names can be specified as a list.
+
+| Category | Description | Default Action |
+| --- | --- | --- |
+| all | shortcut to include all categories | |
+| input |      raw input files for calculator, including cluster and phase files in custom format | delete |
+| output |     raw output files from calculator | delete |
+| atomic |     atomic scattering and emission files in portable format | delete |
+| cluster |    cluster files in portable XYZ format for report | keep |
+| debug |      debug files |  delete |
+| model |       output files in ETPAI format: complete simulation  (a_-1_-1_-1_-1) | keep |
+| scan |       output files in ETPAI format: scan (a_b_-1_-1_-1) |  keep |
+| domain |     output files in ETPAI format: domain (a_b_c_-1_-1) |  delete |
+| emitter |    output files in ETPAI format: emitter (a_b_c_d_-1) |  delete |
+| region |     output files in ETPAI format: region (a_b_c_d_e) |  delete |
+| report|      final report of results | keep always |
+| population |  final state of particle population | keep |
+| rfac |        files related to models which give bad r-factors, see warning below | delete |
+
+\note
+The `report` category is always kept and cannot be turned off.
+The `model` category is always kept in single calculation mode.
+
+\warning
+If you want to specify `rfac` with the `keep_files` option,
+you have to add the file categories that you want to keep, e.g.,
+`"keep_files": ["rfac", "cluster", "model", "scan", "population"]`
+(to return the default categories for all calculated models).
+Do not specify `rfac` alone as this will effectively not return any file.
+
+
+\subsection sec_runfile_schedule Job Scheduling
+
+To submit a job to a resource manager such as Slurm, add a `schedule` section to the run file
+(section ordering is not important):
+
+~~~~~~{.py}
+{
+  "schedule": {
+    "__module__": "pmsco.schedule",
+    "__class__": "PsiRaSchedule",
+    "nodes": 1,
+    "tasks_per_node": 24,
+    "walltime": "2:00",
+    "manual_run": true,
+    "enabled": true
+  },
+  "project": {
+    "__module__": "projects.twoatom.twoatom",
+    "__class__": "TwoatomProject",
+    "mode": "single",
+    "output_file": "{home}/pmsco/twoatom0001",
+    ...
+  }
+}
+~~~~~~
+
+In the same way as for the project, the `__module__` and `__class__` keys select the class that handles the job submission.
+In this example, it is pmsco.schedule.PsiRaSchedule which is tied to the Ra cluster at PSI.
+For other machines, you can sub-class one of the classes in the pmsco.schedule module and include it in your project module.
+
+The parameters of pmsco.schedule.PsiRaSchedule are as follows.
+Some of them are also used in other schedule classes or may have different types or ranges.
+
+| Key | Values | Description |
+| --- | --- | --- |
+| nodes | integer: 1..2 | Number of compute nodes (main boards on Ra). The maximum number available for PEARL is 2. |
+| tasks_per_node | integer: 1..24, 32 | Number of tasks (CPU cores on Ra) per node. Jobs with less than 24 tasks are assigned to the shared partition. |
+| wall_time | string: [days-]hours[:minutes[:seconds]] <br> dict: with any combination of days, hours, minutes, seconds | Maximum run time (wall time) of the job. |
+| manual | bool | Manual submission (true) or automatic submission (false). Manual submission allows you to inspect the job files before submission. |
+| enabled | bool | Enable scheduling (true). Otherwise, the calculation is started directly (false).
+
+@note The calculation job may run in a different working directory than the current one.
+It is important to specify absolute data and output directories in the run file (project/directories section).
+
+*/
--- a/docs/src/tasks.dot
+++ b/docs/src/tasks.dot
@ -38,15 +38,15 @@ custom_scan [label="scan\nconfiguration", shape=note];
 {rank=same; custom_scan; create_scan; combine_scan;}
 custom_scan -> create_scan [lhead=cluster_scan];

-subgraph cluster_symmetry {
-label="symmetry handler";
+subgraph cluster_domain {
+label="domain handler";
 rank=same;
-create_symmetry [label="define\nsymmetry\ntasks"];
-combine_symmetry  [label="gather\nsymmetry\nresults"];
+create_model_space [label="define\ndomain\ntasks"];
+combine_domain  [label="gather\ndomain\nresults"];
 }
-custom_symmetry [label="symmetry\ndefinition", shape=cds];
-{rank=same; create_symmetry; combine_symmetry; custom_symmetry;}
-custom_symmetry -> combine_symmetry [lhead=cluster_symmetry];
+custom_domain [label="domain\ndefinition", shape=cds];
+{rank=same; create_model_space; combine_domain; custom_domain;}
+custom_domain -> combine_domain [lhead=cluster_domain];

 subgraph cluster_emitter {
 label="emitter handler";
@ -80,11 +80,11 @@ create_cluster -> edac;
 create_model -> create_scan [label="level 1 tasks"];
 evaluate_model -> combine_scan [label="level 1 results", dir=back];

-create_scan -> create_symmetry [label="level 2 tasks"];
-combine_scan -> combine_symmetry [label="level 2 results", dir=back];
+create_scan -> create_model_space [label="level 2 tasks"];
+combine_scan -> combine_domain [label="level 2 results", dir=back];

-create_symmetry -> create_emitter [label="level 3 tasks"];
-combine_symmetry -> combine_emitter [label="level 3 results", dir=back];
+create_model_space -> create_emitter [label="level 3 tasks"];
+combine_domain -> combine_emitter [label="level 3 results", dir=back];

 create_emitter -> create_region [label="level 4 tasks"];
 combine_emitter -> combine_region [label="level 4 results", dir=back];
--- a/docs/src/uml/CalculationTask-class.puml
+++ b/docs/src/uml/CalculationTask-class.puml
@ -28,7 +28,7 @@ remove_task_file()
 class CalcID {
 model
 scan
-sym
+domain
 emit
 region
 }
--- a/docs/src/uml/CalculationTask-objects.puml
+++ b/docs/src/uml/CalculationTask-objects.puml
@ -43,15 +43,15 @@ parent = 2, -1, -1, -1, -1
 model = {'d': 7}
 }

-Scan11 o.. Sym111
+Scan11 o.. Dom111

-object Sym111 {
+object Dom111 {
 id = 1, 1, 1, -1, -1
 parent = 1, 1, -1, -1, -1
 model = {'d': 5}
 }

-Sym111 o.. Emitter1111
+Dom111 o.. Emitter1111

 object Emitter1111 {
 id = 1, 1, 1, 1, -1
@ -90,18 +90,18 @@ scan

 object ScanHandler

-object "Sym: CalculationTask" as Sym {
+object "Domain: CalculationTask" as Domain {
 model
 scan
-symmetry
+domain
 }

-object "SymmetryHandler" as SymHandler
+object "DomainHandler" as DomainHandler

 object "Emitter: CalculationTask" as Emitter {
 model
 scan
-symmetry
+domain
 emitter
 }

@ -110,7 +110,7 @@ object EmitterHandler
 object "Region: CalculationTask" as Region {
 model
 scan
-symmetry
+domain
 emitter
 region
 }
@ -120,14 +120,14 @@ object RegionHandler

 Root "1" o.. "1..*" Model
 Model "1" o.. "1..*" Scan
-Scan "1" o.. "1..*" Sym
-Sym "1" o.. "1..*" Emitter
+Scan "1" o.. "1..*" Domain
+Domain "1" o.. "1..*" Emitter
 Emitter "1" o.. "1..*" Region

 (Root, Model) .. ModelHandler
 (Model, Scan) .. ScanHandler
-(Scan, Sym) .. SymHandler
-(Sym, Emitter) .. EmitterHandler
+(Scan, Domain) .. DomainHandler
+(Domain, Emitter) .. EmitterHandler
 (Emitter, Region) .. RegionHandler

@enduml
--- a/docs/src/uml/calculation-task.puml
+++ b/docs/src/uml/calculation-task.puml
@ -4,7 +4,7 @@
 class CalculationTask {
 model
 scan
-symmetry
+domain
 emitter
 region
 ..
@ -35,7 +35,7 @@ class Scan {
    alphas
 }

-class Symmetry {
+class Domain {
    index
    ..
    rotation
@ -55,13 +55,13 @@ class Region {

 CalculationTask *-- Model
 CalculationTask *-- Scan
-CalculationTask *-- Symmetry
+CalculationTask *-- Domain
 CalculationTask *-- Emitter
 CalculationTask *-- Region

 class Project {
    scans
-    symmetries
+    domains
    model_handler
    cluster_generator
 }
@ -78,7 +78,7 @@ class ModelHandler {

 Model ..> ModelHandler
 Scan ..> Project
-Symmetry ..> Project
+Domain ..> Project
 Emitter ..> ClusterGenerator
 Region ..> Project

--- a/docs/src/uml/database.puml
+++ b/docs/src/uml/database.puml
@ -9,14 +9,6 @@ name
 code
 }

-class Scan << (T,orchid) >> {
-id
-..
-job_id
-..
-name
-}
-
 class Job << (T,orchid) >> {
 id
 ..
@ -30,6 +22,22 @@ datetime
 description
 }

+class Tag << (T,orchid) >> {
+id
+..
+..
+key
+}
+
+class JobTag << (T,orchid) >> {
+id
+..
+tag_id
+job_id
+..
+value
+}
+
 class Model << (T,orchid) >> {
 id
 ..
@ -46,7 +54,7 @@ id
 model_id
 ..
 scan
-sym
+domain
 emit
 region
 rfac
@ -69,8 +77,9 @@ value
 }

 Project "1" *-- "*" Job
+Job "1" *-- "*" JobTag
+Tag "1" *-- "*" JobTag
 Job "1" *-- "*" Model
-Job "1" *-- "*" Scan
 Param "1" *-- "*" ParamValue
 Model "1" *-- "*" ParamValue
 Model "1" *-- "*" Result
--- a/docs/src/uml/handler-activity.puml
+++ b/docs/src/uml/handler-activity.puml
@ -20,7 +20,7 @@ repeat
 partition "generate tasks" {
 :define model tasks;
 :define scan tasks;
-:define symmetry tasks;
+:define domain tasks;
 :define emitter tasks;
 :define region tasks;
 }
@ -34,7 +34,7 @@ end fork
 partition "collect results" {
 :gather region results;
 :gather emitter results;
-:gather symmetry results;
+:gather domain results;
 :gather scan results;
 :gather model results;
 }
--- a/docs/src/uml/minimum-project-classes.puml
+++ b/docs/src/uml/minimum-project-classes.puml
@ -5,10 +5,10 @@ package pmsco {
        mode
        code
        scans
-        symmetries
+        domains
        {abstract} create_cluster()
        {abstract} create_params()
-        {abstract} create_domain()
+        {abstract} create_model_space()
    }

 }
@ -18,7 +18,7 @@ package projects {
        __init__()
        create_cluster()
        create_params()
-        create_domain()
+        create_model_space()
    }

 }
--- a/docs/src/uml/project-classes.puml
+++ b/docs/src/uml/project-classes.puml
@ -4,13 +4,13 @@ abstract class Project {
        mode : str = "single"
        code : str = "edac"
        scans : Scan [1..*]
-        symmetries : dict [1..*]
+        domains : dict [1..*]
        cluster_generator : ClusterGenerator
        handler_classes
        files : FileTracker
        {abstract} create_cluster() : Cluster
-        {abstract} create_params() : Params
-        {abstract} create_domain() : Domain
+        {abstract} create_params() : CalculatorParams
+        {abstract} create_model_space() : ModelSpace
    }

 class Scan {
@ -28,7 +28,7 @@ class Scan {
    import_scan_file()
 }

-class Domain {
+class ModelSpace {
    start : dict
    min : dict
    max : dict
@ -37,7 +37,7 @@ class Domain {
    get_param(name)
 }

-class Params {
+class CalculatorParams {
    title
    comment
    cluster_file
--- a/docs/src/uml/top-activity-partitions.puml
+++ b/docs/src/uml/top-activity-partitions.puml
@ -25,7 +25,7 @@ stop

 |pmsco|
 start
-:define task (model, scan, symmetry, emitter, region);
+:define task (model, scan, domain, emitter, region);
 |project|
 :create cluster;
 :create parameters;
--- a/docs/src/uml/top-components.puml
+++ b/docs/src/uml/top-components.puml
@ -2,21 +2,19 @@

 skinparam componentStyle uml2

-component "project" as project
 component "PMSCO" as pmsco
+component "project" as project
 component "scattering code\n(calculator)" as calculator

 interface "command line" as cli
-interface "input files" as input
-interface "output files" as output
 interface "experimental data" as data
 interface "results" as results
+interface "output files" as output

+cli --> pmsco
 data -> project
-project ..> pmsco
+pmsco ..> project
 pmsco ..> calculator
-cli --> project
-input -> calculator
 calculator -> output
 pmsco -> results

--- a/docs/src/uml/user-project-classes.puml
+++ b/docs/src/uml/user-project-classes.puml
@ -5,16 +5,16 @@ package pmsco {
        mode
        code
        scans
-        symmetries
+        domains
        cluster_generator
        handler_classes
        __
        {abstract} create_cluster()
        {abstract} create_params()
-        {abstract} create_domain()
+        {abstract} create_model_space()
        ..
        combine_scans()
-        combine_symmetries()
+        combine_domains()
        combine_emitters()
        calc_modulation()
        calc_rfactor()
@ -34,9 +34,9 @@ package projects {
        setup()
        ..
        create_params()
-        create_domain()
+        create_model_space()
        ..
-        combine_symmetries()
+        combine_domains()
    }

    class UserClusterGenerator {
--- a/extras/singularity/singularity_python2
+++ b/extras/singularity/singularity_python2
@ -1,118 +0,0 @@
-BootStrap: debootstrap
-OSVersion: bionic
-MirrorURL: http://ch.archive.ubuntu.com/ubuntu/
-
-%help
-a singularity container for PMSCO.
-
-git clone requires an ssh key for git.psi.ch.
-try agent forwarding (-A option to ssh).
-
-#%setup
-# executed on the host system outside of the container before %post
-#
-# this will be inside the container
-#    touch ${SINGULARITY_ROOTFS}/tacos.txt
-# this will be on the host
-#    touch avocados.txt
-
-#%files
-# files are copied before %post
-#
-# this copies to root
-#    avocados.txt
-# this copies to /opt
-#    avocados.txt /opt
-#
-# this does not work
-#    ~/.ssh/known_hosts /etc/ssh/ssh_known_hosts
-#    ~/.ssh/id_rsa /etc/ssh/id_rsa
-
-%labels
-    Maintainer Matthias Muntwiler
-    Maintainer_Email matthias.muntwiler@psi.ch
-    Python_Version 2.7
-
-%environment
-    export PATH="/usr/local/miniconda3/bin:$PATH"
-    export PYTHON_VERSION=2.7
-    export SINGULAR_BRANCH="singular"
-    export LC_ALL=C
-
-%post
-    export PYTHON_VERSION=2.7
-    export LC_ALL=C
-
-    sed -i 's/$/ universe/' /etc/apt/sources.list
-    apt-get update
-    apt-get -y install \
-        binutils \
-        build-essential \
-        doxygen \
-        doxypy \
-        f2c \
-        g++ \
-        gcc \
-        gfortran \
-        git \
-        graphviz \
-        libblas-dev \
-        liblapack-dev \
-        libopenmpi-dev \
-        make \
-        nano \
-        openmpi-bin \
-        openmpi-common \
-        sqlite3 \
-        wget
-    apt-get clean
-
-    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
-    bash ~/miniconda.sh -b -p /usr/local/miniconda3
-    export PATH="/usr/local/miniconda3/bin:$PATH"
-
-    conda create -q --yes -n pmsco python=${PYTHON_VERSION}
-    . /usr/local/miniconda3/bin/activate pmsco
-    conda install -q --yes -n pmsco \
-        pip \
-        "numpy>=1.13" \
-        scipy \
-        ipython \
-        mpi4py \
-        matplotlib \
-        nose \
-        mock \
-        future \
-        statsmodels \
-        swig
-    conda clean --all -y
-    /usr/local/miniconda3/envs/pmsco/bin/pip install periodictable attrdict fasteners
-    
-    
-#%test
-# test the image after build
-
-%runscript
-    # executes command from command line
-    . /usr/local/miniconda3/bin/activate pmsco
-    exec echo "$@"
-
-%apprun install
-    . /usr/local/miniconda3/bin/activate pmsco
-    cd ~
-    git clone https://git.psi.ch/pearl/pmsco.git pmsco
-    cd pmsco
-    git checkout develop
-    git checkout -b ${SINGULAR_BRANCH}
-
-    make all
-    nosetests
-
-%apprun python
-    . /usr/local/miniconda3/bin/activate pmsco
-    exec python "${@}"
-
-%apprun conda
-    . /usr/local/miniconda3/bin/activate pmsco
-    exec conda "${@}"
-
--- a/extras/singularity/singularity_python3
+++ b/extras/singularity/singularity_python3
@ -3,10 +3,11 @@ OSVersion: bionic
 MirrorURL: http://ch.archive.ubuntu.com/ubuntu/

 %help
-a singularity container for PMSCO.
+A singularity container for PMSCO.

-git clone requires an ssh key for git.psi.ch.
-try agent forwarding (-A option to ssh).
+singularity run -e pmsco.sif path/to/pmsco -r path/to/your-runfile
+
+path/to/pmsco must point to the directory that contains the __main__.py file.

 #%setup
 # executed on the host system outside of the container before %post
@ -34,22 +35,25 @@ try agent forwarding (-A option to ssh).
    Python_Version 3

 %environment
-    export PATH="/usr/local/miniconda3/bin:$PATH"
-    export PYTHON_VERSION=3
-    export SINGULAR_BRANCH="singular"
    export LC_ALL=C
+    export PYTHON_VERSION=3
+    export CONDA_ROOT=/opt/miniconda
+    export PLANTUML_JAR_PATH=/opt/plantuml/plantuml.jar
+    export SINGULAR_BRANCH="singular"

 %post
-    export PYTHON_VERSION=3
    export LC_ALL=C
+    export PYTHON_VERSION=3
+    export CONDA_ROOT=/opt/miniconda
+    export PLANTUML_ROOT=/opt/plantuml

    sed -i 's/$/ universe/' /etc/apt/sources.list
    apt-get update
    apt-get -y install \
        binutils \
        build-essential \
+        default-jre \
        doxygen \
-        doxypy \
        f2c \
        g++ \
        gcc \
@ -67,51 +71,51 @@ try agent forwarding (-A option to ssh).
    apt-get clean

    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
-    bash ~/miniconda.sh -b -p /usr/local/miniconda3
-    export PATH="/usr/local/miniconda3/bin:$PATH"
+    bash ~/miniconda.sh -b -p ${CONDA_ROOT}

+    . ${CONDA_ROOT}/bin/activate
    conda create -q --yes -n pmsco python=${PYTHON_VERSION}
-    . /usr/local/miniconda3/bin/activate pmsco
+    conda activate pmsco
    conda install -q --yes -n pmsco \
        pip \
        "numpy>=1.13" \
        scipy \
        ipython \
-        mpi4py \
        matplotlib \
        nose \
        mock \
        future \
        statsmodels \
-        swig
+        swig \
+        gitpython
    conda clean --all -y
-    /usr/local/miniconda3/envs/pmsco/bin/pip install periodictable attrdict fasteners
+    pip install periodictable attrdict commentjson fasteners mpi4py doxypypy

+    mkdir ${PLANTUML_ROOT}
+    wget -O ${PLANTUML_ROOT}/plantuml.jar https://sourceforge.net/projects/plantuml/files/plantuml.jar/download

 #%test
 # test the image after build

 %runscript
-    # executes command from command line
-    source /usr/local/miniconda3/bin/activate pmsco
-    exec echo "$@"
+    . ${CONDA_ROOT}/etc/profile.d/conda.sh
+    conda activate pmsco
+    exec python "$@"

 %apprun install
-    source /usr/local/miniconda3/bin/activate pmsco
+    . ${CONDA_ROOT}/etc/profile.d/conda.sh
+    conda activate pmsco
    cd ~
    git clone https://git.psi.ch/pearl/pmsco.git pmsco
    cd pmsco
-    git checkout develop
+    git checkout master
    git checkout -b ${SINGULAR_BRANCH}

+    make all
+    nosetests -w tests/
+
+%apprun compile
+    . ${CONDA_ROOT}/etc/profile.d/conda.sh
+    conda activate pmsco
    make all
    nosetests
-
-%apprun python
-    source /usr/local/miniconda3/bin/activate pmsco
-    exec python "${@}"
-
-%apprun conda
-    source /usr/local/miniconda3/bin/activate pmsco
-    exec conda "${@}"
-
--- a/extras/vagrant/Vagrantfile
+++ b/extras/vagrant/Vagrantfile
@ -12,8 +12,8 @@ Vagrant.configure("2") do |config|

  # Every Vagrant development environment requires a box. You can search for
  # boxes at https://vagrantcloud.com/search.
-  config.vm.box = "singularityware/singularity-2.4"
-  config.vm.box_version = "2.4"
+  config.vm.box = "sylabs/singularity-3.7-ubuntu-bionic64"
+  config.vm.box_version = "3.7"

  # Disable automatic box update checking. If you disable this, then
  # boxes will only be checked for updates when the user runs
--- a/4
+++ b/4
@ -40,9 +40,9 @@ SHELL=/bin/sh
 PMSCO_DIR = pmsco
 DOCS_DIR = docs

-all: edac loess docs
+all: edac loess phagen docs

-bin: edac loess
+bin: edac loess phagen

 edac loess msc mufpot phagen:
 	$(MAKE) -C $(PMSCO_DIR)
--- a/pmsco/main.py
+++ b/pmsco/main.py
@ -8,16 +8,13 @@ python pmsco [pmsco-arguments]
@endverbatim
 """

-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
+from pathlib import Path
 import sys
-import os.path

-file_dir = os.path.dirname(__file__) or '.'
-root_dir = os.path.join(file_dir, '..')
-root_dir = os.path.abspath(root_dir)
-sys.path[0] = root_dir
+pmsco_root = Path(__file__).resolve().parent.parent
+if str(pmsco_root) not in sys.path:
+    sys.path.insert(0, str(pmsco_root))
+

 if __name__ == '__main__':
    import pmsco.pmsco
--- a/pmsco/calculators/calculator.py
+++ b/pmsco/calculators/calculator.py
@ -42,7 +42,7 @@ class Calculator(object):
        or <code>output_file + '.etpai'</code> depending on scan mode.
        all other intermediate files are deleted unless keep_temp_files is True.

-        @param params: a pmsco.project.Params object with all necessary values except cluster and output files set.
+        @param params: a pmsco.project.CalculatorParams object with all necessary values except cluster and output files set.

        @param cluster: a pmsco.cluster.Cluster(format=FMT_EDAC) object with all atom positions set.

--- a/pmsco/calculators/edac.py
+++ b/pmsco/calculators/edac.py
@ -49,7 +49,7 @@ class EdacCalculator(calculator.Calculator):

        if alpha is defined, theta is implicitly set to normal emission! (to be generalized)

-        @param params: a pmsco.project.Params object with all necessary values except cluster and output files set.
+        @param params: a pmsco.project.CalculatorParams object with all necessary values except cluster and output files set.

        @param scan: a pmsco.project.Scan() object describing the experimental scanning scheme.

@ -173,7 +173,7 @@ class EdacCalculator(calculator.Calculator):
            f.write(" ".join(format(order, "d") for order in params.orders) + "\n")
            f.write("emission angle window {0:F}\n".format(params.angular_resolution / 2.0))

-            # scattering factor output (see project.Params.phase_output_classes)
+            # scattering factor output (see project.CalculatorParams.phase_output_classes)
            if params.phase_output_classes is not None:
                fn = "{0}.clu".format(params.output_file)
                f.write("cluster output l(A) {fn}\n".format(fn=fn))
@ -197,7 +197,7 @@ class EdacCalculator(calculator.Calculator):
        """
        run EDAC with the given parameters and cluster.

-        @param params: a pmsco.project.Params object with all necessary values except cluster and output files set.
+        @param params: a pmsco.project.CalculatorParams object with all necessary values except cluster and output files set.

        @param cluster: a pmsco.cluster.Cluster(format=FMT_EDAC) object with all atom positions set.

--- a/pmsco/calculators/msc.py
+++ b/pmsco/calculators/msc.py
@ -62,7 +62,7 @@ class MscCalculator(calculator.Calculator):
        """
        run the MSC program with the given parameters and cluster.

-        @param params: a project.Params() object with all necessary values except cluster and output files set.
+        @param params: a project.CalculatorParams() object with all necessary values except cluster and output files set.

        @param cluster: a cluster.Cluster(format=FMT_MSC) object with all atom positions set.

--- a/pmsco/calculators/phagen/makefile
+++ b/pmsco/calculators/phagen/makefile
@ -13,8 +13,9 @@ SHELL=/bin/sh
 .PHONY: all clean phagen

 FC?=gfortran
+FCOPTS?=-std=legacy
 F2PY?=f2py
-F2PYOPTS?=
+F2PYOPTS?=--f77flags=-std=legacy --f90flags=-std=legacy
 CC?=gcc
 CCOPTS?=
 SWIG?=swig
@ -28,7 +29,7 @@ PYTHON_EXT_SUFFIX ?= $(shell ${PYTHON_CONFIG} --extension-suffix)

 all: phagen

-phagen: phagen.exe phagen$(EXT_SUFFIX)
+phagen: phagen.exe phagen$(PYTHON_EXT_SUFFIX)

 phagen.exe: phagen_scf.f msxas3.inc msxasc3.inc
 	$(FC) $(FCOPTS) -o phagen.exe phagen_scf.f
@ -36,7 +37,7 @@ phagen.exe: phagen_scf.f msxas3.inc msxasc3.inc
 phagen.pyf: | phagen_scf.f
 	$(F2PY) -h phagen.pyf -m phagen phagen_scf.f only: libmain

-phagen$(EXT_SUFFIX): phagen_scf.f phagen.pyf msxas3.inc msxasc3.inc
+phagen$(PYTHON_EXT_SUFFIX): phagen_scf.f phagen.pyf msxas3.inc msxasc3.inc
 	$(F2PY) -c $(F2PYOPTS) -m phagen phagen.pyf phagen_scf.f

 clean:
--- a/pmsco/calculators/phagen/runner.py
+++ b/pmsco/calculators/phagen/runner.py
@ -61,7 +61,7 @@ class PhagenCalculator(AtomicCalculator):
        because PHAGEN generates a lot of files with hard-coded names,
        the function creates a temporary directory for PHAGEN and deletes it before returning.

-        @param params: pmsco.project.Params object.
+        @param params: pmsco.project.CalculatorParams object.
            the phase_files attribute is updated with the paths of the scattering files.

        @param cluster: pmsco.cluster.Cluster object.
@ -76,6 +76,8 @@ class PhagenCalculator(AtomicCalculator):
        @return (None, dict) where dict is a list of output files with their category.
            the category is "atomic" for all output files.
        """
+        assert cluster.get_emitter_count() == 1, "PHAGEN cannot handle more than one emitter at a time"
+
        transl = Translator()
        transl.params.set_params(params)
        transl.params.set_cluster(cluster)
@ -132,6 +134,14 @@ class PhagenCalculator(AtomicCalculator):
                except IOError:
                    logger.error("error loading phagen cluster file {fi}".format(fi=clufile))

+                try:
+                    listfile = outfile + ".list"
+                    report_listfile = os.path.join(prev_wd, output_file + ".phagen.list")
+                    shutil.copy(listfile, report_listfile)
+                    files[report_listfile] = "log"
+                except IOError:
+                    logger.error("error loading phagen list file {fi}".format(fi=listfile))
+
        finally:
            os.chdir(prev_wd)

--- a/pmsco/calculators/phagen/translator.py
+++ b/pmsco/calculators/phagen/translator.py
@ -19,6 +19,7 @@ from __future__ import print_function

 import numpy as np

+from pmsco.cluster import Cluster
 from pmsco.compat import open

 ## rydberg energy in electron volts
@ -72,7 +73,7 @@ class TranslationParams(object):
        """
        set the translation parameters.

-        @param params: a pmsco.project.Params object or
+        @param params: a pmsco.project.CalculatorParams object or
                       a dictionary containing some or all public fields of this class.
        @return: None
        """
@ -125,6 +126,44 @@ class Translator(object):
    6. call write_edac_scattering to produce the EDAC scattering matrix files.
    7. call write_edac_emission to produce the EDAC emission matrix file.
    """
+
+    ## @var params
+    #
+    # project parameters needed for translation.
+    #
+    # fill the attributes of this object before using any translator methods.
+
+    ## @var scattering
+    #
+    # t-matrix storage
+    #
+    # the t-matrix is stored in a flat, one-dimensional numpy structured array consisting of the following fields:
+    # @arg e (float) energy (eV)
+    # @arg a (int) atom index (1-based)
+    # @arg l (int) angular momentum quantum number l
+    # @arg t (complex) scattering matrix element, t = exp(-i * delta) * sin delta
+    #
+    # @note PHAGEN uses the convention t = exp(-i * delta) * sin delta,
+    # whereas EDAC uses t = exp(i * delta) * sin delta (complex conjugate).
+    # this object stores the t-matrix according to the PHAGEN convention.
+    # the conversion to the EDAC convention occurs in write_edac_scattering_file().
+
+    ## @var emission
+    #
+    # radial matrix element storage
+    #
+    # the radial matrix elemnts are stored in a flat, one-dimensional numpy structured array
+    # consisting of the following fields:
+    # @arg e (float) energy (eV)
+    # @arg dw (complex) matrix element for the transition to l-1
+    # @arg up (complex) matrix element for the transition to l+1
+
+    ## @var cluster
+    #
+    # cluster object for PHAGEN
+    #
+    # this object is created by translate_cluster().
+
    def __init__(self):
        """
        initialize the object instance.
@ -134,18 +173,33 @@ class Translator(object):
        self.scattering = np.empty(0, dtype=dt)
        dt = [('e', 'f4'), ('dw', 'c16'), ('up', 'c16')]
        self.emission = np.empty(0, dtype=dt)
+        self.cluster = None
+
+    def translate_cluster(self):
+        """
+        translate the cluster into a form suitable for PHAGEN.
+
+        specifically, move the (first and hopefully only) emitter to the first atom position.
+
+        the method copies the cluster from self.params into a new object
+        and stores it under self.cluster.
+
+        @return: None
+        """
+        self.cluster = Cluster()
+        self.cluster.copy_from(self.params.cluster)
+        ems = self.cluster.get_emitters(['i'])
+        self.cluster.move_to_first(idx=ems[0][0]-1)

    def write_cluster(self, f):
        """
        write the cluster section of the PHAGEN input file.

-        requires a valid pmsco.cluster.Cluster in self.params.cluster.
-
        @param f: file or output stream (an object with a write method)

        @return: None
        """
-        for atom in self.params.cluster.data:
+        for atom in self.cluster.data:
            d = {k: atom[k] for k in atom.dtype.names}
            f.write("{s} {t} {x} {y} {z}\n".format(**d))
        f.write("-1 -1 0. 0. 0.\n")
@ -163,7 +217,7 @@ class Translator(object):

        @return: None
        """
-        data = self.params.cluster.data
+        data = self.cluster.data
        elements = np.unique(data['t'])
        for element in elements:
            idx = np.where(data['t'] == element)
@ -181,29 +235,34 @@ class Translator(object):
        @return: None
        """
        phagen_params = {}
+
+        self.translate_cluster()
+        phagen_params['absorber'] = 1
        phagen_params['emin'] = self.params.kinetic_energies.min() / ERYDBERG
        phagen_params['emax'] = self.params.kinetic_energies.max() / ERYDBERG
+        if self.params.kinetic_energies.shape[0] > 1:
            phagen_params['delta'] = (phagen_params['emax'] - phagen_params['emin']) / \
                                     (self.params.kinetic_energies.shape[0] - 1)
-        if phagen_params['delta'] < 0.0001:
+        else:
            phagen_params['delta'] = 0.1
-        phagen_params['edge'] = state_to_edge(self.params.initial_state)  # possibly not used
+        phagen_params['edge'] = state_to_edge(self.params.initial_state)
        phagen_params['edge1'] = 'm4'  # auger not supported
        phagen_params['edge2'] = 'm4'  # auger not supported
        phagen_params['cip'] = self.params.binding_energy / ERYDBERG
        if phagen_params['cip'] < 0.001:
            raise ValueError("binding energy parameter is zero.")

-        if np.sum(np.abs(self.params.cluster.data['q']) >= 0.001) > 0:
+        if np.sum(np.abs(self.cluster.data['q'])) > 0.:
            phagen_params['ionzst'] = 'ionic'
        else:
            phagen_params['ionzst'] = 'neutral'

-        if hasattr(f, "write"):
+        if hasattr(f, "write") and callable(f.write):
            f.write("&job\n")
            f.write("calctype='xpd',\n")
            f.write("coor='angs',\n")
            f.write("cip={cip},\n".format(**phagen_params))
+            f.write("absorber={absorber},\n".format(**phagen_params))
            f.write("edge='{edge}',\n".format(**phagen_params))
            f.write("edge1='{edge1}',\n".format(**phagen_params))
            f.write("edge2='{edge1}',\n".format(**phagen_params))
@ -254,13 +313,18 @@ class Translator(object):
        @arg l angular momentum quantum number l
        @arg t complex scattering matrix element

+        @note PHAGEN uses the convention t = exp(-i * delta) * sin delta,
+        whereas EDAC uses t = exp(i * delta) * sin delta (complex conjugate).
+        this class stores the t-matrix according to the PHAGEN convention.
+        the conversion to the EDAC convention occurs in write_edac_scattering_file().
+
        @param f: file or path (any file-like or path-like object that can be passed to numpy.genfromtxt).

        @return: None
        """
        dt = [('e', 'f4'), ('x1', 'f4'), ('x2', 'f4'), ('na', 'i4'), ('nl', 'i4'),
              ('tr', 'f8'), ('ti', 'f8'), ('ph', 'f4')]
-        data = np.genfromtxt(f, dtype=dt)
+        data = np.atleast_1d(np.genfromtxt(f, dtype=dt))

        self.scattering = np.resize(self.scattering, data.shape)
        scat = self.scattering
@ -308,7 +372,7 @@ class Translator(object):

        @return: None
        """
-        if hasattr(f, "write"):
+        if hasattr(f, "write") and callable(f.write):
            energies = np.unique(scat['e'])
            ne = energies.shape[0]
            lmax = scat['l'].max()
@ -323,7 +387,7 @@ class Translator(object):
                if ne > 1:
                    f.write("{0:.3f} ".format(energy))
                for item in energy_scat:
-                    f.write(" {0:.6f} {1:.6f}".format(item['t'].real, item['t'].imag))
+                    f.write(" {0:.6f} {1:.6f}".format(item['t'].real, -item['t'].imag))
                for i in range(len(energy_scat), lmax + 1):
                    f.write(" 0 0")
                f.write("\n")
@ -341,7 +405,7 @@ class Translator(object):

        @return: None
        """
-        if hasattr(f, "write"):
+        if hasattr(f, "write") and callable(f.write):
            energies = np.unique(scat['e'])
            ne = energies.shape[0]
            lmax = scat['l'].max()
@ -356,7 +420,8 @@ class Translator(object):
                if ne > 1:
                    f.write("{0:.3f} ".format(energy))
                for item in energy_scat:
-                    f.write(" {0:.6f}".format(np.angle(item['t'])))
+                    pha = np.sign(item['t'].real) * np.arcsin(np.sqrt(np.abs(item['t'].imag)))
+                    f.write(" {0:.6f}".format(pha))
                for i in range(len(energy_scat), lmax + 1):
                    f.write(" 0")
                f.write("\n")
@ -373,7 +438,7 @@ class Translator(object):
        @return: None
        """
        dt = [('ar', 'f8'), ('ai', 'f8'), ('br', 'f8'), ('bi', 'f8')]
-        data = np.genfromtxt(f, dtype=dt)
+        data = np.atleast_1d(np.genfromtxt(f, dtype=dt))

        self.emission = np.resize(self.emission, data.shape)
        emission = self.emission
@ -390,7 +455,7 @@ class Translator(object):

        @return: None
        """
-        if hasattr(f, "write"):
+        if hasattr(f, "write") and callable(f.write):
            l0 = self.params.l_init
            energies = self.params.kinetic_energies
            emission = self.emission
--- a/pmsco/cluster.py
+++ b/pmsco/cluster.py
@ -17,22 +17,20 @@ pip install --user periodictable

@author Matthias Muntwiler

-@copyright (c) 2015-19 by Paul Scherrer Institut @n
+@copyright (c) 2015-21 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
 import math
 import numpy as np
 import periodictable as pt
 import sys

+import pmsco.config as config
+
 ## default file format identifier
 FMT_DEFAULT = 0
 ## MSC file format identifier
@ -54,14 +52,14 @@ if sys.version_info[0] >= 3:
 else:
    _SYMBOL_TYPE = 'S2'

-## numpy.array datatype of Cluster.data array
-DTYPE_CLUSTER_INTERNAL = [('i', 'i4'), ('t', 'i4'), ('s', _SYMBOL_TYPE), ('x', 'f4'), ('y', 'f4'), ('z', 'f4'),
-                          ('e', 'u1'), ('q', 'f4'), ('c', 'i4')]
-## file format of internal Cluster.data array
+## numpy.array datatype of internal Cluster.data array
+DTYPE_CLUSTER_INTERNAL = [('i', 'i4'), ('t', 'i4'), ('s', _SYMBOL_TYPE), ('c', 'i4'),
+                          ('x', 'f4'), ('y', 'f4'), ('z', 'f4'), ('e', 'u1'), ('q', 'f4')]
+## string formatting of native file format
 FMT_CLUSTER_INTERNAL = ["%5u", "%2u", "%s", "%5u", "%7.3f", "%7.3f", "%7.3f", "%1u", "%7.3f"]
-## field (column) names of internal Cluster.data array
+## field (column) names of native file format
 FIELDS_CLUSTER_INTERNAL = ['i', 't', 's', 'c', 'x', 'y', 'z', 'e', 'q']
-## column names for export
+## column names of native file format
 NAMES_CLUSTER_INTERNAL = {'i': 'index', 't': 'element', 's': 'symbol', 'c': 'class', 'x': 'x', 'y': 'y', 'z': 'z',
                          'e': 'emitter', 'q': 'charge'}

@ -178,12 +176,12 @@ class Cluster(object):
    #       @arg @c 'i' (int) atom index (1-based)
    #       @arg @c 't' (int) atom type (chemical element number)
    #       @arg @c 's' (string) chemical element symbol
+    #       @arg @c 'c' (int) scatterer class
    #       @arg @c 'x' (float32) x coordinate of the atom position
    #       @arg @c 'y' (float32) t coordinate of the atom position
    #       @arg @c 'z' (float32) z coordinate of the atom position
    #       @arg @c 'e' (uint8)   1 = emitter, 0 = regular atom
    #       @arg @c 'q' (float32) charge/ionicity
-    #       @arg @c 'c' (int) scatterer class

    ##  @var comment (str)
    #   one-line comment that can be included in some cluster files
@ -227,13 +225,13 @@ class Cluster(object):
        """
        self.rmax = r

-    def build_element(self, index, element_number, x, y, z, emitter, charge=0., scatterer=0):
+    def build_element(self, index, element, x, y, z, emitter, charge=0., scatterer_class=0):
        """
        build a tuple in the format of the internal data array.
        
        @param index: (int) index
        
-        @param element_number: (int) chemical element number
+        @param element: chemical element number (int) or symbol (str)
        
        @param x, y, z: (float) atom coordinates in the cluster
        
@ -241,17 +239,23 @@ class Cluster(object):

        @param charge: (float) ionicity. default = 0

-        @param scatterer: (int) scatterer class. default = 0.
+        @param scatterer_class: (int) scatterer class. default = 0.
        """
+        try:
+            element_number = int(element)
            symbol = pt.elements[element_number].symbol
-        element = (index, element_number, symbol, x, y, z, int(emitter), charge, scatterer)
+        except ValueError:
+            symbol = element
+            element_number = pt.elements.symbol(symbol.strip()).number
+
+        element = (index, element_number, symbol, scatterer_class, x, y, z, int(emitter), charge)
        return element

    def add_atom(self, atomtype, v_pos, is_emitter=False, charge=0.):
        """
        add a single atom to the cluster.
        
-        @param atomtype: (int) chemical element number
+        @param atomtype: chemical element number (int) or symbol (str)
        
        @param v_pos: (numpy.ndarray, shape = (3)) position vector
        
@ -274,7 +278,7 @@ class Cluster(object):
        self.rmax (maximum distance from the origin).
        all atoms are non-emitters.
        
-        @param atomtype: (int) chemical element number
+        @param atomtype: chemical element number (int) or symbol (str)
        
        @param v_pos: (numpy.ndarray, shape = (3))
            position vector of the first atom (basis vector)
@ -284,17 +288,18 @@ class Cluster(object):
        """
        r_great = max(self.rmax, np.linalg.norm(v_pos))
        n0 = self.data.shape[0] + 1
-        n1 = max(int(r_great / np.linalg.norm(v_lat1)) + 1, 3) * 2
-        n2 = max(int(r_great / np.linalg.norm(v_lat2)) + 1, 3) * 2
-        nn = 0
-        buf = np.empty((2 * n1 + 1) * (2 * n2 + 1), dtype=self.dtype)
-        for i1 in range(-n1, n1 + 1):
-            for i2 in range(-n2, n2 + 1):
-                v = v_pos + v_lat1 * i1 + v_lat2 * i2
-                if np.linalg.norm(v) <= self.rmax:
-                    buf[nn] = self.build_element(nn + n0, atomtype, v[0], v[1], v[2], 0)
-                    nn += 1
-        buf = np.resize(buf, nn)
+        n1 = max(int(r_great / np.linalg.norm(v_lat1)) + 1, 4) * 3
+        n2 = max(int(r_great / np.linalg.norm(v_lat2)) + 1, 4) * 3
+        idx = np.mgrid[-n1:n1+1, -n2:n2+1]
+        idx = idx.reshape(idx.shape[0], -1)
+        lat = np.array([v_lat1, v_lat2])
+        v = v_pos + np.matmul(idx.T, lat)
+        rsq = np.sum(np.square(v), axis=-1)
+        b1 = rsq <= self.rmax**2
+        sel = b1.nonzero()[0]
+        buf = np.empty((len(sel)), dtype=self.dtype)
+        for nn, ii in enumerate(sel):
+            buf[nn] = self.build_element(nn + n0, atomtype, v[ii, 0], v[ii, 1], v[ii, 2], 0)
        self.data = np.append(self.data, buf)

    def add_bulk(self, atomtype, v_pos, v_lat1, v_lat2, v_lat3, z_surf=0.0):
@ -306,7 +311,7 @@ class Cluster(object):
        and z_surf (position of the surface).
        all atoms are non-emitters.

-        @param atomtype: (int) chemical element number
+        @param atomtype: chemical element number (int) or symbol (str)
        
        @param v_pos: (numpy.ndarray, shape = (3))
            position vector of the first atom (basis vector)
@ -322,16 +327,18 @@ class Cluster(object):
        n1 = max(int(r_great / np.linalg.norm(v_lat1)) + 1, 4) * 3
        n2 = max(int(r_great / np.linalg.norm(v_lat2)) + 1, 4) * 3
        n3 = max(int(r_great / np.linalg.norm(v_lat3)) + 1, 4) * 3
-        nn = 0
-        buf = np.empty((2 * n1 + 1) * (2 * n2 + 1) * (n3 + 1), dtype=self.dtype)
-        for i1 in range(-n1, n1 + 1):
-            for i2 in range(-n2, n2 + 1):
-                for i3 in range(-n3, n3 + 1):
-                    v = v_pos + v_lat1 * i1 + v_lat2 * i2 + v_lat3 * i3
-                    if np.linalg.norm(v) <= self.rmax and v[2] <= z_surf:
-                        buf[nn] = self.build_element(nn + n0, atomtype, v[0], v[1], v[2], 0)
-                        nn += 1
-        buf = np.resize(buf, nn)
+        idx = np.mgrid[-n1:n1+1, -n2:n2+1, -n3:n3+1]
+        idx = idx.reshape(idx.shape[0], -1)
+        lat = np.array([v_lat1, v_lat2, v_lat3])
+        v = v_pos + np.matmul(idx.T, lat)
+        rsq = np.sum(np.square(v), axis=-1)
+        b1 = rsq <= self.rmax**2
+        b2 = v[:, 2] <= z_surf
+        ba = np.all([b1, b2], axis=0)
+        sel = ba.nonzero()[0]
+        buf = np.empty((len(sel)), dtype=self.dtype)
+        for nn, ii in enumerate(sel):
+            buf[nn] = self.build_element(nn + n0, atomtype, v[ii, 0], v[ii, 1], v[ii, 2], 0)
        self.data = np.append(self.data, buf)

    def add_cluster(self, cluster, check_rmax=False, check_unique=False, tol=0.001):
@ -426,15 +433,47 @@ class Cluster(object):
        idx = np.where(b_all)
        self.data['z'][idx] += z_shift

-        return idx
+        return idx[0]

-    def translate(self, vector, element=0):
+    def get_center(self, element=None):
+        """
+        get the geometric center of the cluster or a class of atoms.
+
+        @param element: chemical element number (int) or symbol (str)
+            if atoms of a specific element should be considered only.
+            by default (element == None or 0 or ""),
+            all atoms are included in the calculation.
+
+        @return: (numpy.ndarray) 3-dimensional vector.
+        """
+
+        if element:
+            try:
+                sel = self.data['t'] == int(element)
+            except ValueError:
+                sel = self.data['s'] == element
+        else:
+            sel = np.ones_like(self.data['t'])
+        idx = np.where(sel)
+        center = np.zeros(3)
+        center[0] = np.mean(self.data['x'][idx])
+        center[1] = np.mean(self.data['y'][idx])
+        center[2] = np.mean(self.data['z'][idx])
+        return center
+
+    def translate(self, vector, element=None):
        """
        translate the cluster or all atoms of a specified element.

+        translation shifts each selected atom by the given vector.
+
        @param vector: (numpy.ndarray) 3-dimensional displacement vector.
-        @param element: (int) chemical element number if atoms of a specific element should be affected.
-            by default (element = 0), all atoms are moved.
+
+        @param element: chemical element number (int) or symbol (str)
+            if atoms of a specific element should be affected only.
+            by default (element == None or 0 or ""),
+            all atoms are translated.
+
        @return: (numpy.ndarray) indices of the atoms that have been shifted.
        """
        if element:
@ -449,7 +488,7 @@ class Cluster(object):
        self.data['y'][idx] += vector[1]
        self.data['z'][idx] += vector[2]

-        return idx
+        return idx[0]

    def matrix_transform(self, matrix):
        """
@ -461,47 +500,49 @@ class Cluster(object):
        
        @return: None 
        """
-        for atom in self.data:
-            v = np.matrix([atom['x'], atom['y'], atom['z']])
-            w = matrix * v.transpose()
-            atom['x'] = float(w[0])
-            atom['y'] = float(w[1])
-            atom['z'] = float(w[2])
+        pos = np.empty((3, self.data.shape[0]), np.float32)
+        pos[0, :] = self.data['x']
+        pos[1, :] = self.data['y']
+        pos[2, :] = self.data['z']
+        pos = np.matmul(matrix, pos)
+        self.data['x'] = pos[0, :]
+        self.data['y'] = pos[1, :]
+        self.data['z'] = pos[2, :]

    def rotate_x(self, angle):
        """
-        rotate cluster about the surface normal axis
+        rotate cluster about the x-axis

        @param angle (float) in degrees
        """
        angle = math.radians(angle)
        s = math.sin(angle)
        c = math.cos(angle)
-        matrix = np.matrix([[1, 0, 0], [0, c, -s], [0, s, c]])
+        matrix = np.array([[1, 0, 0], [0, c, -s], [0, s, c]])
        self.matrix_transform(matrix)

    def rotate_y(self, angle):
        """
-        rotate cluster about the surface normal axis
+        rotate cluster about the y-axis

        @param angle (float) in degrees
        """
        angle = math.radians(angle)
        s = math.sin(angle)
        c = math.cos(angle)
-        matrix = np.matrix([[c, 0, s], [0, 1, 0], [-s, 0, c]])
+        matrix = np.array([[c, 0, s], [0, 1, 0], [-s, 0, c]])
        self.matrix_transform(matrix)

    def rotate_z(self, angle):
        """
-        rotate cluster about the surface normal axis
+        rotate cluster about the z-axis (surface normal)

        @param angle (float) in degrees
        """
        angle = math.radians(angle)
        s = math.sin(angle)
        c = math.cos(angle)
-        matrix = np.matrix([[c, -s, 0], [s, c, 0], [0, 0, 1]])
+        matrix = np.array([[c, -s, 0], [s, c, 0], [0, 0, 1]])
        self.matrix_transform(matrix)

    def find_positions(self, pos, tol=0.001):
@ -794,6 +835,53 @@ class Cluster(object):
        idx = self.data['e'] != 0
        return np.sum(idx)

+    def calc_scattering_angles(self, index_emitter, radius):
+        """
+        calculate forward-scattering angles of the cluster atoms
+
+        for each atom within a given radius of the emitter,
+        the connecting vector between emitter and scatterer is calculated
+        and returned in cartesian and polar coordinates.
+
+        @param index_emitter: atom index of the emitter.
+            all angles are calculated with respect to this atom.
+
+        @param radius: include only atoms within this radius of the emitter.
+
+        @note back-scattering angles can be obtained by inverting the angle on the unit sphere:
+            th' = 180 - th, ph' = -ph.
+
+        @return dictionary with results.
+            each item is a numpy.ndarray of shape (N, M)
+            where N is the number of scatterers
+            and M = 3 for dict['xyz'] and M = 1 otherwise.
+            @arg dict['index']: atom index into the cluster array.
+            @arg dict['xyz']: connecting vector between the emitter and the atom in cartesian coordinates.
+            @arg dict['dist']: distance between the emitter and the atom.
+            @arg dict['polar']: polar angle with respect to the z-axis.
+            @arg dict['azimuth']: azimuthal angle with respect to the x-axis.
+        """
+        # position of emitter atom
+        em = self.data[index_emitter]
+        em = np.asarray((em['x'], em['y'], em['z']))
+
+        # relative positions of scattering atoms
+        xyz = self.get_positions()
+        xyz -= em
+        dist = np.linalg.norm(xyz, axis=1)
+        sel1 = dist <= radius
+        sel2 = dist > 0.
+        idx = np.where(np.all([sel1, sel2], axis=0))
+        xyz = xyz[idx]
+        dist = dist[idx]
+
+        # angles
+        v1 = np.asarray([0, 0, 1])
+        v2 = np.transpose(xyz / dist.reshape((dist.shape[0], 1)))
+        th = np.degrees(np.arccos(np.clip(np.dot(v1, v2), -1., 1.)))
+        ph = np.degrees(np.arctan2(v2[1], v2[0]))
+        return {'index': idx[0], 'xyz': xyz, 'dist': dist, 'polar': th, 'azimuth': ph}
+
    def load_from_file(self, f, fmt=FMT_DEFAULT):
        """
        load a cluster from a file created by the scattering program.
@ -848,7 +936,7 @@ class Cluster(object):
        else:
            raise ValueError("unknown file format {}".format(fmt))

-        data = np.genfromtxt(f, dtype=dtype, skip_header=sh)
+        data = np.atleast_1d(np.genfromtxt(f, dtype=dtype, skip_header=sh))
        if fmt == FMT_PHAGEN_IN and data['t'][-1] < 1:
            data = data[:-1]

@ -920,7 +1008,7 @@ class Cluster(object):
        or left at the default value 0 in which case PMSCO sets the correct values.

        if the scattering factors are loaded from existing files,
-        the atom class corresponds to the key of the pmsco.project.Params.phase_files dictionary.
+        the atom class corresponds to the key of the pmsco.project.CalculatorParams.phase_files dictionary.
        in this case the meaning of the class value is up to the project,
        and the class must be set either by the cluster generator
        or the project's after_atomic_scattering hook.
@ -956,7 +1044,7 @@ class Cluster(object):
        the other cluster must contain the same atoms (same coordinates) in a possibly random order.
        the atoms of this and the other cluster are matched up by sorting them by coordinate.

-        atomic scattering calculators often change the order of atoms in a cluster based on symmetry,
+        atomic scattering calculators often change the order of atoms in a cluster based on domain,
        and return atom classes versus atomic coordinates.
        this method allows to import the atom classes into the original cluster.

@ -1049,7 +1137,7 @@ class Cluster(object):
        np.savetxt(f, data, fmt=file_format, header=header, comments="")


-class ClusterGenerator(object):
+class ClusterGenerator(config.ConfigurableObject):
    """
    cluster generator class.

@ -1067,13 +1155,14 @@ class ClusterGenerator(object):
        @param project: reference to the project object.
            cluster generators may need to look up project parameters.
        """
+        super().__init__()
        self.project = project

    def count_emitters(self, model, index):
        """
-        return the number of emitter configurations for a particular model, scan and symmetry.
+        return the number of emitter configurations for a particular model, scan and domain.

-        the number of emitter configurations may depend on the model parameters, scan index and symmetry index.
+        the number of emitter configurations may depend on the model parameters, scan index and domain index.
        by default, the method returns 1, which means that there is only one emitter configuration.

        emitter configurations are mainly a way to distribute the calculations to multiple processes
@ -1100,9 +1189,9 @@ class ClusterGenerator(object):

        @param index (named tuple CalcID) calculation index.
            the method should consider only the following attributes:
-            @arg @c scan   scan index (index into Project.scans)
-            @arg @c sym    symmetry index (index into Project.symmetries)
-            @arg @c emit   emitter index must be -1.
+            @arg scan   scan index (index into Project.scans)
+            @arg domain domain index (index into Project.domains)
+            @arg emit   emitter index must be -1.

        @return number of emitter configurations.
            this implementation returns the default value of 1.
@ -1114,23 +1203,23 @@ class ClusterGenerator(object):
        create a Cluster object given the model parameters and calculation index.

        the generated cluster will typically depend on the model parameters.
-        depending on the project, it may also depend on the scan index, symmetry index and emitter index.
+        depending on the project, it may also depend on the scan index, domain index and emitter index.

        the scan index can be used to generate a different cluster for different scan geometry,
        e.g., if some atoms can be excluded due to a longer mean free path.
        if this is not the case for the specific project, the scan index can be ignored.

-        the symmetry index may select a particular domain that has a different atomic arrangement.
-        in this case, depending on the value of index.sym, the function must generate a cluster corresponding
-        to the particular domain/symmetry.
-        the method can ignore the symmetry index if the project defines only one symmetry,
-        or if the symmetry does not correspond to a different atomic structure.
+        the domain index may select a particular domain that has a different atomic arrangement.
+        in this case, depending on the value of index.domain, the function must generate a cluster corresponding
+        to the particular domain.
+        the method can ignore the domain index if the project defines only one domain,
+        or if the domain does not correspond to a different atomic structure.

        the emitter index selects a particular emitter configuration.
        depending on the value of the emitter index, the method must react differently:

        1. if the value is -1, return the full cluster and mark all inequivalent emitter atoms.
-           emitters which are reproduced by a symmetry expansion in combine_emitters() should not be marked.
+           emitters which are reproduced by a domain expansion in combine_emitters() should not be marked.
           the full diffraction scan will be calculated in one calculation.

        2. if the value is greater or equal to zero, generate the cluster with the emitter configuration
@ -1152,9 +1241,9 @@ class ClusterGenerator(object):

        @param index (named tuple CalcID) calculation index.
            the method should consider only the following attributes:
-            @arg @c scan   scan index (index into Project.scans)
-            @arg @c sym    symmetry index (index into Project.symmetries)
-            @arg @c emit   emitter index.
+            @arg scan   scan index (index into Project.scans)
+            @arg domain domain index (index into Project.domains)
+            @arg emit   emitter index.
                            if -1, generate the full cluster and mark all emitters.
                            if greater or equal to zero, the value is a zero-based index of the emitter configuration.

@ -1174,7 +1263,7 @@ class LegacyClusterGenerator(ClusterGenerator):
    """

    def __init__(self, project):
-        super(LegacyClusterGenerator, self).__init__(project)
+        super().__init__(project)

    def count_emitters(self, model, index):
        """
--- a/pmsco/config.py
+++ b/pmsco/config.py
@ -0,0 +1,120 @@
+"""
+@package pmsco.config
+infrastructure for configurable objects
+
+@author Matthias Muntwiler
+
+@copyright (c) 2021 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import collections.abc
+import functools
+import inspect
+import logging
+from pathlib import Path
+
+logger = logging.getLogger(__name__)
+
+
+def resolve_path(path, dirs):
+    """
+    resolve a file path by replacing placeholders
+
+    placeholders are enclosed in curly braces.
+    values for all possible placeholders are provided in a dictionary.
+
+    @param path: str, Path or other path-like.
+        example: '{work}/test/testfile.dat'.
+    @param dirs: dictionary mapping placeholders to project paths.
+        the paths can be str, Path or other path-like
+        example: {'work': '/home/user/work'}
+    @return: pathlib.Path object
+    """
+    return Path(*(p.format(**dirs) for p in Path(path).parts))
+
+
+class ConfigurableObject(object):
+    """
+    Parent class for objects that can be configured by a run file
+
+    the run file is a JSON file that contains object data in a nested dictionary structure.
+
+    in the dictionary structure the keys are property or attribute names of the object to be initialized.
+    keys starting with a non-alphabetic character (except for some special keys like __class__) are ignored.
+    these can be used as comments, or they protect private attributes.
+
+    the values can be numeric values, strings, lists or dictionaries.
+
+    simple values are simply assigned using setattr.
+    this may call a property setter if defined.
+
+    lists are iterated. each item is appended to the attribute.
+    the attribute must implement an append method in this case.
+
+    if an item is a dictionary and contains the special key '__class__',
+    an object of that class is instantiated and recursively initialized with the dictionary elements.
+    this requires that the class can be found in the module scope passed to the parser methods,
+    and that the class inherits from this class.
+
+    cases that can't be covered easily using this mechanism
+    should be implemented in a property setter.
+    value-checking should also be done in a property setter (or the append method in sequence-like objects).
+    """
+    def __init__(self):
+        pass
+
+    def set_properties(self, module, data_dict, project):
+        """
+        set properties of this class.
+
+        @param module: module reference that should be used to resolve class names.
+            this is usually the project module.
+        @param data_dict: dictionary of properties to set.
+            see the class description for details.
+        @param project: reference to the project object.
+        @return: None
+        """
+        for key in data_dict:
+            if key[0].isalpha():
+                self.set_property(module, key, data_dict[key], project)
+
+    def set_property(self, module, key, value, project):
+        obj = self.parse_object(module, value, project)
+        if hasattr(self, key):
+            if obj is not None:
+                if isinstance(obj, collections.abc.MutableSequence):
+                    attr = getattr(self, key)
+                    for item in obj:
+                        attr.append(item)
+                elif isinstance(obj, collections.abc.Mapping):
+                    d = getattr(self, key)
+                    if d is not None and isinstance(d, collections.abc.MutableMapping):
+                        d.update(obj)
+                    else:
+                        setattr(self, key, obj)
+                else:
+                    setattr(self, key, obj)
+            else:
+                setattr(self, key, obj)
+        else:
+            logger.warning(f"class {self.__class__.__name__} does not have attribute {key}.")
+
+    def parse_object(self, module, value, project):
+        if isinstance(value, collections.abc.MutableMapping) and "__class__" in value:
+            cn = value["__class__"].split('.')
+            c = functools.reduce(getattr, cn, module)
+            s = inspect.signature(c)
+            if 'project' in s.parameters:
+                o = c(project=project)
+            else:
+                o = c()
+            o.set_properties(module, value, project)
+        elif isinstance(value, collections.abc.MutableSequence):
+            o = [self.parse_object(module, i, project) for i in value]
+        else:
+            o = value
+        return o
--- a/pmsco/data.py
+++ b/pmsco/data.py
@ -117,7 +117,7 @@ def load_plt(filename, int_column=-1):
    data[i]['p'] = phi
    data[i]['i'] = selected intensity column
    """
-    data = np.genfromtxt(filename, usecols=(0, 2, 3, int_column), dtype=DTYPE_ETPI)
+    data = np.atleast_1d(np.genfromtxt(filename, usecols=(0, 2, 3, int_column), dtype=DTYPE_ETPI))
    sort_data(data)
    return data

@ -189,7 +189,7 @@ def load_edac_pd(filename, int_column=-1, energy=0.0, theta=0.0, phi=0.0, fixed_
            logger.warning("unexpected EDAC output file column name")
            break
    cols = tuple(cols)
-    raw = np.genfromtxt(filename, usecols=cols, dtype=dtype, skip_header=2)
+    raw = np.atleast_1d(np.genfromtxt(filename, usecols=cols, dtype=dtype, skip_header=2))

    if fixed_cluster:
        etpi = np.empty(raw.shape, dtype=DTYPE_ETPAI)
--- a/pmsco/database.py
+++ b/pmsco/database.py
@ -29,6 +29,7 @@ import sqlite3
 import fasteners
 import numpy as np
 import pmsco.dispatch as dispatch
+from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)

@ -60,7 +61,7 @@ DB_SPECIAL_PARAMS = {"job_id": "_db_job",
                     "result_id": "_db_result",
                     "model": "_model",
                     "scan": "_scan",
-                     "sym": "_sym",
+                     "domain": "_domain",
                     "emit": "_emit",
                     "region": "_region",
                     "gen": "_gen",
@ -77,7 +78,7 @@ DB_SPECIAL_NUMPY_TYPES = {"job_id": "i8",
                          "result_id": "i8",
                          "model": "i8",
                          "scan": "i8",
-                          "sym": "i8",
+                          "domain": "i8",
                          "emit": "i8",
                          "region": "i8",
                          "gen": "i8",
@ -259,7 +260,7 @@ class ResultsDatabase(object):
    sql_select_model = """select id, job_id, model, gen, particle
        from Models where id=:id"""
    sql_select_model_model = """select id, job_id, model, gen, particle
-        from Models where model=:model"""
+        from Models where job_id=:job_id and model=:model"""
    sql_select_model_job = """select id, job_id, model, gen, particle
        from Models where job_id=:job_id"""
    sql_delete_model = """delete from Models where model_id = :model_id"""
@ -268,7 +269,7 @@ class ResultsDatabase(object):
        `id` INTEGER PRIMARY KEY,
        `model_id` INTEGER,
        `scan` integer,
-        `sym` integer,
+        `domain` integer,
        `emit` integer,
        `region` integer,
        `rfac` REAL,
@ -276,22 +277,29 @@ class ResultsDatabase(object):
        )"""
    sql_index_results_tasks = """create index if not exists 
        `index_results_tasks` ON `Results` 
-        (`model_id`, `scan`,`sym`,`emit`,`region`)"""
+        (`model_id`, `scan`,`domain`,`emit`,`region`)"""
    sql_drop_index_results_tasks = "drop index if exists index_results_tasks"
    sql_index_results_models = """create index if not exists 
        `index_results_models` ON `Results` 
        (`id`, `model_id`)"""
    sql_drop_index_results_models = "drop index if exists index_results_models"
-    sql_insert_result = """insert into Results(model_id, scan, sym, emit, region, rfac)
-        values (:model_id, :scan, :sym, :emit, :region, :rfac)"""
+    sql_insert_result = """insert into Results(model_id, scan, domain, emit, region, rfac)
+        values (:model_id, :scan, :domain, :emit, :region, :rfac)"""
    sql_update_result = """update Results
        set rfac=:rfac
        where id=:result_id"""
-    sql_select_result = """select id, model_id, scan, sym, emit, region, rfac
+    sql_select_result = """select id, model_id, scan, domain, emit, region, rfac
        from Results where id=:id"""
-    sql_select_result_index = """select id, model_id, scan, sym, emit, region, rfac
-        from Results where model_id=:model_id and scan=:scan and sym=:sym and emit=:emit and region=:region"""
+    sql_select_result_index = """select id, model_id, scan, domain, emit, region, rfac
+        from Results where model_id=:model_id and scan=:scan and domain=:domain and emit=:emit and region=:region"""
    sql_delete_result = """delete from Results where id = :result_id"""
+    sql_view_results_models = """create view if not exists `ViewResultsModels` as 
+        select project_id, job_id, model_id, Results.id as result_id, rfac, model, scan, domain, emit, region 
+        from Models 
+        join Results on Results.model_id = Models.id 
+        join Jobs on Jobs.id = Models.job_id 
+        order by project_id, job_id, rfac, model, scan, domain, emit, region 
+        """

    sql_create_params = """CREATE TABLE IF NOT EXISTS `Params` (
        `id` INTEGER PRIMARY KEY,
@ -422,16 +430,6 @@ class ResultsDatabase(object):
    # @var _lock_filename (str).
    # path and name of the lock file or an empty string if no locking is used.

-    # @var _lock (obj).
-    # context manager which provides a locking mechanism for the database.
-    #
-    # this is either a fasteners.InterprocessLock or _DummyLock.
-    # InterprocessLock allows to serialize access to the database by means of a lock file.
-    # _DummyLock is used with an in-memory database which does not require locking.
-    #
-    # @note InterprocessLock is re-usable but not re-entrant.
-    # Be careful not to nest contexts when calling other methods from within this class!
-
    def __init__(self):
        self._conn = None
        self._db_filename = ""
@ -440,7 +438,6 @@ class ResultsDatabase(object):
        self._model_params = {}
        self._tags = {}
        self._lock_filename = ""
-        self._lock = None

    def connect(self, db_filename, lock_filename=""):
        """
@ -469,14 +466,10 @@ class ResultsDatabase(object):
            self._lock_filename = ""
        else:
            self._lock_filename = db_filename + ".lock"
-        if self._lock_filename:
-            self._lock = fasteners.InterProcessLock(self._lock_filename)
-        else:
-            self._lock = _DummyLock()

        self._conn = sqlite3.connect(self._db_filename)
        self._conn.row_factory = sqlite3.Row
-        with self._lock:
+        with self.lock():
            self._conn.execute("PRAGMA foreign_keys = 1")
            self._conn.commit()
            c = self._conn.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='Models'")
@ -496,7 +489,6 @@ class ResultsDatabase(object):
        if self._conn is not None:
            self._conn.close()
        self._conn = None
-        self._lock = None

    def check_connection(self):
        """
@ -511,9 +503,25 @@ class ResultsDatabase(object):

        @raise AssertionError if the connection is not valid.
        """
-        assert self._lock is not None, "database not connected"
        assert self._conn is not None, "database not connected"

+    def lock(self):
+        """
+        create a file-lock context manager for the database.
+
+        this is either a fasteners.InterProcessLock object on self._lock_filename
+        or a _DummyLock object if the database is in memory.
+        InterprocessLock allows to serialize access to the database by means of a lock file.
+        this is necessary if multiple pmsco instances require access to the same database.
+        _DummyLock is used with an in-memory database which does not require locking.
+
+        the lock object can be used as context-manager in a with statement.
+        """
+        if self._lock_filename:
+            return fasteners.InterProcessLock(self._lock_filename)
+        else:
+            return _DummyLock()
+
    def create_schema(self):
        """
        create the database schema (tables and indices).
@ -525,7 +533,7 @@ class ResultsDatabase(object):
        @return: None
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            self._conn.execute(self.sql_create_projects)
            self._conn.execute(self.sql_create_jobs)
            self._conn.execute(self.sql_create_models)
@ -539,19 +547,23 @@ class ResultsDatabase(object):
            self._conn.execute(self.sql_index_paramvalues)
            self._conn.execute(self.sql_index_jobtags)
            self._conn.execute(self.sql_index_models)
+            self._conn.execute(self.sql_view_results_models)

    def register_project(self, name, code):
        """
        register a project with the database.

        @param name: name of the project. alphanumeric characters only. no spaces or special characters!
+            if a project of the same name exists in the database,
+            the id of the existing entry is returned.
+            the existing entry is not modified.

        @param code: name of the pmsco module that defines the project.

        @return: id value of the project in the database.
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            c = self._conn.execute(self.sql_select_project_name, {'name': name})
            v = c.fetchone()
            if v:
@ -574,7 +586,7 @@ class ResultsDatabase(object):
        @return None
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            param_dict = {'project_id': project_id}
            self._conn.execute(self.sql_delete_project, param_dict)

@ -585,7 +597,11 @@ class ResultsDatabase(object):

        @param project_id: identifier of the project. see register_project().

-        @param name: name of the job. up to the user, must be unique within a project.
+        @param name: name of the job. alphanumeric characters only. no spaces or special characters!
+            must be unique within a project.
+            if a job of the same name and same project exists in the database,
+            the id of the existing entry is returned.
+            the existing entry is not modified.

        @param mode: optimization mode string (should be same as command line argument).

@ -600,7 +616,7 @@ class ResultsDatabase(object):
        @return: id value of the job in the database.
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            c = self._conn.execute(self.sql_select_job_name, {'project_id': project_id, 'name': name})
            v = c.fetchone()
            if v:
@ -630,7 +646,7 @@ class ResultsDatabase(object):
        @return None
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            param_dict = {'job_id': job_id}
            self._conn.execute(self.sql_delete_job, param_dict)

@ -669,7 +685,7 @@ class ResultsDatabase(object):
        @return: id value of the job in the database
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            job_id = self._query_job_name(job_name, project_id=project_id)

        return job_id
@ -686,7 +702,7 @@ class ResultsDatabase(object):
        @return: id value of the parameter in the database.
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            return self._register_param(key)

    def _register_param(self, key):
@ -721,7 +737,7 @@ class ResultsDatabase(object):
        @return: None
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            for key in model_params:
                if key[0] != '_':
                    self._register_param(key)
@ -762,7 +778,7 @@ class ResultsDatabase(object):

        params = {}
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            c = self._conn.execute(sql, args)
            for row in c:
                params[row['key']] = row['param_id']
@ -790,7 +806,7 @@ class ResultsDatabase(object):
        @return: id value of the tag in the database.
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            return self._register_tag(key)

    def _register_tag(self, key):
@ -825,7 +841,7 @@ class ResultsDatabase(object):
        @return: None
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            for key in tags:
                self._register_tag(key)

@ -865,7 +881,7 @@ class ResultsDatabase(object):

        tags = {}
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            c = self._conn.execute(sql, args)
            for row in c:
                tags[row['key']] = row['tag_id']
@ -889,7 +905,7 @@ class ResultsDatabase(object):

        tags = {}
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            c = self._conn.execute(sql, args)
            for row in c:
                tags[row['key']] = row['value']
@ -912,7 +928,7 @@ class ResultsDatabase(object):
        @return: None
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            for key, value in tags.items():
                try:
                    tag_id = self._tags[key]
@ -965,7 +981,7 @@ class ResultsDatabase(object):
        params = self.query_project_params(project_id, job_id)
        params.update(self._model_params)
        param_names = sorted(params, key=lambda s: s.lower())
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            if job_id:
                view_name = "ViewModelsJob{0}".format(job_id)
            else:
@ -1009,7 +1025,7 @@ class ResultsDatabase(object):
        @raise KeyError if a parameter hasn't been registered.
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            # insert model record
            model_dict = {'job_id': self.job_id, 'gen': None, 'particle': None}
            model_dict.update(special_params(model_params))
@ -1036,7 +1052,7 @@ class ResultsDatabase(object):
        @return None
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            param_dict = {'model_id': model_id}
            self._conn.execute(self.sql_delete_model, param_dict)

@ -1048,7 +1064,7 @@ class ResultsDatabase(object):
        @return: dict
        """
        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            c = self._conn.execute(self.sql_select_paramvalue_model, {'model_id': model_id})
            d = {}
            for row in c:
@ -1084,7 +1100,7 @@ class ResultsDatabase(object):
        @param filter: list of filter expressions.
                       each expression is a relational expression of the form <code>field operator value</code>,
                       where field is a unique field name of the Projects, Jobs, Models or Results table, e.g.
-                       `job_id`, `model`, `rfac`, `scan`, `sym`, etc.
+                       `job_id`, `model`, `rfac`, `scan`, `domain`, etc.
                       operator is one of the relational operators in SQL syntax.
                       value is a numeric or string constant, the latter including single or double quotes.
                       if the list is empty, no filtering is applied.
@ -1102,7 +1118,7 @@ class ResultsDatabase(object):
        """
        self.check_connection()
        filter += [" project_id = {0} ".format(self.project_id)]
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            sql = "select distinct Models.id as model_id, model "
            sql += "from Models "
            sql += "join Results on Models.id = Results.model_id "
@ -1147,7 +1163,7 @@ class ResultsDatabase(object):
        @param filter: list of filter expressions.
                       each expression is a relational expression of the form <code>field operator value</code>,
                       where field is a unique field name of the Projects, Jobs, Models or Results table, e.g.
-                       `job_id`, `model`, `rfac`, `scan`, `sym`, etc.
+                       `job_id`, `model`, `rfac`, `scan`, `domain`, etc.
                       operator is one of the relational operators in SQL syntax.
                       value is a numeric or string constant, the latter including single or double quotes.
                       if the list is empty, no filtering is applied.
@ -1161,9 +1177,9 @@ class ResultsDatabase(object):
        """
        self.check_connection()
        filter += [" project_id = {0} ".format(self.project_id)]
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            sql = "select Results.id as result_id, model_id, job_id, "
-            sql += "model, scan, sym, emit, region, rfac, gen, particle "
+            sql += "model, scan, domain, emit, region, rfac, gen, particle "
            sql += "from Models "
            sql += "join Results on Models.id = Results.model_id "
            sql += "join Jobs on Models.job_id = Jobs.id "
@ -1172,7 +1188,7 @@ class ResultsDatabase(object):
                sql += "where "
                sql += " and ".join(filter)
                sql += " "
-            sql += "order by rfac, job_id, model, scan, sym, emit, region "
+            sql += "order by rfac, job_id, model, scan, domain, emit, region "
            if limit:
                sql += "limit {0} ".format(limit)
            c = self._conn.execute(sql)
@ -1240,7 +1256,7 @@ class ResultsDatabase(object):
            level_name = dispatch.CALC_LEVELS[4]

        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            sql = "select Models.id from Models "
            sql += "join Results on Models.id = Results.model_id "
            sql += "join Jobs on Models.job_id = Jobs.id "
@ -1250,7 +1266,7 @@ class ResultsDatabase(object):
                sql += "and Models.job_id in ({0}) ".format(",".join(map(str, job_ids)))
            sql += "group by Models.job_id "
            sql += "having min(rfac) "
-            sql += "order by rfac, job_id, model, scan, sym, emit, region "
+            sql += "order by rfac, job_id, model, scan, domain, emit, region "
            c = self._conn.execute(sql)
            models = [row['id'] for row in c]

@ -1261,7 +1277,7 @@ class ResultsDatabase(object):
        query the task index used in a calculation job.

        this query neglects the model index
-        and returns the unique tuples (-1, scan, sym, emit, region).
+        and returns the unique tuples (-1, scan, domain, emit, region).

        @param job_id: (int) id of the associated Jobs entry.
            if 0, self.job_id is used.
@ -1273,8 +1289,8 @@ class ResultsDatabase(object):
            job_id = self.job_id

        self.check_connection()
-        with self._lock, self._conn:
-            sql = "select scan, sym, emit, region "
+        with self.lock(), self._conn:
+            sql = "select scan, domain, emit, region "
            sql += "from Models "
            sql += "join Results on Models.id = Results.model_id "
            sql += "join Jobs on Models.job_id = Jobs.id "
@ -1323,7 +1339,7 @@ class ResultsDatabase(object):
            sql += "join Results on Models.id = Results.model_id "
            sql += "where Models.job_id = :job_id "
            sql += "and scan = :scan "
-            sql += "and sym = :sym "
+            sql += "and domain = :domain "
            sql += "and emit = :emit "
            sql += "and region = :region "
            sql += "order by rfac "
@ -1334,7 +1350,7 @@ class ResultsDatabase(object):

        tasks = self.query_tasks(job_id)
        models = set([])
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            for task in tasks:
                if task.numeric_level <= level:
                    d = task._asdict()
@ -1360,7 +1376,7 @@ class ResultsDatabase(object):
        @param index: (pmsco.dispatch.CalcID or dict)
            calculation index.
            in case of dict, the keys must be the attribute names of CalcID prefixed with an underscore, i.e.,
-            '_model', '_scan', '_sym', '_emit', '_region'.
+            '_model', '_scan', '_domain', '_emit', '_region'.
            extra values in the dictionary are ignored.
            undefined indices must be -1.

@ -1377,11 +1393,13 @@ class ResultsDatabase(object):
            job_id = self.job_id

        self.check_connection()
-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            model_id = self._insert_result_model(job_id, index, result)
            result_id = self._insert_result_data(model_id, index, result)
            self._insert_result_paramvalues(model_id, result)

+        logger.debug(BMsg("database insert result: job {}, model {}, result {}", job_id, model_id, result_id))
+
        return result_id

    def _insert_result_model(self, job_id, index, result):
@ -1402,7 +1420,7 @@ class ResultsDatabase(object):
        @param index: (pmsco.dispatch.CalcID or dict)
            calculation index.
            in case of dict, the keys must be the attribute names of CalcID prefixed with an underscore, i.e.,
-            '_model', '_scan', '_sym', '_emit', '_region'.
+            '_model', '_scan', '_domain', '_emit', '_region'.
            extra values in the dictionary are ignored.
            undefined indices must be -1.

@ -1448,7 +1466,7 @@ class ResultsDatabase(object):
        @param index: (pmsco.dispatch.CalcID or dict)
            calculation index.
            in case of dict, the keys must be the attribute names of CalcID prefixed with an underscore, i.e.,
-            '_model', '_scan', '_sym', '_emit', '_region'.
+            '_model', '_scan', '_domain', '_emit', '_region'.
            extra values in the dictionary are ignored.
            undefined indices must be -1.
        @param result: (dict) dictionary containing the parameter values and the '_rfac' result.
@ -1525,7 +1543,7 @@ class ResultsDatabase(object):
        if not job_id:
            job_id = self.job_id

-        data = np.genfromtxt(filename, names=True)
+        data = np.atleast_1d(np.genfromtxt(filename, names=True))
        self.register_params(data.dtype.names)
        try:
            unique_models, unique_index = np.unique(data['_model'], True)
@ -1552,7 +1570,7 @@ class ResultsDatabase(object):
                    model = unique_models[0]
                result_entry = {'model_id': model_ids[model],
                                'scan': -1,
-                                'sym': -1,
+                                'domain': -1,
                                'emit': -1,
                                'region': -1,
                                'rfac': None}
@ -1571,7 +1589,7 @@ class ResultsDatabase(object):
                                   'value': value}
                    yield param_entry

-        with self._lock, self._conn:
+        with self.lock(), self._conn:
            c = self._conn.execute(self.sql_select_model_job, {'job_id': job_id})
            v = c.fetchone()
            if v:
--- a/pmsco/dispatch.py
+++ b/pmsco/dispatch.py
@ -4,16 +4,13 @@ calculation dispatcher.

@author Matthias Muntwiler

-@copyright (c) 2015 by Paul Scherrer Institut @n
+@copyright (c) 2015-21 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
 import os
 import os.path
 import datetime
@ -21,8 +18,20 @@ import signal
 import collections
 import copy
 import logging
+
 from attrdict import AttrDict
+
+try:
    from mpi4py import MPI
+    mpi_comm = MPI.COMM_WORLD
+    mpi_size = mpi_comm.Get_size()
+    mpi_rank = mpi_comm.Get_rank()
+except ImportError:
+    MPI = None
+    mpi_comm = None
+    mpi_size = 1
+    mpi_rank = 0
+
 from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)
@ -53,7 +62,7 @@ TAG_ERROR_ABORTING = 4

 ## levels of calculation tasks
 #
-CALC_LEVELS = ('model', 'scan', 'sym', 'emit', 'region')
+CALC_LEVELS = ('model', 'scan', 'domain', 'emit', 'region')

 ## intermediate sub-class of CalcID
 #
@ -159,13 +168,13 @@ class CalculationTask(object):

    @arg @c id.model  structure number or iteration (handled by the mode module)
    @arg @c id.scan   scan number (handled by the project)
-    @arg @c id.sym    symmetry number (handled by the project)
+    @arg @c id.domain domain number (handled by the project)
    @arg @c id.emit   emitter number (handled by the project)
    @arg @c id.region region number (handled by the region handler)

    specified members must be greater or equal to zero.
    -1 is the wildcard which is used in parent tasks,
-    where, e.g., no specific symmetry is chosen.
+    where, e.g., no specific domain is chosen.
    the root task has the ID (-1, -1, -1, -1, -1).
    """

@ -311,7 +320,8 @@ class CalculationTask(object):
        format input or output file name including calculation index.

        @param overrides optional keyword arguments override object fields.
-            the following keywords are handled: @c root, @c model, @c scan, @c sym, @c emit, @c region, @c ext.
+            the following keywords are handled:
+            `root`, `model`, `scan`, `domain`, `emit`, `region`, `ext`.

        @return a string consisting of the concatenation of the base name, the ID, and the extension.
        """
@ -322,7 +332,7 @@ class CalculationTask(object):
        for key in overrides.keys():
            parts[key] = overrides[key]

-        filename = "{root}_{model}_{scan}_{sym}_{emit}_{region}{ext}".format(**parts)
+        filename = "{root}_{model}_{scan}_{domain}_{emit}_{region}{ext}".format(**parts)
        return filename

    def copy(self):
@ -462,7 +472,7 @@ class CachedCalculationMethod(object):
        def wrapped_func(inst, model, index):
            # note: _replace returns a new instance of the namedtuple
            index = index._replace(emit=-1, region=-1)
-            cache_index = (id(inst), index.model, index.scan, index.sym)
+            cache_index = (id(inst), index.model, index.scan, index.domain)
            try:
                result = self._cache[cache_index]
            except KeyError:
@ -518,8 +528,7 @@ class MscoProcess(object):
    #
    #  the default is 2 days after start.

-    def __init__(self, comm):
-        self._comm = comm
+    def __init__(self):
        self._project = None
        self._atomic_scattering = None
        self._multiple_scattering = None
@ -565,6 +574,8 @@ class MscoProcess(object):
        """
        clean up after all calculations.

+        this method must be called after run() has finished.
+
        @return: None
        """
        pass
@ -693,7 +704,7 @@ class MscoProcess(object):
        parameters generation is delegated to the project's create_params method.

        @param task: CalculationTask with all attributes set for the calculation.
-        @return: pmsco.project.Params object for the calculator.
+        @return: pmsco.project.CalculatorParams object for the calculator.
        """
        par = self._project.create_params(task.model, task.id)

@ -711,7 +722,7 @@ class MscoProcess(object):

        @param task: CalculationTask with all attributes set for the calculation.

-        @param par: pmsco.project.Params object for the calculator.
+        @param par: pmsco.project.CalculatorParams object for the calculator.
            its phase_files attribute is updated with the created scattering files.
            the radial matrix elements are not changed (but may be in a future version).

@ -740,7 +751,7 @@ class MscoProcess(object):
        calculate the multiple scattering intensity.

        @param task: CalculationTask with all attributes set for the calculation.
-        @param par: pmsco.project.Params object for the calculator.
+        @param par: pmsco.project.CalculatorParams object for the calculator.
        @param clu: pmsco.cluster.Cluster object for the calculator.
        @return: None
        """
@ -820,16 +831,16 @@ class MscoMaster(MscoProcess):
    ## @var task_handlers
    #       (AttrDict) dictionary of task handler objects
    #
-    #       the keys are the task levels 'model', 'scan', 'sym', 'emit' and 'region'.
+    #       the keys are the task levels 'model', 'scan', 'domain', 'emit' and 'region'.
    #       the values are handlers.TaskHandler objects.
    #       the objects can be accessed in attribute or dictionary notation.

-    def __init__(self, comm):
-        super(MscoMaster, self).__init__(comm)
+    def __init__(self):
+        super().__init__()
        self._pending_tasks = collections.OrderedDict()
        self._running_tasks = collections.OrderedDict()
        self._complete_tasks = collections.OrderedDict()
-        self._slaves = self._comm.Get_size() - 1
+        self._slaves = mpi_size - 1
        self._idle_ranks = []
        self.max_calculations = 1000000
        self._calculations = 0
@ -854,12 +865,18 @@ class MscoMaster(MscoProcess):

        the method notifies the handlers of the number of available slave processes (slots).
        some of the tasks handlers adjust their branching according to the number of slots.
-        this mechanism may be used to balance the load between the task levels.
-        however, the current implementation is very coarse in this respect.
-        it advertises all slots to the model handler but a reduced number to the remaining handlers
-        depending on the operation mode.
-        the region handler receives a maximum of 4 slots except in single calculation mode.
-        in single calculation mode, all slots can be used by all handlers.
+
+        this mechanism may be used to adjust the priorities of the task levels,
+        i.e., whether one slot handles all calculations of one model
+        so that all models of a generation finish around the same time,
+        or whether a model is finished completely before the next one is calculated
+        so that a result is returned as soon as possible.
+
+        the current algorithm tries to pass as many slots as available
+        down to the lowest level (region) in order to minimize wall time.
+        the lowest level is restricted to the minimum number of splits
+        only if the intermediate levels create a lot of branches,
+        in which case splitting scans would not offer a performance benefit.
        """
        super(MscoMaster, self).setup(project)

@ -868,8 +885,8 @@ class MscoMaster(MscoProcess):
        self._idle_ranks = list(range(1, self._running_slaves + 1))

        self._root_task = CalculationTask()
-        self._root_task.file_root = project.output_file
-        self._root_task.model = project.create_domain().start
+        self._root_task.file_root = str(project.output_file)
+        self._root_task.model = project.model_space.start

        for level in self.task_levels:
            self.task_handlers[level] = project.handler_classes[level]()
@ -877,14 +894,22 @@ class MscoMaster(MscoProcess):
        self.task_handlers.model.datetime_limit = self.datetime_limit

        slaves_adj = max(self._slaves, 1)
-        self.task_handlers.model.setup(project, slaves_adj)
-        if project.mode != "single":
-            slaves_adj = max(slaves_adj / 2, 1)
-        self.task_handlers.scan.setup(project, slaves_adj)
-        self.task_handlers.sym.setup(project, slaves_adj)
-        self.task_handlers.emit.setup(project, slaves_adj)
-        if project.mode != "single":
+        n_models = self.task_handlers.model.setup(project, slaves_adj)
+        if n_models > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_scans = self.task_handlers.scan.setup(project, slaves_adj)
+        if n_scans > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_doms = self.task_handlers.domain.setup(project, slaves_adj)
+        if n_doms > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_emits = self.task_handlers.emit.setup(project, slaves_adj)
+        if n_emits > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_extra = max(n_scans, n_doms, n_emits)
+        if n_extra > slaves_adj * 2:
            slaves_adj = min(slaves_adj, 4)
+        logger.debug(BMsg("{regions} slots available for region handler", regions=slaves_adj))
        self.task_handlers.region.setup(project, slaves_adj)

        project.setup(self.task_handlers)
@ -911,6 +936,7 @@ class MscoMaster(MscoProcess):
            else:
                self._dispatch_tasks()
            self._receive_result()
+            self._cleanup_tasks()
            self._check_finish()

        logger.debug("master exiting main loop")
@ -918,12 +944,32 @@ class MscoMaster(MscoProcess):
        self._save_report()

    def cleanup(self):
+        """
+        clean up after all calculations.
+
+        this method must be called after run() has finished.
+
+        in the master process, this calls cleanup() of each task handler and of the project.
+
+        @return: None
+        """
        logger.debug("master entering cleanup")
        for level in reversed(self.task_levels):
            self.task_handlers[level].cleanup()
        self._project.cleanup()
        super(MscoMaster, self).cleanup()

+    def _cleanup_tasks(self):
+        """
+        periodic clean-up in the main loop.
+
+        once per iteration of the main loop, this method cleans up unnecessary files.
+        this is done by the project's cleanup_files() method.
+
+        @return: None
+        """
+        self._project.cleanup_files()
+
    def _dispatch_results(self):
        """
        pass results through the post-processing modules.
@ -993,7 +1039,7 @@ class MscoMaster(MscoProcess):
                else:
                    logger.debug("assigning task %s to rank %u", str(task.id), rank)
                    self._running_tasks[task.id] = task
-                    self._comm.send(task.get_mpi_message(), dest=rank, tag=TAG_NEW_TASK)
+                    mpi_comm.send(task.get_mpi_message(), dest=rank, tag=TAG_NEW_TASK)
                    self._calculations += 1
        else:
            if not self._finishing:
@ -1015,7 +1061,7 @@ class MscoMaster(MscoProcess):
        while self._idle_ranks:
            rank = self._idle_ranks.pop()
            logger.debug("send finish tag to rank %u", rank)
-            self._comm.send(None, dest=rank, tag=TAG_FINISH)
+            mpi_comm.send(None, dest=rank, tag=TAG_FINISH)
            self._running_slaves -= 1

    def _receive_result(self):
@ -1025,7 +1071,7 @@ class MscoMaster(MscoProcess):
        if self._running_slaves > 0:
            logger.debug("waiting for calculation result")
            s = MPI.Status()
-            data = self._comm.recv(source=MPI.ANY_SOURCE, tag=MPI.ANY_TAG, status=s)
+            data = mpi_comm.recv(source=MPI.ANY_SOURCE, tag=MPI.ANY_TAG, status=s)

            if s.tag == TAG_NEW_RESULT:
                task_id = self._accept_task_done(data)
@ -1122,9 +1168,9 @@ class MscoMaster(MscoProcess):

        scan_tasks = self.task_handlers.scan.create_tasks(task)
        for scan_task in scan_tasks:
-            sym_tasks = self.task_handlers.sym.create_tasks(scan_task)
-            for sym_task in sym_tasks:
-                emitter_tasks = self.task_handlers.emit.create_tasks(sym_task)
+            dom_tasks = self.task_handlers.domain.create_tasks(scan_task)
+            for dom_task in dom_tasks:
+                emitter_tasks = self.task_handlers.emit.create_tasks(dom_task)
                for emitter_task in emitter_tasks:
                    region_tasks = self.task_handlers.region.create_tasks(emitter_task)
                    for region_task in region_tasks:
@ -1145,8 +1191,8 @@ class MscoSlave(MscoProcess):
    #
    #       typically, a task is aborted when an exception is encountered.

-    def __init__(self, comm):
-        super(MscoSlave, self).__init__(comm)
+    def __init__(self):
+        super().__init__()
        self._errors = 0
        self._max_errors = 5

@ -1159,7 +1205,7 @@ class MscoSlave(MscoProcess):
        self._running = True
        while self._running:
            logger.debug("waiting for message")
-            data = self._comm.recv(source=0, tag=MPI.ANY_TAG, status=s)
+            data = mpi_comm.recv(source=0, tag=MPI.ANY_TAG, status=s)
            if s.tag == TAG_NEW_TASK:
                logger.debug("received new task")
                self.accept_task(data)
@ -1189,17 +1235,17 @@ class MscoSlave(MscoProcess):
            logger.exception(BMsg("unhandled exception in calculation task {0}", task.id))
            self._errors += 1
            if self._errors <= self._max_errors:
-                self._comm.send(data, dest=0, tag=TAG_INVALID_RESULT)
+                mpi_comm.send(data, dest=0, tag=TAG_INVALID_RESULT)
            else:
                logger.error("too many exceptions, aborting")
                self._running = False
-                self._comm.send(data, dest=0, tag=TAG_ERROR_ABORTING)
+                mpi_comm.send(data, dest=0, tag=TAG_ERROR_ABORTING)
        else:
            logger.debug(BMsg("sending result of task {0} to master", result.id))
-            self._comm.send(result.get_mpi_message(), dest=0, tag=TAG_NEW_RESULT)
+            mpi_comm.send(result.get_mpi_message(), dest=0, tag=TAG_NEW_RESULT)


-def run_master(mpi_comm, project):
+def run_master(project):
    """
    initialize and run the master calculation loop.

@ -1211,25 +1257,25 @@ def run_master(mpi_comm, project):
    if an unhandled exception occurs, this function aborts the MPI communicator, killing all MPI processes.
    the caller will not have a chance to handle the exception.

-    @param mpi_comm: MPI communicator (mpi4py.MPI.COMM_WORLD).
-
    @param project: project instance (sub-class of project.Project).
    """
    try:
-        master = MscoMaster(mpi_comm)
+        master = MscoMaster()
        master.setup(project)
        master.run()
        master.cleanup()
    except (SystemExit, KeyboardInterrupt):
+        if mpi_comm:
            mpi_comm.Abort()
        raise
    except Exception:
        logger.exception("unhandled exception in master calculation loop.")
+        if mpi_comm:
            mpi_comm.Abort()
        raise


-def run_slave(mpi_comm, project):
+def run_slave(project):
    """
    initialize and run the slave calculation loop.

@ -1242,12 +1288,10 @@ def run_slave(mpi_comm, project):
    unless it is a SystemExit or KeyboardInterrupt (where we expect that the master also receives the signal),
    the MPI communicator is aborted, killing all MPI processes.

-    @param mpi_comm: MPI communicator (mpi4py.MPI.COMM_WORLD).
-
    @param project: project instance (sub-class of project.Project).
    """
    try:
-        slave = MscoSlave(mpi_comm)
+        slave = MscoSlave()
        slave.setup(project)
        slave.run()
        slave.cleanup()
@ -1255,6 +1299,7 @@ def run_slave(mpi_comm, project):
        raise
    except Exception:
        logger.exception("unhandled exception in slave calculation loop.")
+        if mpi_comm:
            mpi_comm.Abort()
        raise

@ -1267,12 +1312,9 @@ def run_calculations(project):

    @param project: project instance (sub-class of project.Project).
    """
-    mpi_comm = MPI.COMM_WORLD
-    mpi_rank = mpi_comm.Get_rank()
-
    if mpi_rank == 0:
        logger.debug("MPI rank %u setting up master loop", mpi_rank)
-        run_master(mpi_comm, project)
+        run_master(project)
    else:
        logger.debug("MPI rank %u setting up slave loop", mpi_rank)
-        run_slave(mpi_comm, project)
+        run_slave(project)
--- a/pmsco/edac/edac.i
+++ b/pmsco/edac/edac.i
@ -1,7 +0,0 @@
-/* EDAC interface for other programs */
-%module edac
-%{
-extern int run_script(char *scriptfile);
-%}
-
-extern int run_script(char *scriptfile);
--- a/pmsco/elements/init.py
+++ b/pmsco/elements/init.py
@ -0,0 +1,41 @@
+"""
+@package pmsco.elements
+extended properties of the elements
+
+this package extends the element table of the `periodictable` package
+(https://periodictable.readthedocs.io/en/latest/index.html)
+by additional attributes like the electron binding energies.
+
+the package requires the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import periodictable.core
+
+
+def _load_binding_energy():
+    """
+    delayed loading of the binding energy table.
+    """
+    from . import bindingenergy
+    bindingenergy.init(periodictable.core.default_table())
+
+
+def _load_photoionization():
+    """
+    delayed loading of the binding energy table.
+    """
+    from . import photoionization
+    photoionization.init(periodictable.core.default_table())
+
+
+periodictable.core.delayed_load(['binding_energy'], _load_binding_energy)
+periodictable.core.delayed_load(['photoionization'], _load_photoionization)
--- a/pmsco/elements/bindingenergy.json
+++ b/pmsco/elements/bindingenergy.json
--- a/pmsco/elements/bindingenergy.py
+++ b/pmsco/elements/bindingenergy.py
@ -0,0 +1,212 @@
+"""
+@package pmsco.elements.bindingenergy
+electron binding energies of the elements
+
+extends the element table of the `periodictable` package
+(https://periodictable.readthedocs.io/en/latest/index.html)
+by the electron binding energies.
+
+the binding energies are compiled from Gwyn Williams' web page
+(https://userweb.jlab.org/~gwyn/ebindene.html).
+please refer to the original web page or the x-ray data booklet
+for original sources, definitions and remarks.
+binding energies of gases are replaced by respective values of a common compound
+from the 'handbook of x-ray photoelectron spectroscopy' (physical electronics, inc., 1995).
+
+usage
+-----
+
+this module requires the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+~~~~~~{.py}
+import periodictable as pt
+import pmsco.elements.bindingenergy
+
+# read any periodictable's element interfaces, e.g.
+print(pt.gold.binding_energy['4f7/2'])
+print(pt.elements.symbol('Au').binding_energy['4f7/2'])
+print(pt.elements.name('gold').binding_energy['4f7/2'])
+print(pt.elements[79].binding_energy['4f7/2'])
+~~~~~~
+
+note that attributes are writable.
+you may assign refined values in your instance of the database.
+
+the query_binding_energy() function queries all terms with a particular binding energy.
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import json
+import numpy as np
+import os
+import periodictable as pt
+from pmsco.compat import open
+
+
+index_energy = np.zeros(0)
+index_number = np.zeros(0)
+index_term = []
+default_data_path = os.path.join(os.path.dirname(__file__), "bindingenergy.json")
+
+
+def load_data(data_path=None):
+    """
+    load binding energy data from json file
+
+    the data file must be in the same format as generated by save_data.
+
+    @param file path of the data file. default: "bindingenergy.json" next to this module file
+
+    @return dictionary
+    """
+    if data_path is None:
+        data_path = default_data_path
+    with open(data_path) as fp:
+        data = json.load(fp)
+    return data
+
+
+def save_data(data_path=None):
+    """
+    save binding energy data to json file
+
+    @param file path of the data file. default: "bindingenergy.json" next to this module file
+
+    @return None
+    """
+    if data_path is None:
+        data_path = default_data_path
+    data = {}
+    for element in pt.elements:
+        element_data = {}
+        for term, energy in element.binding_energy.items():
+            element_data[term] = energy
+        if element_data:
+            data[element.number] = element_data
+    with open(data_path, 'w', 'utf8') as fp:
+        json.dump(data, fp, sort_keys=True, indent='\t')
+
+
+def init(table, reload=False):
+    if 'binding_energy' in table.properties and not reload:
+        return
+    table.properties.append('binding_energy')
+
+    pt.core.Element.binding_energy = {}
+    pt.core.Element.binding_energy_units = "eV"
+
+    data = load_data()
+    for el_key, el_data in data.items():
+        try:
+            el = table[int(el_key)]
+        except ValueError:
+            el = table.symbol(el_key)
+        el.binding_energy = el_data
+
+
+def build_index():
+    """
+    build an index for query_binding_energy().
+
+    the index is kept in global variables of the module.
+
+    @return None
+    """
+    global index_energy
+    global index_number
+    global index_term
+
+    n = 0
+    for element in pt.elements:
+        n += len(element.binding_energy)
+
+    index_energy = np.zeros(n)
+    index_number = np.zeros(n)
+    index_term = []
+
+    for element in pt.elements:
+        for term, energy in element.binding_energy.items():
+            index_term.append(term)
+            i = len(index_term) - 1
+            index_energy[i] = energy
+            index_number[i] = element.number
+
+
+def query_binding_energy(energy, tol=1.0):
+    """
+    search the periodic table for a specific binding energy and return all matching terms.
+
+    @param energy: binding energy in eV.
+
+    @param tol: tolerance in eV.
+
+    @return: list of dictionaries containing element and term specification.
+             the list is ordered arbitrarily.
+             each dictionary contains the following keys:
+             @arg 'number': element number
+             @arg 'symbol': element symbol
+             @arg 'term': spectroscopic term
+             @arg 'energy': actual binding energy
+    """
+    if len(index_energy) == 0:
+        build_index()
+    sel = np.abs(index_energy - energy) < tol
+    idx = np.where(sel)
+    result = []
+    for i in idx[0]:
+        el_num = int(index_number[i])
+        d = {'number': el_num,
+             'symbol': pt.elements[el_num].symbol,
+             'term': index_term[i],
+             'energy': index_energy[i]}
+        result.append(d)
+
+    return result
+
+
+def export_flat_text(f):
+    """
+    export the binding energies to a flat general text file.
+
+    the file has four space-separated columns `number`, `symbol`, `term`, `energy`.
+    column names are included in the first row.
+
+    @param f: file path or open file object
+    @return: None
+    """
+    if hasattr(f, "write") and callable(f.write):
+        f.write("number symbol term energy\n")
+        for element in pt.elements:
+            for term, energy in element.binding_energy.items():
+                f.write(f"{element.number} {element.symbol} {term} {energy}\n")
+    else:
+        with open(f, "w") as fi:
+            export_flat_text(fi)
+
+
+def import_flat_text(f):
+    """
+    import binding energies from a flat general text file.
+
+    data is in space-separated columns.
+    the first row contains column names.
+    at least the columns `number`, `term`, `energy` must be present.
+
+    the function updates existing entries and appends entries of non-existing terms.
+    existing terms that are not listed in the file remain unchanged.
+
+    @param f: file path or open file object
+
+    @return: None
+    """
+    data = np.atleast_1d(np.genfromtxt(f, names=True, dtype=None, encoding="utf8"))
+    for d in data:
+        pt.elements[d['number']].binding_energy[d['term']] = d['energy']
--- a/pmsco/elements/cross-sections.dat
+++ b/pmsco/elements/cross-sections.dat
--- a/pmsco/elements/photoionization.py
+++ b/pmsco/elements/photoionization.py
@ -0,0 +1,248 @@
+"""
+@package pmsco.elements.photoionization
+photoionization cross-sections of the elements
+
+extends the element table of the `periodictable` package
+(https://periodictable.readthedocs.io/en/latest/index.html)
+by a table of photoionization cross-sections.
+
+
+the data is available from (https://vuo.elettra.eu/services/elements/)
+or (https://figshare.com/articles/dataset/Digitisation_of_Yeh_and_Lindau_Photoionisation_Cross_Section_Tabulated_Data/12389750).
+both sources are based on the original atomic data tables by Yeh and Lindau (1985).
+the Elettra data includes interpolation at finer steps,
+whereas the Kalha data contains only the original data points by Yeh and Lindau
+plus an additional point at 8 keV.
+the tables go up to 1500 eV photon energy and do not resolve spin-orbit splitting.
+
+
+usage
+-----
+
+this module requires python 3.6, numpy and the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+~~~~~~{.py}
+import numpy as np
+import periodictable as pt
+import pmsco.elements.photoionization
+
+# read any periodictable's element interfaces as follows.
+# eph and cs are numpy arrays of identical shape that hold the photon energies and cross sections.
+eph, cs = pt.gold.photoionization.cross_section['4f']
+eph, cs = pt.elements.symbol('Au').photoionization.cross_section['4f']
+eph, cs = pt.elements.name('gold').photoionization.cross_section['4f']
+eph, cs = pt.elements[79].photoionization.cross_section['4f']
+
+# interpolate for specific photon energy
+print(np.interp(photon_energy, eph, cs)
+~~~~~~
+
+the data is loaded from the cross-sections.dat file which is a python-pickled data file.
+to switch between data sources, use one of the load functions defined here
+and dump the data to the cross-sections.dat file.
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import numpy as np
+from pathlib import Path
+import periodictable as pt
+import pickle
+import urllib.request
+import urllib.error
+from . import bindingenergy
+
+
+def load_kalha_data():
+    """
+    load all cross-sections from csv-files by Kalha et al.
+
+    the files must be placed in the 'kalha' directory next to this file.
+
+    @return: cross-section data in a nested dictionary, cf. load_pickled_data().
+    """
+    data = {}
+    p = Path(Path(__file__).parent, "kalha")
+    for entry in p.glob('*_*.csv'):
+        if entry.is_file():
+            try:
+                element = int(entry.stem.split('_')[0])
+            except ValueError:
+                pass
+            else:
+                data[element] = load_kalha_file(entry)
+    return data
+
+
+def load_kalha_file(path):
+    """
+    load the cross-sections of an element from a csv-file by Kalha et al.
+
+    @param path: file path
+    @return: (dict) dictionary of 'nl' terms.
+        the data items are tuples (photon_energy, cross_sections) of 1-dimensional numpy arrays.
+    """
+    a = np.genfromtxt(path, delimiter=',', names=True)
+    b = ~np.isnan(a['Photon_Energy__eV'])
+    a = a[b]
+    eph = a['Photon_Energy__eV'].copy()
+    data = {}
+    for n in range(1, 8):
+        for l in 'spdf':
+            col = f"{n}{l}"
+            try:
+                data[col] = (eph, a[col].copy())
+            except ValueError:
+                pass
+    return data
+
+
+def load_kalha_configuration(path):
+    """
+    load the electron configuration from a csv-file by Kalha et al.
+
+    @param path: file path
+    @return: (dict) dictionary of 'nl' terms mapping to number of electrons in the sub-shell.
+    """
+    p = Path(path)
+    subshells = []
+    electrons = []
+    config = {}
+    with p.open() as f:
+        for l in f.readlines():
+            s = l.split(',')
+            k_eph = "Photon Energy"
+            k_el = "#electrons"
+            if s[0][0:len(k_eph)] == k_eph:
+                subshells = s[1:]
+            elif s[0][0:len(k_el)] == k_el:
+                electrons = s[1:]
+
+    for i, sh in enumerate(subshells):
+        if sh:
+            config[sh] = electrons[i]
+
+    return config
+
+
+def load_elettra_file(symbol, nl):
+    """
+    download the cross sections of one level from the Elettra webelements web site.
+
+    @param symbol: (str) element symbol
+    @param nl: (str) nl term, e.g. '2p' (no spin-orbit)
+    @return: (photon_energy, cross_section) tuple of 1-dimensional numpy arrays.
+    """
+    url = f"https://vuo.elettra.eu/services/elements/data/{symbol.lower()}{nl}.txt"
+    try:
+        data = urllib.request.urlopen(url)
+    except urllib.error.HTTPError:
+        eph = None
+        cs = None
+    else:
+        a = np.genfromtxt(data)
+        try:
+            eph = a[:, 0]
+            cs = a[:, 1]
+        except IndexError:
+            eph = None
+            cs = None
+
+    return eph, cs
+
+
+def load_elettra_data():
+    """
+    download the cross sections from the Elettra webelements web site.
+
+    @return: cross-section data in a nested dictionary, cf. load_pickled_data().
+    """
+    data = {}
+    for element in pt.elements:
+        element_data = {}
+        for nlj in element.binding_energy:
+            nl = nlj[0:2]
+            eb = element.binding_energy[nlj]
+            if nl not in element_data and eb <= 2000:
+                eph, cs = load_elettra_file(element.symbol, nl)
+                if eph is not None and cs is not None:
+                    element_data[nl] = (eph, cs)
+        if len(element_data):
+            data[element.symbol] = element_data
+
+    return data
+
+
+def save_pickled_data(path, data):
+    """
+    save a cross section data dictionary to a python-pickled file.
+
+    @param path: file path
+    @param data: cross-section data in a nested dictionary, cf. load_pickled_data().
+    @return: None
+    """
+    with open(path, "wb") as f:
+        pickle.dump(data, f)
+
+
+def load_pickled_data(path):
+    """
+    load the cross section data from a python-pickled file.
+
+    the file can be generated by the save_pickled_data() function.
+
+    @param path: file path
+    @return: cross-section data in a nested dictionary.
+        the first-level keys are element symbols.
+        the second-level keys are 'nl' terms (e.g. '2p').
+        note that the Yeh and Lindau tables do not resolve spin-orbit splitting.
+        the data items are (photon_energy, cross_sections) tuples
+        of 1-dimensional numpy arrays holding the data table.
+        cross section values are given in Mb.
+    """
+    with open(path, "rb") as f:
+        data = pickle.load(f)
+    return data
+
+
+class Photoionization(object):
+    def __init__(self):
+        self.cross_section = {}
+        self.cross_section_units = "Mb"
+
+
+def init(table, reload=False):
+    """
+    loads cross section data into the periodic table.
+
+    this function is called by the periodictable to load the data on demand.
+
+    @param table:
+    @param reload:
+    @return:
+    """
+    if 'photoionization' in table.properties and not reload:
+        return
+    table.properties.append('photoionization')
+
+    # default value
+    pt.core.Element.photoionization = Photoionization()
+
+    p = Path(Path(__file__).parent, "cross-sections.dat")
+    data = load_pickled_data(p)
+    for el_key, el_data in data.items():
+        try:
+            el = table[int(el_key)]
+        except ValueError:
+            el = table.symbol(el_key)
+        pi = Photoionization()
+        pi.cross_section = el_data
+        pi.cross_section_units = "Mb"
+        el.photoionization = pi
--- a/pmsco/elements/spectrum.py
+++ b/pmsco/elements/spectrum.py
@ -0,0 +1,208 @@
+"""
+@package pmsco.elements.spectrum
+photoelectron spectrum simulator
+
+this module calculates the basic structure of a photoelectron spectrum.
+it calculates positions and approximate amplitude of elastic peaks
+based on photon energy, binding energy, photoionization cross section, and stoichiometry.
+escape depth, photon flux, analyser transmission are not accounted for.
+
+
+usage
+-----
+
+this module requires python 3.6, numpy, matplotlib and
+the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+~~~~~~{.py}
+import numpy as np
+import periodictable as pt
+import pmsco.elements.spectrum as spec
+
+# for working with the data
+labels, energy, intensity = spec.build_spectrum(800., {"Ti": 1, "O": 2})
+
+# for plotting
+spec.plot_spectrum(800., {"Ti": 1, "O": 2})
+~~~~~~
+
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from matplotlib import pyplot as plt
+import numpy as np
+import periodictable as pt
+from . import bindingenergy
+from . import photoionization
+
+
+def get_element(number_or_symbol):
+    """
+    return the given Element object of the periodic table.
+
+    @param number_or_symbol: atomic number (int) or chemical symbol (str).
+    @return: Element object.
+    """
+    try:
+        el = pt.elements[number_or_symbol]
+    except KeyError:
+        el = pt.elements.symbol(number_or_symbol)
+    return el
+
+
+def get_binding_energy(photon_energy, element, nlj):
+    """
+    look up the binding energy of a core level and check whether it is smaller than the photon energy.
+
+    @param photon_energy: photon energy in eV.
+    @param element: Element object of the periodic table.
+    @param nlj: (str) spectroscopic term, e.g. '4f7/2'.
+    @return: (float) binding energy or numpy.nan.
+    """
+    try:
+        eb = element.binding_energy[nlj]
+    except KeyError:
+        return np.nan
+    if eb < photon_energy:
+        return eb
+    else:
+        return np.nan
+
+
+def get_cross_section(photon_energy, element, nlj):
+    """
+    look up the photoionization cross section.
+
+    since the Yeh/Lindau tables do not resolve the spin-orbit splitting,
+    this function applies the normal relative weights of a full sub-shell.
+
+    the result is a linear interpolation between tabulated values.
+
+    @param photon_energy: photon energy in eV.
+    @param element: Element object of the periodic table.
+    @param nlj: (str) spectroscopic term, e.g. '4f7/2'.
+    @return: (float) cross section in Mb.
+    """
+    nl = nlj[0:2]
+    if not hasattr(element, "photoionization"):
+        element = get_element(element)
+    try:
+        pet, cst = element.photoionization.cross_section[nl]
+    except KeyError:
+        return np.nan
+
+    # weights of spin-orbit peaks
+    d_wso = {"p1/2": 1./3.,
+             "p3/2": 2./3.,
+             "d3/2": 2./5.,
+             "d5/2": 3./5.,
+             "f5/2": 3./7.,
+             "f7/2": 4./7.}
+    wso = d_wso.get(nlj[1:], 1.)
+    cst = cst * wso
+
+    # todo: consider spline
+    return np.interp(photon_energy, pet, cst)
+
+
+def build_spectrum(photon_energy, elements, binding_energy=False, work_function=4.5):
+    """
+    calculate the positions and amplitudes of core-level photoemission lines.
+
+    the function looks up the binding energies and cross sections of all photoemission lines in the energy range
+    given by the photon energy and returns an array of expected spectral lines.
+
+    @param photon_energy: (numeric) photon energy in eV.
+    @param elements: list or dictionary of elements.
+        elements are identified by their atomic number (int) or chemical symbol (str).
+        if a dictionary is given, the (float) values are stoichiometric weights of the elements.
+    @param binding_energy: (bool) return binding energies (True) rather than kinetic energies (False, default).
+    @param work_function: (float) work function of the instrument in eV.
+    @return: tuple (labels, positions, intensities) of 1-dimensional numpy arrays representing the spectrum.
+        labels are in the format {Symbol}{n}{l}{j}.
+    """
+    ekin = []
+    ebind = []
+    intens = []
+    labels = []
+
+    for element in elements:
+        el = get_element(element)
+        for n in range(1, 8):
+            for l in "spdf":
+                for j in ['', '1/2', '3/2', '5/2', '7/2']:
+                    nlj = f"{n}{l}{j}"
+                    eb = get_binding_energy(photon_energy, el, nlj)
+                    cs = get_cross_section(photon_energy, el, nlj)
+                    try:
+                        cs = cs * elements[element]
+                    except (KeyError, TypeError):
+                        pass
+                    if not np.isnan(eb) and not np.isnan(cs):
+                        ekin.append(photon_energy - eb - work_function)
+                        ebind.append(eb)
+                        intens.append(cs)
+                        labels.append(f"{el.symbol}{nlj}")
+
+    ebind = np.array(ebind)
+    ekin = np.array(ekin)
+    intens = np.array(intens)
+    labels = np.array(labels)
+
+    if binding_energy:
+        return labels, ebind, intens
+    else:
+        return labels, ekin, intens
+
+
+def plot_spectrum(photon_energy, elements, binding_energy=False, work_function=4.5, show_labels=True):
+    """
+    plot a simple spectrum representation of a material.
+
+    the function looks up the binding energies and cross sections of all photoemission lines in the energy range
+    given by the photon energy and returns an array of expected spectral lines.
+
+    the spectrum is plotted using matplotlib.pyplot.stem.
+
+    @param photon_energy: (numeric) photon energy in eV.
+    @param elements: list or dictionary of elements.
+        elements are identified by their atomic number (int) or chemical symbol (str).
+        if a dictionary is given, the (float) values are stoichiometric weights of the elements.
+    @param binding_energy: (bool) return binding energies (True) rather than kinetic energies (False, default).
+    @param work_function: (float) work function of the instrument in eV.
+    @param show_labels: (bool) show peak labels (True, default) or not (False).
+    @return: (figure, axes)
+    """
+    labels, energy, intensity = build_spectrum(photon_energy, elements, binding_energy=binding_energy,
+                                               work_function=work_function)
+
+    fig, ax = plt.subplots()
+    ax.stem(energy, intensity, basefmt=' ', use_line_collection=True)
+    if show_labels:
+        for sxy in zip(labels, energy, intensity):
+            ax.annotate(sxy[0], xy=(sxy[1], sxy[2]), textcoords='data')
+
+    ax.grid()
+    if binding_energy:
+        ax.set_xlabel('binding energy')
+    else:
+        ax.set_xlabel('kinetic energy')
+    ax.set_ylabel('intensity')
+    ax.set_title(elements)
+    return fig, ax
+
+
+def plot_cross_section(el, nlj):
+    energy = np.arange(100, 1500, 140)
+    cs = get_cross_section(energy, el, nlj)
+    fig, ax = plt.subplots()
+    ax.set_yscale("log")
+    ax.plot(energy, cs)
--- a/pmsco/files.py
+++ b/pmsco/files.py
@ -27,20 +27,20 @@ logger = logging.getLogger(__name__)
 #
 # each string of this set marks a category of files.
 #
-# @arg @c 'input' :     raw input files for calculator, including cluster and atomic files in custom format
-# @arg @c 'output' :    raw output files from calculator
-# @arg @c 'atomic' :    atomic scattering (phase, emission) files in portable format
-# @arg @c 'cluster' :   cluster files in portable XYZ format for report
-# @arg @c 'log' :       log files
-# @arg @c 'debug' :     debug files
-# @arg @c 'model':      output files in ETPAI format: complete simulation  (a_-1_-1_-1_-1)
-# @arg @c 'scan' :      output files in ETPAI format: scan (a_b_-1_-1_-1)
-# @arg @c 'symmetry' :  output files in ETPAI format: symmetry (a_b_c_-1_-1)
-# @arg @c 'emitter' :   output files in ETPAI format: emitter (a_b_c_d_-1)
-# @arg @c 'region' :    output files in ETPAI format: region (a_b_c_d_e)
-# @arg @c 'report':     final report of results
-# @arg @c 'population': final state of particle population
-# @arg @c 'rfac':       files related to models which give bad r-factors (dynamic category, see below).
+# @arg 'input' :     raw input files for calculator, including cluster and atomic files in custom format
+# @arg 'output' :    raw output files from calculator
+# @arg 'atomic' :    atomic scattering (phase, emission) files in portable format
+# @arg 'cluster' :   cluster files in portable XYZ format for report
+# @arg 'log' :       log files
+# @arg 'debug' :     debug files
+# @arg 'model':      output files in ETPAI format: complete simulation  (a_-1_-1_-1_-1)
+# @arg 'scan' :      output files in ETPAI format: scan (a_b_-1_-1_-1)
+# @arg 'domain' :    output files in ETPAI format: domain (a_b_c_-1_-1)
+# @arg 'emitter' :   output files in ETPAI format: emitter (a_b_c_d_-1)
+# @arg 'region' :    output files in ETPAI format: region (a_b_c_d_e)
+# @arg 'report':     final report of results
+# @arg 'population': final state of particle population
+# @arg 'rfac':       files related to models which give bad r-factors (dynamic category, see below).
 #
 # @note @c 'rfac' is a dynamic category not connected to a particular file or content type.
 # no file should be marked @c 'rfac'.
@ -48,7 +48,7 @@ logger = logging.getLogger(__name__)
 # if so, all files related to bad models are deleted, regardless of their static category.
 #
 FILE_CATEGORIES = {'cluster', 'atomic', 'input', 'output',
-                   'report', 'region', 'emitter', 'scan', 'symmetry', 'model',
+                   'report', 'region', 'emitter', 'scan', 'domain', 'model',
                   'log', 'debug', 'population', 'rfac'}

 ## @var FILE_CATEGORIES_TO_KEEP
@ -242,37 +242,52 @@ class FileTracker(object):
        else:
            self._complete_models.discard(model)

-    def delete_files(self, categories=None):
+    def delete_files(self, categories=None, incomplete_models=False):
        """
-        delete the files matching the list of categories.
+        delete all files matching a set of categories.

-        @version this method does not act on the 'rfac' category.
+        this function deletes all files that are tagged with one of the given categories.
+        tags are set by the code sections that create the files.
+        for a list of common categories, see FILE_CATEGORIES.
+        the categories can be given as an argument or taken from the categories_to_delete property.
+
+        files are deleted regardless of R-factor.
+        be sure to specify only categories that you don't need in the output at all.
+
+        by default, only files of complete models (cf. set_model_complete()) are deleted
+        to avoid interference with running calculations.
+        to clean up after calculations, the incomplete_models argument can override this.
+
+        @note this method does not act on the special 'rfac' category (see delete_bad_rfac()).

        @param categories: set of file categories to delete.
-            defaults to self.categories_to_delete.
+            if the argument is None, it defaults to the categories_to_delete property.
+
+        @param incomplete_models: (bool) delete files of incomplete models as well.
+            by default (False), incomplete models are not deleted.

        @return: None
        """
        if categories is None:
            categories = self.categories_to_delete
        for cat in categories:
-            self.delete_category(cat)
+            self.delete_category(cat, incomplete_models=incomplete_models)

    def delete_bad_rfac(self, keep=0, force_delete=False):
        """
-        delete the files of all models except a specified number of good models.
+        delete all files of all models except for a specified number of best ranking models.

        the method first determines which models to keep.
-        models with R factor values of 0.0, without a specified R-factor, and
        the specified number of best ranking non-zero models are kept.
-        the files belonging to the keeper models are kept, all others are deleted,
-        regardless of category.
-        files of incomplete models are also kept.
+        in addition, incomplete models, models with R factor = 0.0,
+        and those without a specified R-factor are kept.
+        all other files are deleted.
+        the method does not consider the file category.

        the files are deleted from the list and the file system.

-        files are deleted only if 'rfac' is specified in self.categories_to_delete
-        or if force_delete is set to True.
+        the method executes only if 'rfac' is specified in self.categories_to_delete
+        or if force_delete is  True.
        otherwise the method does nothing.

        @param keep: number of files to keep.
@ -330,17 +345,31 @@ class FileTracker(object):

        return len(del_models)

-    def delete_category(self, category):
+    def delete_category(self, category, incomplete_models=False):
        """
        delete all files of a specified category from the list and the file system.

-        only files of complete models (cf. set_model_complete()) are deleted, but regardless of R-factor.
+        this function deletes all files that are tagged with the given category.
+        tags are set by the code sections that create the files.
+        for a list of common categories, see FILE_CATEGORIES.
+
+        files are deleted regardless of R-factor.
+        be sure to specify only categories that you don't need in the output at all.
+
+        by default, only files of complete models (cf. set_model_complete()) are deleted
+        to avoid interference with running calculations.
+        to clean up after calculations, the incomplete_models argument can override this.

        @param category: (str) category.
+            should be one of FILE_CATEGORIES. otherwise, the function has no effect.
+
+        @param incomplete_models: (bool) delete files of incomplete models as well.
+            by default (False), incomplete models are not deleted.

        @return: None
        """
        del_names = {name for (name, cat) in self._file_category.items() if cat == category}
+        if not incomplete_models:
            del_names &= {name for (name, model) in self._file_model.items() if model in self._complete_models}
        for name in del_names:
            self.delete_file(name)
@ -375,3 +404,33 @@ class FileTracker(object):
                logger.warning("file system error deleting file {0}".format(path))
            else:
                logger.debug("delete file {0} ({1}, model {2})".format(path, cat, model))
+
+
+def list_files_other_models(prefix, models):
+    """
+    list input/output files except those of the given models.
+
+    this can be used to clean up all files except those belonging to the given models.
+
+    to delete the listed files:
+
+        for f in files:
+            os.remove(f)
+
+    @param prefix: file name prefix up to the first underscore.
+        only files starting with this prefix are listed.
+
+    @param models: sequence or set of model numbers that should not be listed.
+
+    @return: set of file names
+    """
+    file_names = set([])
+    for entry in os.scandir():
+        if entry.is_file:
+            elements = entry.name.split('_')
+            try:
+                if len(elements) == 6 and elements[0] == prefix and int(elements[1]) not in models:
+                    file_names.add(entry.name)
+            except (IndexError, ValueError):
+                pass
+    return file_names
--- a/pmsco/graphics/population.py
+++ b/pmsco/graphics/population.py
@ -0,0 +1,443 @@
+"""
+@package pmsco.graphics.population
+graphics rendering module for population dynamics.
+
+the main function is render_genetic_chart().
+
+this module is experimental.
+interface and implementation are subject to change.
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2021 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import logging
+import numpy as np
+import os
+from pmsco.database import regular_params, special_params
+
+logger = logging.getLogger(__name__)
+
+try:
+    from matplotlib.figure import Figure
+    from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
+    # from matplotlib.backends.backend_pdf import FigureCanvasPdf
+    # from matplotlib.backends.backend_svg import FigureCanvasSVG
+except ImportError:
+    Figure = None
+    FigureCanvas = None
+    logger.warning("error importing matplotlib. graphics rendering disabled.")
+
+
+def _default_range(pos):
+    """
+    determine a default range from actual values.
+
+    @param pos: (numpy.ndarray) 1-dimensional structured array of parameter values.
+    @return: range_min, range_max are dictionaries of the minimum and maximum values of each parameter.
+    """
+    names = regular_params(pos.dtype.names)
+    range_min = {}
+    range_max = {}
+    for name in names:
+        range_min[name] = pos[name].min()
+        range_max[name] = pos[name].max()
+    return range_min, range_max
+
+
+def _prune_constant_params(pnames, range_min, range_max):
+    """
+    remove constant parameters from the list and range
+
+    @param pnames: (list)
+    @param range_min: (dict)
+    @param range_max: (dict)
+    @return:
+    """
+    del_names = [name for name in pnames if range_max[name] <= range_min[name]]
+    for name in del_names:
+        pnames.remove(name)
+        del range_min[name]
+        del range_max[name]
+
+
+def render_genetic_chart(output_file, input_data_or_file, model_space=None, generations=None, title=None, cmap=None,
+                         canvas=None):
+    """
+    produce a genetic chart from a given population.
+
+    a genetic chart is a pseudo-colour representation of the coordinates of each individual in the model space.
+    the axes are the particle number and the model parameter.
+    the colour is mapped from the relative position of a parameter value within the parameter range.
+
+    the chart should illustrate the diversity in the population.
+    converged parameters will show similar colours.
+    by comparing charts of different generations, the effect of the optimization algorithm can be examined.
+    though the chart type is designed for the genetic algorithm, it may be useful for other algorithms as well.
+
+    the function requires input in one of the following forms:
+    - a result (.dat) file or numpy structured array.
+      the array must contain regular parameters, as well as the _particle and _gen columns.
+      the function generates one chart per generation unless the generation argument is specified.
+    - a population (.pop) file or numpy structured array.
+      the array must contain regular parameters, as well as the _particle columns.
+    - a pmsco.optimizers.population.Population object with valid data.
+
+    the graphics file format can be changed by providing a specific canvas. default is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param output_file: path and base name of the output file without extension.
+        a generation index and the file extension according to the file format are appended.
+    @param input_data_or_file: a numpy structured ndarray of a population or result list from an optimization run.
+        alternatively, the file path of a result file (.dat) or population file (.pop) can be given.
+        file can be any object that numpy.genfromtxt() can handle.
+    @param model_space: model space can be a pmsco.project.ModelSpace object,
+        any object that contains the same min and max attributes as pmsco.project.ModelSpace,
+        or a dictionary with to keys 'min' and 'max' that provides the corresponding ModelSpace dictionaries.
+        by default, the model space boundaries are derived from the input data.
+        if a model_space is specified, only the parameters listed in it are plotted.
+    @param generations: (int or sequence) generation index or list of indices.
+        this index is used in the output file name and for filtering input data by generation.
+        if the input data does not contain the generation, no filtering is applied.
+        by default, no filtering is applied, and one graph for each generation is produced.
+    @param title: (str) title of the chart.
+        the title is a {}-style format string, where {base} is the output file name and {gen} is the generation.
+        default: derived from file name.
+    @param cmap: (str) name of colour map supported by matplotlib.
+        default is 'jet'.
+        other good-looking options are 'PiYG', 'RdBu', 'RdYlGn', 'coolwarm'.
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+        some other options are:
+        matplotlib.backends.backend_pdf.FigureCanvasPdf or
+        matplotlib.backends.backend_svg.FigureCanvasSVG.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+
+    @raise TypeError if matplotlib is not available.
+    """
+
+    try:
+        pos = np.copy(input_data_or_file.pos)
+        range_min = input_data_or_file.model_min
+        range_max = input_data_or_file.model_max
+        generations = [input_data_or_file.generation]
+    except AttributeError:
+        try:
+            pos = np.atleast_1d(np.genfromtxt(input_data_or_file, names=True))
+        except TypeError:
+            pos = np.copy(input_data_or_file)
+        range_min, range_max = _default_range(pos)
+    pnames = regular_params(pos.dtype.names)
+
+    if model_space is not None:
+        try:
+            # a ModelSpace-like object
+            range_min = model_space.min
+            range_max = model_space.max
+        except AttributeError:
+            # a dictionary-like object
+            range_min = model_space['min']
+            range_max = model_space['max']
+        try:
+            pnames = range_min.keys()
+        except AttributeError:
+            pnames = range_min.dtype.names
+
+    pnames = list(pnames)
+    _prune_constant_params(pnames, range_min, range_max)
+
+    if generations is None:
+        try:
+            generations = np.unique(pos['_gen'])
+        except ValueError:
+            pass
+
+    files = []
+    path, base = os.path.split(output_file)
+    if generations is not None and len(generations):
+        if title is None:
+            title = "{base} gen {gen}"
+
+        for generation in generations:
+            idx = np.where(pos['_gen'] == generation)
+            gpos = pos[idx]
+            gtitle = title.format(base=base, gen=int(generation))
+            out_filename = "{base}-{gen}".format(base=os.fspath(output_file), gen=int(generation))
+            out_filename = _render_genetic_chart_2(out_filename, gpos, pnames, range_min, range_max,
+                                                   gtitle, cmap, canvas)
+            files.append(out_filename)
+    else:
+        if title is None:
+            title = "{base}"
+        gtitle = title.format(base=base, gen="")
+        out_filename = "{base}".format(base=os.fspath(output_file))
+        out_filename = _render_genetic_chart_2(out_filename, pos, pnames, range_min, range_max, gtitle, cmap, canvas)
+        files.append(out_filename)
+
+    return files
+
+
+def _render_genetic_chart_2(out_filename, pos, pnames, range_min, range_max, title, cmap, canvas):
+    """
+    internal part of render_genetic_chart()
+
+    this function calculates the relative position in the model space,
+    sorts the positions array by particle index,
+    and calls plot_genetic_chart().
+
+    @param out_filename:
+    @param pos:
+    @param pnames:
+    @param range_max:
+    @param range_min:
+    @param cmap:
+    @param canvas:
+    @return: out_filename
+    """
+    spos = np.sort(pos, order='_particle')
+    rpos2d = np.zeros((spos.shape[0], len(pnames)))
+    for index, pname in enumerate(pnames):
+        rpos2d[:, index] = (spos[pname] - range_min[pname]) / (range_max[pname] - range_min[pname])
+    out_filename = plot_genetic_chart(out_filename, rpos2d, pnames, title=title, cmap=cmap, canvas=canvas)
+    return out_filename
+
+
+def plot_genetic_chart(filename, rpos2d, param_labels, title=None, cmap=None, canvas=None):
+    """
+    produce a genetic chart from the given data.
+
+    a genetic chart is a pseudo-colour representation of the coordinates of each individual in the model space.
+    the chart should highlight the amount of diversity in the population
+    and - by comparing charts of different generations - the changes due to mutation.
+    the axes are the model parameter (x) and particle number (y).
+    the colour is mapped from the relative position of a parameter value within the parameter range.
+
+    in contrast to render_genetic_chart() this function contains only the drawing code.
+    it requires input in the final form and does not do any checks, conversion or processing.
+
+    the graphics file format can be changed by providing a specific canvas. default is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param filename: path and name of the output file without extension.
+    @param rpos2d: (two-dimensional numpy array of numeric type)
+        relative positions of the particles in the model space.
+        dimension 0 (y-axis) is the particle index,
+        dimension 1 (x-axis) is the parameter index (in the order given by param_labels).
+        all values must be between 0 and 1.
+    @param param_labels: (sequence) list or tuple of parameter names.
+    @param title: (str) string to be printed as chart title. default is 'genetic chart'.
+    @param cmap: (str) name of colour map supported by matplotlib.
+        default is 'jet'.
+        other good-looking options are 'PiYG', 'RdBu', 'RdYlGn', 'coolwarm'.
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+        some other options are:
+        matplotlib.backends.backend_pdf.FigureCanvasPdf or
+        matplotlib.backends.backend_svg.FigureCanvasSVG.
+
+    @raise TypeError if matplotlib is not available.
+    """
+    if canvas is None:
+        canvas = FigureCanvas
+    if cmap is None:
+        cmap = 'jet'
+    if title is None:
+        title = 'genetic chart'
+
+    fig = Figure()
+    canvas(fig)
+    ax = fig.add_subplot(111)
+    im = ax.imshow(rpos2d, aspect='auto', cmap=cmap, origin='lower')
+    im.set_clim((0.0, 1.0))
+    ax.set_xticks(np.arange(len(param_labels)))
+    ax.set_xticklabels(param_labels, rotation=45, ha="right", rotation_mode="anchor")
+    ax.set_ylabel('particle')
+    ax.set_title(title)
+    cb = ax.figure.colorbar(im, ax=ax)
+    cb.ax.set_ylabel("relative value", rotation=-90, va="bottom")
+
+    out_filename = "{base}.{ext}".format(base=filename, ext=canvas.get_default_filetype())
+    fig.savefig(out_filename)
+    return out_filename
+
+
+def render_swarm(output_file, input_data, model_space=None, title=None, cmap=None, canvas=None):
+    """
+    render a two-dimensional particle swarm population.
+
+    this function generates a schematic rendering of a particle swarm in two dimensions.
+    particles are represented by their position and velocity, indicated by an arrow.
+    the model space is projected on the first two (or selected two) variable parameters.
+    in the background, a scatter plot of results (dots with pseudocolor representing the R-factor) can be plotted.
+    the chart type is designed for the particle swarm optimization algorithm.
+
+    the function requires input in one of the following forms:
+    - position (.pos), velocity (.vel) and result (.dat) files or the respective numpy structured arrays.
+      the arrays must contain regular parameters, as well as the `_particle` column.
+      the result file must also contain an `_rfac` column.
+    - a pmsco.optimizers.population.Population object with valid data.
+
+    the graphics file format can be changed by providing a specific canvas. default is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param output_file: path and base name of the output file without extension.
+        a generation index and the file extension according to the file format are appended.
+    @param input_data: a pmsco.optimizers.population.Population object with valid data,
+        or a sequence of position, velocity and result arrays.
+        the arrays must be structured ndarrays corresponding to the respective Population members.
+        alternatively, the arrays can be referenced as file paths
+        in any format that numpy.genfromtxt() can handle.
+    @param model_space: model space can be a pmsco.project.ModelSpace object,
+        any object that contains the same min and max attributes as pmsco.project.ModelSpace,
+        or a dictionary with to keys 'min' and 'max' that provides the corresponding ModelSpace dictionaries.
+        by default, the model space boundaries are derived from the input data.
+        if a model_space is specified, only the parameters listed in it are plotted.
+    @param title: (str) title of the chart.
+        the title is a {}-style format string, where {base} is the output file name and {gen} is the generation.
+        default: derived from file name.
+    @param cmap: (str) name of colour map supported by matplotlib.
+        default is 'plasma'.
+        other good-looking options are 'viridis', 'plasma', 'inferno', 'magma', 'cividis'.
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+        some other options are:
+        matplotlib.backends.backend_pdf.FigureCanvasPdf or
+        matplotlib.backends.backend_svg.FigureCanvasSVG.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+
+    @raise TypeError if matplotlib is not available.
+    """
+    try:
+        range_min = input_data.model_min
+        range_max = input_data.model_max
+        pos = np.copy(input_data.pos)
+        vel = np.copy(input_data.vel)
+        rfac = np.copy(input_data.results)
+        generation = input_data.generation
+    except AttributeError:
+        try:
+            pos = np.atleast_1d(np.genfromtxt(input_data[0], names=True))
+            vel = np.atleast_1d(np.genfromtxt(input_data[1], names=True))
+            rfac = np.atleast_1d(np.genfromtxt(input_data[2], names=True))
+        except TypeError:
+            pos = np.copy(input_data[0])
+            vel = np.copy(input_data[1])
+            rfac = np.copy(input_data[2])
+        range_min, range_max = _default_range(rfac)
+    pnames = regular_params(pos.dtype.names)
+
+    if model_space is not None:
+        try:
+            # a ModelSpace-like object
+            range_min = model_space.min
+            range_max = model_space.max
+        except AttributeError:
+            # a dictionary-like object
+            range_min = model_space['min']
+            range_max = model_space['max']
+        try:
+            pnames = range_min.keys()
+        except AttributeError:
+            pnames = range_min.dtype.names
+
+    pnames = list(pnames)
+    _prune_constant_params(pnames, range_min, range_max)
+    pnames = pnames[0:2]
+    files = []
+    if len(pnames) == 2:
+        params = {pnames[0]: [range_min[pnames[0]], range_max[pnames[0]]],
+                  pnames[1]: [range_min[pnames[1]], range_max[pnames[1]]]}
+        out_filename = plot_swarm(output_file, pos, vel, rfac, params, title=title, cmap=cmap, canvas=canvas)
+        files.append(out_filename)
+    else:
+        logging.warning("model space must be two-dimensional and non-degenerate.")
+
+    return files
+
+
+def plot_swarm(filename, pos, vel, rfac, params, title=None, cmap=None, canvas=None):
+    """
+    plot a two-dimensional particle swarm population.
+
+    this is a sub-function of render_swarm() containing just the plotting commands.
+
+    the graphics file format can be changed by providing a specific canvas. default is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param filename: path and base name of the output file without extension.
+        a generation index and the file extension according to the file format are appended.
+    @param pos: structured ndarray containing the positions of the particles.
+    @param vel: structured ndarray containing the velocities of the particles.
+    @param rfac: structured ndarray containing positions and R-factor values.
+        this array is independent of pos and vel.
+        it can also be set to None if results should be suppressed.
+    @param params: dictionary of two parameters to be plotted.
+        the keys correspond to columns of the pos, vel and rfac arrays.
+        the values are lists [minimum, maximum] that define the axis range.
+    @param title: (str) title of the chart.
+        the title is a {}-style format string, where {base} is the output file name and {gen} is the generation.
+        default: derived from file name.
+    @param cmap: (str) name of colour map supported by matplotlib.
+        default is 'plasma'.
+        other good-looking options are 'viridis', 'plasma', 'inferno', 'magma', 'cividis'.
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+        some other options are:
+        matplotlib.backends.backend_pdf.FigureCanvasPdf or
+        matplotlib.backends.backend_svg.FigureCanvasSVG.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+
+    @raise TypeError if matplotlib is not available.
+    """
+    if canvas is None:
+        canvas = FigureCanvas
+    if cmap is None:
+        cmap = 'plasma'
+    if title is None:
+        title = 'swarm map'
+
+    pnames = list(params.keys())
+    fig = Figure()
+    canvas(fig)
+    ax = fig.add_subplot(111)
+
+    if rfac is not None:
+        try:
+            s = ax.scatter(rfac[params[0]], rfac[params[1]], s=5, c=rfac['_rfac'], cmap=cmap, vmin=0, vmax=1)
+        except ValueError:
+            # _rfac column missing
+            pass
+        else:
+            cb = ax.figure.colorbar(s, ax=ax)
+            cb.ax.set_ylabel("R-factor", rotation=-90, va="bottom")
+
+    p = ax.plot(pos[pnames[0]], pos[pnames[1]], 'co')
+    q = ax.quiver(pos[pnames[0]], pos[pnames[1]], vel[pnames[0]], vel[pnames[1]], color='c')
+    ax.set_xlim(params[pnames[0]])
+    ax.set_ylim(params[pnames[1]])
+    ax.set_xlabel(pnames[0])
+    ax.set_ylabel(pnames[1])
+    ax.set_title(title)
+
+    out_filename = "{base}.{ext}".format(base=filename, ext=canvas.get_default_filetype())
+    fig.savefig(out_filename)
+    return out_filename
--- a/pmsco/graphics/rfactor.py
+++ b/pmsco/graphics/rfactor.py
@ -182,7 +182,7 @@ def render_results(results_file, data=None):
    """

    if data is None:
-        data = np.genfromtxt(results_file, names=True)
+        data = np.atleast_1d(np.genfromtxt(results_file, names=True))

    summary = evaluate_results(data)

--- a/pmsco/graphics/scan.py
+++ b/pmsco/graphics/scan.py
@ -7,16 +7,13 @@ interface and implementation are subject to change.

@author Matthias Muntwiler, matthias.muntwiler@psi.ch

-@copyright (c) 2018 by Paul Scherrer Institut @n
+@copyright (c) 2018-21 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
 import logging
 import math
 import numpy as np
@ -135,9 +132,8 @@ def render_ea_scan(filename, data, scan_mode, canvas=None, is_modf=False):
        im.set_cmap("RdBu_r")
        dhi = max(abs(dlo), abs(dhi))
        dlo = -dhi
-        im.set_clim((dlo, dhi))
+        im.set_clim((-1., 1.))
        try:
-            # requires matplotlib 2.1.0
            ti = cb.get_ticks()
            ti = [min(ti), 0., max(ti)]
            cb.set_ticks(ti)
@ -213,9 +209,8 @@ def render_tp_scan(filename, data, canvas=None, is_modf=False):
        # im.set_cmap("coolwarm")
        dhi = max(abs(dlo), abs(dhi))
        dlo = -dhi
-        pc.set_clim((dlo, dhi))
+        pc.set_clim((-1., 1.))
        try:
-            # requires matplotlib 2.1.0
            ti = cb.get_ticks()
            ti = [min(ti), 0., max(ti)]
            cb.set_ticks(ti)
@ -226,9 +221,12 @@ def render_tp_scan(filename, data, canvas=None, is_modf=False):
        # im.set_cmap("inferno")
        # im.set_cmap("viridis")
        pc.set_clim((dlo, dhi))
+        try:
            ti = cb.get_ticks()
            ti = [min(ti), max(ti)]
            cb.set_ticks(ti)
+        except AttributeError:
+            pass

    out_filename = "{0}.{1}".format(filename, canvas.get_default_filetype())
    fig.savefig(out_filename)
--- a/pmsco/handlers.py
+++ b/pmsco/handlers.py
@ -1,6 +1,6 @@
 """
@package pmsco.handlers
-project-independent task handlers for models, scans, symmetries, emitters and energies.
+project-independent task handlers for models, scans, domains, emitters and energies.

 calculation tasks are organized in a hierarchical tree.
 at each node, a task handler (feel free to find a better name)
@ -20,9 +20,9 @@ the handlers of the structural optimizers are declared in separate modules.
 scans are defined by the project.
 the actual merging step from multiple scans into one result dataset is delegated to the project class.

-<em>symmetry handlers</em> split a task into one child per symmetry.
-symmetries are defined by the project.
-the actual merging step from multiple symmetries into one result dataset is delegated to the project class.
+<em>domain handlers</em> split a task into one child per domain.
+domains are defined by the project.
+the actual merging step from multiple domains into one result dataset is delegated to the project class.

 <em>emitter handlers</em> split a task into one child per emitter configuration (inequivalent sets of emitting atoms).
 emitter configurations are defined by the project.
@ -35,31 +35,29 @@ code inspection and tests have shown that per-emitter results from EDAC can be s
 in order to take advantage of parallel processing.

 while several classes of model handlers are available,
-the default handlers for scans, symmetries, emitters and energies should be sufficient in most situations.
-the scan and symmetry handlers call methods of the project class to invoke project-specific functionality.
+the default handlers for scans, domains, emitters and energies should be sufficient in most situations.
+the scan and domain handlers call methods of the project class to invoke project-specific functionality.

@author Matthias Muntwiler, matthias.muntwiler@psi.ch

-@copyright (c) 2015-18 by Paul Scherrer Institut @n
+@copyright (c) 2015-21 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
 import datetime
 from functools import reduce
 import logging
 import math
 import numpy as np
 import os
+from pathlib import Path

 from pmsco.compat import open
 import pmsco.data as md
+import pmsco.dispatch as dispatch
 import pmsco.graphics.scan as mgs
 from pmsco.helpers import BraceMessage as BMsg

@ -127,10 +125,14 @@ class TaskHandler(object):
            for best efficiency the number of tasks generated should be greater or equal the number of slots.
            it should not exceed N times the number of slots, where N is a reasonably small number.

-        @return None
+        @return (int) number of children that create_tasks() will generate on average.
+            the number does not need to be accurate, a rough estimate or order of magnitude if greater than 10 is fine.
+            it is used to distribute processing slots across task levels.
+            see pmsco.dispatch.MscoMaster.setup().
        """
        self._project = project
        self._slots = slots
+        return 1

    def cleanup(self):
        """
@ -372,7 +374,7 @@ class SingleModelHandler(ModelHandler):
        keys = [key for key in self.result]
        keys.sort(key=lambda t: t[0].lower())
        vals = (str(self.result[key]) for key in keys)
-        filename = self._project.output_file + ".dat"
+        filename = Path(self._project.output_file).with_suffix(".dat")
        with open(filename, "w") as outfile:
            outfile.write("# ")
            outfile.write(" ".join(keys))
@ -416,6 +418,8 @@ class ScanHandler(TaskHandler):
    def setup(self, project, slots):
        """
        initialize the scan task handler and save processed experimental scans.
+
+        @return (int) number of scans defined in the project.
        """
        super(ScanHandler, self).setup(project, slots)

@ -430,13 +434,15 @@ class ScanHandler(TaskHandler):

        if project.combined_scan is not None:
            ext = md.format_extension(project.combined_scan)
-            filename = project.output_file + ext
+            filename = Path(project.output_file).with_suffix(ext)
            md.save_data(filename, project.combined_scan)
        if project.combined_modf is not None:
            ext = md.format_extension(project.combined_modf)
-            filename = project.output_file + ".modf" + ext
+            filename = Path(project.output_file).with_suffix(".modf" + ext)
            md.save_data(filename, project.combined_modf)

+        return len(self._project.scans)
+
    def create_tasks(self, parent_task):
        """
        generate a calculation task for each scan of the given parent task.
@ -526,7 +532,7 @@ class ScanHandler(TaskHandler):
            return None


-class SymmetryHandler(TaskHandler):
+class DomainHandler(TaskHandler):
    ## @var _pending_ids_per_parent
    #       (dict) sets of child task IDs per parent
    #
@ -546,20 +552,29 @@ class SymmetryHandler(TaskHandler):
    #       the values are sets of all child CalculationTask.id belonging to the parent.

    def __init__(self):
-        super(SymmetryHandler, self).__init__()
+        super(DomainHandler, self).__init__()
        self._pending_ids_per_parent = {}
        self._complete_ids_per_parent = {}

+    def setup(self, project, slots):
+        """
+        initialize the domain task handler.
+
+        @return (int) number of domains defined in the project.
+        """
+        super(DomainHandler, self).setup(project, slots)
+        return len(self._project.domains)
+
    def create_tasks(self, parent_task):
        """
-        generate a calculation task for each symmetry of the given parent task.
+        generate a calculation task for each domain of the given parent task.

-        all symmetries share the same model parameters.
+        all domains share the same model parameters.

-        @return list of CalculationTask objects, with one element per symmetry.
-            the symmetry index varies according to project.symmetries.
+        @return list of CalculationTask objects, with one element per domain.
+            the domain index varies according to project.domains.
        """
-        super(SymmetryHandler, self).create_tasks(parent_task)
+        super(DomainHandler, self).create_tasks(parent_task)

        parent_id = parent_task.id
        self._parent_tasks[parent_id] = parent_task
@ -567,10 +582,10 @@ class SymmetryHandler(TaskHandler):
        self._complete_ids_per_parent[parent_id] = set()

        out_tasks = []
-        for (i_sym, sym) in enumerate(self._project.symmetries):
+        for (i_dom, domain) in enumerate(self._project.domains):
            new_task = parent_task.copy()
            new_task.parent_id = parent_id
-            new_task.change_id(sym=i_sym)
+            new_task.change_id(domain=i_dom)

            child_id = new_task.id
            self._pending_tasks[child_id] = new_task
@ -579,25 +594,25 @@ class SymmetryHandler(TaskHandler):
            out_tasks.append(new_task)

        if not out_tasks:
-            logger.error("no symmetry tasks generated. your project must declare at least one symmetry.")
+            logger.error("no domain tasks generated. your project must declare at least one domain.")

        return out_tasks

    def add_result(self, task):
        """
-        collect and combine the calculation results versus symmetry.
+        collect and combine the calculation results versus domain.

        * mark the task as complete
        * store its result for later
        * check whether this was the last pending task of the family (belonging to the same parent).

-        the actual merging of data is delegated to the project's combine_symmetries() method.
+        the actual merging of data is delegated to the project's combine_domains() method.

        @param task: (CalculationTask) calculation task that completed.

        @return parent task (CalculationTask) if the family is complete. None if the family is not complete yet.
        """
-        super(SymmetryHandler, self).add_result(task)
+        super(DomainHandler, self).add_result(task)

        self._complete_tasks[task.id] = task
        del self._pending_tasks[task.id]
@ -607,7 +622,7 @@ class SymmetryHandler(TaskHandler):
        family_pending.remove(task.id)
        family_complete.add(task.id)

-        # all symmetries complete?
+        # all domains complete?
        if len(family_pending) == 0:
            parent_task = self._parent_tasks[task.parent_id]

@ -624,7 +639,7 @@ class SymmetryHandler(TaskHandler):
            parent_task.time = reduce(lambda a, b: a + b, child_times)

            if parent_task.result_valid:
-                self._project.combine_symmetries(parent_task, child_tasks)
+                self._project.combine_domains(parent_task, child_tasks)
                self._project.evaluate_result(parent_task, child_tasks)
                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'scan')
                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'scan')
@ -669,6 +684,19 @@ class EmitterHandler(TaskHandler):
        self._pending_ids_per_parent = {}
        self._complete_ids_per_parent = {}

+    def setup(self, project, slots):
+        """
+        initialize the emitter task handler.
+
+        @return (int) estimated number of emitter configurations that the cluster generator will generate.
+            the estimate is based on the start parameters, scan 0 and domain 0.
+        """
+        super(EmitterHandler, self).setup(project, slots)
+        mock_model = self._project.model_space.start
+        mock_index = dispatch.CalcID(-1, 0, 0, -1, -1)
+        n_emitters = project.cluster_generator.count_emitters(mock_model, mock_index)
+        return n_emitters
+
    def create_tasks(self, parent_task):
        """
        generate a calculation task for each emitter configuration of the given parent task.
@ -750,11 +778,11 @@ class EmitterHandler(TaskHandler):
            if parent_task.result_valid:
                self._project.combine_emitters(parent_task, child_tasks)
                self._project.evaluate_result(parent_task, child_tasks)
-                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'symmetry')
-                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'symmetry')
+                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'domain')
+                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'domain')
                graph_file = mgs.render_scan(parent_task.modf_filename,
                                             ref_data=self._project.scans[parent_task.id.scan].modulation)
-                self._project.files.add_file(graph_file, parent_task.id.model, 'symmetry')
+                self._project.files.add_file(graph_file, parent_task.id.model, 'domain')

            del self._pending_ids_per_parent[parent_task.id]
            del self._complete_ids_per_parent[parent_task.id]
@ -921,7 +949,7 @@ class EnergyRegionHandler(RegionHandler):

        @param slots (int) number of calculation slots (processes).

-        @return None
+        @return (int) average number of child tasks
        """
        super(EnergyRegionHandler, self).setup(project, slots)

@ -934,6 +962,8 @@ class EnergyRegionHandler(RegionHandler):
            logger.debug(BMsg("region handler: split scan {file} into {slots} chunks",
                              file=os.path.basename(scan.filename), slots=self._slots_per_scan[i]))

+        return max(int(sum(self._slots_per_scan) / len(self._slots_per_scan)), 1)
+
    def create_tasks(self, parent_task):
        """
        generate a calculation task for each energy region of the given parent task.
--- a/pmsco/makefile
+++ b/pmsco/makefile
@ -12,7 +12,7 @@ MUFPOT_DIR = mufpot
 LOESS_DIR = loess
 PHAGEN_DIR = calculators/phagen

-all: edac loess
+all: edac loess phagen

 edac:
 	$(MAKE) -C $(EDAC_DIR)
--- a/pmsco/optimizers/genetic.py
+++ b/pmsco/optimizers/genetic.py
@ -45,7 +45,7 @@ class GeneticPopulation(population.Population):
    4. mutation: a gene may mutate at random.
    5. selection: the globally best individual is added to a parent population (and replaces the worst).

-    the main tuning parameter of the algorithm is the mutation_step which is copied from the domain.step.
+    the main tuning parameter of the algorithm is the mutation_step which is copied from the model_space.step.
    it defines the width of a gaussian distribution of change under a weak mutation.
    it should be large enough so that the whole parameter space can be probed,
    but small enough that a frequent mutation does not throw the individual out of the convergence region.
@ -92,9 +92,9 @@ class GeneticPopulation(population.Population):
    ## @var mutation_step
    #
    # standard deviations of the exponential distribution function used in the mutate_weak() method.
-    # the variable is a dictionary with the same keys as model_step (the parameter domain).
+    # the variable is a dictionary with the same keys as model_step (the parameter space).
    #
-    # it is initialized from the domain.step
+    # it is initialized from the model_space.step
    # or set to a default value based on the parameter range and population size.

    def __init__(self):
@ -110,15 +110,15 @@ class GeneticPopulation(population.Population):
        self.position_constrain_mode = 'random'
        self.mutation_step = {}

-    def setup(self, size, domain, **kwargs):
+    def setup(self, size, model_space, **kwargs):
        """
        @copydoc Population.setup()

        in addition to the inherited behaviour, this method initializes self.mutation_step.
-        mutation_step of a parameter is set to its domain.step if non-zero.
+        mutation_step of a parameter is set to its model_space.step if non-zero.
        otherwise it is set to the parameter range divided by the population size.
        """
-        super(GeneticPopulation, self).setup(size, domain, **kwargs)
+        super(GeneticPopulation, self).setup(size, model_space, **kwargs)

        for key in self.model_step:
            val = self.model_step[key]
@ -131,7 +131,7 @@ class GeneticPopulation(population.Population):
        this implementation is a new proposal.
        the distribution is not completely random.
        rather, a position vector (by parameter) is initialized with a linear function
-        that covers the parameter domain.
+        that covers the parameter space.
        the linear function is then permuted randomly.

        the method does not update the particle info fields.
@ -243,7 +243,7 @@ class GeneticPopulation(population.Population):
        """
        apply a weak mutation to a model.

-        each parameter is changed to a different value in the domain of the parameter at the given probability.
+        each parameter is changed to a different value in the parameter space at the given probability.
        the amount of change has a gaussian distribution with a standard deviation of mutation_step.

        @param[in,out] model: structured numpy.ndarray holding the model parameters.
@ -263,7 +263,7 @@ class GeneticPopulation(population.Population):
        """
        apply a strong mutation to a model.

-        each parameter is changed to a random value in the domain of the parameter at the given probability.
+        each parameter is changed to a random value in the parameter space at the given probability.

        @param[in,out] model: structured numpy.ndarray holding the model parameters.
            model is modified in place.
--- a/pmsco/optimizers/gradient.py
+++ b/pmsco/optimizers/gradient.py
@ -8,7 +8,7 @@ the optimization task is distributed over multiple processes using MPI.
 the optimization must be started with N+1 processes in the MPI environment,
 where N equals the number of fit parameters.

-IMPLEMENTATION IN PROGRESS - DEBUGGING
+THIS MODULE IS NOT INTEGRATED INTO PMSCO YET.

 Requires: scipy, numpy

@ -109,7 +109,7 @@ class MscMaster(MscProcess):

    def setup(self, project):
        super(MscMaster, self).setup(project)
-        self.dom = project.create_domain()
+        self.dom = project.create_model_space()
        self.running_slaves = self.slaves

        self._outfile = open(self.project.output_file + ".dat", "w")
--- a/pmsco/optimizers/grid.py
+++ b/pmsco/optimizers/grid.py
@ -63,7 +63,7 @@ class GridPopulation(object):
    ## @var positions
    # (numpy.ndarray) flat list of grid coordinates and results.
    #
-    # the column names include the names of the model parameters, taken from domain.start,
+    # the column names include the names of the model parameters, taken from model_space.start,
    # and the special names @c '_model', @c '_rfac'.
    # the special fields have the following meanings:
    #
@ -113,11 +113,12 @@ class GridPopulation(object):
        dt.sort(key=lambda t: t[0].lower())
        return dt

-    def setup(self, domain):
+    def setup(self, model_space):
        """
        set up the population and result arrays.

-        @param domain: definition of initial and limiting model parameters
+        @param model_space: (pmsco.project.ModelSpace)
+            definition of initial and limiting model parameters
            expected by the cluster and parameters functions.
            the attributes have the following meanings:
            @arg start: values of the fixed parameters.
@ -128,24 +129,24 @@ class GridPopulation(object):
                        if step <= 0, the parameter is kept constant.

        """
-        self.model_start = domain.start
-        self.model_min = domain.min
-        self.model_max = domain.max
-        self.model_step = domain.step
+        self.model_start = model_space.start
+        self.model_min = model_space.min
+        self.model_max = model_space.max
+        self.model_step = model_space.step

        self.model_count = 1
        self.search_keys = []
        self.fixed_keys = []
        scales = []

-        for p in domain.step.keys():
-            if domain.step[p] > 0:
-                n = np.round((domain.max[p] - domain.min[p]) / domain.step[p]) + 1
+        for p in model_space.step.keys():
+            if model_space.step[p] > 0:
+                n = int(np.round((model_space.max[p] - model_space.min[p]) / model_space.step[p]) + 1)
            else:
                n = 1
            if n > 1:
                self.search_keys.append(p)
-                scales.append(np.linspace(domain.min[p], domain.max[p], n))
+                scales.append(np.linspace(model_space.min[p], model_space.max[p], n))
            else:
                self.fixed_keys.append(p)

@ -221,7 +222,7 @@ class GridPopulation(object):

        @raise AssertionError if the number of rows of the two files differ.
        """
-        data = np.genfromtxt(filename, names=True)
+        data = np.atleast_1d(np.genfromtxt(filename, names=True))
        assert data.shape == array.shape
        for name in data.dtype.names:
            array[name] = data[name]
@ -298,12 +299,12 @@ class GridSearchHandler(handlers.ModelHandler):
            the minimum number of slots is 1, the recommended value is 10 or greater.
            the population size is set to at least 4.

-        @return:
+        @return (int) number of models to be calculated.
        """
        super(GridSearchHandler, self).setup(project, slots)

        self._pop = GridPopulation()
-        self._pop.setup(self._project.create_domain())
+        self._pop.setup(self._project.model_space)
        self._invalid_limit = max(slots, self._invalid_limit)

        self._outfile = open(self._project.output_file + ".dat", "w")
@ -311,7 +312,7 @@ class GridSearchHandler(handlers.ModelHandler):
        self._outfile.write(" ".join(self._pop.positions.dtype.names))
        self._outfile.write("\n")

-        return None
+        return self._pop.model_count

    def cleanup(self):
        self._outfile.close()
--- a/pmsco/optimizers/population.py
+++ b/pmsco/optimizers/population.py
@ -3,7 +3,7 @@
 base classes for population-based optimizers.

 a _population_ is a set of individuals or particles
-that can assume coordinates from the parameter domain.
+that can assume coordinates from the parameter space.
 a tuple of coordinates is also called _model parameters_ which define the _model_.
 the individuals travel through parameter space according to an algorithm defined separately.
 depending on the algorithm, the population can converge towards the optimum coordinates based on calculated R-factors.
@ -117,7 +117,7 @@ class Population(object):
    ## @var pos
    # (numpy.ndarray) current positions of each particle.
    #
-    # the column names include the names of the model parameters, taken from domain.start,
+    # the column names include the names of the model parameters, taken from model_space.start,
    # and the special names @c '_particle', @c '_model', @c '_rfac'.
    # the special fields have the following meanings:
    #
@ -299,7 +299,7 @@ class Population(object):
            arr[k] = model_dict[k]
        return arr

-    def setup(self, size, domain, **kwargs):
+    def setup(self, size, model_space, **kwargs):
        """
        set up the population arrays seeded with previous results and the start model.

@ -315,12 +315,12 @@ class Population(object):

        @param size: requested number of particles.

-        @param domain: definition of initial and limiting model parameters
+        @param model_space: definition of initial and limiting model parameters
            expected by the cluster and parameters functions.
-            @arg domain.start: initial guess.
-            @arg domain.min:   minimum values allowed.
-            @arg domain.max:   maximum values allowed. if min == max, the parameter is kept constant.
-            @arg domain.step:  depends on the actual algorithm.
+            @arg model_space.start: initial guess.
+            @arg model_space.min:   minimum values allowed.
+            @arg model_space.max:   maximum values allowed. if min == max, the parameter is kept constant.
+            @arg model_space.step:  depends on the actual algorithm.
                not used in particle swarm.
                standard deviation of mutations in genetic optimization.

@ -335,14 +335,14 @@ class Population(object):
        """
        self.size_req = size
        self.size_act = size
-        self.model_start = domain.start
-        self.model_min = domain.min
-        self.model_max = domain.max
-        self.model_step = domain.step
-        self.model_start_array = self.get_model_array(domain.start)
-        self.model_min_array = self.get_model_array(domain.min)
-        self.model_max_array = self.get_model_array(domain.max)
-        self.model_step_array = self.get_model_array(domain.step)
+        self.model_start = model_space.start
+        self.model_min = model_space.min
+        self.model_max = model_space.max
+        self.model_step = model_space.step
+        self.model_start_array = self.get_model_array(model_space.start)
+        self.model_min_array = self.get_model_array(model_space.min)
+        self.model_max_array = self.get_model_array(model_space.max)
+        self.model_step_array = self.get_model_array(model_space.step)

        # allocate arrays
        dt = self.get_pop_dtype(self.model_start)
@ -378,8 +378,8 @@ class Population(object):
        """
        initializes a random population.

-        the position array is filled with random values (uniform distribution) from the parameter domain.
-        velocity values are randomly chosen between -1/8 to 1/8 times the width (max - min) of the parameter domain.
+        the position array is filled with random values (uniform distribution) from the parameter space.
+        velocity values are randomly chosen between -1/8 to 1/8 times the width (max - min) of the parameter space.

        the method does not update the particle info fields.

@ -402,8 +402,8 @@ class Population(object):
        the method does not update the particle info fields.

        @param params: dictionary of model parameters.
-            the keys must match the ones of domain.start.
-            values that lie outside of the domain are skipped.
+            the keys must match the ones of model_space.start.
+            values that lie outside of the model space are skipped.

        @param index: index of the particle that is seeded.
            the index must be in the allowed range of the self.pos array.
@ -440,7 +440,7 @@ class Population(object):
        this method is called as a part of setup().
        it must not be called after the optimization has started.

-        parameter values that lie outside the parameter domain (min/max) are left at their previous value.
+        parameter values that lie outside the model space (min/max) are left at their previous value.

        @note this method does not initialize the remaining particles.
            neither does it set the velocity and best position arrays of the seeded particles.
@ -488,7 +488,7 @@ class Population(object):
            count_limit = self.pos.shape[0]
        count_limit = min(count_limit, self.pos.shape[0] - first_particle)

-        seed = np.genfromtxt(seed_file, names=True)
+        seed = np.atleast_1d(np.genfromtxt(seed_file, names=True))
        try:
            seed = seed[seed['_rfac'] <= rfac_limit]
        except ValueError:
@ -554,14 +554,14 @@ class Population(object):
        however, the patch is applied only upon the next execution of advance_population().

        an info or warning message is printed to the log
-        depending on whether the filed contained a complete dataset or not.
+        depending on whether the file contained a complete dataset or not.

        @attention patching a live population is a potentially dangerous operation.
        it may cause an optimization to abort because of an error in the file.
        this method does not handle exceptions coming from numpy.genfromtxt
        such as missing file (IOError) or conversion errors (ValueError).
        exception handling should be done by the owner of the population (typically the model handler).
-        patch values that lie outside the population domain aresilently ignored.
+        patch values that lie outside the model space are silently ignored.

        @param patch_file: path and name of the patch file.
            the file must have the correct format for load_array(),
@ -572,7 +572,7 @@ class Population(object):

        @raise ValueError for conversion errors.
        """
-        self.pos_patch = np.genfromtxt(patch_file, names=True)
+        self.pos_patch = np.atleast_1d(np.genfromtxt(patch_file, names=True))
        source_fields = set(self.pos_patch.dtype.names)
        dest_fields = set(self.model_start.keys())
        common_fields = source_fields & dest_fields
@ -592,7 +592,7 @@ class Population(object):

        the method overwrites only parameter values, not control variables.
        _particle indices that lie outside the range of available population items are ignored.
-        parameter values that lie outside the parameter domain (min/max) are ignored.
+        parameter values that lie outside the model space (min/max) are ignored.
        """
        if self.pos_patch is not None:
            logger.warning(BMsg("patching generation {gen} with new positions.", gen=self.generation))
@ -658,7 +658,7 @@ class Population(object):
        elif isinstance(source, str):
            for i in range(timeout):
                try:
-                    array = np.genfromtxt(source, names=True)
+                    array = np.atleast_1d(np.genfromtxt(source, names=True))
                except IOError:
                    time.sleep(1)
                else:
@ -708,7 +708,7 @@ class Population(object):

        the method also performs a range check.
        the parameter values are constrained according to self.position_constrain_mode
-        and the parameter domain self.model_min and self.model_max.
+        and the model space self.model_min and self.model_max.
        if the constrain mode is `error`, models that violate the constraints are ignored
        and removed from the import queue.

@ -844,18 +844,18 @@ class Population(object):
        """
        constrain a position to the given bounds.

-        this method resolves violations of parameter boundaries, i.e. when a particle is leaving the designated domain.
-        if a violation is detected, the method calculates an updated position inside the domain
+        this method resolves violations of parameter boundaries, i.e. when a particle is leaving the designated model space.
+        if a violation is detected, the method calculates an updated position inside the model space
        according to the selected algorithm.
        in some cases the velocity or boundaries have to be updated as well.

        the method distinguishes overshoot and undershoot violations.
-        overshoot is the normal case when the particle is leaving the domain.
+        overshoot is the normal case when the particle is leaving the model space.
        it is handled according to the selected algorithm.

        undershoot is a special case where the particle was outside the boundaries before the move.
        this case can occur in the beginning if the population is seeded with out-of-bounds values.
-        undershoot is always handled by placing the particle at a random position in the domain
+        undershoot is always handled by placing the particle at a random position in the model space
        regardless of the chosen constraint mode.

        @note it is important to avoid bias while handling constraint violations.
@ -877,7 +877,7 @@ class Population(object):

        @param _mode: what to do if a boundary constraint is violated:
            @arg 're-enter': re-enter from the opposite side of the parameter interval.
-            @arg 'bounce': fold the motion vector at the boundary and move the particle back into the domain.
+            @arg 'bounce': fold the motion vector at the boundary and move the particle back into the model space.
            @arg 'scatter': place the particle at a random place between its old position and the violated boundary.
            @arg 'stick': place the particle at the violated boundary.
            @arg 'expand': move the boundary so that the particle fits.
@ -982,7 +982,7 @@ class Population(object):
        @param search_array: population-like numpy structured array to search for the model.
            defaults to self.results if None.

-        @param precision: precision relative to model domain at which elements should be considered equal.
+        @param precision: precision relative to model space at which elements should be considered equal.

        @return index of the first occurrence.

@ -1071,7 +1071,7 @@ class Population(object):

        @raise AssertionError if the number of rows of the two files differ.
        """
-        data = np.genfromtxt(filename, names=True)
+        data = np.atleast_1d(np.genfromtxt(filename, names=True))
        assert data.shape == array.shape
        for name in data.dtype.names:
            array[name] = data[name]
@ -1182,7 +1182,7 @@ class PopulationHandler(handlers.ModelHandler):
        which may slow down convergence.

        if calculations take a long time compared to the available computation time
-        or spawn a lot of sub-tasks due to complex symmetry,
+        or spawn a lot of sub-tasks due to complex model space,
        and you prefer to allow for a good number of generations,
        you should override the population size.

@ -1190,7 +1190,7 @@ class PopulationHandler(handlers.ModelHandler):

        @param slots: number of calculation processes available through MPI.

-        @return: None
+        @return (int) population size
        """
        super(PopulationHandler, self).setup(project, slots)

@ -1206,10 +1206,10 @@ class PopulationHandler(handlers.ModelHandler):
            outfile.write(" ".join(self._pop.results.dtype.names))
            outfile.write("\n")

-        return None
+        return self._pop_size

    def setup_population(self):
-        self._pop.setup(self._pop_size, self._project.create_domain(), **self._project.optimizer_params)
+        self._pop.setup(self._pop_size, self._project.model_space, **self._project.optimizer_params)

    def cleanup(self):
        super(PopulationHandler, self).cleanup()
--- a/pmsco/optimizers/table.py
+++ b/pmsco/optimizers/table.py
@ -83,8 +83,8 @@ class TablePopulation(population.Population):
    although, a seed file is accepted, it is not used.
    patching is allowed, but there is normally no advantage over modifying the table.

-    the domain is used to define the model parameters and the parameter range.
-    models violating the parameter domain are ignored.
+    the model space is used to define the model parameters and the parameter range.
+    models violating the parameter model space are ignored.
    """

    ## @var table_source
@ -103,20 +103,20 @@ class TablePopulation(population.Population):
        self.table_source = None
        self.position_constrain_mode = 'error'

-    def setup(self, size, domain, **kwargs):
+    def setup(self, size, model_space, **kwargs):
        """
-        set up the population arrays, parameter domain and data source.
+        set up the population arrays, parameter model space and data source.

        @param size: requested number of particles.
            this does not need to correspond to the number of table entries.
            on each generation the population loads up to this number of new entries from the table source.

-        @param domain: definition of initial and limiting model parameters
+        @param model_space: definition of initial and limiting model parameters
            expected by the cluster and parameters functions.
-            @arg domain.start: not used.
-            @arg domain.min:   minimum values allowed.
-            @arg domain.max:   maximum values allowed.
-            @arg domain.step:  not used.
+            @arg model_space.start: not used.
+            @arg model_space.min:   minimum values allowed.
+            @arg model_space.max:   maximum values allowed.
+            @arg model_space.step:  not used.

        the following arguments are keyword arguments.
        the method also accepts the inherited arguments for seeding. they do not have an effect, however.
@ -128,7 +128,7 @@ class TablePopulation(population.Population):

        @return: None
        """
-        super(TablePopulation, self).setup(size, domain, **kwargs)
+        super(TablePopulation, self).setup(size, model_space, **kwargs)
        self.table_source = kwargs['table_source']

    def advance_population(self):
--- a/pmsco/pmsco.py
+++ b/pmsco/pmsco.py
@ -6,12 +6,12 @@ PEARL Multiple-Scattering Calculation and Structural Optimization

 this is the top-level interface of the PMSCO package.
 all calculations (any mode, any project) start by calling the run_project() function of this module.
-the module also provides a command line parser for common options.
+the module also provides a command line and a run-file/run-dict interface.

 for parallel execution, prefix the command line with mpi_exec -np NN, where NN is the number of processes to use.
 note that in parallel mode, one process takes the role of the coordinator (master).
 the master does not run calculations and is idle most of the time.
-to benefit from parallel execution on a work station, NN should be the number of processors plus one.
+to benefit from parallel execution on a work station, NN should be the number of processors.
 on a cluster, the number of processes is chosen according to the available resources.

 all calculations can also be run in a single process.
@ -25,26 +25,35 @@ refer to the projects folder for examples.

@author Matthias Muntwiler, matthias.muntwiler@psi.ch

-@copyright (c) 2015-18 by Paul Scherrer Institut @n
+@copyright (c) 2015-21 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
 import argparse
 from builtins import range
-import datetime
 import logging
 import importlib
-import os.path
+import commentjson as json
+from pathlib import Path
 import sys

+try:
    from mpi4py import MPI
+    mpi_comm = MPI.COMM_WORLD
+    mpi_size = mpi_comm.Get_size()
+    mpi_rank = mpi_comm.Get_rank()
+except ImportError:
+    MPI = None
+    mpi_comm = None
+    mpi_size = 1
+    mpi_rank = 0
+
+pmsco_root = Path(__file__).resolve().parent.parent
+if str(pmsco_root) not in sys.path:
+    sys.path.insert(0, str(pmsco_root))

 import pmsco.dispatch as dispatch
 import pmsco.files as files
@ -71,40 +80,36 @@ def setup_logging(enable=False, filename="pmsco.log", level="WARNING"):

    @param enable: (bool) True=enable logging to the specified file,
        False=do not generate a log (null handler).
-    @param filename: (string) path and name of the log file.
+    @param filename: (Path-like) path and name of the log file.
        if this process is part of an MPI communicator,
        the function inserts a dot and the MPI rank of this process before the extension.
+        if the filename is empty, logging is disabled.
    @param level: (string) name of the log level.
        must be the name of one of "DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL".
-        if empty or invalid, the function raises a ValueError.
+        if empty, logging is disabled.
+        if not a valid level, defaults to "WARNING".
    @return None
    """
-    numeric_level = getattr(logging, level.upper(), None)
-    if not isinstance(numeric_level, int):
-        raise ValueError('Invalid log level: %s' % level)
-
-    logger = logging.getLogger("")
-    logger.setLevel(numeric_level)
-
-    logformat = '%(asctime)s (%(name)s) %(levelname)s: %(message)s'
-    formatter = logging.Formatter(logformat)
+    enable = enable and str(filename) and level
+    numeric_level = getattr(logging, level.upper(), logging.WARNING)
+    root_logger = logging.getLogger()
+    root_logger.setLevel(numeric_level)

    if enable:
-        mpi_comm = MPI.COMM_WORLD
-        mpi_size = mpi_comm.Get_size()
        if mpi_size > 1:
-            mpi_rank = mpi_comm.Get_rank()
-            root, ext = os.path.splitext(filename)
-            filename = root + "." + str(mpi_rank) + ext
+            p = Path(filename)
+            filename = p.with_suffix(f".{mpi_rank}" + p.suffix)
+
+        log_format = '%(asctime)s (%(name)s) %(levelname)s: %(message)s'
+        formatter = logging.Formatter(log_format)

        handler = logging.FileHandler(filename, mode="w", delay=True)
        handler.setLevel(numeric_level)
-
        handler.setFormatter(formatter)
    else:
        handler = logging.NullHandler()

-    logger.addHandler(handler)
+    root_logger.addHandler(handler)


 def set_common_args(project, args):
@ -124,65 +129,58 @@ def set_common_args(project, args):

    @return: None
    """
-    log_file = "pmsco.log"

    if args.data_dir:
        project.data_dir = args.data_dir
    if args.output_file:
-        project.set_output(args.output_file)
-        log_file = args.output_file + ".log"
+        project.output_file = args.output_file
+    if args.db_file:
+        project.db_file = args.db_file
    if args.log_file:
-        log_file = args.log_file
-    setup_logging(enable=args.log_enable, filename=log_file, level=args.log_level)
-
-    logger.debug("creating project")
-    mode = args.mode.lower()
-    if mode in {'single', 'grid', 'swarm', 'genetic', 'table'}:
-        project.mode = mode
-    else:
-        logger.error("invalid optimization mode '%s'.", mode)
-
-    if args.pop_size:
-        project.optimizer_params['pop_size'] = args.pop_size
-
-    if args.seed_file:
-        project.optimizer_params['seed_file'] = args.seed_file
-    if args.seed_limit:
-        project.optimizer_params['seed_limit'] = args.seed_limit
-    if args.table_file:
-        project.optimizer_params['table_file'] = args.table_file
-
+        project.log_file = args.log_file
+    if args.log_level:
+        project.log_level = args.log_level
+    if not args.log_enable:
+        project.log_file = ""
+        project.log_level = ""
+    if args.mode:
+        project.mode = args.mode.lower()
    if args.time_limit:
-        project.set_timedelta_limit(datetime.timedelta(hours=args.time_limit))
-
+        project.time_limit = args.time_limit
    if args.keep_files:
-        if "all" in args.keep_files:
-            cats = set([])
-        else:
-            cats = files.FILE_CATEGORIES - set(args.keep_files)
-        cats -= {'report'}
-        if mode == 'single':
-            cats -= {'model'}
-        project.files.categories_to_delete = cats
-    if args.keep_levels > project.keep_levels:
-        project.keep_levels = args.keep_levels
-    if args.keep_best > project.keep_best:
-        project.keep_best = args.keep_best
+        project.keep_files = args.keep_files
+    if args.keep_levels:
+        project.keep_levels = max(args.keep_levels, project.keep_levels)
+    if args.keep_best:
+        project.keep_best = max(args.keep_best, project.keep_best)


 def run_project(project):
    """
    run a calculation project.

-    @param project:
-    @return:
+    the function sets up logging, validates the project, chooses the handler classes,
+    and passes control to the pmsco.dispatch module to run the calculations.
+
+    @param project: fully initialized project object.
+        the validate method is called as part of this function after setting up the logger.
+    @return: None
    """
-    # log project arguments only in rank 0
-    mpi_comm = MPI.COMM_WORLD
-    mpi_rank = mpi_comm.Get_rank()
+
+    log_file = Path(project.log_file)
+    if not log_file.name:
+        log_file = Path(project.job_name).with_suffix(".log")
+    if log_file.name:
+        log_file.parent.mkdir(exist_ok=True)
+        log_level = project.log_level
+    else:
+        log_level = ""
+    setup_logging(enable=bool(log_level), filename=log_file, level=log_level)
    if mpi_rank == 0:
        project.log_project_args()

+    project.validate()
+
    optimizer_class = None
    if project.mode == 'single':
        optimizer_class = handlers.SingleModelHandler
@ -219,6 +217,34 @@ def run_project(project):
        logger.error("undefined project, optimizer, or calculator.")


+def schedule_project(project, run_dict):
+    """
+    schedule a calculation project.
+
+    the function validates the project and submits a job to the scheduler.
+
+    @param project: fully initialized project object.
+        the validate method is called as part of this function.
+
+    @param run_dict: dictionary holding the contents of the run file.
+
+    @return: None
+    """
+    assert mpi_rank == 0
+    setup_logging(enable=False)
+
+    project.validate()
+
+    schedule_dict = run_dict['schedule']
+    module = importlib.import_module(schedule_dict['__module__'])
+    schedule_class = getattr(module, schedule_dict['__class__'])
+    schedule = schedule_class(project)
+    schedule.set_properties(module, schedule_dict, project)
+    schedule.run_dict = run_dict
+    schedule.validate()
+    schedule.submit()
+
+
 class Args(object):
    """
    arguments of the main function.
@ -231,7 +257,7 @@ class Args(object):
    values as the command line parser.
    """

-    def __init__(self, mode="single", output_file="pmsco_data"):
+    def __init__(self):
        """
        constructor.
        
@ -240,12 +266,9 @@ class Args(object):
        other parameters may be required depending on the project
        and/or the calculation mode.
        """
-        self.mode = mode
-        self.pop_size = 0
-        self.seed_file = ""
-        self.seed_limit = 0
        self.data_dir = ""
-        self.output_file = output_file
+        self.output_file = ""
+        self.db_file = ""
        self.time_limit = 24.0
        self.keep_files = files.FILE_CATEGORIES_TO_KEEP
        self.keep_best = 10
@ -253,13 +276,9 @@ class Args(object):
        self.log_level = "WARNING"
        self.log_file = ""
        self.log_enable = True
-        self.table_file = ""


-def get_cli_parser(default_args=None):
-    if not default_args:
-        default_args = Args()
-
+def get_cli_parser():
    KEEP_FILES_CHOICES = files.FILE_CATEGORIES | {'all'}

    parser = argparse.ArgumentParser(
@ -272,7 +291,7 @@ def get_cli_parser(default_args=None):

        1) a project class derived from pmsco.project.Project.
           the class implements/overrides all necessary methods of the calculation project,
-           in particular create_domain, create_cluster, and create_params.
+           in particular create_model_space, create_cluster, and create_params.

        2) a global function named create_project.
           the function accepts a namespace object from the argument parser.
@ -287,54 +306,45 @@ def get_cli_parser(default_args=None):
    # for simplicity, the parser does not check these requirements.
    # all parameters are optional and accepted regardless of mode.
    # errors may occur if implicit requirements are not met.
-    parser.add_argument('project_module',
+    parser.add_argument('project_module', nargs='?',
                        help="path to custom module that defines the calculation project")
-    parser.add_argument('-m', '--mode', default=default_args.mode,
+    parser.add_argument('-r', '--run-file',
+                        help="path to run-time parameters file which contains all program arguments. " +
+                        "must be in JSON format.")
+    parser.add_argument('-m', '--mode',
                        choices=['single', 'grid', 'swarm', 'genetic', 'table'],
                        help='calculation mode')
-    parser.add_argument('--pop-size', type=int, default=default_args.pop_size,
-                        help='population size (number of particles) in swarm or genetic optimization mode. ' +
-                        'default is the greater of 4 or the number of calculation processes.')
-    parser.add_argument('--seed-file',
-                        help='path and name of population seed file. ' +
-                        'population data of previous optimizations can be used to seed a new optimization. ' +
-                        'the file must have the same structure as the .pop or .dat files.')
-    parser.add_argument('--seed-limit', type=int, default=default_args.seed_limit,
-                        help='maximum number of models to use from the seed file. ' +
-                        'the models with the best R-factors are selected.')
-    parser.add_argument('-d', '--data-dir', default=default_args.data_dir,
+    parser.add_argument('-d', '--data-dir',
                        help='directory path for experimental data files (if required by project). ' +
                             'default: working directory')
-    parser.add_argument('-o', '--output-file', default=default_args.output_file,
+    parser.add_argument('-o', '--output-file',
                        help='base path for intermediate and output files.')
-    parser.add_argument('--table-file',
-                        help='path and name of population table file for table optimization mode. ' +
-                        'the file must have the same structure as the .pop or .dat files.')
-    parser.add_argument('-k', '--keep-files', nargs='*', default=default_args.keep_files,
+    parser.add_argument('-b', '--db-file',
+                        help='name of an sqlite3 database file where the results should be stored.')
+    parser.add_argument('-k', '--keep-files', nargs='*',
                        choices=KEEP_FILES_CHOICES,
                        help='output file categories to keep after the calculation. '
                             'by default, cluster and model (simulated data) '
                             'of a limited number of best models are kept.')
-    parser.add_argument('--keep-best', type=int, default=default_args.keep_best,
+    parser.add_argument('--keep-best', type=int,
                        help='number of best models for which to keep result files '
                             '(at each node from root down to keep-levels).')
    parser.add_argument('--keep-levels', type=int, choices=range(5),
-                        default=default_args.keep_levels,
                        help='task level down to which result files of best models are kept. '
-                             '0 = model, 1 = scan, 2 = symmetry, 3 = emitter, 4 = region.')
-    parser.add_argument('-t', '--time-limit', type=float, default=default_args.time_limit,
+                             '0 = model, 1 = scan, 2 = domain, 3 = emitter, 4 = region.')
+    parser.add_argument('-t', '--time-limit', type=float,
                        help='wall time limit in hours. the optimizers try to finish before the limit.')
-    parser.add_argument('--log-file', default=default_args.log_file,
+    parser.add_argument('--log-file',
                        help='name of the main log file. ' +
                             'under MPI, the rank of the process is inserted before the extension.')
-    parser.add_argument('--log-level', default=default_args.log_level,
+    parser.add_argument('--log-level',
                        help='minimum level of log messages. DEBUG, INFO, WARNING, ERROR, CRITICAL.')
    feature_parser = parser.add_mutually_exclusive_group(required=False)
    feature_parser.add_argument('--log-enable', dest='log_enable', action="store_true",
                        help="enable logging. by default, logging is on.")
    feature_parser.add_argument('--log-disable', dest='log_enable', action='store_false',
                        help="disable logging. by default, logging is on.")
-    parser.set_defaults(log_enable=default_args.log_enable)
+    parser.set_defaults(log_enable=True)

    return parser

@ -345,52 +355,135 @@ def parse_cli():

    @return: Namespace object created by the argument parser.
    """
-    default_args = Args()
-    parser = get_cli_parser(default_args)
+    parser = get_cli_parser()

    args, unknown_args = parser.parse_known_args()

    return args, unknown_args


-def import_project_module(path):
+def import_module(module_name):
    """
-    import the custom project module.
+    import a custom module by name.

-    imports the project module given its file path.
-    the path is expanded to its absolute form and appended to the python path.
+    import a module given its file path or module name (like in an import statement).

-    @param path: path and name of the module to be loaded.
-        path is optional and defaults to the python path.
-        if the name includes an extension, it is stripped off.
+    preferably, the module name should be given as in an import statement.
+    as the top-level pmsco directory is on the python path,
+    the module name will begin with `projects` for a custom project module or `pmsco` for a core pmsco module.
+    in this case, the function just calls importlib.import_module.
+
+    if a file path is given, i.e., `module_name` links to an existing file and has a `.py` extension,
+    the function extracts the directory path,
+    inserts it into the python path,
+    and calls importlib.import_module on the stem of the file name.
+
+    @note the file path remains in the python path.
+    this option should be used carefully to avoid breaking file name resolution.
+
+    @param module_name: file path or module name.
+        file path is interpreted relative to the working directory.

    @return: the loaded module as a python object
    """
-    path, name = os.path.split(path)
-    name, __ = os.path.splitext(name)
-    path = os.path.abspath(path)
-    sys.path.append(path)
-    project_module = importlib.import_module(name)
-    return project_module
+    p = Path(module_name)
+    if p.is_file() and p.suffix == ".py":
+        path = p.parent.resolve()
+        module_name = p.stem
+        if path not in sys.path:
+            sys.path.insert(0, path)
+
+    module = importlib.import_module(module_name)
+    return module
+
+
+def main_dict(run_params):
+    """
+    main function with dictionary run-time parameters
+
+    this starts the whole process with all direct parameters.
+    the command line is not parsed.
+    no run-file is loaded (just the project module).
+
+    @param run_params: dictionary with the same structure as the JSON run-file.
+
+    @return: None
+    """
+    project_params = run_params['project']
+
+    module = importlib.import_module(project_params['__module__'])
+    try:
+        project_class = getattr(module, project_params['__class__'])
+    except KeyError:
+        project = module.create_project()
+    else:
+        project = project_class()
+
+    project._module = module
+    project.directories['pmsco'] = Path(__file__).parent
+    project.directories['project'] = Path(module.__file__).parent
+    project.set_properties(module, project_params, project)
+    run_project(project)


 def main():
+    """
+    main function with command line parsing
+
+    this function starts the whole process with parameters from the command line.
+
+    if the command line contains a run-file parameter, it determines the module to load and the project parameters.
+    otherwise, the command line parameters apply.
+
+    the project class can be specified either in the run-file or the project module.
+    if the run-file specifies a class name, that class is looked up in the project module and instantiated.
+    otherwise, the module's create_project is called.
+
+    @return: None
+    """
    args, unknown_args = parse_cli()

-    if args:
-        module = import_project_module(args.project_module)
+    try:
+        with open(args.run_file, 'r') as f:
+            rf = json.load(f)
+    except AttributeError:
+        rfp = {'__module__': args.project_module}
+    else:
+        rfp = rf['project']
+
+    module = import_module(rfp['__module__'])
    try:
        project_args = module.parse_project_args(unknown_args)
-        except NameError:
+    except AttributeError:
        project_args = None

+    try:
+        project_class = getattr(module, rfp['__class__'])
+    except (AttributeError, KeyError):
        project = module.create_project()
+    else:
+        project = project_class()
+        project_args = None
+
+    project._module = module
+    project.directories['pmsco'] = Path(__file__).parent
+    project.directories['project'] = Path(module.__file__).parent
+    project.set_properties(module, rfp, project)
+
    set_common_args(project, args)
    try:
+        if project_args:
            module.set_project_args(project, project_args)
-        except NameError:
+    except AttributeError:
        pass

+    try:
+        schedule_enabled = rf['schedule']['enabled']
+    except KeyError:
+        schedule_enabled = False
+    if schedule_enabled:
+        schedule_project(project, rf)
+    else:
        run_project(project)


--- a/pmsco/project.py
+++ b/pmsco/project.py
--- a/pmsco/schedule.py
+++ b/pmsco/schedule.py
@ -0,0 +1,309 @@
+"""
+@package pmsco.schedule
+job schedule interface
+
+this module defines common infrastructure to submit a pmsco calculation job to a job scheduler such as slurm.
+
+the schedule can be defined as part of the run-file (see pmsco module).
+users may derive sub-classes in a separate module to adapt to their own computing cluster.
+
+the basic call sequence is:
+1. create a schedule object.
+2. initialize its properties with job parameters.
+3. validate()
+4. submit()
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-21 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import collections.abc
+import commentjson as json
+import datetime
+import logging
+from pathlib import Path
+import shutil
+import subprocess
+import pmsco.config
+
+logger = logging.getLogger(__name__)
+
+
+class JobSchedule(pmsco.config.ConfigurableObject):
+    """
+    base class for job schedule
+
+    this class defines the abstract interface and some utilities.
+    derived classes may override any method, but should call the inherited method.
+
+    usage:
+    1. create object, assigning a project instance.
+    2. assign run_file.
+    3. call validate.
+    4. call submit.
+
+    this class' properties should not be listed in the run file - they will be overwritten.
+    """
+
+    ## @var enabled (bool)
+    #
+    # this parameter signals whether pmsco should schedule a job or run the calculation.
+    # it is not directly used by the schedule classes but by the pmsco module.
+    # it must be defined in the run file and set to true to submit the job to a scheduler.
+    # it is set to false in the run file copied to the job directory so that the job script starts the calculation.
+
+    def __init__(self, project):
+        super(JobSchedule, self).__init__()
+        self.project = project
+        self.enabled = False
+        self.run_dict = {}
+        self.job_dir = Path()
+        self.job_file = Path()
+        self.run_file = Path()
+        # directory that contains the pmsco and projects directories
+        self.pmsco_root = Path(__file__).parent.parent
+
+    def validate(self):
+        """
+        validate the job parameters.
+
+        make sure all object attributes are correct for submission.
+
+        @return: None
+        """
+        self.pmsco_root = Path(self.project.directories['pmsco']).parent
+        output_dir = Path(self.project.directories['output'])
+
+        assert self.pmsco_root.is_dir()
+        assert (self.pmsco_root / "pmsco").is_dir()
+        assert (self.pmsco_root / "projects").is_dir()
+        assert output_dir.is_dir()
+        assert self.project.job_name
+
+        self.job_dir = output_dir / self.project.job_name
+        self.job_dir.mkdir(parents=True, exist_ok=True)
+        self.job_file = (self.job_dir / self.project.job_name).with_suffix(".sh")
+        self.run_file = (self.job_dir / self.project.job_name).with_suffix(".json")
+
+    def submit(self):
+        """
+        submit the job to the scheduler.
+
+        as of this class, the method does to following:
+
+        1. copy source files
+        2. copy a patched version of the run file.
+        3. write the job file (_write_job_file must be implemented by a derived class).
+
+        @return: None
+        """
+        self._copy_source()
+        self._fix_run_file()
+        self._write_run_file()
+        self._write_job_file()
+
+    def _copy_source(self):
+        """
+        copy the source files to the job directory.
+
+        the source_dir and job_dir attributes must be correct.
+        the job_dir directory must not exist and will be created.
+
+        this is a utility method used internally by derived classes.
+
+        job_dir/pmsco/pmsco/**
+        job_dir/pmsco/projects/**
+        job_dir/job.sh
+        job_dir/job.json
+
+        @return: None
+        """
+
+        source = self.pmsco_root
+        dest = self.job_dir / "pmsco"
+        ignore = shutil.ignore_patterns(".*", "~*", "*~")
+        shutil.copytree(source / "pmsco", dest / "pmsco", ignore=ignore)
+        shutil.copytree(source / "projects", dest / "projects", ignore=ignore)
+
+    def _fix_run_file(self):
+        """
+        fix the run file.
+
+        patch some entries of self.run_dict so that it can be used as run file.
+        the following changes are made:
+        1. set schedule.enabled to false so that the calculation is run.
+        2. set the output directory to the job directory.
+        3. set the log file to the job directory.
+
+        @return: None
+        """
+        self.run_dict['schedule']['enabled'] = False
+        self.run_dict['project']['directories']['output'] = str(self.job_dir)
+        self.run_dict['project']['log_file'] = str((self.job_dir / self.project.job_name).with_suffix(".log"))
+
+    def _write_run_file(self):
+        """
+        copy the run file.
+
+        this is a JSON dump of self.run_dict to the self.run_file file.
+
+        @return: None
+        """
+        with open(self.run_file, "wt") as f:
+            json.dump(self.run_dict, f, indent=2)
+
+    def _write_job_file(self):
+        """
+        create the job script.
+
+        this method must be implemented by a derived class.
+        the script must be written to the self.job_file file.
+        don't forget to make the file executable.
+
+        @return: None
+        """
+        pass
+
+
+class SlurmSchedule(JobSchedule):
+    """
+    job schedule for a slurm scheduler.
+
+    this class implements commonly used features of the slurm scheduler.
+    host-specific features and the creation of the job file should be done in a derived class.
+    derived classes must, in particular, implement the _write_job_file method.
+    they can override other methods, too, but should call the inherited method first.
+
+    1. copy the source trees (pmsco and projects) to the job directory
+    2. copy a patched version of the run file.
+    3. call the submission command
+
+    the public properties of this class should be assigned from the run file.
+    """
+    def __init__(self, project):
+        super(SlurmSchedule, self).__init__(project)
+        self.host = ""
+        self.nodes = 1
+        self.tasks_per_node = 8
+        self.wall_time = datetime.timedelta(hours=1)
+        self.signal_time = 600
+        self.manual = True
+
+    @staticmethod
+    def parse_timedelta(td):
+        """
+        parse time delta input formats
+
+        converts a string or dictionary from run-file into datetime.timedelta.
+
+        @param td:
+            str: [days-]hours[:minutes[:seconds]]
+            dict: days, hours, minutes, seconds - at least one needs to be defined. values must be numeric.
+            datetime.timedelta - native type
+        @return: datetime.timedelta
+        """
+        if isinstance(td, str):
+            dt = {}
+            d = td.split("-")
+            if len(d) > 1:
+                dt['days'] = float(d.pop(0))
+            t = d[0].split(":")
+            try:
+                dt['hours'] = float(t.pop(0))
+                dt['minutes'] = float(t.pop(0))
+                dt['seconds'] = float(t.pop(0))
+            except (IndexError, ValueError):
+                pass
+            td = datetime.timedelta(**dt)
+        elif isinstance(td, collections.abc.Mapping):
+            td = datetime.timedelta(**td)
+        return td
+
+    def validate(self):
+        super(SlurmSchedule, self).validate()
+        self.wall_time = self.parse_timedelta(self.wall_time)
+        assert self.job_dir.is_absolute()
+
+    def submit(self):
+        """
+        call the sbatch command
+
+        if manual is true, the job files are generated but the job is not submitted.
+
+        @return: None
+        """
+        super(SlurmSchedule, self).submit()
+        args = ['sbatch', str(self.job_file)]
+        print(" ".join(args))
+        if self.manual:
+            print("manual run - job files created but not submitted")
+        else:
+            cp = subprocess.run(args)
+            cp.check_returncode()
+
+
+class PsiRaSchedule(SlurmSchedule):
+    """
+    job shedule for the Ra cluster at PSI.
+
+    this class selects specific features of the Ra cluster,
+    such as the partition and node type (24 or 32 cores).
+    it also implements the _write_job_file method.
+    """
+
+    ## @var partition (str)
+    #
+    # the partition is selected based on wall time and number of tasks by the validate() method.
+    # it should not be listed in the run file.
+
+    def __init__(self, project):
+        super(PsiRaSchedule, self).__init__(project)
+        self.partition = "shared"
+
+    def validate(self):
+        super(PsiRaSchedule, self).validate()
+        assert self.nodes <= 2
+        assert self.tasks_per_node <= 24 or self.tasks_per_node == 32
+        assert self.wall_time.total_seconds() >= 60
+        if self.wall_time.total_seconds() > 24 * 60 * 60:
+            self.partition = "week"
+        elif self.tasks_per_node < 24:
+            self.partition = "shared"
+        else:
+            self.partition = "day"
+        assert self.partition in ["day", "week", "shared"]
+
+    def _write_job_file(self):
+        lines = []
+
+        lines.append('#!/bin/bash')
+        lines.append('#SBATCH --export=NONE')
+        lines.append(f'#SBATCH --job-name="{self.project.job_name}"')
+        lines.append(f'#SBATCH --partition={self.partition}')
+        lines.append(f'#SBATCH --time={int(self.wall_time.total_seconds() / 60)}')
+        lines.append(f'#SBATCH --nodes={self.nodes}')
+        lines.append(f'#SBATCH --ntasks-per-node={self.tasks_per_node}')
+        if self.tasks_per_node > 24:
+            lines.append('#SBATCH --cores-per-socket=16')
+        # 0 - 65535 seconds
+        # currently, PMSCO does not react to signals properly
+        # lines.append(f'#SBATCH --signal=TERM@{self.signal_time}')
+        lines.append(f'#SBATCH --output="{self.project.job_name}.o.%j"')
+        lines.append(f'#SBATCH --error="{self.project.job_name}.e.%j"')
+        lines.append('module load psi-python36/4.4.0')
+        lines.append('module load gcc/4.8.5')
+        lines.append('module load openmpi/3.1.3')
+        lines.append('source activate pmsco')
+        lines.append(f'cd "{self.job_dir}"')
+        lines.append(f'mpirun python pmsco/pmsco -r {self.run_file.name}')
+        lines.append(f'cd "{self.job_dir}"')
+        lines.append('rm -rf pmsco')
+        lines.append('exit 0')
+
+        self.job_file.write_text("\n".join(lines))
+        self.job_file.chmod(0o755)
--- a/projects/common/clusters/crystals.py
+++ b/projects/common/clusters/crystals.py
@ -0,0 +1,138 @@
+"""
+@package projects.common.clusters.crystals
+cluster generators for some common bulk crystals
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-19 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import math
+import numpy as np
+import os.path
+import periodictable as pt
+import logging
+
+import pmsco.cluster as cluster
+import pmsco.dispatch as dispatch
+import pmsco.project as project
+from pmsco.helpers import BraceMessage as BMsg
+
+logger = logging.getLogger(__name__)
+
+
+class ZincblendeCluster(cluster.ClusterGenerator):
+    def __init__(self, proj):
+        super(ZincblendeCluster, self).__init__(proj)
+        self.atomtype1 = 30
+        self.atomtype2 = 16
+        self.bulk_lattice = 1.0
+        self.surface = (1, 1, 1)
+
+    @classmethod
+    def check(cls, outfilename=None, model_dict=None, domain_dict=None):
+        """
+        function to test and debug the cluster generator.
+
+        to use this function, you don't need to import or initialize anything but the class.
+        though the project class is used internally, the result does not depend on any project settings.
+
+        @param outfilename: name of output file for the cluster (XYZ format).
+            the file is written to the same directory where this module is located.
+            if empty or None, no file is written.
+
+        @param model_dict: dictionary of model parameters to override the default values.
+
+        @param domain_dict: dictionary of domain parameters to override the default values.
+
+        @return: @ref pmsco.cluster.Cluster object
+        """
+        proj = project.Project()
+        dom = project.ModelSpace()
+        dom.add_param('dlat', 10.)
+        dom.add_param('rmax', 5.0)
+        if model_dict:
+            dom.start.update(model_dict)
+
+        try:
+            proj.domains[0].update({'zrot': 0.})
+        except IndexError:
+            proj.add_domain({'zrot': 0.})
+        if domain_dict:
+            proj.domains[0].update(domain_dict)
+        proj.add_scan("", 'C', '1s')
+
+        clu_gen = cls(proj)
+        index = dispatch.CalcID(0, 0, 0, -1, -1)
+        clu = clu_gen.create_cluster(dom.start, index)
+
+        if outfilename:
+            project_dir = os.path.dirname(os.path.abspath(__file__))
+            outfilepath = os.path.join(project_dir, outfilename)
+            clu.save_to_file(outfilepath, fmt=cluster.FMT_XYZ, comment="{0} {1} {2}".format(cls, index, str(dom.start)))
+
+        return clu
+
+    def count_emitters(self, model, index):
+        return 1
+
+    def create_cluster(self, model, index):
+        """
+        calculate a specific set of atom positions given the optimizable parameters.
+
+        @param model  (dict)          optimizable parameters
+            @arg    model['dlat']     bulk lattice constant in Angstrom
+            @arg    model['rmax']     cluster radius
+            @arg    model['phi']      azimuthal rotation angle in degrees
+
+        @param dom (dict)             domain
+            @arg    dom['term']       surface termination
+        """
+        clu = cluster.Cluster()
+        clu.comment = "{0} {1}".format(self.__class__, index)
+        clu.set_rmax(model['rmax'])
+        a_lat = model['dlat']
+        dom = self.project.domains[index]
+        try:
+            term = int(dom['term'])
+        except ValueError:
+            term = pt.elements.symbol(dom['term'].strip().number)
+
+        if self.surface == (0, 0, 1):
+            # identity matrix
+            m = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
+        elif self.surface == (1, 1, 1):
+            # this will map the [111] direction onto the z-axis
+            m1 = np.array([1, -1, 0]) * math.sqrt(1/2)
+            m2 = np.array([0.5, 0.5, -1]) * math.sqrt(2/3)
+            m3 = np.array([1, 1, 1]) * math.sqrt(1/3)
+            m = np.array([m1, m2, m3])
+        else:
+            raise ValueError("unsupported surface specification")
+
+        # lattice vectors
+        a1 = np.matmul(m, np.array((1.0, 0.0, 0.0)) * a_lat)
+        a2 = np.matmul(m, np.array((0.0, 1.0, 0.0)) * a_lat)
+        a3 = np.matmul(m, np.array((0.0, 0.0, 1.0)) * a_lat)
+
+        # basis
+        b1 = [np.array((0.0, 0.0, 0.0)), (a2 + a3) / 2, (a3 + a1) / 2, (a1 + a2) / 2]
+        if term == self.atomtype1:
+            d1 = np.array((0, 0, 0))
+            d2 = (a1 + a2 + a3) / 4
+        else:
+            d1 = -(a1 + a2 + a3) / 4
+            d2 = np.array((0, 0, 0))
+        for b in b1:
+            clu.add_bulk(self.atomtype1, b + d1, a1, a2, a3)
+            clu.add_bulk(self.atomtype2, b + d2, a1, a2, a3)
+
+        return clu
--- a/projects/demo/fcc.py
+++ b/projects/demo/fcc.py
@ -91,7 +91,7 @@ class FCC111Project(mp.Project):
        par['V0']  = inner potential
        par['Zsurf'] = position of surface
        """
-        params = mp.Params()
+        params = mp.CalculatorParams()

        params.title = "fcc(111)"
        params.comment = "{0} {1}".format(self.__class__, index)
@ -133,11 +133,11 @@ class FCC111Project(mp.Project):

        return params

-    def create_domain(self):
+    def create_model_space(self):
        """
-        define the domain of the optimization parameters.
+        define the model space of the optimization parameters.
        """
-        dom = mp.Domain()
+        dom = mp.ModelSpace()

        if self.mode == "single":
            dom.add_param('rmax',     5.00,    5.00, 15.00, 2.50)
@ -190,7 +190,7 @@ def create_project():
    project.scan_dict['alpha'] = {'filename': os.path.join(project_dir, "demo_alpha_scan.etp"),
                                  'emitter': "Ni", 'initial_state': "3s"}

-    project.add_symmetry({'default': 0.0})
+    project.add_domain({'default': 0.0})

    return project

@ -229,8 +229,9 @@ def set_project_args(project, project_args):

    try:
        if project_args.initial_state:
-            project.initial_state = project_args.initial_state
-            logger.warning(BMsg("override initial states to {0}", project.initial_state))
+            for scan in project.scans:
+                scan.initial_state = project_args.initial_state
+            logger.warning(f"override initial states of all scans to {project_args.initial_state}")
    except AttributeError:
        pass

--- a/projects/demo/molecule.py
+++ b/projects/demo/molecule.py
@ -0,0 +1,384 @@
+"""
+@package pmsco.projects.demo.molecule
+scattering calculation project for single molecules
+
+the atomic positions are read from a molecule file.
+cluster file, emitter (by chemical symbol), initial state and kinetic energy are specified on the command line.
+there are no structural parameters.
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-20 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import math
+import numpy as np
+import os.path
+from pathlib import Path
+import periodictable as pt
+import argparse
+import logging
+
+# noinspection PyUnresolvedReferences
+from pmsco.calculators.calculator import InternalAtomicCalculator
+# noinspection PyUnresolvedReferences
+from pmsco.calculators.edac import EdacCalculator
+# noinspection PyUnresolvedReferences
+from pmsco.calculators.phagen.runner import PhagenCalculator
+import pmsco.cluster as cluster
+from pmsco.data import calc_modfunc_loess
+# noinspection PyUnresolvedReferences
+import pmsco.elements.bindingenergy
+from pmsco.helpers import BraceMessage as BMsg
+import pmsco.project as project
+
+logger = logging.getLogger(__name__)
+
+
+class MoleculeFileCluster(cluster.ClusterGenerator):
+    """
+    cluster generator based on external file.
+
+    work in progress.
+    """
+    def __init__(self, project):
+        super(MoleculeFileCluster, self).__init__(project)
+        self.base_cluster = None
+
+    def load_base_cluster(self):
+        """
+        load and cache the project-defined coordinate file.
+
+        the file path is set in self.project.cluster_file.
+        the file must be in XYZ (.xyz) or PMSCO cluster (.clu) format (cf. pmsco.cluster module).
+
+        @return: Cluster object (also referenced by self.base_cluster)
+        """
+        if self.base_cluster is None:
+            clu = cluster.Cluster()
+            clu.set_rmax(120.0)
+            p = Path(self.project.cluster_file)
+            ext = p.suffix
+            if ext == ".xyz":
+                fmt = cluster.FMT_XYZ
+            elif ext == ".clu":
+                fmt = cluster.FMT_PMSCO
+            else:
+                raise ValueError(f"unknown cluster file extension {ext}")
+            clu.load_from_file(self.project.cluster_file, fmt=fmt)
+            self.base_cluster = clu
+
+        return self.base_cluster
+
+    def count_emitters(self, model, index):
+        """
+        count the number of emitter configurations.
+
+        the method creates the full cluster and counts the emitters.
+
+        @param model: model parameters.
+        @param index: scan and domain are used by the create_cluster() method,
+            emit decides whether the method returns the number of configurations (-1),
+            or the number of emitters in the specified configuration (>= 0).
+        @return: number of emitter configurations.
+        """
+        clu = self.create_cluster(model, index)
+        return clu.get_emitter_count()
+
+    def create_cluster(self, model, index):
+        """
+        import a cluster from a coordinate file (XYZ format).
+
+        the method does the following:
+        - load the cluster file specified by self.cluster_file.
+        - trim the cluster according to model['rmax'].
+        - mark the 6 nitrogen atoms at the center of the trimer as emitters.
+
+        @param model: rmax is the trim radius of the cluster in units of the surface lattice constant.
+
+        @param index (named tuple CalcID) calculation index.
+            this method uses the domain index to look up domain parameters in
+            `pmsco.project.Project.domains`.
+            `index.emit` selects whether a single-emitter (>= 0) or all-emitter cluster (== -1) is returned.
+
+        @return pmsco.cluster.Cluster object
+        """
+        self.load_base_cluster()
+        clu = cluster.Cluster()
+        clu.copy_from(self.base_cluster)
+        clu.comment = f"{self.__class__}, {index}"
+        dom = self.project.domains[index.domain]
+
+        # trim
+        clu.set_rmax(model['rmax'])
+        clu.trim_sphere(clu.rmax)
+
+        # emitter selection
+        idx_emit = np.where(clu.data['s'] == self.project.scans[index.scan].emitter)
+        assert isinstance(idx_emit, tuple)
+        idx_emit = idx_emit[0]
+        if index.emit >= 0:
+            idx_emit = idx_emit[index.emit]
+        clu.data['e'][idx_emit] = 1
+
+        # rotation
+        if 'xrot' in model:
+            clu.rotate_z(model['xrot'])
+        elif 'xrot' in dom:
+            clu.rotate_z(dom['xrot'])
+        if 'yrot' in model:
+            clu.rotate_z(model['yrot'])
+        elif 'yrot' in dom:
+            clu.rotate_z(dom['yrot'])
+        if 'zrot' in model:
+            clu.rotate_z(model['zrot'])
+        elif 'zrot' in dom:
+            clu.rotate_z(dom['zrot'])
+
+        logger.info(f"cluster for calculation {index}: "
+                    f"{clu.get_atom_count()} atoms, {clu.get_emitter_count()} emitters")
+
+        return clu
+
+
+class MoleculeProject(project.Project):
+    """
+    general molecule project.
+
+    the following model parameters are used:
+
+    @arg `model['zsurf']`   : position of surface above molecule (angstrom)
+    @arg `model['Texp']`    : experimental temperature (K)
+    @arg `model['Tdeb']`    : debye temperature (K)
+    @arg `model['V0']`      : inner potential (eV)
+    @arg `model['rmax']`    : cluster radius (angstrom)
+    @arg `model['ares']`    : angular resolution (degrees, FWHM)
+    @arg `model['distm']`   : dmax for EDAC (angstrom)
+
+    the following domain parameters are used.
+    they can also be specified as model parameters.
+
+    @arg `'xrot'`           : rotation about x-axis (applied first) (deg)
+    @arg `'yrot'`           : rotation about y-axis (applied after x) (deg)
+    @arg `'zrot'`           : rotation about z-axis (applied after x and y) (deg)
+
+    the project parameters are:
+
+    @arg `cluster_file`    : name of cluster file of template molecule.
+                              default: "dpdi-trimer.xyz"
+    """
+    def __init__(self):
+        """
+        initialize a project instance
+        """
+        super(MoleculeProject, self).__init__()
+        self.model_space = project.ModelSpace()
+        self.scan_dict = {}
+        self.cluster_file = "demo-cluster.xyz"
+        self.cluster_generator = MoleculeFileCluster(self)
+        self.atomic_scattering_factory = PhagenCalculator
+        self.multiple_scattering_factory = EdacCalculator
+        self.phase_files = {}
+        self.rme_files = {}
+        self.modf_smth_ei = 0.5
+
+    def create_params(self, model, index):
+        """
+        set a specific set of parameters given the optimizable parameters.
+
+        @param model: (dict) optimization parameters
+            this method requires zsurf, V0, Texp, Tdeb, ares and distm.
+
+        @param index (named tuple CalcID) calculation index.
+            this method formats the index into the comment line.
+        """
+        params = project.CalculatorParams()
+
+        params.title = "molecule demo"
+        params.comment = f"{self.__class__} {index}"
+        params.cluster_file = ""
+        params.output_file = ""
+        initial_state = self.scans[index.scan].initial_state
+        params.initial_state = initial_state
+        emitter = self.scans[index.scan].emitter
+        params.binding_energy = pt.elements.symbol(emitter).binding_energy[initial_state]
+        params.polarization = "H"
+        params.z_surface = model['zsurf']
+        params.inner_potential = model['V0']
+        params.work_function = 4.5
+        params.polar_incidence_angle = 60.0
+        params.azimuthal_incidence_angle = 0.0
+        params.angular_resolution = model['ares']
+        params.experiment_temperature = model['Texp']
+        params.debye_temperature = model['Tdeb']
+        params.phase_files = self.phase_files
+        params.rme_files = self.rme_files
+        # edac_interface only
+        params.emitters = []
+        params.lmax = 15
+        params.dmax = model['distm']
+        params.orders = [20]
+
+        return params
+
+    def create_model_space(self):
+        """
+        define the range of model parameters.
+
+        see the class description for a list of parameters.
+        """
+
+        return self.model_space
+
+    # noinspection PyUnusedLocal
+    def calc_modulation(self, data, model):
+        """
+        calculate the modulation function with project-specific smoothing factor
+
+        see @ref pmsco.pmsco.project.calc_modulation.
+
+        @param data: (numpy.ndarray) experimental data in ETPI, or ETPAI format.
+
+        @param model: (dict) model parameters of the calculation task. not used.
+
+        @return copy of the data array with the modulation function in the 'i' column.
+        """
+        return calc_modfunc_loess(data, smth=self.modf_smth_ei)
+
+
+def create_model_space(mode):
+    """
+    define the model space.
+    """
+    dom = project.ModelSpace()
+
+    if mode == "single":
+        dom.add_param('zsurf',   1.20)
+        dom.add_param('Texp',  300.00)
+        dom.add_param('Tdeb',  100.00)
+        dom.add_param('V0',     10.00)
+        dom.add_param('rmax',   50.00)
+        dom.add_param('ares',    5.00)
+        dom.add_param('distm',   5.00)
+        dom.add_param('wdom1', 1.0)
+        dom.add_param('wdom2', 1.0)
+        dom.add_param('wdom3', 1.0)
+        dom.add_param('wdom4', 1.0)
+        dom.add_param('wdom5', 1.0)
+    else:
+        raise ValueError(f"undefined model space for {mode} optimization")
+
+    return dom
+
+
+def create_project():
+    """
+    create the project instance.
+    """
+
+    proj = MoleculeProject()
+    proj_dir = os.path.dirname(os.path.abspath(__file__))
+    proj.project_dir = proj_dir
+
+    # scan dictionary
+    # to select any number of scans, add their dictionary keys as scans option on the command line
+    proj.scan_dict['empty'] = {'filename': os.path.join(proj_dir, "../common/empty-hemiscan.etpi"),
+                               'emitter': "N", 'initial_state': "1s"}
+
+    proj.mode = 'single'
+    proj.model_space = create_model_space(proj.mode)
+    proj.job_name = 'molecule0000'
+    proj.description = 'molecule demo'
+
+    return proj
+
+
+def set_project_args(project, project_args):
+    """
+    set the project arguments.
+
+    @param project: project instance
+
+    @param project_args: (Namespace object) project arguments.
+    """
+
+    scans = []
+    try:
+        if project_args.scans:
+            scans = project_args.scans
+        else:
+            logger.error("missing scan argument")
+            exit(1)
+    except AttributeError:
+        logger.error("missing scan argument")
+        exit(1)
+
+    for scan_key in scans:
+        scan_spec = project.scan_dict[scan_key]
+        project.add_scan(**scan_spec)
+
+    try:
+        project.cluster_file = os.path.abspath(project_args.cluster_file)
+        project.cluster_generator = MoleculeFileCluster(project)
+    except (AttributeError, TypeError):
+        logger.error("missing cluster-file argument")
+        exit(1)
+
+    try:
+        if project_args.emitter:
+            for scan in project.scans:
+                scan.emitter = project_args.emitter
+            logger.warning(f"override emitters of all scans to {project_args.emitter}")
+    except AttributeError:
+        pass
+
+    try:
+        if project_args.initial_state:
+            for scan in project.scans:
+                scan.initial_state = project_args.initial_state
+            logger.warning(f"override initial states of all scans to {project_args.initial_state}")
+    except AttributeError:
+        pass
+
+    try:
+        if project_args.energy:
+            for scan in project.scans:
+                scan.energies = np.asarray((project_args.energy, ))
+            logger.warning(f"override scan energy of all scans to {project_args.energy}")
+    except AttributeError:
+        pass
+
+    try:
+        if project_args.symmetry:
+            for angle in np.linspace(0, 360, num=project_args.symmetry, endpoint=False):
+                project.add_domain({'xrot': 0., 'yrot': 0., 'zrot': angle})
+                logger.warning(f"override rotation symmetry to {project_args.symmetry}")
+    except AttributeError:
+        pass
+
+
+def parse_project_args(_args):
+    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
+
+    # main arguments
+    parser.add_argument('--scans', nargs="*",
+                        help="nick names of scans to use in calculation (see create_project function)")
+    parser.add_argument('--cluster-file',
+                        help="path name of molecule file (xyz format).")
+
+    # conditional arguments
+    parser.add_argument('--emitter',
+                        help="emitter: chemical symbol")
+    parser.add_argument('--initial-state',
+                        help="initial state term: e.g. 2p1/2")
+    parser.add_argument('--energy', type=float,
+                        help="kinetic energy (eV)")
+    parser.add_argument('--symmetry', type=int, default=1,
+                        help="n-fold rotational symmetry")
+
+    parsed_args = parser.parse_args(_args)
+    return parsed_args
--- a/projects/twoatom/twoatom-energy.json
+++ b/projects/twoatom/twoatom-energy.json
@ -0,0 +1,93 @@
+{
+  // line comments using // or # prefix are allowed as an extension of JSON syntax
+  "project": {
+    "__module__": "projects.twoatom.twoatom",
+    "__class__": "TwoatomProject",
+    "job_name": "twoatom0002",
+    "job_tags": [],
+    "description": "",
+    "mode": "single",
+    "directories": {
+      "data": "",
+      "output": ""
+    },
+    "keep_files": [
+      "cluster",
+      "model",
+      "scan",
+      "report",
+      "population"
+    ],
+    "keep_best": 10,
+    "keep_levels": 1,
+    "time_limit": 24,
+    "log_file": "",
+    "log_level": "WARNING",
+    "cluster_generator": {
+      "__class__": "TwoatomCluster",
+      "atom_types": {
+        "A": "N",
+        "B": "Ni"
+      },
+      "model_dict": {
+        "dAB": "dNNi",
+        "th": "pNNi",
+        "ph": "aNNi"
+      }
+    },
+    "atomic_scattering_factory": "InternalAtomicCalculator",
+    "multiple_scattering_factory": "EdacCalculator",
+    "model_space": {
+      "dNNi": {
+        "start": 2.109,
+        "min": 2.0,
+        "max": 2.25,
+        "step": 0.05
+      },
+      "pNNi": {
+        "start": 15.0,
+        "min": 0.0,
+        "max": 30.0,
+        "step": 1.0
+      },
+      "V0": {
+        "start": 21.966,
+        "min": 15.0,
+        "max": 25.0,
+        "step": 1.0
+      },
+      "Zsurf": {
+        "start": 1.449,
+        "min": 0.5,
+        "max": 2.0,
+        "step": 0.25
+      }
+    },
+    "domains": [
+      {
+        "default": 0.0
+      }
+    ],
+    "scans": [
+      {
+        "__class__": "mp.ScanCreator",
+        "filename": "twoatom_energy_alpha.etpai",
+        "emitter": "N",
+        "initial_state": "1s",
+        "positions": {
+          "e": "np.arange(10, 400, 5)",
+          "t": "0",
+          "p": "0",
+          "a": "np.linspace(-30, 30, 31)"
+        }
+      }
+    ],
+    "optimizer_params": {
+      "pop_size": 0,
+      "seed_file": "",
+      "seed_limit": 0,
+      "recalc_seed": true,
+      "table_file": ""
+    }
+  }
+}
--- a/projects/twoatom/twoatom-hemi.json
+++ b/projects/twoatom/twoatom-hemi.json
@ -0,0 +1,90 @@
+{
+  // line comments using // or # prefix are allowed as an extension of JSON syntax
+  "project": {
+    "__module__": "projects.twoatom.twoatom",
+    "__class__": "TwoatomProject",
+    "job_name": "twoatom0001",
+    "job_tags": [],
+    "description": "",
+    "mode": "single",
+    "directories": {
+      "data": "",
+      "output": ""
+    },
+    "keep_files": [
+      "cluster",
+      "model",
+      "scan",
+      "report",
+      "population"
+    ],
+    "keep_best": 10,
+    "keep_levels": 1,
+    "time_limit": 24,
+    "log_file": "",
+    "log_level": "WARNING",
+    "cluster_generator": {
+      "__class__": "TwoatomCluster",
+      "atom_types": {
+        "A": "N",
+        "B": "Ni"
+      },
+      "model_dict": {
+        "dAB": "dNNi",
+        "th": "pNNi",
+        "ph": "aNNi"
+      }
+    },
+    "atomic_scattering_factory": "InternalAtomicCalculator",
+    "multiple_scattering_factory": "EdacCalculator",
+    "model_space": {
+        "dNNi": {
+          "start": 2.109,
+          "min": 2.0,
+          "max": 2.25,
+          "step": 0.05
+        },
+        "pNNi": {
+          "start": 15.0,
+          "min": 0.0,
+          "max": 30.0,
+          "step": 1.0
+        },
+        "V0": {
+          "start": 21.966,
+          "min": 15.0,
+          "max": 25.0,
+          "step": 1.0
+        },
+        "Zsurf": {
+          "start": 1.449,
+          "min": 0.5,
+          "max": 2.0,
+          "step": 0.25
+        }
+    },
+    "domains": [
+      {
+        "default": 0.0
+      }
+    ],
+    "scans": [
+      {
+        // class name as it would be used in the project module
+        "__class__": "mp.ScanLoader",
+        // any placeholder key from project.directories can be used
+        "filename": "{project}/twoatom_hemi_250e.etpi",
+        "emitter": "N",
+        "initial_state": "1s",
+        "is_modf": false
+      }
+    ],
+    "optimizer_params": {
+      "pop_size": 0,
+      "seed_file": "",
+      "seed_limit": 0,
+      "recalc_seed": true,
+      "table_file": ""
+    }
+  }
+}
--- a/projects/twoatom/twoatom.py
+++ b/projects/twoatom/twoatom.py
@ -17,6 +17,9 @@ import numpy as np
 import os.path
 import periodictable as pt

+from pmsco.calculators.calculator import InternalAtomicCalculator
+from pmsco.calculators.edac import EdacCalculator
+from pmsco.calculators.phagen.runner import PhagenCalculator
 import pmsco.cluster as mc
 import pmsco.project as mp
 from pmsco.helpers import BraceMessage as BMsg
@ -152,6 +155,17 @@ class TwoatomProject(mp.Project):
        self.cluster_generator.model_dict['dAB'] = 'dNNi'
        self.cluster_generator.model_dict['th'] = 'pNNi'
        self.cluster_generator.model_dict['ph'] = 'aNNi'
+        self.atomic_scattering_factory = PhagenCalculator
+        self.multiple_scattering_factory = EdacCalculator
+        self.phase_files = {}
+        self.rme_files = {}
+        self.bindings = {}
+        self.bindings['N'] = {'1s': 409.9}
+        self.bindings['B'] = {'1s': 188.0}
+        self.bindings['Ni'] = {'2s': 1008.6,
+                               '2p': (870.0 + 852.7) / 2, '2p1/2': 870.0, '2p3/2': 852.7,
+                               '3s': 110.8,
+                               '3p': (68.0 + 66.2) / 2, '3p1/2': 68.0, '3p3/2': 66.2}

    def create_params(self, model, index):
        """
@ -159,40 +173,40 @@ class TwoatomProject(mp.Project):

        @param model: (dict) optimizable parameters
        """
-        params = mp.Params()
+        params = mp.CalculatorParams()

        params.title = "two-atom demo"
        params.comment = "{0} {1}".format(self.__class__, index)
        params.cluster_file = ""
        params.output_file = ""
        params.initial_state = self.scans[index.scan].initial_state
-        params.spherical_order = 2
+        initial_state = self.scans[index.scan].initial_state
+        params.initial_state = initial_state
+        emitter = self.scans[index.scan].emitter
+        params.binding_energy = self.bindings[emitter][initial_state]
        params.polarization = "H"
-        params.scattering_level = 5
-        params.fcut = 15.0
-        params.cut = 15.0
-        params.angular_resolution = 0.0
-        params.lattice_constant = 1.0
        params.z_surface = model['Zsurf']
-        params.phase_files = {self.cluster_generator.atom_types['A']: "",
-                              self.cluster_generator.atom_types['B']: ""}
-        params.msq_displacement = {self.cluster_generator.atom_types['A']: 0.01,
-                                   self.cluster_generator.atom_types['B']: 0.0}
-        params.planewave_attenuation = 1.0
        params.inner_potential = model['V0']
        params.work_function = 3.6
-        params.symmetry_range = 360.0
        params.polar_incidence_angle = 60.0
        params.azimuthal_incidence_angle = 0.0
-        params.vibration_model = "P"
-        params.substrate_atomic_mass = 58.69
        params.experiment_temperature = 300.0
        params.debye_temperature = 356.0
-        params.debye_wavevector = 1.7558
-        params.rme_minus_value = 0.0
+
+        if self.phase_files:
+            state = emitter + initial_state
+            try:
+                params.phase_files = self.phase_files[state]
+            except KeyError:
+                params.phase_files = {}
+                logger.warning("no phase files found for {} - using default calculator".format(state))
+
+        params.rme_files = {}
+        params.rme_minus_value = 0.1
        params.rme_minus_shift = 0.0
        params.rme_plus_value = 1.0
        params.rme_plus_shift = 0.0
+
        # used by EDAC only
        params.emitters = []
        params.lmax = 15
@ -201,11 +215,11 @@ class TwoatomProject(mp.Project):

        return params

-    def create_domain(self):
+    def create_model_space(self):
        """
        define the domain of the optimization parameters.
        """
-        dom = mp.Domain()
+        dom = mp.ModelSpace()

        if self.mode == "single":
            dom.add_param('dNNi',     2.109,  2.000,  2.250, 0.050)
@ -294,21 +308,19 @@ def set_project_args(project, project_args):
    @param project_args: (Namespace object) project arguments.
    """

-    scans = ['tp250e']
+    scans = []
    try:
        if project_args.scans:
            scans = project_args.scans
-        else:
-            logger.warning(BMsg("missing scan argument, using {0}", scans[0]))
    except AttributeError:
-        logger.warning(BMsg("missing scan argument, using {0}", scans[0]))
+        pass

    for scan_key in scans:
        scan_spec = project.scan_dict[scan_key]
        project.add_scan(**scan_spec)
        logger.info(BMsg("add scan {filename} ({emitter} {initial_state})", **scan_spec))

-    project.add_symmetry({'default': 0.0})
+    project.add_domain({'default': 0.0})


 def parse_project_args(_args):
@ -323,7 +335,7 @@ def parse_project_args(_args):
    parser = argparse.ArgumentParser()

    # main arguments
-    parser.add_argument('-s', '--scans', nargs="*", default=['tp250e'],
+    parser.add_argument('-s', '--scans', nargs="*",
                        help="nick names of scans to use in calculation (see create_project function)")

    parsed_args = parser.parse_args(_args)
--- a/requirements.txt
+++ b/requirements.txt
@ -1,3 +1,4 @@
+python >= 3.6
 attrdict
 fasteners
 numpy >= 1.13
@ -11,3 +12,4 @@ matplotlib
 future
 swig
 gitpython
+commentjson
--- a/setup.py
+++ b/setup.py
@ -0,0 +1,26 @@
+#!/usr/bin/env python
+
+"""
+@file package distribution information.
+
+preliminary - not tested.
+"""
+
+try:
+    from setuptools import setup
+except ImportError:
+    from distutils.core import setup
+
+config = {
+    'name': "pmsco",
+    'description': "PEARL Multiple-Scattering Cluster Calculation and Structural Optimization",
+    'url': "https://git.psi.ch/pearl/pmsco",
+    'author': "Matthias Muntwiler",
+    'author_email': "matthias.muntwiler@psi.ch",
+    'version': '0.1',
+    'packages': ['pmsco'],
+    'scripts': [],
+    'install_requires': ['numpy','periodictable','statsmodels','mpi4py','nose', 'scipy']
+}
+
+setup(**config)
--- a/tests/calculators/init.py
+++ b/tests/calculators/init.py
--- a/tests/calculators/phagen/init.py
+++ b/tests/calculators/phagen/init.py
--- a/tests/calculators/phagen/test_translator.py
+++ b/tests/calculators/phagen/test_translator.py
@ -0,0 +1,79 @@
+"""
+@package tests.calculators.phagen.test_translator
+unit tests for pmsco.calculators.phagen.translator
+
+the purpose of these tests is to check whether the code runs as expected in a particular environment.
+
+to run the tests, change to the directory which contains the tests directory, and execute =nosetests=.
+
+@pre nose must be installed (python-nose package on Debian).
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-19 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import unittest
+from pmsco.calculators.phagen import translator
+
+
+class TestModule(unittest.TestCase):
+    def setUp(self):
+        # before each test method
+        pass
+
+    def tearDown(self):
+        # after each test method
+        pass
+
+    @classmethod
+    def setup_class(cls):
+        # before any methods in this class
+        pass
+
+    @classmethod
+    def teardown_class(cls):
+        # teardown_class() after any methods in this class
+        pass
+
+    def test_state_to_edge(self):
+        self.assertEqual(translator.state_to_edge('1s'), 'k')
+        self.assertEqual(translator.state_to_edge('2s'), 'l1')
+        self.assertEqual(translator.state_to_edge('3s'), 'm1')
+        self.assertEqual(translator.state_to_edge('4s'), 'n1')
+        self.assertEqual(translator.state_to_edge('5s'), 'o1')
+
+        self.assertEqual(translator.state_to_edge('2p'), 'l2')
+        self.assertEqual(translator.state_to_edge('3p'), 'm2')
+        self.assertEqual(translator.state_to_edge('4p'), 'n2')
+        self.assertEqual(translator.state_to_edge('5p'), 'o2')
+
+        self.assertEqual(translator.state_to_edge('2p1/2'), 'l2')
+        self.assertEqual(translator.state_to_edge('3p1/2'), 'm2')
+        self.assertEqual(translator.state_to_edge('4p1/2'), 'n2')
+        self.assertEqual(translator.state_to_edge('5p1/2'), 'o2')
+
+        self.assertEqual(translator.state_to_edge('2p3/2'), 'l3')
+        self.assertEqual(translator.state_to_edge('3p3/2'), 'm3')
+        self.assertEqual(translator.state_to_edge('4p3/2'), 'n3')
+        self.assertEqual(translator.state_to_edge('5p3/2'), 'o3')
+
+        self.assertEqual(translator.state_to_edge('3d'), 'm4')
+        self.assertEqual(translator.state_to_edge('4d'), 'n4')
+        self.assertEqual(translator.state_to_edge('5d'), 'o4')
+
+        self.assertEqual(translator.state_to_edge('3d3/2'), 'm4')
+        self.assertEqual(translator.state_to_edge('4d3/2'), 'n4')
+        self.assertEqual(translator.state_to_edge('5d3/2'), 'o4')
+
+        self.assertEqual(translator.state_to_edge('3d5/2'), 'm5')
+        self.assertEqual(translator.state_to_edge('4d5/2'), 'n5')
+        self.assertEqual(translator.state_to_edge('5d5/2'), 'o5')
--- a/tests/test_cluster.py
+++ b/tests/test_cluster.py
@ -175,6 +175,22 @@ class TestClusterFunctions(unittest.TestCase):

        np.testing.assert_allclose(layers, np.asarray([-0.3, -0.2, -0.1, 0.0, +0.1]), atol=0.001)

+    def test_get_center(self):
+        clu = mc.Cluster()
+        clu.add_atom(1, np.asarray([1, 0, 0]), 0)
+        clu.add_atom(2, np.asarray([0, 0, 1]), 0)
+        clu.add_atom(1, np.asarray([0, 1, 0]), 0)
+        clu.add_atom(2, np.asarray([-1, -1, -1]), 0)
+        v0 = np.asarray([0, 0, 0])
+        v1 = np.asarray([1/2, 1/2, 0])
+        v2 = np.asarray([-1/2, -1/2, 0])
+        v = clu.get_center()
+        np.testing.assert_allclose(v, v0, atol=0.001)
+        v = clu.get_center(element=1)
+        np.testing.assert_allclose(v, v1, atol=0.001)
+        v = clu.get_center(element="He")
+        np.testing.assert_allclose(v, v2, atol=0.001)
+
    def test_relax(self):
        clu = mc.Cluster()
        clu.add_atom(1, np.asarray([1, 0, 1]), 0)
@ -184,7 +200,7 @@ class TestClusterFunctions(unittest.TestCase):
        clu.add_atom(2, np.asarray([0, 1, -3]), 0)
        idx = clu.relax(-0.3, -0.1, 2)

-        np.testing.assert_almost_equal(idx, np.asarray([[2, 4]]))
+        np.testing.assert_almost_equal(idx, np.asarray([2, 4]))
        np.testing.assert_allclose(clu.get_position(0), np.asarray([1, 0, 1]), atol=1e-6)
        np.testing.assert_allclose(clu.get_position(1), np.asarray([1, 0, 0]), atol=1e-6)
        np.testing.assert_allclose(clu.get_position(2), np.asarray([0, 1, -1.1]), atol=1e-6)
@ -224,18 +240,81 @@ class TestClusterFunctions(unittest.TestCase):
        np.testing.assert_allclose(clu.get_position(1), np.asarray([-1, 0, 0]), atol=1e-6)
        np.testing.assert_allclose(clu.get_position(2), np.asarray([0, 0, 1]), atol=1e-6)

+    def test_translate(self):
+        clu = mc.Cluster()
+        clu.add_atom(1, np.asarray([1, 0, 0]), 0)
+        clu.add_atom(2, np.asarray([0, 1, 0]), 0)
+        clu.add_atom(3, np.asarray([0, 0, 1]), 0)
+
+        v = np.array((0.1, 0.2, 0.3))
+        shift = clu.translate(v)
+        np.testing.assert_allclose(clu.get_position(0), np.asarray([1.1, 0.2, 0.3]), atol=1e-6)
+        np.testing.assert_allclose(clu.get_position(1), np.asarray([0.1, 1.2, 0.3]), atol=1e-6)
+        np.testing.assert_allclose(clu.get_position(2), np.asarray([0.1, 0.2, 1.3]), atol=1e-6)
+        np.testing.assert_allclose(shift, np.asarray([0, 1, 2]))
+
+        shift = clu.translate(v, element=3)
+        np.testing.assert_allclose(clu.get_position(0), np.asarray([1.1, 0.2, 0.3]), atol=1e-6)
+        np.testing.assert_allclose(clu.get_position(1), np.asarray([0.1, 1.2, 0.3]), atol=1e-6)
+        np.testing.assert_allclose(clu.get_position(2), np.asarray([0.2, 0.4, 1.6]), atol=1e-6)
+        np.testing.assert_allclose(shift, np.asarray([2]))
+
    def test_add_layer(self):
        clu = mc.Cluster()
-        # from hbncu project
-        b_surf = 2.50
-        clu.set_rmax(4.0)
-        b1 = np.array((b_surf, 0.0, 0.0))
-        b2 = np.array((b_surf / 2.0, b_surf * math.sqrt(3.0) / 2.0, 0.0))
-        a1 = -10.0 * b1 - 10.0 * b2
-        emitter = np.array((0.0, 0.0, 0.0))
+        clu.set_rmax(2.0)
+        b1 = np.array((1.0, 0.0, 0.0))
+        b2 = np.array((0.0, 1.0, 0.0))
+        a1 = np.array((0.1, 0.0, -0.1))
        clu.add_layer(7, a1, b1, b2)
-        pos = clu.find_positions(pos=emitter)
-        self.assertEqual(1, len(pos))
+
+        exp_pos = [[-0.9, -1.0, -0.1], [0.1, -1.0, -0.1], [1.1, -1.0, -0.1],
+                   [-1.9, 0.0, -0.1], [-0.9, 0.0, -0.1], [0.1, 0.0, -0.1], [1.1, 0.0, -0.1],
+                   [-0.9, 1.0, -0.1], [0.1, 1.0, -0.1], [1.1, 1.0, -0.1]]
+
+        nn = len(exp_pos)
+        self.assertEqual(nn, clu.data.shape[0])
+        self.assertEqual(nn, clu.get_atom_count())
+        act_pos = np.sort(clu.get_positions(), axis=0)
+        exp_pos = np.sort(np.array(exp_pos), axis=0)
+        np.testing.assert_allclose(act_pos, exp_pos)
+        act_idx = np.unique(clu.data['i'])
+        self.assertEqual(nn, act_idx.shape[0])
+        act_typ = (clu.data['t'] == 7).nonzero()[0]
+        self.assertEqual(nn, act_typ.shape[0])
+
+    def test_add_bulk(self):
+        clu = mc.Cluster()
+        clu.set_rmax(2.0)
+        b1 = np.array((1.0, 0.0, 0.0))
+        b2 = np.array((0.0, 1.0, 0.0))
+        b3 = np.array((0.0, 0.0, 1.0))
+        a1 = np.array((0.1, 0.0, -0.1))
+        z_surf = 0.8
+        clu.add_bulk(7, a1, b1, b2, b3, z_surf=z_surf)
+
+        r_great = max(clu.rmax, np.linalg.norm(a1))
+        n1 = max(int(r_great / np.linalg.norm(b1)) + 1, 4) * 3
+        n2 = max(int(r_great / np.linalg.norm(b2)) + 1, 4) * 3
+        n3 = max(int(r_great / np.linalg.norm(b3)) + 1, 4) * 3
+        exp_pos = []
+        nn = 0
+        for i1 in range(-n1, n1 + 1):
+            for i2 in range(-n2, n2 + 1):
+                for i3 in range(-n3, n3 + 1):
+                    v = a1 + b1 * i1 + b2 * i2 + b3 * i3
+                    if np.linalg.norm(v) <= clu.rmax and v[2] <= z_surf:
+                        exp_pos.append(v)
+                        nn += 1
+
+        self.assertEqual(nn, clu.data.shape[0])
+        self.assertEqual(nn, clu.get_atom_count())
+        act_pos = np.sort(clu.get_positions(), axis=0)
+        exp_pos = np.sort(np.array(exp_pos), axis=0)
+        np.testing.assert_allclose(act_pos, exp_pos)
+        act_idx = np.unique(clu.data['i'])
+        self.assertEqual(nn, act_idx.shape[0])
+        act_typ = (clu.data['t'] == 7).nonzero()[0]
+        self.assertEqual(nn, act_typ.shape[0])

    def test_add_cluster(self):
        clu1 = mc.Cluster()
@ -375,6 +454,18 @@ class TestClusterFunctions(unittest.TestCase):
        f = BytesIO()
        pos = np.asarray((-1, -1, 0))
        clu.set_emitter(pos=pos)
+        clu.save_to_file(f, mc.FMT_PMSCO, "qwerty", emitters_only=True)
+        f.seek(0)
+        line = f.readline()
+        self.assertEqual(line, b"# index element symbol class x y z emitter charge\n", b"line 1: " + line)
+        line = f.readline()
+        self.assertRegexpMatches(line, b"[0-9]+ +1 +H +[0-9]+ +[0.]+ +[0.]+ +[0.]+ +1 +[0.]", b"line 3: " + line)
+        line = f.readline()
+        self.assertRegexpMatches(line, b"[0-9]+ +14 +Si +[0-9]+ +[01.-]+ +[01.-]+ +[0.]+ +1 +[0.]", b"line 4: " + line)
+        line = f.readline()
+        self.assertEqual(b"", line, b"end of file")
+
+        f = BytesIO()
        clu.save_to_file(f, mc.FMT_XYZ, "qwerty", emitters_only=True)
        f.seek(0)
        line = f.readline()
@ -388,6 +479,37 @@ class TestClusterFunctions(unittest.TestCase):
        line = f.readline()
        self.assertEqual(b"", line, b"end of file")

+    def test_load_from_file(self):
+        f = BytesIO()
+        f.write(b"2\n")
+        f.write(b"qwerty\n")
+        f.write(b"H 0.5 0.6 0.7\n")
+        f.write(b"Si -1.5 -1.6 -1.7\n")
+        f.seek(0)
+        clu = mc.Cluster()
+        clu.load_from_file(f, fmt=mc.FMT_XYZ)
+        np.testing.assert_allclose(clu.data['t'], np.array([1, 14]))
+        np.testing.assert_allclose(clu.data['x'], np.array([0.5, -1.5]))
+        np.testing.assert_allclose(clu.data['y'], np.array([0.6, -1.6]))
+        np.testing.assert_allclose(clu.data['z'], np.array([0.7, -1.7]))
+        np.testing.assert_allclose(clu.data['e'], np.array([0, 0]))
+        np.testing.assert_allclose(clu.data['q'], np.array([0, 0]))
+
+        f = BytesIO()
+        f.write(b"# index element symbol class x y z emitter charge\n")
+        # ['i', 't', 's', 'c', 'x', 'y', 'z', 'e', 'q']
+        f.write(b"1 6 C 1 0.5 0.6 0.7 0 -0.5\n")
+        f.write(b"2 14 Si 2 -1.5 -1.6 -1.7 1 0.5\n")
+        f.seek(0)
+        clu = mc.Cluster()
+        clu.load_from_file(f, fmt=mc.FMT_PMSCO)
+        np.testing.assert_allclose(clu.data['t'], np.array([6, 14]))
+        np.testing.assert_allclose(clu.data['x'], np.array([0.5, -1.5]))
+        np.testing.assert_allclose(clu.data['y'], np.array([0.6, -1.6]))
+        np.testing.assert_allclose(clu.data['z'], np.array([0.7, -1.7]))
+        np.testing.assert_allclose(clu.data['e'], np.array([0, 1]))
+        np.testing.assert_allclose(clu.data['q'], np.array([-0.5, 0.5]))
+
    def test_update_atoms(self):
        clu = mc.Cluster()
        clu.add_atom(1, np.asarray([0, 0, 0]), 1)
@ -409,3 +531,29 @@ class TestClusterFunctions(unittest.TestCase):
        clu.update_atoms(other, {'c'})
        expected = np.asarray((1, 3, 2, 3, 2, 4))
        np.testing.assert_array_equal(expected, clu.data['c'])
+
+    def test_calc_scattering_angles(self):
+        clu = mc.Cluster()
+        ref_em = np.asarray([0.1, -0.1, 0.5])
+        ref_th = np.asarray([0., 15., 90., 100., 120.])
+        ref_ph = np.asarray([0., 90., 180., 270., 360.])
+        ref_di = np.asarray([0.5, 1.0, 1.5, 2.0, 2.5])
+        exp_th = ref_th[0:4]
+        exp_ph = ref_ph[0:4]
+        exp_di = ref_di[0:4]
+        sel_ph = exp_ph > 180.
+        exp_ph[sel_ph] = exp_ph[sel_ph] - 360.
+
+        idx_em = clu.add_atom(1, ref_em, 1)
+        for i, r in enumerate(ref_di):
+            v = np.asarray([
+                r * math.cos(math.radians(ref_ph[i])) * math.sin(math.radians(ref_th[i])) + ref_em[0],
+                r * math.sin(math.radians(ref_ph[i])) * math.sin(math.radians(ref_th[i])) + ref_em[1],
+                r * math.cos(math.radians(ref_th[i])) + ref_em[2]])
+            clu.add_atom(i, v, 0)
+
+        result = clu.calc_scattering_angles(idx_em, 2.2)
+        np.testing.assert_allclose(result['index'], np.arange(1, exp_di.shape[0] + 1))
+        np.testing.assert_allclose(result['polar'], exp_th, atol=1e-3)
+        np.testing.assert_allclose(result['azimuth'], exp_ph, atol=1e-3)
+        np.testing.assert_allclose(result['dist'], exp_di, rtol=1e-5)
--- a/tests/test_database.py
+++ b/tests/test_database.py
@ -77,7 +77,7 @@ class TestDatabase(unittest.TestCase):

        cid1 = dispatch.CalcID(1, 2, 3, 4, -1)
        cid2 = db.special_params(cid1)
-        cid3 = {'model': 1, 'scan': 2, 'sym': 3, 'emit': 4, 'region': -1}
+        cid3 = {'model': 1, 'scan': 2, 'domain': 3, 'emit': 4, 'region': -1}
        self.assertEqual(cid2, cid3)

        l1 = d1.keys()
@ -91,6 +91,7 @@ class TestDatabase(unittest.TestCase):
        self.assertEqual(t2, t3)

    def setup_sample_database(self):
+        self.db.register_project("oldproject", "oldcode")
        self.db.register_project("unittest", "testcode")
        self.db.register_job(self.db.project_id, "testjob", "testmode", "testhost", None, datetime.datetime.now())
        self.ex_model = {'parA': 1.234, 'parB': 5.678, '_model': 91, '_rfac': 0.534}
@ -101,10 +102,13 @@ class TestDatabase(unittest.TestCase):
    def test_register_project(self):
        id1 = self.db.register_project("unittest1", "Atest")
        self.assertIsInstance(id1, int)
+        self.assertEqual(id1, self.db.project_id)
        id2 = self.db.register_project("unittest2", "Btest")
        self.assertIsInstance(id2, int)
+        self.assertEqual(id2, self.db.project_id)
        id3 = self.db.register_project("unittest1", "Ctest")
        self.assertIsInstance(id3, int)
+        self.assertEqual(id3, self.db.project_id)
        self.assertNotEqual(id1, id2)
        self.assertEqual(id1, id3)
        c = self.db._conn.cursor()
@ -120,6 +124,47 @@ class TestDatabase(unittest.TestCase):
        self.assertEqual(row['name'], "unittest1")
        self.assertEqual(row['code'], "Atest")

+    def test_register_job(self):
+        pid1 = self.db.register_project("unittest1", "Acode")
+        pid2 = self.db.register_project("unittest2", "Bcode")
+        dt1 = datetime.datetime.now()
+
+        # insert new job
+        id1 = self.db.register_job(pid1, "Ajob", "Amode", "local", "Ahash", dt1, "Adesc")
+        self.assertIsInstance(id1, int)
+        self.assertEqual(id1, self.db.job_id)
+        # insert another job
+        id2 = self.db.register_job(pid1, "Bjob", "Amode", "local", "Ahash", dt1, "Adesc")
+        self.assertIsInstance(id2, int)
+        self.assertEqual(id2, self.db.job_id)
+        # update first job
+        id3 = self.db.register_job(pid1, "Ajob", "Cmode", "local", "Chash", dt1, "Cdesc")
+        self.assertIsInstance(id3, int)
+        self.assertEqual(id3, self.db.job_id)
+        # insert another job with same name but in other project
+        id4 = self.db.register_job(pid2, "Ajob", "Dmode", "local", "Dhash", dt1, "Ddesc")
+        self.assertIsInstance(id4, int)
+        self.assertEqual(id4, self.db.job_id)
+
+        self.assertNotEqual(id1, id2)
+        self.assertEqual(id1, id3)
+        self.assertNotEqual(id1, id4)
+
+        c = self.db._conn.cursor()
+        c.execute("select count(*) from Jobs")
+        count = c.fetchone()
+        self.assertEqual(count[0], 3)
+        c.execute("select name, mode, machine, git_hash, datetime, description from Jobs where id=:id", {'id': id1})
+        row = c.fetchone()
+        self.assertIsNotNone(row)
+        self.assertEqual(len(row), 6)
+        self.assertEqual(row[0], "Ajob")
+        self.assertEqual(row[1], "Amode")
+        self.assertEqual(row['machine'], "local")
+        self.assertEqual(str(row['datetime']), str(dt1))
+        self.assertEqual(row['git_hash'], "Ahash")
+        self.assertEqual(row['description'], "Adesc")
+
    def test_register_params(self):
        self.setup_sample_database()
        model5 = {'parA': 2.341, 'parC': 6.785, '_model': 92, '_rfac': 0.453}
@ -181,7 +226,7 @@ class TestDatabase(unittest.TestCase):

    def test_query_model_array(self):
        self.setup_sample_database()
-        index = {'_scan': -1, '_sym': -1, '_emit': -1, '_region': -1}
+        index = {'_scan': -1, '_domain': -1, '_emit': -1, '_region': -1}
        model2 = {'parA': 4.123, 'parB': 8.567, '_model': 92, '_rfac': 0.654}
        model3 = {'parA': 3.412, 'parB': 7.856, '_model': 93, '_rfac': 0.345}
        model4 = {'parA': 4.123, 'parB': 8.567, '_model': 94, '_rfac': 0.354}
@ -234,12 +279,12 @@ class TestDatabase(unittest.TestCase):
        model7 = {'parA': 5.123, 'parB': 6.567, '_model': 97, '_rfac': 0.154, '_gen': 1, '_particle': 7}
        self.db.register_params(model5)
        self.db.create_models_view()
-        model2.update({'_scan': -1, '_sym': 11, '_emit': 21, '_region': 31})
-        model3.update({'_scan':  1, '_sym': 12, '_emit': 22, '_region': 32})
-        model4.update({'_scan':  2, '_sym': 11, '_emit': 23, '_region': 33})
-        model5.update({'_scan':  3, '_sym': 11, '_emit': 24, '_region': 34})
-        model6.update({'_scan':  4, '_sym': 11, '_emit': 25, '_region': 35})
-        model7.update({'_scan':  5, '_sym': -1, '_emit': -1, '_region': -1})
+        model2.update({'_scan': -1, '_domain': 11, '_emit': 21, '_region': 31})
+        model3.update({'_scan':  1, '_domain': 12, '_emit': 22, '_region': 32})
+        model4.update({'_scan':  2, '_domain': 11, '_emit': 23, '_region': 33})
+        model5.update({'_scan':  3, '_domain': 11, '_emit': 24, '_region': 34})
+        model6.update({'_scan':  4, '_domain': 11, '_emit': 25, '_region': 35})
+        model7.update({'_scan':  5, '_domain': -1, '_emit': -1, '_region': -1})
        self.db.insert_result(model2, model2)
        self.db.insert_result(model3, model3)
        self.db.insert_result(model4, model4)
@ -248,12 +293,12 @@ class TestDatabase(unittest.TestCase):
        self.db.insert_result(model7, model7)

        # only model3, model4 and model5 fulfill all conditions and limits
-        fil = ['mode = "testmode"', 'sym = 11']
+        fil = ['mode = "testmode"', 'domain = 11']
        lim = 3
        result = self.db.query_best_results(filter=fil, limit=lim)

        ifields = ['_db_job', '_db_model', '_db_result',
-                   '_model', '_scan', '_sym', '_emit', '_region',
+                   '_model', '_scan', '_domain', '_emit', '_region',
                   '_gen', '_particle']
        ffields = ['_rfac']
        dt = [(f, 'i8') for f in ifields]
@ -262,7 +307,7 @@ class TestDatabase(unittest.TestCase):
        expected['_rfac'] = np.array([0.354, 0.354, 0.453])
        expected['_model'] = np.array([94, 96, 95])
        expected['_scan'] = np.array([2, 4, 3])
-        expected['_sym'] = np.array([11, 11, 11])
+        expected['_domain'] = np.array([11, 11, 11])
        expected['_emit'] = np.array([23, 25, 24])
        expected['_region'] = np.array([33, 35, 34])
        expected['_gen'] = np.array([1, 1, 1])
@ -272,7 +317,7 @@ class TestDatabase(unittest.TestCase):
        np.testing.assert_array_almost_equal(result['_rfac'], expected['_rfac'])
        np.testing.assert_array_equal(result['_model'], expected['_model'])
        np.testing.assert_array_equal(result['_scan'], expected['_scan'])
-        np.testing.assert_array_equal(result['_sym'], expected['_sym'])
+        np.testing.assert_array_equal(result['_domain'], expected['_domain'])
        np.testing.assert_array_equal(result['_emit'], expected['_emit'])
        np.testing.assert_array_equal(result['_region'], expected['_region'])
        np.testing.assert_array_equal(result['_gen'], expected['_gen'])
@ -296,7 +341,7 @@ class TestDatabase(unittest.TestCase):
        model_id = row['model_id']
        self.assertIsInstance(model_id, int)
        self.assertEqual(row['scan'], index.scan)
-        self.assertEqual(row['sym'], index.sym)
+        self.assertEqual(row['domain'], index.domain)
        self.assertEqual(row['emit'], index.emit)
        self.assertEqual(row['region'], index.region)
        self.assertEqual(row['rfac'], result['_rfac'])
@ -342,7 +387,7 @@ class TestDatabase(unittest.TestCase):
        model_id = row['model_id']
        self.assertIsInstance(model_id, int)
        self.assertEqual(row['scan'], index.scan)
-        self.assertEqual(row['sym'], index.sym)
+        self.assertEqual(row['domain'], index.domain)
        self.assertEqual(row['emit'], index.emit)
        self.assertEqual(row['region'], index.region)
        self.assertEqual(row['rfac'], result2['_rfac'])
@ -372,7 +417,7 @@ class TestDatabase(unittest.TestCase):
        @return:
        """
        self.setup_sample_database()
-        index = {'_model': 15, '_scan': 16, '_sym': 17, '_emit': 18, '_region': -1}
+        index = {'_model': 15, '_scan': 16, '_domain': 17, '_emit': 18, '_region': -1}
        result1 = {'parA': 4.123, 'parB': 8.567, '_rfac': 0.654, '_particle': 21}
        result_id1 = self.db.insert_result(index, result1)
        result2 = {'parA': 5.456, '_rfac': 0.254, '_particle': 11}
@ -393,7 +438,7 @@ class TestDatabase(unittest.TestCase):
        model_id = row['model_id']
        self.assertIsInstance(model_id, int)
        self.assertEqual(row['scan'], index['_scan'])
-        self.assertEqual(row['sym'], index['_sym'])
+        self.assertEqual(row['domain'], index['_domain'])
        self.assertEqual(row['emit'], index['_emit'])
        self.assertEqual(row['region'], index['_region'])
        self.assertEqual(row['rfac'], result2['_rfac'])
@ -418,19 +463,19 @@ class TestDatabase(unittest.TestCase):

    def test_query_best_task_models(self):
        self.setup_sample_database()
-        model0xxx = {'_model': 0, '_scan': -1, '_sym': -1, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.01}
-        model00xx = {'_model': 1, '_scan': 0, '_sym': -1, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.02}
-        model000x = {'_model': 2, '_scan': 0, '_sym': 0, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.03}
-        model01xx = {'_model': 3, '_scan': 1, '_sym': -1, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.04}
-        model010x = {'_model': 4, '_scan': 1, '_sym': 0, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.05}
+        model0xxx = {'_model': 0, '_scan': -1, '_domain': -1, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.01}
+        model00xx = {'_model': 1, '_scan': 0, '_domain': -1, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.02}
+        model000x = {'_model': 2, '_scan': 0, '_domain': 0, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.03}
+        model01xx = {'_model': 3, '_scan': 1, '_domain': -1, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.04}
+        model010x = {'_model': 4, '_scan': 1, '_domain': 0, '_emit': -1, '_region': -1, 'parA': 4., 'parB': 8.567, '_rfac': 0.05}

-        model1xxx = {'_model': 5, '_scan': -1, '_sym': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.09}
-        model10xx = {'_model': 6, '_scan': 0, '_sym': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.08}
-        model100x = {'_model': 7, '_scan': 0, '_sym': 0, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.07}
-        model11xx = {'_model': 8, '_scan': 1, '_sym': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.06}
-        model110x = {'_model': 9, '_scan': 1, '_sym': 0, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.05}
+        model1xxx = {'_model': 5, '_scan': -1, '_domain': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.09}
+        model10xx = {'_model': 6, '_scan': 0, '_domain': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.08}
+        model100x = {'_model': 7, '_scan': 0, '_domain': 0, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.07}
+        model11xx = {'_model': 8, '_scan': 1, '_domain': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.06}
+        model110x = {'_model': 9, '_scan': 1, '_domain': 0, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.05}

-        model2xxx = {'_model': 10, '_scan': -1, '_sym': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.01}
+        model2xxx = {'_model': 10, '_scan': -1, '_domain': -1, '_emit': -1, '_region': -1, 'parA': 4.123, 'parB': 8.567, '_rfac': 0.01}

        self.db.insert_result(model0xxx, model0xxx)
        self.db.insert_result(model00xx, model00xx)
@ -451,6 +496,75 @@ class TestDatabase(unittest.TestCase):
        expected = {0, 1, 3, 6, 8, 10}
        self.assertEqual(result, expected)

+    def test_sample_project(self):
+        """
+        test ingestion of two results
+
+        this test uses the same call sequence as the actual pmsco code.
+        it has been used to debug a problem in the main code
+        where prevous results were overwritten.
+        """
+        db_filename = os.path.join(self.test_dir, "sample_database.db")
+        lock_filename = os.path.join(self.test_dir, "sample_database.lock")
+
+        # project
+        project_name = self.__class__.__name__
+        project_module = self.__class__.__module__
+
+        # job 1
+        job_name1 = "job1"
+        result1 = {'parA': 1.234, 'parB': 5.678, '_model': 91, '_rfac': 0.534}
+        task1 = dispatch.CalcID(91, -1, -1, -1, -1)
+
+        # ingest job 1
+        _db = db.ResultsDatabase()
+        _db.connect(db_filename, lock_filename=lock_filename)
+        project_id1 = _db.register_project(project_name, project_module)
+        job_id1 = _db.register_job(project_id1, job_name1, "test", "localhost", "", datetime.datetime.now(), "")
+        # _db.insert_jobtags(job_id, self.job_tags)
+        _db.register_params(result1.keys())
+        _db.create_models_view()
+        result_id1 = _db.insert_result(task1, result1)
+        _db.disconnect()
+
+        # job 2
+        job_name2 = "job2"
+        result2 = {'parA': 1.345, 'parB': 5.789, '_model': 91, '_rfac': 0.654}
+        task2 = dispatch.CalcID(91, -1, -1, -1, -1)
+
+        # ingest job 2
+        _db = db.ResultsDatabase()
+        _db.connect(db_filename, lock_filename=lock_filename)
+        project_id2 = _db.register_project(project_name, project_module)
+        job_id2 = _db.register_job(project_id2, job_name2, "test", "localhost", "", datetime.datetime.now(), "")
+        # _db.insert_jobtags(job_id, self.job_tags)
+        _db.register_params(result2.keys())
+        _db.create_models_view()
+        result_id2 = _db.insert_result(task2, result2)
+        _db.disconnect()
+
+        # check jobs
+        _db = db.ResultsDatabase()
+        _db.connect(db_filename, lock_filename=lock_filename)
+        sql = "select * from Jobs "
+        c = _db._conn.execute(sql)
+        rows = c.fetchall()
+        self.assertEqual(len(rows), 2)
+
+        # check models
+        sql = "select * from Models "
+        c = _db._conn.execute(sql)
+        rows = c.fetchall()
+        self.assertEqual(len(rows), 2)
+
+        # check results
+        sql = "select * from Results "
+        c = _db._conn.execute(sql)
+        rows = c.fetchall()
+        self.assertEqual(len(rows), 2)
+
+        _db.disconnect()
+

 if __name__ == '__main__':
    unittest.main()
--- a/tests/test_dispatch.py
+++ b/tests/test_dispatch.py
@ -99,7 +99,7 @@ class TestCalculationTask(unittest.TestCase):

    def test_get_mpi_message(self):
        result = self.sample.get_mpi_message()
-        expected = {'model': 11, 'scan': 12, 'sym': 13, 'emit': 14, 'region': 15}
+        expected = {'model': 11, 'scan': 12, 'domain': 13, 'emit': 14, 'region': 15}
        self.assertEqual(result['id'], expected)
        self.assertEqual(result['model'], self.sample.model)
        self.assertEqual(result['result_filename'], self.sample.result_filename)
--- a/tests/test_genetic.py
+++ b/tests/test_genetic.py
@ -39,11 +39,11 @@ class TestPopulation(unittest.TestCase):
    def setUp(self):
        random.seed(0)
        self._test_dir = ""
-        self.domain = mp.Domain()
+        self.model_space = mp.ModelSpace()

-        self.domain.add_param('A', 1.5, 1.0, 2.0, 0.1)
-        self.domain.add_param('B', 2.5, 2.0, 3.0, 0.1)
-        self.domain.add_param('C', 3.5, 3.0, 4.0, 0.1)
+        self.model_space.add_param('A', 1.5, 1.0, 2.0, 0.1)
+        self.model_space.add_param('B', 2.5, 2.0, 3.0, 0.1)
+        self.model_space.add_param('C', 3.5, 3.0, 4.0, 0.1)
        self.expected_names = ('_gen', '_model', '_particle', '_rfac', 'A', 'B', 'C')

        self.size = POP_SIZE
@ -114,7 +114,7 @@ class TestPopulation(unittest.TestCase):
        return r

    def test_setup(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.assertEqual(self.pop.pos.dtype.names, self.expected_names)
        self.assertEqual(self.pop.pos.shape, (POP_SIZE,))
        np.testing.assert_array_equal(np.arange(POP_SIZE), self.pop.pos['_particle'])
@ -131,7 +131,7 @@ class TestPopulation(unittest.TestCase):
    def test_setup_with_results(self):
        data_dir = os.path.dirname(os.path.abspath(__file__))
        data_file = os.path.join(data_dir, "test_swarm.setup_with_results.1.dat")
-        self.pop.setup(self.size, self.domain, seed_file=data_file, recalc_seed=False)
+        self.pop.setup(self.size, self.model_space, seed_file=data_file, recalc_seed=False)

        self.assertEqual(self.pop.pos.dtype.names, self.expected_names)
        self.assertEqual(self.pop.pos.shape, (POP_SIZE,))
@ -158,7 +158,7 @@ class TestPopulation(unittest.TestCase):
    def test_setup_with_results_recalc(self):
        data_dir = os.path.dirname(os.path.abspath(__file__))
        data_file = os.path.join(data_dir, "test_swarm.setup_with_results.1.dat")
-        self.pop.setup(self.size, self.domain, seed_file=data_file, recalc_seed=True)
+        self.pop.setup(self.size, self.model_space, seed_file=data_file, recalc_seed=True)

        self.assertEqual(self.pop.pos.dtype.names, self.expected_names)
        self.assertEqual(self.pop.pos.shape, (POP_SIZE,))
@ -183,26 +183,26 @@ class TestPopulation(unittest.TestCase):
        self.assertAlmostEqual(3.5, self.pop.pos['C'][0], 3)

    def test_pos_gen(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        for index, item in enumerate(self.pop.pos_gen()):
            self.assertIsInstance(item, dict)
            self.assertEqual(set(item.keys()), set(self.expected_names))
            self.assertEqual(item['_particle'], index)

    def test_randomize(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.pop.randomize()
-        self.assertTrue(np.all(self.pop.pos['A'] >= self.domain.min['A']))
-        self.assertTrue(np.all(self.pop.pos['A'] <= self.domain.max['A']))
-        self.assertGreater(np.std(self.pop.pos['A']), self.domain.step['A'])
+        self.assertTrue(np.all(self.pop.pos['A'] >= self.model_space.min['A']))
+        self.assertTrue(np.all(self.pop.pos['A'] <= self.model_space.max['A']))
+        self.assertGreater(np.std(self.pop.pos['A']), self.model_space.step['A'])

    def test_seed(self):
-        self.pop.setup(self.size, self.domain)
-        self.pop.seed(self.domain.start)
-        self.assertAlmostEqual(self.pop.pos['A'][0], self.domain.start['A'], delta=0.001)
+        self.pop.setup(self.size, self.model_space)
+        self.pop.seed(self.model_space.start)
+        self.assertAlmostEqual(self.pop.pos['A'][0], self.model_space.start['A'], delta=0.001)

    def test_add_result(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        i_sample = 1
        i_result = 0
        result = self.pop.pos[i_sample]
@ -212,7 +212,7 @@ class TestPopulation(unittest.TestCase):
        self.assertEqual(self.pop.best[i_sample], result)

    def test_is_converged(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.assertFalse(self.pop.is_converged())
        i_sample = 0
        result = self.pop.pos[i_sample]
@ -226,12 +226,12 @@ class TestPopulation(unittest.TestCase):
        self.assertTrue(self.pop.is_converged())

    def test_save_population(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        filename = os.path.join(self.test_dir, "test_save_population.pop")
        self.pop.save_population(filename)

    def test_save_results(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        i_sample = 1
        result = self.pop.pos[i_sample]
        self.pop.add_result(result, 1.0)
@ -239,17 +239,17 @@ class TestPopulation(unittest.TestCase):
        self.pop.save_results(filename)

    def test_save_array(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        filename = os.path.join(self.test_dir, "test_save_array.pos")
        self.pop.save_array(filename, self.pop.pos)

    def test_load_array(self):
        n = 3
        filename = os.path.join(self.test_dir, "test_load_array")
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        # expected array
-        dt_exp = self.pop.get_pop_dtype(self.domain.start)
+        dt_exp = self.pop.get_pop_dtype(self.model_space.start)
        a_exp = np.zeros((n,), dtype=dt_exp)
        a_exp['A'] = np.linspace(0, 1, n)
        a_exp['B'] = np.linspace(1, 2, n)
@ -276,13 +276,13 @@ class TestPopulation(unittest.TestCase):
            np.testing.assert_almost_equal(result[name], a_exp[name], err_msg=name)

    def test_mate_parents(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        pos1 = self.pop.pos.copy()
        parents = self.pop.mate_parents(pos1)
        self.assertEqual(len(parents), pos1.shape[0] / 2)

    def test_crossover(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        p1 = self.pop.pos[2].copy()
        p2 = self.pop.pos[3].copy()
        c1, c2 = self.pop.crossover(p1, p2)
@ -290,11 +290,11 @@ class TestPopulation(unittest.TestCase):
        self.assertIsInstance(c2, np.void)
        self.assertEqual(c1['_particle'], p1['_particle'])
        self.assertEqual(c2['_particle'], p2['_particle'])
-        for name in self.domain.start:
+        for name in self.model_space.start:
            self.assertAlmostEqual(c1[name] + c2[name], p1[name] + p2[name], msg=name)

    def test_mutate_weak(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        p1 = self.pop.pos[3].copy()
        c1 = p1.copy()
        self.pop.mutate_weak(c1, 1.0)
@ -304,7 +304,7 @@ class TestPopulation(unittest.TestCase):
        self.assertNotAlmostEqual(c1['C'], p1['C'])

    def test_mutate_strong(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        p1 = self.pop.pos[3].copy()
        c1 = p1.copy()
        self.pop.mutate_strong(c1, 1.0)
@ -314,7 +314,7 @@ class TestPopulation(unittest.TestCase):
        self.assertNotAlmostEqual(c1['C'], p1['C'])

    def test_advance_population(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        p1 = {'A': np.linspace(1.0, 2.0, POP_SIZE),
              'B': np.linspace(2.0, 3.0, POP_SIZE),
@ -335,7 +335,7 @@ class TestPopulation(unittest.TestCase):
            self.assertTrue(np.any(abs(self.pop.pos[name] - value) >= 0.001), msg=name)

    def test_convergence_1(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        self.pop.pos['A'] = np.linspace(1.0, 2.0, POP_SIZE)
        self.pop.pos['B'] = np.linspace(2.0, 3.0, POP_SIZE)
@ -352,7 +352,7 @@ class TestPopulation(unittest.TestCase):

    def optimize_rfactor_2(self, pop_size, iterations):
        self.size = pop_size
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        for i in range(iterations):
            self.pop.advance_population()
--- a/tests/test_grid.py
+++ b/tests/test_grid.py
@ -32,11 +32,11 @@ import pmsco.project as mp
 class TestPopulation(unittest.TestCase):
    def setUp(self):
        random.seed(0)
-        self.domain = mp.Domain()
+        self.model_space = mp.ModelSpace()

-        self.domain.add_param('A', 1.5, 1.0, 2.0, 0.2)
-        self.domain.add_param('B', 2.5, 2.0, 3.0, 0.25)
-        self.domain.add_param('C', 3.5, 3.5, 3.5, 0.0)
+        self.model_space.add_param('A', 1.5, 1.0, 2.0, 0.2)
+        self.model_space.add_param('B', 2.5, 2.0, 3.0, 0.25)
+        self.model_space.add_param('C', 3.5, 3.5, 3.5, 0.0)
        self.expected_popsize = 30
        self.expected_names = ('_model', '_rfac', 'A', 'B', 'C')

@ -57,7 +57,7 @@ class TestPopulation(unittest.TestCase):
        pass

    def test_setup(self):
-        self.pop.setup(self.domain)
+        self.pop.setup(self.model_space)
        self.assertEqual(self.pop.positions.dtype.names, self.expected_names)
        self.assertEqual(self.pop.positions.shape, (self.expected_popsize,))
        self.assertEqual(self.pop.model_count, self.expected_popsize)
--- a/tests/test_population.py
+++ b/tests/test_population.py
@ -41,11 +41,11 @@ class TestPopulation(unittest.TestCase):
    def setUp(self):
        random.seed(0)
        self.test_dir = tempfile.mkdtemp()
-        self.domain = project.Domain()
+        self.model_space = project.ModelSpace()

-        self.domain.add_param('A', 1.5, 1.0, 2.0, 0.1)
-        self.domain.add_param('B', 2.5, 2.0, 3.0, 0.1)
-        self.domain.add_param('C', 3.5, 3.0, 4.0, 0.1)
+        self.model_space.add_param('A', 1.5, 1.0, 2.0, 0.1)
+        self.model_space.add_param('B', 2.5, 2.0, 3.0, 0.1)
+        self.model_space.add_param('C', 3.5, 3.0, 4.0, 0.1)
        self.expected_names = ('_gen', '_model', '_particle', '_rfac', 'A', 'B', 'C')

        self.size = POP_SIZE
@ -115,7 +115,7 @@ class TestPopulation(unittest.TestCase):
        return r

    def test_setup(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.assertEqual(self.pop.pos.dtype.names, self.expected_names)
        self.assertEqual(self.pop.pos.shape, (POP_SIZE,))
        np.testing.assert_array_equal(np.arange(POP_SIZE), self.pop.pos['_particle'])
@ -132,7 +132,7 @@ class TestPopulation(unittest.TestCase):
    def test_setup_with_results(self):
        data_dir = os.path.dirname(os.path.abspath(__file__))
        data_file = os.path.join(data_dir, "test_swarm.setup_with_results.1.dat")
-        self.pop.setup(self.size, self.domain, seed_file=data_file, recalc_seed=False)
+        self.pop.setup(self.size, self.model_space, seed_file=data_file, recalc_seed=False)

        self.assertEqual(self.pop.pos.dtype.names, self.expected_names)
        self.assertEqual(self.pop.pos.shape, (POP_SIZE,))
@ -159,7 +159,7 @@ class TestPopulation(unittest.TestCase):
    def test_setup_with_results_recalc(self):
        data_dir = os.path.dirname(os.path.abspath(__file__))
        data_file = os.path.join(data_dir, "test_swarm.setup_with_results.1.dat")
-        self.pop.setup(self.size, self.domain, seed_file=data_file, recalc_seed=True)
+        self.pop.setup(self.size, self.model_space, seed_file=data_file, recalc_seed=True)

        self.assertEqual(self.pop.pos.dtype.names, self.expected_names)
        self.assertEqual(self.pop.pos.shape, (POP_SIZE,))
@ -184,12 +184,12 @@ class TestPopulation(unittest.TestCase):
        self.assertAlmostEqual(3.5, self.pop.pos['C'][0], 3)

    def test_setup_with_partial_results(self):
-        self.domain.add_param('D', 4.5, 4.0, 5.0, 0.1)
+        self.model_space.add_param('D', 4.5, 4.0, 5.0, 0.1)
        self.expected_names = ('_gen', '_model', '_particle', '_rfac', 'A', 'B', 'C', 'D')

        data_dir = os.path.dirname(os.path.abspath(__file__))
        data_file = os.path.join(data_dir, "test_swarm.setup_with_results.1.dat")
-        self.pop.setup(self.size, self.domain, seed_file=data_file, recalc_seed=False)
+        self.pop.setup(self.size, self.model_space, seed_file=data_file, recalc_seed=False)

        self.assertEqual(self.pop.pos.dtype.names, self.expected_names)
        self.assertEqual(self.pop.pos.shape, (POP_SIZE,))
@ -214,7 +214,7 @@ class TestPopulation(unittest.TestCase):
        self.assertAlmostEqual(3.5, self.pop.pos['C'][0], 3)

    def test_pos_gen(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        for index, item in enumerate(self.pop.pos_gen()):
            self.assertIsInstance(item, dict)
            self.assertEqual(set(item.keys()), set(self.expected_names))
@ -242,19 +242,19 @@ class TestPopulation(unittest.TestCase):
        np.testing.assert_array_equal(result, expected)

    def test_randomize(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.pop.randomize()
        m = np.mean(self.pop.pos['A'])
-        self.assertGreaterEqual(m, self.domain.min['A'])
-        self.assertLessEqual(m, self.domain.max['A'])
+        self.assertGreaterEqual(m, self.model_space.min['A'])
+        self.assertLessEqual(m, self.model_space.max['A'])

    def test_seed(self):
-        self.pop.setup(self.size, self.domain)
-        self.pop.seed(self.domain.start)
-        self.assertAlmostEqual(self.pop.pos['A'][0], self.domain.start['A'], delta=0.001)
+        self.pop.setup(self.size, self.model_space)
+        self.pop.seed(self.model_space.start)
+        self.assertAlmostEqual(self.pop.pos['A'][0], self.model_space.start['A'], delta=0.001)

    def test_add_result(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        i_sample = 1
        i_result = 0
        result = self.pop.pos[i_sample]
@ -264,12 +264,12 @@ class TestPopulation(unittest.TestCase):
        self.assertEqual(self.pop.best[i_sample], result)

    def test_save_population(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        filename = os.path.join(self.test_dir, "test_save_population.pop")
        self.pop.save_population(filename)

    def test_save_results(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        i_sample = 1
        result = self.pop.pos[i_sample]
        self.pop.add_result(result, 1.0)
@ -277,17 +277,17 @@ class TestPopulation(unittest.TestCase):
        self.pop.save_results(filename)

    def test_save_array(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        filename = os.path.join(self.test_dir, "test_save_array.pos")
        self.pop.save_array(filename, self.pop.pos)

    def test_load_array(self):
        n = 3
        filename = os.path.join(self.test_dir, "test_load_array")
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        # expected array
-        dt_exp = self.pop.get_pop_dtype(self.domain.start)
+        dt_exp = self.pop.get_pop_dtype(self.model_space.start)
        a_exp = np.zeros((n,), dtype=dt_exp)
        a_exp['A'] = np.linspace(0, 1, n)
        a_exp['B'] = np.linspace(1, 2, n)
@ -395,7 +395,7 @@ class TestPopulation(unittest.TestCase):
        self.assertRaises(ValueError, population.Population.constrain_position, pos1, vel1, min1, max1, 'error')

    def test_patch_from_file(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        data_dir = os.path.dirname(os.path.abspath(__file__))
        data_file = os.path.join(data_dir, "test_swarm.setup_with_results.1.dat")
@ -411,7 +411,7 @@ class TestPopulation(unittest.TestCase):
        np.testing.assert_array_almost_equal(self.pop.pos_patch['C'], expected_pos['C'])

    def test_apply_patch(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        expected_pos = self.pop.pos.copy()
        dt_test = [('A', 'f4'), ('_particle', 'i4'), ('_rfac', 'f4'), ('C', 'f4'), ('_model', 'i4')]
        patch_size = 3
@ -436,14 +436,14 @@ class TestPopulation(unittest.TestCase):
        self.assert_pop_array_equal(self.pop.pos, expected_pos)

    def test_find_result(self):
-        self.domain.min['A'] = -0.1
-        self.domain.max['A'] = 0.1
-        self.domain.min['B'] = 0.0
-        self.domain.max['B'] = 1000.
-        self.domain.min['C'] = 9.
-        self.domain.max['C'] = 9.001
+        self.model_space.min['A'] = -0.1
+        self.model_space.max['A'] = 0.1
+        self.model_space.min['B'] = 0.0
+        self.model_space.max['B'] = 1000.
+        self.model_space.min['C'] = 9.
+        self.model_space.max['C'] = 9.001
        self.size = 100
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.pop.results = self.pop.pos.copy()

        expected_index = 77
@ -472,7 +472,7 @@ class TestPopulation(unittest.TestCase):
        check the different type conversions.
        the main work is in test_import_positions_array.
        """
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        source_type = [('A', 'f4'), ('B', 'f4'), ('C', 'f4'), ('D', 'f4'),
                       ('_model', 'i4'), ('_particle', 'i4'), ('_gen', 'i4'), ('_rfac', 'f4')]
@ -548,7 +548,7 @@ class TestPopulation(unittest.TestCase):
        - no range or duplicate checking.
        - missing parameter.
        """
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        source_type = [('A', 'f4'), ('B', 'f4'), ('C', 'f4'), ('D', 'f4'),
                       ('_model', 'i4'), ('_particle', 'i4'), ('_gen', 'i4'), ('_rfac', 'f4')]
        source = np.array([(1.0, 0.0, 0.0, 0.0, 0, 0, 0, 0.0),
@ -584,7 +584,7 @@ class TestPopulation(unittest.TestCase):
        - range checks.
        - de-duplication.
        """
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.pop.position_constrain_mode = 'error'
        source_type = [('A', 'f8'), ('B', 'f8'), ('C', 'f8'),
                       ('_model', 'i8'), ('_particle', 'i8'), ('_gen', 'i8'), ('_rfac', 'f8')]
--- a/tests/test_project.py
+++ b/tests/test_project.py
@ -10,20 +10,17 @@ to run the tests, change to the directory which contains the tests directory, an

@author Matthias Muntwiler, matthias.muntwiler@psi.ch

-@copyright (c) 2015-18 by Paul Scherrer Institut @n
+@copyright (c) 2015-21 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
 import mock
 import numpy as np
 import os
+from pathlib import Path
 import unittest

 import pmsco.data as data
@ -31,6 +28,103 @@ import pmsco.dispatch as dispatch
 import pmsco.project as project


+class TestModelSpace(unittest.TestCase):
+    def setUp(self):
+        self.d1 = {
+            "A": {"start": 2.1, "min": 2.0, "max": 3.0, "step": 0.05},
+            "B": {"start": 15.0, "min": 0.0, "max": 30.0, "step": 1.0}}
+        self.d2 = {
+            "C": {"start": 22.0, "min": 15.0, "max": 25.0, "step": 1.0},
+            "D": {"start": 1.5, "min": 0.5, "max": 2.0, "step": 0.25}}
+
+    def test_add_param(self):
+        ms = project.ModelSpace()
+        ms.start['A'] = 2.1
+        ms.min['A'] = 2.0
+        ms.max['A'] = 3.0
+        ms.step['A'] = 0.05
+        ms.add_param("E", 5.0, 1.0, 9.0, 0.2)
+        ms.add_param("F", 8.0, width=6.0, step=0.5)
+        d_start = {'A': 2.1, 'E': 5.0, 'F': 8.0}
+        d_min = {'A': 2.0, 'E': 1.0, 'F': 5.0}
+        d_max = {'A': 3.0, 'E': 9.0, 'F': 11.0}
+        d_step = {'A': 0.05, 'E': 0.2, 'F': 0.5}
+        self.assertDictEqual(ms.start, d_start)
+        self.assertDictEqual(ms.min, d_min)
+        self.assertDictEqual(ms.max, d_max)
+        self.assertDictEqual(ms.step, d_step)
+
+    def test_get_param(self):
+        ms = project.ModelSpace()
+        ms.add_param("A", **self.d1['A'])
+        ms.add_param("B", **self.d1['B'])
+        result = ms.get_param('B')
+        expected = {'start': 15.0, 'min': 0.0, 'max': 30.0, 'step': 1.0}
+        self.assertIsInstance(result, project.ParamSpace)
+        self.assertEqual(result.start, expected['start'])
+        self.assertEqual(result.min, expected['min'])
+        self.assertEqual(result.max, expected['max'])
+        self.assertEqual(result.step, expected['step'])
+
+    def test_set_param_dict(self):
+        ms = project.ModelSpace()
+        ms.set_param_dict(self.d1)
+        ms.set_param_dict(self.d2)
+        d_start = {'C': 22.0, 'D': 1.5}
+        d_min = {'C': 15.0, 'D': 0.5}
+        d_max = {'C': 25.0, 'D': 2.0}
+        d_step = {'C': 1.0, 'D': 0.25}
+        self.assertDictEqual(ms.start, d_start)
+        self.assertDictEqual(ms.min, d_min)
+        self.assertDictEqual(ms.max, d_max)
+        self.assertDictEqual(ms.step, d_step)
+
+
+class TestScanCreator(unittest.TestCase):
+    """
+    test case for @ref pmsco.project.ScanCreator class
+
+    """
+    def test_load_1(self):
+        """
+        test the load method, case 1
+
+        test for:
+        - correct array expansion of an ['e', 'a'] scan.
+        - correct file name expansion with place holders and pathlib.Path objects.
+        """
+        sc = project.ScanCreator()
+        sc.filename = Path("{test_p}", "twoatom_energy_alpha.etpai")
+        sc.positions = {
+            "e": "np.arange(10, 400, 5)",
+            "t": "0",
+            "p": "0",
+            "a": "np.linspace(-30, 30, 31)"
+        }
+        sc.emitter = "Cu"
+        sc.initial_state = "2p3/2"
+
+        p = Path(__file__).parent / ".." / "projects" / "twoatom"
+        dirs = {"test_p": p,
+                "test_s": str(p)}
+
+        result = sc.load(dirs=dirs)
+
+        self.assertEqual(result.mode, ['e', 'a'])
+        self.assertEqual(result.emitter, sc.emitter)
+        self.assertEqual(result.initial_state, sc.initial_state)
+
+        e = np.arange(10, 400, 5)
+        a = np.linspace(-30, 30, 31)
+        t = p = np.asarray([0])
+        np.testing.assert_array_equal(result.energies, e)
+        np.testing.assert_array_equal(result.thetas, t)
+        np.testing.assert_array_equal(result.phis, p)
+        np.testing.assert_array_equal(result.alphas, a)
+
+        self.assertTrue(Path(result.filename).is_file(), msg=f"file {result.filename} not found")
+
+
 class TestScan(unittest.TestCase):
    """
    test case for @ref pmsco.project.Scan class
@ -106,16 +200,16 @@ class TestProject(unittest.TestCase):

    @mock.patch('pmsco.data.load_data')
    @mock.patch('pmsco.data.save_data')
-    def test_combine_symmetries(self, save_data_mock, load_data_mock):
+    def test_combine_domains(self, save_data_mock, load_data_mock):
        self.project.scans.append(project.Scan())

        parent_task = dispatch.CalculationTask()
        parent_task.change_id(model=0, scan=0)
-        parent_task.model['wsym1'] = 0.5
+        parent_task.model['wdom1'] = 0.5

        child_tasks = [parent_task.copy()] * 2
        for idx, task in enumerate(child_tasks):
-            task.change_id(sym=idx)
+            task.change_id(domain=idx)

        data1 = data.create_data(5, datatype='EI')
        data1['e'] = np.arange(5)
@ -126,7 +220,7 @@ class TestProject(unittest.TestCase):
        data3 = data1.copy()
        data3['i'] = (10. + 0.5 * 10.) / 1.5

-        self.project.combine_symmetries(parent_task, child_tasks)
+        self.project.combine_domains(parent_task, child_tasks)

        save_data_mock.assert_called()
        args, kwargs = save_data_mock.call_args
--- a/tests/test_swarm.py
+++ b/tests/test_swarm.py
@ -38,11 +38,11 @@ class TestSwarmPopulation(unittest.TestCase):
    def setUp(self):
        random.seed(0)
        self.test_dir = tempfile.mkdtemp()
-        self.domain = project.Domain()
+        self.model_space = project.ModelSpace()
        
-        self.domain.add_param('A', 1.5, 1.0, 2.0, 0.1)
-        self.domain.add_param('B', 2.5, 2.0, 3.0, 0.1)
-        self.domain.add_param('C', 3.5, 3.0, 4.0, 0.1)
+        self.model_space.add_param('A', 1.5, 1.0, 2.0, 0.1)
+        self.model_space.add_param('B', 2.5, 2.0, 3.0, 0.1)
+        self.model_space.add_param('C', 3.5, 3.0, 4.0, 0.1)
        self.expected_names = ('_gen', '_model', '_particle', '_rfac', 'A', 'B', 'C')

        self.size = POP_SIZE
@ -73,14 +73,14 @@ class TestSwarmPopulation(unittest.TestCase):
        return r

    def test_best_friend(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.pop.best['_rfac'] = np.arange(self.size)
        friend = self.pop.best_friend(0)
        self.assertNotIsInstance(friend, np.ndarray)
        self.assertEqual(friend.dtype.names, self.expected_names)
        
    def test_advance_particle(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        
        self.pop.pos['A'] = np.linspace(1.0, 2.0, POP_SIZE)
        self.pop.pos['B'] = np.linspace(2.0, 3.0, POP_SIZE)
@ -98,11 +98,11 @@ class TestSwarmPopulation(unittest.TestCase):
        
        for key in ['A','B','C']:
            for pos in self.pop.pos[key]:
-                self.assertGreaterEqual(pos, self.domain.min[key])
-                self.assertLessEqual(pos, self.domain.max[key])
+                self.assertGreaterEqual(pos, self.model_space.min[key])
+                self.assertLessEqual(pos, self.model_space.max[key])

    def test_is_converged(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)
        self.assertFalse(self.pop.is_converged())
        i_sample = 0
        result = self.pop.pos[i_sample]
@ -116,7 +116,7 @@ class TestSwarmPopulation(unittest.TestCase):
        self.assertTrue(self.pop.is_converged())
        
    def test_convergence_1(self):
-        self.pop.setup(self.size, self.domain)
+        self.pop.setup(self.size, self.model_space)

        self.pop.pos['A'] = np.linspace(1.0, 2.0, POP_SIZE)
        self.pop.pos['B'] = np.linspace(2.0, 3.0, POP_SIZE)
Author	SHA1	Message	Date
matthias muntwiler	ef781e2db4	public release 3.0.0 - see README and CHANGES for details	2021-02-09 12:46:20 +01:00
matthias muntwiler	2b3dbd8bac	update README	2020-09-04 16:31:45 +02:00
matthias muntwiler	7c61eb1b41	public release 2.2.0 - see README.md and CHANGES.md for details	2020-09-04 16:22:42 +02:00