update README

public release 2.2.0 - see README.md and CHANGES.md for details
public distro 2.1.0
2020-09-04 16:31:45 +02:00 · 2020-09-04 16:22:42 +02:00 · 2019-07-19 12:54:54 +02:00 · 2019-01-31 15:45:02 +01:00
121 changed files with 171011 additions and 143628 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1,6 +1,7 @@
 work/*
 debug/*
 lib/*
+dev/*
 *.pyc
 *.o
 *.so
@ -13,3 +14,4 @@ lib/*
 .eric5project/*
 .ropeproject/*
 .fuse*
+.trash
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@ -0,0 +1,13 @@
+pages:
+  stage: deploy
+  script:
+  - make docs
+  - mv docs/html/ public/
+  artifacts:
+    paths:
+    - public
+  only:
+  - master
+  tags:
+  - doxygen
+  
--- a/CHANGES.md
+++ b/CHANGES.md
@ -0,0 +1,34 @@
+Release 2.2.0 (2020-09-04)
+==========================
+
+| Hash | Date | Description |
+| ---- | ---- | ----------- |
+| 4bb2331 | 2020-07-30 | demo project for arbitrary molecule (cluster file) |
+| f984f64 | 2020-09-03 | bugfix: DATA CORRUPTION in phagen translator (emitter mix-up) |
+| 11fb849 | 2020-09-02 | bugfix: load native cluster file: wrong column order |
+| d071c97 | 2020-09-01 | bugfix: initial-state command line option not respected |
+| 9705eed | 2020-07-28 | photoionization cross sections and spectrum simulator |
+| 98312f0 | 2020-06-12 | database: use local lock objects |
+| c8fb974 | 2020-04-30 | database: create view on results and models |
+| 2cfebcb | 2020-05-14 | REFACTORING: Domain -> ModelSpace, Params -> CalculatorParams |
+| d5516ae | 2020-05-14 | REFACTORING: symmetry -> domain |
+| b2dd21b | 2020-05-13 | possible conda/mpi4py conflict - changed installation procedure |
+| cf5c7fd | 2020-05-12 | cluster: new calc_scattering_angles function |
+| 20df82d | 2020-05-07 | include a periodic table of binding energies of the elements |
+| 5d560bf | 2020-04-24 | clean up files in the main loop and in the end |
+| 6e0ade5 | 2020-04-24 | bugfix: database ingestion overwrites results from previous jobs |
+| 263b220 | 2020-04-24 | time out at least 10 minutes before the hard time limit given on the command line |
+| 4ec526d | 2020-04-09 | cluster: new get_center function |
+| fcdef4f | 2020-04-09 | bugfix: type error in grid optimizer |
+| a4d1cf7 | 2020-03-05 | bugfix: file extension in phagen/makefile |
+| 9461e46 | 2019-09-11 | dispatch: new algo to distribute processing slots to task levels |
+| 30851ea | 2020-03-04 | bugfix: load single-line data files correctly! |
+| 71fe0c6 | 2019-10-04 | cluster generator for zincblende crystal |
+| 23965e3 | 2020-02-26 | phagen translator: fix phase convention (MAJOR), fix single-energy |
+| cf1814f | 2019-09-11 | dispatch: give more priority to mid-level tasks in single mode |
+| 58c778d | 2019-09-05 | improve performance of cluster add_bulk, add_layer and rotate |
+| 20ef1af | 2019-09-05 | unit test for Cluster.translate, bugfix in translate and relax |
+| 0b80850 | 2019-07-17 | fix compatibility with numpy >= 1.14, require numpy >= 1.13 |
+| 1d0a542 | 2019-07-16 | database: introduce job-tags |
+| 8461d81 | 2019-07-05 | qpmsco: delete code after execution |
+
--- a/201
+++ b/201
@ -0,0 +1,201 @@
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "{}"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright 2015-2020 Paul Scherrer Institut
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/README.md
+++ b/README.md
@ -9,26 +9,29 @@ The actual scattering calculation is done by code developed by other parties.
 PMSCO wraps around that program and facilitates parameter handling, cluster building, structural optimization and parallel processing.
 In the current version, the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code
 developed by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley (1999) is used for scattering calculations.
-Other code can be integrated as well.
+Instead of EDAC built-in routines, alternatively,
+the PHAGEN program from [MsSpec-1.0](https://msspec.cnrs.fr/index.html) can be used to calculate atomic scattering factors.
+

 Highlights
 ----------

- angle or energy scanned XPD.
- various scanning modes including energy, polar angle, azimuthal angle, analyser angle.
- averaging over multiple symmetries (domains or emitters).
+- angle and energy scanned XPD.
+- various scanning modes including energy, manipulator angle (polar/azimuthal), emission angle.
+- averaging over multiple domains and emitters.
 - global optimization of multiple scans.
 - structural optimization algorithms: particle swarm optimization, grid search, gradient search.
 - calculation of the modulation function.
 - calculation of the weighted R-factor.
 - automatic parallel processing using OpenMPI.
+- tested on Linux cluster machines.


 Installation
 ============

-PMSCO is written in Python 2.7.
-The code will run in any recent Linux environment on a workstation or in a virtual machine.
+PMSCO is written in Python 3.6.
+The code will run in any recent Linux environment on a workstation or virtual machine.
 Scientific Linux, CentOS7, [Ubuntu](https://www.ubuntu.com/)
 and [Lubuntu](http://lubuntu.net/) (recommended for virtual machine) have been tested.
 For optimization jobs, a cluster with 20-50 available processor cores is recommended.
@ -38,6 +41,12 @@ Detailed installation instructions and dependencies can be found in the document
 (docs/src/installation.dox).
 A [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) compiler with Doxypy is required to generate the documentation in HTML or LaTeX format.

+The easiest way to set up an environment with all dependencies and without side-effects on other installed software is to use a [Singularity](https://www.sylabs.io/guides/2.5/user-guide/index.html) container.
+A Singularity recipe file is part of the distribution, see the PMSCO documentation for details.
+On newer Linux systems (e.g. Ubuntu 18.04), Singularity is available from the package manager. 
+Installation in a [virtual box](https://www.virtualbox.org/) on Windows or Mac is straightforward using the [Vagrant](https://www.vagrantup.com/) system.
+A Vagrant file is included in the distribution.
+
 The public distribution of PMSCO does not contain the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code.
 Please obtain the EDAC source code from the original author, copy it to the pmsco/edac directory, and apply the edac_all.patch patch.

@ -61,10 +70,23 @@ Matthias Muntwiler, <mailto:matthias.muntwiler@psi.ch>
 Copyright
 ---------

-Copyright 2015-2017 by [Paul Scherrer Institut](http://www.psi.ch)
+Copyright 2015-2020 by [Paul Scherrer Institut](http://www.psi.ch)


 Release Notes
 =============

+For a detailed list of changes, see the CHANGES.md file.
+
+2.2.0 (2020-09-04)
+------------------
+
+This release breaks existing project code unless the listed refactorings are applied.
+
+- Major refactoring: The 'symmetry' calculation level is renamed to 'domain'. 
+  The previous Domain class is renamed to ModelSpace, Params to CalculatorParams.
+  The refactorings must be applied to project code as well.
+- Included periodic table of elements with electron binding energies and scattering cross-sections.
+- Various bug fixes in cluster routines, data file handling, and in the PHAGEN interface.
+- Experimental sqlite3 database interface for optimization results.

--- a/bin/pmsco.ra-git.template
+++ b/bin/pmsco.ra-git.template
@ -0,0 +1,136 @@
+#!/bin/bash
+#
+# Slurm script template for PMSCO calculations on the Ra cluster
+# based on run_mpi_HPL_nodes-2.sl by V. Markushin 2016-03-01
+#
+# this version checks out the source code from a git repository
+# to a temporary location and compiles the code.
+# this is to minimize conflicts between different jobs
+# but requires that each job has its own git commit.
+#
+# Use:
+# - enter the appropriate parameters and save as a new file.
+# - call the sbatch command to pass the job script.
+#   request a specific number of nodes and tasks.
+#   example:
+#   sbatch --nodes=2  --ntasks-per-node=24 --time=02:00:00 run_pmsco.sl
+# the qpmsco script does all this for you.
+#
+# PMSCO arguments
+# copy this template to a new file, and set the arguments
+#
+# PMSCO_WORK_DIR
+#   path to be used as working directory.
+#   contains the script derived from this template
+#   and a copy of the pmsco code in the 'pmsco' directory.
+#   receives output and temporary files.
+#
+# PMSCO_PROJECT_FILE
+#   python module that declares the project and starts the calculation.
+#   must include the file path relative to $PMSCO_WORK_DIR.
+#
+# PMSCO_OUT
+#   name of output file. should not include a path.
+#
+# all paths are relative to $PMSCO_WORK_DIR or (better) absolute.
+#
+#
+# Further arguments
+#
+# PMSCO_JOBNAME (required)
+#   the job name is the base name for output files.
+#
+# PMSCO_WALLTIME_HR (integer, required)
+#   wall time limit in hours. must be integer, minimum 1.
+#   this value is passed to PMSCO.
+#   it should specify the same amount of wall time as requested from the scheduler.
+#
+# PMSCO_PROJECT_ARGS (optional)
+#   extra arguments that are parsed by the project module.
+#
+#SBATCH --job-name="_PMSCO_JOBNAME"
+#SBATCH --output="_PMSCO_JOBNAME.o.%j"
+#SBATCH --error="_PMSCO_JOBNAME.e.%j"
+
+PMSCO_WORK_DIR="_PMSCO_WORK_DIR"
+PMSCO_JOBNAME="_PMSCO_JOBNAME"
+PMSCO_WALLTIME_HR=_PMSCO_WALLTIME_HR
+
+PMSCO_PROJECT_FILE="_PMSCO_PROJECT_FILE"
+PMSCO_OUT="_PMSCO_JOBNAME"
+PMSCO_PROJECT_ARGS="_PMSCO_PROJECT_ARGS"
+
+module load psi-python36/4.4.0
+module load gcc/4.8.5
+module load openmpi/3.1.3
+source activate pmsco3
+
+echo '================================================================================'
+echo "=== Running $0 at the following time and place:"
+date
+/bin/hostname
+cd $PMSCO_WORK_DIR
+pwd
+ls -lA
+#the intel compiler is currently not compatible with mpi4py. -mm 170131
+#echo
+#echo '================================================================================'
+#echo "=== Setting the environment to use Intel Cluster Studio XE 2016 Update 2 intel/16.2:"
+#cmd="source /opt/psi/Programming/intel/16.2/bin/compilervars.sh intel64"
+#echo $cmd
+#$cmd
+echo
+echo '================================================================================'
+echo "=== The environment is set as following:"
+env
+echo
+echo '================================================================================'
+echo "BEGIN test"
+which mpirun
+cmd="mpirun /bin/hostname"
+echo $cmd
+$cmd
+echo "END test"
+echo
+echo '================================================================================'
+echo "BEGIN mpirun pmsco"
+echo
+
+cd "$PMSCO_WORK_DIR"
+cd pmsco
+echo "code revision"
+git log --pretty=tformat:'%h %ai %d' -1
+make -C pmsco all
+python -m compileall pmsco
+python -m compileall projects
+echo
+
+cd "$PMSCO_WORK_DIR"
+PMSCO_CMD="python pmsco/pmsco $PMSCO_PROJECT_FILE"
+PMSCO_ARGS="$PMSCO_PROJECT_ARGS"
+if [ -n "$PMSCO_SCAN_FILES" ]; then
+    PMSCO_ARGS="-s $PMSCO_SCAN_FILES $PMSCO_ARGS"
+fi
+if [ -n "$PMSCO_OUT" ]; then
+    PMSCO_ARGS="-o $PMSCO_OUT $PMSCO_ARGS"
+fi
+if [ "$PMSCO_WALLTIME_HR" -ge 1 ]; then
+    PMSCO_ARGS="-t $PMSCO_WALLTIME_HR $PMSCO_ARGS"
+fi
+if [ -n "$PMSCO_LOGLEVEL" ]; then
+    PMSCO_ARGS="--log-level $PMSCO_LOGLEVEL --log-file $PMSCO_JOBNAME.log $PMSCO_ARGS"
+fi
+
+# Do no use the OpenMPI specific options, like "-x LD_LIBRARY_PATH", with the Intel mpirun.
+cmd="mpirun $PMSCO_CMD $PMSCO_ARGS"
+echo $cmd
+$cmd
+echo "END mpirun pmsco"
+echo '================================================================================'
+cd "$PMSCO_WORK_DIR"
+rm -rf pmsco
+date
+ls -lAtr
+echo '================================================================================'
+
+exit 0
--- a/bin/pmsco.ra.template
+++ b/bin/pmsco.ra.template
@ -75,10 +75,10 @@ PMSCO_OUT="_PMSCO_JOBNAME"
 PMSCO_LOGLEVEL="_PMSCO_LOGLEVEL"
 PMSCO_PROJECT_ARGS="_PMSCO_PROJECT_ARGS"

-module load psi-python27/2.4.1
+module load psi-python36/4.4.0
 module load gcc/4.8.5
-module load openmpi/1.10.2
-source activate pmsco
+module load openmpi/3.1.3
+source activate pmsco3

 echo '================================================================================'
 echo "=== Running $0 at the following time and place:"
@ -120,7 +120,7 @@ python -m compileall projects
 cd "$PMSCO_WORK_DIR"
 echo

-PMSCO_CMD="python $PMSCO_PROJECT_FILE"
+PMSCO_CMD="python $PMSCO_SOURCE_DIR/pmsco $PMSCO_PROJECT_FILE"
 PMSCO_ARGS="$PMSCO_PROJECT_ARGS"
 if [ -n "$PMSCO_SCAN_FILES" ]; then
    PMSCO_ARGS="-s $PMSCO_SCAN_FILES $PMSCO_ARGS"
--- a/bin/qpmsco.ra-git.sh
+++ b/bin/qpmsco.ra-git.sh
@ -0,0 +1,145 @@
+#!/bin/sh
+#
+# submission script for PMSCO calculations on the Ra cluster
+#
+# this version clones the current git repository at HEAD to the work directory.
+# thus, version conflicts between jobs are avoided.
+#
+
+if [ $# -lt 1 ]; then
+  echo "Usage: $0 [NOSUB] GIT_TAG DESTDIR JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT [ARGS [ARGS [...]]]"
+  echo ""
+  echo "       NOSUB (optional): do not submit the script to the queue. default: submit."
+  echo "       GIT_TAG: git tag or branch name of the code. HEAD for current code."
+  echo "       DESTDIR: destination directory. must exist. a sub-dir \$JOBNAME is created."
+  echo "       JOBNAME (text): name of job. use only alphanumeric characters, no spaces."
+  echo "       NODES (integer): number of computing nodes. (1 node = 24 or 32 processors)."
+  echo "          do not specify more than 2."
+  echo "       TASKS_PER_NODE (integer): 1...24, or 32."
+  echo "          24 or 32 for full-node allocation."
+  echo "          1...23 for shared node allocation."
+  echo "       WALLTIME:HOURS (integer): requested wall time."
+  echo "          1...24 for day partition"
+  echo "          24...192 for week partition"
+  echo "          1...192 for shared partition"
+  echo "       PROJECT: python module (file path) that declares the project and starts the calculation."
+  echo "       ARGS (optional): any number of further PMSCO or project arguments (except time)."
+  echo ""
+  echo "the job script is written to \$DESTDIR/\$JOBNAME which is also the destination of calculation output."
+  exit 1
+fi
+
+# location of the pmsco package is derived from the path of this script
+SCRIPTDIR="$(dirname $(readlink -f $0))"
+SOURCEDIR="$(readlink -f $SCRIPTDIR/..)"
+PMSCO_SOURCE_DIR="$SOURCEDIR"
+
+# read arguments
+if [ "$1" == "NOSUB" ]; then
+  NOSUB="true"
+  shift
+else
+  NOSUB="false"
+fi
+
+if [ "$1" == "HEAD" ]; then
+    BRANCH_ARG=""
+else
+    BRANCH_ARG="-b $1"
+fi
+shift
+
+DEST_DIR="$1"
+shift
+
+PMSCO_JOBNAME=$1
+shift
+
+PMSCO_NODES=$1
+PMSCO_TASKS_PER_NODE=$2
+PMSCO_TASKS=$(expr $PMSCO_NODES \* $PMSCO_TASKS_PER_NODE)
+shift 2
+
+PMSCO_WALLTIME_HR=$1
+PMSCO_WALLTIME_MIN=$(expr $PMSCO_WALLTIME_HR \* 60)
+shift
+
+# select partition
+if [ $PMSCO_WALLTIME_HR -ge 25 ]; then
+    PMSCO_PARTITION="week"
+else
+    PMSCO_PARTITION="day"
+fi
+if [ $PMSCO_TASKS_PER_NODE -lt 24 ]; then
+    PMSCO_PARTITION="shared"
+fi
+
+PMSCO_PROJECT_FILE="$(readlink -f $1)"
+shift
+
+PMSCO_PROJECT_ARGS="$*"
+
+# set up working directory
+cd "$DEST_DIR"
+if [ ! -d "$PMSCO_JOBNAME" ]; then
+    mkdir "$PMSCO_JOBNAME"
+fi
+cd "$PMSCO_JOBNAME"
+WORKDIR="$(pwd)"
+PMSCO_WORK_DIR="$WORKDIR"
+
+# copy code
+PMSCO_SOURCE_REPO="file://$PMSCO_SOURCE_DIR"
+echo "$PMSCO_SOURCE_REPO"
+
+cd "$PMSCO_WORK_DIR"
+git clone $BRANCH_ARG --single-branch --depth 1 $PMSCO_SOURCE_REPO pmsco || exit
+cd pmsco
+PMSCO_REV=$(git log --pretty=format:"%h, %ai" -1) || exit
+cd "$WORKDIR"
+echo "$PMSCO_REV" > revision.txt
+
+# generate job script from template
+sed -e "s:_PMSCO_WORK_DIR:$PMSCO_WORK_DIR:g" \
+    -e "s:_PMSCO_JOBNAME:$PMSCO_JOBNAME:g" \
+    -e "s:_PMSCO_NODES:$PMSCO_NODES:g" \
+    -e "s:_PMSCO_WALLTIME_HR:$PMSCO_WALLTIME_HR:g" \
+    -e "s:_PMSCO_PROJECT_FILE:$PMSCO_PROJECT_FILE:g" \
+    -e "s:_PMSCO_PROJECT_ARGS:$PMSCO_PROJECT_ARGS:g" \
+    "$SCRIPTDIR/pmsco.ra-git.template" > $PMSCO_JOBNAME.job
+
+chmod u+x "$PMSCO_JOBNAME.job" || exit
+
+# request nodes and tasks
+#
+# The option --ntasks-per-node is meant to be used with the --nodes option.
+# (For the --ntasks option, the default is one task per node, use the --cpus-per-task option to change this default.)
+#
+# sbatch options
+# --cores-per-socket=16
+#   32 cores per node
+# --partition=[shared|day|week]
+# --time=8-00:00:00
+#   override default time limit (2 days in long queue)
+#   time formats: "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes", "days-hours:minutes:seconds"
+# --mail-type=ALL
+# --test-only
+#   check script but do not submit
+#
+SLURM_ARGS="--nodes=$PMSCO_NODES --ntasks-per-node=$PMSCO_TASKS_PER_NODE"
+
+if [ $PMSCO_TASKS_PER_NODE -gt 24 ]; then
+    SLURM_ARGS="--cores-per-socket=16 $SLURM_ARGS"
+fi
+
+SLURM_ARGS="--partition=$PMSCO_PARTITION $SLURM_ARGS"
+
+SLURM_ARGS="--time=$PMSCO_WALLTIME_HR:00:00 $SLURM_ARGS"
+
+CMD="sbatch $SLURM_ARGS $PMSCO_JOBNAME.job"
+echo $CMD
+if [ "$NOSUB" != "true" ]; then
+  $CMD
+fi
+
+exit 0
--- a/bin/qpmsco.ra.sh
+++ b/bin/qpmsco.ra.sh
@ -1,11 +1,18 @@
 #!/bin/sh
 #
 # submission script for PMSCO calculations on the Ra cluster
+#
+# CAUTION: the job will execute the pmsco code which is present in the directory tree
+#          of this script _at the time of job execution_, not submission!
+#          before changing the code, make sure that all pending jobs have started execution,
+#          otherwise you will experience version conflicts.
+#          it's better to use the qpmsco.ra-git.sh script which clones the code.

 if [ $# -lt 1 ]; then
-  echo "Usage: $0 [NOSUB] JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT MODE [ARGS [ARGS [...]]]"
+  echo "Usage: $0 [NOSUB] DESTDIR JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT MODE [ARGS [ARGS [...]]]"
  echo ""
  echo "       NOSUB (optional): do not submit the script to the queue. default: submit."
+  echo "       DESTDIR: destination directory. must exist. a sub-dir \$JOBNAME is created."
  echo "       JOBNAME (text): name of job. use only alphanumeric characters, no spaces."
  echo "       NODES (integer): number of computing nodes. (1 node = 24 or 32 processors)."
  echo "          do not specify more than 2."
@ -20,7 +27,7 @@ if [ $# -lt 1 ]; then
  echo "       MODE: PMSCO calculation mode (single|swarm|gradient|grid)."
  echo "       ARGS (optional): any number of further PMSCO or project arguments (except mode and time)."
  echo ""
-  echo "the job script complete with the program code and input/output data is generated in ~/jobs/\$JOBNAME"
+  echo "the job script is written to \$DESTDIR/\$JOBNAME which is also the destination of calculation output."
  exit 1
 fi

@ -37,6 +44,9 @@ else
  NOSUB="false"
 fi

+DEST_DIR="$1"
+shift
+
 PMSCO_JOBNAME=$1
 shift

@ -73,11 +83,7 @@ PMSCO_LOGLEVEL=""
 PMSCO_CODE=""

 # set up working directory
-cd ~
-if [ ! -d "jobs" ]; then
-    mkdir jobs
-fi
-cd jobs
+cd "$DEST_DIR"
 if [ ! -d "$PMSCO_JOBNAME" ]; then
    mkdir "$PMSCO_JOBNAME"
 fi
@ -87,9 +93,9 @@ PMSCO_WORK_DIR="$WORKDIR"

 # provide revision information, requires git repository
 cd "$SOURCEDIR"
-PMSCO_REV=$(git log --pretty=format:"Data revision %h, %ai" -1)
+PMSCO_REV=$(git log --pretty=format:"%h, %ai" -1)
 if [ $? -ne 0 ]; then
-   PMSCO_REV="Data revision unknown, "$(date +"%F %T %z")
+   PMSCO_REV="revision unknown, "$(date +"%F %T %z")
 fi
 cd "$WORKDIR"
 echo "$PMSCO_REV" > revision.txt
--- a/bin/qpmsco.sge
+++ b/bin/qpmsco.sge
@ -86,9 +86,9 @@ PHD_WORK_DIR="$WORKDIR"

 # provide revision information, requires git repository
 cd "$SOURCEDIR"
-PHD_REV=$(git log --pretty=format:"Data revision %h, %ad" --date=iso -1)
+PHD_REV=$(git log --pretty=format:"%h, %ad" --date=iso -1)
 if [ $? -ne 0 ]; then
-   PHD_REV="Data revision unknown, "$(date +"%F %T %z")
+   PHD_REV="revision unknown, "$(date +"%F %T %z")
 fi
 cd "$WORKDIR"
 echo "$PHD_REV" > revision.txt
--- a/docs/config.dox
+++ b/docs/config.dox
@ -38,13 +38,13 @@ PROJECT_NAME           = "PEARL MSCO"
 # could be handy for archiving the generated documentation or if some version
 # control system is used.

-PROJECT_NUMBER         = 
+PROJECT_NUMBER         = $(REVISION)

 # Using the PROJECT_BRIEF tag one can provide an optional one line description
 # for a project that appears at the top of each page and should give viewer a
 # quick idea about the purpose of the project. Keep the description short.

-PROJECT_BRIEF          = "PEARL multiple scattering calculations and optimizations"
+PROJECT_BRIEF          = "PEARL multiple scattering calculation and optimization"

 # With the PROJECT_LOGO tag one can specify a logo or an icon that is included
 # in the documentation. The maximum height of the logo should not exceed 55
@ -228,7 +228,7 @@ TAB_SIZE               = 4
 # "Side Effects:". You can put \n's in the value part of an alias to insert
 # newlines.

-ALIASES                = 
+ALIASES                = "raise=@exception"

 # This tag can be used to specify a number of word-keyword mappings (TCL only).
 # A mapping has the form "name=value". For example adding "class=itcl::class"
@ -597,19 +597,19 @@ STRICT_PROTO_MATCHING  = NO
 # list. This list is created by putting \todo commands in the documentation.
 # The default value is: YES.

-GENERATE_TODOLIST      = YES
+GENERATE_TODOLIST      = NO

 # The GENERATE_TESTLIST tag can be used to enable (YES) or disable (NO) the test
 # list. This list is created by putting \test commands in the documentation.
 # The default value is: YES.

-GENERATE_TESTLIST      = YES
+GENERATE_TESTLIST      = NO

 # The GENERATE_BUGLIST tag can be used to enable (YES) or disable (NO) the bug
 # list. This list is created by putting \bug commands in the documentation.
 # The default value is: YES.

-GENERATE_BUGLIST       = YES
+GENERATE_BUGLIST       = NO

 # The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or disable (NO)
 # the deprecated list. This list is created by putting \deprecated commands in
@ -761,9 +761,13 @@ WARN_LOGFILE           =
 INPUT                  = \
 src/introduction.dox \
 src/concepts.dox \
+src/concepts-tasks.dox \
+src/concepts-emitter.dox \
+src/concepts-atomscat.dox \
 src/installation.dox \
 src/execution.dox \
 src/commandline.dox \
+src/optimizers.dox \
                         ../pmsco \
                         ../projects \
                         ../tests
@ -859,7 +863,7 @@ EXAMPLE_RECURSIVE      = NO
 # that contain images that are to be included in the documentation (see the
 # \image command).

-IMAGE_PATH             = 
+IMAGE_PATH             = src/images

 # The INPUT_FILTER tag can be used to specify a program that doxygen should
 # invoke to filter for each input file. Doxygen will invoke the filter program
@ -876,7 +880,7 @@ IMAGE_PATH             =
 # code is scanned, but not when the output code is generated. If lines are added
 # or removed, the anchors will not be placed correctly.

-INPUT_FILTER           = /usr/bin/doxypy
+INPUT_FILTER           =

 # The FILTER_PATTERNS tag can be used to specify filters on a per file pattern
 # basis. Doxygen will compare the file name with each pattern and apply the
@ -885,7 +889,7 @@ INPUT_FILTER           = /usr/bin/doxypy
 # filters are used. If the FILTER_PATTERNS tag is empty or if none of the
 # patterns match the file name, INPUT_FILTER is applied.

-FILTER_PATTERNS        = 
+FILTER_PATTERNS        = *.py=/usr/bin/doxypy

 # If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using
 # INPUT_FILTER) will also be used to filter the input files that are used for
@ -2328,7 +2332,7 @@ DIAFILE_DIRS           =
 # generate a warning when it encounters a \startuml command in this case and
 # will not generate output for the diagram.

-PLANTUML_JAR_PATH      = 
+PLANTUML_JAR_PATH      = $(PLANTUML_JAR_PATH)

 # When using plantuml, the specified paths are searched for files specified by
 # the !include statement in a plantuml block.
--- a/docs/makefile
+++ b/docs/makefile
@ -2,6 +2,11 @@ SHELL=/bin/sh

 # makefile for PMSCO documentation
 #
+# requirements
+#
+# 1) doxygen
+# 2) /usr/bin/doxypy
+# 3) PLANTUML_JAR_PATH environment variable must point to plantUML jar.

 .SUFFIXES:
 .SUFFIXES: .c .cpp .cxx .exe .f .h .i .o .py .pyf .so .html
@ -11,6 +16,9 @@ DOX=doxygen
 DOXOPTS=
 LATEX_DIR=latex

+REVISION=$(shell git describe --always --tags --dirty --long || echo "unknown, "`date +"%F %T %z"`)
+export REVISION
+
 all: docs

 docs: doxygen pdf
@ -22,5 +30,6 @@ pdf: doxygen
 	-$(MAKE) -C $(LATEX_DIR)

 clean:
-	-rm -rf latex/*
-	-rm -rf html/*
+	-rm -r latex/*
+	-rm -r html/*
+
--- a/docs/src/commandline.dox
+++ b/docs/src/commandline.dox
@ -11,12 +11,14 @@ it is recommended to adhere to the standard syntax described below.

 The basic command line is as follows:
@code{.sh}
-[mpiexec -np NPROCESSES] python path-to-project.py [common args] [project args]
+[mpiexec -np NPROCESSES] python path/to/pmsco path/to/project.py [common args] [project args]
@endcode

 Include the first portion between square brackets if you want to run parallel processes.
 Specify the number of processes as the @c -np option.
-@c path-to-project.py should be the path and name to your project module.
+@c path/to/pmsco is the directory where <code>__main.py__</code> is located.
+Do not include the extension <code>.py</code> or a trailing slash.
+@c path/to/project.py should be the path and name to your project module.
 Common args and project args are described below.


@ -30,7 +32,7 @@ The following table is ordered by importance.
 | Option | Values | Description |
 | --- | --- | --- |
 | -h , --help | | Display a command line summary and exit. |
-| -m , --mode | single (default), grid, swarm | Operation mode. |
+| -m , --mode | single (default), grid, swarm, genetic | Operation mode. |
 | -d, --data-dir | file system path | Directory path for experimental data files (if required by project). Default: current working directory. |
 | -o, --output-file | file system path | Base path and/or name for intermediate and output files. Default: pmsco_data |
 | -t, --time-limit | decimal number | Wall time limit in hours. The optimizers try to finish before the limit. Default: 24.0. |
@ -38,30 +40,43 @@ The following table is ordered by importance.
 | --log-level | DEBUG, INFO, WARNING (default), ERROR, CRITICAL | Minimum level of messages that should be added to the log. |
 | --log-file | file system path | Name of the main log file. Under MPI, the rank of the process is inserted before the extension. Default: output-file + log, or pmsco.log. |
 | --log-disable | | Disable logging. By default, logging is on. |
-| --pop-size | integer | Population size (number of particles) in swarm optimization mode. The default value is the greater of 4 or two times the number of calculation processes. |
-| -c, --code | edac (default) | Scattering code. At the moment, only edac is supported. |
+| --pop-size | integer | Population size (number of particles) in swarm and genetic optimization mode. The default value is the greater of 4 or the number of parallel calculation processes. |
+| --seed-file | file system path | Name of the population seed file. Population data of previous optimizations can be used to seed a new optimization. The file must have the same structure as the .pop or .dat files. See @ref pmsco.project.Project.seed_file. |
+| --table-file | file system path | Name of the model table file in table scan mode. |


 \subsubsection sec_file_categories File Categories

-The following category names can be used with the @c --keep-files option.
+The following category names can be used with the `--keep-files` option.
 Multiple names can be specified and must be separated by spaces.

 | Category | Description | Default Action |
 | --- | --- | --- |
+| all | shortcut to include all categories | |
 | input |      raw input files for calculator, including cluster and phase files in custom format | delete |
 | output |     raw output files from calculator | delete |
-| phase |      phase files in portable format for report |  delete |
+| atomic |     atomic scattering and emission files in portable format | delete |
 | cluster |    cluster files in portable XYZ format for report | keep |
 | debug |      debug files |  delete |
 | model |       output files in ETPAI format: complete simulation  (a_-1_-1_-1_-1) | keep |
-| scan |       output files in ETPAI format: scan (a_b_-1_-1_-1) |  delete |
-| symmetry |   output files in ETPAI format: symmetry (a_b_c_-1_-1) |  delete |
+| scan |       output files in ETPAI format: scan (a_b_-1_-1_-1) |  keep |
+| domain |     output files in ETPAI format: domain (a_b_c_-1_-1) |  delete |
 | emitter |    output files in ETPAI format: emitter (a_b_c_d_-1) |  delete |
 | region |     output files in ETPAI format: region (a_b_c_d_e) |  delete |
-| report|      final report of results |  keep |
+| report|      final report of results | keep always |
 | population |  final state of particle population | keep |
-| rfac |        files related to models which give bad r-factors | delete |
+| rfac |        files related to models which give bad r-factors, see warning below | delete |
+
+\note
+The `report` category is always kept and cannot be turned off.
+The `model` category is always kept in single calculation mode.
+
+\warning
+If you want to specify `rfac` with the `--keep-files` option,
+you have to add the file categories that you want to keep, e.g.,
+`--keep-files rfac cluster model scan population`
+(to return the default categories for all calculated models).
+Do not specify `rfac` alone as this will effectively not return any file.


 \subsection sec_project_args Project Arguments
@ -84,36 +99,11 @@ This way, the file names and photoelectron parameters are versioned with the cod
 whereas command line arguments may easily get forgotten in the records.


-\subsection sec_project_example Example Argument Handling
+\subsection sec_project_example Argument Handling

-An example for handling the command line in a project module can be found in the twoatom.py demo project.
-The following code snippet shows how the common and project arguments are separated and handled.
-
-@code{.py}
-def main():
-    # have the pmsco module parse the common arguments.
-    args, unknown_args = pmsco.pmsco.parse_cli()
-
-    # pass any arguments not handled by pmsco
-    # to the project-defined parse_project_args function.
-    # unknown_args can be passed to argparse.ArgumentParser.parse_args().
-    if unknown_args:
-        project_args = parse_project_args(unknown_args)
-    else:
-        project_args = None
-
-    # create the project object
-    project = create_project()
-
-    # apply the common arguments on the project
-    pmsco.pmsco.set_common_args(project, args)
-
-    # apply the specific arguments on the project
-    set_project_args(project, project_args)
-
-    # run the project
-    pmsco.pmsco.run_project(project)
-@endcode
+To handle command line arguments in a project module,
+the module must define a <code>parse_project_args</code> and a <code>set_project_args</code> function.
+An example can be found in the twoatom.py demo project.


 \section sec_slurm Slurm Job Submission
@ -122,23 +112,24 @@ The command line of the Slurm job submission script for the Ra cluster at PSI is
 This script is specific to the configuration of the Ra cluster but may be adapted to other Slurm-based queues.

@code{.sh}
-qpmsco.sh [NOSUB] JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT MODE [ARGS [ARGS [...]]]
+qpmsco.sh [NOSUB] DESTDIR JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT MODE [ARGS [ARGS [...]]]
@endcode

 Here, the first few arguments are positional and their order must be strictly adhered to.
 After the positional arguments, optional arguments of the PMSCO project command line can be added in arbitrary order.
 If you execute the script without arguments, it displays a short summary.
-The job script is written to @c ~/jobs/\$JOBNAME.
+The job script is written to @c $DESTDIR/$JOBNAME which is also the destination of calculation output.

 | Argument | Values | Description |
 | --- | --- | --- |
 | NOSUB (optional) | NOSUB or omitted | If NOSUB is present as the first argument, create the job script but do not submit it to the queue. Otherwise, submit the job script. |
+| DESTDIR | file system path | destination directory. must exist. a sub-dir $JOBNAME is created. |
 | JOBNAME | text | Name of job. Use only alphanumeric characters, no spaces. |
 | NODES | integer | Number of computing nodes. (1 node = 24 or 32 processors). Do not specify more than 2. |
 | TASKS_PER_NODE | 1...24, or 32 | Number of processes per node. 24 or 32 for full-node allocation. 1...23 for shared node allocation. |
 | WALLTIME:HOURS | integer | Requested wall time. 1...24 for day partition, 24...192 for week partition, 1...192 for shared partition. This value is also passed on to PMSCO as the @c --time-limit argument. |
 | PROJECT | file system path | Python module (file path) that declares the project and starts the calculation. |
-| MODE | single, swarm, grid | PMSCO operation mode. This value is passed on to PMSCO as the @c --mode argument. |
+| MODE | single, swarm, grid, genetic | PMSCO operation mode. This value is passed on to PMSCO as the @c --mode argument. |
 | ARGS (optional) | | Any further arguments are passed on verbatim to PMSCO. You don't need to specify the mode and time limit here. |

-*/
+*/
--- a/docs/src/concepts-atomscat.dox
+++ b/docs/src/concepts-atomscat.dox
@ -0,0 +1,114 @@
+/*! @page pag_concepts_atomscat Atomic scattering
+
+\section sec_atomscat Atomic scattering
+
+\subsection sec_atomscat_intro Introduction
+
+The process of calculating atomic scattering factors (phase shifts) can be customized in several ways.
+
+1. Internal processing.
+   Some multiple scattering programs, like EDAC, contain a built-in facility to calculate phase shifts.
+   This is the most simple and default behaviour.
+2. Automatic calculation in a separate program.
+   PMSCO has an interface to run the PHAGEN program from
+   the [MsSpec-1.0 package](https://ipr.univ-rennes1.fr/msspec) to calculate scattering factors.
+   Note that the PHAGEN code is not included in the public distribution of PMSCO.
+3. Manual calculation.
+   Scattering files created manually using an external program can be used by providing the file names.
+   The files must have the format required by the multiple scattering code,
+   and they must be linked to the corresponding atoms of the cluster.
+
+In the case of automatic calculation, the project code can optionally hook into the process
+and modify clusters before and after scattering factors are calculated.
+For instance, it may provide an extended cluster in order to reduce boundary effects,
+or it may modify the assignment of scattering files to cluster atoms
+so that the scattering factors of selected atom classes are used
+(cf. section \ref sec_atomscat_atomclass).
+
+
+\subsection sec_atomscat_usage Usage
+
+\subsubsection sec_atomscat_internal Internal processing
+
+This is the default behaviour selected in the inherited pmsco.project.Project class.
+Make sure not to override the `atomic_scattering_factory` attribute.
+Its default value is pmsco.calculators.calculator.InternalAtomicCalculator.
+
+\subsubsection sec_atomscat_external Automatic calculation in a separate program
+
+To select the atomic scattering calculator,
+assign its interface class to the project's `atomic_scattering_factory` attribute.
+For example, to use PHAGEN, add the following code to your project's `__init__` constructor:
+
+@code{.py}
+    from pmsco.calculators.phagen import PhagenCalculator
+    self.atomic_scattering_factory = PhagenCalculator
+@endcode
+
+\subsubsection sec_atomscat_manual Manual calculation
+
+If you want to keep the scattering factors constant during an optimization,
+you should run PMSCO in _single_ mode and provide the model parameters and cluster
+that will return the desired scattering files.
+In the `create_params` method of your project,
+you should then set the `phase_files` attribute,
+which is a dictionary that maps atom classes to the names of the scattering files.
+Unless you set specific values in the cluster object, the atom class defaults to the element number.
+The file names should include a path relative to the working directory.
+
+
+\subsection sec_atomscat_implement Implementation
+
+\subsubsection sec_atomscat_atomclass Atom classes
+
+Atomic scattering programs classify atoms based on chemical element, charge state and symmetry of the local environment.
+This means that two atoms of the same chemical element may have different scattering factors.
+For example, if you have EDAC output the cluster after calculation of the muffin tin potential,
+you will find that the chemical element number has been replaced by an arbitrary integer.
+
+By default, PMSCO will do the linking of atom classes and scattering files transparently.
+However, if you want to reduce the number of atom classes,
+or if you have the scattering factors calculated on a reference cluster,
+you will have to provide project code to do the assignment.
+This is described further below.
+
+
+\subsubsection sec_atomscat_calculator Atomic scattering calculator
+
+The project selects the atomic scattering calculation mode by specifying its `atomic_scattering_factory` attributed.
+This is the name of a class that inherits from @ref pmsco.calculators.calculator.AtomicCalculator.
+
+The following calculators are currently implemented:
+
+| Class | Description |
+| --- | --- |
+| pmsco.calculators.calculator.InternalAtomicCalculator | Calculate the atomic scattering factors in the multiple-scattering program. |
+| pmsco.calculators.phagen.PhagenCalculator | Calculate the atomic scattering factors in the PHAGEN program. |
+
+An atomic calculator class essentially defines a `run` method that operates on a cluster and scattering parameters object.
+It generates the necessary scattering files, updates the cluster with the new atom classes
+and updates the parameters with the file names of the scattering files.
+Note that the scattering files have to be in the correct format for the multiple scattering calculator.
+
+
+\subsubsection sec_atomscat_hooks Project hooks
+
+Before and after calculation of the scattering factors,
+the project's `before_atomic_scattering` and `after_atomic_scattering` methods are called
+with the cluster and input parameters.
+
+The _before_ method provides the cluster to be used for atomic scattering calculations.
+It may,
+1. just return the original cluster,
+2. modify the provided cluster to include additional atoms or modify the charge state of the emitter,
+3. create a completely different cluster,
+4. return None to suppress the atomic scattering calculation.
+The method is called once at the beginning of the PMSCO job with model -1,
+where it may return the global reference cluster.
+Later on it is called once for each calculation task with the specific task index.
+
+Similarly, the _after_ method collects the results and updates the `phase_files` dictionary of the input parameters.
+It is free to consolidate atom classes and remove unwanted atoms.
+However, it must make sure that for each atom class in the cluster,
+there is a corresponding link to a scattering file.
+*/
--- a/docs/src/concepts-emitter.dox
+++ b/docs/src/concepts-emitter.dox
@ -0,0 +1,185 @@
+/*! @page pag_concepts_emitter Emitter configurations
+
+\section sec_emitters Emitter configurations
+
+\subsection sec_emit_intro Introduction
+
+Since emitters contribute incoherently to the diffraction pattern,
+it should make no difference how the emitters are grouped and calculated.
+This fact can be used to distribute a calculation over multiple parallel processes
+if each process calculates the diffraction pattern coming from one particular emitter atom.
+In effect, some calculation codes are implemented for a single emitter per calculation.
+
+With PMSCO, it is easy to distribute the emitters over parallel processes.
+The project just declares the number of emitters and returns one specific cluster per emitter.
+In the simplest case, this means that the emitter attribute of the cluster atoms is set differently,
+while the atomic coordinates are the same for all clusters generated.
+PMSCO takes care of dispatching the clusters to multiple calculation processes
+depending on the number of allocated MPI processes
+as well as summing up the resulting diffraction patterns.
+
+In addition, the emitter framework also supports that clusters are tailored to a specific emitter configuration.
+Suppose that the unit cell contains a large number of inequivalent emitters.
+If all emitters had to be included in a single calculation,
+the cluster would grow very large and the calculation would include many long scattering paths
+that effectively did not contribute intensity to the final result.
+Splitting a large cluster into small ones built locally around one emitter
+can provide a significant performance gain in complex systems.
+
+Note that the emitter framework does not require that an emitter _configuration_ contains only one emitter _atom_.
+It is up to the project to define how many emitter configurations there are and what they encompass.
+This should, however, normally not be necessary.
+To avoid confusion, it is recommended to declare exactly one emitter atom per configuration.
+
+
+\subsection sec_emit_implement Implementation
+
+There are several implementation routes with varying complexity.
+Which route to take can depend on the complexity of the system and/or the programming skills of the user.
+The following class diagram illustrates the classes and packages involved in cluster generation.
+
+@startuml "class diagram for cluster generation"
+
+package pmsco {
+    class Project {
+        cluster_generator
+        export_cluster()
+    }
+
+    abstract class ClusterGenerator {
+        project
+        {abstract} count_emitters()
+        {abstract} create_cluster()
+    }
+
+    class LegacyClusterGenerator {
+        project
+        count_emitters()
+        create_cluster()
+    }
+}
+
+package "user project" {
+    class UserClusterGenerator {
+        project
+        count_emitters()
+        create_cluster()
+    }
+
+    note bottom : for complex cluster
+
+    class UserProject {
+        count_emitters()
+        create_cluster()
+    }
+
+    note bottom : for simple cluster
+
+}
+
+Project <|-- UserProject
+ClusterGenerator <|-- LegacyClusterGenerator
+ClusterGenerator <|-- UserClusterGenerator
+Project *-- ClusterGenerator
+UserProject .> LegacyClusterGenerator
+UserProject .> UserClusterGenerator
+
+@enduml
+
+In general, the cluster is generated by calls to the project's cluster_generator object.
+This can be either a custom generator class derived from pmsco.cluster.ClusterGenerator
+or the default pmsco.cluster.LegacyClusterGenerator which calls the UserProject.
+For simple clusters, it may be sufficient to implement the cluster directly in the user project class
+(UserProject in the diagram).
+For more complex systems, it is recommended to implement a custom cluster generator class
+(UserClusterGenerator).
+
+
+\subsubsection sec_emit_implement_legacy Static cluster implemented in project methods
+
+This is the most simple route as it requires the implementation of one or two methods of the user project class.
+It can be used for single-emitter and multi-emitter problems.
+This implementation is active while a pmsco.cluster.LegacyClusterGenerator
+is assigned to the project's cluster_generator attribute.
+
+1. Implement a count_emitters method in your project class
+   if the project uses more than one emitter configurations.
+   It must have same method contract as pmsco.cluster.ClusterGenerator.count_emitters.
+   Specifically, it must return the number of emitter configurations of a given model, scan and domain.
+   If there is only one configuration, the method does not need to be implemented.
+
+2. Implement a create_cluster method in your project class.
+   It must have same method contract as pmsco.cluster.ClusterGenerator.create_cluster.
+   Specifically, it must return a cluster.Cluster object for the given model, scan, domain and emitter configuration.
+   The emitter atoms must be marked according to the emitter configuration specified by the index argument.
+   Note that, depending on the index.emit argument, all emitter atoms must be marked
+   or only the ones of the corresponding emitter configuration.
+
+3. (Optionally) override the pmsco.project.Project.combine_emitters method
+   if the emitters should be added with non-uniform weights.
+
+Although it's possible to produce emitter-dependent clusters using this approach,
+this is usually not recommended.
+Rather, the generator approach described below should be followed in this case.
+
+
+\subsubsection sec_emit_implement_generator Static cluster implemented by generator class
+
+The preferred way of creating clusters is to implement a _generator_ class
+because it is the most scalable way from simple to complex systems.
+In addition, one cluster generator class can be quickly exchanged for another
+if there are multiple possibilities.
+
+1. Implement a cluster generator class which inherits from pmsco.cluster.ClusterGenerator
+   in your project module.
+
+2. Implement the create_cluster and count_emitters methods of the generator.
+   The method contracts are the same as the ones described in the previous paragraph,
+   just in the context of a separate class.
+
+3. Initialize an instance of the generator and assign it to the project.cluster_generator attribute
+   in the initialization of your project.
+
+
+\subsubsection sec_emit_implement_local Local clusters implemented by generator class
+
+The basic method contract outlined in the previous paragraph is equally applicable to the case
+where a local cluster is generated for each emitter configuration.
+Again, the generator class with the two methods (count_emitters and create_cluster) is the minimum requirement.
+However, for ease of code maintenance and/or for improved performance of large clusters,
+some internal structure may be helpful.
+
+Suppose that the system consists of a large supercell containing many emitters
+and that a small cluster shall be built for each emitter configuration.
+During the calculations, the generator will receive several calls to the count_emitters and create_cluster methods.
+Every time the model and index are the same, the functions must return the same result.
+Thus, most importantly, the implementation must make sure that the results are fully deterministic.
+Second, depending on the complexity, it could be more efficient to cache a cluster for later use.
+
+One way to reduce the complexity is to introduce a _master cluster_
+from which the emitter configurations and individual clusters are derived.
+
+1. Implement a master_cluster method with the same arguments and result types as create_cluster.
+   The method returns a full cluster of the supercell and its neighbouring cells.
+   All inequivalent emitters are marked (which determines the number of emitter configurations).
+
+2. Decorate the master_cluster with pmsco.dispatch.CachedCalculationMethod.
+   This pre-defined decorator transparently caches the cluster
+   so that subsequent calls with the same arguments do not re-create the cluster but return the cached one.
+
+3. The count_emitters method can simply return the emitter count of the master cluster.
+
+4. The create_cluster method calls master_cluster() and extracts the region
+   corresponding to the requested emitter configuration.
+
+
+\subsection sec_emit_report Reporting
+
+The pmsco.project.Project class implements a method that saves a cluster to two XYZ files,
+one containing the coordinates of all atoms
+and one containing only the coordinates of the emitters.
+
+The method is called for each cluster that is passed to the calculator, i.e., each emitter index.
+You may override the method in your project to alter the reporting.
+
+*/
--- a/docs/src/concepts-model.dox
+++ b/docs/src/concepts-model.dox
@ -0,0 +1,3 @@
+/*! @page pag_concepts_model Model
+
+*/
--- a/docs/src/concepts-region.dox
+++ b/docs/src/concepts-region.dox
@ -0,0 +1,3 @@
+/*! @page pag_concepts_region Region
+
+*/
--- a/docs/src/concepts-scan.dox
+++ b/docs/src/concepts-scan.dox
@ -0,0 +1,31 @@
+/*! @page pag_concepts_scan Scans
+
+\section sec_scanning Scanning
+
+PMSCO with EDAC currently supports the following scan axes.
+
+- kinetic energy E
+- polar angle theta T
+- azimuthal angle phi P
+- analyser angle alpha A
+
+The following combinations of these scan axes are allowed (see pmsco.data.SCANTYPES).
+
+- E
+- E-T
+- E-A
+- T-P (hemispherical or hologram scan)
+
+@attention The T and A axes cannot be combined.
+If a scan of one of them is specified, the other is assumed to be fixed at zero!
+This assumption may change in the future,
+so it is best to explicitly set the fixed angle to zero in the scan file.
+
+@remark According to the measurement geometry at PEARL,
+alpha scans are implemented in EDAC as theta scans at phi = 90 in fixed cluster mode.
+The switch to fixed cluster mode is made by PMSCO internally,
+no change of angles or other parameters is necessary in the scan or project files
+besides filling the alpha instead of the theta column.
+
+
+*/
--- a/docs/src/concepts-symmetry.dox
+++ b/docs/src/concepts-symmetry.dox
@ -0,0 +1,32 @@
+/*! @page pag_concepts_domain Domain
+
+\section sec_domain Domain Averaging
+
+A _domain_ under PMSCO is a discrete variant of a set of calculation parameters (including the atomic cluster)
+that is derived from the same set of model parameters
+and that contributes incoherently to the measured diffraction pattern.
+A domain may be represented by special domain parameters that are not subject to optimization.
+
+For instance, a real sample may have rotational domains that are not present in the cluster,
+changing the symmetry from three-fold to six-fold.
+Or, an adsorbate may be present in a number of different lateral configurations on the substrate.
+In the first case, it may be sufficient to fold calculated data in the proper way to generate the same symmetry as in the measurement.
+In the latter case, it may be necessary to execute a scattering calculation for each possible orientation or a representative number of possible orientations.
+
+PMSCO provides the basic framework to spawn multiple calculations according to the number of domains (cf. \ref sec_tasks).
+The actual data reduction from multiple domain to one measurement needs to be implemented on the project level.
+This section explains the necessary steps.
+
+1. Your project needs to populate the pmsco.project.Project.domains list.
+   For each domain, add a dictionary of domain parameters,  e.g. <code>{'angle_azi': 15.0}</code>.
+   At least one domain must be declared in a project, otherwise no calculation is executed.
+
+2. The project may use the domain index of a task to build the cluster and parameter file as necessary.
+   The pmsco.project.Project.create_cluster and pmsco.project.Project.create_params methods receive the index of the particular domain in addition to the model parameters.
+
+3. The project combines the results of the calculations for the various domains into one dataset that can be compared to the measurement.
+   The default method implemented in pmsco.project.Project just adds up all calculations with customizable weight.
+   It uses the special model parameters `wdom1`, `wdom2`, ... (if defined, default 1) to weight each domain.
+   If you need more control, override the pmsco.project.Project.combine_domains method and implement your own algorithm.
+
+*/
--- a/docs/src/concepts-tasks.dox
+++ b/docs/src/concepts-tasks.dox
@ -0,0 +1,306 @@
+/*! @page pag_concepts_tasks Task concept
+\section sec_tasks Calculation tasks
+
+A _calculation task_ defines a concrete set of model parameters, atomic coordinates, emitter configuration,
+experimental reference and meta-data (such as file names)
+that completely defines how to produce the input data for the scattering program (the _calculator_).
+For each task, the calculator is executed once and produces one result dataset.
+In a typical optimization project, however, the calculator is executed multiple times for various reasons
+mandated by the project but also efficient calculations in a multi-process environment:
+
+1. The calculation must be repeated under variation of parameters.
+   A concrete set of parameters is called @ref sec_task_model.
+2. The sample was measured multiple times or under different conditions (initial states, photon energy, emission angle).
+   Each contiguous measured dataset is called a @ref sec_task_scan.
+3. The measurement averages over multiple inequivalent domains, cf. @ref sec_task_domain.
+4. The measurement includes multiple geometrically inequivalent emitters, cf. @ref sec_task_emitter.
+5. The calculation should be distributed over multiple processes that run in parallel to reduce the wall time, cf. @ref sec_task_region.
+
+In PMSCO, these aspects are modelled as attributes of a calculation task
+as shown schematically in the following diagram.
+
+@startuml "attributes of a calculation task"
+
+class CalculationTask {
+model
+scan
+domain
+emitter
+region
+..
+files
+}
+
+class Model {
+    index
+    ..
+    dlat
+    dAS
+    dS1S2
+    V0
+    Zsurf
+    Texp
+    rmax
+}
+
+class Scan {
+    index
+    ..
+    filename
+    mode
+    initial_state
+    energies
+    thetas
+    phis
+    alphas
+}
+
+class Domain {
+    index
+    ..
+    rotation
+    registry
+}
+
+class Emitter {
+    index
+
+}
+
+class Region {
+    index
+    ..
+    range
+}
+
+CalculationTask *-- Model
+CalculationTask *-- Scan
+CalculationTask *-- Domain
+CalculationTask *-- Emitter
+CalculationTask *-- Region
+
+class Project {
+    scans
+    domains
+    model_handler
+    cluster_generator
+}
+
+    class ClusterGenerator {
+        count_emitters()
+        create_cluster()
+    }
+
+class ModelHandler {
+    create_tasks()
+    add_result()
+}
+
+Model ..> ModelHandler
+Scan ..> Project
+Domain ..> Project
+Emitter ..> ClusterGenerator
+Region ..> Project
+
+Project *-left- ModelHandler
+Project *- ClusterGenerator
+
+hide empty members
+
+@enduml
+
+Although the attributes may have quite different types (as detailed below),
+each instance is also given a unique (per attribute) integer index,
+where -1 means that the attribute is undefined.
+The indices of the five attributes together (pmsco.dispatch.CalcID tuple)
+serve internally to identify a task and the data belonging it.
+The identifier appears, for instance, in input and output file names.
+Normally, data files are deleted after the calculation, and only a few top-level files are kept
+(can be overridden at the command line or in the project code).
+At the top level, only the model ID is set, the other ones are undefined (-1).
+
+
+\subsection sec_task_model Model
+
+The _model_ attribute is a dictionary of continuously variable parameters of the system such as lattice constants, relaxation constants, rotation angles, etc.
+It may also define non-structural or non-physical parameters such as temperature, inner potential or cluster radius.
+
+The dictionary contains key-value pairs where the keys are up to the user project (the figure shows some examples).
+The values are floating-point numbers that are chosen by the model handler within the domain specified by the user project.
+
+Models are generated by the chosen optimizer according to a particular algorithm or, in single mode, directly by the project.
+Each specific instance of model parameters is given a unique index that allows to identify related input and output files.
+Model parameters are reported with the corresponding R-factors during the optimization process.
+
+
+\subsection sec_task_scan Scan
+
+The _scan_ attribute is an index into the list of scans defined by the user project.
+Each scan refers to one experimental data file and, thus, defines the initial and final states of the photoelectron.
+PMSCO runs a separate calculation for each scan file and compares the combined results to the experimental data.
+This is sometimes called a _global fit_.
+
+
+\subsection sec_task_domain Domain
+
+A _domain_ is a discrete variant of a set of calculation parameters (including the atomic cluster)
+that is independent of the _model_ and contributes incoherently to the measured diffraction pattern.
+For instance, for a system that includes two inequivalent structural domains,
+two separate clusters have to be generated and calculated for each model.
+
+The domain parameter is not subject to optimization.
+However, if the branching ratio is unknown a priori, a model parameter can be introduced
+to control the relative contribution of a particular domain to the diffraction pattern.
+The basic @ref pmsco.project.Project.combine_domains method reads the special model parameters `wdom1`, `wdom2`, etc. to weight the individual domains.
+
+A domain is identified by its index which is an index into the project's domains table (pmsco.project.Project.domains).
+It is up to the user project to give a physical description of the domain, e.g. a rotation angle,
+by assigning a meaningful value (e.g. a dictionary with key-value pairs) to the domains table.
+The cluster generator can then read the value from the table rather than from constants in the code.
+
+The figure shows two examples of domain parameters.
+The corresponding domains table could be set up like this:
+
+@code{.py}
+project.add_domain({'rotation': 0.0, 'registry': 0.0})
+project.add_domain({'rotation': 30.0, 'registry': 0.0})
+@endcode
+
+
+\subsection sec_task_emitter Emitter
+
+The _emitter_ component of the calculation task selects a specific emitter configuration of the cluster generator.
+This is merely an index whose interpretation is up to the cluster generator.
+The default emitter handler enumerates the emitter index from 1 to the emitter count reported by the cluster generator.
+
+The emitter count and list of emitters may depend on model, scan and domain.
+
+The cluster generator can tailor a cluster to the given model, scan, domain and emitter index.
+For example, in a large unit cell with many inequivalent emitters,
+the generator might return a small sub-cluster around the actual emitter for better calculation performance
+since the distant atoms of the unit cell do not contribute to the diffraction pattern.
+
+Emitter branching must be requested specifically by using a particular pattern in the code.
+By default, it is disabled, which allows the cluster code to be written in a slightly easier way.
+
+
+\subsection sec_task_region Region
+
+The _region_ handler may split a scan region into several smaller chunks
+so that the tasks can be distributed to multiple processes.
+
+Chunking by energy regions is enabled automatically if the project contains an energy scan of at least 10 points
+and the project is run in multiple processes.
+It can be disabled by the user project.
+
+
+\section sec_task_handler Task handlers
+
+The previous section described the five important attributes of a calculation task.
+These attributes span a five-dimensional index space
+where each point maps to one task and, consequently, one calculation and one result dataset.
+To populate the index space, however, calculation tasks are more adequately arranged in a tree-like hierarchy with five levels.
+The code that defines attributes and processes results can then be separated into _handlers_.
+
+Each level calls for a particular functional contract of the handler.
+According to object-oriented principles the contracts at the five levels are defined by abstract base classes
+which can be sub-classed for more specific behaviour.
+For instance, the class of the model handler is chosen based on the execution mode (single, grid, swarm, etc.).
+Though it is possible for a project to define its own handlers,
+the PMSCO core declares handlers that should cover most calculation scenarios.
+
+The following diagram shows the tree of calculation tasks and how handlers act on the task objects to populate the task attributes.
+At the top of the tree, an empty task object (all attributes undefined) is fed into the model level handler which takes care of the model attribute.
+The model handler generates a number of sub-tasks, one for each set of model parameters.
+Each of these (incompletely defined) tasks is then passed to the next handler, and so on.
+
+@startuml "calculation task hierarchy and task handler stack"
+
+object "Root: CalculationTask" as Root {
+index = (-1,-1,-1,-1,-1)
+}
+note right: all attributes undefined
+
+object "Model: CalculationTask" as Model {
+index = (i,-1,-1,-1,-1)
+model
+}
+note right: model is defined\nother attributes undefined
+
+object ModelHandler
+
+object "Scan: CalculationTask" as Scan {
+index = (i,j,-1,-1,-1)
+model
+scan
+}
+
+object ScanHandler
+
+object "Domain: CalculationTask" as Domain {
+index = (i,j,k,-1,-1)
+model
+scan
+domain
+}
+
+object "DomainHandler" as DomainHandler
+
+object "Emitter: CalculationTask" as Emitter {
+index = (i,j,k,l,-1)
+model
+scan
+domain
+emitter
+}
+
+object EmitterHandler
+
+object "Region: CalculationTask" as Region {
+index = (i,j,k,l,m)
+model
+scan
+domain
+emitter
+region
+}
+note right: all attributes well-defined
+
+object RegionHandler
+
+Root "1" o.. "1..*" Model
+Model "1" o.. "1..*" Scan
+Scan "1" o.. "1..*" Domain
+Domain "1" o.. "1..*" Emitter
+Emitter "1" o.. "1..*" Region
+
+(Root, Model) .. ModelHandler
+(Model, Scan) .. ScanHandler
+(Scan, Domain) .. DomainHandler
+(Domain, Emitter) .. EmitterHandler
+(Emitter, Region) .. RegionHandler
+
+@enduml
+
+At the end of the stack, the tasks are fully specified and are passed to the calculation queue.
+They are dispatched to the available processes of the MPI environment in which PMSCO was started,
+which allows calculations to be run in parallel.
+Only now that the model is broken down into multiple, fully specified tasks,
+the cluster and input files are generated, and the calculation program is started.
+
+At the end of a calculation, the output files are associated with their original task objects,
+and the tasks are passed back through the task handler stack.
+In this phase, each level joins the datasets from the sub-tasks to the data requested by the parent task.
+For example, at the lowest level, one result file is present for each region.
+The region handler gathers all files that correspond to the same parent task
+(i.e. have the same emitter, domain, scan and model attributes),
+joins them to one file which includes all regions,
+links the file to the parent task and passes the result to the next higher level.
+
+On the top level, the model handler compares the result to the experimental data.
+Depending on the operation mode, it refines the model parameters and issues new tasks by passing them down the stack.
+When the optimization is finished (according to a set of defined criteria),
+The model handler returns the root task to the caller, which causes PMSCO to exit.
+
+*/
+
--- a/docs/src/concepts.dox
+++ b/docs/src/concepts.dox
@ -1,153 +1,85 @@
-/*! @page pag_concepts Design Concepts
-\section sec_tasks Tasks
+/*! @page pag_concepts Design

-In an optimization project, a number of optimizable, high-level parameters generated by the optimization algorithm
-must be mapped to the input parameters and atomic coordinates before the calculation program is executed.
-Possibly, the calculation program is executed multiple times for inequivalent domains, emitters or scan geometries.
-After the calculation, the output is collected, compared to the experimental data, and the model is refined.
-In PMSCO, the optimization is broken down into a set of _tasks_ and assigned to a stack of task _handlers_ according to the following figure.
-Each invocation of the scattering program (EDAC) runs a specific task,
-i.e. a calculation for a set of specific parameters, a fully-qualified cluster of atoms, and a specific angle and/or energy scan.
+\section sec_components Components

-\dotfile tasks.dot "PMSCO task stack"
+The code for a PMSCO job consists of the following components.

-At the root, the _model handler_ proposes models that need to be calculated according to the operation mode specified at the command line.
-A _model_ is the minimum set of variable parameters in the context of a custom project.
-Other parameters that will not vary under optimization are set directly by the project code.
-The model handler may generate models based on a fixed scheme, e.g. on a grid, or based on R-factors of previous results.
+@startuml "top-level components of scattering and optimization code"

-For each model, one task is passed to the task handling chain, starting with the scan handler.
-The _scan handler_ generates sub-tasks for each experimental scan dataset.
-This way, the model can be optimized for multiple experimental scans in the same run (see Sec. \ref sec_scanning).
+skinparam componentStyle uml2

-The _symmetry handler_ generates sub-tasks based on the number of symmetries contained in the experimental data (see Sec. \ref sec_symmetry).
-For instance, for a system that includes two inequivalent structural domains, two separate calculations have to be run for each model.
-The symmetry handler is implemented on the project level and may be customized for a specific system.
+component "project" as project
+component "PMSCO" as pmsco
+component "scattering code\n(calculator)" as calculator

-The _emitter handler_ generates a sub-task for each inequivalent emitter atom
-so that the tasks can be distributed to multiple processes (see Sec. \ref sec_emitters).
-In a single-process environment, all emitters are calculated in one task.
+interface "command line" as cli
+interface "input files" as input
+interface "output files" as output
+interface "experimental data" as data
+interface "results" as results

-The _region handler_ may split a scan region into several smaller chunks
-so that the tasks can be distributed to multiple processes.
-With EDAC, only energy scans can benefit from chunking
-since it always calculates the full angular distribution.
-This layer has to be enabled specifically in the project module.
-It is disabled by default.
+data -> project
+project ..> pmsco
+pmsco ..> calculator
+cli --> project
+input -> calculator
+calculator -> output
+pmsco -> results

-At the end of the stack, the tasks are fully specified and are passed to the calculation queue.
-They are dispatched to the available processes of the MPI environment in which PMSCO was started,
-which allows calculations to be run in parallel.
-Only now that the model is broken down into multiple tasks,
-the cluster and input files are generated, and the calculation program is started.
-
-At the end of a calculation, the output is passed back through the task handler stack.
-In this phase, each level gathers the datasets from the sub-tasks to the data requested by the parent task
-and passes the result to the next higher level.
-
-On the top level, the calculation is compared to the experimental data.
-Depending on the operation mode, the model parameters are refined, and new tasks issued.
-If the optimization is finished according to a set of defined criteria, PMSCO exits.
-
-As an implentation detail, each task is given a unique _identifier_ consisting of five integer numbers
-which correspond to the five levels model, scan, symmetry, emitter and region.
-The identifier appears in the file names in the communication with the scattering program.
-Normally, the data files are deleted after the calculation, and only a few top-level files are kept
-(can be overridden at the command line or in the project code).
-At the top level, only the model ID is set, the other ones are undefined (-1).
+@enduml


-\section sec_symmetry Symmetry and Domain Averaging
+The _project_ consists of program code, system and experimental parameters
+that are specific to a particular experiment and calculation job.
+The project code reads experimental data, defines the parameter dictionary of the model,
+and contains code to generate the cluster, parameter and phase files for the scattering code.
+The project is also the main entry point of process execution.

-A _symmetry_ under PMSCO is a discrete variant of a set of calculation parameters (including the atomic cluster)
-that is derived from the same set of model parameters
-and that contributes incoherently to the measured diffraction pattern.
-A symmetry may be represented by a special symmetry parameter which is not subject to optimization.
+The _scattering code_ on the other hand is a static calculation engine
+which accepts detailed input files
+(parameters, atomic coordinates, emitter specification, scattering phases)
+and outputs an intensity distribution of photoelectrons versus energy and/or angle.

-For instance, a real sample may have additional rotational domains that are not present in the cluster,
-increasing the symmetry from three-fold to six-fold.
-Or, an adsorbate may be present in a number of different lateral configurations on the substrate.
-In the first case, it may be sufficient to fold calculated data in the proper way to generate the same symmetry as in the measurement.
-In the latter case, it may be necessary to execute a scattering calculation for each possible orientation or a representative number of possible orientations.
-
-PMSCO provides the basic framework to spawn multiple calculations according to the number of symmetries (cf. \ref sec_tasks).
-The actual data reduction from multiple symmetries to one measurement needs to be implemented on the project level.
-This section explains the necessary steps.
-
-1. Your project needs to populate the pmsco.project.Project.symmetries list.
-   For each symmetry, add a dictionary of symmetry parameters,  e.g. <code>{'angle_azi': 15.0}</code>.
-   There must be at least one symmetry in a project, otherwise no calculation is executed.
-
-2. The project may apply the symmetry of a task to the cluster and parameter file if necessary.
-   The pmsco.project.Project.create_cluster and pmsco.project.Project.create_params methods receive the index of the particular symmetry in addition to the model parameters.
-
-3. The project combines the results of the calculations for the various symmetries into one dataset that can be compared to the measurement.
-   The default method implemented in pmsco.project.Project just adds up all calculations with equal weight.
-   If you need more control, you need to override the pmsco.project.Project.combine_symmetries method and implement your own algorithm.
+The _PMSCO core_ interfaces between the project and the calculator.
+It carries out the structural optimization and manages the calculation tasks.
+It generates and sends input files to the calculator and reads back the output.


-\section sec_scanning Scanning
+\section sec_control_flow Control flow

-PMSCO with EDAC currently supports the following scan axes.
+The basic control flow of a optimization job is depicted schematically in the following figure.

- kinetic energy E
- polar angle theta T
- azimuthal angle phi P
- analyser angle alpha A
+@startuml "top-level activity diagram"

-The following combinations of these scan axes are allowed (see pmsco.data.SCANTYPES).
+start
+:initialize;
+:import experimental data;
+repeat
+:define tasks;
+fork
+:calculate\ntask 1;
+fork again
+:calculate\ntask N;
+end fork
+:evaluate results;
+repeat while
+-> [finished];
+:report results;

- E
- E-T
- E-A
- T-P (hemispherical or hologram scan)
+stop

-@attention The T and A axes cannot be combined.
-If a scan of one of them is specified, the other is assumed to be fixed at zero!
-This assumption may change in the future,
-so it is best to explicitly set the fixed angle to zero in the scan file.
+@enduml

-@remark According to the measurement geometry at PEARL,
-alpha scans are implemented in EDAC as theta scans at phi = 90 in fixed cluster mode.
-The switch to fixed cluster mode is made by PMSCO internally,
-no change of angles or other parameters is necessary in the scan or project files
-besides filling the alpha instead of the theta column.
+After importing experimental data and setting up the model dictionary and job parameters,
+the calculation tasks are defined depending on the execution mode and system setup.
+Each task consists of a specific set of model, experimental and calculation parameters
+that describe an independent calculation step,
+while several steps may be required to produce a dataset that can be compared to the experimental data.
+The idea is that tasks can be defined quickly
+and that the time-consuming operations are dispatched to slave processes which can run in parallel.

-
-\section sec_emitters Emitter Configurations
-
-Since emitters contribute incoherently to the diffraction pattern,
-it should make no difference how the emitters are grouped and calculated.
-EDAC allows to specify multiple emitters in one calculation.
-However, running EDAC multiple times for a single-emitter configuration or simply summing up the results
-gives the same final diffraction pattern with no significant difference of used CPU time.
-It is, thus, easy to distribute the emitters over parallel processes in a multi-process environment.
-PMSCO can handle this transparently with a minimal effort.
-
-Within the same framework, PMSCO also supports that clusters are tailored to a specific emitter configuration.
-Suppose that the unit cell contains a large number of inequivalent emitters.
-If all emitters had to be included in a single calculation,
-the cluster would grow very large and the calculation would take a long time
-because it would include many long scattering paths
-that effectively do not contribute intensity to the final result.
-Using single-emitters, a cluster can be built locally around the emitter and kept to a reasonable size.
-
-Even when using this feature, PMSCO does not require that each configuration contains only one emitter.
-The term _emitter_ effectively means _emitter configuration_.
-A configuration can include multiple emitters which will not be broken up further.
-It is up to the project, what is included in a particular configuration.
-
-To enable emitter handling,
-
-1. override the count_emitters method of your cluster generator
-   and return the number of emitter configurations of a given model, scan and symmetry.
-
-2. handle the emitter index in your create_cluster method.
-
-3. (optionally) override the pmsco.project.Project.combine_emitters method
-   if the emitters should not be added with equal weights.
-
-For implementation details see the respective method descriptions.
+As soon as all necessary results are available they are combined into one dataset and compared to the experimental data.
+Depending on the execution mode, the process of task definition and calculation repeats until the model has converged
+or the calculations are stopped for another reason.

 */
-
--- a/docs/src/dataflow.dot
+++ b/docs/src/dataflow.dot
@ -10,7 +10,7 @@ digraph G {
        create_params;
        calc_modf;
        calc_rfac;
-        comb_syms;
+        comb_doms;
        comb_scans;
    }
    */
@ -24,11 +24,11 @@ digraph G {
        model_handler -> model_creator [constraint=false, label="optimize"];
    }

-    subgraph cluster_symmetry {
-        label = "symmetry handler";
+    subgraph cluster_domain {
+        label = "domain handler";
        rank = same;
-        sym_creator [label="expand models", group=creators];
-        sym_handler [label="combine symmetries", group=handlers];
+        dom_creator [label="expand models", group=creators];
+        dom_handler [label="combine domains", group=handlers];
    }

    subgraph cluster_scan {
@ -47,15 +47,15 @@ digraph G {

    calculator [label="calculator (EDAC)", shape=box];

-    model_creator -> sym_creator [label="model", style=bold];
-    sym_creator -> scan_creator [label="models", style=bold];
+    model_creator -> dom_creator [label="model", style=bold];
+    dom_creator -> scan_creator [label="models", style=bold];
    scan_creator -> calc_creator [label="models", style=bold];
    calc_creator -> calculator [label="clusters,\rparameters", style=bold];

    calculator -> calc_handler [label="output files", style=bold];
    calc_handler -> scan_handler [label="raw data files", style=bold];
-    scan_handler -> sym_handler [label="combined scans", style=bold];
-    sym_handler -> model_handler [label="combined symmetries", style=bold];
+    scan_handler -> dom_handler [label="combined scans", style=bold];
+    dom_handler -> model_handler [label="combined domains", style=bold];

    mode [shape=parallelogram];
    mode -> model_creator [lhead="cluster_model"];
@ -76,8 +76,8 @@ digraph G {
    calc_rfac [shape=cds, label="R-factor function"];
    calc_rfac -> model_handler [style=dashed];

-    comb_syms [shape=cds, label="symmetry combination rule"];
-    comb_syms -> sym_handler [style=dashed];
+    comb_doms [shape=cds, label="domain combination rule"];
+    comb_doms -> dom_handler [style=dashed];

    comb_scans [shape=cds, label="scan combination rule"];
    comb_scans -> scan_handler [style=dashed];
--- a/docs/src/execution.dox
+++ b/docs/src/execution.dox
@ -14,26 +14,29 @@ Run PMSCO from the command prompt:

@code{.sh}
 cd work-dir
-python project-dir/project.py [pmsco-arguments] [project-arguments]
+python pmsco-dir project-dir/project.py [pmsco-arguments] [project-arguments]
@endcode

 where <code>work-dir</code> is the destination directory for output files,
+<code>pmsco-dir</code> is the directory containing the <code>__main__.py</code> file,
 <code>project.py</code> is the specific project module,
 and <code>project-dir</code> is the directory where the project file is located.
 PMSCO is run in one process which handles all calculations sequentially.

-The command line arguments are usually divided into common arguments interpreted by the main pmsco code (pmsco.py),
+The command line arguments are divided into common arguments interpreted by the main pmsco code (pmsco.py),
 and project-specific arguments interpreted by the project module.
-However, it is ultimately up to the project module how the command line is interpreted.

 Example command line for a single EDAC calculation of the two-atom project:
@code{.sh}
 cd work/twoatom
-python pmsco/projects/twoatom/twoatom.py -s ea -o twoatom-demo -m single
+python ../../pmsco ../../projects/twoatom/twoatom.py -s ea -o twoatom-demo -m single
@endcode

-The project file <code>twoatom.py</code> takes the lead of the project execution.
-Usually, it contains only project-specific code and delegates common tasks to the main pmsco code.
+This command line executes the main pmsco module <code>pmsco.py</code>.
+The main module loads the project file <code>twoatom.py</code> as a plug-in
+and starts processing the common arguments.
+The <code>twoatom.py</code> module contains only project-specific code
+with several defined entry-points called from the main module.

 In the command line above, the <code>-o twoatom-demo</code> and <code>-m single</code> arguments
 are interpreted by the pmsco module.
@ -61,7 +64,7 @@ For optimum performance, the number of processes should not exceed the number of
 To start a two-hour optimization job with multiple processes on an quad-core workstation with hyperthreading:
@code{.sh}
 cd work/my_project
-mpiexec -np 8 project-dir/project.py -o my_job_0001 -t 2 -m swarm
+mpiexec -np 8 pmsco-dir/pmsco project-dir/project.py -o my_job_0001 -t 2 -m swarm
@endcode


@ -84,4 +87,4 @@ bin/qpmsco.ra.sh my_job_0001 1 8 2 projects/my_project/project.py swarm
 Be sure to consider the resource allocation policy of the cluster
 before you decide on the number of processes.
 Requesting less resources will prolong the run time but might increase the scheduling priority.
-*/
+*/
--- a/docs/src/installation.dox
+++ b/docs/src/installation.dox
@ -3,60 +3,66 @@

 \subsection sec_general General Remarks

-The PMSCO code is maintained under git.
+The PMSCO code is maintained under [Git](https://git-scm.com/).
 The central repository for PSI-internal projects is at https://git.psi.ch/pearl/pmsco,
 the public repository at https://gitlab.psi.ch/pearl/pmsco.
-
 For their own developments, users should clone the repository.
 Changes to common code should be submitted via pull requests.

+The program code of PMSCO and its external programs is written in Python 3.6, C++ and Fortran.
+The code will run in any recent Linux environment on a workstation or in a virtual machine.
+Scientific Linux, CentOS7, [Ubuntu](https://www.ubuntu.com/)
+and [Lubuntu](http://lubuntu.net/) (recommended for virtual machine) have been tested.
+For optimization jobs, a workstation with at least 4 processor cores
+or cluster with 20-50 available processor cores is recommended.
+The program requires about 2 GB of RAM per process.
+
+The recommended IDE is [PyCharm (community edition)](https://www.jetbrains.com/pycharm).
+[Spyder](https://docs.spyder-ide.org/index.html) is a good alternative with a better focus on scientific data.
+The documentation in [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) format is part of the source code.
+The Doxygen compiler can generate separate documentation in HTML or LaTeX.
+

 \subsection sec_requirements Requirements

-The recommended IDE is [PyCharm (community edition)](https://www.jetbrains.com/pycharm).
-The documentation in [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) format is part of the source code.
-The Doxygen compiler can generate separate documentation in HTML or LaTeX.
+Please note that in some environments (particularly shared high-performance machines)
+it may be important to choose specific compiler and library versions.
+In order to maintain backward compatibility with some of these older machines,
+code that requires new versions of compilers and libraries should be introduced carefully.

-The MSC and EDAC codes compile with the GNU Fortran and C++ compilers on Linux.
-Other compilers may work but have not been tested.
-The code will run in any recent Linux environment on a workstation or in a virtual machine.
-Scientific Linux, CentOS7, [Ubuntu](https://www.ubuntu.com/)
-and [Lubuntu](http://lubuntu.net/) (recommended for virtual machine) have been tested.
-For optimization jobs, a high-performance cluster with 20-50 available processor cores is recommended.
-The code requires about 2 GB of RAM per process.
-
-Please note that it may be important that the code remains compatible with earlier compiler and library versions.
-Newer compilers or the latest versions of the libraries contain features that will break the compatibility.
-The code can be used with newer versions as long they are backward compatible.
 The code depends on the following libraries:

- GCC 4.8
- OpenMPI 1.10
+- GCC >= 4.8
+- OpenMPI >= 1.10
 - F2PY
 - F2C
 - SWIG
- Python 2.7 (incompatible with Python 3.0)
- Numpy 1.11 (incompatible with Numpy 1.13 and later)
- MPI4PY (from PyPI)
 - BLAS
 - LAPACK
- periodictable
+- Python 3.6
+- Numpy >= 1.13
+- Python packages listed in the requirements.txt file
+
+Most of these requirements are available from the Linux distribution.
+For an easily maintainable Python environment, [Miniconda](https://conda.io/miniconda.html) is recommended.
+The Python environment distributed with the OS often contains outdated packages,
+and it's difficult to switch between different Python versions.

-Most of these requirements are available from the Linux distribution, or from PyPI (pip install), respectively.
-If there are any issues with the packages installed by the distribution, try the ones from PyPI
-(e.g. there is currently a bug in the Debian mpi4py package).
-The F2C source code is contained in the repository for machines which don't have it installed.
 On the PSI cluster machines, the environment must be set using the module system and conda (on Ra).
 Details are explained in the PEARL Wiki.

-\subsubsection sec_install_ubuntu Installation on Ubuntu 16.04

-The following instructions install the necessary dependencies on Ubuntu (or Lubuntu 16.04):
+\subsection sec_install_instructions Instructions
+
+\subsubsection sec_install_ubuntu Installation on Ubuntu
+
+The following instructions install the necessary dependencies on Ubuntu, Debian or related distributions.
+The Python environment is provided by [Miniconda](https://conda.io/miniconda.html).

@code{.sh}
-sudo apt-get update
+sudo apt update

-sudo apt-get install \
+sudo apt install \
 binutils \
 build-essential \
 doxygen \
@ -67,38 +73,118 @@ gcc \
 gfortran \
 git \
 graphviz \
-ipython \
+libblas-dev \
+liblapack-dev \
 libopenmpi-dev \
 make \
+nano \
 openmpi-bin \
 openmpi-common \
-python-all \
-python-mock \
-python-nose \
-python-numpy \
-python-pip \
-python-scipy \
-python2.7-dev \
-swig
+sqlite3 \
+wget
+@endcode

-sudo pip install --system mpi4py periodictable
+On systems where the link to libblas is missing (see @ref sec_compile below),
+the following lines are necessary.

+@code{.sh}
 cd /usr/lib
 sudo ln -s /usr/lib/libblas/libblas.so.3 libblas.so
@endcode

-The following instructions install the PyCharm IDE and a few other useful utilities:
+Install Miniconda according to their [instructions](https://conda.io/docs/user-guide/install/index.html),
+then configure the Python environment:

@code{.sh}
-sudo sh -c 'echo "deb http://archive.getdeb.net/ubuntu xenial-getdeb apps" >> /etc/apt/sources.list.d/getdeb.list'
-wget -q -O - http://archive.getdeb.net/getdeb-archive.key | sudo apt-key add -
-sudo apt-get update
-sudo apt-get install \
+conda create -q --yes -n pmsco python=3.6
+source activate pmsco
+conda install -q --yes -n pmsco \
+    pip \
+    "numpy>=1.13" \
+    scipy \
+    ipython \
+    matplotlib \
+    nose \
+    mock \
+    future \
+    statsmodels \
+    swig \
+    gitpython
+pip install periodictable attrdict fasteners mpi4py
+@endcode
+
+@note `mpi4pi` should be installed via pip, _not_ conda.
+   conda might install its own MPI libraries, which can cause a conflict with system libraries.
+   (cf. [mpi4py forum](https://groups.google.com/forum/#!topic/mpi4py/xpPKcOO-H4k))
+
+\subsubsection sec_install_singularity Installation in Singularity container
+
+A [Singularity](https://www.sylabs.io/guides/2.5/user-guide/index.html) container
+contains all OS and Python dependencies for running PMSCO.
+Besides the Singularity executable, nothing else needs to be installed in the host system.
+This may be the fastest way to get PMSCO running.
+
+For installation of Singularity,
+see their [user guide](https://www.sylabs.io/guides/2.5/user-guide/installation.html).
+On newer Linux systems (e.g. Ubuntu 18.04), Singularity is available from the package manager.
+Installation in a virtual machine on Windows or Mac are straightforward
+thanks to the [Vagrant system](https://www.vagrantup.com/).
+
+After installing Singularity,
+check out PMSCO as explained in the @ref sec_compile section:
+
+@code{.sh}
+cd ~
+mkdir containers
+git clone git@git.psi.ch:pearl/pmsco.git pmsco
+cd pmsco
+git checkout master
+git checkout -b my_branch
+@endcode
+
+Then, either copy a pre-built container into `~/containers`,
+or build one from a script provided by the PMSCO repository:
+
+@code{.sh}
+cd ~/containers
+sudo singularity build pmsco.simg ~/containers/pmsco/extras/singularity/singularity_python3
+@endcode
+
+To work with PMSCO, start an interactive shell in the container and switch to the pmsco environment.
+Note that the PMSCO code is outside the container and can be edited with the usual tools.
+
+@code{.sh}
+cd ~/containers
+singularity shell pmsco.simg
+source activate pmsco
+cd ~/containers/pmsco
+make all
+nosetests -w tests/
+@endcode
+
+Or call PMSCO from outside:
+
+@code{.sh}
+cd ~/containers
+mkdir output
+cd output
+singularity run ../pmsco.simg python ~/containers/pmsco/pmsco path/to/your-project.py arg1 arg2 ...
+@endcode
+
+For parallel processing, prepend `mpirun -np X` to the singularity command as needed.
+
+
+\subsubsection sec_install_extra Additional Applications
+
+For working with the code and data, some other applications are recommended.
+The PyCharm IDE can be installed from the Ubuntu software center.
+The following commands install other useful helper applications:
+
+@code{.sh}
+sudo apt install \
 avogadro \
 gitg \
-meld \
-openjdk-9-jdk \
-pycharm
+meld
@endcode

 To produce documentation in PDF format (not recommended on virtual machine), install LaTeX:
@ -124,15 +210,18 @@ Private key authentication is usually recommended except on shared computers.
 Clone the code repository using one of these repositiory addresses and switch to the desired branch:

@code{.sh}
-cd ~
 git clone git@git.psi.ch:pearl/pmsco.git pmsco
 cd pmsco
 git checkout master
 git checkout -b my_branch
@endcode

-The compilation of the various modules is started by <code>make all</code>.
-The compilation step is necessary only once after installation.
+Compile the code and run the unit tests to check that it worked.
+
+@code{.sh}
+make all
+nosetests -w tests/
+@endcode

 If the compilation of _loess.so failes due to a missing BLAS library,
 try to set a link to the BLAS library as follows (the actual file names may vary due to the actual distribution or version):
@ -150,7 +239,7 @@ Re-check from time to time.

@code{.sh}
 cd ~/pmsco
-nosetests
+nosetests -w tests/
@endcode

 Run the twoatom project to check the compilation of the calculation programs.
@ -161,8 +250,10 @@ mkdir work
 cd work
 mkdir twoatom
 cd twoatom/
-nice python ~/pmsco/projects/twoatom/twoatom.py  -s ~/pmsco/projects/twoatom/twoatom_energy_alpha.etpai -o twoatom_energy_alpha -m single
+nice python ~/pmsco/pmsco ~/pmsco/projects/twoatom/twoatom.py -s ea -o twoatom_energy_alpha -m single
@endcode

+Runtime warnings may appear because the twoatom project does not contain experimental data.
+
 To learn more about running PMSCO, see @ref pag_run.
 */
--- a/docs/src/introduction.dox
+++ b/docs/src/introduction.dox
@ -9,25 +9,22 @@ The actual scattering calculation is done by code developed by other parties.
 While the scattering program typically calculates a diffraction pattern based on a set of static parameters and a specific coordinate file in a single process,
 PMSCO wraps around that program to facilitate parameter handling, cluster building, structural optimization and parallel processing.

-In the current version, the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code
-developed by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley (1999) is used for scattering calculations.
-Other code can be integrated as well.
-Initially, support for the MSC program by Kaduwela, Friedman, and Fadley was planned but is currently not maintained.
-PMSCO is written in Python 2.7.
-EDAC is written in C++, MSC in Fortran.
-PMSCO interacts with the calculation programs through Python wrappers for C++ or Fortran.
+In the current version, PMSCO can make use of the following programs.
+Other programs may be integrated as well.

-The MSC and EDAC source code is contained in the same software repository.
-The PMSCO, MSC, and EDAC programs may not be used outside the PEARL group without an explicit agreement by the respective original authors.
-Users of the PMSCO code are requested to coordinate and share the development of the code with the original author.
-Please read and respect the respective license agreements.
+- [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/)
+  by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley,
+  [Phys. Rev. B 63 (2001) 075404](http://dx.doi.org/10.1103/PhysRevB.63.075404)
+- PHAGEN from the [MsSpec package](https://ipr.univ-rennes1.fr/msspec)
+  by C. R. Natoli and D. Sébilleau,
+  [Comp. Phys. Comm. 182 (2011) 2567](http://dx.doi.org/10.1016/j.cpc.2011.07.012)


 \section sec_intro_highlights Highlights

 - angle or energy scanned XPD.
 - various scanning modes including energy, polar angle, azimuthal angle, analyser angle.
- averaging over multiple symmetries (domains or emitters).
+- averaging over multiple domains and emitters.
 - global optimization of multiple scans.
 - structural optimization algorithms: particle swarm optimization, grid search, gradient search.
 - calculation of the modulation function.
@ -42,8 +39,8 @@ To set up a new optimization project, you need to:
 - create a new directory under projects.
 - create a new Python module in this directory, e.g., my_project.py.
 - implement a sub-class of project.Project in my_project.py.
- override the create_cluster, create_params, and create_domain methods.
- optionally, override the combine_symmetries and combine_scans methods.
+- override the create_cluster, create_params, and create_model_space methods.
+- optionally, override the combine_domains and combine_scans methods.
 - add a global function create_project to my_project.py.
 - provide experimental data files (intensity or modulation function).

@ -54,8 +51,25 @@ and the example projects.
 \section sec_intro_start Getting Started

 - @ref pag_concepts
+  - @ref pag_concepts_tasks
+  - @ref pag_concepts_emitter
 - @ref pag_install
 - @ref pag_run
 - @ref pag_command

+\section sec_license License Information
+
+An open distribution of PMSCO is available under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0) at <https://gitlab.psi.ch/pearl-public/pmsco>.
+
+- Please read and respect the respective license agreements.
+- Please acknowledge the use of the code.
+- Please share your development of the code with the original author.
+
+Due to different copyright terms, the third-party calculation programs are not contained in the public software repository.
+These programs may not be used without an explicit agreement by the respective original authors.
+
+\author    Matthias Muntwiler, <mailto:matthias.muntwiler@psi.ch>
+\version   This documentation is compiled from version $(REVISION).
+\copyright 2015-2019 by [Paul Scherrer Institut](http://www.psi.ch)
+\copyright Licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0)
 */
--- a/docs/src/optimizers.dox
+++ b/docs/src/optimizers.dox
@ -0,0 +1,193 @@
+/*! @page pag_opt Model optimizers
+\section sec_opt Model optimizers
+
+
+
+\subsection sec_opt_swarm Particle swarm
+
+The particle swarm algorithm is adapted from
+D. A. Duncan et al., Surface Science 606, 278 (2012).
+
+The general parameters of the genetic algorithm are specified in the @ref Project.optimizer_params dictionary.
+Some of them can be changed on the command line.
+
+| Parameter | Command line | Range | Description |
+| --- | --- | --- | --- |
+| pop_size | --pop-size | &ge; 1 | |
+| position_constrain_mode | | default bounce | Resolution of domain limit violations. |
+| seed_file | --seed-file | a file path, default none | |
+| seed_limit | --seed-limit | 0..pop_size | |
+| rfac_limit | | 0..1, default 0.8 | Accept only seed values that have a lower R-factor. |
+| recalc_seed | | True or False, default True | |
+
+The domain parameters have the following meanings:
+
+| Parameter | Description |
+| --- | --- |
+| start | Seed model. The start values are copied into particle 0 of the initial population. |
+| min | Lower limit of the parameter range. |
+| max | Upper limit of the parameter range. |
+| step | Not used. |
+
+
+\subsubsection sec_opt_seed Seeding a population
+
+By default, one particle is initialized with the start value declared in the parameter domain,
+and the other are set to random values within the domain.
+You may initialize more particles of the population with specific values by providing a seed file.
+
+The seed file must have a similar format as the result `.dat` files
+with a header line specifying the column names and data rows containing the values for each particle.
+A good practice is to use a previous `.dat` file and remove unwanted rows.
+To continue an interrupted optimization,
+the `.dat` file from the previous optimization can be used as is.
+
+The seeding procedure can be tweaked by several optimizer parameters (see above).
+PMSCO normally loads the first rows up to population size - 1 or up to the `seed_limit` parameter,
+whichever is lower.
+If an `_rfac` column is present, the file is first sorted by R-factor and only the best models are loaded.
+Models that resulted in an R-factor above the `rfac_limit` parameter are always ignored.
+
+During the optimization process, all models loaded from the seed file are normally re-calculated.
+This may waste CPU time if the calculation is run under the same conditions
+and would result in exactly the same R-factor,
+as is the case if the seed is used to continue a previous optimization, for example.
+In these situations, the `recalc_seed` parameter can be set to False,
+and PMSCO will use the R-factor value from the seed file rather than calculating the model again.
+
+
+\subsubsection sec_opt_patch Patching a running optimization
+
+While an optimization process is running, the user can manually patch the population with arbitrary values,
+for instance, to kick the population out of a local optimum or to drive it to a less sampled parameter region.
+To patch a running population, prepare a population file named `pmsco_patch.pop` and copy it to the work directory.
+
+The file must have a similar format as the result `.dat` files
+with a header line specifying the column names and data rows containing the values.
+It should contain as many rows as particles to be patched but not more than the size of the population.
+The columns must include a `_particle` column which specifies the particle to patch
+as well as the model parameters to be changed.
+Parameters that should remain unaffected can be left out,
+extra columns including `_gen`, `_rfac` etc. are ignored.
+
+PMSCO checks the file for syntax errors and ignores it if errors are present.
+Parameter values that lie outside the domain boundary are ignored.
+Successful or failed patching is logged at warning level.
+The patch file is re-applied whenever its time stamp has changed.
+
+\attention Do not edit the patch file in the working directory
+to prevent it from being read in an unfinished state or multiple times.
+
+
+\subsection sec_opt_genetic Genetic optimization
+
+The genetic algorithm evolves a population of individuals 
+by a combination of inheritance, crossover, mutation
+and selection in analogy to biological evolution.
+The _genes_ are in this case the model parameters,
+and selection occurs based on R-factor.
+The genetic algorithm is adapted from
+D. A. Duncan et al., Surface Science 606, 278 (2012).
+It is implemented in the @ref pmsco.optimizers.genetic module.
+
+The genetic optimization is helpful in the first stage of an optimization
+where a large parameter space needs to be sampled
+and fast convergence on a small part of the parameter space is less desirable
+as it might catch on a local optimum.
+On the other hand, convergence near the optimum is slower than in the particle swarm.
+The genetic optimization should be run with a large number of iterations
+rather than a large population size.
+
+The general parameters of the genetic algorithm are specified in the @ref Project.optimizer_params dictionary.
+Some of them can be changed on the command line.
+
+| Parameter | Command line | Range | Description |
+| --- | --- | --- | --- |
+| pop_size | --pop-size | &ge; 1 | |
+| mating_factor | | 1..pop_size, default 4 | |
+| strong_mutation_probability | | 0..1, default 0.01 | Probability that a parameter undergoes a strong mutation. |
+| weak_mutation_probability | | 0..1, default 1 | Probability that a parameter undergoes a weak mutation. This parameters should be left at 1. Lower values tend to produce discrete parameter values. Weak mutations can be tuned by the step domain parameters. |
+| position_constrain_mode | | default random | Resolution of domain limit violations. |
+| seed_file | --seed-file | a file path, default none | |
+| seed_limit | --seed-limit | 0..pop_size | |
+| rfac_limit | | 0..1, default 0.8 | Accept only seed values that have a lower R-factor. |
+| recalc_seed | | True or False, default True | |
+
+The domain parameters have the following meanings:
+
+| Parameter | Description |
+| --- | --- |
+| start | Seed model. The start values are copied into particle 0 of the initial population. |
+| min | Lower limit of the parameter range. |
+| max | Upper limit of the parameter range. |
+| step | Standard deviation of the Gaussian distribution of weak mutations. The step should not be much lower than the the parameter range divided by the population size and not greater than one third of the parameter range. |
+
+The population of the genetic optimizer can be seeded and patched in the same way as the particle swarm,
+cf. sections @ref sec_opt_seed and @ref sec_opt_swarm.
+
+
+\subsection sec_opt_grid Grid search
+
+The grid search algorithm samples the parameter space at equidistant steps.
+The order of calculations is randomized so that distant parts of the parameter space are sampled at an early stage.
+
+| Parameter | Description |
+| --- | --- |
+| start | Values of fixed parameters. |
+| min | Lower limit of the parameter range. |
+| max | Upper limit of the parameter range. If abs(max - min) < step/2 , the parameter is kept constant. |
+| step | Step size (distance between two grid points). If step <= 0, the parameter is kept constant. |
+
+
+\subsection sec_opt_gradient Gradient search
+
+Currently not implemented.
+
+\subsection sec_opt_table Table scan
+
+The table scan calculates models from an explicit table of model parameters.
+It can be used to recalculate models from a previous optimization run on other experimental data,
+as an interface to external optimizers,
+or as a simple input of manually edited model parameters.
+
+The table can be stored in an external file that is specified on the command line,
+or supplied in one of several forms by the custom project class.
+The table can be left unchanged during the calculations,
+or new models can be added on the go.
+
+@attention Because it is not easily possible to know when and which models have been read from the table file, if you do modify the table file during processing, pay attention to the following hints:
+1. The file on disk must not be locked for more than a second. Do not keep the file open unnecessarily.
+2. _Append_ new models to the end of the table rather than overwriting previous ones. Otherwise, some models may be lost before they have been calculated.
+
+The general parameters of the table scan are specified in the @ref Project.optimizer_params dictionary.
+Some of them can be changed on the command line or in the project class (depending on how the project class is implemented).
+
+| Parameter | Command line | Range | Description |
+| --- | --- | --- | --- |
+| pop_size | --pop-size | &ge; 1 | Number of models in a generation (calculated in parallel). In table mode, this parameter is not so important and can be left at the default. It has nothing to do with table size. |
+| table_file | --table-file | a file path, default none | |
+
+The domain parameters have the following meanings.
+Models that violate the parameter range are not calculated.
+
+| Parameter | Description |
+| --- | --- |
+| start | Not used. |
+| min | Lower limit of the parameter range. |
+| max | Upper limit of the parameter range. |
+| step | Not used. |
+
+
+\subsection sec_opt_single Single model
+
+The single model optimizer calculates the model defined by domain.start.
+
+| Parameter | Description |
+| --- | --- |
+| start | Values of model parameters. |
+| min | Not used. |
+| max | Not used. |
+| step | Not used. |
+
+*/
+
--- a/docs/src/tasks.dot
+++ b/docs/src/tasks.dot
@ -38,15 +38,15 @@ custom_scan [label="scan\nconfiguration", shape=note];
 {rank=same; custom_scan; create_scan; combine_scan;}
 custom_scan -> create_scan [lhead=cluster_scan];

-subgraph cluster_symmetry {
-label="symmetry handler";
+subgraph cluster_domain {
+label="domain handler";
 rank=same;
-create_symmetry [label="define\nsymmetry\ntasks"];
-combine_symmetry  [label="gather\nsymmetry\nresults"];
+create_model_space [label="define\ndomain\ntasks"];
+combine_domain  [label="gather\ndomain\nresults"];
 }
-custom_symmetry [label="symmetry\ndefinition", shape=cds];
-{rank=same; create_symmetry; combine_symmetry; custom_symmetry;}
-custom_symmetry -> combine_symmetry [lhead=cluster_symmetry];
+custom_domain [label="domain\ndefinition", shape=cds];
+{rank=same; create_model_space; combine_domain; custom_domain;}
+custom_domain -> combine_domain [lhead=cluster_domain];

 subgraph cluster_emitter {
 label="emitter handler";
@ -80,11 +80,11 @@ create_cluster -> edac;
 create_model -> create_scan [label="level 1 tasks"];
 evaluate_model -> combine_scan [label="level 1 results", dir=back];

-create_scan -> create_symmetry [label="level 2 tasks"];
-combine_scan -> combine_symmetry [label="level 2 results", dir=back];
+create_scan -> create_model_space [label="level 2 tasks"];
+combine_scan -> combine_domain [label="level 2 results", dir=back];

-create_symmetry -> create_emitter [label="level 3 tasks"];
-combine_symmetry -> combine_emitter [label="level 3 results", dir=back];
+create_model_space -> create_emitter [label="level 3 tasks"];
+combine_domain -> combine_emitter [label="level 3 results", dir=back];

 create_emitter -> create_region [label="level 4 tasks"];
 combine_emitter -> combine_region [label="level 4 results", dir=back];
--- a/docs/src/uml/CalculationTask-class.puml
+++ b/docs/src/uml/CalculationTask-class.puml
@ -0,0 +1,38 @@
+@startuml
+
+class CalculationTask {
+id : CalcID
+parent : CalcID
+model : dict
+file_root : str
+file_ext : str
+result_filename : str
+modf_filename : str
+result_valid : bool
+time : datetime.timedelta
+files : dict
+region : dict
+__init__()
+__eq__()
+__hash__()
+copy()
+change_id()
+format_filename()
+get_mpi_message()
+set_mpi_message()
+add_task_file()
+rename_task_file()
+remove_task_file()
+}
+
+class CalcID {
+model
+scan
+domain
+emit
+region
+}
+
+CalculationTask *-- CalcID
+
+@enduml
--- a/docs/src/uml/CalculationTask-objects.puml
+++ b/docs/src/uml/CalculationTask-objects.puml
@ -0,0 +1,133 @@
+@startuml
+
+object Root {
+id = -1, -1, -1, -1, -1
+parent = -1, -1, -1, -1, -1
+model = {}
+}
+
+Root o.. Model1
+Root o.. Model2
+
+object Model1 {
+id = 1, -1, -1, -1, -1
+parent = -1, -1, -1, -1, -1
+model = {'d': 5}
+}
+
+object Model2 {
+id = 2, -1, -1, -1, -1
+parent = -1, -1, -1, -1, -1
+model = {'d': 7}
+}
+
+Model1 o.. Scan11
+Model1 o.. Scan12
+Model2 o.. Scan21
+
+object Scan11 {
+id = 1, 1, -1, -1, -1
+parent = 1, -1, -1, -1, -1
+model = {'d': 5}
+}
+
+object Scan12 {
+id = 1, 2, -1, -1, -1
+parent = 1, -1, -1, -1, -1
+model = {'d': 5}
+}
+
+object Scan21 {
+id = 2, 1, -1, -1, -1
+parent = 2, -1, -1, -1, -1
+model = {'d': 7}
+}
+
+Scan11 o.. Dom111
+
+object Dom111 {
+id = 1, 1, 1, -1, -1
+parent = 1, 1, -1, -1, -1
+model = {'d': 5}
+}
+
+Dom111 o.. Emitter1111
+
+object Emitter1111 {
+id = 1, 1, 1, 1, -1
+parent = 1, 1, 1, -1, -1
+model = {'d': 5}
+}
+
+Emitter1111 o.. Region11111
+
+object Region11111 {
+id = 1, 1, 1, 1, 1
+parent = 1, 1, 1, 1, -1
+model = {'d': 5}
+}
+
+
+@enduml
+
+@startuml
+
+object "Root: CalculationTask" as Root {
+}
+note right: all attributes undefined
+
+object "Model: CalculationTask" as Model {
+model
+}
+note right: model is defined\nother attributes undefined
+
+object ModelHandler
+
+object "Scan: CalculationTask" as Scan {
+model
+scan
+}
+
+object ScanHandler
+
+object "Domain: CalculationTask" as Domain {
+model
+scan
+domain
+}
+
+object "DomainHandler" as DomainHandler
+
+object "Emitter: CalculationTask" as Emitter {
+model
+scan
+domain
+emitter
+}
+
+object EmitterHandler
+
+object "Region: CalculationTask" as Region {
+model
+scan
+domain
+emitter
+region
+}
+note right: all attributes well-defined
+
+object RegionHandler
+
+Root "1" o.. "1..*" Model
+Model "1" o.. "1..*" Scan
+Scan "1" o.. "1..*" Domain
+Domain "1" o.. "1..*" Emitter
+Emitter "1" o.. "1..*" Region
+
+(Root, Model) .. ModelHandler
+(Model, Scan) .. ScanHandler
+(Scan, Domain) .. DomainHandler
+(Domain, Emitter) .. EmitterHandler
+(Emitter, Region) .. RegionHandler
+
+@enduml
--- a/docs/src/uml/calculation-task.puml
+++ b/docs/src/uml/calculation-task.puml
@ -0,0 +1,90 @@
+@startuml
+
+
+class CalculationTask {
+model
+scan
+domain
+emitter
+region
+..
+files
+}
+
+class Model {
+    index
+    ..
+    dlat
+    dAS
+    dS1S2
+    V0
+    Zsurf
+    Texp
+    rmax
+}
+
+class Scan {
+    index
+    ..
+    filename
+    mode
+    initial_state
+    energies
+    thetas
+    phis
+    alphas
+}
+
+class Domain {
+    index
+    ..
+    rotation
+    registry
+}
+
+class Emitter {
+    index
+
+}
+
+class Region {
+    index
+    ..
+    range
+}
+
+CalculationTask *-- Model
+CalculationTask *-- Scan
+CalculationTask *-- Domain
+CalculationTask *-- Emitter
+CalculationTask *-- Region
+
+class Project {
+    scans
+    domains
+    model_handler
+    cluster_generator
+}
+
+    class ClusterGenerator {
+        count_emitters()
+        create_cluster()
+    }
+
+class ModelHandler {
+    create_tasks()
+    add_result()
+}
+
+Model ..> ModelHandler
+Scan ..> Project
+Domain ..> Project
+Emitter ..> ClusterGenerator
+Region ..> Project
+
+Project *-left- ModelHandler
+Project *- ClusterGenerator
+
+hide empty members
+
+@enduml
--- a/docs/src/uml/cluster-generator.puml
+++ b/docs/src/uml/cluster-generator.puml
@ -0,0 +1,47 @@
+@startuml
+
+package pmsco {
+    class Project {
+        cluster_generator
+        export_cluster()
+    }
+
+    abstract class ClusterGenerator {
+        project
+        {abstract} count_emitters()
+        {abstract} create_cluster()
+    }
+
+    class LegacyClusterGenerator {
+        project
+        count_emitters()
+        create_cluster()
+    }
+}
+
+package "user project" {
+    class UserClusterGenerator {
+        project
+        count_emitters()
+        create_cluster()
+    }
+
+    note bottom : for complex cluster
+
+    class UserProject {
+        count_emitters()
+        create_cluster()
+    }
+
+    note bottom : for simple cluster
+
+}
+
+Project <|-- UserProject
+ClusterGenerator <|-- LegacyClusterGenerator
+ClusterGenerator <|-- UserClusterGenerator
+Project *-- ClusterGenerator
+UserProject .> LegacyClusterGenerator
+UserProject .> UserClusterGenerator
+
+@enduml
--- a/docs/src/uml/database.puml
+++ b/docs/src/uml/database.puml
@ -0,0 +1,90 @@
+@startuml
+
+
+class Project << (T,orchid) >> {
+id
+..
+..
+name
+code
+}
+
+class Job << (T,orchid) >> {
+id
+..
+project_id
+..
+name
+mode
+machine
+git_hash
+datetime
+description
+}
+
+class Tag << (T,orchid) >> {
+id
+..
+..
+key
+}
+
+class JobTag << (T,orchid) >> {
+id
+..
+tag_id
+job_id
+..
+value
+}
+
+class Model << (T,orchid) >> {
+id
+..
+job_id
+..
+model
+gen
+particle
+}
+
+class Result << (T,orchid) >> {
+id
+..
+model_id
+..
+scan
+domain
+emit
+region
+rfac
+}
+
+class Param << (T,orchid) >> {
+id
+..
+..
+key
+}
+
+class ParamValue << (T,orchid) >> {
+id
+..
+param_id
+model_id
+..
+value
+}
+
+Project "1" *-- "*" Job
+Job "1" *-- "*" JobTag
+Tag "1" *-- "*" JobTag
+Job "1" *-- "*" Model
+Param "1" *-- "*" ParamValue
+Model "1" *-- "*" ParamValue
+Model "1" *-- "*" Result
+
+hide empty members
+
+
+@enduml
--- a/docs/src/uml/handler-activity.puml
+++ b/docs/src/uml/handler-activity.puml
@ -0,0 +1,45 @@
+@startuml
+
+start
+
+repeat
+:define model tasks;
+
+:gather model results;
+repeat while
+
+stop
+
+@enduml
+
+@startuml
+
+start
+
+repeat
+partition "generate tasks" {
+:define model tasks;
+:define scan tasks;
+:define domain tasks;
+:define emitter tasks;
+:define region tasks;
+}
+fork
+:calculate task 1;
+fork again
+:calculate task 2;
+fork again
+:calculate task N;
+end fork
+partition "collect results" {
+:gather region results;
+:gather emitter results;
+:gather domain results;
+:gather scan results;
+:gather model results;
+}
+repeat while
+
+stop
+
+@enduml
--- a/docs/src/uml/master-slave-messages.puml
+++ b/docs/src/uml/master-slave-messages.puml
@ -0,0 +1,24 @@
+@startuml{master-slave-messages.png}
+== task execution ==
+loop calculation tasks
+    hnote over Master : define task
+
+    Master -> Slave: TAG_NEW_TASK
+    activate Slave
+    hnote over Slave : calculation
+    alt successful
+        Slave --> Master: TAG_NEW_RESULT
+    else calculation failed
+        Slave --> Master: TAG_INVALID_RESULT
+    else critical error
+        Slave --> Master: TAG_ERROR_ABORTING
+    end
+    deactivate Slave
+
+    hnote over Master : collect results
+end
+...
+== termination ==
+Master -> Slave: TAG_FINISH
+destroy Slave
+@enduml
--- a/docs/src/uml/minimum-project-classes.puml
+++ b/docs/src/uml/minimum-project-classes.puml
@ -0,0 +1,31 @@
+@startuml
+
+package pmsco {
+    abstract class Project {
+        mode
+        code
+        scans
+        domains
+        {abstract} create_cluster()
+        {abstract} create_params()
+        {abstract} create_model_space()
+    }
+
+}
+
+package projects {
+    class UserProject {
+        __init__()
+        create_cluster()
+        create_params()
+        create_model_space()
+    }
+
+}
+
+Project <|-- UserProject
+
+hide empty members
+
+
+@enduml
--- a/docs/src/uml/mpi-processes.puml
+++ b/docs/src/uml/mpi-processes.puml
@ -0,0 +1,66 @@
+@startuml
+participant rank0 as "rank 0 (master)"
+participant rank1 as "rank 1 (slave)"
+participant rank2 as "rank 2 (slave)"
+participant rankN as "rank N (slave)"
+
+== initialization ==
+rank0 ->> rank0
+activate rank0
+
+rnote over rank0: initialize project
+
+== task loop ==
+
+rnote over rank0: specify tasks
+rank0 ->> rank1: task 1
+activate rank1
+rnote over rank1: execute task 1
+rank0 ->> rank2: task 2
+activate rank2
+rnote over rank2: execute task 2
+rank0 ->> rankN: task N
+deactivate rank0
+activate rankN
+rnote over rankN: execute task N
+rank0 <<-- rank1: result 1
+deactivate rank1
+rnote over rank0: process results\nspecify tasks
+activate rank0
+rank0 ->> rank1: task N+1
+deactivate rank0
+activate rank1
+rnote over rank1: execute task N+1
+rank0 <<-- rank2: result 2
+deactivate rank2
+activate rank0
+rank0 ->> rank2: task N+2
+deactivate rank0
+activate rank2
+rnote over rank2: execute task N+2
+rank0 <<-- rankN: result N
+deactivate rankN
+activate rank0
+rank0 ->> rankN: task 2N
+deactivate rank0
+activate rankN
+rnote over rankN: execute task 2N
+rank0 <<-- rank1: result N+1
+deactivate rank1
+rank0 <<-- rank2: result N+2
+deactivate rank2
+rank0 <<-- rankN: result 2N
+deactivate rankN
+rnote over rank0: process results
+activate rank0
+hnote over rank0: calculations complete
+== termination ==
+rnote over rank0: report results
+rank0 ->> rank1: finish
+destroy rank1
+rank0 ->> rank2: finish
+destroy rank2
+rank0 ->> rankN: finish
+destroy rankN
+deactivate rank0
+@enduml
--- a/docs/src/uml/project-classes.puml
+++ b/docs/src/uml/project-classes.puml
@ -0,0 +1,59 @@
+@startuml
+
+abstract class Project {
+        mode : str = "single"
+        code : str = "edac"
+        scans : Scan [1..*]
+        domains : dict [1..*]
+        cluster_generator : ClusterGenerator
+        handler_classes
+        files : FileTracker
+        {abstract} create_cluster() : Cluster
+        {abstract} create_params() : CalculatorParams
+        {abstract} create_model_space() : ModelSpace
+    }
+
+class Scan {
+    filename
+    raw_data
+    dtype
+    modulation
+    mode
+    emitter
+    initial_state
+    energies
+    thetas
+    phis
+    alphas
+    import_scan_file()
+}
+
+class ModelSpace {
+    start : dict
+    min : dict
+    max : dict
+    step : dict
+    add_param(name, start, min, max, step)
+    get_param(name)
+}
+
+class CalculatorParams {
+    title
+    comment
+    cluster_file
+    output_file
+    scan_file
+    initial_state
+    polarization
+    angular_resolution
+    z_surface
+    inner_potential
+    work_function
+    polar_incidence_angle
+    azimuthal_incidence_angle
+    experiment_temperature
+}
+
+Project "1" *-- "1..*" Scan
+
+@enduml
--- a/docs/src/uml/scan-tasks-activity.puml
+++ b/docs/src/uml/scan-tasks-activity.puml
@ -0,0 +1,26 @@
+@startuml
+:model task|
+fork
+    partition "scan 0" {
+        :define scan;
+        :scan 0 task|
+        detach
+        :scan 0 result|
+    }
+fork again
+    partition "scan 1" {
+        :define scan;
+        :scan 1 task|
+        detach
+        :scan 1 result|
+    }
+fork again
+    partition "scan N" {
+        :define scan;
+        :scan N task|
+        detach
+        :scan N result|
+    }
+end fork
+:model result|
+@enduml
--- a/docs/src/uml/top-activity-partitions.puml
+++ b/docs/src/uml/top-activity-partitions.puml
@ -0,0 +1,42 @@
+@startuml
+
+|user|
+start
+:setup;
+|pmsco|
+:initialize;
+:import experimental data;
+repeat
+:define task;
+|calculator|
+:calculate\ntask;
+|pmsco|
+:evaluate results;
+repeat while
+-> [finished];
+:report results;
+
+
+stop
+
+@enduml
+
+@startuml
+
+|pmsco|
+start
+:define task (model, scan, domain, emitter, region);
+|project|
+:create cluster;
+:create parameters;
+|calculator|
+:scattering calculation;
+|pmsco|
+:combine results;
+|project|
+:calculate modulation function;
+:calculate R-factor;
+stop
+
+@enduml
+
--- a/docs/src/uml/top-activity.puml
+++ b/docs/src/uml/top-activity.puml
@ -0,0 +1,21 @@
+@startuml
+
+start
+:initialize;
+:import experimental data|
+repeat
+:define tasks;
+fork
+:calculate\ntask 1;
+fork again
+:calculate\ntask N;
+end fork
+:evaluate results;
+repeat while
+-> [finished];
+:report results|
+
+
+stop
+
+@enduml
--- a/docs/src/uml/top-components.puml
+++ b/docs/src/uml/top-components.puml
@ -0,0 +1,23 @@
+@startuml
+
+skinparam componentStyle uml2
+
+component "project" as project
+component "PMSCO" as pmsco
+component "scattering code\n(calculator)" as calculator
+
+interface "command line" as cli
+interface "input files" as input
+interface "output files" as output
+interface "experimental data" as data
+interface "results" as results
+
+data -> project
+project ..> pmsco
+pmsco ..> calculator
+cli --> project
+input -> calculator
+calculator -> output
+pmsco -> results
+
+@enduml
--- a/docs/src/uml/user-project-classes.puml
+++ b/docs/src/uml/user-project-classes.puml
@ -0,0 +1,55 @@
+@startuml
+
+package pmsco {
+    abstract class Project {
+        mode
+        code
+        scans
+        domains
+        cluster_generator
+        handler_classes
+        __
+        {abstract} create_cluster()
+        {abstract} create_params()
+        {abstract} create_model_space()
+        ..
+        combine_scans()
+        combine_domains()
+        combine_emitters()
+        calc_modulation()
+        calc_rfactor()
+    }
+
+    abstract class ClusterGenerator {
+        {abstract} count_emitters()
+        {abstract} create_cluster()
+    }
+
+}
+
+package projects {
+    class UserProject {
+        scan_dict
+        __
+        setup()
+        ..
+        create_params()
+        create_model_space()
+        ..
+        combine_domains()
+    }
+
+    class UserClusterGenerator {
+        count_emitters()
+        create_cluster()
+    }
+
+}
+
+Project <|-- UserProject
+Project *-- ClusterGenerator
+ClusterGenerator <|-- UserClusterGenerator
+
+hide empty members
+
+@enduml
--- a/extras/singularity/singularity_python2
+++ b/extras/singularity/singularity_python2
@ -0,0 +1,117 @@
+BootStrap: debootstrap
+OSVersion: bionic
+MirrorURL: http://ch.archive.ubuntu.com/ubuntu/
+
+%help
+a singularity container for PMSCO.
+
+git clone requires an ssh key for git.psi.ch.
+try agent forwarding (-A option to ssh).
+
+#%setup
+# executed on the host system outside of the container before %post
+#
+# this will be inside the container
+#    touch ${SINGULARITY_ROOTFS}/tacos.txt
+# this will be on the host
+#    touch avocados.txt
+
+#%files
+# files are copied before %post
+#
+# this copies to root
+#    avocados.txt
+# this copies to /opt
+#    avocados.txt /opt
+#
+# this does not work
+#    ~/.ssh/known_hosts /etc/ssh/ssh_known_hosts
+#    ~/.ssh/id_rsa /etc/ssh/id_rsa
+
+%labels
+    Maintainer Matthias Muntwiler
+    Maintainer_Email matthias.muntwiler@psi.ch
+    Python_Version 2.7
+
+%environment
+    export PATH="/usr/local/miniconda3/bin:$PATH"
+    export PYTHON_VERSION=2.7
+    export SINGULAR_BRANCH="singular"
+    export LC_ALL=C
+
+%post
+    export PYTHON_VERSION=2.7
+    export LC_ALL=C
+
+    sed -i 's/$/ universe/' /etc/apt/sources.list
+    apt-get update
+    apt-get -y install \
+        binutils \
+        build-essential \
+        doxygen \
+        doxypy \
+        f2c \
+        g++ \
+        gcc \
+        gfortran \
+        git \
+        graphviz \
+        libblas-dev \
+        liblapack-dev \
+        libopenmpi-dev \
+        make \
+        nano \
+        openmpi-bin \
+        openmpi-common \
+        sqlite3 \
+        wget
+    apt-get clean
+
+    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
+    bash ~/miniconda.sh -b -p /usr/local/miniconda3
+    export PATH="/usr/local/miniconda3/bin:$PATH"
+
+    conda create -q --yes -n pmsco python=${PYTHON_VERSION}
+    . /usr/local/miniconda3/bin/activate pmsco
+    conda install -q --yes -n pmsco \
+        pip \
+        "numpy>=1.13" \
+        scipy \
+        ipython \
+        matplotlib \
+        nose \
+        mock \
+        future \
+        statsmodels \
+        swig
+    conda clean --all -y
+    /usr/local/miniconda3/envs/pmsco/bin/pip install periodictable attrdict fasteners mpi4py
+    
+    
+#%test
+# test the image after build
+
+%runscript
+    # executes command from command line
+    . /usr/local/miniconda3/bin/activate pmsco
+    exec echo "$@"
+
+%apprun install
+    . /usr/local/miniconda3/bin/activate pmsco
+    cd ~
+    git clone https://git.psi.ch/pearl/pmsco.git pmsco
+    cd pmsco
+    git checkout develop
+    git checkout -b ${SINGULAR_BRANCH}
+
+    make all
+    nosetests
+
+%apprun python
+    . /usr/local/miniconda3/bin/activate pmsco
+    exec python "${@}"
+
+%apprun conda
+    . /usr/local/miniconda3/bin/activate pmsco
+    exec conda "${@}"
+
--- a/extras/singularity/singularity_python3
+++ b/extras/singularity/singularity_python3
@ -0,0 +1,116 @@
+BootStrap: debootstrap
+OSVersion: bionic
+MirrorURL: http://ch.archive.ubuntu.com/ubuntu/
+
+%help
+a singularity container for PMSCO.
+
+git clone requires an ssh key for git.psi.ch.
+try agent forwarding (-A option to ssh).
+
+#%setup
+# executed on the host system outside of the container before %post
+#
+# this will be inside the container
+#    touch ${SINGULARITY_ROOTFS}/tacos.txt
+# this will be on the host
+#    touch avocados.txt
+
+#%files
+# files are copied before %post
+#
+# this copies to root
+#    avocados.txt
+# this copies to /opt
+#    avocados.txt /opt
+#
+# this does not work
+#    ~/.ssh/known_hosts /etc/ssh/ssh_known_hosts
+#    ~/.ssh/id_rsa /etc/ssh/id_rsa
+
+%labels
+    Maintainer Matthias Muntwiler
+    Maintainer_Email matthias.muntwiler@psi.ch
+    Python_Version 3
+
+%environment
+    export PATH="/usr/local/miniconda3/bin:$PATH"
+    export PYTHON_VERSION=3
+    export SINGULAR_BRANCH="singular"
+    export LC_ALL=C
+
+%post
+    export PYTHON_VERSION=3
+    export LC_ALL=C
+
+    sed -i 's/$/ universe/' /etc/apt/sources.list
+    apt-get update
+    apt-get -y install \
+        binutils \
+        build-essential \
+        doxygen \
+        doxypy \
+        f2c \
+        g++ \
+        gcc \
+        gfortran \
+        git \
+        graphviz \
+        libblas-dev \
+        liblapack-dev \
+        libopenmpi-dev \
+        make \
+        openmpi-bin \
+        openmpi-common \
+        sqlite3 \
+        wget
+    apt-get clean
+
+    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
+    bash ~/miniconda.sh -b -p /usr/local/miniconda3
+    export PATH="/usr/local/miniconda3/bin:$PATH"
+
+    conda create -q --yes -n pmsco python=${PYTHON_VERSION}
+    . /usr/local/miniconda3/bin/activate pmsco
+    conda install -q --yes -n pmsco \
+        pip \
+        "numpy>=1.13" \
+        scipy \
+        ipython \
+        matplotlib \
+        nose \
+        mock \
+        future \
+        statsmodels \
+        swig
+    conda clean --all -y
+    /usr/local/miniconda3/envs/pmsco/bin/pip install periodictable attrdict fasteners mpi4py
+
+
+#%test
+# test the image after build
+
+%runscript
+    # executes command from command line
+    source /usr/local/miniconda3/bin/activate pmsco
+    exec echo "$@"
+
+%apprun install
+    source /usr/local/miniconda3/bin/activate pmsco
+    cd ~
+    git clone https://git.psi.ch/pearl/pmsco.git pmsco
+    cd pmsco
+    git checkout develop
+    git checkout -b ${SINGULAR_BRANCH}
+
+    make all
+    nosetests
+
+%apprun python
+    source /usr/local/miniconda3/bin/activate pmsco
+    exec python "${@}"
+
+%apprun conda
+    source /usr/local/miniconda3/bin/activate pmsco
+    exec conda "${@}"
+
--- a/extras/vagrant/Vagrantfile
+++ b/extras/vagrant/Vagrantfile
@ -0,0 +1,76 @@
+# -*- mode: ruby -*-
+# vi: set ft=ruby :
+
+# All Vagrant configuration is done below. The "2" in Vagrant.configure
+# configures the configuration version (we support older styles for
+# backwards compatibility). Please don't change it unless you know what
+# you're doing.
+Vagrant.configure("2") do |config|
+  # The most common configuration options are documented and commented below.
+  # For a complete reference, please see the online documentation at
+  # https://docs.vagrantup.com.
+
+  # Every Vagrant development environment requires a box. You can search for
+  # boxes at https://vagrantcloud.com/search.
+  config.vm.box = "singularityware/singularity-2.4"
+  config.vm.box_version = "2.4"
+
+  # Disable automatic box update checking. If you disable this, then
+  # boxes will only be checked for updates when the user runs
+  # `vagrant box outdated`. This is not recommended.
+  # config.vm.box_check_update = false
+
+  # Create a forwarded port mapping which allows access to a specific port
+  # within the machine from a port on the host machine. In the example below,
+  # accessing "localhost:8080" will access port 80 on the guest machine.
+  # NOTE: This will enable public access to the opened port
+  # config.vm.network "forwarded_port", guest: 80, host: 8080
+
+  # Create a forwarded port mapping which allows access to a specific port
+  # within the machine from a port on the host machine and only allow access
+  # via 127.0.0.1 to disable public access
+  # config.vm.network "forwarded_port", guest: 80, host: 8080, host_ip: "127.0.0.1"
+
+  # Create a private network, which allows host-only access to the machine
+  # using a specific IP.
+  # config.vm.network "private_network", ip: "192.168.33.10"
+
+  # Create a public network, which generally matched to bridged network.
+  # Bridged networks make the machine appear as another physical device on
+  # your network.
+  # config.vm.network "public_network"
+
+  # Share an additional folder to the guest VM. The first argument is
+  # the path on the host to the actual folder. The second argument is
+  # the path on the guest to mount the folder. And the optional third
+  # argument is a set of non-required options.
+  # config.vm.synced_folder "../data", "/vagrant_data"
+
+  # Provider-specific configuration so you can fine-tune various
+  # backing providers for Vagrant. These expose provider-specific options.
+  # Example for VirtualBox:
+  #
+  config.vm.provider "virtualbox" do |vb|
+    # Display the VirtualBox GUI when booting the machine
+    # vb.gui = true
+
+    # Customize the amount of memory on the VM:
+    # Increase this number if you plan to run parallel processes.
+    vb.memory = "2048"
+    # Customize the number of CPUs:
+    # Increase as necessary but not more than physically available on the host.
+    vb.cpus = 2
+  end
+  #
+  # View the documentation for the provider you are using for more
+  # information on available options.
+
+  # Enable provisioning with a shell script. Additional provisioners such as
+  # Puppet, Chef, Ansible, Salt, and Docker are also available. Please see the
+  # documentation for more information about their specific syntax and use.
+  # config.vm.provision "shell", inline: <<-SHELL
+  #   apt-get update
+  #   apt-get install -y apache2
+  # SHELL
+end
+
--- a/27
+++ b/27
@ -15,17 +15,36 @@ SHELL=/bin/sh
 #
 # the MSC and MUFPOT programs are currently not used.
 # they are not built by the top-level targets all and bin.
+#
+# the make system uses the compiler executables of the current environment.
+# to override the executables, you may set the following variables.
+# to switch between python versions, however, the developers recommend miniconda.
+#
+# PYTHON = python executable (default: python)
+# PYTHONOPTS = python options (default: none)
+# CC = C and Fortran compiler executable (default: gcc)
+# CCOPTS = C compiler options (default: none)
+# CXX = C++ compiler executable (default: g++)
+# CXXOPTS = C++ compiler options (default: none)
+#
+# make all PYTHON=/usr/bin/python2.7
+#
+# or:
+#
+# export PYTHON=/usr/bin/python2.7
+# make all
+#

-.PHONY: all bin docs clean edac loess msc mufpot
+.PHONY: all bin docs clean edac loess msc mufpot phagen

 PMSCO_DIR = pmsco
 DOCS_DIR = docs

-all: edac loess docs
+all: edac loess phagen docs

-bin: edac loess
+bin: edac loess phagen

-edac loess msc mufpot:
+edac loess msc mufpot phagen:
 	$(MAKE) -C $(PMSCO_DIR)

 docs:
--- a/pmsco/main.py
+++ b/pmsco/main.py
@ -8,10 +8,17 @@ python pmsco [pmsco-arguments]
@endverbatim
 """

-import pmsco
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import sys
+import os.path
+
+file_dir = os.path.dirname(__file__) or '.'
+root_dir = os.path.join(file_dir, '..')
+root_dir = os.path.abspath(root_dir)
+sys.path[0] = root_dir

 if __name__ == '__main__':
-    args, unknown_args = pmsco.parse_cli()
-    pmsco.main_pmsco(args, unknown_args)
-    sys.exit(0)
+    import pmsco.pmsco
+    pmsco.pmsco.main()
--- a/pmsco/calculators/init.py
+++ b/pmsco/calculators/init.py
--- a/pmsco/calculators/calculator.py
+++ b/pmsco/calculators/calculator.py
@ -1,5 +1,5 @@
 """
-@package pmsco.calculator
+@package pmsco.calculators.calculator
 abstract scattering program interface.

 this module declares the basic interface to scattering programs.
@ -11,17 +11,21 @@ TestCalcInterface is provided for testing the PMSCO code quickly without calling

@author Matthias Muntwiler

-@copyright (c) 2015 by Paul Scherrer Institut @n
+@copyright (c) 2015-19 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import time
 import numpy as np
-import data as md
-import cluster as mc
+
+import pmsco.data as md

 __author__ = 'matthias muntwiler'

@ -38,56 +42,37 @@ class Calculator(object):
        or <code>output_file + '.etpai'</code> depending on scan mode.
        all other intermediate files are deleted unless keep_temp_files is True.

-        @param params: a msco_project.Params() object with all necessary values except cluster and output files set.
+        @param params: a pmsco.project.CalculatorParams object with all necessary values except cluster and output files set.

-        @param cluster: a msco_cluster.Cluster() object with all atom positions set.
+        @param cluster: a pmsco.cluster.Cluster(format=FMT_EDAC) object with all atom positions set.

-        @param scan: a msco_project.Scan() object describing the experimental scanning scheme.
+        @param scan: a pmsco.project.Scan() object describing the experimental scanning scheme.

        @param output_file: base name for all intermediate and output files

-        @return: result_file, files_cats
-            @arg result_file is the name of the main ETPI or ETPAI result file to be further processed.
-            @arg files_cats is a dictionary that lists the names of all created data files with their category.
+        @return: (str, dict) result_file, and dictionary of created files {filename: category}
+
+        @return: (str, dict) result_file, and dictionary of created files.
+            @arg the first element is the name of the main ETPI or ETPAI result file to be further processed.
+            @arg the second element is a dictionary that lists the names of all created data files with their category.
                 the dictionary key is the file name,
-                 the value is the file category (cluster, phase, etc.).
+                 the value is the file category (cluster, atomic, etc.).
        """
        return None, None

-    def check_cluster(self, cluster, output_file):
-        """
-        export the cluster in XYZ format for reference.

-        along with the complete cluster, the method also saves cuts in the xz (extension .y.xyz) and yz (.x.xyz) plane.
+class AtomicCalculator(Calculator):
+    """
+    abstract interface class to the atomic scattering calculation program.
+    """
+    pass

-        @param cluster: a pmsco.cluster.Cluster() object with all atom positions set.

-        @param output_file: base name for all intermediate and output files
-
-        @return: dictionary listing the names of the created files with their category.
-                 the dictionary key is the file name,
-                 the value is the file category (cluster).
-
-        @warning experimental: this method may be moved elsewhere in a future version.
-        """
-        xyz_filename = output_file + ".xyz"
-        cluster.save_to_file(xyz_filename, fmt=mc.FMT_XYZ)
-        files = {xyz_filename: 'cluster'}
-
-        clucut = mc.Cluster()
-        clucut.copy_from(cluster)
-        clucut.trim_slab("x", 0.0, 0.1)
-        xyz_filename = output_file + ".x.xyz"
-        clucut.save_to_file(xyz_filename, fmt=mc.FMT_XYZ)
-        files[xyz_filename] = 'cluster'
-
-        clucut.copy_from(cluster)
-        clucut.trim_slab("y", 0.0, 0.1)
-        xyz_filename = output_file + ".y.xyz"
-        clucut.save_to_file(xyz_filename, fmt=mc.FMT_XYZ)
-        files[xyz_filename] = 'cluster'
-
-        return files
+class InternalAtomicCalculator(AtomicCalculator):
+    """
+    dummy atomic scattering class if scattering factors are calculated internally by the multiple scattering calculator.
+    """
+    pass


 class TestCalculator(Calculator):
@ -127,5 +112,5 @@ class TestCalculator(Calculator):

        md.save_data(etpi_filename, result_etpi)

-        files = {clu_filename: 'cluster', etpi_filename: 'energy'}
+        files = {clu_filename: 'cluster', etpi_filename: 'region'}
        return etpi_filename, files
--- a/pmsco/calculators/edac.py
+++ b/pmsco/calculators/edac.py
@ -1,25 +1,30 @@
 """
-@package pmsco.edac_calculator
+@package pmsco.calculators.edac
 Garcia de Abajo EDAC program interface.

@author Matthias Muntwiler

-@copyright (c) 2015 by Paul Scherrer Institut @n
+@copyright (c) 2015-18 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
 from __future__ import division
-import os
+from __future__ import print_function
+
 import logging
-import math
 import numpy as np
-import calculator
-import data as md
-import cluster as mc
-import edac.edac as edac
+import os
+
+import pmsco.calculators.calculator as calculator
+from pmsco.compat import open
+import pmsco.data as md
+import pmsco.cluster as mc
+import pmsco.edac.edac as edac
+from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)

@ -44,18 +49,22 @@ class EdacCalculator(calculator.Calculator):

        if alpha is defined, theta is implicitly set to normal emission! (to be generalized)

-        TODO: some parameters are still hard-coded.
+        @param params: a pmsco.project.CalculatorParams object with all necessary values except cluster and output files set.
+
+        @param scan: a pmsco.project.Scan() object describing the experimental scanning scheme.
+
+        @param filepath: (str) name and path of the file to be created.
+
+        @return dictionary of created files {filename: category}
        """
+        files = {}
+
        with open(filepath, "w") as f:
            f.write("verbose off\n")
-            f.write("cluster input %s\n" % (params.cluster_file))
-            f.write("emitters %u l(A)\n" % (len(params.emitters)))
+            f.write("cluster input {0}\n".format(params.cluster_file))
+            f.write("emitters {0:d} l(A)\n".format(len(params.emitters)))
            for em in params.emitters:
-                f.write("%g %g %g %u\n" % em)
-                #for iat in range(params.atom_types):
-                #pf = params.phase_file[iat]
-                #pf = pf.replace(".pha", ".edac.pha")
-                #f.write("scatterer %u %s\n" % (params.atomic_number[iat], pf))
+                f.write("{0:f} {1:f} {2:f} {3:d}\n".format(*em))

            en = scan.energies + params.work_function
            en_min = en.min()
@ -106,7 +115,7 @@ class EdacCalculator(calculator.Calculator):
                assert th_num < th.shape[0] * 10, \
                    "linearization of theta scan causes excessive oversampling {0}/{1}".format(th_num, th.shape[0])

-            f.write("beta {0}\n".format(params.polar_incidence_angle, params.azimuthal_incidence_angle))
+            f.write("beta {0}\n".format(params.polar_incidence_angle))
            f.write("incidence {0} {1}\n".format(params.polar_incidence_angle, params.azimuthal_incidence_angle))
            f.write("emission angle theta {th0:f} {th1:f} {nth:d}\n".format(th0=th_min, th1=th_max, nth=th_num))

@ -136,35 +145,67 @@ class EdacCalculator(calculator.Calculator):
            f.write("initial state {0}\n".format(params.initial_state))
            polarizations = {'H': 'LPx', 'V': 'LPy', 'L': 'LCP', 'R': 'RCP'}
            f.write("polarization {0}\n".format(polarizations[params.polarization]))
-            f.write("muffin-tin\n")
-            f.write("V0 E(eV) {0}\n".format(params.inner_potential))
-            f.write("cluster surface l(A) {0}\n".format(params.z_surface))
+
+            scatterers = ["scatterer {at} {fi}\n".format(at=at, fi=fi)
+                          for (at, fi) in params.phase_files.items()
+                          if os.path.isfile(fi)]
+            rme = ["rmat {fi}\n".format(fi=fi)
+                   for (at, fi) in params.rme_files.items()
+                   if at == params.emitters[0][3] and os.path.isfile(fi)] or \
+                  ["rmat inline 1 regular1 {l0} {pv} {pd} {mv} {md}\n".format(l0=params.l_init,
+                   pv=params.rme_plus_value, pd=params.rme_plus_shift,
+                   mv=params.rme_minus_value, md=params.rme_minus_shift)]
+            if scatterers and rme:
+                for scat in scatterers:
+                    f.write(scat)
+                f.write(rme[0])
+            else:
+                f.write("muffin-tin\n")
+
+            f.write("V0 E(eV) {0:f}\n".format(params.inner_potential))
+            f.write("cluster surface l(A) {0:f}\n".format(params.z_surface))
            f.write("imfp SD-UC\n")
-            f.write("temperature %g %g\n" % (params.experiment_temperature, params.debye_temperature))
+            f.write("temperature {0:f} {1:f}\n".format(params.experiment_temperature, params.debye_temperature))
            f.write("iteration recursion\n")
-            f.write("dmax l(A) %g\n" % (params.dmax))
-            f.write("lmax %u\n" % (params.lmax))
-            f.write("orders %u " % (len(params.orders)))
-            for order in params.orders:
-                f.write("%u " % (order))
-            f.write("\n")
-            f.write("emission angle window 1\n")
-            f.write("scan pd %s\n" % (params.output_file))
+            f.write("dmax l(A) {0:f}\n".format(params.dmax))
+            f.write("lmax {0:d}\n".format(params.lmax))
+            f.write("orders {0:d} ".format(len(params.orders)))
+            f.write(" ".join(format(order, "d") for order in params.orders) + "\n")
+            f.write("emission angle window {0:F}\n".format(params.angular_resolution / 2.0))
+
+            # scattering factor output (see project.CalculatorParams.phase_output_classes)
+            if params.phase_output_classes is not None:
+                fn = "{0}.clu".format(params.output_file)
+                f.write("cluster output l(A) {fn}\n".format(fn=fn))
+                files[fn] = "output"
+                try:
+                    cls = (cl for cl in params.phase_output_classes)
+                except TypeError:
+                    cls = range(params.phase_output_classes)
+                for cl in cls:
+                    fn = "{of}.{cl}.scat".format(cl=cl, of=params.output_file)
+                    f.write("scan scatterer {cl} phase-shifts {fn}\n".format(cl=cl, fn=fn))
+                    files[fn] = "output"
+
+            f.write("scan pd {0}\n".format(params.output_file))
+            files[params.output_file] = "output"
            f.write("end\n")

+        return files
+
    def run(self, params, cluster, scan, output_file):
        """
        run EDAC with the given parameters and cluster.

-        @param params: a msc_param.Params() object with all necessary values except cluster and output files set.
+        @param params: a pmsco.project.CalculatorParams object with all necessary values except cluster and output files set.

-        @param cluster: a msc_cluster.Cluster(format=FMT_EDAC) object with all atom positions set.
+        @param cluster: a pmsco.cluster.Cluster(format=FMT_EDAC) object with all atom positions set.

-        @param scan: a msco_project.Scan() object describing the experimental scanning scheme.
+        @param scan: a pmsco.project.Scan() object describing the experimental scanning scheme.

        @param output_file: base name for all intermediate and output files

-        @return: result_file, files_cats
+        @return: (str, dict) result_file, and dictionary of created files {filename: category}
        """

        # set up scan
@ -185,13 +226,13 @@ class EdacCalculator(calculator.Calculator):
        params.cluster_file = clu_filename
        params.output_file = out_filename
        params.data_file = dat_filename
-        params.emitters = cluster.get_emitters()
+        params.emitters = cluster.get_emitters(['x', 'y', 'z', 'c'])

        # save parameter files
        logger.debug("writing cluster file %s", clu_filename)
        cluster.save_to_file(clu_filename, fmt=mc.FMT_EDAC)
        logger.debug("writing input file %s", par_filename)
-        self.write_input_file(params, scan, par_filename)
+        files = self.write_input_file(params, scan, par_filename)

        # run EDAC
        logger.info("calling EDAC with input file %s", par_filename)
@ -213,11 +254,20 @@ class EdacCalculator(calculator.Calculator):
                pass
            result_etpi = md.interpolate_hemi_scan(result_etpi, hemi_tpi)

-        if result_etpi.shape[0] != scan.raw_data.shape[0]:
-            logger.error("scan length mismatch: EDAC result: %u, scan data: %u", result_etpi.shape[0], scan.raw_data.shape[0])
+        if params.fixed_cluster:
+            expected_shape = max(scan.energies.shape[0], 1) * max(scan.alphas.shape[0], 1)
+        else:
+            expected_shape = max(scan.energies.shape[0], 1) * max(scan.phis.shape[0], scan.thetas.shape[0], 1)
+        if result_etpi.shape[0] != expected_shape:
+            logger.warning(BMsg("possible scan length mismatch: EDAC result: {result}, expected: {expected}",
+                              result=result_etpi.shape[0], expected=expected_shape))
+
        logger.debug("save result to file %s", etpi_filename)
        md.save_data(etpi_filename, result_etpi)

-        files = {clu_filename: 'input', par_filename: 'input', dat_filename: 'output',
-                 etpi_filename: 'region'}
+        files[clu_filename] = 'input'
+        files[par_filename] = 'input'
+        files[dat_filename] = 'output'
+        files[etpi_filename] = 'region'
+
        return etpi_filename, files
--- a/pmsco/calculators/msc.py
+++ b/pmsco/calculators/msc.py
@ -13,9 +13,12 @@ Licensed under the Apache License, Version 2.0 (the "License"); @n
  http://www.apache.org/licenses/LICENSE-2.0
 """

-import calculator
-import data as md
-import msc.msc as msc
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import pmsco.calculators.calculator as calculator
+import pmsco.data as md
+import pmsco.msc.msc as msc
 import logging

 logger = logging.getLogger(__name__)
@ -32,7 +35,7 @@ class MscCalculator(calculator.Calculator):
            f.write(" %s\n" % (params.polarization) )
            f.write(" %4u\n" % (params.scattering_level) )
            f.write("  %7.2f%7.2f\n" % (params.fcut, params.cut) )
-            f.write(" %12.6f\n" % (params.angular_broadening) )
+            f.write(" %12.6f\n" % (params.angular_resolution) )
            f.write(" %12.6f\n" % (params.lattice_constant) )
            f.write(" %12.6f\n" % (params.z_surface) )
            f.write(" %4u\n" % (params.atom_types) )
@ -59,7 +62,7 @@ class MscCalculator(calculator.Calculator):
        """
        run the MSC program with the given parameters and cluster.

-        @param params: a project.Params() object with all necessary values except cluster and output files set.
+        @param params: a project.CalculatorParams() object with all necessary values except cluster and output files set.

        @param cluster: a cluster.Cluster(format=FMT_MSC) object with all atom positions set.

--- a/pmsco/calculators/phagen/init.py
+++ b/pmsco/calculators/phagen/init.py
--- a/pmsco/calculators/phagen/makefile
+++ b/pmsco/calculators/phagen/makefile
@ -0,0 +1,43 @@
+SHELL=/bin/sh
+
+# makefile for PHAGEN program and module
+#
+# the PHAGEN source code is not included in the public distribution.
+# please obtain the PHAGEN code from the original author,
+# and copy it to this directory before compilation.
+#
+# see the top-level makefile for additional information.
+
+.SUFFIXES:
+.SUFFIXES: .c .cpp .cxx .exe .f .h .i .o .py .pyf .so
+.PHONY: all clean phagen
+
+FC?=gfortran
+F2PY?=f2py
+F2PYOPTS?=
+CC?=gcc
+CCOPTS?=
+SWIG?=swig
+SWIGOPTS?=
+PYTHON?=python
+PYTHONOPTS?=
+PYTHONINC?=
+PYTHON_CONFIG = ${PYTHON}-config
+PYTHON_CFLAGS ?= $(shell ${PYTHON_CONFIG} --cflags)
+PYTHON_EXT_SUFFIX ?= $(shell ${PYTHON_CONFIG} --extension-suffix)
+
+all: phagen
+
+phagen: phagen.exe phagen$(PYTHON_EXT_SUFFIX)
+
+phagen.exe: phagen_scf.f msxas3.inc msxasc3.inc
+	$(FC) $(FCOPTS) -o phagen.exe phagen_scf.f
+
+phagen.pyf: | phagen_scf.f
+	$(F2PY) -h phagen.pyf -m phagen phagen_scf.f only: libmain
+
+phagen$(PYTHON_EXT_SUFFIX): phagen_scf.f phagen.pyf msxas3.inc msxasc3.inc
+	$(F2PY) -c $(F2PYOPTS) -m phagen phagen.pyf phagen_scf.f
+
+clean:
+	rm -f *.so *.o *.exe
--- a/pmsco/calculators/phagen/phagen_scf.f.patch
+++ b/pmsco/calculators/phagen/phagen_scf.f.patch
@ -0,0 +1,102 @@
+--- phagen_scf.orig.f	2019-06-05 16:45:52.977855859 +0200
+++ phagen_scf.f	2019-05-09 16:32:35.790286429 +0200
+@@ -174,6 +174,99 @@
+  1100 format(//,1x,' ** phagen terminated normally ** ',//)
+       end
+ 
+
+c-----------------------------------------------------------------------
+      subroutine libmain(infile,outfile,etcfile)
+c      main calculation routine
+c      entry point for external callers
+c
+c      infile: name of parameter input file
+c
+c      outfile: base name of output files
+c        output files with endings .list, .clu, .pha, .tl, .rad
+c        will be created
+c-----------------------------------------------------------------------
+      implicit real*8 (a-h,o-z)
+c
+      include 'msxas3.inc'
+      include 'msxasc3.inc'
+
+      character*60 infile,outfile,etcfile
+      character*70 listfile,clufile,tlfile,radfile,phafile
+
+c
+c.. constants
+      antoau  = 0.52917715d0
+      pi      = 3.141592653589793d0
+      ev      = 13.6058d0
+      zero    = 0.d0
+c.. threshold for linearity
+      thresh  = 1.d-4
+c.. fortran io units
+      idat = 5
+      iwr = 6
+      iphas = 30
+      iedl0 = 31
+      iwf = 32
+      iof = 17
+
+      iii=LnBlnk(outfile)+1
+      listfile=outfile
+      listfile(iii:)='.list'
+      clufile=outfile
+      clufile(iii:)='.clu'
+      phafile=outfile
+      phafile(iii:)='.pha'
+      tlfile=outfile
+      tlfile(iii:)='.tl'
+      radfile=outfile
+      radfile(iii:)='.rad'
+
+      open(idat,file=infile,form='formatted',status='old')
+      open(iwr,file=listfile,form='formatted',status='unknown')
+      open(10,file=clufile,form='formatted',status='unknown')
+      open(35,file=tlfile,form='formatted',status='unknown')
+      open(55,file=radfile,form='formatted',status='unknown')
+      open(iphas,file=phafile,form='formatted',status='unknown')
+
+      open(iedl0,form='unformatted',status='scratch')
+      open(iof,form='unformatted',status='scratch')
+      open(unit=21,form='unformatted',status='scratch')
+      open(60,form='formatted',status='scratch')
+      open(50,form='formatted',status='scratch')
+      open(unit=13,form='formatted',status='scratch')
+      open(unit=14,form='formatted',status='scratch')
+      open(unit=11,status='scratch')
+      open(unit=iwf,status='scratch')
+      open(unit=33,status='scratch')
+      open(unit=66,status='scratch')
+
+      call inctrl
+      call intit(iof)
+      call incoor
+      call calphas
+
+      close(idat)
+      close(iwr)
+      close(10)
+      close(35)
+      close(55)
+      close(iphas)
+      close(iedl0)
+      close(iof)
+      close(60)
+      close(50)
+      close(13)
+      close(14)
+      close(11)
+      close(iwf)
+      close(33)
+      close(66)
+      close(21)
+
+      endsubroutine
+
+
+       subroutine inctrl
+       implicit real*8 (a-h,o-z)
+       include 'msxas3.inc'
--- a/pmsco/calculators/phagen/runner.py
+++ b/pmsco/calculators/phagen/runner.py
@ -0,0 +1,163 @@
+"""
+@package pmsco.calculators.phagen.runner
+Natoli/Sebilleau PHAGEN interface
+
+this module runs the PHAGEN program to calculate scattering factors and radial matrix element.
+
+@author Matthias Muntwiler
+
+@copyright (c) 2015-19 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import logging
+import os
+import shutil
+import tempfile
+
+from pmsco.calculators.calculator import AtomicCalculator
+from pmsco.calculators.phagen.phagen import libmain
+from pmsco.calculators.phagen.translator import Translator
+import pmsco.cluster
+
+logger = logging.getLogger(__name__)
+
+
+class PhagenCalculator(AtomicCalculator):
+    """
+    use the PHAGEN program to calculate scattering factors and radial matrix element.
+
+    this produces scatterer, radial matrix element and cluster files for EDAC.
+    """
+
+    def run(self, params, cluster, scan, output_file):
+        """
+        create the input file, run PHAGEN, and translate the output to EDAC format.
+
+        the following files are created in the job work directory:
+        - scattering factor files in EDAC format.
+          their names are `output_file + "_{atomclass}.scat"`.
+        - radial matrix element file in EDAC format.
+          its name is `output_file + ".rme"`.
+        - cluster file in PMSCO format.
+          its name is `output_file + ".clu"`.
+
+        the cluster and params objects are updated and linked to the scattering files
+        so that they can be passed to EDAC without further modification.
+        the radial matrix element is currently not used.
+
+        note that the scattering files are numbered according to the atomic environment and not chemical element.
+        this means that the updated cluster (cluster object or ".clu" file)
+        must be used in the scattering calculation.
+        atomic index is not preserved - atoms in the input and output clusters can only be related by coordinate!
+
+        because PHAGEN generates a lot of files with hard-coded names,
+        the function creates a temporary directory for PHAGEN and deletes it before returning.
+
+        @param params: pmsco.project.CalculatorParams object.
+            the phase_files attribute is updated with the paths of the scattering files.
+
+        @param cluster: pmsco.cluster.Cluster object.
+            the cluster is updated with the one returned from PHAGEN.
+            the atom classes are linked to the scattering files.
+
+        @param scan: pmsco.project.Scan object.
+            the scan object is used to determine the kinetic energy range.
+
+        @param output_file: base path and name of the output files.
+
+        @return (None, dict) where dict is a list of output files with their category.
+            the category is "atomic" for all output files.
+        """
+        assert cluster.get_emitter_count() == 1, "PHAGEN cannot handle more than one emitter at a time"
+
+        transl = Translator()
+        transl.params.set_params(params)
+        transl.params.set_cluster(cluster)
+        transl.params.set_scan(scan)
+        phagen_cluster = pmsco.cluster.Cluster()
+
+        files = {}
+        prev_wd = os.getcwd()
+        try:
+            with tempfile.TemporaryDirectory() as temp_dir:
+                os.chdir(temp_dir)
+                os.mkdir("div")
+                os.mkdir("div/wf")
+                os.mkdir("plot")
+                os.mkdir("data")
+
+                # prepare input for phagen
+                infile = "phagen.in"
+                outfile = "phagen.out"
+
+                try:
+                    transl.write_input(infile)
+                    report_infile = os.path.join(prev_wd, output_file + ".phagen.in")
+                    shutil.copy(infile, report_infile)
+                    files[report_infile] = "input"
+                except IOError:
+                    logger.warning("error writing phagen input file {fi}.".format(fi=infile))
+
+                # call phagen
+                libmain(infile, outfile)
+
+                # collect results
+                try:
+                    phafile = outfile + ".pha"
+                    transl.parse_phagen_phase(phafile)
+                    report_phafile = os.path.join(prev_wd, output_file + ".phagen.pha")
+                    shutil.copy(phafile, report_phafile)
+                    files[report_phafile] = "output"
+                except IOError:
+                    logger.error("error loading phagen phase file {fi}".format(fi=phafile))
+
+                try:
+                    radfile = outfile + ".rad"
+                    transl.parse_radial_file(radfile)
+                    report_radfile = os.path.join(prev_wd, output_file + ".phagen.rad")
+                    shutil.copy(radfile, report_radfile)
+                    files[report_radfile] = "output"
+                except IOError:
+                    logger.error("error loading phagen radial file {fi}".format(fi=radfile))
+
+                try:
+                    clufile = outfile + ".clu"
+                    phagen_cluster.load_from_file(clufile, pmsco.cluster.FMT_PHAGEN_OUT)
+                except IOError:
+                    logger.error("error loading phagen cluster file {fi}".format(fi=clufile))
+
+                try:
+                    listfile = outfile + ".list"
+                    report_listfile = os.path.join(prev_wd, output_file + ".phagen.list")
+                    shutil.copy(listfile, report_listfile)
+                    files[report_listfile] = "log"
+                except IOError:
+                    logger.error("error loading phagen list file {fi}".format(fi=listfile))
+
+        finally:
+            os.chdir(prev_wd)
+
+        # write edac files
+        scatfile = output_file + "_{}.scat"
+        scatfiles = transl.write_edac_scattering(scatfile)
+        params.phase_files = {c: scatfiles[c] for c in scatfiles}
+        files.update({scatfiles[c]: "atomic" for c in scatfiles})
+
+        rmefile = output_file + ".rme"
+        transl.write_edac_emission(rmefile)
+        files[rmefile] = "atomic"
+
+        cluster.update_atoms(phagen_cluster, {'c'})
+        clufile = output_file + ".pmsco.clu"
+        cluster.save_to_file(clufile, pmsco.cluster.FMT_PMSCO)
+        files[clufile] = "cluster"
+
+        return None, files
--- a/pmsco/calculators/phagen/translator.py
+++ b/pmsco/calculators/phagen/translator.py
@ -0,0 +1,476 @@
+"""
+@package pmsco.calculators.phagen.translator
+Natoli/Sebilleau PHAGEN interface
+
+this module provides conversion between input/output files of PHAGEN and EDAC.
+
+@author Matthias Muntwiler
+
+@copyright (c) 2015-19 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+from pmsco.cluster import Cluster
+from pmsco.compat import open
+
+## rydberg energy in electron volts
+ERYDBERG = 13.6056923
+
+
+def state_to_edge(state):
+    """
+    translate spectroscopic notation to edge notation.
+
+    @param state: spectroscopic notation: "1s", "2s", "2p1/2", etc.
+    @return: edge notation: "k", "l1", "l2", etc.
+        note: if the j-value is not given, the lower j edge is returned.
+    """
+    jshells = ['s', 'p1/2', 'p3/2', 'd3/2', 'd5/2', 'f5/2', 'f7/2']
+    lshells = [s[0] for s in jshells]
+    shell = int(state[0])
+    try:
+        subshell = jshells.index(state[1:]) + 1
+    except ValueError:
+        subshell = lshells.index(state[1]) + 1
+    except IndexError:
+        subshell = 1
+    edge = "klmnop"[shell-1]
+    if shell > 1:
+        edge += str(subshell)
+    return edge
+
+
+class TranslationParams(object):
+    """
+    project parameters needed for translation.
+
+    energy unit is eV.
+    """
+    def __init__(self):
+        self.initial_state = "1s"
+        self.binding_energy = 0.
+        self.cluster = None
+        self.kinetic_energies = np.empty(0, dtype=np.float)
+
+    @property
+    def l_init(self):
+        return "spdf".index(self.initial_state[1])
+
+    @property
+    def edge(self):
+        return state_to_edge(self.initial_state)
+
+    def set_params(self, params):
+        """
+        set the translation parameters.
+
+        @param params: a pmsco.project.CalculatorParams object or
+                       a dictionary containing some or all public fields of this class.
+        @return: None
+        """
+        try:
+            self.initial_state = params.initial_state
+            self.binding_energy = params.binding_energy
+        except AttributeError:
+            for key in params:
+                self.__setattr__(key, params[key])
+
+    def set_scan(self, scan):
+        """
+        set the scan parameters.
+
+        @param scan: a pmsco.project.Scan object
+        @return: None
+        """
+        try:
+            energies = scan.energies
+        except AttributeError:
+            try:
+                energies = scan['e']
+            except KeyError:
+                energies = scan
+        if not isinstance(energies, np.ndarray):
+            energies = np.array(energies)
+        self.kinetic_energies = np.resize(self.kinetic_energies, energies.shape)
+        self.kinetic_energies = energies
+
+    def set_cluster(self, cluster):
+        """
+        set the initial cluster.
+
+        @param cluster: a pmsco.cluster.Cluster object
+        @return: None
+        """
+        self.cluster = cluster
+
+
+class Translator(object):
+    """
+    data conversion to/from phagen input/output files.
+
+    usage:
+    1. set the translation parameters self.params.
+    2. call write_input_file to create the phagen input files.
+    3. call phagen on the input file.
+    4. call parse_phagen_phase.
+    5. call parse_radial_file.
+    6. call write_edac_scattering to produce the EDAC scattering matrix files.
+    7. call write_edac_emission to produce the EDAC emission matrix file.
+    """
+
+    ## @var params
+    #
+    # project parameters needed for translation.
+    #
+    # fill the attributes of this object before using any translator methods.
+
+    ## @var scattering
+    #
+    # t-matrix storage
+    #
+    # the t-matrix is stored in a flat, one-dimensional numpy structured array consisting of the following fields:
+    # @arg e (float) energy (eV)
+    # @arg a (int) atom index (1-based)
+    # @arg l (int) angular momentum quantum number l
+    # @arg t (complex) scattering matrix element, t = exp(-i * delta) * sin delta
+    #
+    # @note PHAGEN uses the convention t = exp(-i * delta) * sin delta,
+    # whereas EDAC uses t = exp(i * delta) * sin delta (complex conjugate).
+    # this object stores the t-matrix according to the PHAGEN convention.
+    # the conversion to the EDAC convention occurs in write_edac_scattering_file().
+
+    ## @var emission
+    #
+    # radial matrix element storage
+    #
+    # the radial matrix elemnts are stored in a flat, one-dimensional numpy structured array
+    # consisting of the following fields:
+    # @arg e (float) energy (eV)
+    # @arg dw (complex) matrix element for the transition to l-1
+    # @arg up (complex) matrix element for the transition to l+1
+
+    ## @var cluster
+    #
+    # cluster object for PHAGEN
+    #
+    # this object is created by translate_cluster().
+
+    def __init__(self):
+        """
+        initialize the object instance.
+        """
+        self.params = TranslationParams()
+        dt = [('e', 'f4'), ('a', 'i4'), ('l', 'i4'), ('t', 'c16')]
+        self.scattering = np.empty(0, dtype=dt)
+        dt = [('e', 'f4'), ('dw', 'c16'), ('up', 'c16')]
+        self.emission = np.empty(0, dtype=dt)
+        self.cluster = None
+
+    def translate_cluster(self):
+        """
+        translate the cluster into a form suitable for PHAGEN.
+
+        specifically, move the (first and hopefully only) emitter to the first atom position.
+
+        the method copies the cluster from self.params into a new object
+        and stores it under self.cluster.
+
+        @return: None
+        """
+        self.cluster = Cluster()
+        self.cluster.copy_from(self.params.cluster)
+        ems = self.cluster.get_emitters(['i'])
+        self.cluster.move_to_first(idx=ems[0][0]-1)
+
+    def write_cluster(self, f):
+        """
+        write the cluster section of the PHAGEN input file.
+
+        @param f: file or output stream (an object with a write method)
+
+        @return: None
+        """
+        for atom in self.cluster.data:
+            d = {k: atom[k] for k in atom.dtype.names}
+            f.write("{s} {t} {x} {y} {z}\n".format(**d))
+        f.write("-1 -1 0. 0. 0.\n")
+
+    def write_ionicity(self, f):
+        """
+        write the ionicity section of the PHAGEN input file.
+
+        ionicity is read from the 'q' column of the cluster.
+        all atoms of a chemical element must have the same charge state
+        because ionicity has to be specified per element.
+        this function writes the average of all charge states of an element.
+
+        @param f: file or output stream (an object with a write method)
+
+        @return: None
+        """
+        data = self.cluster.data
+        elements = np.unique(data['t'])
+        for element in elements:
+            idx = np.where(data['t'] == element)
+            charge = np.mean(data['q'][idx])
+            f.write("{t} {q}\n".format(t=element, q=charge))
+
+        f.write("-1\n")
+
+    def write_input(self, f):
+        """
+        write the PHAGEN input file.
+
+        @param f: file path or output stream (an object with a write method).
+
+        @return: None
+        """
+        phagen_params = {}
+
+        self.translate_cluster()
+        phagen_params['absorber'] = 1
+        phagen_params['emin'] = self.params.kinetic_energies.min() / ERYDBERG
+        phagen_params['emax'] = self.params.kinetic_energies.max() / ERYDBERG
+        if self.params.kinetic_energies.shape[0] > 1:
+            phagen_params['delta'] = (phagen_params['emax'] - phagen_params['emin']) / \
+                                     (self.params.kinetic_energies.shape[0] - 1)
+        else:
+            phagen_params['delta'] = 0.1
+        phagen_params['edge'] = state_to_edge(self.params.initial_state)
+        phagen_params['edge1'] = 'm4'  # auger not supported
+        phagen_params['edge2'] = 'm4'  # auger not supported
+        phagen_params['cip'] = self.params.binding_energy / ERYDBERG
+        if phagen_params['cip'] < 0.001:
+            raise ValueError("binding energy parameter is zero.")
+
+        if np.sum(np.abs(self.cluster.data['q'])) > 0.:
+            phagen_params['ionzst'] = 'ionic'
+        else:
+            phagen_params['ionzst'] = 'neutral'
+
+        if hasattr(f, "write") and callable(f.write):
+            f.write("&job\n")
+            f.write("calctype='xpd',\n")
+            f.write("coor='angs',\n")
+            f.write("cip={cip},\n".format(**phagen_params))
+            f.write("absorber={absorber},\n".format(**phagen_params))
+            f.write("edge='{edge}',\n".format(**phagen_params))
+            f.write("edge1='{edge1}',\n".format(**phagen_params))
+            f.write("edge2='{edge1}',\n".format(**phagen_params))
+            f.write("gamma=0.03,\n")
+            f.write("lmax_mode=2,\n")
+            f.write("lmaxt=50,\n")
+            f.write("emin={emin},\n".format(**phagen_params))
+            f.write("emax={emax},\n".format(**phagen_params))
+            f.write("delta={delta},\n".format(**phagen_params))
+            f.write("potgen='in',\n")
+            f.write("potype='hedin',\n")
+            f.write("norman='stdcrm',\n")
+            f.write("ovlpfac=0.0,\n")
+            f.write("ionzst='{ionzst}',\n".format(**phagen_params))
+            f.write("charelx='ex',\n")
+            f.write("l2h=4\n")
+            f.write("&end\n")
+            f.write("comment 1\n")
+            f.write("comment 2\n")
+            f.write("\n")
+
+            self.write_cluster(f)
+            self.write_ionicity(f)
+        else:
+            with open(f, "w") as fi:
+                self.write_input(fi)
+
+    def parse_phagen_phase(self, f):
+        """
+        parse the phase output file from PHAGEN.
+
+        the phase file is written to div/phases.dat.
+        it contains the following columns:
+
+        @arg e energy (Ry)
+        @arg x1 unknown 1
+        @arg x2 unknown 2
+        @arg na atom index (1-based)
+        @arg nl angular momentum quantum number l
+        @arg tr real part of the scattering matrix element
+        @arg ti imaginary part of the scattering matrix element
+        @arg ph phase shift
+
+        the data is translated into the self.scattering array.
+
+        @arg e energy (eV)
+        @arg a atom index (1-based)
+        @arg l angular momentum quantum number l
+        @arg t complex scattering matrix element
+
+        @note PHAGEN uses the convention t = exp(-i * delta) * sin delta,
+        whereas EDAC uses t = exp(i * delta) * sin delta (complex conjugate).
+        this class stores the t-matrix according to the PHAGEN convention.
+        the conversion to the EDAC convention occurs in write_edac_scattering_file().
+
+        @param f: file or path (any file-like or path-like object that can be passed to numpy.genfromtxt).
+
+        @return: None
+        """
+        dt = [('e', 'f4'), ('x1', 'f4'), ('x2', 'f4'), ('na', 'i4'), ('nl', 'i4'),
+              ('tr', 'f8'), ('ti', 'f8'), ('ph', 'f4')]
+        data = np.atleast_1d(np.genfromtxt(f, dtype=dt))
+
+        self.scattering = np.resize(self.scattering, data.shape)
+        scat = self.scattering
+        scat['e'] = data['e'] * ERYDBERG
+        scat['a'] = data['na']
+        scat['l'] = data['nl']
+        scat['t'] = data['tr'] + 1j * data['ti']
+
+    def write_edac_scattering(self, filename_format, phases=False):
+        """
+        write scatterer files for EDAC.
+
+        produces one file for each atom class in self.scattering.
+
+        @param filename_format: file name including a placeholder {} for the atom class.
+
+        @param phases: write phase files instead of t-matrix files.
+
+        @return: dictionary that maps atom classes to file names
+        """
+        if phases:
+            write = self.write_edac_phase_file
+        else:
+            write = self.write_edac_scattering_file
+        scat = self.scattering
+        atoms = np.unique(scat['a'])
+        files = {}
+        for atom in atoms:
+            f = filename_format.format(atom)
+            sel = scat['a'] == atom
+            idx = np.where(sel)
+            atom_scat = scat[idx]
+            write(f, atom_scat)
+            files[atom] = f
+
+        return files
+
+    def write_edac_scattering_file(self, f, scat):
+        """
+        write a scatterer file for EDAC.
+
+        @param f: file path or output stream (an object with a write method).
+
+        @param scat: a slice of the self.scattering array belonging to the same atom class.
+
+        @return: None
+        """
+        if hasattr(f, "write") and callable(f.write):
+            energies = np.unique(scat['e'])
+            ne = energies.shape[0]
+            lmax = scat['l'].max()
+            if ne == 1:
+                f.write("1 {lmax} regular tl\n".format(lmax=lmax))
+            else:
+                f.write("{nk} E(eV) {lmax} regular tl\n".format(nk=ne, lmax=lmax))
+            for energy in energies:
+                sel = scat['e'] == energy
+                idx = np.where(sel)
+                energy_scat = scat[idx]
+                if ne > 1:
+                    f.write("{0:.3f} ".format(energy))
+                for item in energy_scat:
+                    f.write(" {0:.6f} {1:.6f}".format(item['t'].real, -item['t'].imag))
+                for i in range(len(energy_scat), lmax + 1):
+                    f.write(" 0 0")
+                f.write("\n")
+        else:
+            with open(f, "w") as fi:
+                self.write_edac_scattering_file(fi, scat)
+
+    def write_edac_phase_file(self, f, scat):
+        """
+        write a phase file for EDAC.
+
+        @param f: file path or output stream (an object with a write method).
+
+        @param scat: a slice of the self.scattering array belonging to the same atom class.
+
+        @return: None
+        """
+        if hasattr(f, "write") and callable(f.write):
+            energies = np.unique(scat['e'])
+            ne = energies.shape[0]
+            lmax = scat['l'].max()
+            if ne == 1:
+                f.write("1 {lmax} regular real\n".format(lmax=lmax))
+            else:
+                f.write("{nk} E(eV) {lmax} regular real\n".format(nk=ne, lmax=lmax))
+            for energy in energies:
+                sel = scat['e'] == energy
+                idx = np.where(sel)
+                energy_scat = scat[idx]
+                if ne > 1:
+                    f.write("{0:.3f} ".format(energy))
+                for item in energy_scat:
+                    pha = np.sign(item['t'].real) * np.arcsin(np.sqrt(np.abs(item['t'].imag)))
+                    f.write(" {0:.6f}".format(pha))
+                for i in range(len(energy_scat), lmax + 1):
+                    f.write(" 0")
+                f.write("\n")
+        else:
+            with open(f, "w") as fi:
+                self.write_edac_phase_file(fi, scat)
+
+    def parse_radial_file(self, f):
+        """
+        parse the radial matrix element output file from phagen.
+
+        @param f: file or path (any file-like or path-like object that can be passed to numpy.genfromtxt).
+
+        @return: None
+        """
+        dt = [('ar', 'f8'), ('ai', 'f8'), ('br', 'f8'), ('bi', 'f8')]
+        data = np.atleast_1d(np.genfromtxt(f, dtype=dt))
+
+        self.emission = np.resize(self.emission, data.shape)
+        emission = self.emission
+        emission['dw'] = data['ar'] + 1j * data['ai']
+        emission['up'] = data['br'] + 1j * data['bi']
+
+    def write_edac_emission(self, f):
+        """
+        write the radial photoemission matrix element in EDAC format.
+
+        requires self.emission, self.params.kinetic_energies and self.params.initial_state.
+
+        @param f: file path or output stream (an object with a write method).
+
+        @return: None
+        """
+        if hasattr(f, "write") and callable(f.write):
+            l0 = self.params.l_init
+            energies = self.params.kinetic_energies
+            emission = self.emission
+            emission['e'] = energies
+            ne = energies.shape[0]
+            if ne == 1:
+                f.write("1 regular2 {l0}\n".format(l0=l0))
+            else:
+                f.write("{nk} E(eV) regular2 {l0}\n".format(nk=ne, l0=l0))
+            for item in emission:
+                if ne > 1:
+                    f.write("{0:.3f} ".format(item['e']))
+                f.write(" {0:.6f} {1:.6f}".format(item['up'].real, item['up'].imag))
+                f.write(" {0:.6f} {1:.6f}".format(item['dw'].real, item['dw'].imag))
+                f.write("\n")
+        else:
+            with open(f, "w") as of:
+                self.write_edac_emission(of)
--- a/pmsco/cluster.py
+++ b/pmsco/cluster.py
--- a/pmsco/compat.py
+++ b/pmsco/compat.py
@ -0,0 +1,40 @@
+"""
+@package pmsco.compat
+compatibility code
+
+code bits to provide compatibility for different python versions.
+currently supported 2.7 and 3.6.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+from io import open as io_open
+
+
+def open(fname, mode='r', encoding='latin1'):
+    """
+    open a data file for read/write/append using the default str type
+
+    this is a drop-in for io.open
+    where data is exchanged via the built-in str type of python,
+    whether this is a byte string (python 2) or unicode string (python 3).
+
+    the file is assumed to be a latin-1 encoded binary file.
+
+    @param fname: file name and path
+    @param mode: 'r', 'w' or 'a'
+    @param encoding: 'latin1' (default), 'ascii' or 'utf-8'
+    @return file handle
+    """
+    if isinstance(b'b', str):
+        # python 2
+        mode += 'b'
+        kwargs = {}
+    else:
+        # python 3
+        mode += 't'
+        kwargs = {'encoding': encoding}
+
+    return io_open(fname, mode, **kwargs)
--- a/pmsco/data.py
+++ b/pmsco/data.py
@ -1,21 +1,29 @@
 """
@package pmsco.data
-import, export, evaluation of msc data
+import, export, evaluation of msc data.
+
+this module provides common functions for loading/saving and manipulating PED scan data sets.

@author Matthias Muntwiler

-@copyright (c) 2015 by Paul Scherrer Institut @n
+@copyright (c) 2015-17 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-import os
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import logging
 import numpy as np
+import os
 import scipy.optimize as so
-import loess.loess as loess
+
+from pmsco.compat import open
+import pmsco.loess.loess as loess

 logger = logging.getLogger(__name__)

@ -109,7 +117,7 @@ def load_plt(filename, int_column=-1):
    data[i]['p'] = phi
    data[i]['i'] = selected intensity column
    """
-    data = np.genfromtxt(filename, usecols=(0, 2, 3, int_column), dtype=DTYPE_ETPI)
+    data = np.atleast_1d(np.genfromtxt(filename, usecols=(0, 2, 3, int_column), dtype=DTYPE_ETPI))
    sort_data(data)
    return data

@ -149,7 +157,7 @@ def load_edac_pd(filename, int_column=-1, energy=0.0, theta=0.0, phi=0.0, fixed_
    data[i]['i'] = selected intensity column
    @endverbatim
    """
-    with open(filename, 'r') as f:
+    with open(filename, "r") as f:
        header1 = f.readline().strip()
        header2 = f.readline().strip()
    if not header1 == '--- scan PD':
@ -181,7 +189,7 @@ def load_edac_pd(filename, int_column=-1, energy=0.0, theta=0.0, phi=0.0, fixed_
            logger.warning("unexpected EDAC output file column name")
            break
    cols = tuple(cols)
-    raw = np.genfromtxt(filename, usecols=cols, dtype=dtype, skip_header=2)
+    raw = np.atleast_1d(np.genfromtxt(filename, usecols=cols, dtype=dtype, skip_header=2))

    if fixed_cluster:
        etpi = np.empty(raw.shape, dtype=DTYPE_ETPAI)
@ -259,19 +267,38 @@ def load_data(filename, dtype=None):
        DTYPE_EI, DTYPE_ETPI, DTYPE_ETPIS, DTYPE_ETPAI, or DTYPE_ETPAIS.
        by default, the function uses the extension to determine the data type.
        the actual type can be read from the dtype attribute of the returned array.
+        if the extension is missing, DTYPE_EI is assumed.
        
    @return one-dimensional numpy structured ndarray with data
+
+    @raise IOError if the file cannot be read.
+
+    @raise IndexError if the number of columns is lower than expected based on the dtype or extension.
    """
    if not dtype:
        (root, ext) = os.path.splitext(filename)
-        datatype = ext[1:].upper()
-        dtype = DTYPES[datatype]
+        ext_type = ext[1:].upper()
+        try:
+            dtype = DTYPES[ext_type]
+        except KeyError:
+            dtype = DTYPE_EI

    data = np.loadtxt(filename, dtype=dtype)
    sort_data(data)
    return data


+def format_extension(data):
+    """
+    format the file extension based on the contents of an array.
+
+    @param data ETPI-like structured numpy.ndarray.
+
+    @return: file extension string including the leading period.
+    """
+    return "." + "".join(data.dtype.names)
+
+
 def save_data(filename, data):
    """
    save column data (ETPI, and the like) to a text file.
@ -331,20 +358,24 @@ def restructure_data(data, dtype=DTYPE_ETPAIS, defaults=None):
        undefined fields are initialized to zero.
        if the parameter is unspecified, all fields are initialized to zero.

-    @return: re-structured numpy array
+    @return: re-structured numpy array or
+        @c data if the new and original data types are the same.
    """
-    new_data = np.zeros(data.shape, dtype=dtype)
-    fields = [dt[0] for dt in dtype if dt[0] in data.dtype.names]
+    if data.dtype == dtype:
+        return data
+    else:
+        new_data = np.zeros(data.shape, dtype=dtype)
+        fields = [dt[0] for dt in dtype if dt[0] in data.dtype.names]

-    if defaults is not None:
-        for field, value in defaults.iteritems():
-            if field in new_data.dtype.names:
-                new_data[field] = value
+        if defaults is not None:
+            for field, value in defaults.items():
+                if field in new_data.dtype.names:
+                    new_data[field] = value

-    for field in fields:
-        new_data[field] = data[field]
+        for field in fields:
+            new_data[field] = data[field]

-    return new_data
+        return new_data


 def common_dtype(scans):
@ -584,7 +615,7 @@ def calc_modfunc_mean(data):
    return modf


-def calc_modfunc_loess(data):
+def calc_modfunc_loess(data, smth=0.4):
    """
    calculate the modulation function using LOESS (locally weighted regression) smoothing.

@ -609,9 +640,11 @@ def calc_modfunc_loess(data):
        the modulation function is calculated for the finite-valued scan points.
        NaNs are ignored and do not affect the finite values.

-    @return copy of the data array with the modulation function in the 'i' column.
+    @param smth: size of the smoothing window relative to the size of the scan.
+        reasonable values are between 0.2 and 0.5.
+        the default value 0.4 has been found to work in many cases.

-    @todo is a fixed smoothing factor of 0.5 okay?
+    @return copy of the data array with the modulation function in the 'i' column.
    """
    sel = np.isfinite(data['i'])
    _data = data[sel]
@ -626,7 +659,7 @@ def calc_modfunc_loess(data):
        factors = [_data[axis] for axis in scan_mode]
        lo.set_x(np.hstack(tuple(factors)))
        lo.set_y(_data['i'])
-        lo.model.span = 0.5
+        lo.model.span = smth
        loess.loess(lo)

        modf['i'][sel] = lo.get_fitted_residuals() / lo.get_fitted_values()
--- a/pmsco/database.py
+++ b/pmsco/database.py
--- a/pmsco/dispatch.py
+++ b/pmsco/dispatch.py
@ -11,7 +11,9 @@ Licensed under the Apache License, Version 2.0 (the "License"); @n
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
 from __future__ import division
+from __future__ import print_function
 import os
 import os.path
 import datetime
@ -19,8 +21,11 @@ import signal
 import collections
 import copy
 import logging
+import math
+
+from attrdict import AttrDict
 from mpi4py import MPI
-from helpers import BraceMessage as BMsg
+from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)

@ -48,7 +53,99 @@ TAG_INVALID_RESULT = 3
 ## the message is empty
 TAG_ERROR_ABORTING = 4

-CalcID = collections.namedtuple('CalcID', ['model', 'scan', 'sym', 'emit', 'region'])
+## levels of calculation tasks
+#
+CALC_LEVELS = ('model', 'scan', 'domain', 'emit', 'region')
+
+## intermediate sub-class of CalcID
+#
+# this class should not be instantiated.
+# instead, use CalcID which provides some useful helper methods.
+#
+_CalcID = collections.namedtuple('_CalcID', CALC_LEVELS)
+
+
+class CalcID(_CalcID):
+    """
+    named tuple class to uniquely identify a calculation task.
+
+    this is a 5-tuple of indices, one index per task level.
+    a positive index refers to a specific instance in the task hierarchy.
+
+    the ID is defined as a named tuple so that it can be used as key of a dictionary.
+    cf. @ref CalculationTask for further detail.
+    compared to a plain named tuple, the CalcID class provides additional helper methods and properties.
+
+    example constructor: CalcID(1, 2, 3, 4, 5).
+    """
+
+    @property
+    def levels(self):
+        """
+        level names.
+
+        this property returns the defined level names in a tuple.
+        this is the same as @ref CALC_LEVELS.
+
+        @return: tuple of level names
+        """
+        return self._fields
+
+    @property
+    def level(self):
+        """
+        specific level of a task, dictionary key form.
+
+        this corresponds to the name of the last positive component.
+
+        @return: attribute name corresponding to the task level.
+            empty string if all members are negative (the root task).
+        """
+        for k in reversed(self._fields):
+            if self.__getattribute__(k) >= 0:
+                return k
+        return ''
+
+    @property
+    def numeric_level(self):
+        """
+        specific level of a task, numeric form.
+
+        this corresponds to the last positive value in the sequence of indices.
+
+        @return: index corresponding to the significant task level component of the id.
+            the value ranges from -1 to len(CalcID) - 1.
+            it is -1 if all indices are negative (root task).
+        """
+        for k in reversed(range(len(self))):
+            if self[k] >= 0:
+                return k
+        return -1
+
+    def collapse_levels(self, level):
+        """
+        return a new CalcID that is collapsed at a specific level.
+
+        the method returns a new CalcID object where the indices below the given level are -1 (undefined).
+        this can be seen as collapsing the tree at the specified node (level).
+
+        @note because a CalcID is immutable, this method returns a new instance.
+
+        @param level: level at which to collapse.
+            the index at this level remains unchanged, lower ones are set to -1.
+            the level can be specified by attribute name (str) or numeric index (-1..4).
+
+        @raise ValueError if level is not numeric and not in CALC_LEVELS.
+
+        @return: new CalcID instance.
+        """
+        try:
+            level = int(level)
+        except ValueError:
+            level = CALC_LEVELS.index(level)
+        assert -1 <= level < len(CALC_LEVELS)
+        mask = {l: -1 for (i, l) in enumerate(CALC_LEVELS) if i > level}
+        return self._replace(**mask)


 class CalculationTask(object):
@ -64,18 +161,18 @@ class CalculationTask(object):

    @arg @c id.model  structure number or iteration (handled by the mode module)
    @arg @c id.scan   scan number (handled by the project)
-    @arg @c id.sym    symmetry number (handled by the project)
+    @arg @c id.domain domain number (handled by the project)
    @arg @c id.emit   emitter number (handled by the project)
    @arg @c id.region region number (handled by the region handler)

    specified members must be greater or equal to zero.
    -1 is the wildcard which is used in parent tasks,
-    where, e.g., no specific symmetry is chosen.
-    the root task has the ID (-1, -1, -1, -1).
+    where, e.g., no specific domain is chosen.
+    the root task has the ID (-1, -1, -1, -1, -1).
    """

    ## @var id (CalcID)
-    #  named tuple CalcID containing the 4-part calculation task identifier.
+    #  named tuple CalcID containing the 5-part calculation task identifier.

    ## @var parent_id (CalcID)
    # named tuple CalcID containing the task identifier of the parent task.
@ -105,6 +202,21 @@ class CalculationTask(object):
    ## @var modf_filename (string)
    #  name of the ETPI or ETPAI file that contains the resulting modulation function.

+    ## @var result_valid (bool)
+    # validity status of the result file @ref result_filename.
+    #
+    # if True, the file must exist and contain valid data according to the task specification.
+    # the value is set True when a calculation task completes successfully.
+    # it may be reset later to invalidate the data if an error occurs during processing.
+    #
+    # validity of a parent task requires validity of all child tasks.
+
+    ## @var rfac (float)
+    # r-factor value of the task result.
+    #
+    # the rfac field is written by @ref pmsco.project.Project.evaluate_result.
+    # the initial value is Not A Number.
+
    ## @var time (timedelta)
    #  execution time of the task.
    #
@ -117,7 +229,7 @@ class CalculationTask(object):
    #  files generated by the task and their category
    #
    #  dictionary key is the file name,
-    #  value is the file category, e.g. 'cluster', 'phase', etc.
+    #  value is the file category, e.g. 'cluster', 'atomic', etc.
    #
    #  this information is used to automatically clean up unnecessary data files.

@ -125,7 +237,7 @@ class CalculationTask(object):
    # scan positions to substitute the ones from the original scan.
    #
    # this is used to distribute scans over multiple calculator processes,
-    # cf. e.g. @ref EnergyRegionHandler.
+    # cf. e.g. @ref pmsco.handlers.EnergyRegionHandler.
    #
    # dictionary key must be the scan dimension 'e', 't', 'p', 'a'.
    # the value is a numpy.ndarray containing the scan positions.
@ -147,6 +259,7 @@ class CalculationTask(object):
        self.time = datetime.timedelta()
        self.files = {}
        self.region = {}
+        self.rfac = float('nan')

    def __eq__(self, other):
        """
@ -192,7 +305,7 @@ class CalculationTask(object):
            msg['id'] = CalcID(**msg['id'])
        if isinstance(msg['parent_id'], dict):
            msg['parent_id'] = CalcID(**msg['parent_id'])
-        for k, v in msg.iteritems():
+        for k, v in msg.items():
            self.__setattr__(k, v)

    def format_filename(self, **overrides):
@ -200,23 +313,19 @@ class CalculationTask(object):
        format input or output file name including calculation index.

        @param overrides optional keyword arguments override object fields.
-            the following keywords are handled: @c root, @c model, @c scan, @c sym, @c emit, @c region, @c ext.
+            the following keywords are handled:
+            `root`, `model`, `scan`, `domain`, `emit`, `region`, `ext`.

        @return a string consisting of the concatenation of the base name, the ID, and the extension.
        """
-        parts = {}
+        parts = self.id._asdict()
        parts['root'] = self.file_root
-        parts['model'] = self.id.model
-        parts['scan'] = self.id.scan
-        parts['sym'] = self.id.sym
-        parts['emit'] = self.id.emit
-        parts['region'] = self.id.region
        parts['ext'] = self.file_ext

        for key in overrides.keys():
            parts[key] = overrides[key]

-        filename = "{root}_{model}_{scan}_{sym}_{emit}_{region}{ext}".format(**parts)
+        filename = "{root}_{model}_{scan}_{domain}_{emit}_{region}{ext}".format(**parts)
        return filename

    def copy(self):
@ -237,6 +346,30 @@ class CalculationTask(object):
        """
        self.id = self.id._replace(**kwargs)

+    @property
+    def level(self):
+        """
+        specific level of a task, dictionary key form.
+
+        this corresponds to the name of the last positive component of self.id.
+
+        @return: attribute name corresponding to the task level.
+            empty string for the root task.
+        """
+        return self.id.level
+
+    @property
+    def numeric_level(self):
+        """
+        specific level of a task, numeric form.
+
+        this corresponds to the index of the last positive component of self.id.
+
+        @return: index corresponding to the significant task level component of the id.
+            -1 for the root task.
+        """
+        return self.id.numeric_level
+
    def add_task_file(self, name, category):
        """
        register a file that was generated by the calculation task.
@ -244,7 +377,7 @@ class CalculationTask(object):
        this information is used to automatically clean up unnecessary data files.

        @param name: file name (optionally including a path).
-        @param category: file category, e.g. 'cluster', 'phase', etc.
+        @param category: file category, e.g. 'cluster', 'atomic', etc.
        @return: None
        """
        self.files[name] = category
@ -284,6 +417,82 @@ class CalculationTask(object):
            logger.warning("CalculationTask.remove_task_file: could not remove file {0}".format(filename))


+class CachedCalculationMethod(object):
+    """
+    decorator to cache results of expensive calculation functions.
+
+    this decorator can be used to transparently cache any expensive calculation result
+    that depends in a deterministic way on the calculation index.
+    for example, each cluster gets a unique calculation index.
+    if a cluster needs to be calculated repeatedly, it may be more efficient to cache it.
+
+    the method to decorate must have the following signature:
+    result = func(self, model, index).
+    the index (neglecting emitter and region) identifies the result (completely and uniquely).
+
+    the number of cached results is limited by the ttl (time to live) attribute.
+    the items' ttl values are decreased each time a requested calculation is not found in the cache (miss).
+    on a cache hit, the corresponding item's ttl is reset.
+
+    the target ttl (time to live) can be specified as an optional parameter of the decorator.
+    time increases with every cache miss.
+    """
+
+    ## @var _cache (dict)
+    #
+    # key = calculation index,
+    # value = function result
+
+    ## @var _ttl (dict)
+    #
+    # key = calculation index,
+    # value (int) = remaining time to live
+    # where time is the number of subsequent cache misses.
+
+    ## @var ttl (int)
+    #
+    # target time to live of cache items.
+    # time is given by the number cache misses.
+
+    def __init__(self, ttl=10):
+        super(CachedCalculationMethod, self).__init__()
+        self._cache = {}
+        self._ttl = {}
+        self.ttl = ttl
+
+    def __call__(self, func):
+
+        def wrapped_func(inst, model, index):
+            # note: _replace returns a new instance of the namedtuple
+            index = index._replace(emit=-1, region=-1)
+            cache_index = (id(inst), index.model, index.scan, index.domain)
+            try:
+                result = self._cache[cache_index]
+            except KeyError:
+                result = func(inst, model, index)
+                self._expire()
+                self._cache[cache_index] = result
+
+            self._ttl[cache_index] = self.ttl
+
+            return result
+
+        return wrapped_func
+
+    def _expire(self):
+        """
+        decrease the remaining ttl of cache items and delete items whose ttl has fallen below 0.
+
+        @return: None
+        """
+        for index in self._ttl:
+            self._ttl[index] -= 1
+        old_items = [index for index in self._ttl if self._ttl[index] < 0]
+        for index in old_items:
+            del self._ttl[index]
+            del self._cache[index]
+
+
 class MscoProcess(object):
    """
    code shared by MscoMaster and MscoSlave.
@ -315,7 +524,8 @@ class MscoProcess(object):
    def __init__(self, comm):
        self._comm = comm
        self._project = None
-        self._calculator = None
+        self._atomic_scattering = None
+        self._multiple_scattering = None
        self._running = False
        self._finishing = False
        self.stop_signal = False
@ -323,7 +533,8 @@ class MscoProcess(object):

    def setup(self, project):
        self._project = project
-        self._calculator = project.calculator_class()
+        self._atomic_scattering = project.atomic_scattering_factory()
+        self._multiple_scattering = project.multiple_scattering_factory()
        self._running = False
        self._finishing = False
        self.stop_signal = False
@ -357,11 +568,11 @@ class MscoProcess(object):
        """
        clean up after all calculations.

-        this method calls the clean up function of the project.
+        this method must be called after run() has finished.

        @return: None
        """
-        self._project.cleanup()
+        pass

    def calc(self, task):
        """
@ -387,14 +598,35 @@ class MscoProcess(object):
        logger.info("model %s", s_model)
        start_time = datetime.datetime.now()

-        # create parameter and cluster structures
-        clu = self._project.cluster_generator.create_cluster(task.model, task.id)
-        par = self._project.create_params(task.model, task.id)
-
-        # generate file names
+        clu = self._create_cluster(task)
+        par = self._create_params(task)
+        scan = self._define_scan(task)
        output_file = task.format_filename(ext="")

-        # determine scan range
+        # check parameters and call the calculators
+        if clu.get_atom_count() >= 1:
+            self._calc_atomic(task, par, clu, scan, output_file)
+        else:
+            logger.error("empty cluster in calculation %s", s_id)
+            task.result_valid = False
+
+        if clu.get_emitter_count() >= 1:
+            self._calc_multiple(task, par, clu, scan, output_file)
+        else:
+            logger.error("no emitters in cluster of calculation %s.", s_id)
+            task.result_valid = False
+
+        task.time = datetime.datetime.now() - start_time
+
+        return task
+
+    def _define_scan(self, task):
+        """
+        define the scan range.
+
+        @param task: CalculationTask with all attributes set for the calculation.
+        @return: pmsco.project.Scan object for the calculator.
+        """
        scan = self._project.scans[task.id.scan]
        if task.region:
            scan = scan.copy()
@ -419,27 +651,112 @@ class MscoProcess(object):
            except KeyError:
                pass

-        # check parameters and call the msc program
-        if clu.get_atom_count() < 2:
-            logger.error("empty cluster in calculation %s", s_id)
-            task.result_valid = False
-        elif clu.get_emitter_count() < 1:
-            logger.error("no emitters in cluster of calculation %s.", s_id)
-            task.result_valid = False
-        else:
-            files = self._calculator.check_cluster(clu, output_file)
+        return scan
+
+    def _create_cluster(self, task):
+        """
+        generate the cluster for the given calculation task.
+
+        cluster generation is delegated to the project's cluster_generator object.
+
+        if the current task has region == 0,
+        the method also exports diagnostic clusters via the project's export_cluster() method.
+        the file name is formatted with the given task index except that region is -1.
+
+        if (in addition to region == 0) the current task has emit == 0 and cluster includes multiple emitters,
+        the method also exports the master cluster and full emitter list.
+        the file name is formatted with the given task index except that emitter and region are -1.
+
+        @param task: CalculationTask with all attributes set for the calculation.
+        @return: pmsco.cluster.Cluster object for the calculator.
+        """
+        nem = self._project.cluster_generator.count_emitters(task.model, task.id)
+        clu = self._project.cluster_generator.create_cluster(task.model, task.id)
+        # overwrite atom classes only if they are at their default value
+        clu.init_atomclasses(field_or_value='t', default_only=True)
+
+        if task.id.region == 0:
+            file_index = task.id._replace(region=-1)
+            filename = task.format_filename(region=-1)
+            files = self._project.export_cluster(file_index, filename, clu)
            task.files.update(files)

-            task.result_filename, files = self._calculator.run(par, clu, scan, output_file)
+            # master cluster
+            if nem > 1 and task.id.emit == 0:
+                master_index = task.id._replace(emit=-1, region=-1)
+                filename = task.format_filename(emit=-1, region=-1)
+                master_cluster = self._project.cluster_generator.create_cluster(task.model, master_index)
+                files = self._project.export_cluster(master_index, filename, master_cluster)
+                task.files.update(files)
+
+        return clu
+
+    def _create_params(self, task):
+        """
+        generate the parameters list.
+
+        parameters generation is delegated to the project's create_params method.
+
+        @param task: CalculationTask with all attributes set for the calculation.
+        @return: pmsco.project.CalculatorParams object for the calculator.
+        """
+        par = self._project.create_params(task.model, task.id)
+
+        return par
+
+    def _calc_atomic(self, task, par, clu, scan, output_file):
+        """
+        calculate the atomic scattering factors if necessary and link them to the cluster.
+
+        the method first calls the `before_atomic_scattering` project hook,
+        the atomic scattering calculator,
+        and finally the `after_atomic_scattering` hook.
+        this process updates the par and clu objects to link to the created files.
+        if any of the functions returns None, the par and clu objects are left unchanged.
+
+        @param task: CalculationTask with all attributes set for the calculation.
+
+        @param par: pmsco.project.CalculatorParams object for the calculator.
+            its phase_files attribute is updated with the created scattering files.
+            the radial matrix elements are not changed (but may be in a future version).
+
+        @param clu: pmsco.cluster.Cluster object for the calculator.
+            the cluster is overwritten with the one returned by the calculator,
+            so that atom classes match the phase_files.
+
+        @return: None
+        """
+        _par = copy.deepcopy(par)
+        _clu = copy.deepcopy(clu)
+
+        _par, _clu = self._project.before_atomic_scattering(task, _par, _clu)
+        if _clu is not None:
+            filename, files = self._atomic_scattering.run(_par, _clu, scan, output_file)
+            if files:
+                task.files.update(files)
+
+                _par, _clu = self._project.after_atomic_scattering(task, _par, _clu)
+                if _clu is not None:
+                    par.phase_files = _par.phase_files
+                    clu.copy_from(_clu)
+
+    def _calc_multiple(self, task, par, clu, scan, output_file):
+        """
+        calculate the multiple scattering intensity.
+
+        @param task: CalculationTask with all attributes set for the calculation.
+        @param par: pmsco.project.CalculatorParams object for the calculator.
+        @param clu: pmsco.cluster.Cluster object for the calculator.
+        @return: None
+        """
+        task.result_filename, files = self._multiple_scattering.run(par, clu, scan, output_file)
+        if task.result_filename:
            (root, ext) = os.path.splitext(task.result_filename)
            task.file_ext = ext
            task.result_valid = True
+        if files:
            task.files.update(files)

-        task.time = datetime.datetime.now() - start_time
-
-        return task
-

 class MscoMaster(MscoProcess):
    """
@ -505,20 +822,12 @@ class MscoMaster(MscoProcess):
    #       it defines the initial model and the output file name.
    #       it is passed to the model handler during the main loop.

-    # @var _model_handler
-    #       (ModelHandler) model handler instance
-
-    # @var _scan_handler
-    #       (ScanHandler) scan handler instance
-
-    # @var _symmetry_handler
-    #       (SymmetryHandler) symmetry handler instance
-
-    # @var _emitter_handler
-    #       (EmitterHandler) emitter handler instance
-
-    # @var _region_handler
-    #       (RegionHandler) region handler instance
+    ## @var task_handlers
+    #       (AttrDict) dictionary of task handler objects
+    #
+    #       the keys are the task levels 'model', 'scan', 'domain', 'emit' and 'region'.
+    #       the values are handlers.TaskHandler objects.
+    #       the objects can be accessed in attribute or dictionary notation.

    def __init__(self, comm):
        super(MscoMaster, self).__init__(comm)
@ -534,11 +843,8 @@ class MscoMaster(MscoProcess):
        self._min_queue_len = self._slaves + 1

        self._root_task = None
-        self._model_handler = None
-        self._scan_handler = None
-        self._symmetry_handler = None
-        self._emitter_handler = None
-        self._region_handler = None
+        self.task_levels = list(CalcID._fields)
+        self.task_handlers = AttrDict()

    def setup(self, project):
        """
@ -553,41 +859,54 @@ class MscoMaster(MscoProcess):

        the method notifies the handlers of the number of available slave processes (slots).
        some of the tasks handlers adjust their branching according to the number of slots.
-        this mechanism may be used to balance the load between the task levels.
-        however, the current implementation is very coarse in this respect.
-        it advertises all slots to the model handler but a reduced number to the remaining handlers
-        depending on the operation mode.
-        the region handler receives a maximum of 4 slots except in single calculation mode.
-        in single calculation mode, all slots can be used by all handlers.
+
+        this mechanism may be used to adjust the priorities of the task levels,
+        i.e., whether one slot handles all calculations of one model
+        so that all models of a generation finish around the same time,
+        or whether a model is finished completely before the next one is calculated
+        so that a result is returned as soon as possible.
+
+        the current algorithm tries to pass as many slots as available
+        down to the lowest level (region) in order to minimize wall time.
+        the lowest level is restricted to the minimum number of splits
+        only if the intermediate levels create a lot of branches,
+        in which case splitting scans would not offer a performance benefit.
        """
        super(MscoMaster, self).setup(project)

        logger.debug("master entering setup")
        self._running_slaves = self._slaves
-        self._idle_ranks = range(1, self._running_slaves + 1)
+        self._idle_ranks = list(range(1, self._running_slaves + 1))

        self._root_task = CalculationTask()
        self._root_task.file_root = project.output_file
-        self._root_task.model = project.create_domain().start
+        self._root_task.model = project.create_model_space().start

-        self._model_handler = project.handler_classes['model']()
-        self._scan_handler = project.handler_classes['scan']()
-        self._symmetry_handler = project.handler_classes['symmetry']()
-        self._emitter_handler = project.handler_classes['emitter']()
-        self._region_handler = project.handler_classes['region']()
+        for level in self.task_levels:
+            self.task_handlers[level] = project.handler_classes[level]()

-        self._model_handler.datetime_limit = self.datetime_limit
+        self.task_handlers.model.datetime_limit = self.datetime_limit

        slaves_adj = max(self._slaves, 1)
-        self._model_handler.setup(project, slaves_adj)
-        if project.mode != "single":
-            slaves_adj = max(slaves_adj / 2, 1)
-        self._scan_handler.setup(project, slaves_adj)
-        self._symmetry_handler.setup(project, slaves_adj)
-        self._emitter_handler.setup(project, slaves_adj)
-        if project.mode != "single":
+        n_models = self.task_handlers.model.setup(project, slaves_adj)
+        if n_models > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_scans = self.task_handlers.scan.setup(project, slaves_adj)
+        if n_scans > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_doms = self.task_handlers.domain.setup(project, slaves_adj)
+        if n_doms > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_emits = self.task_handlers.emit.setup(project, slaves_adj)
+        if n_emits > 1:
+            slaves_adj = max(int(slaves_adj / 2), 1)
+        n_extra = max(n_scans, n_doms, n_emits)
+        if n_extra > slaves_adj * 2:
            slaves_adj = min(slaves_adj, 4)
-        self._region_handler.setup(project, slaves_adj)
+        logger.debug(BMsg("{regions} slots available for region handler", regions=slaves_adj))
+        self.task_handlers.region.setup(project, slaves_adj)
+
+        project.setup(self.task_handlers)

    def run(self):
        """
@ -611,20 +930,40 @@ class MscoMaster(MscoProcess):
            else:
                self._dispatch_tasks()
            self._receive_result()
+            self._cleanup_tasks()
            self._check_finish()

        logger.debug("master exiting main loop")
        self._running = False
+        self._save_report()

    def cleanup(self):
+        """
+        clean up after all calculations.
+
+        this method must be called after run() has finished.
+
+        in the master process, this calls cleanup() of each task handler and of the project.
+
+        @return: None
+        """
        logger.debug("master entering cleanup")
-        self._region_handler.cleanup()
-        self._emitter_handler.cleanup()
-        self._symmetry_handler.cleanup()
-        self._scan_handler.cleanup()
-        self._model_handler.cleanup()
+        for level in reversed(self.task_levels):
+            self.task_handlers[level].cleanup()
+        self._project.cleanup()
        super(MscoMaster, self).cleanup()

+    def _cleanup_tasks(self):
+        """
+        periodic clean-up in the main loop.
+
+        once per iteration of the main loop, this method cleans up unnecessary files.
+        this is done by the project's cleanup_files() method.
+
+        @return: None
+        """
+        self._project.cleanup_files()
+
    def _dispatch_results(self):
        """
        pass results through the post-processing modules.
@ -632,29 +971,25 @@ class MscoMaster(MscoProcess):
        logger.debug("dispatching results of %u tasks", len(self._complete_tasks))
        while self._complete_tasks:
            __, task = self._complete_tasks.popitem(last=False)
+            self._dispatch_result(task)

-            logger.debug("passing task %s to region handler", str(task.id))
-            task = self._region_handler.add_result(task)
+    def _dispatch_result(self, task):
+        """
+        pass a result through the post-processing modules.

+        @param task: a CalculationTask object.
+
+        @return None
+        """
+        level = task.level
+        if level:
+            logger.debug(BMsg("passing task {task} to {level} handler", task=str(task.id), level=level))
+            task = self.task_handlers[level].add_result(task)
            if task:
-                logger.debug("passing task %s to emitter handler", str(task.id))
-                task = self._emitter_handler.add_result(task)
-
-            if task:
-                logger.debug("passing task %s to symmetry handler", str(task.id))
-                task = self._symmetry_handler.add_result(task)
-
-            if task:
-                logger.debug("passing task %s to scan handler", str(task.id))
-                task = self._scan_handler.add_result(task)
-
-            if task:
-                logger.debug("passing task %s to model handler", str(task.id))
-                task = self._model_handler.add_result(task)
-
-            if task:
-                logger.debug("root task %s complete", str(task.id))
-                self._finishing = True
+                self._dispatch_result(task)
+        else:
+            self._finishing = True
+            logger.debug(BMsg("root task {task} complete", task=str(task.id)))

    def _create_tasks(self):
        """
@ -668,7 +1003,7 @@ class MscoMaster(MscoProcess):
        """
        logger.debug("creating new tasks from root")
        while len(self._pending_tasks) < self._min_queue_len:
-            tasks = self._model_handler.create_tasks(self._root_task)
+            tasks = self.task_handlers.model.create_tasks(self._root_task)
            logger.debug("model handler returned %u new tasks", len(tasks))
            if not tasks:
                self._model_done = True
@ -786,19 +1121,19 @@ class MscoMaster(MscoProcess):
        @return: self._finishing
        """
        if not self._finishing and (self._model_done and not self._pending_tasks and not self._running_tasks):
-            logger.info("finish: model handler is done")
+            logger.warning("finish: model handler is done")
            self._finishing = True
        if not self._finishing and (self._calculations >= self.max_calculations):
            logger.warning("finish: max. calculations (%u) exeeded", self.max_calculations)
            self._finishing = True
        if not self._finishing and self.stop_signal:
-            logger.info("finish: stop signal received")
+            logger.warning("finish: stop signal received")
            self._finishing = True
        if not self._finishing and (datetime.datetime.now() > self.datetime_limit):
            logger.warning("finish: time limit exceeded")
            self._finishing = True
        if not self._finishing and os.path.isfile("finish_pmsco"):
-            logger.info("finish: finish_pmsco file detected")
+            logger.warning("finish: finish_pmsco file detected")
            self._finishing = True

        if self._finishing and not self._running_slaves and not self._running_tasks:
@ -807,6 +1142,17 @@ class MscoMaster(MscoProcess):

        return self._finishing

+    def _save_report(self):
+        """
+        generate a final report.
+
+        this method is called at the end of the master loop.
+        it passes the call to @ref pmsco.handlers.ModelHandler.save_report.
+
+        @return: None
+        """
+        self.task_handlers.model.save_report(self._root_task)
+
    def add_model_task(self, task):
        """
        add a new model task including all of its children to the task queue.
@ -814,13 +1160,13 @@ class MscoMaster(MscoProcess):
        @param task (CalculationTask) task identifier and model parameters.
        """

-        scan_tasks = self._scan_handler.create_tasks(task)
+        scan_tasks = self.task_handlers.scan.create_tasks(task)
        for scan_task in scan_tasks:
-            sym_tasks = self._symmetry_handler.create_tasks(scan_task)
-            for sym_task in sym_tasks:
-                emitter_tasks = self._emitter_handler.create_tasks(sym_task)
+            dom_tasks = self.task_handlers.domain.create_tasks(scan_task)
+            for dom_task in dom_tasks:
+                emitter_tasks = self.task_handlers.emit.create_tasks(dom_task)
                for emitter_task in emitter_tasks:
-                    region_tasks = self._region_handler.create_tasks(emitter_task)
+                    region_tasks = self.task_handlers.region.create_tasks(emitter_task)
                    for region_task in region_tasks:
                        self._pending_tasks[region_task.id] = region_task

--- a/pmsco/edac/.gitignore
+++ b/pmsco/edac/.gitignore
@ -1,3 +1,2 @@
+edac_all_wrap.*
 edac.py
-edac_wrap.cxx
-revision.py
--- a/pmsco/edac/edac_all.cpp
+++ b/pmsco/edac/edac_all.cpp
--- a/pmsco/edac/edac_all.i
+++ b/pmsco/edac/edac_all.i
@ -0,0 +1,7 @@
+/* EDAC interface for other programs */
+%module edac
+%{
+extern int run_script(char *scriptfile);
+%}
+
+extern int run_script(char *scriptfile);
--- a/pmsco/edac/edac_all.patch
+++ b/pmsco/edac/edac_all.patch
@ -1,8 +1,18 @@
 *** /home/muntwiler_m/mnt/pearl_data/software/edac/edac_all.cpp	2011-04-14 23:38:44.000000000 +0200
--- edac_all.cpp	2016-02-11 12:15:45.322049772 +0100
+--- edac_all.cpp	2018-02-05 17:30:17.347373088 +0100
+***************
+*** 3085,3090 ****
+--- 3085,3091 ----
+  numero Vxc_Barth_Hedin(numero den, numero denup)
+  {
+    if(den<1e-10)  return 0;
+   if(denup<0)  denup=0;
+    numero rs=1/pow(4*pi/3*den, 1/3.0);
+    numero x=denup/den;
+    numero alpha0=pow(4/(9*pi), 1/3.0);
 ***************
 *** 10117,10122 ****
--- 10117,10123 ----
+--- 10118,10124 ----
    void    scan_imfp(char *name);      
    void    scan_imfp(FILE *fout);      
    numero  iimfp_TPP(numero kr);
@ -12,7 +22,7 @@
    int     scattering_so;      
 ***************
 *** 10230,10235 ****
--- 10231,10237 ----
+--- 10232,10238 ----
    
    int     n_th;
    int     n_fi;
@ -22,7 +32,7 @@
    numero  *th_out,          
 ***************
 *** 10239,10244 ****
--- 10241,10247 ----
+--- 10242,10248 ----
    void free(void);
    void init_th(numero thi, numero thf, int nth);
    void init_phi(numero fii, numero fif, int nfi);
@ -31,8 +41,121 @@
              numero refraction);  
    void init_transmission(        
 ***************
+*** 10905,10942 ****
+  }
+  numero calculation::IIIthfi(int no, numero th, numero fi)
+  {
+!   numero ii,ii0, xth,xfi;
+!   numero th0=final.th[0], th1=final.th[final.n_th-1];
+    numero fi0=final.fi[0], fi1=final.fi[final.n_fi-1];
+    int ith,ifi,ij;
+    while(fi<0)    fi+=2*pi;
+    while(fi>2*pi) fi-=2*pi;
+!   xth=(final.n_th-1)*(th-th0)/(th1-th0);   ith=int(floor(xth-0.001));
+!   xfi=(final.n_fi-1)*(fi-fi0)/(fi1-fi0);   ifi=int(floor(xfi-0.001));
+    if(ifi==-1)  ifi=0;
+    if(0<=ith && ith<final.n_th-1 && 0<=ifi && ifi<final.n_fi-1) {
+      ij=no*n_ang+ith*final.n_fi+ifi;
+!     ii0=III0[ij];
+!     ii=ii0 + (xth-ith)*(III0[ij+final.n_fi]-ii0) + (xfi-ifi)*(III0[ij+1]-ii0);
+    }  else  ii=0;
+    if(ii<0)  ii=0;
+    return ii;
+  }
+! numero calculation::IIIave(int no, numero th, numero fi)
+  {
+!   if(thave<=1e-6)  return IIIthfi(no,th,fi);
+    int  i,j, nn=10, mm=50;
+!   numero  tth,ffi,val=0, r,f;
+    for(i=0; i<nn; i++)
+    for(j=0; j<mm; j++)  {
+!     r=i*thave/nn;
+!     f=j*2*pi/mm;
+!     tth=th+r*cos(f);
+!     ffi=fi+r*sin(f)/cos(pi*th/180);
+!     val+=IIIthfi(no,tth,ffi);
+    }
+!   return val/(nn*mm);
+  }
+  void calculation::write_ang(FILE *fout_, int ik)          
+  {
+    int no,nno,i,j;        
+--- 10909,10963 ----
+  }
+  numero calculation::IIIthfi(int no, numero th, numero fi)
+  {
+!   numero ii,xth,xfi;
+!   numero th0=final.th_out[0], th1=final.th_out[final.n_th-1];
+    numero fi0=final.fi[0], fi1=final.fi[final.n_fi-1];
+    int ith,ifi,ij;
+    while(fi<0)    fi+=2*pi;
+    while(fi>2*pi) fi-=2*pi;
+!   xth=(final.n_th-1)*(th-th0)/(th1-th0);   ith=int(floor(xth-0.0001));
+!   xfi=(final.n_fi-1)*(fi-fi0)/(fi1-fi0);   ifi=int(floor(xfi-0.0001));
+    if(ifi==-1)  ifi=0;
+   if(ith==-1)  ith=0;
+    if(0<=ith && ith<final.n_th-1 && 0<=ifi && ifi<final.n_fi-1) {
+      ij=no*n_ang+ith*final.n_fi+ifi;
+!     xth = xth-ith;
+!     xfi = xfi-ifi;
+!     ii=III0[ij]*(1-xth)*(1-xfi) + III0[ij+final.n_fi]*xth*(1-xfi) + III0[ij+1]*(1-xth)*xfi + III0[ij+final.n_fi+1]*xth*xfi;
+    }  else  ii=0;
+    if(ii<0)  ii=0;
+    return ii;
+  }
+! numero calculation::IIIave(int no, numero th, numero ph)
+  {
+!   if(thave<=1e-6)  return IIIthfi(no,th,ph);
+    int  i,j, nn=10, mm=50;
+!   numero  tth, ffi, val=0, th1, ph1, cf, nw=0, x0, y0, z0, x1, y1, th2, ph2;
+    for(i=0; i<nn; i++)
+    for(j=0; j<mm; j++)  {
+!     th1=i*2*thave/(nn-1);            //2*sigma range
+!     ph1=j*2*pi/mm;
+!     //rotation of (001) around Y by th1 and around Z by ph1
+!     x0 = cos(ph1)*sin(th1);
+!     y1 = sin(ph1)*sin(th1);
+!     z0 = cos(th1);
+!     //rotation around Y by th
+!     x1 = x0*cos(th) + z0*sin(th);
+!     z0 = -x0*sin(th) + z0*cos(th);
+!     //rotation around Z by ph
+!     x0 = x1*cos(ph) - y1*sin(ph);
+!     y0 = x1*sin(ph) + y1*cos(ph);
+! 
+!     th2 = acos(z0);
+!     ph2 = atan2(y0,x0);
+!    
+!     cf = exp(-(th1*th1/thave*thave))*(i>0.1?i:0.1);        //gauss weight * radial weight
+!     nw += cf;                                    //sum of weights
+!     val+=IIIthfi(no,th2,ph2)*cf;
+    }
+!   return val/nw;
+  }
+ 
+  void calculation::write_ang(FILE *fout_, int ik)        
+  {
+    int no,nno,i,j;      
+***************
+*** 10961,10967 ****
+        for(no=0; no<nno; no++)          
+        for(i=0; i<final.n_th; i++)
+        for(j=0; j<final.n_fi; j++)
+!         III[no*n_ang+i*final.n_fi+j]=IIIave(no, final.th[i], final.fi[j]);
+        delete [] III0;
+      }
+      for(i=0; i<final.n_th; i++)
+--- 10982,10988 ----
+        for(no=0; no<nno; no++)        
+        for(i=0; i<final.n_th; i++)
+        for(j=0; j<final.n_fi; j++)
+!         III[no*n_ang+i*final.n_fi+j]=IIIave(no, final.th_out[i], final.fi[j]);
+        delete [] III0;
+      }
+      for(i=0; i<final.n_th; i++)
+***************
 *** 12485,12490 ****
--- 12488,12494 ----
+--- 12506,12512 ----
      else {
        kr=sqrt(sqr(calc.k[ik])+2*V0);            
        if(iimfp_flag==0)  ki=iimfp.val(kr)/2;    
@ -42,7 +165,7 @@
    } }  else  if(calc.k_flag==2)  set_k(calc.kc[ik]);
 ***************
 *** 12507,12512 ****
--- 12511,12522 ----
+--- 12529,12540 ----
    numero imfp=E/(TPP_Ep*TPP_Ep*(beta*log(gamma*E)-C/E+D/(E*E)))/a0_au;
    return  1/imfp;
  }
@ -64,7 +187,7 @@
    n_1=n_2=0;
    Ylm0_th_flag=Ylm0_fi_flag=0;
    mesh_flag=0;
--- 13212,13218 ----
+--- 13230,13236 ----
  }
  final_state::final_state(void)
  {
@ -74,7 +197,7 @@
    mesh_flag=0;
 ***************
 *** 13233,13238 ****
--- 13243,13271 ----
+--- 13261,13289 ----
      if(n_fi==1)  fi[0]=fii;
      else  for(j=0; j<n_fi; j++)  fi[j]=fii+j*(fif-fii)/(n_fi-1);
  } }
@ -106,7 +229,7 @@
    int i;
 ***************
 *** 14743,14748 ****
--- 14776,14783 ----
+--- 14794,14801 ----
                             || scat.TPP_Ep<=0 || scat.TPP_Eg<0)
            on_error(foutput,"(input) imfp TPP-2M", "wrong parameters");
          scat.iimfp_flag=1;
@ -117,7 +240,7 @@
          scat.iimfp_flag=0;
 ***************
 *** 15162,15164 ****
--- 15197,15206 ----
+--- 15215,15224 ----
    fprintf(foutput,"That's all, folks!\n");
    return 0;
  }
--- a/pmsco/edac/makefile
+++ b/pmsco/edac/makefile
@ -13,28 +13,23 @@ SHELL=/bin/sh
 .SUFFIXES: .c .cpp .cxx .exe .f .h .i .o .py .pyf .so
 .PHONY: all clean edac

-FC=gfortran
-FCCOPTS=
-F2PY=f2py
-F2PYOPTS=
-CC=g++
-CCOPTS=-Wno-write-strings
-SWIG=swig
-SWIGOPTS=
-PYTHON=python
-PYTHONOPTS=
+FC?=gfortran
+FCCOPTS?=
+F2PY?=f2py
+F2PYOPTS?=
+CXX?=g++
+CXXOPTS?=-Wno-write-strings
+PYTHON?=python
+PYTHONOPTS?=

 all: edac

 edac: edac.exe _edac.so edac.py

 edac.exe: edac_all.cpp
-	$(CC) $(CCOPTS) -o edac.exe edac_all.cpp
+	$(CXX) $(CXXOPTS) -o edac.exe edac_all.cpp

-edac_wrap.cxx: edac_all.cpp edac.i
-	$(SWIG) $(SWIGOPTS) -c++ -python edac.i
-	
-edac.py _edac.so: edac_wrap.cxx setup.py	
+edac.py _edac.so: edac_all.cpp edac_all.i setup.py	
 	$(PYTHON) $(PYTHONOPTS) setup.py build_ext --inplace

 revision.py: _edac.so
@ -46,7 +41,7 @@ revision.txt: _edac.so	edac.exe
 	echo "" >> revision.txt

 clean:
-	rm -f *.so *.o *.exe
-	rm -f *_wrap.cxx
-	rm -f revision.py
-	rm -f revision.txt
+	rm -f *.so *.o *.exe *.pyc
+	rm -f edac.py edac_all_wrap.*
+	rm -f revision.*
+
--- a/pmsco/edac/setup.py
+++ b/pmsco/edac/setup.py
@ -8,7 +8,8 @@ from distutils.core import setup, Extension


 edac_module = Extension('_edac',
-                           sources=['edac_wrap.cxx', 'edac_all.cpp'],
+                           sources=['edac_all.cpp', 'edac_all.i'],
+                           swig_opts=['-c++']
                           )

 setup (name = 'edac',
@ -16,5 +17,7 @@ setup (name = 'edac',
       author      = "Matthias Muntwiler",
       description = """EDAC module in Python""",
       ext_modules = [edac_module],
-       py_modules = ["edac"], requires=['numpy']
+       py_modules = ["edac"], 
+       requires=['numpy']
       )
+
--- a/pmsco/elements/init.py
+++ b/pmsco/elements/init.py
@ -0,0 +1,41 @@
+"""
+@package pmsco.elements
+extended properties of the elements
+
+this package extends the element table of the `periodictable` package
+(https://periodictable.readthedocs.io/en/latest/index.html)
+by additional attributes like the electron binding energies.
+
+the package requires the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import periodictable.core
+
+
+def _load_binding_energy():
+    """
+    delayed loading of the binding energy table.
+    """
+    from . import bindingenergy
+    bindingenergy.init(periodictable.core.default_table())
+
+
+def _load_photoionization():
+    """
+    delayed loading of the binding energy table.
+    """
+    from . import photoionization
+    photoionization.init(periodictable.core.default_table())
+
+
+periodictable.core.delayed_load(['binding_energy'], _load_binding_energy)
+periodictable.core.delayed_load(['photoionization'], _load_photoionization)
--- a/pmsco/elements/bindingenergy.json
+++ b/pmsco/elements/bindingenergy.json
@ -0,0 +1,93 @@
+{ "1": {"1s": 13.6},
+  "2": {"1s": 24.6},
+  "3": {"1s": 54.7},
+  "4": {"1s": 111.5},
+  "5": {"1s": 188.0},
+  "6": {"1s": 284.2},
+  "7": {"1s": 399.6, "2s": 27.0},
+  "8": {"1s": 543.1, "2s": 41.6},
+  "9": {"1s": 696.7},
+ "10": {"1s": 870.2, "2s": 48.5, "2p1/2": 21.7, "2p3/2": 21.6},
+ "11": {"1s": 1070.8, "2s": 63.5, "2p1/2": 30.65, "2p3/2": 30.81},
+ "12": {"1s": 1303.0, "2s": 88.7, "2p1/2": 49.78, "2p3/2": 49.5},
+ "13": {"1s": 1559.6, "2s": 117.8, "2p1/2": 72.95, "2p3/2": 72.55},
+ "14": {"1s": 1839.0, "2s": 149.7, "2p1/2": 99.82, "2p3/2": 99.42},
+ "15": {"1s": 2145.5, "2s": 189.0, "2p1/2": 136.0, "2p3/2": 135.0},
+ "16": {"1s": 2472.0, "2s": 230.9, "2p1/2": 163.6, "2p3/2": 162.5},
+ "17": {"1s": 2822.4, "2s": 270.0, "2p1/2": 202.0, "2p3/2": 200.0},
+ "18": {"1s": 3205.9, "2s": 326.3, "2p1/2": 250.6, "2p3/2": 248.4, "3s": 29.3, "3p1/2": 15.9, "3p3/2": 15.7},
+ "19": {"1s": 3608.4, "2s": 378.6, "2p1/2": 297.3, "2p3/2": 294.6, "3s": 34.8, "3p1/2": 18.3, "3p3/2": 18.3},
+ "20": {"1s": 4038.5, "2s": 438.4, "2p1/2": 349.7, "2p3/2": 346.2, "3s": 44.3, "3p1/2": 25.4, "3p3/2": 25.4},
+ "21": {"1s": 4492.0, "2s": 498.0, "2p1/2": 403.6, "2p3/2": 398.7, "3s": 51.1, "3p1/2": 28.3, "3p3/2": 28.3},
+ "22": {"1s": 4966.0, "2s": 560.9, "2p1/2": 460.2, "2p3/2": 453.8, "3s": 58.7, "3p1/2": 32.6, "3p3/2": 32.6},
+ "23": {"1s": 5465.0, "2s": 626.7, "2p1/2": 519.8, "2p3/2": 512.1, "3s": 66.3, "3p1/2": 37.2, "3p3/2": 37.2},
+ "24": {"1s": 5989.0, "2s": 696.0, "2p1/2": 583.8, "2p3/2": 574.1, "3s": 74.1, "3p1/2": 42.2, "3p3/2": 42.2},
+ "25": {"1s": 6539.0, "2s": 769.1, "2p1/2": 649.9, "2p3/2": 638.7, "3s": 82.3, "3p1/2": 47.2, "3p3/2": 47.2},
+ "26": {"1s": 7112.0, "2s": 844.6, "2p1/2": 719.9, "2p3/2": 706.8, "3s": 91.3, "3p1/2": 52.7, "3p3/2": 52.7},
+ "27": {"1s": 7709.0, "2s": 925.1, "2p1/2": 793.2, "2p3/2": 778.1, "3s": 101.0, "3p1/2": 58.9, "3p3/2": 59.9},
+ "28": {"1s": 8333.0, "2s": 1008.6, "2p1/2": 870.0, "2p3/2": 852.7, "3s": 110.8, "3p1/2": 68.0, "3p3/2": 66.2},
+ "29": {"1s": 8979.0, "2s": 1096.7, "2p1/2": 952.3, "2p3/2": 932.7, "3s": 122.5, "3p1/2": 77.3, "3p3/2": 75.1},
+ "30": {"1s": 9659.0, "2s": 1196.2, "2p1/2": 1044.9, "2p3/2": 1021.8, "3s": 139.8, "3p1/2": 91.4, "3p3/2": 88.6, "3d3/2": 10.2, "3d5/2": 10.1},
+ "31": {"1s": 10367.0, "2s": 1299.0, "2p1/2": 1143.2, "2p3/2": 1116.4, "3s": 159.5, "3p1/2": 103.5, "3p3/2": 100.0, "3d3/2": 18.7, "3d5/2": 18.7},
+ "32": {"1s": 11103.0, "2s": 1414.6, "2p1/2": 1248.1, "2p3/2": 1217.0, "3s": 180.1, "3p1/2": 124.9, "3p3/2": 120.8, "3d3/2": 29.8, "3d5/2": 29.2},
+ "33": {"1s": 11867.0, "2s": 1527.0, "2p1/2": 1359.1, "2p3/2": 1323.6, "3s": 204.7, "3p1/2": 146.2, "3p3/2": 141.2, "3d3/2": 41.7, "3d5/2": 41.7},
+ "34": {"1s": 12658.0, "2s": 1652.0, "2p1/2": 1474.3, "2p3/2": 1433.9, "3s": 229.6, "3p1/2": 166.5, "3p3/2": 160.7, "3d3/2": 55.5, "3d5/2": 54.6},
+ "35": {"1s": 13474.0, "2s": 1782.0, "2p1/2": 1596.0, "2p3/2": 1550.0, "3s": 257.0, "3p1/2": 189.0, "3p3/2": 182.0, "3d3/2": 70.0, "3d5/2": 69.0},
+ "36": {"1s": 14326.0, "2s": 1921.0, "2p1/2": 1730.9, "2p3/2": 1678.4, "3s": 292.8, "3p1/2": 222.2, "3p3/2": 214.4, "3d3/2": 95.0, "3d5/2": 93.8, "4s": 27.5, "4p1/2": 14.1, "4p3/2": 14.1},
+ "37": {"1s": 15200.0, "2s": 2065.0, "2p1/2": 1864.0, "2p3/2": 1804.0, "3s": 326.7, "3p1/2": 248.7, "3p3/2": 239.1, "3d3/2": 113.0, "3d5/2": 112.0, "4s": 30.5, "4p1/2": 16.3, "4p3/2": 15.3},
+ "38": {"1s": 16105.0, "2s": 2216.0, "2p1/2": 2007.0, "2p3/2": 1940.0, "3s": 358.7, "3p1/2": 280.3, "3p3/2": 270.0, "3d3/2": 136.0, "3d5/2": 134.2, "4s": 38.9, "4p1/2": 21.3, "4p3/2": 20.1},
+ "39": {"1s": 17038.0, "2s": 2373.0, "2p1/2": 2156.0, "2p3/2": 2080.0, "3s": 392.0, "3p1/2": 310.6, "3p3/2": 298.8, "3d3/2": 157.7, "3d5/2": 155.8, "4s": 43.8, "4p1/2": 24.4, "4p3/2": 23.1},
+ "40": {"1s": 17998.0, "2s": 2532.0, "2p1/2": 2307.0, "2p3/2": 2223.0, "3s": 430.3, "3p1/2": 343.5, "3p3/2": 329.8, "3d3/2": 181.1, "3d5/2": 178.8, "4s": 50.6, "4p1/2": 28.5, "4p3/2": 27.1},
+ "41": {"1s": 18986.0, "2s": 2698.0, "2p1/2": 2465.0, "2p3/2": 2371.0, "3s": 466.6, "3p1/2": 376.1, "3p3/2": 360.6, "3d3/2": 205.0, "3d5/2": 202.3, "4s": 56.4, "4p1/2": 32.6, "4p3/2": 30.8},
+ "42": {"1s": 20000.0, "2s": 2866.0, "2p1/2": 2625.0, "2p3/2": 2520.0, "3s": 506.3, "3p1/2": 411.6, "3p3/2": 394.0, "3d3/2": 231.1, "3d5/2": 227.9, "4s": 63.2, "4p1/2": 37.6, "4p3/2": 35.5},
+ "43": {"1s": 21044.0, "2s": 3043.0, "2p1/2": 2793.0, "2p3/2": 2677.0, "3s": 544.0, "3p1/2": 447.6, "3p3/2": 417.7, "3d3/2": 257.6, "3d5/2": 253.9, "4s": 69.5, "4p1/2": 42.3, "4p3/2": 39.9},
+ "44": {"1s": 22117.0, "2s": 3224.0, "2p1/2": 2967.0, "2p3/2": 2838.0, "3s": 586.1, "3p1/2": 483.5, "3p3/2": 461.4, "3d3/2": 284.2, "3d5/2": 280.0, "4s": 75.0, "4p1/2": 46.3, "4p3/2": 43.2},
+ "45": {"1s": 23220.0, "2s": 3412.0, "2p1/2": 3146.0, "2p3/2": 3004.0, "3s": 628.1, "3p1/2": 521.3, "3p3/2": 496.5, "3d3/2": 311.9, "3d5/2": 307.2, "4s": 81.4, "4p1/2": 50.5, "4p3/2": 47.3},
+ "46": {"1s": 24350.0, "2s": 3604.0, "2p1/2": 3330.0, "2p3/2": 3173.0, "3s": 671.6, "3p1/2": 559.9, "3p3/2": 532.3, "3d3/2": 340.5, "3d5/2": 335.2, "4s": 87.1, "4p1/2": 55.7, "4p3/2": 50.9},
+ "47": {"1s": 25514.0, "2s": 3806.0, "2p1/2": 3524.0, "2p3/2": 3351.0, "3s": 719.0, "3p1/2": 603.8, "3p3/2": 573.0, "3d3/2": 374.0, "3d5/2": 368.3, "4s": 97.0, "4p1/2": 63.7, "4p3/2": 58.3},
+ "48": {"1s": 26711.0, "2s": 4018.0, "2p1/2": 3727.0, "2p3/2": 3538.0, "3s": 772.0, "3p1/2": 652.6, "3p3/2": 618.4, "3d3/2": 411.9, "3d5/2": 405.2, "4s": 109.8, "4p1/2": 63.9, "4p3/2": 63.9, "4d3/2": 11.7, "4d5/2": 10.7},
+ "49": {"1s": 27940.0, "2s": 4238.0, "2p1/2": 3938.0, "2p3/2": 3730.0, "3s": 827.2, "3p1/2": 703.2, "3p3/2": 665.3, "3d3/2": 451.4, "3d5/2": 443.9, "4s": 122.9, "4p1/2": 73.5, "4p3/2": 73.5, "4d3/2": 17.7, "4d5/2": 16.9},
+ "50": {"1s": 29200.0, "2s": 4465.0, "2p1/2": 4156.0, "2p3/2": 3929.0, "3s": 884.7, "3p1/2": 756.5, "3p3/2": 714.6, "3d3/2": 493.2, "3d5/2": 484.9, "4s": 137.1, "4p1/2": 83.6, "4p3/2": 83.6, "4d3/2": 24.9, "4d5/2": 23.9},
+ "51": {"1s": 30491.0, "2s": 4698.0, "2p1/2": 4380.0, "2p3/2": 4132.0, "3s": 946.0, "3p1/2": 812.7, "3p3/2": 766.4, "3d3/2": 537.5, "3d5/2": 528.2, "4s": 153.2, "4p1/2": 95.6, "4p3/2": 95.6, "4d3/2": 33.3, "4d5/2": 32.1},
+ "52": {"1s": 31814.0, "2s": 4939.0, "2p1/2": 4612.0, "2p3/2": 4341.0, "3s": 1006.0, "3p1/2": 870.8, "3p3/2": 820.0, "3d3/2": 583.4, "3d5/2": 573.0, "4s": 169.4, "4p1/2": 103.3, "4p3/2": 103.3, "4d3/2": 41.9, "4d5/2": 40.4},
+ "53": {"1s": 33169.0, "2s": 5188.0, "2p1/2": 4852.0, "2p3/2": 4557.0, "3s": 1072.0, "3p1/2": 931.0, "3p3/2": 875.0, "3d3/2": 630.8, "3d5/2": 619.3, "4s": 186.0, "4p1/2": 123.0, "4p3/2": 123.0, "4d3/2": 50.6, "4d5/2": 48.9},
+ "54": {"1s": 34561.0, "2s": 5453.0, "2p1/2": 5107.0, "2p3/2": 4786.0, "3s": 1148.7, "3p1/2": 1002.1, "3p3/2": 940.6, "3d3/2": 689.0, "3d5/2": 676.4, "4s": 213.2, "4p1/2": 146.7, "4p3/2": 145.5, "4d3/2": 69.5, "4d5/2": 67.5, "5s": 23.3, "5p1/2": 13.4, "5p3/2": 12.1},
+ "55": {"1s": 35985.0, "2s": 5714.0, "2p1/2": 5359.0, "2p3/2": 5012.0, "3s": 1211.0, "3p1/2": 1071.0, "3p3/2": 1003.0, "3d3/2": 740.5, "3d5/2": 726.6, "4s": 232.3, "4p1/2": 172.4, "4p3/2": 161.3, "4d3/2": 79.8, "4d5/2": 77.5, "5s": 22.7, "5p1/2": 14.2, "5p3/2": 12.1},
+ "56": {"1s": 37441.0, "2s": 5989.0, "2p1/2": 5624.0, "2p3/2": 5247.0, "3s": 1293.0, "3p1/2": 1137.0, "3p3/2": 1063.0, "3d3/2": 795.7, "3d5/2": 780.5, "4s": 253.5, "4p1/2": 192.0, "4p3/2": 178.6, "4d3/2": 92.6, "4d5/2": 89.9, "5s": 30.3, "5p1/2": 17.0, "5p3/2": 14.8},
+ "57": {"1s": 38925.0, "2s": 6266.0, "2p1/2": 5891.0, "2p3/2": 5483.0, "3s": 1362.0, "3p1/2": 1209.0, "3p3/2": 1128.0, "3d3/2": 853.0, "3d5/2": 836.0, "4s": 274.7, "4p1/2": 205.8, "4p3/2": 196.0, "4d3/2": 105.3, "4d5/2": 102.5, "5s": 34.3, "5p1/2": 19.3, "5p3/2": 16.8},
+ "58": {"1s": 40443.0, "2s": 6549.0, "2p1/2": 6164.0, "2p3/2": 5723.0, "3s": 1436.0, "3p1/2": 1274.0, "3p3/2": 1187.0, "3d3/2": 902.4, "3d5/2": 883.8, "4s": 291.0, "4p1/2": 223.2, "4p3/2": 206.5, "4d3/2": 109.0, "4f5/2": 0.1, "4f7/2": 0.1, "5s": 37.8, "5p1/2": 19.8, "5p3/2": 17.0},
+ "59": {"1s": 41991.0, "2s": 6835.0, "2p1/2": 6440.0, "2p3/2": 5964.0, "3s": 1511.0, "3p1/2": 1337.0, "3p3/2": 1242.0, "3d3/2": 948.3, "3d5/2": 928.8, "4s": 304.5, "4p1/2": 236.3, "4p3/2": 217.6, "4d3/2": 115.1, "4d5/2": 115.1, "4f5/2": 2.0, "4f7/2": 2.0, "5s": 37.4, "5p1/2": 22.3, "5p3/2": 22.3},
+ "60": {"1s": 43569.0, "2s": 7126.0, "2p1/2": 6722.0, "2p3/2": 6208.0, "3s": 1575.0, "3p1/2": 1403.0, "3p3/2": 1297.0, "3d3/2": 1003.3, "3d5/2": 980.4, "4s": 319.2, "4p1/2": 243.3, "4p3/2": 224.6, "4d3/2": 120.5, "4d5/2": 120.5, "4f5/2": 1.5, "4f7/2": 1.5, "5s": 37.5, "5p1/2": 21.1, "5p3/2": 21.1},
+ "61": {"1s": 45184.0, "2s": 7428.0, "2p1/2": 7013.0, "2p3/2": 6459.0, "3p1/2": 1471.0, "3p3/2": 1357.0, "3d3/2": 1052.0, "3d5/2": 1027.0, "4p1/2": 242.0, "4p3/2": 242.0, "4d3/2": 120.0, "4d5/2": 120.0},
+ "62": {"1s": 46834.0, "2s": 7737.0, "2p1/2": 7312.0, "2p3/2": 6716.0, "3s": 1723.0, "3p1/2": 1541.0, "3p3/2": 1420.0, "3d3/2": 1110.9, "3d5/2": 1083.4, "4s": 347.2, "4p1/2": 265.6, "4p3/2": 247.4, "4d3/2": 129.0, "4d5/2": 129.0, "4f5/2": 5.2, "4f7/2": 5.2, "5s": 37.4, "5p1/2": 21.3, "5p3/2": 21.3},
+ "63": {"1s": 48519.0, "2s": 8052.0, "2p1/2": 7617.0, "2p3/2": 6977.0, "3s": 1800.0, "3p1/2": 1614.0, "3p3/2": 1481.0, "3d3/2": 1158.6, "3d5/2": 1127.5, "4s": 360.0, "4p1/2": 284.0, "4p3/2": 257.0, "4d3/2": 133.0, "4d5/2": 127.7, "4f5/2": 0.0, "4f7/2": 0.0, "5s": 32.0, "5p1/2": 22.0, "5p3/2": 22.0},
+ "64": {"1s": 50239.0, "2s": 8376.0, "2p1/2": 7930.0, "2p3/2": 7243.0, "3s": 1881.0, "3p1/2": 1688.0, "3p3/2": 1544.0, "3d3/2": 1221.9, "3d5/2": 1189.6, "4s": 378.6, "4p1/2": 286.0, "4p3/2": 271.0, "4d5/2": 142.6, "4f5/2": 8.6, "4f7/2": 8.6, "5s": 36.0, "5p1/2": 28.0, "5p3/2": 21.0},
+ "65": {"1s": 51996.0, "2s": 8708.0, "2p1/2": 8252.0, "2p3/2": 7514.0, "3s": 1968.0, "3p1/2": 1768.0, "3p3/2": 1611.0, "3d3/2": 1276.9, "3d5/2": 1241.1, "4s": 396.0, "4p1/2": 322.4, "4p3/2": 284.1, "4d3/2": 150.5, "4d5/2": 150.5, "4f5/2": 7.7, "4f7/2": 2.4, "5s": 45.6, "5p1/2": 28.7, "5p3/2": 22.6},
+ "66": {"1s": 53789.0, "2s": 9046.0, "2p1/2": 8581.0, "2p3/2": 7790.0, "3s": 2047.0, "3p1/2": 1842.0, "3p3/2": 1676.0, "3d3/2": 1333.0, "3d5/2": 1292.6, "4s": 414.2, "4p1/2": 333.5, "4p3/2": 293.2, "4d3/2": 153.6, "4d5/2": 153.6, "4f5/2": 8.0, "4f7/2": 4.3, "5s": 49.9, "5p1/2": 26.3, "5p3/2": 26.3},
+ "67": {"1s": 55618.0, "2s": 9394.0, "2p1/2": 8918.0, "2p3/2": 8071.0, "3s": 2128.0, "3p1/2": 1923.0, "3p3/2": 1741.0, "3d3/2": 1392.0, "3d5/2": 1351.0, "4s": 432.4, "4p1/2": 343.5, "4p3/2": 308.2, "4d3/2": 160.0, "4d5/2": 160.0, "4f5/2": 8.6, "4f7/2": 5.2, "5s": 49.3, "5p1/2": 30.8, "5p3/2": 24.1},
+ "68": {"1s": 57486.0, "2s": 9751.0, "2p1/2": 9264.0, "2p3/2": 8358.0, "3s": 2207.0, "3p1/2": 2006.0, "3p3/2": 1812.0, "3d3/2": 1453.0, "3d5/2": 1409.0, "4s": 449.8, "4p1/2": 366.2, "4p3/2": 320.2, "4d3/2": 167.6, "4d5/2": 167.6, "4f7/2": 4.7, "5s": 50.6, "5p1/2": 31.4, "5p3/2": 24.7},
+ "69": {"1s": 59390.0, "2s": 10116.0, "2p1/2": 9617.0, "2p3/2": 8648.0, "3s": 2307.0, "3p1/2": 2090.0, "3p3/2": 1885.0, "3d3/2": 1515.0, "3d5/2": 1468.0, "4s": 470.9, "4p1/2": 385.9, "4p3/2": 332.6, "4d3/2": 175.5, "4d5/2": 175.5, "4f7/2": 4.6, "5s": 54.7, "5p1/2": 31.8, "5p3/2": 25.0},
+ "70": {"1s": 61332.0, "2s": 10486.0, "2p1/2": 9978.0, "2p3/2": 8944.0, "3s": 2398.0, "3p1/2": 2173.0, "3p3/2": 1950.0, "3d3/2": 1576.0, "3d5/2": 1528.0, "4s": 480.5, "4p1/2": 388.7, "4p3/2": 339.7, "4d3/2": 191.2, "4d5/2": 182.4, "4f5/2": 2.5, "4f7/2": 1.3, "5s": 52.0, "5p1/2": 30.3, "5p3/2": 24.1},
+ "71": {"1s": 63314.0, "2s": 10870.0, "2p1/2": 10349.0, "2p3/2": 9244.0, "3s": 2491.0, "3p1/2": 2264.0, "3p3/2": 2024.0, "3d3/2": 1639.0, "3d5/2": 1589.0, "4s": 506.8, "4p1/2": 412.4, "4p3/2": 359.2, "4d3/2": 206.1, "4d5/2": 196.3, "4f5/2": 8.9, "4f7/2": 7.5, "5s": 57.3, "5p1/2": 33.6, "5p3/2": 26.7},
+ "72": {"1s": 65351.0, "2s": 11271.0, "2p1/2": 10739.0, "2p3/2": 9561.0, "3s": 2601.0, "3p1/2": 2365.0, "3p3/2": 2108.0, "3d3/2": 1716.0, "3d5/2": 1662.0, "4s": 538.0, "4p1/2": 438.2, "4p3/2": 380.7, "4d3/2": 220.0, "4d5/2": 211.5, "4f5/2": 15.9, "4f7/2": 14.2, "5s": 64.2, "5p1/2": 38.0, "5p3/2": 29.9},
+ "73": {"1s": 67416.0, "2s": 11682.0, "2p1/2": 11136.0, "2p3/2": 9881.0, "3s": 2708.0, "3p1/2": 2469.0, "3p3/2": 2194.0, "3d3/2": 1793.0, "3d5/2": 1735.0, "4s": 563.4, "4p1/2": 463.4, "4p3/2": 400.9, "4d3/2": 237.9, "4d5/2": 226.4, "4f5/2": 23.5, "4f7/2": 21.6, "5s": 69.7, "5p1/2": 42.2, "5p3/2": 32.7},
+ "74": {"1s": 69525.0, "2s": 12100.0, "2p1/2": 11544.0, "2p3/2": 10207.0, "3s": 2820.0, "3p1/2": 2575.0, "3p3/2": 2281.0, "3d3/2": 1872.0, "3d5/2": 1809.0, "4s": 594.1, "4p1/2": 490.4, "4p3/2": 423.6, "4d3/2": 255.9, "4d5/2": 243.5, "4f5/2": 33.6, "4f7/2": 31.4, "5s": 75.6, "5p1/2": 45.3, "5p3/2": 36.8},
+ "75": {"1s": 71676.0, "2s": 12527.0, "2p1/2": 11959.0, "2p3/2": 10535.0, "3s": 2932.0, "3p1/2": 2682.0, "3p3/2": 2367.0, "3d3/2": 1949.0, "3d5/2": 1883.0, "4s": 625.4, "4p1/2": 518.7, "4p3/2": 446.8, "4d3/2": 273.9, "4d5/2": 260.5, "4f5/2": 42.9, "4f7/2": 40.5, "5s": 83.0, "5p1/2": 45.6, "5p3/2": 34.6},
+ "76": {"1s": 73871.0, "2s": 12968.0, "2p1/2": 12385.0, "2p3/2": 10871.0, "3s": 3049.0, "3p1/2": 2792.0, "3p3/2": 2457.0, "3d3/2": 2031.0, "3d5/2": 1960.0, "4s": 658.2, "4p1/2": 549.1, "4p3/2": 470.7, "4d3/2": 293.1, "4d5/2": 278.5, "4f5/2": 53.4, "4f7/2": 50.7, "5s": 84.0, "5p1/2": 58.0, "5p3/2": 44.5},
+ "77": {"1s": 76111.0, "2s": 13419.0, "2p1/2": 12824.0, "2p3/2": 11215.0, "3s": 3174.0, "3p1/2": 2909.0, "3p3/2": 2551.0, "3d3/2": 2116.0, "3d5/2": 2040.0, "4s": 691.1, "4p1/2": 577.8, "4p3/2": 495.8, "4d3/2": 311.9, "4d5/2": 296.3, "4f5/2": 63.8, "4f7/2": 60.8, "5s": 95.2, "5p1/2": 63.0, "5p3/2": 48.0},
+ "78": {"1s": 78395.0, "2s": 13880.0, "2p1/2": 13273.0, "2p3/2": 11564.0, "3s": 3296.0, "3p1/2": 3027.0, "3p3/2": 2645.0, "3d3/2": 2202.0, "3d5/2": 2122.0, "4s": 725.4, "4p1/2": 609.1, "4p3/2": 519.4, "4d3/2": 331.6, "4d5/2": 314.6, "4f5/2": 74.5, "4f7/2": 71.2, "5s": 101.7, "5p1/2": 65.3, "5p3/2": 51.7},
+ "79": {"1s": 80725.0, "2s": 14353.0, "2p1/2": 13734.0, "2p3/2": 11919.0, "3s": 3425.0, "3p1/2": 3148.0, "3p3/2": 2743.0, "3d3/2": 2291.0, "3d5/2": 2206.0, "4s": 762.1, "4p1/2": 642.7, "4p3/2": 546.3, "4d3/2": 353.2, "4d5/2": 335.1, "4f5/2": 87.6, "4f7/2": 84.0, "5s": 107.2, "5p1/2": 74.2, "5p3/2": 57.2},
+ "80": {"1s": 83102.0, "2s": 14839.0, "2p1/2": 14209.0, "2p3/2": 12284.0, "3s": 3562.0, "3p1/2": 3279.0, "3p3/2": 2847.0, "3d3/2": 2385.0, "3d5/2": 2295.0, "4s": 802.2, "4p1/2": 680.2, "4p3/2": 576.6, "4d3/2": 378.2, "4d5/2": 358.8, "4f5/2": 104.0, "4f7/2": 99.9, "5s": 127.0, "5p1/2": 83.1, "5p3/2": 64.5, "5d3/2": 9.6, "5d5/2": 7.8},
+ "81": {"1s": 85530.0, "2s": 15347.0, "2p1/2": 14698.0, "2p3/2": 12658.0, "3s": 3704.0, "3p1/2": 3416.0, "3p3/2": 2957.0, "3d3/2": 2485.0, "3d5/2": 2389.0, "4s": 846.2, "4p1/2": 720.5, "4p3/2": 609.5, "4d3/2": 405.7, "4d5/2": 385.0, "4f5/2": 122.2, "4f7/2": 117.8, "5s": 136.0, "5p1/2": 94.6, "5p3/2": 73.5, "5d3/2": 14.7, "5d5/2": 12.5},
+ "82": {"1s": 88005.0, "2s": 15861.0, "2p1/2": 15200.0, "2p3/2": 13035.0, "3s": 3851.0, "3p1/2": 3554.0, "3p3/2": 3066.0, "3d3/2": 2586.0, "3d5/2": 2484.0, "4s": 891.8, "4p1/2": 761.9, "4p3/2": 643.5, "4d3/2": 434.3, "4d5/2": 412.2, "4f5/2": 141.7, "4f7/2": 136.9, "5s": 147.0, "5p1/2": 106.4, "5p3/2": 83.3, "5d3/2": 20.7, "5d5/2": 18.1},
+ "83": {"1s": 90526.0, "2s": 16388.0, "2p1/2": 15711.0, "2p3/2": 13419.0, "3s": 3999.0, "3p1/2": 3696.0, "3p3/2": 3177.0, "3d3/2": 2688.0, "3d5/2": 2580.0, "4s": 939.0, "4p1/2": 805.2, "4p3/2": 678.8, "4d3/2": 464.0, "4d5/2": 440.1, "4f5/2": 162.3, "4f7/2": 157.0, "5s": 159.3, "5p1/2": 119.0, "5p3/2": 92.6, "5d3/2": 26.9, "5d5/2": 23.8},
+ "84": {"1s": 93105.0, "2s": 16939.0, "2p1/2": 16244.0, "2p3/2": 13814.0, "3s": 4149.0, "3p1/2": 3854.0, "3p3/2": 3302.0, "3d3/2": 2798.0, "3d5/2": 2683.0, "4s": 995.0, "4p1/2": 851.0, "4p3/2": 705.0, "4d3/2": 500.0, "4d5/2": 473.0, "4f5/2": 184.0, "4f7/2": 184.0, "5s": 177.0, "5p1/2": 132.0, "5p3/2": 104.0, "5d3/2": 31.0, "5d5/2": 31.0},
+ "85": {"1s": 95730.0, "2s": 17493.0, "2p1/2": 16785.0, "2p3/2": 14214.0, "3s": 4317.0, "3p1/2": 4008.0, "3p3/2": 3426.0, "3d3/2": 2909.0, "3d5/2": 2787.0, "4s": 1042.0, "4p1/2": 886.0, "4p3/2": 740.0, "4d3/2": 533.0, "4d5/2": 507.0, "4f5/2": 210.0, "4f7/2": 210.0, "5s": 195.0, "5p1/2": 148.0, "5p3/2": 115.0, "5d3/2": 40.0, "5d5/2": 40.0},
+ "86": {"1s": 98404.0, "2s": 18049.0, "2p1/2": 17337.0, "2p3/2": 14619.0, "3s": 4482.0, "3p1/2": 4159.0, "3p3/2": 3538.0, "3d3/2": 3022.0, "3d5/2": 2892.0, "4s": 1097.0, "4p1/2": 929.0, "4p3/2": 768.0, "4d3/2": 567.0, "4d5/2": 541.0, "4f5/2": 238.0, "4f7/2": 238.0, "5s": 214.0, "5p1/2": 164.0, "5p3/2": 127.0, "5d3/2": 48.0, "5d5/2": 48.0, "6s": 26.0},
+ "87": {"1s": 101137.0, "2s": 18639.0, "2p1/2": 17907.0, "2p3/2": 15031.0, "3s": 4652.0, "3p1/2": 4327.0, "3p3/2": 3663.0, "3d3/2": 3136.0, "3d5/2": 3000.0, "4s": 1153.0, "4p1/2": 980.0, "4p3/2": 810.0, "4d3/2": 603.0, "4d5/2": 577.0, "4f5/2": 268.0, "4f7/2": 268.0, "5s": 234.0, "5p1/2": 182.0, "5p3/2": 140.0, "5d3/2": 58.0, "5d5/2": 58.0, "6s": 34.0, "6p1/2": 15.0, "6p3/2": 15.0},
+ "88": {"1s": 103922.0, "2s": 19237.0, "2p1/2": 18484.0, "2p3/2": 15444.0, "3s": 4822.0, "3p1/2": 4490.0, "3p3/2": 3792.0, "3d3/2": 3248.0, "3d5/2": 3105.0, "4s": 1208.0, "4p1/2": 1058.0, "4p3/2": 879.0, "4d3/2": 636.0, "4d5/2": 603.0, "4f5/2": 299.0, "4f7/2": 299.0, "5s": 254.0, "5p1/2": 200.0, "5p3/2": 153.0, "5d3/2": 68.0, "5d5/2": 68.0, "6s": 44.0, "6p1/2": 19.0, "6p3/2": 19.0},
+ "89": {"1s": 106755.0, "2s": 19840.0, "2p1/2": 19083.0, "2p3/2": 15871.0, "3s": 5002.0, "3p1/2": 4656.0, "3p3/2": 3909.0, "3d3/2": 3370.0, "3d5/2": 3219.0, "4s": 1269.0, "4p1/2": 1080.0, "4p3/2": 890.0, "4d3/2": 675.0, "4d5/2": 639.0, "4f5/2": 319.0, "4f7/2": 319.0, "5s": 272.0, "5p1/2": 215.0, "5p3/2": 167.0, "5d3/2": 80.0, "5d5/2": 80.0},
+ "90": {"1s": 109651.0, "2s": 20472.0, "2p1/2": 19693.0, "2p3/2": 16300.0, "3s": 5182.0, "3p1/2": 4830.0, "3p3/2": 4046.0, "3d3/2": 3491.0, "3d5/2": 3332.0, "4s": 1330.0, "4p1/2": 1168.0, "4p3/2": 966.4, "4d3/2": 712.1, "4d5/2": 675.2, "4f5/2": 342.4, "4f7/2": 333.1, "5s": 290.0, "5p1/2": 229.0, "5p3/2": 182.0, "5d3/2": 92.5, "5d5/2": 85.4, "6s": 41.4, "6p1/2": 24.5, "6p3/2": 16.6},
+ "91": {"1s": 112601.0, "2s": 21105.0, "2p1/2": 20314.0, "2p3/2": 16733.0, "3s": 5367.0, "3p1/2": 5001.0, "3p3/2": 4174.0, "3d3/2": 3611.0, "3d5/2": 3442.0, "4s": 1387.0, "4p1/2": 1224.0, "4p3/2": 1007.0, "4d3/2": 743.0, "4d5/2": 708.0, "4f5/2": 371.0, "4f7/2": 360.0, "5s": 310.0, "5p1/2": 232.0, "5p3/2": 232.0, "5d3/2": 94.0, "5d5/2": 94.0},
+ "92": {"1s": 115606.0, "2s": 21757.0, "2p1/2": 20948.0, "2p3/2": 17166.0, "3s": 5548.0, "3p1/2": 5182.0, "3p3/2": 4303.0, "3d3/2": 3728.0, "3d5/2": 3552.0, "4s": 1439.0, "4p1/2": 1271.0, "4p3/2": 1043.0, "4d3/2": 778.3, "4d5/2": 736.2, "4f5/2": 388.2, "4f7/2": 377.4, "5s": 321.0, "5p1/2": 257.0, "5p3/2": 192.0, "5d3/2": 102.8, "5d5/2": 94.2, "6s": 43.9, "6p1/2": 26.8, "6p3/2": 16.8}
+}
--- a/pmsco/elements/bindingenergy.py
+++ b/pmsco/elements/bindingenergy.py
@ -0,0 +1,155 @@
+"""
+@package pmsco.elements.bindingenergy
+electron binding energies of the elements
+
+extends the element table of the `periodictable` package
+(https://periodictable.readthedocs.io/en/latest/index.html)
+by the electron binding energies.
+
+the binding energies are compiled from Gwyn Williams' web page
+(https://userweb.jlab.org/~gwyn/ebindene.html).
+please refer to the original web page or the x-ray data booklet
+for original sources, definitions and remarks.
+
+usage
+-----
+
+this module requires the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+~~~~~~{.py}
+import periodictable as pt
+import pmsco.elements.bindingenergy
+
+# read any periodictable's element interfaces, e.g.
+print(pt.gold.binding_energy['4f7/2'])
+print(pt.elements.symbol('Au').binding_energy['4f7/2'])
+print(pt.elements.name('gold').binding_energy['4f7/2'])
+print(pt.elements[79].binding_energy['4f7/2'])
+~~~~~~
+
+note that attributes are writable.
+you may assign refined values in your instance of the database.
+
+the query_binding_energy() function queries all terms with a particular binding energy.
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import json
+import numpy as np
+import os
+import periodictable as pt
+from pmsco.compat import open
+
+
+index_energy = np.zeros(0)
+index_number = np.zeros(0)
+index_term = []
+
+
+def load_data():
+    data_path = os.path.join(os.path.dirname(__file__), "bindingenergy.json")
+    with open(data_path) as fp:
+        data = json.load(fp)
+    return data
+
+
+def init(table, reload=False):
+    if 'binding_energy' in table.properties and not reload:
+        return
+    table.properties.append('binding_energy')
+
+    pt.core.Element.binding_energy = {}
+    pt.core.Element.binding_energy_units = "eV"
+
+    data = load_data()
+    for el_key, el_data in data.items():
+        try:
+            el = table[int(el_key)]
+        except ValueError:
+            el = table.symbol(el_key)
+        el.binding_energy = el_data
+
+
+def build_index():
+    """
+    build an index for query_binding_energy().
+
+    the index is kept in global variables of the module.
+
+    @return None
+    """
+    global index_energy
+    global index_number
+    global index_term
+
+    n = 0
+    for element in pt.elements:
+        n += len(element.binding_energy)
+
+    index_energy = np.zeros(n)
+    index_number = np.zeros(n)
+    index_term = []
+
+    for element in pt.elements:
+        for term, energy in element.binding_energy.items():
+            index_term.append(term)
+            i = len(index_term) - 1
+            index_energy[i] = energy
+            index_number[i] = element.number
+
+
+def query_binding_energy(energy, tol=1.0):
+    """
+    search the periodic table for a specific binding energy and return all matching terms.
+
+    @param energy: binding energy in eV.
+
+    @param tol: tolerance in eV.
+
+    @return: list of dictionaries containing element and term specification.
+             the list is ordered arbitrarily.
+             each dictionary contains the following keys:
+             @arg 'number': element number
+             @arg 'symbol': element symbol
+             @arg 'term': spectroscopic term
+             @arg 'energy': actual binding energy
+    """
+    if len(index_energy) == 0:
+        build_index()
+    sel = np.abs(index_energy - energy) < tol
+    idx = np.where(sel)
+    result = []
+    for i in idx[0]:
+        el_num = int(index_number[i])
+        d = {'number': el_num,
+             'symbol': pt.elements[el_num].symbol,
+             'term': index_term[i],
+             'energy': index_energy[i]}
+        result.append(d)
+
+    return result
+
+
+def export_flat_text(f):
+    """
+    export the binding energies to a flat general text file.
+
+    @param f: file path or open file object
+    @return: None
+    """
+    if hasattr(f, "write") and callable(f.write):
+        f.write("number symbol term energy\n")
+        for element in pt.elements:
+            for term, energy in element.binding_energy.items():
+                f.write(f"{element.number} {element.symbol} {term} {energy}\n")
+    else:
+        with open(f, "w") as fi:
+            export_flat_text(fi)
--- a/pmsco/elements/cross-sections.dat
+++ b/pmsco/elements/cross-sections.dat
--- a/pmsco/elements/photoionization.py
+++ b/pmsco/elements/photoionization.py
@ -0,0 +1,248 @@
+"""
+@package pmsco.elements.photoionization
+photoionization cross-sections of the elements
+
+extends the element table of the `periodictable` package
+(https://periodictable.readthedocs.io/en/latest/index.html)
+by a table of photoionization cross-sections.
+
+
+the data is available from (https://vuo.elettra.eu/services/elements/)
+or (https://figshare.com/articles/dataset/Digitisation_of_Yeh_and_Lindau_Photoionisation_Cross_Section_Tabulated_Data/12389750).
+both sources are based on the original atomic data tables by Yeh and Lindau (1985).
+the Elettra data includes interpolation at finer steps,
+whereas the Kalha data contains only the original data points by Yeh and Lindau
+plus an additional point at 8 keV.
+the tables go up to 1500 eV photon energy and do not resolve spin-orbit splitting.
+
+
+usage
+-----
+
+this module requires python 3.6, numpy and the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+~~~~~~{.py}
+import numpy as np
+import periodictable as pt
+import pmsco.elements.photoionization
+
+# read any periodictable's element interfaces as follows.
+# eph and cs are numpy arrays of identical shape that hold the photon energies and cross sections.
+eph, cs = pt.gold.photoionization.cross_section['4f']
+eph, cs = pt.elements.symbol('Au').photoionization.cross_section['4f']
+eph, cs = pt.elements.name('gold').photoionization.cross_section['4f']
+eph, cs = pt.elements[79].photoionization.cross_section['4f']
+
+# interpolate for specific photon energy
+print(np.interp(photon_energy, eph, cs)
+~~~~~~
+
+the data is loaded from the cross-sections.dat file which is a python-pickled data file.
+to switch between data sources, use one of the load functions defined here
+and dump the data to the cross-sections.dat file.
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import numpy as np
+from pathlib import Path
+import periodictable as pt
+import pickle
+import urllib.request
+import urllib.error
+from . import bindingenergy
+
+
+def load_kalha_data():
+    """
+    load all cross-sections from csv-files by Kalha et al.
+
+    the files must be placed in the 'kalha' directory next to this file.
+
+    @return: cross-section data in a nested dictionary, cf. load_pickled_data().
+    """
+    data = {}
+    p = Path(Path(__file__).parent, "kalha")
+    for entry in p.glob('*_*.csv'):
+        if entry.is_file():
+            try:
+                element = int(entry.stem.split('_')[0])
+            except ValueError:
+                pass
+            else:
+                data[element] = load_kalha_file(entry)
+    return data
+
+
+def load_kalha_file(path):
+    """
+    load the cross-sections of an element from a csv-file by Kalha et al.
+
+    @param path: file path
+    @return: (dict) dictionary of 'nl' terms.
+        the data items are tuples (photon_energy, cross_sections) of 1-dimensional numpy arrays.
+    """
+    a = np.genfromtxt(path, delimiter=',', names=True)
+    b = ~np.isnan(a['Photon_Energy__eV'])
+    a = a[b]
+    eph = a['Photon_Energy__eV'].copy()
+    data = {}
+    for n in range(1, 8):
+        for l in 'spdf':
+            col = f"{n}{l}"
+            try:
+                data[col] = (eph, a[col].copy())
+            except ValueError:
+                pass
+    return data
+
+
+def load_kalha_configuration(path):
+    """
+    load the electron configuration from a csv-file by Kalha et al.
+
+    @param path: file path
+    @return: (dict) dictionary of 'nl' terms mapping to number of electrons in the sub-shell.
+    """
+    p = Path(path)
+    subshells = []
+    electrons = []
+    config = {}
+    with p.open() as f:
+        for l in f.readlines():
+            s = l.split(',')
+            k_eph = "Photon Energy"
+            k_el = "#electrons"
+            if s[0][0:len(k_eph)] == k_eph:
+                subshells = s[1:]
+            elif s[0][0:len(k_el)] == k_el:
+                electrons = s[1:]
+
+    for i, sh in enumerate(subshells):
+        if sh:
+            config[sh] = electrons[i]
+
+    return config
+
+
+def load_elettra_file(symbol, nl):
+    """
+    download the cross sections of one level from the Elettra webelements web site.
+
+    @param symbol: (str) element symbol
+    @param nl: (str) nl term, e.g. '2p' (no spin-orbit)
+    @return: (photon_energy, cross_section) tuple of 1-dimensional numpy arrays.
+    """
+    url = f"https://vuo.elettra.eu/services/elements/data/{symbol.lower()}{nl}.txt"
+    try:
+        data = urllib.request.urlopen(url)
+    except urllib.error.HTTPError:
+        eph = None
+        cs = None
+    else:
+        a = np.genfromtxt(data)
+        try:
+            eph = a[:, 0]
+            cs = a[:, 1]
+        except IndexError:
+            eph = None
+            cs = None
+
+    return eph, cs
+
+
+def load_elettra_data():
+    """
+    download the cross sections from the Elettra webelements web site.
+
+    @return: cross-section data in a nested dictionary, cf. load_pickled_data().
+    """
+    data = {}
+    for element in pt.elements:
+        element_data = {}
+        for nlj in element.binding_energy:
+            nl = nlj[0:2]
+            eb = element.binding_energy[nlj]
+            if nl not in element_data and eb <= 2000:
+                eph, cs = load_elettra_file(element.symbol, nl)
+                if eph is not None and cs is not None:
+                    element_data[nl] = (eph, cs)
+        if len(element_data):
+            data[element.symbol] = element_data
+
+    return data
+
+
+def save_pickled_data(path, data):
+    """
+    save a cross section data dictionary to a python-pickled file.
+
+    @param path: file path
+    @param data: cross-section data in a nested dictionary, cf. load_pickled_data().
+    @return: None
+    """
+    with open(path, "wb") as f:
+        pickle.dump(data, f)
+
+
+def load_pickled_data(path):
+    """
+    load the cross section data from a python-pickled file.
+
+    the file can be generated by the save_pickled_data() function.
+
+    @param path: file path
+    @return: cross-section data in a nested dictionary.
+        the first-level keys are element symbols.
+        the second-level keys are 'nl' terms (e.g. '2p').
+        note that the Yeh and Lindau tables do not resolve spin-orbit splitting.
+        the data items are (photon_energy, cross_sections) tuples
+        of 1-dimensional numpy arrays holding the data table.
+        cross section values are given in Mb.
+    """
+    with open(path, "rb") as f:
+        data = pickle.load(f)
+    return data
+
+
+class Photoionization(object):
+    def __init__(self):
+        self.cross_section = {}
+        self.cross_section_units = "Mb"
+
+
+def init(table, reload=False):
+    """
+    loads cross section data into the periodic table.
+
+    this function is called by the periodictable to load the data on demand.
+
+    @param table:
+    @param reload:
+    @return:
+    """
+    if 'photoionization' in table.properties and not reload:
+        return
+    table.properties.append('photoionization')
+
+    # default value
+    pt.core.Element.photoionization = Photoionization()
+
+    p = Path(Path(__file__).parent, "cross-sections.dat")
+    data = load_pickled_data(p)
+    for el_key, el_data in data.items():
+        try:
+            el = table[int(el_key)]
+        except ValueError:
+            el = table.symbol(el_key)
+        pi = Photoionization()
+        pi.cross_section = el_data
+        pi.cross_section_units = "Mb"
+        el.photoionization = pi
--- a/pmsco/elements/spectrum.py
+++ b/pmsco/elements/spectrum.py
@ -0,0 +1,198 @@
+"""
+@package pmsco.elements.spectrum
+photoelectron spectrum simulator
+
+this module calculates the basic structure of a photoelectron spectrum.
+it calculates positions and approximate amplitude of elastic peaks
+based on photon energy, binding energy, photoionization cross section, and stoichiometry.
+escape depth, photon flux, analyser transmission are not accounted for.
+
+
+usage
+-----
+
+this module requires python 3.6, numpy, matplotlib and
+the periodictable package (https://pypi.python.org/pypi/periodictable).
+
+~~~~~~{.py}
+import numpy as np
+import periodictable as pt
+import pmsco.elements.spectrum as spec
+
+# for working with the data
+labels, energy, intensity = spec.build_spectrum(800., {"Ti": 1, "O": 2})
+
+# for plotting
+spec.plot_spectrum(800., {"Ti": 1, "O": 2})
+~~~~~~
+
+
+
+@author Matthias Muntwiler
+
+@copyright (c) 2020 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from matplotlib import pyplot as plt
+import numpy as np
+import periodictable as pt
+from . import bindingenergy
+from . import photoionization
+
+
+def get_element(number_or_symbol):
+    """
+    return the given Element object of the periodic table.
+
+    @param number_or_symbol: atomic number (int) or chemical symbol (str).
+    @return: Element object.
+    """
+    try:
+        el = pt.elements[number_or_symbol]
+    except KeyError:
+        el = pt.elements.symbol(number_or_symbol)
+    return el
+
+
+def get_binding_energy(photon_energy, element, nlj):
+    """
+    look up the binding energy of a core level and check whether it is smaller than the photon energy.
+
+    @param photon_energy: photon energy in eV.
+    @param element: Element object of the periodic table.
+    @param nlj: (str) spectroscopic term, e.g. '4f7/2'.
+    @return: (float) binding energy or numpy.nan.
+    """
+    try:
+        eb = element.binding_energy[nlj]
+    except KeyError:
+        return np.nan
+    if eb < photon_energy:
+        return eb
+    else:
+        return np.nan
+
+
+def get_cross_section(photon_energy, element, nlj):
+    """
+    look up the photoionization cross section.
+
+    since the Yeh/Lindau tables do not resolve the spin-orbit splitting,
+    this function applies the normal relative weights of a full sub-shell.
+
+    the result is a linear interpolation between tabulated values.
+
+    @param photon_energy: photon energy in eV.
+    @param element: Element object of the periodic table.
+    @param nlj: (str) spectroscopic term, e.g. '4f7/2'.
+    @return: (float) cross section in Mb.
+    """
+    nl = nlj[0:2]
+    try:
+        pet, cst = element.photoionization.cross_section[nl]
+    except KeyError:
+        return np.nan
+
+    # weights of spin-orbit peaks
+    d_wso = {"p1/2": 1./3.,
+             "p3/2": 2./3.,
+             "d3/2": 2./5.,
+             "d5/2": 3./5.,
+             "f5/2": 3./7.,
+             "f7/2": 4./7.}
+    wso = d_wso.get(nlj[1:], 1.)
+    cst = cst * wso
+
+    # todo: consider spline
+    return np.interp(photon_energy, pet, cst)
+
+
+def build_spectrum(photon_energy, elements, binding_energy=False, work_function=4.5):
+    """
+    calculate the positions and amplitudes of core-level photoemission lines.
+
+    the function looks up the binding energies and cross sections of all photoemission lines in the energy range
+    given by the photon energy and returns an array of expected spectral lines.
+
+    @param photon_energy: (numeric) photon energy in eV.
+    @param elements: list or dictionary of elements.
+        elements are identified by their atomic number (int) or chemical symbol (str).
+        if a dictionary is given, the (float) values are stoichiometric weights of the elements.
+    @param binding_energy: (bool) return binding energies (True) rather than kinetic energies (False, default).
+    @param work_function: (float) work function of the instrument in eV.
+    @return: tuple (labels, positions, intensities) of 1-dimensional numpy arrays representing the spectrum.
+        labels are in the format {Symbol}{n}{l}{j}.
+    """
+    ekin = []
+    ebind = []
+    intens = []
+    labels = []
+
+    for element in elements:
+        el = get_element(element)
+        for n in range(1, 8):
+            for l in "spdf":
+                for j in ['', '1/2', '3/2', '5/2', '7/2']:
+                    nlj = f"{n}{l}{j}"
+                    eb = get_binding_energy(photon_energy, el, nlj)
+                    cs = get_cross_section(photon_energy, el, nlj)
+                    try:
+                        cs = cs * elements[element]
+                    except (KeyError, TypeError):
+                        pass
+                    if not np.isnan(eb) and not np.isnan(cs):
+                        ekin.append(photon_energy - eb - work_function)
+                        ebind.append(eb)
+                        intens.append(cs)
+                        labels.append(f"{el.symbol}{nlj}")
+
+    ebind = np.array(ebind)
+    ekin = np.array(ekin)
+    intens = np.array(intens)
+    labels = np.array(labels)
+
+    if binding_energy:
+        return labels, ebind, intens
+    else:
+        return labels, ekin, intens
+
+
+def plot_spectrum(photon_energy, elements, binding_energy=False, work_function=4.5, show_labels=True):
+    """
+    plot a simple spectrum representation of a material.
+
+    the function looks up the binding energies and cross sections of all photoemission lines in the energy range
+    given by the photon energy and returns an array of expected spectral lines.
+
+    the spectrum is plotted using matplotlib.pyplot.stem.
+
+    @param photon_energy: (numeric) photon energy in eV.
+    @param elements: list or dictionary of elements.
+        elements are identified by their atomic number (int) or chemical symbol (str).
+        if a dictionary is given, the (float) values are stoichiometric weights of the elements.
+    @param binding_energy: (bool) return binding energies (True) rather than kinetic energies (False, default).
+    @param work_function: (float) work function of the instrument in eV.
+    @param show_labels: (bool) show peak labels (True, default) or not (False).
+    @return: (figure, axes)
+    """
+    labels, energy, intensity = build_spectrum(photon_energy, elements, binding_energy=binding_energy,
+                                               work_function=work_function)
+
+    fig, ax = plt.subplots()
+    ax.stem(energy, intensity, basefmt=' ', use_line_collection=True)
+    if show_labels:
+        for sxy in zip(labels, energy, intensity):
+            ax.annotate(sxy[0], xy=(sxy[1], sxy[2]), textcoords='data')
+
+    ax.grid()
+    if binding_energy:
+        ax.set_xlabel('binding energy')
+    else:
+        ax.set_xlabel('kinetic energy')
+    ax.set_ylabel('intensity')
+    ax.set_title(elements)
+    return fig, ax
--- a/pmsco/files.py
+++ b/pmsco/files.py
@ -1,16 +1,19 @@
 """
@package pmsco.files
-manage files produced by pmsco.
+manage the lifetime of files produced by pmsco.

@author Matthias Muntwiler

-@copyright (c) 2016 by Paul Scherrer Institut @n
+@copyright (c) 2016-18 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import os
 import logging
 import mpi4py
@ -24,28 +27,28 @@ logger = logging.getLogger(__name__)
 #
 # each string of this set marks a category of files.
 #
-# @arg @c 'input' :     raw input files for calculator, including cluster and phase files in custom format
-# @arg @c 'output' :    raw output files from calculator
-# @arg @c 'phase' :     phase files in portable format for report
-# @arg @c 'cluster' :   cluster files in portable XYZ format for report
-# @arg @c 'log' :       log files
-# @arg @c 'debug' :     debug files
-# @arg @c 'model':      output files in ETPAI format: complete simulation  (a_-1_-1_-1_-1)
-# @arg @c 'scan' :      output files in ETPAI format: scan (a_b_-1_-1_-1)
-# @arg @c 'symmetry' :  output files in ETPAI format: symmetry (a_b_c_-1_-1)
-# @arg @c 'emitter' :   output files in ETPAI format: emitter (a_b_c_d_-1)
-# @arg @c 'region' :    output files in ETPAI format: region (a_b_c_d_e)
-# @arg @c 'report':     final report of results
-# @arg @c 'population': final state of particle population
-# @arg @c 'rfac':       files related to models which give bad r-factors (dynamic category, see below).
+# @arg 'input' :     raw input files for calculator, including cluster and atomic files in custom format
+# @arg 'output' :    raw output files from calculator
+# @arg 'atomic' :    atomic scattering (phase, emission) files in portable format
+# @arg 'cluster' :   cluster files in portable XYZ format for report
+# @arg 'log' :       log files
+# @arg 'debug' :     debug files
+# @arg 'model':      output files in ETPAI format: complete simulation  (a_-1_-1_-1_-1)
+# @arg 'scan' :      output files in ETPAI format: scan (a_b_-1_-1_-1)
+# @arg 'domain' :    output files in ETPAI format: domain (a_b_c_-1_-1)
+# @arg 'emitter' :   output files in ETPAI format: emitter (a_b_c_d_-1)
+# @arg 'region' :    output files in ETPAI format: region (a_b_c_d_e)
+# @arg 'report':     final report of results
+# @arg 'population': final state of particle population
+# @arg 'rfac':       files related to models which give bad r-factors (dynamic category, see below).
 #
 # @note @c 'rfac' is a dynamic category not connected to a particular file or content type.
 # no file should be marked @c 'rfac'.
 # the string is used only to specify whether bad models should be deleted or not.
 # if so, all files related to bad models are deleted, regardless of their static category.
 #
-FILE_CATEGORIES = {'cluster', 'phase', 'input', 'output',
-                   'report', 'region', 'emitter', 'scan', 'symmetry', 'model',
+FILE_CATEGORIES = {'cluster', 'atomic', 'input', 'output',
+                   'report', 'region', 'emitter', 'scan', 'domain', 'model',
                   'log', 'debug', 'population', 'rfac'}

 ## @var FILE_CATEGORIES_TO_KEEP
@ -53,7 +56,7 @@ FILE_CATEGORIES = {'cluster', 'phase', 'input', 'output',
 #
 # this constant defines the default set of file categories that are kept after the calculation.
 #
-FILE_CATEGORIES_TO_KEEP = {'cluster', 'model', 'report', 'population'}
+FILE_CATEGORIES_TO_KEEP = {'cluster', 'model', 'scan', 'report', 'population'}

 ## @var FILE_CATEGORIES_TO_DELETE
 # categories of files to be deleted.
@ -67,13 +70,17 @@ FILE_CATEGORIES_TO_DELETE = FILE_CATEGORIES - FILE_CATEGORIES_TO_KEEP

 class FileTracker(object):
    """
-    organize output files of calculations.
+    manage the lifetime of files produced by the calculations.

    the file manager stores references to data files generated during calculations
    and cleans up unused files according to a range of filter criteria.
+
+    this class identifies files by _file name_.
+    file names must therefore be unique over the whole calculation process.
+    it is possible to specify a full path that is used for communication with the operating system.
    """

-    ## @var files_to_delete (set)
+    ## @var categories_to_delete (set)
    # categories of generated files that should be deleted after the calculation.
    #
    # each string of this set marks a category of files to be deleted.
@ -93,96 +100,119 @@ class FileTracker(object):
    #
    # the default is 10.

-    ## @var _last_id (int)
-    # last used file identification number (incremental)
-
-    ## @var _path_by_id (dict)
-    # key = file id, value = file path
-
-    ## @var _model_by_id (dict)
-    # key = file id, value = model number
-    
-    ## @var _category_by_id (dict)
-    # key = file id, value = category (str)
+    ## @var _file_model (dict)
+    # key = file name, value = model number
    
+    ## @var _file_category (dict)
+    # key = file name, value = category (str)
+
+    ## @var _file_path (dict)
+    # key = file name, value = absolute file path (str)
+
    ## @var _rfac_by_model (dict)
-    # key = model number, value = file id
+    # key = model number, value = R-factor

-    ## @var _complete_by_model (dict)
-    # key = model number, value (boolean) = all calculations complete, files can be deleted
+    ## @var _complete_models (set)
+    # this set contains the model numbers of the models that have finished all calculations.
+    # files of these models can be considered for clean up.

    def __init__(self):
-        self._id_by_path = {}
-        self._path_by_id = {}
-        self._model_by_id = {}
-        self._category_by_id = {}
+        self._file_model = {}
+        self._file_category = {}
+        self._file_path = {}
        self._rfac_by_model = {}
-        self._complete_by_model = {}
-        self._last_id = 0
+        self._complete_models = set([])
        self.categories_to_delete = FILE_CATEGORIES_TO_DELETE
        self.keep_rfac = 10

-    def add_file(self, path, model, category='default'):
+    def get_file_count(self):
+        """
+        return the number of tracked files.
+
+        @return: (int) number of tracked files.
+        """
+        return len(self._file_path)
+
+    def get_complete_models_count(self):
+        """
+        return the number of complete models.
+
+        @return: (int) number of complete models.
+        """
+        return len(self._complete_models)
+
+    def add_file(self, name, model, category='default', path=''):
        """
        add a new data file to the list.

-        @param path: (str) system path of the file relative to the working directory.
+        @param name: (str) unique identification of the file.
+            this can be the file name in the file system if file names are unique without path specification.
+            the name must be spelled identically
+            whenever the same file is referenced in a call to another method of this class.
+            the empty string is ignored.

        @param model: (int) model number

        @param category: (str) file category, e.g. 'output', etc.

+        @param path: (str) file system path of the file.
+            the file system path is used for communication with the operating system when the file is deleted.
+
+            by default, the path is the name argument expanded to a full path relative to the current working directory.
+            the path is expanded during the call of this method and will not change when the working directory changes.
+
        @return: None
        """
-        self._last_id += 1
-        _id = self._last_id
-        self._id_by_path[path] = _id
-        self._path_by_id[_id] = path
-        self._model_by_id[_id] = model
-        self._category_by_id[_id] = category
+        if name:
+            self._file_model[name] = model
+            self._file_category[name] = category
+            self._file_path[name] = path if path else os.path.abspath(name)

-    def rename_file(self, old_path, new_path):
+    def rename_file(self, old_name, new_name, new_path=''):
        """
        rename a data file in the list.

        the method does not rename the file in the file system.

-        @param old_path: must match an existing file path identically.
-            if old_path is not in the list, the method does nothing.
+        @param old_name: name used in the original add_file() call.
+            if it is not in the list, the method does nothing.

-        @param new_path: new path.
+        @param new_name: new name of the file, see add_file().
+            if the file is already in the list, its model and category is overwritten by the values of the old file.
+
+        @param new_path: new file system path of the file, see add_file().
+            by default, the path is the name argument expanded to a full path relative to the current working directory.

        @return: None
        """
        try:
-            _id = self._id_by_path[old_path]
+            model = self._file_model[old_name]
+            cat = self._file_category[old_name]
        except KeyError:
            pass
        else:
-            del self._id_by_path[old_path]
-            self._id_by_path[new_path] = _id
-            self._path_by_id[_id] = new_path
+            del self._file_model[old_name]
+            del self._file_category[old_name]
+            del self._file_path[old_name]
+            self.add_file(new_name, model, cat, new_path)

-    def remove_file(self, path):
+    def remove_file(self, name):
        """
        remove a file from the list.

        the method does not delete the file from the file system.

-        @param path: must match an existing file path identically.
-            if path is not in the list, the method does nothing.
+        @param name: must match an existing file name identically.
+            if the name is not found in the list, the method does nothing.

        @return:  None
        """
        try:
-            _id = self._id_by_path[path]
+            del self._file_model[name]
+            del self._file_category[name]
+            del self._file_path[name]
        except KeyError:
            pass
-        else:
-            del self._id_by_path[path]
-            del self._path_by_id[_id]
-            del self._model_by_id[_id]
-            del self._category_by_id[_id]

    def update_model_rfac(self, model, rfac):
        """
@ -207,43 +237,57 @@ class FileTracker(object):
        @param complete: (bool) True if all calculations of the model are complete (files can be deleted).
        @return: None
        """
-        self._complete_by_model[model] = complete
+        if complete:
+            self._complete_models.add(model)
+        else:
+            self._complete_models.discard(model)

-    def delete_files(self, categories=None, keep_rfac=0):
+    def delete_files(self, categories=None, incomplete_models=False):
        """
-        delete the files matching the list of categories.
+        delete all files matching a set of categories.
+
+        this function deletes all files that are tagged with one of the given categories.
+        tags are set by the code sections that create the files.
+        for a list of common categories, see FILE_CATEGORIES.
+        the categories can be given as an argument or taken from the categories_to_delete property.
+
+        files are deleted regardless of R-factor.
+        be sure to specify only categories that you don't need in the output at all.
+
+        by default, only files of complete models (cf. set_model_complete()) are deleted
+        to avoid interference with running calculations.
+        to clean up after calculations, the incomplete_models argument can override this.
+
+        @note this method does not act on the special 'rfac' category (see delete_bad_rfac()).

        @param categories: set of file categories to delete.
-            may include 'rfac' if bad r-factors should be deleted additionally (regardless of static category).
-            defaults to self.categories_to_delete.
+            if the argument is None, it defaults to the categories_to_delete property.

-        @param keep_rfac: number of best models to keep if bad r-factors are to be deleted.
-            the effective keep number is the greater of self.keep_rfac and this argument.
+        @param incomplete_models: (bool) delete files of incomplete models as well.
+            by default (False), incomplete models are not deleted.

        @return: None
        """
        if categories is None:
            categories = self.categories_to_delete
        for cat in categories:
-            self.delete_category(cat)
-        if 'rfac' in categories:
-            self.delete_bad_rfac(keep=keep_rfac)
+            self.delete_category(cat, incomplete_models=incomplete_models)

    def delete_bad_rfac(self, keep=0, force_delete=False):
        """
-        delete the files of all models except a specified number of good models.
+        delete all files of all models except for a specified number of best ranking models.

        the method first determines which models to keep.
-        models with R factor values of 0.0, without a specified R-factor, and
        the specified number of best ranking non-zero models are kept.
-        the files belonging to the keeper models are kept, all others are deleted,
-        regardless of category.
-        files of incomplete models are also kept.
+        in addition, incomplete models, models with R factor = 0.0,
+        and those without a specified R-factor are kept.
+        all other files are deleted.
+        the method does not consider the file category.

        the files are deleted from the list and the file system.

-        files are deleted only if 'rfac' is specified in self.categories_to_delete
-        or if force_delete is set to True.
+        the method executes only if 'rfac' is specified in self.categories_to_delete
+        or if force_delete is  True.
        otherwise the method does nothing.

        @param keep: number of files to keep.
@ -252,8 +296,6 @@ class FileTracker(object):
        @param force_delete: delete the bad files even if 'rfac' is not selected in categories_to_delete.

        @return: None
-
-        @todo should clean up rfac and model dictionaries from time to time.
        """
        if force_delete or 'rfac' in self.categories_to_delete:
            keep = max(keep, self.keep_rfac)
@ -263,62 +305,132 @@ class FileTracker(object):
            except IndexError:
                return

-            complete_models = {_model for (_model, _complete) in self._complete_by_model.iteritems() if _complete}
-            del_models = {_model for (_model, _rfac) in self._rfac_by_model.iteritems() if _rfac > rfac_split}
-            del_models &= complete_models
-            del_ids = {_id for (_id, _model) in self._model_by_id.iteritems() if _model in del_models}
-            for _id in del_ids:
-                self.delete_file(_id)
+            keep_models = {model for (model, rfac) in self._rfac_by_model.items() if 0.0 <= rfac <= rfac_split}
+            del_models = self._complete_models - keep_models
+            del_names = {name for (name, model) in self._file_model.items() if model in del_models}
+            for name in del_names:
+                self.delete_file(name)

-    def delete_category(self, category):
+    def delete_models(self, keep=None, delete=None):
+        """
+        delete all files by model.
+
+        this involves the following steps:
+        1. determine a list of complete models
+           (incomplete models are still being processed and must not be deleted).
+        2. intersect with the _delete_ list if specified.
+        3. subtract the _keep_ list if specified.
+
+        if neither the _keep_ nor the _delete_ list is specified,
+        or if the steps above resolve to the _complete_ list
+        the method considers it as an error and does nothing.
+
+        @param keep: (sequence) model numbers to keep, i.e., delete all others.
+
+        @param delete: (sequence) model numbers to delete.
+
+        @return (int) number of models deleted.
+        """
+        del_models = self._complete_models.copy()
+        if delete:
+            del_models &= delete
+        if keep:
+            del_models -= keep
+        if not del_models or del_models == self._complete_models:
+            return 0
+
+        del_names = {name for (name, model) in self._file_model.items() if model in del_models}
+        for name in del_names:
+            self.delete_file(name)
+
+        return len(del_models)
+
+    def delete_category(self, category, incomplete_models=False):
        """
        delete all files of a specified category from the list and the file system.

-        only files of complete models (cf. set_model_complete()) are deleted, but regardless of R-factor.
+        this function deletes all files that are tagged with the given category.
+        tags are set by the code sections that create the files.
+        for a list of common categories, see FILE_CATEGORIES.
+
+        files are deleted regardless of R-factor.
+        be sure to specify only categories that you don't need in the output at all.
+
+        by default, only files of complete models (cf. set_model_complete()) are deleted
+        to avoid interference with running calculations.
+        to clean up after calculations, the incomplete_models argument can override this.

        @param category: (str) category.
+            should be one of FILE_CATEGORIES. otherwise, the function has no effect.
+
+        @param incomplete_models: (bool) delete files of incomplete models as well.
+            by default (False), incomplete models are not deleted.

        @return: None
        """
-        complete_models = {_model for (_model, _complete) in self._complete_by_model.iteritems() if _complete}
-        del_ids = {_id for (_id, cat) in self._category_by_id.iteritems() if cat == category}
-        del_ids &= {_id for (_id, _model) in self._model_by_id.iteritems() if _model in complete_models}
-        for _id in del_ids:
-            self.delete_file(_id)
+        del_names = {name for (name, cat) in self._file_category.items() if cat == category}
+        if not incomplete_models:
+            del_names &= {name for (name, model) in self._file_model.items() if model in self._complete_models}
+        for name in del_names:
+            self.delete_file(name)

-    def delete_file(self, _id):
+    def delete_file(self, name):
        """
        delete a specified file from the list and the file system.

-        the file is identified by ID number.
        this method is unconditional. it does not consider category, completeness, nor R-factor.

-        @param _id: (int) ID number of the file to delete.
+        the method catches errors during file deletion and prints warnings to the logger.
+
+        @param name: must match an existing file path identically.
+            if it is not in the list, the method does nothing.
+            the method uses the associated path declared in add_file() to delete the file.

        @return: None
        """
-        path = self._path_by_id[_id]
-        cat = self._category_by_id[_id]
-        model = self._model_by_id[_id]
-        del self._id_by_path[path]
-        del self._path_by_id[_id]
-        del self._model_by_id[_id]
-        del self._category_by_id[_id]
        try:
-            self._os_delete_file(path)
-        except OSError:
-            logger.warning("error deleting file {0}".format(path))
+            cat = self._file_category[name]
+            model = self._file_model[name]
+            path = self._file_path[name]
+        except KeyError:
+            logger.warning("tried to delete untracked file {0}".format(name))
        else:
-            logger.debug("delete file {0} ({1}, model {2})".format(path, cat, model))
+            del self._file_model[name]
+            del self._file_category[name]
+            del self._file_path[name]
+            try:
+                os.remove(path)
+            except OSError:
+                logger.warning("file system error deleting file {0}".format(path))
+            else:
+                logger.debug("delete file {0} ({1}, model {2})".format(path, cat, model))

-    @staticmethod
-    def _os_delete_file(path):
-        """
-        have the operating system delete a file path.

-        this function is separate so that we can mock it in unit tests.
+def list_files_other_models(prefix, models):
+    """
+    list input/output files except those of the given models.

-        @param path: OS path
-        @return: None
-        """
-        os.remove(path)
+    this can be used to clean up all files except those belonging to the given models.
+
+    to delete the listed files:
+
+        for f in files:
+            os.remove(f)
+
+    @param prefix: file name prefix up to the first underscore.
+        only files starting with this prefix are listed.
+
+    @param models: sequence or set of model numbers that should not be listed.
+
+    @return: set of file names
+    """
+    file_names = set([])
+    for entry in os.scandir():
+        if entry.is_file:
+            elements = entry.name.split('_')
+            try:
+                if len(elements) == 6 and elements[0] == prefix and int(elements[1]) not in models:
+                    file_names.add(entry.name)
+            except (IndexError, ValueError):
+                pass
+    return file_names
--- a/pmsco/graphics/init.py
+++ b/pmsco/graphics/init.py
--- a/pmsco/graphics/rfactor.py
+++ b/pmsco/graphics/rfactor.py
@ -0,0 +1,198 @@
+"""
+@package pmsco.graphics.rfactor
+graphics rendering module for r-factor optimization results.
+
+this module is under development.
+interface and implementation are subject to change.
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2018 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import logging
+import math
+import numpy as np
+from pmsco.helpers import BraceMessage as BMsg
+
+logger = logging.getLogger(__name__)
+
+try:
+    from matplotlib.figure import Figure
+    from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
+except ImportError:
+    Figure = None
+    FigureCanvas = None
+    logger.warning("error importing matplotlib. graphics rendering disabled.")
+
+
+def render_param_rfac(filename, data, param_name, summary=None, canvas=None):
+    """
+    render an r-factor versus one model parameter graph.
+
+    the default file format is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param filename: path and name of the results file.
+        this is used to derive the output file path by adding the parameter name and
+        the extension of the graphics file format.
+
+    @param data: numpy-structured array of results (one-dimensional).
+
+        the field names identify the model parameters and optimization control values.
+        model parameters can have any name not including a leading underscore and are evaluated as is.
+        the names of the special optimization control values begin with an underscore.
+        of these, at least _rfac must be provided.
+
+    @param param_name: name of the model parameter to display.
+        this must correspond to a field name of the data array.
+
+    @param summary: (dict) the dictionary returned by @ref evaluate_results.
+        this is used to mark the optimum value and the error limits.
+        if None, these values are not marked in the plot.
+
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+
+    @raise TypeError if matplotlib is not available.
+    """
+    if canvas is None:
+        canvas = FigureCanvas
+    fig = Figure()
+    canvas(fig)
+
+    ax = fig.add_subplot(111)
+    ax.scatter(data[param_name], data['_rfac'], c='b', marker='o', s=4.0)
+
+    if summary is not None:
+        xval = summary['val'][param_name]
+        ymin = summary['vmin']['_rfac']
+        ymax = summary['vmax']['_rfac']
+        ax.plot((xval, xval), (ymin, ymax), ':k')
+        xmin = summary['vmin'][param_name]
+        xmax = summary['vmax'][param_name]
+        varr = summary['rmin'] + summary['rvar']
+        ax.plot((xmin, xmax), (varr, varr), ':k')
+
+    ax.grid(True)
+    ax.set_xlabel(param_name)
+    ax.set_ylabel('R-factor')
+
+    out_filename = "{0}.{1}.{2}".format(filename, param_name, canvas.get_default_filetype())
+    fig.savefig(out_filename)
+    return out_filename
+
+
+def evaluate_results(data, features=50.):
+    """
+    @param data: numpy-structured array of results (one-dimensional).
+
+        the field names identify the model parameters and optimization control values.
+        model parameters can have any name not including a leading underscore and are evaluated as is.
+        the names of the special optimization control values begin with an underscore.
+        of these, at least _rfac must be provided.
+
+    @param features: number of independent features (pieces of information) in the data.
+        this quantity can be approximated as the scan range divided by the average width of a feature
+        which includes an intrinsic component and the instrumental resolution.
+        see Booth et al., Surf. Sci. 387 (1997), 152 for energy scans, and
+        Muntwiler et al., Surf. Sci. 472 (2001), 125 for angle scans.
+        the default value of 50 is a typical value.
+
+    @return dictionary of evaluation results.
+
+        the dictionary contains scalars and structured arrays as follows.
+        the structured arrays have the same data type as the input data and contain exactly one element.
+
+        @arg rmin: (scalar) minimum r-factor.
+        @arg rvar: (scalar) one-sigma variation of r-factor.
+        @arg imin: (scalar) array index where the minimum is located.
+        @arg val:  (structured array) estimates of parameter values (parameter value at rmin).
+        @arg sig:  (structured array) one-sigma error of estimated values.
+        @arg vmin: (structured array) minimum value of the parameter.
+        @arg vmax: (structured array) maximum value of the parameter.
+    """
+    imin = data['_rfac'].argmin()
+    rmin = data['_rfac'][imin]
+    rvar = rmin * math.sqrt(2. / float(features))
+
+    val = np.zeros(1, dtype=data.dtype)
+    sig = np.zeros(1, dtype=data.dtype)
+    vmin = np.zeros(1, dtype=data.dtype)
+    vmax = np.zeros(1, dtype=data.dtype)
+    sel = data['_rfac'] <= rmin + rvar
+    for name in data.dtype.names:
+        val[name] = data[name][imin]
+        vmin[name] = data[name].min()
+        vmax[name] = data[name].max()
+        if name[0] != '_':
+            sig[name] = (data[name][sel].max() - data[name][sel].min()) / 2.
+
+    results = {'rmin': rmin, 'rvar': rvar, 'imin': imin, 'val': val, 'sig': sig, 'vmin': vmin, 'vmax': vmax}
+    return results
+
+
+def render_results(results_file, data=None):
+    """
+    produce a graphics file from optimization results.
+
+    the results can be passed in a file name or numpy array (see parameter section).
+
+    the default file format is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function will log a warning message and return gracefully.
+
+    @param results_file: path and name of the result file.
+
+        result files are the ones written by swarm.SwarmPopulation.save_array, for instance.
+        the file contains columns of model parameters and optimization control values.
+        the first row must contain column names that identify the quantity.
+        model parameters can have any name not including a leading underscore and are evaluated as is.
+        the names of the special optimization control values begin with an underscore.
+        of these, at least _rfac must be provided.
+
+        if the optional data parameter is present,
+        this is used only to derive the output file path by adding the extension of the graphics file format.
+
+    @param data: numpy-structured array of results (one-dimensional).
+
+        the field names identify the model parameters and optimization control values.
+        model parameters can have any name not including a leading underscore and are evaluated as is.
+        the names of the special optimization control values begin with an underscore.
+        of these, at least _rfac must be provided.
+
+        if this argument is omitted, the data is loaded from the file referenced by the filename argument.
+
+    @return (list of str) path names of the generated graphics files.
+        empty if an error occurred.
+        the most common exceptions are caught and add a warning in the log file.
+    """
+
+    if data is None:
+        data = np.atleast_1d(np.genfromtxt(results_file, names=True))
+
+    summary = evaluate_results(data)
+
+    out_files = []
+    try:
+        for name in data.dtype.names:
+            if name[0] != '_' and summary['sig'][name] > 0.:
+                graph_file = render_param_rfac(results_file, data, name, summary)
+                out_files.append(graph_file)
+    except (TypeError, AttributeError, IOError) as e:
+        logger.warning(BMsg("error rendering scan file {file}: {msg}", file=results_file, msg=str(e)))
+
+    return out_files
--- a/pmsco/graphics/scan.py
+++ b/pmsco/graphics/scan.py
@ -0,0 +1,288 @@
+"""
+@package pmsco.graphics.scan
+graphics rendering module for energy and angle scans.
+
+this module is experimental.
+interface and implementation are subject to change.
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2018 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import logging
+import math
+import numpy as np
+import pmsco.data as md
+from pmsco.helpers import BraceMessage as BMsg
+
+logger = logging.getLogger(__name__)
+
+try:
+    from matplotlib.figure import Figure
+    from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
+    # from matplotlib.backends.backend_pdf import FigureCanvasPdf
+    # from matplotlib.backends.backend_svg import FigureCanvasSVG
+except ImportError:
+    Figure = None
+    FigureCanvas = None
+    logger.warning("error importing matplotlib. graphics rendering disabled.")
+
+
+def render_1d_scan(filename, data, scan_mode, canvas=None, is_modf=False, ref_data=None):
+    """
+    produce a graphics file from a one-dimensional scan file.
+
+    the default file format is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param filename: path and name of the scan file.
+        this is used to derive the output file path by adding the extension of the graphics file format.
+
+    @param data: numpy-structured array of EI, ETPI or ETPAI data.
+
+    @param scan_mode: list containing the field name of the scanning axis of the data array.
+        it must contain one element exactly.
+
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+
+    @param is_modf: whether data contains a modulation function (True) or intensity (False, default).
+        this parameter is used to set axis labels.
+
+    @param ref_data: numpy-structured array of EI, ETPI or ETPAI data.
+        this is reference data (e.g. experimental data) that should be plotted with the main dataset.
+        both datasets will be plotted on the same axis and should have similar data range.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+
+    @raise TypeError if matplotlib is not available.
+    """
+    if canvas is None:
+        canvas = FigureCanvas
+    fig = Figure()
+    canvas(fig)
+
+    ax = fig.add_subplot(111)
+    if ref_data is not None:
+        ax.plot(ref_data[scan_mode[0]], ref_data['i'], 'k.')
+    ax.plot(data[scan_mode[0]], data['i'])
+
+    ax.set_xlabel(scan_mode[0])
+    if is_modf:
+        ax.set_ylabel('chi')
+    else:
+        ax.set_ylabel('int')
+
+    out_filename = "{0}.{1}".format(filename, canvas.get_default_filetype())
+    fig.savefig(out_filename)
+    return out_filename
+
+
+def render_ea_scan(filename, data, scan_mode, canvas=None, is_modf=False):
+    """
+    produce a graphics file from an energy-angle scan file.
+
+    the default file format is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param filename: path and name of the scan file.
+        this is used to derive the output file path by adding the extension of the graphics file format.
+    @param data: numpy-structured array of ETPI or ETPAI data.
+    @param scan_mode: list containing the field names of the scanning axes of the data array,
+        i.e. 'e' and one of the angle axes.
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+    @param is_modf: whether data contains a modulation function (True) or intensity (False, default).
+        this parameter is used to select a suitable color scale.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+
+    @raise TypeError if matplotlib is not available.
+    """
+    (data2d, axis0, axis1) = md.reshape_2d(data, scan_mode, 'i')
+
+    if canvas is None:
+        canvas = FigureCanvas
+    fig = Figure()
+    canvas(fig)
+
+    ax = fig.add_subplot(111)
+    im = ax.imshow(data2d, origin='lower', aspect='auto', interpolation='none')
+    im.set_extent((axis1[0], axis1[-1], axis0[0], axis0[-1]))
+
+    ax.set_xlabel(scan_mode[1])
+    ax.set_ylabel(scan_mode[0])
+
+    cb = fig.colorbar(im, shrink=0.4, pad=0.1)
+
+    dlo = np.nanpercentile(data['i'], 1)
+    dhi = np.nanpercentile(data['i'], 99)
+    if is_modf:
+        im.set_cmap("RdBu_r")
+        dhi = max(abs(dlo), abs(dhi))
+        dlo = -dhi
+        im.set_clim((dlo, dhi))
+        try:
+            # requires matplotlib 2.1.0
+            ti = cb.get_ticks()
+            ti = [min(ti), 0., max(ti)]
+            cb.set_ticks(ti)
+        except AttributeError:
+            pass
+    else:
+        im.set_cmap("magma")
+        im.set_clim((dlo, dhi))
+
+    out_filename = "{0}.{1}".format(filename, canvas.get_default_filetype())
+    fig.savefig(out_filename)
+    return out_filename
+
+
+def render_tp_scan(filename, data, canvas=None, is_modf=False):
+    """
+    produce a graphics file from an theta-phi (hemisphere) scan file.
+
+    the default file format is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function raises an error.
+
+    @param filename: path and name of the scan file.
+        this is used to derive the output file path by adding the extension of the graphics file format.
+    @param data: numpy-structured array of TPI data.
+        the T and P columns describes a full or partial hemispherical scan.
+        the I column contains the intensity or modulation values.
+        other columns are ignored.
+    @param canvas: a FigureCanvas class reference from a matplotlib backend.
+        if None, the default FigureCanvasAgg is used which produces a bitmap file in PNG format.
+    @param is_modf: whether data contains a modulation function (True) or intensity (False, default).
+        this parameter is used to select a suitable color scale.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+
+    @raise TypeError if matplotlib is not available.
+    """
+    if canvas is None:
+        canvas = FigureCanvas
+    fig = Figure()
+    canvas(fig)
+
+    ax = fig.add_subplot(111, projection='polar')
+
+    data = data[data['t'] <= 89.0]
+    # stereographic projection
+    rd = 2 * np.tan(np.radians(data['t']) / 2)
+    drdt = 1 + np.tan(np.radians(data['t']) / 2)**2
+
+    # http://matplotlib.org/api/collections_api.html#matplotlib.collections.PathCollection
+    pc = ax.scatter(data['p'] * math.pi / 180., rd, c=data['i'], lw=0, alpha=1.)
+
+    # interpolate marker size between 4 and 9 (for theta step = 1)
+    unique_theta = np.unique(data['t'])
+    theta_step = (np.max(unique_theta) - np.min(unique_theta)) / unique_theta.shape[0]
+    sz = np.ones_like(pc.get_sizes()) * drdt * 4.5 * theta_step**2
+    pc.set_sizes(sz)
+
+    # xticks = angles where grid lines are displayed (in radians)
+    ax.set_xticks([])
+    # rticks = radii where grid lines (circles) are displayed
+    ax.set_rticks([])
+    ax.set_rmax(2.0)
+
+    cb = fig.colorbar(pc, shrink=0.4, pad=0.1)
+
+    dlo = np.nanpercentile(data['i'], 2)
+    dhi = np.nanpercentile(data['i'], 98)
+    if is_modf:
+        pc.set_cmap("RdBu_r")
+        # im.set_cmap("coolwarm")
+        dhi = max(abs(dlo), abs(dhi))
+        dlo = -dhi
+        pc.set_clim((dlo, dhi))
+        try:
+            # requires matplotlib 2.1.0
+            ti = cb.get_ticks()
+            ti = [min(ti), 0., max(ti)]
+            cb.set_ticks(ti)
+        except AttributeError:
+            pass
+    else:
+        pc.set_cmap("magma")
+        # im.set_cmap("inferno")
+        # im.set_cmap("viridis")
+        pc.set_clim((dlo, dhi))
+        ti = cb.get_ticks()
+        ti = [min(ti), max(ti)]
+        cb.set_ticks(ti)
+
+    out_filename = "{0}.{1}".format(filename, canvas.get_default_filetype())
+    fig.savefig(out_filename)
+    return out_filename
+
+
+def render_scan(filename, data=None, ref_data=None):
+    """
+    produce a graphics file from a scan file.
+
+    the default file format is PNG.
+
+    this function requires the matplotlib module.
+    if it is not available, the function will log a warning message and return gracefully.
+
+    @param filename: path and name of the scan file.
+        the file must have one of the formats supported by pmsco.data.load_data().
+        it must contain a single scan (not the combined scan from the model level of PMSCO).
+        supported are all one-dimensional linear scans,
+        and two-dimensional energy-angle scans (each axis must be linear).
+        hemispherical scans are currently not supported.
+        the filename should include ".modf" if the data contains a modulation function rather than intensity.
+
+        if the optional data parameter is present,
+        this is used only to derive the output file path by adding the extension of the graphics file format.
+
+    @param data: numpy-structured array of ETPI or ETPAI data.
+        if this argument is omitted, the data is loaded from the file referenced by the filename argument.
+
+    @param ref_data: numpy-structured array of ETPI or ETPAI data.
+        this is reference data (e.g. experimental data) that should be plotted with the main dataset.
+        this is supported for 1d scans only.
+        both datasets will be plotted on the same axis and should have similar data range.
+
+    @return (str) path and name of the generated graphics file.
+        empty string if an error occurred.
+    """
+    if data is None:
+        data = md.load_data(filename)
+    scan_mode, scan_positions = md.detect_scan_mode(data)
+    is_modf = filename.find(".modf") >= 0
+
+    try:
+        if len(scan_mode) == 1:
+            out_filename = render_1d_scan(filename, data, scan_mode, is_modf=is_modf, ref_data=ref_data)
+        elif len(scan_mode) == 2 and 'e' in scan_mode:
+            out_filename = render_ea_scan(filename, data, scan_mode, is_modf=is_modf)
+        elif len(scan_mode) == 2 and 't' in scan_mode and 'p' in scan_mode:
+            out_filename = render_tp_scan(filename, data, is_modf=is_modf)
+        else:
+            out_filename = ""
+            logger.warning(BMsg("no render function for scan file {file}", file=filename))
+    except (TypeError, AttributeError, IOError) as e:
+        out_filename = ""
+        logger.warning(BMsg("error rendering scan file {file}: {msg}", file=filename, msg=str(e)))
+
+    return out_filename
--- a/pmsco/handlers.py
+++ b/pmsco/handlers.py
@ -1,6 +1,6 @@
 """
@package pmsco.handlers
-project-independent task handlers for models, scans, symmetries, emitters and energies.
+project-independent task handlers for models, scans, domains, emitters and energies.

 calculation tasks are organized in a hierarchical tree.
 at each node, a task handler (feel free to find a better name)
@ -20,9 +20,9 @@ the handlers of the structural optimizers are declared in separate modules.
 scans are defined by the project.
 the actual merging step from multiple scans into one result dataset is delegated to the project class.

-<em>symmetry handlers</em> split a task into one child per symmetry.
-symmetries are defined by the project.
-the actual merging step from multiple symmetries into one result dataset is delegated to the project class.
+<em>domain handlers</em> split a task into one child per domain.
+domains are defined by the project.
+the actual merging step from multiple domains into one result dataset is delegated to the project class.

 <em>emitter handlers</em> split a task into one child per emitter configuration (inequivalent sets of emitting atoms).
 emitter configurations are defined by the project.
@ -35,26 +35,34 @@ code inspection and tests have shown that per-emitter results from EDAC can be s
 in order to take advantage of parallel processing.

 while several classes of model handlers are available,
-the default handlers for scans, symmetries, emitters and energies should be sufficient in most situations.
-the scan and symmetry handlers call methods of the project class to invoke project-specific functionality.
+the default handlers for scans, domains, emitters and energies should be sufficient in most situations.
+the scan and domain handlers call methods of the project class to invoke project-specific functionality.

@author Matthias Muntwiler, matthias.muntwiler@psi.ch

-@copyright (c) 2015-17 by Paul Scherrer Institut @n
+@copyright (c) 2015-18 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
 from __future__ import division
+from __future__ import print_function
+
 import datetime
-import os
+from functools import reduce
 import logging
 import math
 import numpy as np
-import data as md
-from helpers import BraceMessage as BMsg
+import os
+
+from pmsco.compat import open
+import pmsco.data as md
+import pmsco.dispatch as dispatch
+import pmsco.graphics.scan as mgs
+from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)

@ -66,10 +74,10 @@ class TaskHandler(object):
    this class defines the common interface of task handlers.
    """

-    ## @var project
+    ## @var _project
    #       (Project) project instance.

-    ## @var slots
+    ## @var _slots
    #       (int) number of calculation slots (processes).
    #
    #       for best efficiency the number of tasks generated should be greater or equal the number of slots.
@ -93,7 +101,7 @@ class TaskHandler(object):
    #       the dictionary keys are the task identifiers CalculationTask.id,
    #       the values are the corresponding CalculationTask objects.

-    ## @var invalid_count (int)
+    ## @var _invalid_count (int)
    #  accumulated total number of invalid results received.
    #
    #  the number is incremented by add_result if an invalid task is reported.
@ -120,10 +128,14 @@ class TaskHandler(object):
            for best efficiency the number of tasks generated should be greater or equal the number of slots.
            it should not exceed N times the number of slots, where N is a reasonably small number.

-        @return None
+        @return (int) number of children that create_tasks() will generate on average.
+            the number does not need to be accurate, a rough estimate or order of magnitude if greater than 10 is fine.
+            it is used to distribute processing slots across task levels.
+            see pmsco.dispatch.MscoMaster.setup().
        """
        self._project = project
        self._slots = slots
+        return 1

    def cleanup(self):
        """
@ -188,21 +200,22 @@ class TaskHandler(object):
            the id, model, and files attributes are required.
            if model contains a '_rfac' value, the r-factor is

-        @return: None
+        @return None
        """
        model_id = task.id.model
-        for path, cat in task.files.iteritems():
+        for path, cat in task.files.items():
            self._project.files.add_file(path, model_id, category=cat)

-    def cleanup_files(self, keep=10):
+    def cleanup_files(self, keep=0):
        """
        delete uninteresting files.

-        @param: number of best ranking models to keep.
+        @param keep: minimum number of models to keep.
+            0 (default): leave the decision to the project.

-        @return: None
+        @return None
        """
-        self._project.files.delete_files(keep_rfac=keep)
+        self._project.cleanup_files(keep=keep)


 class ModelHandler(TaskHandler):
@ -255,6 +268,22 @@ class ModelHandler(TaskHandler):

        return None

+    def save_report(self, root_task):
+        """
+        generate a final report of the optimization procedure.
+
+        detailed calculation results are usually saved as soon as they become available.
+        this method may be implemented in sub-classes to aggregate and summarize the results, generate plots, etc.
+        in this class, the method does nothing.
+
+        @note: implementations must add the path names of generated files to self._project.files.
+
+        @param root_task: (CalculationTask) task with initial model parameters.
+
+        @return: None
+        """
+        pass
+

 class SingleModelHandler(ModelHandler):
    """
@ -263,6 +292,10 @@ class SingleModelHandler(ModelHandler):
    this class runs a single calculation on the start parameters defined in the domain of the project.
    """

+    def __init__(self):
+        super(SingleModelHandler, self).__init__()
+        self.result = {}
+
    def create_tasks(self, parent_task):
        """
        start one task with the start parameters.
@ -316,25 +349,17 @@ class SingleModelHandler(ModelHandler):
        modf_ext = ".modf" + parent_task.file_ext
        parent_task.modf_filename = parent_task.file_root + modf_ext

-        rfac = 1.0
-        if task.result_valid:
-            try:
-                rfac = self._project.calc_rfactor(task)
-            except ValueError:
-                task.result_valid = False
-                logger.warning(BMsg("calculation of model {0} resulted in an undefined R-factor.", task.id.model))
+        self.result = task.model.copy()
+        self.result['_rfac'] = task.rfac

-            task.model['_rfac'] = rfac
-            self.save_report_file(task.model)
-
-        self._project.files.update_model_rfac(task.id.model, rfac)
+        self._project.files.update_model_rfac(task.id.model, task.rfac)
        self._project.files.set_model_complete(task.id.model, True)

        parent_task.time = task.time

        return parent_task

-    def save_report_file(self, result):
+    def save_report(self, root_task):
        """
        save model parameters and r-factor to a file.

@ -343,20 +368,25 @@ class SingleModelHandler(ModelHandler):
        the first line contains the parameter names.
        this is the same format as used by the swarm and grid handlers.

-        @param result: dictionary of results and parameters. the values should be scalars and strings.
+        @param root_task: (CalculationTask) the id.model attribute is used to register the generated files.

        @return: None
        """
-        keys = [key for key in result]
+        super(SingleModelHandler, self).save_report(root_task)
+
+        keys = [key for key in self.result]
        keys.sort(key=lambda t: t[0].lower())
-        vals = (str(result[key]) for key in keys)
-        with open(self._project.output_file + ".dat", "w") as outfile:
+        vals = (str(self.result[key]) for key in keys)
+        filename = self._project.output_file + ".dat"
+        with open(filename, "w") as outfile:
            outfile.write("# ")
            outfile.write(" ".join(keys))
            outfile.write("\n")
            outfile.write(" ".join(vals))
            outfile.write("\n")

+            self._project.files.add_file(filename, root_task.id.model, "report")
+

 class ScanHandler(TaskHandler):
    """
@ -388,6 +418,34 @@ class ScanHandler(TaskHandler):
        self._pending_ids_per_parent = {}
        self._complete_ids_per_parent = {}

+    def setup(self, project, slots):
+        """
+        initialize the scan task handler and save processed experimental scans.
+
+        @return (int) number of scans defined in the project.
+        """
+        super(ScanHandler, self).setup(project, slots)
+
+        for (i_scan, scan) in enumerate(self._project.scans):
+            if scan.modulation is not None:
+                __, filename = os.path.split(scan.filename)
+                pre, ext = os.path.splitext(filename)
+                filename = "{pre}_{scan}.modf{ext}".format(pre=pre, ext=ext, scan=i_scan)
+                filepath = os.path.join(self._project.output_dir, filename)
+                md.save_data(filepath, scan.modulation)
+                mgs.render_scan(filepath, data=scan.modulation)
+
+        if project.combined_scan is not None:
+            ext = md.format_extension(project.combined_scan)
+            filename = project.output_file + ext
+            md.save_data(filename, project.combined_scan)
+        if project.combined_modf is not None:
+            ext = md.format_extension(project.combined_modf)
+            filename = project.output_file + ".modf" + ext
+            md.save_data(filename, project.combined_modf)
+
+        return len(self._project.scans)
+
    def create_tasks(self, parent_task):
        """
        generate a calculation task for each scan of the given parent task.
@ -464,6 +522,7 @@ class ScanHandler(TaskHandler):

            if parent_task.result_valid:
                self._project.combine_scans(parent_task, child_tasks)
+                self._project.evaluate_result(parent_task, child_tasks)
                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'model')
                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'model')

@ -476,7 +535,7 @@ class ScanHandler(TaskHandler):
            return None


-class SymmetryHandler(TaskHandler):
+class DomainHandler(TaskHandler):
    ## @var _pending_ids_per_parent
    #       (dict) sets of child task IDs per parent
    #
@ -496,20 +555,29 @@ class SymmetryHandler(TaskHandler):
    #       the values are sets of all child CalculationTask.id belonging to the parent.

    def __init__(self):
-        super(SymmetryHandler, self).__init__()
+        super(DomainHandler, self).__init__()
        self._pending_ids_per_parent = {}
        self._complete_ids_per_parent = {}

+    def setup(self, project, slots):
+        """
+        initialize the domain task handler.
+
+        @return (int) number of domains defined in the project.
+        """
+        super(DomainHandler, self).setup(project, slots)
+        return len(self._project.domains)
+
    def create_tasks(self, parent_task):
        """
-        generate a calculation task for each symmetry of the given parent task.
+        generate a calculation task for each domain of the given parent task.

-        all symmetries share the same model parameters.
+        all domains share the same model parameters.

-        @return list of CalculationTask objects, with one element per symmetry.
-            the symmetry index varies according to project.symmetries.
+        @return list of CalculationTask objects, with one element per domain.
+            the domain index varies according to project.domains.
        """
-        super(SymmetryHandler, self).create_tasks(parent_task)
+        super(DomainHandler, self).create_tasks(parent_task)

        parent_id = parent_task.id
        self._parent_tasks[parent_id] = parent_task
@ -517,10 +585,10 @@ class SymmetryHandler(TaskHandler):
        self._complete_ids_per_parent[parent_id] = set()

        out_tasks = []
-        for (i_sym, sym) in enumerate(self._project.symmetries):
+        for (i_dom, domain) in enumerate(self._project.domains):
            new_task = parent_task.copy()
            new_task.parent_id = parent_id
-            new_task.change_id(sym=i_sym)
+            new_task.change_id(domain=i_dom)

            child_id = new_task.id
            self._pending_tasks[child_id] = new_task
@ -529,25 +597,25 @@ class SymmetryHandler(TaskHandler):
            out_tasks.append(new_task)

        if not out_tasks:
-            logger.error("no symmetry tasks generated. your project must declare at least one symmetry.")
+            logger.error("no domain tasks generated. your project must declare at least one domain.")

        return out_tasks

    def add_result(self, task):
        """
-        collect and combine the calculation results versus symmetry.
+        collect and combine the calculation results versus domain.

        * mark the task as complete
        * store its result for later
        * check whether this was the last pending task of the family (belonging to the same parent).

-        the actual merging of data is delegated to the project's combine_symmetries() method.
+        the actual merging of data is delegated to the project's combine_domains() method.

        @param task: (CalculationTask) calculation task that completed.

        @return parent task (CalculationTask) if the family is complete. None if the family is not complete yet.
        """
-        super(SymmetryHandler, self).add_result(task)
+        super(DomainHandler, self).add_result(task)

        self._complete_tasks[task.id] = task
        del self._pending_tasks[task.id]
@ -557,7 +625,7 @@ class SymmetryHandler(TaskHandler):
        family_pending.remove(task.id)
        family_complete.add(task.id)

-        # all symmetries complete?
+        # all domains complete?
        if len(family_pending) == 0:
            parent_task = self._parent_tasks[task.parent_id]

@ -574,9 +642,13 @@ class SymmetryHandler(TaskHandler):
            parent_task.time = reduce(lambda a, b: a + b, child_times)

            if parent_task.result_valid:
-                self._project.combine_symmetries(parent_task, child_tasks)
+                self._project.combine_domains(parent_task, child_tasks)
+                self._project.evaluate_result(parent_task, child_tasks)
                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'scan')
                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'scan')
+                graph_file = mgs.render_scan(parent_task.modf_filename,
+                                             ref_data=self._project.scans[parent_task.id.scan].modulation)
+                self._project.files.add_file(graph_file, parent_task.id.model, 'scan')

            del self._pending_ids_per_parent[parent_task.id]
            del self._complete_ids_per_parent[parent_task.id]
@ -615,13 +687,26 @@ class EmitterHandler(TaskHandler):
        self._pending_ids_per_parent = {}
        self._complete_ids_per_parent = {}

+    def setup(self, project, slots):
+        """
+        initialize the emitter task handler.
+
+        @return (int) estimated number of emitter configurations that the cluster generator will generate.
+            the estimate is based on the start parameters, scan 0 and domain 0.
+        """
+        super(EmitterHandler, self).setup(project, slots)
+        mock_model = self._project.create_model_space().start
+        mock_index = dispatch.CalcID(-1, 0, 0, -1, -1)
+        n_emitters = project.cluster_generator.count_emitters(mock_model, mock_index)
+        return n_emitters
+
    def create_tasks(self, parent_task):
        """
        generate a calculation task for each emitter configuration of the given parent task.

        all emitters share the same model parameters.

-        @return list of @ref CalculationTask objects with one element per emitter configuration
+        @return list of @ref pmsco.dispatch.CalculationTask objects with one element per emitter configuration
            if parallel processing is enabled.
            otherwise the list contains a single CalculationTask object with emitter index 0.
            the emitter index is used by the project's create_cluster method.
@ -634,10 +719,7 @@ class EmitterHandler(TaskHandler):
        self._complete_ids_per_parent[parent_id] = set()

        n_emitters = self._project.cluster_generator.count_emitters(parent_task.model, parent_task.id)
-        if n_emitters > 1 and self._slots > 1:
-            emitters = range(1, n_emitters + 1)
-        else:
-            emitters = [0]
+        emitters = range(n_emitters)

        out_tasks = []
        for em in emitters:
@ -698,8 +780,12 @@ class EmitterHandler(TaskHandler):

            if parent_task.result_valid:
                self._project.combine_emitters(parent_task, child_tasks)
-                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'symmetry')
-                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'symmetry')
+                self._project.evaluate_result(parent_task, child_tasks)
+                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'domain')
+                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'domain')
+                graph_file = mgs.render_scan(parent_task.modf_filename,
+                                             ref_data=self._project.scans[parent_task.id.scan].modulation)
+                self._project.files.add_file(graph_file, parent_task.id.model, 'domain')

            del self._pending_ids_per_parent[parent_task.id]
            del self._complete_ids_per_parent[parent_task.id]
@ -776,15 +862,10 @@ class RegionHandler(TaskHandler):
            parent_task.time = reduce(lambda a, b: a + b, child_times)

            if parent_task.result_valid:
-                stack1 = [md.load_data(t.result_filename) for t in child_tasks]
-                dtype = md.common_dtype(stack1)
-                stack2 = [md.restructure_data(d, dtype) for d in stack1]
-                result_data = np.hstack(tuple(stack2))
-                md.sort_data(result_data)
-                md.save_data(parent_task.result_filename, result_data)
+                self._project.combine_regions(parent_task, child_tasks)
+                self._project.evaluate_result(parent_task, child_tasks)
                self._project.files.add_file(parent_task.result_filename, parent_task.id.model, "emitter")
-                for t in child_tasks:
-                    self._project.files.remove_file(t.result_filename)
+                self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, "emitter")

            del self._pending_ids_per_parent[parent_task.id]
            del self._complete_ids_per_parent[parent_task.id]
@ -840,7 +921,7 @@ class EnergyRegionHandler(RegionHandler):
    so that all child tasks of the same parent finish approximately in the same time.
    pure angle scans are not split.

-    to use this feature, the project assigns this class to its @ref handler_classes['region'].
+    to use this feature, the project assigns this class to its @ref pmsco.project.Project.handler_classes['region'].
    it is safe to use this handler for calculations that do not involve energy scans.
    the handler is best used for single calculations.
    in optimizations that calculate many models there is no advantage in using it
@ -871,7 +952,7 @@ class EnergyRegionHandler(RegionHandler):

        @param slots (int) number of calculation slots (processes).

-        @return None
+        @return (int) average number of child tasks
        """
        super(EnergyRegionHandler, self).setup(project, slots)

@ -884,6 +965,8 @@ class EnergyRegionHandler(RegionHandler):
            logger.debug(BMsg("region handler: split scan {file} into {slots} chunks",
                              file=os.path.basename(scan.filename), slots=self._slots_per_scan[i]))

+        return max(int(sum(self._slots_per_scan) / len(self._slots_per_scan)), 1)
+
    def create_tasks(self, parent_task):
        """
        generate a calculation task for each energy region of the given parent task.
--- a/pmsco/helpers.py
+++ b/pmsco/helpers.py
@ -1,4 +1,20 @@
-class BraceMessage:
+"""
+@package pmsco.helpers
+helper classes
+
+a collection of small and generic code bits mostly collected from the www.
+
+"""
+
+
+class BraceMessage(object):
+    """
+    a string formatting proxy class useful for logging and exceptions.
+
+    use BraceMessage("{0} {1}", "formatted", "message")
+    in place of "{0} {1}".format("formatted", "message").
+    the advantage is that the format method is called only if the string is actually used.
+    """
    def __init__(self, fmt, *args, **kwargs):
        self.fmt = fmt
        self.args = args
--- a/pmsco/igor.py
+++ b/pmsco/igor.py
@ -0,0 +1,143 @@
+"""
+@package pmsco.igor
+data exchange with wavemetrics igor pro.
+
+this module provides functions for loading/saving pmsco data in igor pro.
+
+@author Matthias Muntwiler
+
+@copyright (c) 2019 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+from pmsco.compat import open
+
+
+def _escape_igor_string(s):
+    s = s.replace('\\', '\\\\')
+    s = s.replace('"', '\\"')
+    return s
+
+
+def namefix_double(name):
+    """
+    fix 1-character wave name by doubling
+
+    replaces length-1 string by a doubled version.
+
+    @param name: (str) proposed wave name
+
+    @return: corrected name
+    """
+    return name*2 if len(name) == 1 else name
+
+
+def namefix_etpais(name):
+    """
+    fix 1-character wave name according to ETPAIS scheme
+
+    replaces 'e' by 'en' etc.
+
+    @param name: (str) proposed wave name
+
+    @return: corrected name
+    """
+    name_map = {'e': 'en', 't': 'th', 'p': 'ph', 'i': 'in', 'm': 'mo', 's': 'si'}
+    try:
+        return name_map[name]
+    except KeyError:
+        return name
+
+
+class IgorExport(object):
+    """
+    class exports pmsco data to an Igor text (ITX) file.
+
+    usage:
+    1) create an object instance.
+    2) set @ref data.
+    3) set optional attributes: @ref prefix and @ref namefix.
+    4) call @ref export.
+    """
+
+    def __init__(self):
+        super(IgorExport, self).__init__()
+        self.data = None
+        self.prefix = ""
+        self.namefix = namefix_double
+
+    def set_data(self, data):
+        """
+        set the data array to export.
+
+        this must (currently) be a one-dimensional structured array.
+        the column names will become wave names.
+
+        @param data: numpy.ndarray
+        @return:
+        """
+        self.data = data
+
+    def export(self, filename):
+        """
+        write to igor file.
+        """
+        with open(filename, 'w') as f:
+            self._write_header(f)
+            self._write_data(f)
+
+    def _fix_name(self, name):
+        """
+        fix a wave name.
+
+        this function first applies @ref namefix and @ref prefix to the proposed wave name.
+
+        @param name: (str) proposed wave name
+
+        @return: corrected name
+        """
+        if self.namefix is not None:
+            name = self.namefix(name)
+        return self.prefix + name
+
+    def _write_header(self, f):
+        """
+        write the header of the igor text file
+
+        @param f: open file or stream
+
+        @return: None
+        """
+        f.write('IGOR' + '\n')
+        f.write('X // pmsco data export\n')
+
+    def _write_data(self, f):
+        """
+        write a data section to the igor text file.
+
+        @param f: open file or stream
+
+        @return: None
+        """
+        assert isinstance(self.data, np.ndarray)
+        assert len(self.data.shape) == 1
+        assert len(self.data.dtype.names[0]) >= 1
+
+        arr = self.data
+        shape = ",".join(map(str, arr.shape))
+        names = (self._fix_name(name) for name in arr.dtype.names)
+        names = ", ".join(names)
+
+        f.write('Waves/O/D/N=({shape}) {names}\n'.format(shape=shape, names=names))
+        f.write('BEGIN\n')
+        np.savetxt(f, arr, fmt='%g')
+        f.write('END\n')
--- a/pmsco/loess/makefile
+++ b/pmsco/loess/makefile
@ -5,45 +5,42 @@ SHELL=/bin/sh
 # required libraries: libblas, liblapack, libf2c
 # (you may have to set soft links so that linker finds them)
 #
+# the makefile calls python-config to get the compilation flags and include path.
+# you may override the corresponding variables on the command line or by environment variables:
+#
+# PYTHON_INC: specify additional include directories. each dir must start with -I prefix.
+# PYTHON_CFLAGS: specify the C compiler flags.
+#
 # see the top-level makefile for additional information.

 .SUFFIXES:
 .SUFFIXES: .c .cpp .cxx .exe .f .h .i .o .py .pyf .so .x
 .PHONY: all loess test gas madeup ethanol air galaxy

-HOST=$(shell hostname)
-CFLAGS=-O
-FFLAGS=-O
 OBJ=loessc.o loess.o predict.o misc.o loessf.o dqrsl.o dsvdc.o fix_main.o
+
+FFLAGS?=-O
 LIB=-lblas -lm -lf2c
-LIBPATH=
-CC=gcc
-CCOPTS=
-SWIG=swig
-SWIGOPTS=
-PYTHON=python
-PYTHONOPTS=
-ifneq (,$(filter merlin%,$(HOST)))
-PYTHONINC=-I/usr/include/python2.7 -I/opt/python/python-2.7.5/include/python2.7/
-else ifneq (,$(filter ra%,$(HOST)))
-PYTHONINC=-I${PSI_PYTHON27_INCLUDE_DIR}/python2.7 -I${PSI_PYTHON27_LIBRARY_DIR}/python2.7/site-packages/numpy/core/include
-else
-PYTHONINC=-I/usr/include/python2.7
-endif
+LIBPATH?=
+CC?=gcc
+CCOPTS?=
+SWIG?=swig
+SWIGOPTS?=
+PYTHON?=python
+PYTHONOPTS?=
+PYTHON_CONFIG = ${PYTHON}-config
+#PYTHON_LIB ?= $(shell ${PYTHON_CONFIG} --libs)
+#PYTHON_INC ?= $(shell ${PYTHON_CONFIG} --includes)
+PYTHON_INC ?=
+PYTHON_CFLAGS ?= $(shell ${PYTHON_CONFIG} --cflags)
+#PYTHON_LDFLAGS ?= $(shell ${PYTHON_CONFIG} --ldflags)

 all: loess

 loess: _loess.so

-loess_wrap.c: loess.c loess.i
-	$(SWIG) $(SWIGOPTS) -python loess.i
-	
-loess.py _loess.so: loess_wrap.c
-#   setuptools doesn't handle the fortran files correctly
-#	$(PYTHON) $(PYTHONOPTS) setup.py build_ext --inplace
-	$(CC) $(CFLAGS) -fpic -c loessc.c loess.c predict.c misc.c loessf.f dqrsl.f dsvdc.f fix_main.c
-	$(CC) $(CFLAGS) -fpic -c loess_wrap.c $(PYTHONINC)
-	$(CC) -shared $(OBJ) $(LIB) $(LIBPATH) loess_wrap.o -o _loess.so
+loess.py _loess.so: loess.c loess.i
+	$(PYTHON) $(PYTHONOPTS) setup.py build_ext --inplace

 examples: gas madeup ethanol air galaxy

--- a/pmsco/loess/setup.py
+++ b/pmsco/loess/setup.py
@ -1,7 +1,5 @@
 #!/usr/bin/env python

-__author__ = 'Matthias Muntwiler'
-
 """
@package loess.setup
 setup.py file for LOESS
@ -17,39 +15,49 @@ the Python wrapper was set up by M. Muntwiler
 with the help of the SWIG toolkit
 and other incredible goodies available in the Linux world.

-@bug this file is currently not used because
-distutils does not compile the included Fortran files.
+@bug numpy.distutils.build_src in python 2.7 treats all Fortran files with f2py
+so that they are compiled via both f2py and swig.
+this produces extra object files which cause the linker to fail.
+to fix this issue, this module hacks the build_src class.
+this hack does not work with python 3. perhaps it's even unnecessary.

@author Matthias Muntwiler

-@copyright (c) 2015 by Paul Scherrer Institut @n
+@copyright (c) 2015-18 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

-from distutils.core import setup, Extension
-from distutils import sysconfig
-
 import numpy
 try:
    numpy_include = numpy.get_include()
 except AttributeError:
    numpy_include = numpy.get_numpy_include()
-    
-loess_module = Extension('_loess',
-                         sources=['loess.i', 'loess_wrap.c', 'loess.c', 'loessc.c', 'predict.c', 'misc.c', 'loessf.f', 
-                         'dqrsl.f', 'dsvdc.f'],
-                         include_dirs = [numpy_include],
-                         libraries=['blas', 'm', 'f2c'],
-                         )

-setup(name='loess',
-      version='0.1',
-      author=__author__,
-      author_email='matthias.muntwiler@psi.ch',
-      description="""LOESS module in Python""",
-      ext_modules=[loess_module],
-      py_modules=["loess"], requires=['numpy']
-      )
+def configuration(parent_package='', top_path=None):
+    from numpy.distutils.misc_util import Configuration
+    config = Configuration('loess', parent_package, top_path)
+    lib = ['blas', 'm', 'f2c']
+    src = ['loess.c', 'loessc.c', 'predict.c', 'misc.c', 'loessf.f', 'dqrsl.f', 'dsvdc.f', 'fix_main.c', 'loess.i']
+    inc_dir = [numpy_include]
+    config.add_extension('_loess',
+                         sources=src,
+                         libraries=lib,
+                         include_dirs=inc_dir
+                         )
+    return config
+
+def ignore_sources(self, sources, extension):
+    return sources
+
+if __name__ == '__main__':
+    try:
+        from numpy.distutils.core import numpy_cmdclass
+        numpy_cmdclass['build_src'].f2py_sources = ignore_sources
+    except ImportError:
+        pass
+    from numpy.distutils.core import setup
+    setup(**configuration(top_path='').todict())
+
--- a/pmsco/makefile
+++ b/pmsco/makefile
@ -1,17 +1,18 @@
 SHELL=/bin/sh

-# makefile for EDAC, MSC, and MUFPOT programs and modules
+# makefile for external programs and modules
 #
 # see the top-level makefile for additional information.

-.PHONY: all clean edac loess msc mufpot
+.PHONY: all clean edac loess msc mufpot phagen

 EDAC_DIR = edac
 MSC_DIR = msc
 MUFPOT_DIR = mufpot
 LOESS_DIR = loess
+PHAGEN_DIR = calculators/phagen

-all: edac loess
+all: edac loess phagen

 edac:
 	$(MAKE) -C $(EDAC_DIR)
@ -25,9 +26,13 @@ msc:
 mufpot:
 	$(MAKE) -C $(MUFPOT_DIR)

+phagen:
+	$(MAKE) -C $(PHAGEN_DIR)
+
 clean:
 	$(MAKE) -C $(EDAC_DIR) clean
 	$(MAKE) -C $(LOESS_DIR) clean
 	$(MAKE) -C $(MSC_DIR) clean
 	$(MAKE) -C $(MUFPOT_DIR) clean
+	$(MAKE) -C $(PHAGEN_DIR) clean
 	rm -f *.pyc
--- a/pmsco/msc/makefile
+++ b/pmsco/msc/makefile
@ -12,16 +12,17 @@ SHELL=/bin/sh
 .SUFFIXES: .c .cpp .cxx .exe .f .h .i .o .py .pyf .so
 .PHONY: all clean edac msc mufpot

-FC=gfortran
-FCCOPTS=
-F2PY=f2py
-F2PYOPTS=
-CC=gcc
-CCOPTS=
-SWIG=swig
-SWIGOPTS=
-PYTHON=python
-PYTHONOPTS=
+FC?=gfortran
+FCCOPTS?=
+F2PY?=f2py
+F2PYOPTS?=
+CC?=gcc
+CCOPTS?=
+SWIG?=swig
+SWIGOPTS?=
+PYTHON?=python
+PYTHONOPTS?=
+PYTHONINC?=

 all: msc

--- a/pmsco/mufpot/makefile
+++ b/pmsco/mufpot/makefile
@ -20,7 +20,7 @@ CC=gcc
 CCOPTS=
 SWIG=swig
 SWIGOPTS=
-PYTHON=python
+PYTHON=python2
 PYTHONOPTS=

 all: mufpot
--- a/pmsco/optimizers/init.py
+++ b/pmsco/optimizers/init.py
--- a/pmsco/optimizers/genetic.py
+++ b/pmsco/optimizers/genetic.py
@ -0,0 +1,308 @@
+"""
+@package pmsco.optimizers.genetic
+genetic optimization algorithm.
+
+this module implements a genetic algorithm for structural optimization.
+
+the genetic algorithm is adapted from
+D. A. Duncan et al., Surface Science 606, 278 (2012)
+
+the genetic algorithm evolves a population of individuals
+by a combination of inheritance, crossover and mutation
+and R-factor based selection.
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2018 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import logging
+import numpy as np
+import random
+import pmsco.optimizers.population as population
+from pmsco.helpers import BraceMessage as BMsg
+
+logger = logging.getLogger(__name__)
+
+
+class GeneticPopulation(population.Population):
+    """
+    population implementing a genetic optimization algorithm.
+
+    the genetic algorithm implements the following principles:
+
+    1. inheritance: two children of a new generation are generated from the genes (i.e. model parameters)
+       of two parents of the old generation.
+    2. elitism: individuals with similar r-factors are more likely to mate.
+    3. crossover: the genes of the parents are randomly distributed to their children.
+    4. mutation: a gene may mutate at random.
+    5. selection: the globally best individual is added to a parent population (and replaces the worst).
+
+    the main tuning parameter of the algorithm is the mutation_step which is copied from the model_space.step.
+    it defines the width of a gaussian distribution of change under a weak mutation.
+    it should be large enough so that the whole parameter space can be probed,
+    but small enough that a frequent mutation does not throw the individual out of the convergence region.
+    typically, the step should be of the order of the parameter range divided by the population size.
+
+    other tunable parameters are the mating_factor, the weak_mutation_probability and the strong_mutation_probability.
+    the defaults should normally be fine.
+    """
+
+    ## @var weak_mutation_probability
+    #
+    # probability (between 0 and 1) that a parameter changes in the mutate_weak() method.
+    #
+    # the default is 1.0, i.e., each parameter mutates in each generation.
+    #
+    # 1.0 has shown better coverage of the continuous parameter space and faster finding of the optimum.
+
+    ## @var strong_mutation_probability
+    #
+    # probability (between 0 and 1) that a parameter changes in the mutate_strong() method.
+    #
+    # the default is 0.01, i.e., on average, every hundredth probed parameter is affected by a strong mutation.
+    # if the model contains 10 parameters, for example,
+    # every tenth particle would see a mutation of at least one of its parameters.
+    #
+    # too high value may disturb convergence,
+    # too low value may trap the algorithm in a local optimum.
+
+    ## @var mating_factor
+    #
+    # inverse width of the mating preference distribution.
+    #
+    # the greater this value, the more similar partners are mated by the mate_parents() method.
+    #
+    # the default value 4.0 results in a probability of about 0.0025
+    # that the best particle mates the worst.
+
+    ## @var position_constrain_mode
+    #
+    # the position constrain mode selects what to do if a particle violates the parameter limits.
+    #
+    # the default is "random" which resets the parameter to a random value.
+
+    ## @var mutation_step
+    #
+    # standard deviations of the exponential distribution function used in the mutate_weak() method.
+    # the variable is a dictionary with the same keys as model_step (the parameter space).
+    #
+    # it is initialized from the model_space.step
+    # or set to a default value based on the parameter range and population size.
+
+    def __init__(self):
+        """
+        initialize the population object.
+
+        """
+        super(GeneticPopulation, self).__init__()
+
+        self.weak_mutation_probability = 1.0
+        self.strong_mutation_probability = 0.01
+        self.mating_factor = 4.
+        self.position_constrain_mode = 'random'
+        self.mutation_step = {}
+
+    def setup(self, size, model_space, **kwargs):
+        """
+        @copydoc Population.setup()
+
+        in addition to the inherited behaviour, this method initializes self.mutation_step.
+        mutation_step of a parameter is set to its model_space.step if non-zero.
+        otherwise it is set to the parameter range divided by the population size.
+        """
+        super(GeneticPopulation, self).setup(size, model_space, **kwargs)
+
+        for key in self.model_step:
+            val = self.model_step[key]
+            self.mutation_step[key] = val if val != 0 else (self.model_max[key] - self.model_min[key]) / size
+
+    def randomize(self, pos=True, vel=True):
+        """
+        initializes a "random" population.
+
+        this implementation is a new proposal.
+        the distribution is not completely random.
+        rather, a position vector (by parameter) is initialized with a linear function
+        that covers the parameter space.
+        the linear function is then permuted randomly.
+
+        the method does not update the particle info fields.
+
+        @param pos: randomize positions. if False, the positions are not changed.
+        @param vel: randomize velocities. if False, the velocities are not changed.
+        """
+        if pos:
+            for key in self.model_start:
+                self.pos[key] = np.random.permutation(np.linspace(self.model_min[key], self.model_max[key],
+                                                                  self.pos.shape[0]))
+        if vel:
+            for key in self.model_start:
+                d = (self.model_max[key] - self.model_min[key]) / 8
+                self.vel[key] = np.random.permutation(np.linspace(-d, d, self.vel.shape[0]))
+
+    def advance_population(self):
+        """
+        advance the population by one generation.
+
+        the population is advanced in several steps:
+        1. replace the worst individual by the best found so far.
+        2. mate the parents in pairs of two.
+        3. produce children by crossover from the parents.
+        4. apply weak mutations.
+        5. apply strong mutations.
+
+        if generation is lower than zero, the method increases the generation number but does not advance the particles.
+
+        @return: None
+        """
+        if not self._hold_once:
+            self.generation += 1
+
+            pop = self.pos.copy()
+            pop.sort(order='_rfac')
+            elite = self.best.copy()
+            elite.sort(order='_rfac')
+            if elite[0]['_model'] not in pop['_model']:
+                elite[0]['_particle'] = pop[-1]['_particle']
+                pop[-1] = elite[0]
+                pop.sort(order='_rfac')
+
+            parents = self.mate_parents(pop)
+
+            children = []
+            for x, y in parents:
+                a, b = self.crossover(x, y)
+                children.append(a)
+                children.append(b)
+
+            for child in children:
+                index = child['_particle']
+                self.mutate_weak(child, self.weak_mutation_probability)
+                self.mutate_strong(child, self.strong_mutation_probability)
+                self.mutate_duplicate(child)
+                for key in self.model_start:
+                    vel = child[key] - self.pos[index][key]
+                    child[key], vel, self.model_min[key], self.model_max[key] = \
+                        self.constrain_position(child[key], vel, self.model_min[key], self.model_max[key],
+                                                self.position_constrain_mode)
+
+                self.pos[index] = child
+                self.update_particle_info(index)
+
+        super(GeneticPopulation, self).advance_population()
+
+    def mate_parents(self, positions):
+        """
+        group the population in pairs of two.
+
+        to mate two individuals, the first individual of the (remaining) population selects one of the following
+        with an exponential preference of earlier ones.
+        the process is repeated until all individuals are mated.
+
+        @param positions: original population (numpy structured array)
+            the population should be ordered with best model first.
+        @return: sequence of pairs (tuples) of structured arrays holding one model each.
+        """
+        seq = [model for model in positions]
+        parents = []
+        while len(seq) >= 2:
+            p1 = seq.pop(0)
+            ln = len(seq)
+            i = min(int(random.expovariate(self.mating_factor / ln) * ln), ln - 1)
+            p2 = seq.pop(i)
+            parents.append((p1, p2))
+        return parents
+
+    def crossover(self, parent1, parent2):
+        """
+        crossover two parents to create two children.
+
+        for each model parameter, the parent's value is randomly assigned to either one of the children.
+
+        @param parent1: numpy structured array holding the model of the first parent.
+        @param parent2: numpy structured array holding the model of the second parent.
+        @return: tuple of the two crossed children.
+            these are two new ndarray instances that are independent of their parents.
+        """
+        child1 = parent1.copy()
+        child2 = parent2.copy()
+        for key in self.model_start:
+            if random.random() >= 0.5:
+                child1[key], child2[key] = parent2[key], parent1[key]
+        return child1, child2
+
+    def mutate_weak(self, model, probability):
+        """
+        apply a weak mutation to a model.
+
+        each parameter is changed to a different value in the parameter space at the given probability.
+        the amount of change has a gaussian distribution with a standard deviation of mutation_step.
+
+        @param[in,out] model: structured numpy.ndarray holding the model parameters.
+            model is modified in place.
+
+        @param probability: probability between 0 and 1 at which to change a parameter.
+            0 = no change, 1 = force change.
+
+        @return: model (same instance as the @c model input argument).
+        """
+        for key in self.model_start:
+            if random.random() < probability:
+                model[key] += random.gauss(0, self.mutation_step[key])
+        return model
+
+    def mutate_strong(self, model, probability):
+        """
+        apply a strong mutation to a model.
+
+        each parameter is changed to a random value in the parameter space at the given probability.
+
+        @param[in,out] model: structured numpy.ndarray holding the model parameters.
+            model is modified in place.
+
+        @param probability: probability between 0 and 1 at which to change a parameter.
+            0 = no change, 1 = force change.
+
+        @return: model (same instance as the @c model input argument).
+        """
+        for key in self.model_start:
+            if random.random() < probability:
+                model[key] = (self.model_max[key] - self.model_min[key]) * random.random() + self.model_min[key]
+        return model
+
+    def mutate_duplicate(self, model):
+        """
+        mutate a model if it is identical to a previously calculated one.
+
+        if the model was calculated before, the mutate_weak mutation is applied with probability 1.
+
+        @param[in,out] model: structured numpy.ndarray holding the model parameters.
+            model is modified in place.
+
+        @return: model (same instance as the @c model input argument).
+        """
+        try:
+            self.find_model(model)
+            self.mutate_weak(model, 1.0)
+        except ValueError:
+            pass
+        return model
+
+
+class GeneticOptimizationHandler(population.PopulationHandler):
+    """
+    model handler which implements a genetic algorithm.
+
+    """
+
+    def __init__(self):
+        super(GeneticOptimizationHandler, self).__init__()
+        self._pop = GeneticPopulation()
--- a/pmsco/optimizers/gradient.py
+++ b/pmsco/optimizers/gradient.py
@ -8,7 +8,7 @@ the optimization task is distributed over multiple processes using MPI.
 the optimization must be started with N+1 processes in the MPI environment,
 where N equals the number of fit parameters.

-IMPLEMENTATION IN PROGRESS - DEBUGGING
+THIS MODULE IS NOT INTEGRATED INTO PMSCO YET.

 Requires: scipy, numpy

@ -109,7 +109,7 @@ class MscMaster(MscProcess):

    def setup(self, project):
        super(MscMaster, self).setup(project)
-        self.dom = project.create_domain()
+        self.dom = project.create_model_space()
        self.running_slaves = self.slaves

        self._outfile = open(self.project.output_file + ".dat", "w")
--- a/pmsco/optimizers/grid.py
+++ b/pmsco/optimizers/grid.py
@ -13,14 +13,19 @@ Licensed under the Apache License, Version 2.0 (the "License"); @n
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
 from __future__ import division
-import copy
-import os
+from __future__ import print_function
+
 import datetime
+import math
 import numpy as np
 import logging
-import handlers
-from helpers import BraceMessage as BMsg
+
+from pmsco.compat import open
+import pmsco.handlers as handlers
+import pmsco.graphics as graphics
+from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)

@ -58,7 +63,7 @@ class GridPopulation(object):
    ## @var positions
    # (numpy.ndarray) flat list of grid coordinates and results.
    #
-    # the column names include the names of the model parameters, taken from domain.start,
+    # the column names include the names of the model parameters, taken from model_space.start,
    # and the special names @c '_model', @c '_rfac'.
    # the special fields have the following meanings:
    #
@ -108,41 +113,40 @@ class GridPopulation(object):
        dt.sort(key=lambda t: t[0].lower())
        return dt

-    def setup(self, domain):
+    def setup(self, model_space):
        """
        set up the population and result arrays.

-        @param domain: definition of initial and limiting model parameters
+        @param model_space: (pmsco.project.ModelSpace)
+            definition of initial and limiting model parameters
            expected by the cluster and parameters functions.
+            the attributes have the following meanings:
+            @arg start: values of the fixed parameters.
+            @arg min:   minimum values allowed.
+            @arg max:   maximum values allowed.
+                        if abs(max - min) < step/2 , the parameter is kept constant.
+            @arg step:  step size (distance between two grid points).
+                        if step <= 0, the parameter is kept constant.

-        @param domain.start: values of the fixed parameters.
-
-        @param domain.min:   minimum values allowed.
-
-        @param domain.max:   maximum values allowed.
-                             if abs(max - min) < step/2 , the parameter is kept constant.
-
-        @param domain.step:  step size (distance between two grid points).
-                             if step <= 0, the parameter is kept constant.
        """
-        self.model_start = domain.start
-        self.model_min = domain.min
-        self.model_max = domain.max
-        self.model_step = domain.step
+        self.model_start = model_space.start
+        self.model_min = model_space.min
+        self.model_max = model_space.max
+        self.model_step = model_space.step

        self.model_count = 1
        self.search_keys = []
        self.fixed_keys = []
        scales = []

-        for p in domain.step.keys():
-            if domain.step[p] > 0:
-                n = np.round((domain.max[p] - domain.min[p]) / domain.step[p]) + 1
+        for p in model_space.step.keys():
+            if model_space.step[p] > 0:
+                n = int(np.round((model_space.max[p] - model_space.min[p]) / model_space.step[p]) + 1)
            else:
                n = 1
            if n > 1:
                self.search_keys.append(p)
-                scales.append(np.linspace(domain.min[p], domain.max[p], n))
+                scales.append(np.linspace(model_space.min[p], model_space.max[p], n))
            else:
                self.fixed_keys.append(p)

@ -218,7 +222,7 @@ class GridPopulation(object):

        @raise AssertionError if the number of rows of the two files differ.
        """
-        data = np.genfromtxt(filename, names=True)
+        data = np.atleast_1d(np.genfromtxt(filename, names=True))
        assert data.shape == array.shape
        for name in data.dtype.names:
            array[name] = data[name]
@ -295,12 +299,12 @@ class GridSearchHandler(handlers.ModelHandler):
            the minimum number of slots is 1, the recommended value is 10 or greater.
            the population size is set to at least 4.

-        @return:
+        @return (int) number of models to be calculated.
        """
        super(GridSearchHandler, self).setup(project, slots)

        self._pop = GridPopulation()
-        self._pop.setup(self._project.create_domain())
+        self._pop.setup(self._project.create_model_space())
        self._invalid_limit = max(slots, self._invalid_limit)

        self._outfile = open(self._project.output_file + ".dat", "w")
@ -308,7 +312,7 @@ class GridSearchHandler(handlers.ModelHandler):
        self._outfile.write(" ".join(self._pop.positions.dtype.names))
        self._outfile.write("\n")

-        return None
+        return self._pop.model_count

    def cleanup(self):
        self._outfile.close()
@ -341,13 +345,18 @@ class GridSearchHandler(handlers.ModelHandler):
        time_pending += self._model_time
        if time_pending > time_avail:
            self._timeout = True
+            logger.warning("time limit reached")
+
+        if self._invalid_count > self._invalid_limit:
+            self._timeout = True
+            logger.error("number of invalid calculations (%u) exceeds limit", self._invalid_count)

        model = self._next_model
-        if not self._timeout and model < self._pop.model_count and self._invalid_count < self._invalid_limit:
+        if not self._timeout and model < self._pop.model_count:
            new_task = parent_task.copy()
            new_task.parent_id = parent_id
            pos = self._pop.positions[model]
-            new_task.model = {k:pos[k] for k in pos.dtype.names}
+            new_task.model = {k: pos[k] for k in pos.dtype.names}
            new_task.change_id(model=model)

            child_id = new_task.id
@ -374,17 +383,10 @@ class GridSearchHandler(handlers.ModelHandler):
        del self._pending_tasks[task.id]
        parent_task = self._parent_tasks[task.parent_id]

-        rfac = 1.0
        if task.result_valid:
-            try:
-                rfac = self._project.calc_rfactor(task)
-            except ValueError:
-                task.result_valid = False
-                self._invalid_count += 1
-                logger.warning(BMsg("calculation of model {0} resulted in an undefined R-factor.", task.id.model))
-
-            task.model['_rfac'] = rfac
-            self._pop.add_result(task.model, rfac)
+            assert not math.isnan(task.rfac)
+            task.model['_rfac'] = task.rfac
+            self._pop.add_result(task.model, task.rfac)

            if self._outfile:
                s = (str(task.model[name]) for name in self._pop.positions.dtype.names)
@ -392,12 +394,14 @@ class GridSearchHandler(handlers.ModelHandler):
                self._outfile.write("\n")
                self._outfile.flush()

-        self._project.files.update_model_rfac(task.id.model, rfac)
+        self._project.files.update_model_rfac(task.id.model, task.rfac)
        self._project.files.set_model_complete(task.id.model, True)

        if task.result_valid:
            if task.time > self._model_time:
                self._model_time = task.time
+        else:
+            self._invalid_count += 1

        # grid search complete?
        if len(self._pending_tasks) == 0:
@ -407,3 +411,17 @@ class GridSearchHandler(handlers.ModelHandler):

        self.cleanup_files()
        return parent_task
+
+    def save_report(self, root_task):
+        """
+        generate a graphical summary of the optimization.
+
+        @param root_task: (CalculationTask) the id.model attribute is used to register the generated files.
+
+        @return: None
+        """
+        super(GridSearchHandler, self).save_report(root_task)
+
+        files = graphics.rfactor.render_results(self._project.output_file + ".dat", self._pop.positions)
+        for f in files:
+            self._project.files.add_file(f, root_task.id.model, "report")
--- a/pmsco/optimizers/population.py
+++ b/pmsco/optimizers/population.py
--- a/pmsco/optimizers/swarm.py
+++ b/pmsco/optimizers/swarm.py
@ -0,0 +1,139 @@
+"""
+@package pmsco.optimizers.swarm
+particle swarm optimization handler.
+
+the module starts multiple MSC calculations and optimizes the model parameters
+according to the particle swarm optimization algorithm.
+
+Particle swarm optimization adapted from
+D. A. Duncan et al., Surface Science 606, 278 (2012)
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-18 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import logging
+import numpy as np
+import pmsco.optimizers.population as population
+from pmsco.helpers import BraceMessage as BMsg
+
+logger = logging.getLogger(__name__)
+
+
+class SwarmPopulation(population.Population):
+    """
+    particle swarm population.
+    """
+
+    ## @var friends
+    # number of other particles that each particle consults for the global best fit.
+    # default = 3.
+
+    ## @var momentum
+    # momentum of the particle.
+    # default = 0.689343.
+
+    ## @var attract_local
+    # preference for returning to the local best fit
+    # default = 1.92694.
+
+    ## @var attract_global
+    # preference for heading towards the global best fit.
+    # default = 1.92694
+
+    def __init__(self):
+        """
+        initialize the population object.
+
+        """
+        super(SwarmPopulation, self).__init__()
+
+        self.friends = 3
+        self.momentum = 0.689343
+        self.attract_local = 1.92694
+        self.attract_global = 1.92694
+        self.position_constrain_mode = 'default'
+        self.velocity_constrain_mode = 'default'
+
+    def advance_population(self):
+        """
+        advance the population by one step.
+
+        this method just calls advance_particle() for each particle of the population.
+        if generation is lower than zero, the method increases the generation number but does not advance the particles.
+
+        @return: None
+        """
+        if not self._hold_once:
+            self.generation += 1
+            for index, __ in enumerate(self.pos):
+                self.advance_particle(index)
+
+        super(SwarmPopulation, self).advance_population()
+
+    def advance_particle(self, index):
+        """
+        advance a particle by one step.
+
+        @param index: index of the particle in the population.
+        """
+
+        # note: the following two identifiers are views,
+        # assignment will modify the original array
+        pos = self.pos[index]
+        vel = self.vel[index]
+        # best fit that this individual has seen
+        xl = self.best[index]
+        # best fit that a group of others have seen
+        xg = self.best_friend(index)
+
+        for key in self.model_start:
+            # update velocity
+            dxl = xl[key] - pos[key]
+            dxg = xg[key] - pos[key]
+            pv = np.random.random()
+            pl = np.random.random()
+            pg = np.random.random()
+            vel[key] = (self.momentum * pv * vel[key] +
+                self.attract_local * pl * dxl +
+                self.attract_global * pg * dxg)
+            pos[key], vel[key], self.model_min[key], self.model_max[key] = \
+                self.constrain_velocity(pos[key], vel[key], self.model_min[key], self.model_max[key],
+                                        self.velocity_constrain_mode)
+            # update position
+            pos[key] += vel[key]
+            pos[key], vel[key], self.model_min[key], self.model_max[key] = \
+                self.constrain_position(pos[key], vel[key], self.model_min[key], self.model_max[key],
+                                        self.position_constrain_mode)
+
+        self.update_particle_info(index)
+
+    # noinspection PyUnusedLocal
+    def best_friend(self, index):
+        """
+        select the best fit out of a random set of particles
+
+        returns the "best friend"
+        """
+        friends = np.random.choice(self.best, self.friends, replace=False)
+        index = np.argmin(friends['_rfac'])
+        return friends[index]
+
+
+class ParticleSwarmHandler(population.PopulationHandler):
+    """
+    model handler which implements the particle swarm optimization algorithm.
+
+    """
+
+    def __init__(self):
+        super(ParticleSwarmHandler, self).__init__()
+        self._pop = SwarmPopulation()
--- a/pmsco/optimizers/table.py
+++ b/pmsco/optimizers/table.py
@ -0,0 +1,155 @@
+"""
+@package pmsco.table
+table scan optimization handler
+
+the table scan scans through an explicit table of model parameters.
+it can be used to recalculate models from a previous optimization run on different scans,
+or as an interface to external optimizers.
+new elements can be added to the table while the calculation loop is in progress.
+
+though the concepts _population_ and _optimization_ are not intrinsic to a table scan,
+the classes defined here inherit from the generic population class and optimization handler.
+this is done to share as much code as possible between the different optimizers.
+the only difference is that the table optimizer does not generate models internally.
+instead, it loads them (possibly repeatedly) from a file or asks the project code to provide the data.
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-18 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import logging
+import numpy as np
+import pmsco.optimizers.population as population
+from pmsco.helpers import BraceMessage as BMsg
+
+logger = logging.getLogger(__name__)
+
+
+class TablePopulation(population.Population):
+    """
+    population generated from explicit values.
+
+    this class maintains a population that is updated from a table of explicit values.
+    the table can be static (defined at the start of the optimization process)
+    or dynamic (new models appended during the optimization process).
+
+    for each generation, the table is read and the next models are imported into the population.
+    the class de-duplicates the table, i.e. models with equal parameters as a previous one are not calculated again.
+    it is, thus, perfectly fine that new models are appended to the table rather than overwrite previous entries.
+
+    the table can be built from the following data sources:
+
+    @arg (numpy.ndarray): structured array that can be added to self.positions,
+        having at least the columns defining the model parameters.
+    @arg (sequence of dict, numpy.ndarray, numpy.void, named tuple):
+        each element must be syntactically compatible with a dict
+        that holds the model parameters.
+    @arg (str): file name that contains a table in the same format as
+        @ref pmsco.optimizers.population.Population.save_array produces.
+    @arg (callable): a function that returns one of the above objects
+        (or None to mark the end of the table).
+
+    the data source is passed as an argument to the self.setup() method.
+    structured arrays and sequences cannot be modified after they are passed to `setup`.
+    this means that the complete table must be known at the start of the process.
+
+    the most flexible way is to pass a function that generates a structured array in each call.
+    this would even allow to include a non-standard optimization algorithm.
+    the function is best defined in the custom project class.
+    the population calls it every time before a new generation starts.
+    to end the optimization process, it simply returns None.
+
+    the table can also be defined in an external file, e.g. as calculated by other programs or edited manually.
+    the table file can either remain unchanged during the optimization process,
+    or new models can be added while the optimization is in progress.
+    in the latter case, note that there is no reliable synchronization of file access.
+
+    first, writing to the file must be as short as possible.
+    the population class has a read timeout of ten seconds.
+
+    second, because it is impossible to know whether the file has been read or not,
+    new models should be _appended_ rather than _overwrite_ previous ones.
+    the population class automatically skips models that have already been read.
+
+    this class supports does not support seeding.
+    although, a seed file is accepted, it is not used.
+    patching is allowed, but there is normally no advantage over modifying the table.
+
+    the model space is used to define the model parameters and the parameter range.
+    models violating the parameter model space are ignored.
+    """
+
+    ## @var table_source
+    # data source of the model table
+    #
+    # this can be any object accepted by @ref pmsco.optimizers.population.Population.import_positions,
+    # e.g. a file name, a numpy structured array, or a function returning a structured array.
+    # see the class description for details.
+
+    def __init__(self):
+        """
+        initialize the population object.
+
+        """
+        super(TablePopulation, self).__init__()
+        self.table_source = None
+        self.position_constrain_mode = 'error'
+
+    def setup(self, size, model_space, **kwargs):
+        """
+        set up the population arrays, parameter model space and data source.
+
+        @param size: requested number of particles.
+            this does not need to correspond to the number of table entries.
+            on each generation the population loads up to this number of new entries from the table source.
+
+        @param model_space: definition of initial and limiting model parameters
+            expected by the cluster and parameters functions.
+            @arg model_space.start: not used.
+            @arg model_space.min:   minimum values allowed.
+            @arg model_space.max:   maximum values allowed.
+            @arg model_space.step:  not used.
+
+        the following arguments are keyword arguments.
+        the method also accepts the inherited arguments for seeding. they do not have an effect, however.
+
+        @param table_source: data source of the model table.
+            this can be any object accepted by @ref pmsco.optimizers.population.Population.import_positions,
+            e.g. a file name, a numpy structured array, or a function returning a structured array.
+            see the class description for details.
+
+        @return: None
+        """
+        super(TablePopulation, self).setup(size, model_space, **kwargs)
+        self.table_source = kwargs['table_source']
+
+    def advance_population(self):
+        """
+        advance the population by one step.
+
+        this methods re-imports the table file
+        and copies the table to current population.
+
+        @return: None
+        """
+        self.import_positions(self.table_source)
+        self.advance_from_import()
+        super(TablePopulation, self).advance_population()
+
+
+class TableModelHandler(population.PopulationHandler):
+    """
+    model handler which implements the table algorithm.
+
+    """
+    def __init__(self):
+        super(TableModelHandler, self).__init__()
+        self._pop = TablePopulation()
--- a/pmsco/pmsco.py
+++ b/pmsco/pmsco.py
@ -4,13 +4,9 @@
@package pmsco.pmsco
 PEARL Multiple-Scattering Calculation and Structural Optimization

-this is the main entry point and top-level interface of the PMSCO package.
-all calculations (any mode, any project) start by calling the main_pmsco() function of this module.
-the module also provides a command line parser.
-
-command line usage: call with -h option to see the list of arguments.
-
-python usage: call main_pmsco() with suitable arguments.
+this is the top-level interface of the PMSCO package.
+all calculations (any mode, any project) start by calling the run_project() function of this module.
+the module also provides a command line parser for common options.

 for parallel execution, prefix the command line with mpi_exec -np NN, where NN is the number of processes to use.
 note that in parallel mode, one process takes the role of the coordinator (master).
@ -24,48 +20,37 @@ PMSCO serializes the calculations automatically.
 the code of the main module is independent of a particular calculation project.
 all project-specific code must be in a separate python module.
 the project module must implement a class derived from pmsco.project.Project,
-and a global function create_project which returns a new instance of the derived project class.
+and call run_project() with an instance of the project class.
 refer to the projects folder for examples.

-@pre
-* python 2.7, including python-pip
-* numpy
-* nose from Debian python-nose
-* statsmodels from Debian python-statsmodels, or PyPI (https://pypi.python.org/pypi/statsmodels)
-* periodictable from PyPI (https://pypi.python.org/pypi/periodictable)
-* mpi4py from PyPI (the Debian package may have a bug causing the program to crash)
-* OpenMPI, including libopenmpi-dev
-* SWIG from Debian swig
-
-to install a PyPI package, e.g. periodictable, do
-@code{.sh}
-pip install --user periodictable
-@endcode
-
@author Matthias Muntwiler, matthias.muntwiler@psi.ch

-@copyright (c) 2015 by Paul Scherrer Institut @n
+@copyright (c) 2015-18 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import argparse
+from builtins import range
+import datetime
+import logging
+import importlib
 import os.path
 import sys
-import datetime
-import argparse
-import logging
-import cluster
-import dispatch
-import handlers
-import files
-import calculator
-import swarm
-import grid
-# import gradient
+
 from mpi4py import MPI

+import pmsco.dispatch as dispatch
+import pmsco.files as files
+import pmsco.handlers as handlers
+from pmsco.optimizers import genetic, swarm, grid, table
+
 # the module-level logger
 logger = logging.getLogger(__name__)

@ -146,25 +131,28 @@ def set_common_args(project, args):
    if args.output_file:
        project.set_output(args.output_file)
        log_file = args.output_file + ".log"
+    if args.db_file:
+        project.db_file = args.db_file
    if args.log_file:
        log_file = args.log_file
    setup_logging(enable=args.log_enable, filename=log_file, level=args.log_level)

    logger.debug("creating project")
    mode = args.mode.lower()
-    if mode in {'single', 'grid', 'swarm'}:
+    if mode in {'single', 'grid', 'swarm', 'genetic', 'table'}:
        project.mode = mode
    else:
        logger.error("invalid optimization mode '%s'.", mode)

    if args.pop_size:
-        project.pop_size = args.pop_size
+        project.optimizer_params['pop_size'] = args.pop_size

-    code = args.code.lower()
-    if code in {'edac', 'msc', 'test'}:
-        project.code = code
-    else:
-        logger.error("invalid code argument")
+    if args.seed_file:
+        project.optimizer_params['seed_file'] = args.seed_file
+    if args.seed_limit:
+        project.optimizer_params['seed_limit'] = args.seed_limit
+    if args.table_file:
+        project.optimizer_params['table_file'] = args.table_file

    if args.time_limit:
        project.set_timedelta_limit(datetime.timedelta(hours=args.time_limit))
@ -178,27 +166,10 @@ def set_common_args(project, args):
        if mode == 'single':
            cats -= {'model'}
        project.files.categories_to_delete = cats
-
-
-def log_project_args(project):
-    """
-    send some common project arguments to the log.
-
-    @param project: project instance (sub-class of pmsco.project.Project).
-    @return: None
-    """
-    try:
-        logger.info("scattering code: {0}".format(project.code))
-        logger.info("optimization mode: {0}".format(project.mode))
-        logger.info("minimum swarm size: {0}".format(project.pop_size))
-
-        logger.info("data directory: {0}".format(project.data_dir))
-        logger.info("output file: {0}".format(project.output_file))
-
-        _files_to_keep = files.FILE_CATEGORIES - project.files.categories_to_delete
-        logger.info("intermediate files to keep: {0}".format(", ".join(_files_to_keep)))
-    except AttributeError:
-        logger.warning("AttributeError in log_project_args")
+    if args.keep_levels > project.keep_levels:
+        project.keep_levels = args.keep_levels
+    if args.keep_best > project.keep_best:
+        project.keep_best = args.keep_best


 def run_project(project):
@ -208,7 +179,11 @@ def run_project(project):
    @param project:
    @return:
    """
-    log_project_args(project)
+    # log project arguments only in rank 0
+    mpi_comm = MPI.COMM_WORLD
+    mpi_rank = mpi_comm.Get_rank()
+    if mpi_rank == 0:
+        project.log_project_args()

    optimizer_class = None
    if project.mode == 'single':
@ -217,36 +192,21 @@ def run_project(project):
        optimizer_class = grid.GridSearchHandler
    elif project.mode == 'swarm':
        optimizer_class = swarm.ParticleSwarmHandler
+    elif project.mode == 'genetic':
+        optimizer_class = genetic.GeneticOptimizationHandler
    elif project.mode == 'gradient':
        logger.error("gradient search not implemented")
        # TODO: implement gradient search
        # optimizer_class = gradient.GradientSearchHandler
+    elif project.mode == 'table':
+        optimizer_class = table.TableModelHandler
    else:
        logger.error("invalid optimization mode '%s'.", project.mode)
    project.handler_classes['model'] = optimizer_class

    project.handler_classes['region'] = handlers.choose_region_handler_class(project)

-    calculator_class = None
-    if project.code == 'edac':
-        logger.debug("importing EDAC interface")
-        import edac_calculator
-        project.cluster_format = cluster.FMT_EDAC
-        calculator_class = edac_calculator.EdacCalculator
-    elif project.code == 'msc':
-        logger.debug("importing MSC interface")
-        import msc_calculator
-        project.cluster_format = cluster.FMT_MSC
-        calculator_class = msc_calculator.MscCalculator
-    elif project.code == 'test':
-        logger.debug("importing TEST interface")
-        project.cluster_format = cluster.FMT_EDAC
-        calculator_class = calculator.TestCalculator
-    else:
-        logger.error("invalid code argument")
-    project.calculator_class = calculator_class
-
-    if project and optimizer_class and calculator_class:
+    if project and optimizer_class:
        logger.info("starting calculations")
        try:
            dispatch.run_calculations(project)
@ -273,7 +233,7 @@ class Args(object):
    values as the command line parser.
    """

-    def __init__(self, mode="single", code="edac", output_file=""):
+    def __init__(self, mode="single", output_file="pmsco_data"):
        """
        constructor.
        
@ -284,14 +244,19 @@ class Args(object):
        """
        self.mode = mode
        self.pop_size = 0
-        self.code = code
-        self.data_dir = os.getcwd()
+        self.seed_file = ""
+        self.seed_limit = 0
+        self.data_dir = ""
        self.output_file = output_file
+        self.db_file = ""
        self.time_limit = 24.0
-        self.keep_files = []
+        self.keep_files = files.FILE_CATEGORIES_TO_KEEP
+        self.keep_best = 10
+        self.keep_levels = 1
        self.log_level = "WARNING"
        self.log_file = ""
        self.log_enable = True
+        self.table_file = ""


 def get_cli_parser(default_args=None):
@ -301,6 +266,7 @@ def get_cli_parser(default_args=None):
    KEEP_FILES_CHOICES = files.FILE_CATEGORIES | {'all'}

    parser = argparse.ArgumentParser(
+        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
        description="""
        multiple-scattering calculations and optimization

@ -309,7 +275,7 @@ def get_cli_parser(default_args=None):

        1) a project class derived from pmsco.project.Project.
           the class implements/overrides all necessary methods of the calculation project,
-           in particular create_domain, create_cluster, and create_params.
+           in particular create_model_space, create_cluster, and create_params.

        2) a global function named create_project.
           the function accepts a namespace object from the argument parser.
@ -324,33 +290,50 @@ def get_cli_parser(default_args=None):
    # for simplicity, the parser does not check these requirements.
    # all parameters are optional and accepted regardless of mode.
    # errors may occur if implicit requirements are not met.
-    parser.add_argument('-m', '--mode', default='single',
-                        choices=['single', 'grid', 'swarm', 'gradient'],
+    parser.add_argument('project_module',
+                        help="path to custom module that defines the calculation project")
+    parser.add_argument('-m', '--mode', default=default_args.mode,
+                        choices=['single', 'grid', 'swarm', 'genetic', 'table'],
                        help='calculation mode')
-    parser.add_argument('--pop-size', type=int, default=0,
-                        help='population size (number of particles) in swarm optimization mode. ' +
-                        'default is the greater of 4 or two times the number of calculation processes.')
-    parser.add_argument('-c', '--code', choices=['msc', 'edac', 'test'], default="edac",
-                        help='scattering code (default: edac)')
-    parser.add_argument('-d', '--data-dir', default=os.getcwd(),
+    parser.add_argument('--pop-size', type=int, default=default_args.pop_size,
+                        help='population size (number of particles) in swarm or genetic optimization mode. ' +
+                        'default is the greater of 4 or the number of calculation processes.')
+    parser.add_argument('--seed-file',
+                        help='path and name of population seed file. ' +
+                        'population data of previous optimizations can be used to seed a new optimization. ' +
+                        'the file must have the same structure as the .pop or .dat files.')
+    parser.add_argument('--seed-limit', type=int, default=default_args.seed_limit,
+                        help='maximum number of models to use from the seed file. ' +
+                        'the models with the best R-factors are selected.')
+    parser.add_argument('-d', '--data-dir', default=default_args.data_dir,
                        help='directory path for experimental data files (if required by project). ' +
                             'default: working directory')
-    parser.add_argument('-o', '--output-file',
-                        help='base path for intermediate and output files.' +
-                            'default: pmsco_data')
-    parser.add_argument('-k', '--keep-files', nargs='*', default=files.FILE_CATEGORIES_TO_KEEP,
+    parser.add_argument('-o', '--output-file', default=default_args.output_file,
+                        help='base path for intermediate and output files.')
+    parser.add_argument('-b', '--db-file', default=default_args.db_file,
+                        help='name of an sqlite3 database file where the results should be stored.')
+    parser.add_argument('--table-file',
+                        help='path and name of population table file for table optimization mode. ' +
+                        'the file must have the same structure as the .pop or .dat files.')
+    parser.add_argument('-k', '--keep-files', nargs='*', default=default_args.keep_files,
                        choices=KEEP_FILES_CHOICES,
                        help='output file categories to keep after the calculation. '
                             'by default, cluster and model (simulated data) '
                             'of a limited number of best models are kept.')
-    parser.add_argument('-t', '--time-limit', type=float, default=24.0,
-                        help='wall time limit in hours. the optimizers try to finish before the limit. default: 24.')
+    parser.add_argument('--keep-best', type=int, default=default_args.keep_best,
+                        help='number of best models for which to keep result files '
+                             '(at each node from root down to keep-levels).')
+    parser.add_argument('--keep-levels', type=int, choices=range(5),
+                        default=default_args.keep_levels,
+                        help='task level down to which result files of best models are kept. '
+                             '0 = model, 1 = scan, 2 = domain, 3 = emitter, 4 = region.')
+    parser.add_argument('-t', '--time-limit', type=float, default=default_args.time_limit,
+                        help='wall time limit in hours. the optimizers try to finish before the limit.')
    parser.add_argument('--log-file', default=default_args.log_file,
                        help='name of the main log file. ' +
-                             'under MPI, the rank of the process is inserted before the extension. ' +
-                             'defaults: output file + log, or pmsco.log.')
+                             'under MPI, the rank of the process is inserted before the extension.')
    parser.add_argument('--log-level', default=default_args.log_level,
-                        help='minimum level of log messages. DEBUG, INFO, WARNING, ERROR, CRITICAL. default: WARNING.')
+                        help='minimum level of log messages. DEBUG, INFO, WARNING, ERROR, CRITICAL.')
    feature_parser = parser.add_mutually_exclusive_group(required=False)
    feature_parser.add_argument('--log-enable', dest='log_enable', action="store_true",
                        help="enable logging. by default, logging is on.")
@ -375,7 +358,47 @@ def parse_cli():
    return args, unknown_args


+def import_project_module(path):
+    """
+    import the custom project module.
+
+    imports the project module given its file path.
+    the path is expanded to its absolute form and appended to the python path.
+
+    @param path: path and name of the module to be loaded.
+        path is optional and defaults to the python path.
+        if the name includes an extension, it is stripped off.
+
+    @return: the loaded module as a python object
+    """
+    path, name = os.path.split(path)
+    name, __ = os.path.splitext(name)
+    path = os.path.abspath(path)
+    sys.path.append(path)
+    project_module = importlib.import_module(name)
+    return project_module
+
+
+def main():
+    args, unknown_args = parse_cli()
+
+    if args:
+        module = import_project_module(args.project_module)
+        try:
+            project_args = module.parse_project_args(unknown_args)
+        except NameError:
+            project_args = None
+
+        project = module.create_project()
+        set_common_args(project, args)
+        try:
+            module.set_project_args(project, project_args)
+        except NameError:
+            pass
+
+        run_project(project)
+
+
 if __name__ == '__main__':
-    main_parser = get_cli_parser()
-    main_parser.print_help()
+    main()
    sys.exit(0)
--- a/pmsco/project.py
+++ b/pmsco/project.py
--- a/pmsco/swarm.py
+++ b/pmsco/swarm.py
@ -1,909 +0,0 @@
-"""
-@package pmsco.swarm
-particle swarm optimization handler.
-
-the module starts multiple MSC calculations and optimizes the model parameters
-according to the particle swarm optimization algorithm.
-
-Particle swarm optimization adapted from
-D. A. Duncan et al., Surface Science 606, 278 (2012)
-
-@author Matthias Muntwiler, matthias.muntwiler@psi.ch
-
-@copyright (c) 2015 by Paul Scherrer Institut @n
-Licensed under the Apache License, Version 2.0 (the "License"); @n
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
-  http://www.apache.org/licenses/LICENSE-2.0
-"""
-
-from __future__ import division
-import copy
-import os
-import datetime
-import logging
-import numpy as np
-import handlers
-from helpers import BraceMessage as BMsg
-
-logger = logging.getLogger(__name__)
-
-
-CONSTRAIN_MODES = {'re-enter', 'bounce', 'scatter', 'stick', 'expand'}
-
-
-class Population(object):
-    """
-    particle swarm population.
-    """
-
-    ## @var size_req
-    # requested number of particles.
-    # read-only. call setup() to change this attribute.
-
-    ## @var model_start
-    # (dict) initial model parameters.
-    # read-only. call setup() to change this attribute.
-
-    ## @var model_min
-    # (dict) low limits of the model parameters.
-    # read-only. call setup() to change this attribute.
-
-    ## @var model_max
-    # (dict) high limits of the model parameters.
-    # if min == max, the parameter is kept constant.
-    # read-only. call setup() to change this attribute.
-
-    ## @var model_max
-    # (dict) high limits of the model parameters.
-    # read-only. call setup() to change this attribute.
-
-    ## @var model_step
-    # (dict) initial velocity (difference between two steps) of the particle.
-    # read-only. call setup() to change this attribute.
-
-    ## @var friends
-    # number of other particles that each particle consults for the global best fit.
-    # default = 3.
-
-    ## @var momentum
-    # momentum of the particle.
-    # default = 0.689343.
-
-    ## @var attract_local
-    # preference for returning to the local best fit
-    # default = 1.92694.
-
-    ## @var attract_global
-    # preference for heading towards the global best fit.
-    # default = 1.92694
-
-    ## @var generation
-    # generation number. the counter is incremented by advance_population().
-    # initial value = 0.
-
-    ## @var model_count
-    # model number.
-    # the counter is incremented by advance_particle() each time a particle position is changed.
-    # initial value = 0.
-
-    ## @var pos
-    # (numpy.ndarray) current positions of each particle.
-    #
-    # the column names include the names of the model parameters, taken from domain.start,
-    # and the special names @c '_particle', @c '_model', @c '_rfac'.
-    # the special fields have the following meanings:
-    #
-    # * @c '_particle': index of the particle in the array.
-    #   the particle index is used to match a calculation result and its original particle.
-    #   it must be preserved during the calculation process.
-    #
-    # * @c '_gen': generation number.
-    #   the generation number counts the number of calls to advance_population().
-    #   this field is not used internally.
-    #   the first population is generation 0.
-    #
-    # * @c '_model': model number.
-    #   the model number counts the number of calls to advance_particle().
-    #   the field is filled with the current value of model_count whenever the position is changed.
-    #   this field is not used internally.
-    #   the model handlers use it to derive their model ID.
-    #
-    # * @c '_rfac': calculated R-factor for this position.
-    #   this field is meaningful in the best and results arrays only
-    #   where it is set by the add_result() method.
-    #   in the pos and vel arrays, the field value is arbitrary.
-    #
-    # @note if your read a single element, e.g. pos[0], from the array, you will get a numpy.void object.
-    # this object is a <em>view</em> of the original array item
-
-    ## @var vel
-    # (numpy.ndarray) current the velocities of each particle.
-    # the structure is the same as for the pos array.
-
-    ## @var best
-    # (numpy.ndarray) best positions found by each particle so far.
-    # the structure is the same as for the pos array.
-
-    ## @var results
-    # (numpy.ndarray) all positions and resulting R-factors calculated.
-    # the structure is the same as for the pos array.
-
-    ## @var _hold_once
-    # (bool) hold the population once during the next update.
-    # if _hold_once is True, advance_population() will skip the update process once.
-    # this flag is set by setup() because it sets up a valid initial population.
-    # the caller then doesn't have to care whether to skip advance_population() after setup.
-
-    def __init__(self):
-        """
-        initialize the population object.
-
-        """
-        self.size_req = 0
-        self.model_start = {}
-        self.model_min = {}
-        self.model_max = {}
-        self.model_step = {}
-
-        self.friends = 3
-        self.momentum = 0.689343
-        self.attract_local = 1.92694
-        self.attract_global = 1.92694
-        self.position_constrain_mode = 'default'
-        self.velocity_constrain_mode = 'default'
-
-        self.generation = 0
-        self.model_count = 0
-        self._hold_once = False
-
-        self.pos = None
-        self.vel = None
-        self.best = None
-        self.results = None
-
-    def pos_gen(self):
-        """
-        generator for dictionaries of the pos array.
-
-        the generator can be used to loop over the array.
-        on each iteration, it yields a dictionary of the position at the current index.
-        for example,
-        @code{.py}
-        for pos in pop.pos_gen():
-            print pos['_index'], pos['_rfac']
-        @endcode
-        """
-        return ({name: pos[name] for name in pos.dtype.names} for pos in self.pos)
-
-    def vel_gen(self):
-        """
-        generator for dictionaries of the vel array.
-
-        @see pos_gen() for details.
-        """
-        return ({name: vel[name] for name in vel.dtype.names} for vel in self.vel)
-
-    def best_gen(self):
-        """
-        generator for dictionaries of the best array.
-
-        @see pos_gen() for details.
-        """
-        return ({name: best[name] for name in best.dtype.names} for best in self.best)
-
-    def results_gen(self):
-        """
-        generator for dictionaries of the results array.
-
-        @see pos_gen() for details.
-        """
-        return ({name: results[name] for name in results.dtype.names} for results in self.results)
-
-    @staticmethod
-    def get_model_dtype(model_params):
-        """
-        get numpy array data type for model parameters and swarm control variables.
-
-        @param model_params: dictionary of model parameters or list of parameter names.
-
-        @return: dtype for use with numpy array constructors.
-            this is a sorted list of (name, type) tuples.
-        """
-        dt = []
-        for key in model_params:
-            dt.append((key, 'f4'))
-        dt.append(('_particle', 'i4'))
-        dt.append(('_gen', 'i4'))
-        dt.append(('_model', 'i4'))
-        dt.append(('_rfac', 'f4'))
-        dt.sort(key=lambda t: t[0].lower())
-        return dt
-
-    def setup(self, size, domain, history_file="", recalc_history=True):
-        """
-        set up the population arrays seeded with previous results and the start model.
-
-        * set the population parameters and allocate the data arrays.
-        * set one particle to the initial guess, and the others to positions from a previous results file.
-          if the file contains less particles than allocated, the remaining particles are initialized randomly.
-
-        seeding from a history file can be used to continue an interrupted optimization process.
-        the method loads the results into the best and position arrays,
-        and updates the other arrays and variables
-        so that the population can be advanced and calculated.
-
-        by default, the calculations of the previous parameters are repeated.
-        this is recommended whenever the code, the experimental input, or the project arguments change
-        because all of them may have an influence on the R-factor.
-
-        re-calculation can be turned off by setting recalc_history to false.
-        this is recommended only if the calculation is a direct continuation of a previous one
-        without any changes to the code or input.
-        in that case, the previous results are marked as generation -1 with a negative model number.
-        upon the first iteration before running the scattering calculations,
-        new parameters will be derived by the swarm algorithm.
-
-        @param size: requested number of particles.
-
-        @param domain: definition of initial and limiting model parameters
-            expected by the cluster and parameters functions.
-
-            @arg domain.start: initial guess.
-            @arg domain.min:   minimum values allowed.
-            @arg domain.max:   maximum values allowed. if min == max, the parameter is kept constant.
-            @arg domain.step:  initial velocity (difference between two steps) for particle swarm.
-
-        @param history_file: name of the results history file.
-            this can be a file created by the @ref save_array or @ref save_results methods.
-            the columns of the plain-text file contain model parameters and
-            the _rfac values of a previous calculation.
-            additional columns are ignored.
-            the first row must contain the column names.
-            if a parameter column is missing,
-            the corresponding parameter is seeded with a random value within the domain.
-            in this case, a warning is added to the log file.
-
-            the number of rows does not need to be equal to the population size.
-            if it is lower, the remaining particles are initialized randomly.
-            if it is higher, only the ones with the lowest R factors are used.
-            results with R >= 1.0 are ignored in any case.
-
-        @param recalc_history: select whether the R-factors of the historic models are calculated again.
-            this is useful if the historic data was calculated for a different cluster, different set of parameters,
-            or different experimental data, and if the R-factors of the new optimization may be systematically greater.
-            set this argument to False only if the calculation is a continuation of a previous one
-            without any changes to the code.
-
-        @return: None
-        """
-        self.size_req = size
-        self.model_start = domain.start
-        self.model_min = domain.min
-        self.model_max = domain.max
-        self.model_step = domain.step
-
-        # allocate arrays
-        dt = self.get_model_dtype(self.model_start)
-        self.pos = np.zeros(self.size_req, dtype=dt)
-        self.vel = np.zeros(self.size_req, dtype=dt)
-        self.results = np.empty((0), dtype=dt)
-
-        # randomize population
-        self.generation = 0
-        self.randomize()
-        self.pos['_particle'] = np.arange(self.size_req)
-        self.pos['_gen'] = self.generation
-        self.pos['_model'] = np.arange(self.size_req)
-        self.pos['_rfac'] = 2.1
-        self.model_count = self.size_req
-
-        # add previous results
-        if history_file:
-            hist = np.genfromtxt(history_file, names=True)
-            hist = hist[hist['_rfac'] < 1.0]
-            hist.sort(order='_rfac')
-            hist_size = min(hist.shape[0], self.size_req - 1)
-
-            discarded_fields = {'_particle', '_gen', '_model'}
-            source_fields = set(hist.dtype.names) - discarded_fields
-            dest_fields = set(self.pos.dtype.names) - discarded_fields
-            common_fields = source_fields & dest_fields
-            if len(common_fields) < len(dest_fields):
-                logger.warning(BMsg("missing columns in history file {hf} default to random seed value.",
-                                    hf=history_file))
-            for name in common_fields:
-                self.pos[name][0:hist_size] = hist[name][0:hist_size]
-
-            self.pos['_particle'] = np.arange(self.size_req)
-            logger.info(BMsg("seeding swarm population with {hs} models from history file {hf}.",
-                             hs=hist_size, hf=history_file))
-            if recalc_history:
-                self.pos['_gen'] = self.generation
-                self.pos['_model'] = np.arange(self.size_req)
-                self.pos['_rfac'] = 2.1
-                logger.info("historic models will be re-calculated.")
-            else:
-                self.pos['_gen'][0:hist_size] = -1
-                self.pos['_model'][0:hist_size] = -np.arange(hist_size) - 1
-                self.model_count = self.size_req - hist_size
-                self.pos['_model'][hist_size:] = np.arange(self.model_count)
-                logger.info("historic models will not be re-calculated.")
-
-        # seed last particle with start parameters
-        self.seed(self.model_start, index=-1)
-
-        # initialize best array
-        self.best = self.pos.copy()
-
-        self._hold_once = True
-
-    def randomize(self, pos=True, vel=True):
-        """
-        initializes a random population.
-
-        the position array is filled with random values (uniform distribution) from the parameter domain.
-        velocity values are randomly chosen between -1/8 to 1/8 times the width (max - min) of the parameter domain.
-
-        the method does not update the particle info fields.
-
-        @param pos: randomize positions. if False, the positions are not changed.
-        @param vel: randomize velocities. if False, the velocities are not changed.
-        """
-        if pos:
-            for key in self.model_start:
-                self.pos[key] = ((self.model_max[key] - self.model_min[key]) *
-                    np.random.random_sample(self.pos.shape) + self.model_min[key])
-        if vel:
-            for key in self.model_start:
-                self.vel[key] = ((self.model_max[key] - self.model_min[key]) *
-                                 (np.random.random_sample(self.pos.shape) - 0.5) / 4.0)
-
-    def seed(self, params, index=0):
-        """
-        set the one of the particles to the specified seed values.
-
-        the method does not update the particle info fields.
-
-        @param params: dictionary of model parameters.
-            the keys must match the ones of domain.start.
-
-        @param index: index of the particle that is seeded.
-            the index must be in the allowed range of the self.pos array.
-            0 is the first, -1 the last particle.
-        """
-        for key in params:
-            self.pos[key][index] = params[key]
-
-    def update_particle_info(self, index, inc_model=True):
-        """
-        set the internal particle info fields.
-
-        the fields @c _particle, @c _gen, and @c _model are updated with the current values.
-        @c _rfac is set to the default value 2.1.
-
-        this method must be called after each change of particle position.
-
-        @param index: (int) particle index.
-
-        @param inc_model: (bool) if True, increment the model count afterwards.
-
-        @return: None
-        """
-        self.pos['_particle'][index] = index
-        self.pos['_gen'][index] = self.generation
-        self.pos['_model'][index] = self.model_count
-        self.pos['_rfac'][index] = 2.1
-
-        if inc_model:
-            self.model_count += 1
-
-    def advance_population(self):
-        """
-        advance the population by one step.
-
-        this method just calls advance_particle() for each particle of the population.
-        if generation is lower than zero, the method increases the generation number but does not advance the particles.
-
-        @return: None
-        """
-        if not self._hold_once:
-            self.generation += 1
-            for index, __ in enumerate(self.pos):
-                self.advance_particle(index)
-        self._hold_once = False
-
-    def advance_particle(self, index):
-        """
-        advance a particle by one step.
-
-        @param index: index of the particle in the population.
-        """
-
-        # note: the following two identifiers are views,
-        # assignment will modify the original array
-        pos = self.pos[index]
-        vel = self.vel[index]
-        # best fit that this individual has seen
-        xl = self.best[index]
-        # best fit that a group of others have seen
-        xg = self.best_friend(index)
-
-        for key in self.model_start:
-            # update velocity
-            dxl = xl[key] - pos[key]
-            dxg = xg[key] - pos[key]
-            pv = np.random.random()
-            pl = np.random.random()
-            pg = np.random.random()
-            vel[key] = (self.momentum * pv * vel[key] +
-                self.attract_local * pl * dxl +
-                self.attract_global * pg * dxg)
-            pos[key], vel[key], self.model_min[key], self.model_max[key] = \
-                self.constrain_velocity(pos[key], vel[key], self.model_min[key], self.model_max[key],
-                                        self.velocity_constrain_mode)
-            # update position
-            pos[key] += vel[key]
-            pos[key], vel[key], self.model_min[key], self.model_max[key] = \
-                self.constrain_position(pos[key], vel[key], self.model_min[key], self.model_max[key],
-                                        self.position_constrain_mode)
-
-        self.update_particle_info(index)
-
-    @staticmethod
-    def constrain_velocity(_pos, _vel, _min, _max, _mode='default'):
-        """
-        constrain a velocity to the given bounds.
-
-        @param _pos: current position of the particle.
-
-        @param _vel: new velocity of the particle, i.e. distance to move.
-
-        @param _min: lower position boundary.
-
-        @param _max: upper position boundary.
-            _max must be greater or equal to _min.
-
-        @param _mode: what to do if a boundary constraint is violated.
-            reserved for future use. should be set to 'default'.
-
-        @return: tuple (new position, new velocity, new lower boundary, new upper boundary).
-            in the current implementation only the velocity may change.
-            however, in future versions any of these values may change.
-        """
-        d = abs(_max - _min) / 2.0
-        if d > 0.0:
-            while abs(_vel) >= d:
-                _vel /= 2.0
-        else:
-            _vel = 0.0
-        return _pos, _vel, _min, _max
-
-    @staticmethod
-    def constrain_position(_pos, _vel, _min, _max, _mode='default'):
-        """
-        constrain a position to the given bounds.
-
-        @param _pos: new position of the particle, possible out of bounds.
-
-        @param _vel: velocity of the particle, i.e. distance from the previous position.
-            _vel must be lower than _max - _min.
-
-        @param _min: lower boundary.
-
-        @param _max: upper boundary.
-            _max must be greater or equal to _min.
-
-        @param _mode: what to do if a boundary constraint is violated:
-            @arg 're-enter': re-enter from the opposite side of the parameter interval.
-            @arg 'bounce': fold the motion vector at the boundary and move the particle back into the domain.
-            @arg 'scatter': place the particle at a random place between its old position and the violated boundary.
-            @arg 'stick': place the particle at the violated boundary.
-            @arg 'expand': move the boundary so that the particle fits.
-            @arg 'random': place the particle at a random position between the lower and upper boundaries.
-            @arg 'default': the default mode is 'bounce'. this may change in future versions.
-
-        @return: tuple (new position, new velocity, new lower boundary, new upper boundary).
-            depending on the mode, any of these values may change.
-            the velocity is adjusted to be consistent with the change of position.
-        """
-        _rng = max(_max - _min, 0.0)
-        _old = _pos - _vel
-
-        # prevent undershoot
-        if _vel > 0.0 and _pos < _min:
-            _pos = _min
-            _vel = _pos - _old
-        if _vel < 0.0 and _pos > _max:
-            _pos = _max
-            _vel = _pos - _old
-
-        assert abs(_vel) <= _rng, \
-            "velocity: pos = {0}, min = {1}, max = {2}, vel = {3}, _rng = {4}".format(_pos, _min, _max, _vel, _rng)
-        assert (_vel >= 0 and _pos >= _min) or (_vel <= 0 and _pos <= _max), \
-            "undershoot: pos = {0}, min = {1}, max = {2}, vel = {3}, _rng = {4}".format(_pos, _min, _max, _vel, _rng)
-
-        if _rng > 0.0:
-            while _pos > _max:
-                if _mode == 're-enter':
-                    _pos -= _rng
-                elif _mode == 'bounce' or _mode == 'default':
-                    _pos = _max - (_pos - _max)
-                    _vel = -_vel
-                elif _mode == 'scatter':
-                    _pos = _old + (_max - _old) * np.random.random()
-                    _vel = _pos - _old
-                elif _mode == 'stick':
-                    _pos = _max
-                    _vel = _pos - _old
-                elif _mode == 'expand':
-                    _max = _pos
-                elif _mode == 'random':
-                    _pos = _min + _rng * np.random.random()
-                    _vel = _pos - _old
-                else:
-                    raise ValueError('invalid constrain mode')
-
-            while _pos < _min:
-                if _mode == 're-enter':
-                    _pos += _rng
-                elif _mode == 'bounce' or _mode == 'default':
-                    _pos = _min - (_pos - _min)
-                    _vel = -_vel
-                elif _mode == 'scatter':
-                    _pos = _old + (_min - _old) * np.random.random()
-                    _vel = _pos - _old
-                elif _mode == 'stick':
-                    _pos = _min
-                    _vel = _pos - _old
-                elif _mode == 'expand':
-                    _min = _pos
-                elif _mode == 'random':
-                    _pos = _min + _rng * np.random.random()
-                    _vel = _pos - _old
-                else:
-                    raise ValueError('invalid constrain mode')
-        else:
-            _pos = _max
-            _vel = 0.0
-
-        return _pos, _vel, _min, _max
-
-    # noinspection PyUnusedLocal
-    def best_friend(self, index):
-        """
-        select the best fit out of a random set of particles
-
-        returns the "best friend"
-        """
-        friends = np.random.choice(self.best, self.friends, replace=False)
-        index = np.argmin(friends['_rfac'])
-        return friends[index]
-
-    def add_result(self, particle, rfac):
-        """
-        add a calculation particle to the results array, and update the best fit array.
-
-        @param particle: dictionary of model parameters and particle values.
-            the keys must correspond to the columns of the pos array,
-            i.e. the names of the model parameters plus the _rfac, _particle, and _model fields.
-
-        @param rfac: calculated R-factor.
-            the R-factor is written to the '_rfac' field.
-
-        @return better (bool): True if the new R-factor is better than the particle's previous best mark.
-        """
-        particle['_rfac'] = rfac
-        l = [particle[n] for n in self.results.dtype.names]
-        t = tuple(l)
-        a = np.asarray(t, dtype=self.results.dtype)
-        self.results = np.append(self.results, a)
-        index = particle['_particle']
-        better = particle['_rfac'] < self.best['_rfac'][index]
-        if better:
-            self.best[index] = a
-
-        return better
-
-    def is_converged(self, tol=0.01):
-        """
-        check whether the population has converged.
-
-        convergence is reached when the R-factors of the N latest results,
-        do not vary more than tol, where N is the size of the population.
-
-        @param tol: max. difference allowed between greatest and lowest value of the R factor in the population.
-        """
-        nres = self.results.shape[0]
-        npop = self.pos.shape[0]
-        if nres >= npop:
-            rfac1 = np.min(self.results['_rfac'][-npop:])
-            rfac2 = np.max(self.results['_rfac'][-npop:])
-            converg = rfac2 - rfac1 < tol
-            return converg
-        else:
-            return False
-
-    def save_array(self, filename, array):
-        """
-        save a population array to a text file.
-
-        the columns are space-delimited.
-        the first line contains the column names.
-
-        @param filename: name of destination file, optionally including a path.
-
-        @param array: population array to save.
-            must be one of self.pos, self.vel, self.best, self.results
-        """
-        header = " ".join(self.results.dtype.names)
-        np.savetxt(filename, array, fmt='%g', header=header)
-
-    def load_array(self, filename, array):
-        """
-        load a population array from a text file.
-
-        the array to load must be compatible with the current population
-        (same number of rows, same columns).
-        the first row must contain column names.
-        the ordering of columns may be different.
-        the returned array is ordered according to the array argument.
-
-        @param filename: name of source file, optionally including a path.
-
-        @param array: population array to load.
-            must be one of self.pos, self.vel, self.results.
-
-        @return array with loaded data.
-            this may be the same instance as on input.
-
-        @raise AssertionError if the number of rows of the two files differ.
-        """
-        data = np.genfromtxt(filename, names=True)
-        assert data.shape == array.shape
-        for name in data.dtype.names:
-            array[name] = data[name]
-        return array
-
-    def save_population(self, base_filename):
-        """
-        save the population array to a set of text files.
-
-        the file name extensions are .pos, .vel, and .best
-        """
-        self.save_array(base_filename + ".pos", self.pos)
-        self.save_array(base_filename + ".vel", self.vel)
-        self.save_array(base_filename + ".best", self.best)
-
-    def load_population(self, base_filename):
-        """
-        load the population array from a set of previously saved text files.
-        this can be used to continue an optimization job.
-
-        the file name extensions are .pos, .vel, and .best.
-        the files must have the same format as produced by save_population.
-        the files must have the same number of rows.
-        """
-        self.pos = self.load_array(base_filename + ".pos", self.pos)
-        self.vel = self.load_array(base_filename + ".vel", self.vel)
-        self.best = self.load_array(base_filename + ".best", self.best)
-
-    def save_results(self, filename):
-        """
-        saves the complete list of calculations results.
-        """
-        self.save_array(filename, self.results)
-
-
-class ParticleSwarmHandler(handlers.ModelHandler):
-    """
-    model handler which implements the particle swarm optimization algorithm.
-
-    """
-
-    ## @var _pop (Population)
-    # holds the population object.
-
-    ## @var _pop_size (int)
-    # number of particles in the swarm.
-
-    ## @var _outfile (file)
-    # output file for model parametes and R factor.
-    # the file is open during calculations.
-    # each calculation result adds one line.
-
-    ## @var _model_time (timedelta)
-    #  estimated CPU time to calculate one model.
-    #  this value is the maximum time measured of the completed calculations.
-    #  it is used to determine when the optimization should be finished so that the time limit is not exceeded.
-
-    ## @var _converged (bool)
-    #  indicates that the population has converged.
-    #  convergence is detected by calling Population.is_converged().
-    #  once convergence has been reached, this flag is set, and further convergence tests are skipped.
-
-    ## @var _timeout (bool)
-    #  indicates when the handler has run out of time,
-    #  i.e. time is up before convergence has been reached.
-    #  if _timeout is True, create_tasks() will not create further tasks,
-    #  and add_result() will signal completion when the _pending_tasks queue becomes empty.
-
-    ## @var _invalid_limit (int)
-    #  maximum tolerated number of invalid calculations.
-    #
-    #  if the number of invalid calculations (self._invalid_count) exceeds this limit,
-    #  the optimization is aborted.
-    #  the variable is initialized by self.setup() to 10 times the population size.
-
-    def __init__(self):
-        super(ParticleSwarmHandler, self).__init__()
-        self._pop = None
-        self._pop_size = 0
-        self._outfile = None
-        self._model_time = datetime.timedelta()
-        self._converged = False
-        self._timeout = False
-        self._invalid_limit = 10
-
-    def setup(self, project, slots):
-        """
-        initialize the particle swarm and open an output file.
-
-        the population size is set to project.pop_size if it is defined and greater than 4.
-        otherwise, it defaults to <code>max(2 * slots, 4)</code>.
-
-        for good efficiency the population size (number of particles) should be
-        greater or equal to the number of available processing slots,
-        otherwise the next generation is created before all particles have been calculated
-        which may slow down convergence.
-
-        if calculations take a long time compared to the available computation time
-        or spawn a lot of sub-tasks due to complex symmetry,
-        and you prefer to allow for a good number of generations,
-        you should override the population size.
-
-        @param project: project instance.
-
-        @param slots: number of calculation processes available through MPI.
-
-        @return: None
-        """
-        super(ParticleSwarmHandler, self).setup(project, slots)
-
-        _min_size = 4
-        if project.pop_size:
-            self._pop_size = max(project.pop_size, _min_size)
-        else:
-            self._pop_size = max(self._slots * 2, _min_size)
-        self._pop = Population()
-        self._pop.setup(self._pop_size, self._project.create_domain(), self._project.history_file,
-                        self._project.recalc_history)
-        self._invalid_limit = self._pop_size * 10
-
-        self._outfile = open(self._project.output_file + ".dat", "w")
-        self._outfile.write("# ")
-        self._outfile.write(" ".join(self._pop.results.dtype.names))
-        self._outfile.write("\n")
-
-        return None
-
-    def cleanup(self):
-        self._outfile.close()
-        super(ParticleSwarmHandler, self).cleanup()
-
-    def create_tasks(self, parent_task):
-        """
-        develop the particle population and create a calculation task per particle.
-
-        this method advances the population by one step.
-        it generates one task for each particle if its model number is positive.
-        negative model numbers indicate that the particle is used for seeding
-        and does not need to be calculated in the first generation.
-
-        if the time limit is approaching, no new tasks are created.
-
-        the process loop calls this method every time the length of the task queue drops
-        below  the number of calculation processes (slots).
-        this means in particular that a population will not be completely calculated
-        before the next generation starts.
-        for efficiency reasons, we do not wait until a population is complete.
-        this will cause a certain mixing of generations and slow down convergence
-        because the best peer position in the generation may not be known yet.
-        the effect can be reduced by making the population larger than the number of processes.
-
-        @return list of generated tasks. empty list if the optimization has converged (see Population.is_converged()).
-        """
-
-        super(ParticleSwarmHandler, self).create_tasks(parent_task)
-
-        # this is the top-level handler, we expect just one parent: root.
-        parent_id = parent_task.id
-        assert parent_id == (-1, -1, -1, -1, -1)
-        self._parent_tasks[parent_id] = parent_task
-
-        time_pending = self._model_time * len(self._pending_tasks)
-        time_avail = (self.datetime_limit - datetime.datetime.now()) * max(self._slots, 1)
-
-        out_tasks = []
-        if not self._timeout and not self._converged:
-            self._pop.advance_population()
-
-            for pos in self._pop.pos_gen():
-                time_pending += self._model_time
-                if time_pending > time_avail:
-                    self._timeout = True
-                    logger.info("time limit reached")
-                    break
-
-                if pos['_model'] >= 0:
-                    new_task = parent_task.copy()
-                    new_task.parent_id = parent_id
-                    new_task.model = pos
-                    new_task.change_id(model=pos['_model'])
-
-                    child_id = new_task.id
-                    self._pending_tasks[child_id] = new_task
-                    out_tasks.append(new_task)
-
-        return out_tasks
-
-    def add_result(self, task):
-        """
-        calculate the R factor of the result and add it to the results list of the population.
-
-        * save the current population.
-        * append the result to the result output file.
-        * update the execution time statistics.
-        * remove temporary files if requested.
-        * check whether the population has converged.
-
-        @return parent task (CalculationTask) if the optimization has converged, @c None otherwise.
-        """
-        super(ParticleSwarmHandler, self).add_result(task)
-
-        self._complete_tasks[task.id] = task
-        del self._pending_tasks[task.id]
-        parent_task = self._parent_tasks[task.parent_id]
-
-        rfac = 1.0
-        if task.result_valid:
-            try:
-                rfac = self._project.calc_rfactor(task)
-            except ValueError:
-                task.result_valid = False
-                self._invalid_count += 1
-                logger.warning(BMsg("calculation of model {0} resulted in an undefined R-factor.", task.id.model))
-
-            task.model['_rfac'] = rfac
-            self._pop.add_result(task.model, rfac)
-            self._pop.save_population(self._project.output_file + ".pop")
-
-            if self._outfile:
-                s = (str(task.model[name]) for name in self._pop.results.dtype.names)
-                self._outfile.write(" ".join(s))
-                self._outfile.write("\n")
-                self._outfile.flush()
-
-        self._project.files.update_model_rfac(task.id.model, rfac)
-        self._project.files.set_model_complete(task.id.model, True)
-
-        if task.result_valid:
-            if self._pop.is_converged() and not self._converged:
-                logger.info("population converged")
-                self._converged = True
-
-            if task.time > self._model_time:
-                self._model_time = task.time
-        else:
-            if self._invalid_count >= self._invalid_limit:
-                logger.error("number of invalid calculations (%u) exceeds limit", self._invalid_count)
-                self._converged = True
-
-        # optimization complete?
-        if (self._timeout or self._converged) and len(self._pending_tasks) == 0:
-            del self._parent_tasks[parent_task.id]
-        else:
-            parent_task = None
-
-        self.cleanup_files(keep=self._pop_size)
-        return parent_task
--- a/projects/common/clusters/crystals.py
+++ b/projects/common/clusters/crystals.py
@ -0,0 +1,138 @@
+"""
+@package projects.common.clusters.crystals
+cluster generators for some common bulk crystals
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-19 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import math
+import numpy as np
+import os.path
+import periodictable as pt
+import logging
+
+import pmsco.cluster as cluster
+import pmsco.dispatch as dispatch
+import pmsco.project as project
+from pmsco.helpers import BraceMessage as BMsg
+
+logger = logging.getLogger(__name__)
+
+
+class ZincblendeCluster(cluster.ClusterGenerator):
+    def __init__(self, proj):
+        super(ZincblendeCluster, self).__init__(proj)
+        self.atomtype1 = 30
+        self.atomtype2 = 16
+        self.bulk_lattice = 1.0
+        self.surface = (1, 1, 1)
+
+    @classmethod
+    def check(cls, outfilename=None, model_dict=None, domain_dict=None):
+        """
+        function to test and debug the cluster generator.
+
+        to use this function, you don't need to import or initialize anything but the class.
+        though the project class is used internally, the result does not depend on any project settings.
+
+        @param outfilename: name of output file for the cluster (XYZ format).
+            the file is written to the same directory where this module is located.
+            if empty or None, no file is written.
+
+        @param model_dict: dictionary of model parameters to override the default values.
+
+        @param domain_dict: dictionary of domain parameters to override the default values.
+
+        @return: @ref pmsco.cluster.Cluster object
+        """
+        proj = project.Project()
+        dom = project.ModelSpace()
+        dom.add_param('dlat', 10.)
+        dom.add_param('rmax', 5.0)
+        if model_dict:
+            dom.start.update(model_dict)
+
+        try:
+            proj.domains[0].update({'zrot': 0.})
+        except IndexError:
+            proj.add_domain({'zrot': 0.})
+        if domain_dict:
+            proj.domains[0].update(domain_dict)
+        proj.add_scan("", 'C', '1s')
+
+        clu_gen = cls(proj)
+        index = dispatch.CalcID(0, 0, 0, -1, -1)
+        clu = clu_gen.create_cluster(dom.start, index)
+
+        if outfilename:
+            project_dir = os.path.dirname(os.path.abspath(__file__))
+            outfilepath = os.path.join(project_dir, outfilename)
+            clu.save_to_file(outfilepath, fmt=cluster.FMT_XYZ, comment="{0} {1} {2}".format(cls, index, str(dom.start)))
+
+        return clu
+
+    def count_emitters(self, model, index):
+        return 1
+
+    def create_cluster(self, model, index):
+        """
+        calculate a specific set of atom positions given the optimizable parameters.
+
+        @param model  (dict)          optimizable parameters
+            @arg    model['dlat']     bulk lattice constant in Angstrom
+            @arg    model['rmax']     cluster radius
+            @arg    model['phi']      azimuthal rotation angle in degrees
+
+        @param dom (dict)             domain
+            @arg    dom['term']       surface termination
+        """
+        clu = cluster.Cluster()
+        clu.comment = "{0} {1}".format(self.__class__, index)
+        clu.set_rmax(model['rmax'])
+        a_lat = model['dlat']
+        dom = self.project.domains[index]
+        try:
+            term = int(dom['term'])
+        except ValueError:
+            term = pt.elements.symbol(dom['term'].strip().number)
+
+        if self.surface == (0, 0, 1):
+            # identity matrix
+            m = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
+        elif self.surface == (1, 1, 1):
+            # this will map the [111] direction onto the z-axis
+            m1 = np.array([1, -1, 0]) * math.sqrt(1/2)
+            m2 = np.array([0.5, 0.5, -1]) * math.sqrt(2/3)
+            m3 = np.array([1, 1, 1]) * math.sqrt(1/3)
+            m = np.array([m1, m2, m3])
+        else:
+            raise ValueError("unsupported surface specification")
+
+        # lattice vectors
+        a1 = np.matmul(m, np.array((1.0, 0.0, 0.0)) * a_lat)
+        a2 = np.matmul(m, np.array((0.0, 1.0, 0.0)) * a_lat)
+        a3 = np.matmul(m, np.array((0.0, 0.0, 1.0)) * a_lat)
+
+        # basis
+        b1 = [np.array((0.0, 0.0, 0.0)), (a2 + a3) / 2, (a3 + a1) / 2, (a1 + a2) / 2]
+        if term == self.atomtype1:
+            d1 = np.array((0, 0, 0))
+            d2 = (a1 + a2 + a3) / 4
+        else:
+            d1 = -(a1 + a2 + a3) / 4
+            d2 = np.array((0, 0, 0))
+        for b in b1:
+            clu.add_bulk(self.atomtype1, b + d1, a1, a2, a3)
+            clu.add_bulk(self.atomtype2, b + d2, a1, a2, a3)
+
+        return clu
--- a/projects/demo/fcc.py
+++ b/projects/demo/fcc.py
@ -1,36 +1,29 @@
-#!/usr/bin/env python
-
 """
@package pmsco.projects.fcc
 scattering calculation project for the (111) surface of an arbitrary face-centered cubic crystal

@author Matthias Muntwiler, matthias.muntwiler@psi.ch

-@copyright (c) 2015 by Paul Scherrer Institut @n
+@copyright (c) 2015-19 by Paul Scherrer Institut @n
 Licensed under the Apache License, Version 2.0 (the "License"); @n
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
 """

+from __future__ import absolute_import
 from __future__ import division
-import sys
-import os
+from __future__ import print_function
+
 import math
 import numpy as np
+import os.path
 import periodictable as pt
 import argparse
 import logging

-base_dir = os.path.dirname(__file__) or '.'
-package_dir = os.path.join(base_dir, '../..')
-package_dir = os.path.abspath(package_dir)
-sys.path.append(package_dir)
-
-import pmsco.pmsco
 import pmsco.cluster as mc
 import pmsco.project as mp
-import pmsco.data as md
 from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)
@ -82,7 +75,7 @@ class FCC111Project(mp.Project):
        clu.add_layer(self.element, a_l1, a1, a2)
        clu.add_layer(self.element, a_l2, a1, a2)
        clu.add_layer(self.element, a_l3, a1, a2)
-        clu.add_bulk(self.element, a_bulk, a1, a2, a3)
+        clu.add_bulk(self.element, a_bulk, a1, a2, a3, a_bulk[2] + 0.01)

        clu.set_emitter(a_l1)

@ -98,7 +91,7 @@ class FCC111Project(mp.Project):
        par['V0']  = inner potential
        par['Zsurf'] = position of surface
        """
-        params = mp.Params()
+        params = mp.CalculatorParams()

        params.title = "fcc(111)"
        params.comment = "{0} {1}".format(self.__class__, index)
@ -110,7 +103,7 @@ class FCC111Project(mp.Project):
        params.scattering_level = 5
        params.fcut = 15.0
        params.cut = 15.0
-        params.angular_broadening = 0.0
+        params.angular_resolution = 0.0
        params.lattice_constant = 1.0
        params.z_surface = model['Zsurf']
        params.atom_types = 3
@ -140,11 +133,11 @@ class FCC111Project(mp.Project):

        return params

-    def create_domain(self):
+    def create_model_space(self):
        """
-        define the domain of the optimization parameters.
+        define the model space of the optimization parameters.
        """
-        dom = mp.Domain()
+        dom = mp.ModelSpace()

        if self.mode == "single":
            dom.add_param('rmax',     5.00,    5.00, 15.00, 2.50)
@ -176,16 +169,14 @@ class FCC111Project(mp.Project):
            dom.add_param('Zsurf',    1.00,    0.00,  2.00, 0.50)

        return dom
-        
-def create_project(element):
+
+
+def create_project():
    """
    create an FCC111Project calculation project.
-
-    @param element: symbol of the chemical element of the atoms contained in the cluster.
    """

    project = FCC111Project()
-    project.element = element

    project_dir = os.path.dirname(os.path.abspath(__file__))
    project.data_dir = project_dir
@ -193,13 +184,13 @@ def create_project(element):
    # scan dictionary
    # to select any number of scans, add their dictionary keys as scans option on the command line
    project.scan_dict['default'] = {'filename': os.path.join(project_dir, "demo_holo_scan.etp"),
-                                  'emitter': "Ni", 'initial_state': "3s"}
+                                    'emitter': "Ni", 'initial_state': "3s"}
    project.scan_dict['holo'] = {'filename': os.path.join(project_dir, "demo_holo_scan.etp"),
-                                  'emitter': "Ni", 'initial_state': "3s"}
+                                 'emitter': "Ni", 'initial_state': "3s"}
    project.scan_dict['alpha'] = {'filename': os.path.join(project_dir, "demo_alpha_scan.etp"),
                                  'emitter': "Ni", 'initial_state': "3s"}

-    project.add_symmetry({'default': 0.0})
+    project.add_domain({'default': 0.0})

    return project

@ -229,6 +220,7 @@ def set_project_args(project, project_args):

    try:
        if project_args.element:
+            project.element = project_args.element
            for scan in project.scans:
                scan.emitter = project_args.element
            logger.warning(BMsg("override emitters to {0}", project.emitter))
@ -237,8 +229,9 @@ def set_project_args(project, project_args):

    try:
        if project_args.initial_state:
-            project.initial_state = project_args.initial_state
-            logger.warning(BMsg("override initial states to {0}", project.initial_state))
+            for scan in project.scans:
+                scan.initial_state = project_args.initial_state
+            logger.warning(f"override initial states of all scans to {project_args.initial_state}")
    except AttributeError:
        pass

@ -263,22 +256,5 @@ def parse_project_args(_args):
    parser.add_argument('--energy', type=float,
                        help="kinetic energy of photoelectron (override scan file)")

-    parsed_args = parser.parse_known_args(_args)
+    parsed_args = parser.parse_args(_args)
    return parsed_args
-
-
-def main():
-    args, unknown_args = pmsco.pmsco.parse_cli()
-    if unknown_args:
-        project_args = parse_project_args(unknown_args)
-    else:
-        project_args = None
-
-    project = create_project(project_args.element)
-    pmsco.pmsco.set_common_args(project, args)
-    set_project_args(project, project_args)
-    pmsco.pmsco.run_project(project)
-
-if __name__ == '__main__':
-    main()
-    sys.exit(0)
--- a/projects/demo/molecule.py
+++ b/projects/demo/molecule.py
@ -0,0 +1,384 @@
+"""
+@package pmsco.projects.demo.molecule
+scattering calculation project for single molecules
+
+the atomic positions are read from a molecule file.
+cluster file, emitter (by chemical symbol), initial state and kinetic energy are specified on the command line.
+there are no structural parameters.
+
+@author Matthias Muntwiler, matthias.muntwiler@psi.ch
+
+@copyright (c) 2015-20 by Paul Scherrer Institut @n
+Licensed under the Apache License, Version 2.0 (the "License"); @n
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+"""
+
+import math
+import numpy as np
+import os.path
+from pathlib import Path
+import periodictable as pt
+import argparse
+import logging
+
+# noinspection PyUnresolvedReferences
+from pmsco.calculators.calculator import InternalAtomicCalculator
+# noinspection PyUnresolvedReferences
+from pmsco.calculators.edac import EdacCalculator
+# noinspection PyUnresolvedReferences
+from pmsco.calculators.phagen.runner import PhagenCalculator
+import pmsco.cluster as cluster
+from pmsco.data import calc_modfunc_loess
+# noinspection PyUnresolvedReferences
+import pmsco.elements.bindingenergy
+from pmsco.helpers import BraceMessage as BMsg
+import pmsco.project as project
+
+logger = logging.getLogger(__name__)
+
+
+class MoleculeFileCluster(cluster.ClusterGenerator):
+    """
+    cluster generator based on external file.
+
+    work in progress.
+    """
+    def __init__(self, project):
+        super(MoleculeFileCluster, self).__init__(project)
+        self.base_cluster = None
+
+    def load_base_cluster(self):
+        """
+        load and cache the project-defined coordinate file.
+
+        the file path is set in self.project.cluster_file.
+        the file must be in XYZ (.xyz) or PMSCO cluster (.clu) format (cf. pmsco.cluster module).
+
+        @return: Cluster object (also referenced by self.base_cluster)
+        """
+        if self.base_cluster is None:
+            clu = cluster.Cluster()
+            clu.set_rmax(120.0)
+            p = Path(self.project.cluster_file)
+            ext = p.suffix
+            if ext == ".xyz":
+                fmt = cluster.FMT_XYZ
+            elif ext == ".clu":
+                fmt = cluster.FMT_PMSCO
+            else:
+                raise ValueError(f"unknown cluster file extension {ext}")
+            clu.load_from_file(self.project.cluster_file, fmt=fmt)
+            self.base_cluster = clu
+
+        return self.base_cluster
+
+    def count_emitters(self, model, index):
+        """
+        count the number of emitter configurations.
+
+        the method creates the full cluster and counts the emitters.
+
+        @param model: model parameters.
+        @param index: scan and domain are used by the create_cluster() method,
+            emit decides whether the method returns the number of configurations (-1),
+            or the number of emitters in the specified configuration (>= 0).
+        @return: number of emitter configurations.
+        """
+        clu = self.create_cluster(model, index)
+        return clu.get_emitter_count()
+
+    def create_cluster(self, model, index):
+        """
+        import a cluster from a coordinate file (XYZ format).
+
+        the method does the following:
+        - load the cluster file specified by self.cluster_file.
+        - trim the cluster according to model['rmax'].
+        - mark the 6 nitrogen atoms at the center of the trimer as emitters.
+
+        @param model: rmax is the trim radius of the cluster in units of the surface lattice constant.
+
+        @param index (named tuple CalcID) calculation index.
+            this method uses the domain index to look up domain parameters in
+            `pmsco.project.Project.domains`.
+            `index.emit` selects whether a single-emitter (>= 0) or all-emitter cluster (== -1) is returned.
+
+        @return pmsco.cluster.Cluster object
+        """
+        self.load_base_cluster()
+        clu = cluster.Cluster()
+        clu.copy_from(self.base_cluster)
+        clu.comment = f"{self.__class__}, {index}"
+        dom = self.project.domains[index.domain]
+
+        # trim
+        clu.set_rmax(model['rmax'])
+        clu.trim_sphere(clu.rmax)
+
+        # emitter selection
+        idx_emit = np.where(clu.data['s'] == self.project.scans[index.scan].emitter)
+        assert isinstance(idx_emit, tuple)
+        idx_emit = idx_emit[0]
+        if index.emit >= 0:
+            idx_emit = idx_emit[index.emit]
+        clu.data['e'][idx_emit] = 1
+
+        # rotation
+        if 'xrot' in model:
+            clu.rotate_z(model['xrot'])
+        elif 'xrot' in dom:
+            clu.rotate_z(dom['xrot'])
+        if 'yrot' in model:
+            clu.rotate_z(model['yrot'])
+        elif 'yrot' in dom:
+            clu.rotate_z(dom['yrot'])
+        if 'zrot' in model:
+            clu.rotate_z(model['zrot'])
+        elif 'zrot' in dom:
+            clu.rotate_z(dom['zrot'])
+
+        logger.info(f"cluster for calculation {index}: "
+                    f"{clu.get_atom_count()} atoms, {clu.get_emitter_count()} emitters")
+
+        return clu
+
+
+class MoleculeProject(project.Project):
+    """
+    general molecule project.
+
+    the following model parameters are used:
+
+    @arg `model['zsurf']`   : position of surface above molecule (angstrom)
+    @arg `model['Texp']`    : experimental temperature (K)
+    @arg `model['Tdeb']`    : debye temperature (K)
+    @arg `model['V0']`      : inner potential (eV)
+    @arg `model['rmax']`    : cluster radius (angstrom)
+    @arg `model['ares']`    : angular resolution (degrees, FWHM)
+    @arg `model['distm']`   : dmax for EDAC (angstrom)
+
+    the following domain parameters are used.
+    they can also be specified as model parameters.
+
+    @arg `'xrot'`           : rotation about x-axis (applied first) (deg)
+    @arg `'yrot'`           : rotation about y-axis (applied after x) (deg)
+    @arg `'zrot'`           : rotation about z-axis (applied after x and y) (deg)
+
+    the project parameters are:
+
+    @arg `cluster_file`    : name of cluster file of template molecule.
+                              default: "dpdi-trimer.xyz"
+    """
+    def __init__(self):
+        """
+        initialize a project instance
+        """
+        super(MoleculeProject, self).__init__()
+        self.model_space = project.ModelSpace()
+        self.scan_dict = {}
+        self.cluster_file = "demo-cluster.xyz"
+        self.cluster_generator = MoleculeFileCluster(self)
+        self.atomic_scattering_factory = PhagenCalculator
+        self.multiple_scattering_factory = EdacCalculator
+        self.phase_files = {}
+        self.rme_files = {}
+        self.modf_smth_ei = 0.5
+
+    def create_params(self, model, index):
+        """
+        set a specific set of parameters given the optimizable parameters.
+
+        @param model: (dict) optimization parameters
+            this method requires zsurf, V0, Texp, Tdeb, ares and distm.
+
+        @param index (named tuple CalcID) calculation index.
+            this method formats the index into the comment line.
+        """
+        params = project.CalculatorParams()
+
+        params.title = "molecule demo"
+        params.comment = f"{self.__class__} {index}"
+        params.cluster_file = ""
+        params.output_file = ""
+        initial_state = self.scans[index.scan].initial_state
+        params.initial_state = initial_state
+        emitter = self.scans[index.scan].emitter
+        params.binding_energy = pt.elements.symbol(emitter).binding_energy[initial_state]
+        params.polarization = "H"
+        params.z_surface = model['zsurf']
+        params.inner_potential = model['V0']
+        params.work_function = 4.5
+        params.polar_incidence_angle = 60.0
+        params.azimuthal_incidence_angle = 0.0
+        params.angular_resolution = model['ares']
+        params.experiment_temperature = model['Texp']
+        params.debye_temperature = model['Tdeb']
+        params.phase_files = self.phase_files
+        params.rme_files = self.rme_files
+        # edac_interface only
+        params.emitters = []
+        params.lmax = 15
+        params.dmax = model['distm']
+        params.orders = [20]
+
+        return params
+
+    def create_model_space(self):
+        """
+        define the range of model parameters.
+
+        see the class description for a list of parameters.
+        """
+
+        return self.model_space
+
+    # noinspection PyUnusedLocal
+    def calc_modulation(self, data, model):
+        """
+        calculate the modulation function with project-specific smoothing factor
+
+        see @ref pmsco.pmsco.project.calc_modulation.
+
+        @param data: (numpy.ndarray) experimental data in ETPI, or ETPAI format.
+
+        @param model: (dict) model parameters of the calculation task. not used.
+
+        @return copy of the data array with the modulation function in the 'i' column.
+        """
+        return calc_modfunc_loess(data, smth=self.modf_smth_ei)
+
+
+def create_model_space(mode):
+    """
+    define the model space.
+    """
+    dom = project.ModelSpace()
+
+    if mode == "single":
+        dom.add_param('zsurf',   1.20)
+        dom.add_param('Texp',  300.00)
+        dom.add_param('Tdeb',  100.00)
+        dom.add_param('V0',     10.00)
+        dom.add_param('rmax',   50.00)
+        dom.add_param('ares',    5.00)
+        dom.add_param('distm',   5.00)
+        dom.add_param('wdom1', 1.0)
+        dom.add_param('wdom2', 1.0)
+        dom.add_param('wdom3', 1.0)
+        dom.add_param('wdom4', 1.0)
+        dom.add_param('wdom5', 1.0)
+    else:
+        raise ValueError(f"undefined model space for {mode} optimization")
+
+    return dom
+
+
+def create_project():
+    """
+    create the project instance.
+    """
+
+    proj = MoleculeProject()
+    proj_dir = os.path.dirname(os.path.abspath(__file__))
+    proj.project_dir = proj_dir
+
+    # scan dictionary
+    # to select any number of scans, add their dictionary keys as scans option on the command line
+    proj.scan_dict['empty'] = {'filename': os.path.join(proj_dir, "../common/empty-hemiscan.etpi"),
+                               'emitter': "N", 'initial_state': "1s"}
+
+    proj.mode = 'single'
+    proj.model_space = create_model_space(proj.mode)
+    proj.job_name = 'molecule0000'
+    proj.description = 'molecule demo'
+
+    return proj
+
+
+def set_project_args(project, project_args):
+    """
+    set the project arguments.
+
+    @param project: project instance
+
+    @param project_args: (Namespace object) project arguments.
+    """
+
+    scans = []
+    try:
+        if project_args.scans:
+            scans = project_args.scans
+        else:
+            logger.error("missing scan argument")
+            exit(1)
+    except AttributeError:
+        logger.error("missing scan argument")
+        exit(1)
+
+    for scan_key in scans:
+        scan_spec = project.scan_dict[scan_key]
+        project.add_scan(**scan_spec)
+
+    try:
+        project.cluster_file = os.path.abspath(project_args.cluster_file)
+        project.cluster_generator = MoleculeFileCluster(project)
+    except (AttributeError, TypeError):
+        logger.error("missing cluster-file argument")
+        exit(1)
+
+    try:
+        if project_args.emitter:
+            for scan in project.scans:
+                scan.emitter = project_args.emitter
+            logger.warning(f"override emitters of all scans to {project_args.emitter}")
+    except AttributeError:
+        pass
+
+    try:
+        if project_args.initial_state:
+            for scan in project.scans:
+                scan.initial_state = project_args.initial_state
+            logger.warning(f"override initial states of all scans to {project_args.initial_state}")
+    except AttributeError:
+        pass
+
+    try:
+        if project_args.energy:
+            for scan in project.scans:
+                scan.energies = np.asarray((project_args.energy, ))
+            logger.warning(f"override scan energy of all scans to {project_args.energy}")
+    except AttributeError:
+        pass
+
+    try:
+        if project_args.symmetry:
+            for angle in np.linspace(0, 360, num=project_args.symmetry, endpoint=False):
+                project.add_domain({'xrot': 0., 'yrot': 0., 'zrot': angle})
+                logger.warning(f"override rotation symmetry to {project_args.symmetry}")
+    except AttributeError:
+        pass
+
+
+def parse_project_args(_args):
+    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
+
+    # main arguments
+    parser.add_argument('--scans', nargs="*",
+                        help="nick names of scans to use in calculation (see create_project function)")
+    parser.add_argument('--cluster-file',
+                        help="path name of molecule file (xyz format).")
+
+    # conditional arguments
+    parser.add_argument('--emitter',
+                        help="emitter: chemical symbol")
+    parser.add_argument('--initial-state',
+                        help="initial state term: e.g. 2p1/2")
+    parser.add_argument('--energy', type=float,
+                        help="kinetic energy (eV)")
+    parser.add_argument('--symmetry', type=int, default=1,
+                        help="n-fold rotational symmetry")
+
+    parsed_args = parser.parse_args(_args)
+    return parsed_args
--- a/projects/twoatom/twoatom.py
+++ b/projects/twoatom/twoatom.py
@ -1,5 +1,3 @@
-#!/usr/bin/env python
-
 """
@package projects.twoatom
 Two-atom demo scattering calculation project
@ -8,30 +6,132 @@ this file is specific to the project and the state of the data analysis,
 as it contains particular parameter values.
 """

+from __future__ import absolute_import
 from __future__ import division
-import sys
-import os
-import math
-import numpy as np
-import periodictable as pt
+from __future__ import print_function
+
 import argparse
 import logging
+import math
+import numpy as np
+import os.path
+import periodictable as pt

-# adjust the system path so that the main PMSCO code is found
-base_dir = os.path.dirname(__file__) or '.'
-package_dir = os.path.join(base_dir, '../..')
-package_dir = os.path.abspath(package_dir)
-sys.path.append(package_dir)
-
-import pmsco.pmsco
+from pmsco.calculators.calculator import InternalAtomicCalculator
+from pmsco.calculators.edac import EdacCalculator
+from pmsco.calculators.phagen.runner import PhagenCalculator
 import pmsco.cluster as mc
 import pmsco.project as mp
-import pmsco.data as md
 from pmsco.helpers import BraceMessage as BMsg

 logger = logging.getLogger(__name__)


+class TwoatomCluster(mc.ClusterGenerator):
+    """
+    cluster of two atoms.
+
+    atom A (top) is set at position (0, 0, 0), atom B (bottom) at (-dx, -dy, -dz)
+    where dx, dy and dz are calculated from model parameters.
+    the type of the atoms is set upon construction.
+
+    the model parameters are:
+    @arg @c model['dAB']   : distance between the two atoms in Angstrom.
+    @arg @c model['th']    : polar angle of the connection line, 0 = on top geometry.
+    @arg @c model['ph']    : azimuthal angle of the connection line, 0 = polar angle affects X coordinate.
+
+    the class is designed to be reusable in various projects.
+    object attributes refine the atom types and the mapping of project-specific model parameters.
+    """
+
+    ## @var atom_types (dict)
+    # chemical element numbers of the cluster atoms.
+    #
+    # atom 'A' is the top atom, 'B' the bottom one.
+    # upon construction both atoms are set to oxygen.
+    # to customize, call @ref set_atom_type.
+
+    ## @var model_dict (dict)
+    # mapping of model parameters to cluster parameters
+    #
+    # the default model parameters used by the cluster are 'dAB', 'th' and 'ph'.
+    # if the project uses other parameter names, e.g. 'dCO' instead of 'dAB',
+    # the project-specific names can be declared here.
+    # in the example, set model_dict['dAB'] = 'dCO'.
+
+    def __init__(self, project):
+        """
+        initialize the cluster generator.
+
+        the atoms and model dictionary are given default values.
+        see @ref set_atom_type and @ref model_dict for customization.
+
+        @param project: project instance.
+        """
+        super(TwoatomCluster, self).__init__(project)
+
+        self.atom_types = {'A': pt.O.number, 'B': pt.O.number}
+        self.model_dict = {'dAB': 'dAB', 'th': 'th', 'ph': 'ph'}
+
+    def set_atom_type(self, atom, element):
+        """
+        set the type (chemical element) of an atom.
+
+        @param atom: atom key, 'A' (top) or 'B' (bottom).
+        @param element: chemical element number or symbol.
+        """
+        try:
+            self.atom_types[atom] = int(element)
+        except ValueError:
+            self.atom_types[atom] = pt.elements.symbol(element.strip()).number
+
+    def count_emitters(self, model, index):
+        """
+        return the number of emitter configurations.
+
+        this cluster supports only one configuration.
+
+        @param model:
+        @param index:
+        @return 1
+        """
+        return 1
+
+    def create_cluster(self, model, index):
+        """
+        create a cluster given the model parameters and index.
+
+        @param model:
+        @param index:
+        @return a pmsco.cluster.Cluster object containing the atomic coordinates.
+        """
+        r = model[self.model_dict['dAB']]
+        try:
+            th = math.radians(model[self.model_dict['th']])
+        except KeyError:
+            th = 0.
+        try:
+            ph = math.radians(model[self.model_dict['ph']])
+        except KeyError:
+            ph = 0.
+
+        dx = r * math.sin(th) * math.cos(ph)
+        dy = r * math.sin(th) * math.sin(ph)
+        dz = r * math.cos(th)
+
+        clu = mc.Cluster()
+        clu.comment = "{0} {1}".format(self.__class__, index)
+        clu.set_rmax(r * 2.0)
+
+        a_top = np.array((0.0, 0.0, 0.0))
+        a_bot = np.array((-dx, -dy, -dz))
+
+        clu.add_atom(self.atom_types['A'], a_top, 1)
+        clu.add_atom(self.atom_types['B'], a_bot, 0)
+
+        return clu
+
+
 class TwoatomProject(mp.Project):
    """
    two-atom calculation project class.
@ -49,31 +149,23 @@ class TwoatomProject(mp.Project):
    def __init__(self):
        super(TwoatomProject, self).__init__()
        self.scan_dict = {}
-
-    def create_cluster(self, model, index):
-        """
-        calculate a specific set of atom positions given the optimizable parameters.
-
-        the cluster contains a nitrogen in the top layer,
-        and a nickel atom in the second layer.
-        The layer distance and the angle can be adjusted by parameters.
-
-        @param model: (dict) optimizable parameters
-        """
-        clu = mc.Cluster()
-        clu.comment = "{0} {1}".format(self.__class__, index)
-        clu.set_rmax(10.0)
-
-        a_N = np.array((0.0, 0.0, 0.0))
-        rad_pNNi = math.radians(model['pNNi'])
-        a_Ni1 = np.array((0.0,
-            -model['dNNi'] * math.sin(rad_pNNi),
-            -model['dNNi'] * math.cos(rad_pNNi)))
-
-        clu.add_atom(pt.N.number, a_N, 1)
-        clu.add_atom(pt.Ni.number, a_Ni1, 0)
-
-        return clu
+        self.cluster_generator = TwoatomCluster(self)
+        self.cluster_generator.set_atom_type('A', 'N')
+        self.cluster_generator.set_atom_type('B', 'Ni')
+        self.cluster_generator.model_dict['dAB'] = 'dNNi'
+        self.cluster_generator.model_dict['th'] = 'pNNi'
+        self.cluster_generator.model_dict['ph'] = 'aNNi'
+        self.atomic_scattering_factory = PhagenCalculator
+        self.multiple_scattering_factory = EdacCalculator
+        self.phase_files = {}
+        self.rme_files = {}
+        self.bindings = {}
+        self.bindings['N'] = {'1s': 409.9}
+        self.bindings['B'] = {'1s': 188.0}
+        self.bindings['Ni'] = {'2s': 1008.6,
+                               '2p': (870.0 + 852.7) / 2, '2p1/2': 870.0, '2p3/2': 852.7,
+                               '3s': 110.8,
+                               '3p': (68.0 + 66.2) / 2, '3p1/2': 68.0, '3p3/2': 66.2}

    def create_params(self, model, index):
        """
@ -81,40 +173,40 @@ class TwoatomProject(mp.Project):

        @param model: (dict) optimizable parameters
        """
-        params = mp.Params()
+        params = mp.CalculatorParams()

        params.title = "two-atom demo"
        params.comment = "{0} {1}".format(self.__class__, index)
        params.cluster_file = ""
        params.output_file = ""
        params.initial_state = self.scans[index.scan].initial_state
-        params.spherical_order = 2
+        initial_state = self.scans[index.scan].initial_state
+        params.initial_state = initial_state
+        emitter = self.scans[index.scan].emitter
+        params.binding_energy = self.bindings[emitter][initial_state]
        params.polarization = "H"
-        params.scattering_level = 5
-        params.fcut = 15.0
-        params.cut = 15.0
-        params.angular_broadening = 0.0
-        params.lattice_constant = 1.0
        params.z_surface = model['Zsurf']
-        params.atom_types = 3
-        params.atomic_number = [7, 28]
-        params.phase_file = ["hbn_n.pha", "ni.pha"]
-        params.msq_displacement = [0.01, 0.01, 0.00]
-        params.planewave_attenuation = 1.0
        params.inner_potential = model['V0']
        params.work_function = 3.6
-        params.symmetry_range = 360.0
        params.polar_incidence_angle = 60.0
        params.azimuthal_incidence_angle = 0.0
-        params.vibration_model = "P"
-        params.substrate_atomic_mass = 58.69
        params.experiment_temperature = 300.0
        params.debye_temperature = 356.0
-        params.debye_wavevector = 1.7558
-        params.rme_minus_value = 0.0
+
+        if self.phase_files:
+            state = emitter + initial_state
+            try:
+                params.phase_files = self.phase_files[state]
+            except KeyError:
+                params.phase_files = {}
+                logger.warning("no phase files found for {} - using default calculator".format(state))
+
+        params.rme_files = {}
+        params.rme_minus_value = 0.1
        params.rme_minus_shift = 0.0
        params.rme_plus_value = 1.0
        params.rme_plus_shift = 0.0
+
        # used by EDAC only
        params.emitters = []
        params.lmax = 15
@ -123,11 +215,11 @@ class TwoatomProject(mp.Project):

        return params

-    def create_domain(self):
+    def create_model_space(self):
        """
        define the domain of the optimization parameters.
        """
-        dom = mp.Domain()
+        dom = mp.ModelSpace()

        if self.mode == "single":
            dom.add_param('dNNi',     2.109,  2.000,  2.250, 0.050)
@ -153,6 +245,27 @@ class TwoatomProject(mp.Project):
        return dom


+def example_intensity(e, t, p, a):
+    """
+    arbitrary intensity pattern for example data
+
+    this function can be used to calculate the intensity in example scan files.
+    the function implements an arbitrary modulation function
+
+    @param e: energy
+    @param t: theta
+    @param p: phi
+    @param a: alpha
+    @return intensity
+    """
+    i = np.random.random() * 1e6 * \
+        np.cos(np.radians(t)) ** 2 * \
+        np.cos(np.radians(a)) ** 2 * \
+        np.cos(np.radians(p)) ** 2 * \
+        np.sin(e / 1000. * np.pi * 0.1 / np.sqrt(e)) ** 2
+    return i
+
+
 def create_project():
    """
    create a new TwoatomProject calculation project.
@ -209,7 +322,7 @@ def set_project_args(project, project_args):
        project.add_scan(**scan_spec)
        logger.info(BMsg("add scan {filename} ({emitter} {initial_state})", **scan_spec))

-    project.add_symmetry({'default': 0.0})
+    project.add_domain({'default': 0.0})


 def parse_project_args(_args):
@ -230,20 +343,3 @@ def parse_project_args(_args):
    parsed_args = parser.parse_args(_args)

    return parsed_args
-
-
-def main():
-    args, unknown_args = pmsco.pmsco.parse_cli()
-    if unknown_args:
-        project_args = parse_project_args(unknown_args)
-    else:
-        project_args = None
-
-    project = create_project()
-    pmsco.pmsco.set_common_args(project, args)
-    set_project_args(project, project_args)
-    pmsco.pmsco.run_project(project)
-
-if __name__ == '__main__':
-    main()
-    sys.exit(0)
--- a/projects/twoatom/twoatom_energy_alpha.etpai
+++ b/projects/twoatom/twoatom_energy_alpha.etpai
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
matthias muntwiler	2b3dbd8bac	update README	2020-09-04 16:31:45 +02:00
matthias muntwiler	7c61eb1b41	public release 2.2.0 - see README.md and CHANGES.md for details	2020-09-04 16:22:42 +02:00
matthias muntwiler	fbd2d4fa8c	public distro 2.1.0	2019-07-19 12:54:54 +02:00
matthias muntwiler	acea809e4e	update public distribution based on internal repository c9a2ac8 2019-01-03 16:04:57 +0100 tagged rev-master-2.0.0	2019-01-31 15:45:02 +01:00