add files for public distribution
based on internal repository 0a462b6 2017-11-22 14:41:39 +0100
This commit is contained in:
@@ -0,0 +1,23 @@
|
||||
# Set the default behavior, in case people don't have core.autocrlf set.
|
||||
* text=auto
|
||||
|
||||
# Explicitly declare text files you want to always be normalized and converted
|
||||
# to native line endings on checkout.
|
||||
*.c text
|
||||
*.cpp text
|
||||
*.f text
|
||||
*.h text
|
||||
*.i text
|
||||
*.m text
|
||||
*.py text
|
||||
*.pyf text
|
||||
makefile text
|
||||
README text
|
||||
|
||||
# Declare files that will always have CRLF line endings on checkout.
|
||||
*.bat text eol=crlf
|
||||
*.vc text eol=crlf
|
||||
|
||||
# Denote all files that are truly binary and should not be modified.
|
||||
*.png binary
|
||||
*.jpg binary
|
||||
+15
@@ -0,0 +1,15 @@
|
||||
work/*
|
||||
debug/*
|
||||
lib/*
|
||||
*.pyc
|
||||
*.o
|
||||
*.so
|
||||
*.exe
|
||||
*.x
|
||||
*~
|
||||
*.log
|
||||
.idea/*
|
||||
.eric4project/*
|
||||
.eric5project/*
|
||||
.ropeproject/*
|
||||
.fuse*
|
||||
@@ -0,0 +1,14 @@
|
||||
List of Contributors
|
||||
====================
|
||||
|
||||
|
||||
Original Author
|
||||
---------------
|
||||
|
||||
Matthias Muntwiler, <mailto:matthias.muntwiler@psi.ch>
|
||||
|
||||
|
||||
Contributors
|
||||
------------
|
||||
|
||||
|
||||
@@ -0,0 +1,70 @@
|
||||
Introduction
|
||||
============
|
||||
|
||||
PMSCO stands for PEARL multiple-scattering cluster calculations and structural optimization.
|
||||
It is a collection of computer programs to calculate photoelectron diffraction patterns,
|
||||
and to optimize structural models based on measured data.
|
||||
|
||||
The actual scattering calculation is done by code developed by other parties.
|
||||
PMSCO wraps around that program and facilitates parameter handling, cluster building, structural optimization and parallel processing.
|
||||
In the current version, the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code
|
||||
developed by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley (1999) is used for scattering calculations.
|
||||
Other code can be integrated as well.
|
||||
|
||||
Highlights
|
||||
----------
|
||||
|
||||
- angle or energy scanned XPD.
|
||||
- various scanning modes including energy, polar angle, azimuthal angle, analyser angle.
|
||||
- averaging over multiple symmetries (domains or emitters).
|
||||
- global optimization of multiple scans.
|
||||
- structural optimization algorithms: particle swarm optimization, grid search, gradient search.
|
||||
- calculation of the modulation function.
|
||||
- calculation of the weighted R-factor.
|
||||
- automatic parallel processing using OpenMPI.
|
||||
|
||||
|
||||
Installation
|
||||
============
|
||||
|
||||
PMSCO is written in Python 2.7.
|
||||
The code will run in any recent Linux environment on a workstation or in a virtual machine.
|
||||
Scientific Linux, CentOS7, [Ubuntu](https://www.ubuntu.com/)
|
||||
and [Lubuntu](http://lubuntu.net/) (recommended for virtual machine) have been tested.
|
||||
For optimization jobs, a cluster with 20-50 available processor cores is recommended.
|
||||
The code requires about 2 GB of RAM per process.
|
||||
|
||||
Detailed installation instructions and dependencies can be found in the documentation
|
||||
(docs/src/installation.dox).
|
||||
A [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) compiler with Doxypy is required to generate the documentation in HTML or LaTeX format.
|
||||
|
||||
The public distribution of PMSCO does not contain the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code.
|
||||
Please obtain the EDAC source code from the original author, copy it to the pmsco/edac directory, and apply the edac_all.patch patch.
|
||||
|
||||
|
||||
License
|
||||
=======
|
||||
|
||||
The source code of PMSCO is licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0).
|
||||
Please read and respect the license agreement.
|
||||
|
||||
Please share your extensions of the code with the original author.
|
||||
The gitlab facility can be used to create forks and to submit pull requests.
|
||||
Attribution notices for your contributions shall be added to the NOTICE.md file.
|
||||
|
||||
|
||||
Author
|
||||
------
|
||||
|
||||
Matthias Muntwiler, <mailto:matthias.muntwiler@psi.ch>
|
||||
|
||||
Copyright
|
||||
---------
|
||||
|
||||
Copyright 2015-2017 by [Paul Scherrer Institut](http://www.psi.ch)
|
||||
|
||||
|
||||
Release Notes
|
||||
=============
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,157 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Slurm script template for PMSCO calculations on the Ra cluster
|
||||
# based on run_mpi_HPL_nodes-2.sl by V. Markushin 2016-03-01
|
||||
#
|
||||
# Use:
|
||||
# - enter the appropriate parameters and save as a new file.
|
||||
# - call the sbatch command to pass the job script.
|
||||
# request a specific number of nodes and tasks.
|
||||
# example:
|
||||
# sbatch --nodes=2 --ntasks-per-node=24 --time=02:00:00 run_pmsco.sl
|
||||
#
|
||||
# PMSCO arguments
|
||||
# copy this template to a new file, and set the arguments
|
||||
#
|
||||
# PMSCO_WORK_DIR
|
||||
# path to be used as working directory.
|
||||
# contains the script derived from this template.
|
||||
# receives output and temporary files.
|
||||
#
|
||||
# PMSCO_PROJECT_FILE
|
||||
# python module that declares the project and starts the calculation.
|
||||
# must include the file path relative to $PMSCO_WORK_DIR.
|
||||
#
|
||||
# PMSCO_SOURCE_DIR
|
||||
# path to the pmsco source directory
|
||||
# (the directory which contains the bin, lib, pmsco sub-directories)
|
||||
#
|
||||
# PMSCO_SCAN_FILES
|
||||
# list of scan files.
|
||||
#
|
||||
# PMSCO_OUT
|
||||
# name of output file. should not include a path.
|
||||
#
|
||||
# all paths are relative to $PMSCO_WORK_DIR or (better) absolute.
|
||||
#
|
||||
#
|
||||
# Further arguments
|
||||
#
|
||||
# PMSCO_JOBNAME (required)
|
||||
# the job name is the base name for output files.
|
||||
#
|
||||
# PMSCO_WALLTIME_HR (integer, required)
|
||||
# wall time limit in hours. must be integer, minimum 1.
|
||||
# this value is passed to PMSCO.
|
||||
# it should specify the same amount of wall time as requested from the scheduler.
|
||||
#
|
||||
# PMSCO_MODE (optional)
|
||||
# calculation mode: single, swarm, grid, gradient
|
||||
#
|
||||
# PMSCO_CODE (optional)
|
||||
# calculation code: edac, msc, test
|
||||
#
|
||||
# PMSCO_LOGLEVEL (optional)
|
||||
# request log level: DEBUG, INFO, WARNING, ERROR
|
||||
# create a log file based on the job name.
|
||||
#
|
||||
# PMSCO_PROJECT_ARGS (optional)
|
||||
# extra arguments that are parsed by the project module.
|
||||
#
|
||||
#SBATCH --job-name="_PMSCO_JOBNAME"
|
||||
#SBATCH --output="_PMSCO_JOBNAME.o.%j"
|
||||
#SBATCH --error="_PMSCO_JOBNAME.e.%j"
|
||||
|
||||
PMSCO_WORK_DIR="_PMSCO_WORK_DIR"
|
||||
PMSCO_JOBNAME="_PMSCO_JOBNAME"
|
||||
PMSCO_WALLTIME_HR=_PMSCO_WALLTIME_HR
|
||||
|
||||
PMSCO_PROJECT_FILE="_PMSCO_PROJECT_FILE"
|
||||
PMSCO_MODE="_PMSCO_MODE"
|
||||
PMSCO_CODE="_PMSCO_CODE"
|
||||
PMSCO_SOURCE_DIR="_PMSCO_SOURCE_DIR"
|
||||
PMSCO_SCAN_FILES="_PMSCO_SCAN_FILES"
|
||||
PMSCO_OUT="_PMSCO_JOBNAME"
|
||||
PMSCO_LOGLEVEL="_PMSCO_LOGLEVEL"
|
||||
PMSCO_PROJECT_ARGS="_PMSCO_PROJECT_ARGS"
|
||||
|
||||
module load psi-python27/2.4.1
|
||||
module load gcc/4.8.5
|
||||
module load openmpi/1.10.2
|
||||
source activate pmsco
|
||||
|
||||
echo '================================================================================'
|
||||
echo "=== Running $0 at the following time and place:"
|
||||
date
|
||||
/bin/hostname
|
||||
cd $PMSCO_WORK_DIR
|
||||
pwd
|
||||
ls -lA
|
||||
#the intel compiler is currently not compatible with mpi4py. -mm 170131
|
||||
#echo
|
||||
#echo '================================================================================'
|
||||
#echo "=== Setting the environment to use Intel Cluster Studio XE 2016 Update 2 intel/16.2:"
|
||||
#cmd="source /opt/psi/Programming/intel/16.2/bin/compilervars.sh intel64"
|
||||
#echo $cmd
|
||||
#$cmd
|
||||
echo
|
||||
echo '================================================================================'
|
||||
echo "=== The environment is set as following:"
|
||||
env
|
||||
echo
|
||||
echo '================================================================================'
|
||||
echo "BEGIN test"
|
||||
echo "=== Intel native mpirun will get the number of nodes and the machinefile from Slurm"
|
||||
which mpirun
|
||||
cmd="mpirun /bin/hostname"
|
||||
echo $cmd
|
||||
$cmd
|
||||
echo "END test"
|
||||
echo
|
||||
echo '================================================================================'
|
||||
echo "BEGIN mpirun pmsco"
|
||||
echo "Intel native mpirun will get the number of nodes and the machinefile from Slurm"
|
||||
echo
|
||||
echo "code revision"
|
||||
cd "$PMSCO_SOURCE_DIR"
|
||||
git log --pretty=tformat:'%h %ai %d' -1
|
||||
python -m compileall pmsco
|
||||
python -m compileall projects
|
||||
cd "$PMSCO_WORK_DIR"
|
||||
echo
|
||||
|
||||
PMSCO_CMD="python $PMSCO_PROJECT_FILE"
|
||||
PMSCO_ARGS="$PMSCO_PROJECT_ARGS"
|
||||
if [ -n "$PMSCO_SCAN_FILES" ]; then
|
||||
PMSCO_ARGS="-s $PMSCO_SCAN_FILES $PMSCO_ARGS"
|
||||
fi
|
||||
if [ -n "$PMSCO_CODE" ]; then
|
||||
PMSCO_ARGS="-c $PMSCO_CODE $PMSCO_ARGS"
|
||||
fi
|
||||
if [ -n "$PMSCO_MODE" ]; then
|
||||
PMSCO_ARGS="-m $PMSCO_MODE $PMSCO_ARGS"
|
||||
fi
|
||||
if [ -n "$PMSCO_OUT" ]; then
|
||||
PMSCO_ARGS="-o $PMSCO_OUT $PMSCO_ARGS"
|
||||
fi
|
||||
if [ "$PMSCO_WALLTIME_HR" -ge 1 ]; then
|
||||
PMSCO_ARGS="-t $PMSCO_WALLTIME_HR $PMSCO_ARGS"
|
||||
fi
|
||||
if [ -n "$PMSCO_LOGLEVEL" ]; then
|
||||
PMSCO_ARGS="--log-level $PMSCO_LOGLEVEL --log-file $PMSCO_JOBNAME.log $PMSCO_ARGS"
|
||||
fi
|
||||
|
||||
which mpirun
|
||||
ls -l "$PMSCO_SOURCE_DIR"
|
||||
ls -l "$PMSCO_PROJECT_FILE"
|
||||
# Do no use the OpenMPI specific options, like "-x LD_LIBRARY_PATH", with the Intel mpirun.
|
||||
cmd="mpirun $PMSCO_CMD $PMSCO_ARGS"
|
||||
echo $cmd
|
||||
$cmd
|
||||
echo "END mpirun pmsco"
|
||||
echo '================================================================================'
|
||||
date
|
||||
ls -lAtr
|
||||
echo '================================================================================'
|
||||
|
||||
exit 0
|
||||
@@ -0,0 +1,178 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# SGE script template for MSC calculations
|
||||
#
|
||||
# This script uses the tight integration of openmpi-1.4.5-gcc-4.6.3 in SGE
|
||||
# using the parallel environment (PE) "orte".
|
||||
# This script must be used only with qsub command - do NOT run it as a stand-alone
|
||||
# shell script because it will start all processes on the local node.
|
||||
#
|
||||
# PhD arguments
|
||||
# copy this template to a new file, and set the arguments
|
||||
#
|
||||
# PHD_WORK_DIR
|
||||
# path to be used as working directory.
|
||||
# contains the SGE script derived from this template.
|
||||
# receives output and temporary files.
|
||||
#
|
||||
# PHD_PROJECT_FILE
|
||||
# python module that declares the project and starts the calculation.
|
||||
# must include the file path relative to $PHD_WORK_DIR.
|
||||
#
|
||||
# PHD_SOURCE_DIR
|
||||
# path to the pmsco source directory
|
||||
# (the directory which contains the bin, lib, pmsco sub-directories)
|
||||
#
|
||||
# PHD_SCAN_FILES
|
||||
# list of scan files.
|
||||
#
|
||||
# PHD_OUT
|
||||
# name of output file. should not include a path.
|
||||
#
|
||||
# all paths are relative to $PHD_WORK_DIR or (better) absolute.
|
||||
#
|
||||
#
|
||||
# Further arguments
|
||||
#
|
||||
# PHD_JOBNAME (required)
|
||||
# the job name is the base name for output files.
|
||||
#
|
||||
# PHD_NODES (required)
|
||||
# number of computing nodes (processes) to allocate for the job.
|
||||
#
|
||||
# PHD_WALLTIME_HR (required)
|
||||
# wall time limit (hours)
|
||||
#
|
||||
# PHD_WALLTIME_MIN (required)
|
||||
# wall time limit (minutes)
|
||||
#
|
||||
# PHD_MODE (optional)
|
||||
# calculation mode: single, swarm, grid, gradient
|
||||
#
|
||||
# PHD_CODE (optional)
|
||||
# calculation code: edac, msc, test
|
||||
#
|
||||
# PHD_LOGLEVEL (optional)
|
||||
# request log level: DEBUG, INFO, WARNING, ERROR
|
||||
# create a log file based on the job name.
|
||||
#
|
||||
# PHD_PROJECT_ARGS (optional)
|
||||
# extra arguments that are parsed by the project module.
|
||||
#
|
||||
|
||||
PHD_WORK_DIR="_PHD_WORK_DIR"
|
||||
PHD_JOBNAME="_PHD_JOBNAME"
|
||||
PHD_NODES=_PHD_NODES
|
||||
PHD_WALLTIME_HR=_PHD_WALLTIME_HR
|
||||
PHD_WALLTIME_MIN=_PHD_WALLTIME_MIN
|
||||
|
||||
PHD_PROJECT_FILE="_PHD_PROJECT_FILE"
|
||||
PHD_MODE="_PHD_MODE"
|
||||
PHD_CODE="_PHD_CODE"
|
||||
PHD_SOURCE_DIR="_PHD_SOURCE_DIR"
|
||||
PHD_SCAN_FILES="_PHD_SCAN_FILES"
|
||||
PHD_OUT="_PHD_JOBNAME"
|
||||
PHD_LOGLEVEL="_PHD_LOGLEVEL"
|
||||
PHD_PROJECT_ARGS="_PHD_PROJECT_ARGS"
|
||||
|
||||
# Define your job name, parallel environment with the number of slots, and run time:
|
||||
#$ -cwd
|
||||
#$ -N _PHD_JOBNAME.job
|
||||
#$ -pe orte _PHD_NODES
|
||||
#$ -l ram=2G
|
||||
#$ -l s_rt=_PHD_WALLTIME_HR:_PHD_WALLTIME_MIN:00
|
||||
#$ -l h_rt=_PHD_WALLTIME_HR:_PHD_WALLTIME_MIN:30
|
||||
#$ -V
|
||||
|
||||
###################################################
|
||||
# Fix the SGE environment-handling bug (bash):
|
||||
source /usr/share/Modules/init/sh
|
||||
export -n -f module
|
||||
|
||||
# Load the environment modules for this job (the order may be important):
|
||||
module load python/python-2.7.5
|
||||
module load gcc/gcc-4.6.3
|
||||
module load mpi/openmpi-1.4.5-gcc-4.6.3
|
||||
module load blas/blas-20110419-gcc-4.6.3
|
||||
module load lapack/lapack-3.4.2-gcc-4.6.3
|
||||
export LD_LIBRARY_PATH=$PHD_SOURCE_DIR/lib/:$LD_LIBRARY_PATH
|
||||
|
||||
###################################################
|
||||
# Set the environment variables:
|
||||
MPIEXEC=$OPENMPI/bin/mpiexec
|
||||
# OPENMPI is set by the mpi/openmpi-* module.
|
||||
|
||||
export OMP_NUM_THREADS=1
|
||||
export OMPI_MCA_btl='openib,sm,self'
|
||||
# export OMPI_MCA_orte_process_binding=core
|
||||
|
||||
##############
|
||||
# BEGIN DEBUG
|
||||
# Print the SGE environment on master host:
|
||||
echo "================================================================"
|
||||
echo "=== SGE job JOB_NAME=$JOB_NAME JOB_ID=$JOB_ID"
|
||||
echo "================================================================"
|
||||
echo DATE=`date`
|
||||
echo HOSTNAME=`hostname`
|
||||
echo PWD=`pwd`
|
||||
echo "NSLOTS=$NSLOTS"
|
||||
echo "PE_HOSTFILE=$PE_HOSTFILE"
|
||||
cat $PE_HOSTFILE
|
||||
echo "================================================================"
|
||||
echo "Running environment:"
|
||||
env
|
||||
echo "================================================================"
|
||||
echo "Loaded environment modules:"
|
||||
module list 2>&1
|
||||
echo
|
||||
# END DEBUG
|
||||
##############
|
||||
|
||||
##############
|
||||
# Setup
|
||||
cd "$PHD_SOURCE_DIR"
|
||||
python -m compileall .
|
||||
|
||||
cd "$PHD_WORK_DIR"
|
||||
ulimit -c 0
|
||||
|
||||
###################################################
|
||||
# The command to run with mpiexec:
|
||||
CMD="python $PHD_PROJECT_FILE"
|
||||
ARGS="$PHD_PROJECT_ARGS"
|
||||
|
||||
if [ -n "$PHD_SCAN_FILES" ]; then
|
||||
ARGS="-s $PHD_SCAN_FILES -- $ARGS"
|
||||
fi
|
||||
|
||||
if [ -n "$PHD_CODE" ]; then
|
||||
ARGS="-c $PHD_CODE $ARGS"
|
||||
fi
|
||||
|
||||
if [ -n "$PHD_MODE" ]; then
|
||||
ARGS="-m $PHD_MODE $ARGS"
|
||||
fi
|
||||
|
||||
if [ -n "$PHD_OUT" ]; then
|
||||
ARGS="-o $PHD_OUT $ARGS"
|
||||
fi
|
||||
|
||||
if [ "$PHD_WALLTIME_HR" -ge 1 ]
|
||||
then
|
||||
ARGS="-t $PHD_WALLTIME_HR $ARGS"
|
||||
else
|
||||
ARGS="-t 0.5 $ARGS"
|
||||
fi
|
||||
|
||||
if [ -n "$PHD_LOGLEVEL" ]; then
|
||||
ARGS="--log-level $PHD_LOGLEVEL --log-file $PHD_JOBNAME.log $ARGS"
|
||||
fi
|
||||
|
||||
# The MPI command to run:
|
||||
MPICMD="$MPIEXEC --prefix $OPENMPI -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -x OMPI_MCA_btl -np $NSLOTS $CMD $ARGS"
|
||||
echo "Command to run:"
|
||||
echo "$MPICMD"
|
||||
echo
|
||||
exec $MPICMD
|
||||
|
||||
exit 0
|
||||
Executable
+145
@@ -0,0 +1,145 @@
|
||||
#!/bin/sh
|
||||
#
|
||||
# submission script for PMSCO calculations on the Ra cluster
|
||||
|
||||
if [ $# -lt 1 ]; then
|
||||
echo "Usage: $0 [NOSUB] JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT MODE [ARGS [ARGS [...]]]"
|
||||
echo ""
|
||||
echo " NOSUB (optional): do not submit the script to the queue. default: submit."
|
||||
echo " JOBNAME (text): name of job. use only alphanumeric characters, no spaces."
|
||||
echo " NODES (integer): number of computing nodes. (1 node = 24 or 32 processors)."
|
||||
echo " do not specify more than 2."
|
||||
echo " TASKS_PER_NODE (integer): 1...24, or 32."
|
||||
echo " 24 or 32 for full-node allocation."
|
||||
echo " 1...23 for shared node allocation."
|
||||
echo " WALLTIME:HOURS (integer): requested wall time."
|
||||
echo " 1...24 for day partition"
|
||||
echo " 24...192 for week partition"
|
||||
echo " 1...192 for shared partition"
|
||||
echo " PROJECT: python module (file path) that declares the project and starts the calculation."
|
||||
echo " MODE: PMSCO calculation mode (single|swarm|gradient|grid)."
|
||||
echo " ARGS (optional): any number of further PMSCO or project arguments (except mode and time)."
|
||||
echo ""
|
||||
echo "the job script complete with the program code and input/output data is generated in ~/jobs/\$JOBNAME"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# location of the pmsco package is derived from the path of this script
|
||||
SCRIPTDIR="$(dirname $(readlink -f $0))"
|
||||
SOURCEDIR="$SCRIPTDIR/.."
|
||||
PMSCO_SOURCE_DIR="$SOURCEDIR"
|
||||
|
||||
# read arguments
|
||||
if [ "$1" == "NOSUB" ]; then
|
||||
NOSUB="true"
|
||||
shift
|
||||
else
|
||||
NOSUB="false"
|
||||
fi
|
||||
|
||||
PMSCO_JOBNAME=$1
|
||||
shift
|
||||
|
||||
PMSCO_NODES=$1
|
||||
PMSCO_TASKS_PER_NODE=$2
|
||||
PMSCO_TASKS=$(expr $PMSCO_NODES \* $PMSCO_TASKS_PER_NODE)
|
||||
shift 2
|
||||
|
||||
PMSCO_WALLTIME_HR=$1
|
||||
PMSCO_WALLTIME_MIN=$(expr $PMSCO_WALLTIME_HR \* 60)
|
||||
shift
|
||||
|
||||
# select partition
|
||||
if [ $PMSCO_WALLTIME_HR -ge 25 ]; then
|
||||
PMSCO_PARTITION="week"
|
||||
else
|
||||
PMSCO_PARTITION="day"
|
||||
fi
|
||||
if [ $PMSCO_TASKS_PER_NODE -lt 24 ]; then
|
||||
PMSCO_PARTITION="shared"
|
||||
fi
|
||||
|
||||
PMSCO_PROJECT_FILE="$(readlink -f $1)"
|
||||
shift
|
||||
|
||||
PMSCO_MODE="$1"
|
||||
shift
|
||||
|
||||
PMSCO_PROJECT_ARGS="$*"
|
||||
|
||||
# use defaults, override explicitly in PMSCO_PROJECT_ARGS if necessary
|
||||
PMSCO_SCAN_FILES=""
|
||||
PMSCO_LOGLEVEL=""
|
||||
PMSCO_CODE=""
|
||||
|
||||
# set up working directory
|
||||
cd ~
|
||||
if [ ! -d "jobs" ]; then
|
||||
mkdir jobs
|
||||
fi
|
||||
cd jobs
|
||||
if [ ! -d "$PMSCO_JOBNAME" ]; then
|
||||
mkdir "$PMSCO_JOBNAME"
|
||||
fi
|
||||
cd "$PMSCO_JOBNAME"
|
||||
WORKDIR="$(pwd)"
|
||||
PMSCO_WORK_DIR="$WORKDIR"
|
||||
|
||||
# provide revision information, requires git repository
|
||||
cd "$SOURCEDIR"
|
||||
PMSCO_REV=$(git log --pretty=format:"Data revision %h, %ai" -1)
|
||||
if [ $? -ne 0 ]; then
|
||||
PMSCO_REV="Data revision unknown, "$(date +"%F %T %z")
|
||||
fi
|
||||
cd "$WORKDIR"
|
||||
echo "$PMSCO_REV" > revision.txt
|
||||
|
||||
# generate job script from template
|
||||
sed -e "s:_PMSCO_WORK_DIR:$PMSCO_WORK_DIR:g" \
|
||||
-e "s:_PMSCO_JOBNAME:$PMSCO_JOBNAME:g" \
|
||||
-e "s:_PMSCO_NODES:$PMSCO_NODES:g" \
|
||||
-e "s:_PMSCO_WALLTIME_HR:$PMSCO_WALLTIME_HR:g" \
|
||||
-e "s:_PMSCO_PROJECT_FILE:$PMSCO_PROJECT_FILE:g" \
|
||||
-e "s:_PMSCO_PROJECT_ARGS:$PMSCO_PROJECT_ARGS:g" \
|
||||
-e "s:_PMSCO_CODE:$PMSCO_CODE:g" \
|
||||
-e "s:_PMSCO_MODE:$PMSCO_MODE:g" \
|
||||
-e "s:_PMSCO_SOURCE_DIR:$PMSCO_SOURCE_DIR:g" \
|
||||
-e "s:_PMSCO_SCAN_FILES:$PMSCO_SCAN_FILES:g" \
|
||||
-e "s:_PMSCO_LOGLEVEL:$PMSCO_LOGLEVEL:g" \
|
||||
"$SCRIPTDIR/pmsco.ra.template" > $PMSCO_JOBNAME.job
|
||||
|
||||
chmod u+x "$PMSCO_JOBNAME.job"
|
||||
|
||||
# request nodes and tasks
|
||||
#
|
||||
# The option --ntasks-per-node is meant to be used with the --nodes option.
|
||||
# (For the --ntasks option, the default is one task per node, use the --cpus-per-task option to change this default.)
|
||||
#
|
||||
# sbatch options
|
||||
# --cores-per-socket=16
|
||||
# 32 cores per node
|
||||
# --partition=[shared|day|week]
|
||||
# --time=8-00:00:00
|
||||
# override default time limit (2 days in long queue)
|
||||
# time formats: "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes", "days-hours:minutes:seconds"
|
||||
# --mail-type=ALL
|
||||
# --test-only
|
||||
# check script but do not submit
|
||||
#
|
||||
SLURM_ARGS="--nodes=$PMSCO_NODES --ntasks-per-node=$PMSCO_TASKS_PER_NODE"
|
||||
|
||||
if [ $PMSCO_TASKS_PER_NODE -gt 24 ]; then
|
||||
SLURM_ARGS="--cores-per-socket=16 $SLURM_ARGS"
|
||||
fi
|
||||
|
||||
SLURM_ARGS="--partition=$PMSCO_PARTITION $SLURM_ARGS"
|
||||
|
||||
SLURM_ARGS="--time=$PMSCO_WALLTIME_HR:00:00 $SLURM_ARGS"
|
||||
|
||||
CMD="sbatch $SLURM_ARGS $PMSCO_JOBNAME.job"
|
||||
echo $CMD
|
||||
if [ "$NOSUB" != "true" ]; then
|
||||
$CMD
|
||||
fi
|
||||
|
||||
exit 0
|
||||
Executable
+128
@@ -0,0 +1,128 @@
|
||||
#!/bin/sh
|
||||
#
|
||||
# submission script for PMSCO calculations on Merlin cluster
|
||||
#
|
||||
|
||||
if [ $# -lt 1 ]; then
|
||||
echo "Usage: $0 [NOSUB] JOBNAME NODES WALLTIME:HOURS PROJECT MODE [LOG_LEVEL]"
|
||||
echo ""
|
||||
echo " NOSUB (optional): do not submit the script to the queue. default: submit."
|
||||
echo " WALLTIME:HOURS (integer): sets the wall time limits."
|
||||
echo " soft limit = HOURS:00:00"
|
||||
echo " hard limit = HOURS:00:30"
|
||||
echo " for short.q: HOURS = 0 (-> MINUTES=30)"
|
||||
echo " for all.q: HOURS <= 24"
|
||||
echo " for long.q: HOURS <= 96"
|
||||
echo " PROJECT: python module (file path) that declares the project and starts the calculation."
|
||||
echo " MODE: PMSCO calculation mode (single|swarm|gradient|grid)."
|
||||
echo " LOG_LEVEL (optional): one of DEBUG, INFO, WARNING, ERROR if log files should be produced."
|
||||
echo ""
|
||||
echo "the job script complete with the program code and input/output data is generated in ~/jobs/\$JOBNAME"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# location of the pmsco package is derived from the path of this script
|
||||
SCRIPTDIR="$(dirname $(readlink -f $0))"
|
||||
SOURCEDIR="$SCRIPTDIR/.."
|
||||
PHD_SOURCE_DIR="$SOURCEDIR"
|
||||
|
||||
PHD_CODE="edac"
|
||||
|
||||
# read arguments
|
||||
if [ "$1" == "NOSUB" ]; then
|
||||
NOSUB="true"
|
||||
shift
|
||||
else
|
||||
NOSUB="false"
|
||||
fi
|
||||
|
||||
PHD_JOBNAME=$1
|
||||
shift
|
||||
|
||||
PHD_NODES=$1
|
||||
shift
|
||||
|
||||
PHD_WALLTIME_HR=$1
|
||||
PHD_WALLTIME_MIN=0
|
||||
shift
|
||||
|
||||
PHD_PROJECT_FILE="$(readlink -f $1)"
|
||||
PHD_PROJECT_ARGS=""
|
||||
shift
|
||||
|
||||
PHD_MODE="$1"
|
||||
shift
|
||||
|
||||
PHD_LOGLEVEL=""
|
||||
if [ "$1" == "DEBUG" ] || [ "$1" == "INFO" ] || [ "$1" == "WARNING" ] || [ "$1" == "ERROR" ]; then
|
||||
PHD_LOGLEVEL="$1"
|
||||
shift
|
||||
fi
|
||||
|
||||
# ignore remaining arguments
|
||||
PHD_SCAN_FILES=""
|
||||
|
||||
# select allowed queues
|
||||
QUEUE=short.q,all.q,long.q
|
||||
|
||||
# for short queue (limit 30 minutes)
|
||||
if [ "$PHD_WALLTIME_HR" -lt 1 ]; then
|
||||
PHD_WALLTIME_HR=0
|
||||
PHD_WALLTIME_MIN=30
|
||||
fi
|
||||
|
||||
# set up working directory
|
||||
cd ~
|
||||
if [ ! -d "jobs" ]; then
|
||||
mkdir jobs
|
||||
fi
|
||||
cd jobs
|
||||
if [ ! -d "$PHD_JOBNAME" ]; then
|
||||
mkdir "$PHD_JOBNAME"
|
||||
fi
|
||||
cd "$PHD_JOBNAME"
|
||||
WORKDIR="$(pwd)"
|
||||
PHD_WORK_DIR="$WORKDIR"
|
||||
|
||||
# provide revision information, requires git repository
|
||||
cd "$SOURCEDIR"
|
||||
PHD_REV=$(git log --pretty=format:"Data revision %h, %ad" --date=iso -1)
|
||||
if [ $? -ne 0 ]; then
|
||||
PHD_REV="Data revision unknown, "$(date +"%F %T %z")
|
||||
fi
|
||||
cd "$WORKDIR"
|
||||
echo "$PHD_REV" > revision.txt
|
||||
|
||||
# generate job script from template
|
||||
sed -e "s:_PHD_WORK_DIR:$PHD_WORK_DIR:g" \
|
||||
-e "s:_PHD_JOBNAME:$PHD_JOBNAME:g" \
|
||||
-e "s:_PHD_NODES:$PHD_NODES:g" \
|
||||
-e "s:_PHD_WALLTIME_HR:$PHD_WALLTIME_HR:g" \
|
||||
-e "s:_PHD_WALLTIME_MIN:$PHD_WALLTIME_MIN:g" \
|
||||
-e "s:_PHD_PROJECT_FILE:$PHD_PROJECT_FILE:g" \
|
||||
-e "s:_PHD_PROJECT_ARGS:$PHD_PROJECT_ARGS:g" \
|
||||
-e "s:_PHD_CODE:$PHD_CODE:g" \
|
||||
-e "s:_PHD_MODE:$PHD_MODE:g" \
|
||||
-e "s:_PHD_SOURCE_DIR:$PHD_SOURCE_DIR:g" \
|
||||
-e "s:_PHD_SCAN_FILES:$PHD_SCAN_FILES:g" \
|
||||
-e "s:_PHD_LOGLEVEL:$PHD_LOGLEVEL:g" \
|
||||
"$SCRIPTDIR/pmsco.sge.template" > $PHD_JOBNAME.job
|
||||
|
||||
chmod u+x "$PHD_JOBNAME.job"
|
||||
|
||||
if [ "$NOSUB" != "true" ]; then
|
||||
|
||||
# suppress bash error [stackoverflow.com/questions/10496758]
|
||||
unset module
|
||||
|
||||
# submit the job script
|
||||
# EMAIL must be defined in the environment
|
||||
if [ -n "$EMAIL" ]; then
|
||||
qsub -q $QUEUE -m ae -M $EMAIL $PHD_JOBNAME.job
|
||||
else
|
||||
qsub -q $QUEUE $PHD_JOBNAME.job
|
||||
fi
|
||||
|
||||
fi
|
||||
|
||||
exit 0
|
||||
@@ -0,0 +1,3 @@
|
||||
doxygen*.db
|
||||
html/*
|
||||
latex/*
|
||||
+2396
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,26 @@
|
||||
SHELL=/bin/sh
|
||||
|
||||
# makefile for PMSCO documentation
|
||||
#
|
||||
|
||||
.SUFFIXES:
|
||||
.SUFFIXES: .c .cpp .cxx .exe .f .h .i .o .py .pyf .so .html
|
||||
.PHONY: all docs clean
|
||||
|
||||
DOX=doxygen
|
||||
DOXOPTS=
|
||||
LATEX_DIR=latex
|
||||
|
||||
all: docs
|
||||
|
||||
docs: doxygen pdf
|
||||
|
||||
doxygen:
|
||||
$(DOX) $(DOXOPTS) config.dox
|
||||
|
||||
pdf: doxygen
|
||||
-$(MAKE) -C $(LATEX_DIR)
|
||||
|
||||
clean:
|
||||
-rm -rf latex/*
|
||||
-rm -rf html/*
|
||||
@@ -0,0 +1,7 @@
|
||||
to compile the source code documentation, you need the following packages (naming according to Debian):
|
||||
|
||||
doxygen
|
||||
doxygen-gui (optional)
|
||||
doxypy
|
||||
graphviz
|
||||
latex (optional)
|
||||
@@ -0,0 +1,144 @@
|
||||
/*! @page pag_command Command Line
|
||||
\section sec_command Command Line
|
||||
|
||||
This section describes the command line arguments for a direct call of PMSCO from the shell.
|
||||
For batch job submission to Slurm see @ref sec_slurm.
|
||||
|
||||
Since PMSCO is started indirectly by a call of the specific project module,
|
||||
the syntax of the command line arguments is defined by the project module.
|
||||
However, to reduce the amount of custom code and documentation and to avoid confusion
|
||||
it is recommended to adhere to the standard syntax described below.
|
||||
|
||||
The basic command line is as follows:
|
||||
@code{.sh}
|
||||
[mpiexec -np NPROCESSES] python path-to-project.py [common args] [project args]
|
||||
@endcode
|
||||
|
||||
Include the first portion between square brackets if you want to run parallel processes.
|
||||
Specify the number of processes as the @c -np option.
|
||||
@c path-to-project.py should be the path and name to your project module.
|
||||
Common args and project args are described below.
|
||||
|
||||
|
||||
\subsection sec_common_args Common Arguments
|
||||
|
||||
All common arguments are optional and default to more or less reasonable values if omitted.
|
||||
They can be added to the command line in arbitrary order.
|
||||
The following table is ordered by importance.
|
||||
|
||||
|
||||
| Option | Values | Description |
|
||||
| --- | --- | --- |
|
||||
| -h , --help | | Display a command line summary and exit. |
|
||||
| -m , --mode | single (default), grid, swarm | Operation mode. |
|
||||
| -d, --data-dir | file system path | Directory path for experimental data files (if required by project). Default: current working directory. |
|
||||
| -o, --output-file | file system path | Base path and/or name for intermediate and output files. Default: pmsco_data |
|
||||
| -t, --time-limit | decimal number | Wall time limit in hours. The optimizers try to finish before the limit. Default: 24.0. |
|
||||
| -k, --keep-files | list of file categories | Output file categories to keep after the calculation. Multiple values can be specified and must be separated by spaces. By default, cluster and model (simulated data) of a limited number of best models are kept. See @ref sec_file_categories below. |
|
||||
| --log-level | DEBUG, INFO, WARNING (default), ERROR, CRITICAL | Minimum level of messages that should be added to the log. |
|
||||
| --log-file | file system path | Name of the main log file. Under MPI, the rank of the process is inserted before the extension. Default: output-file + log, or pmsco.log. |
|
||||
| --log-disable | | Disable logging. By default, logging is on. |
|
||||
| --pop-size | integer | Population size (number of particles) in swarm optimization mode. The default value is the greater of 4 or two times the number of calculation processes. |
|
||||
| -c, --code | edac (default) | Scattering code. At the moment, only edac is supported. |
|
||||
|
||||
|
||||
\subsubsection sec_file_categories File Categories
|
||||
|
||||
The following category names can be used with the @c --keep-files option.
|
||||
Multiple names can be specified and must be separated by spaces.
|
||||
|
||||
| Category | Description | Default Action |
|
||||
| --- | --- | --- |
|
||||
| input | raw input files for calculator, including cluster and phase files in custom format | delete |
|
||||
| output | raw output files from calculator | delete |
|
||||
| phase | phase files in portable format for report | delete |
|
||||
| cluster | cluster files in portable XYZ format for report | keep |
|
||||
| debug | debug files | delete |
|
||||
| model | output files in ETPAI format: complete simulation (a_-1_-1_-1_-1) | keep |
|
||||
| scan | output files in ETPAI format: scan (a_b_-1_-1_-1) | delete |
|
||||
| symmetry | output files in ETPAI format: symmetry (a_b_c_-1_-1) | delete |
|
||||
| emitter | output files in ETPAI format: emitter (a_b_c_d_-1) | delete |
|
||||
| region | output files in ETPAI format: region (a_b_c_d_e) | delete |
|
||||
| report| final report of results | keep |
|
||||
| population | final state of particle population | keep |
|
||||
| rfac | files related to models which give bad r-factors | delete |
|
||||
|
||||
|
||||
\subsection sec_project_args Project Arguments
|
||||
|
||||
The following table lists a few recommended options that are handled by the project code.
|
||||
Project options that are not listed here should use the long form to avoid conflicts in future versions.
|
||||
|
||||
|
||||
| Option | Values | Description |
|
||||
| --- | --- | --- |
|
||||
| -s, --scans | project-dependent | Nick names of scans to use in calculation. The nick name selects the experimental data file and the initial state of the photoelectron. Multiple values can be specified and must be separated by spaces. |
|
||||
|
||||
|
||||
\subsection sec_scanfile Experimental Scan Files
|
||||
|
||||
The recommended way of specifying experimental scan files is using nick names (dictionary keys) and the @c --scans option.
|
||||
A dictionary in the module code defines the corresponding file name, chemical species of the emitter and initial state of the photoelectron.
|
||||
The location of the files is selected using the common @c --data-dir option.
|
||||
This way, the file names and photoelectron parameters are versioned with the code,
|
||||
whereas command line arguments may easily get forgotten in the records.
|
||||
|
||||
|
||||
\subsection sec_project_example Example Argument Handling
|
||||
|
||||
An example for handling the command line in a project module can be found in the twoatom.py demo project.
|
||||
The following code snippet shows how the common and project arguments are separated and handled.
|
||||
|
||||
@code{.py}
|
||||
def main():
|
||||
# have the pmsco module parse the common arguments.
|
||||
args, unknown_args = pmsco.pmsco.parse_cli()
|
||||
|
||||
# pass any arguments not handled by pmsco
|
||||
# to the project-defined parse_project_args function.
|
||||
# unknown_args can be passed to argparse.ArgumentParser.parse_args().
|
||||
if unknown_args:
|
||||
project_args = parse_project_args(unknown_args)
|
||||
else:
|
||||
project_args = None
|
||||
|
||||
# create the project object
|
||||
project = create_project()
|
||||
|
||||
# apply the common arguments on the project
|
||||
pmsco.pmsco.set_common_args(project, args)
|
||||
|
||||
# apply the specific arguments on the project
|
||||
set_project_args(project, project_args)
|
||||
|
||||
# run the project
|
||||
pmsco.pmsco.run_project(project)
|
||||
@endcode
|
||||
|
||||
|
||||
\section sec_slurm Slurm Job Submission
|
||||
|
||||
The command line of the Slurm job submission script for the Ra cluster at PSI is as follows.
|
||||
This script is specific to the configuration of the Ra cluster but may be adapted to other Slurm-based queues.
|
||||
|
||||
@code{.sh}
|
||||
qpmsco.sh [NOSUB] JOBNAME NODES TASKS_PER_NODE WALLTIME:HOURS PROJECT MODE [ARGS [ARGS [...]]]
|
||||
@endcode
|
||||
|
||||
Here, the first few arguments are positional and their order must be strictly adhered to.
|
||||
After the positional arguments, optional arguments of the PMSCO project command line can be added in arbitrary order.
|
||||
If you execute the script without arguments, it displays a short summary.
|
||||
The job script is written to @c ~/jobs/\$JOBNAME.
|
||||
|
||||
| Argument | Values | Description |
|
||||
| --- | --- | --- |
|
||||
| NOSUB (optional) | NOSUB or omitted | If NOSUB is present as the first argument, create the job script but do not submit it to the queue. Otherwise, submit the job script. |
|
||||
| JOBNAME | text | Name of job. Use only alphanumeric characters, no spaces. |
|
||||
| NODES | integer | Number of computing nodes. (1 node = 24 or 32 processors). Do not specify more than 2. |
|
||||
| TASKS_PER_NODE | 1...24, or 32 | Number of processes per node. 24 or 32 for full-node allocation. 1...23 for shared node allocation. |
|
||||
| WALLTIME:HOURS | integer | Requested wall time. 1...24 for day partition, 24...192 for week partition, 1...192 for shared partition. This value is also passed on to PMSCO as the @c --time-limit argument. |
|
||||
| PROJECT | file system path | Python module (file path) that declares the project and starts the calculation. |
|
||||
| MODE | single, swarm, grid | PMSCO operation mode. This value is passed on to PMSCO as the @c --mode argument. |
|
||||
| ARGS (optional) | | Any further arguments are passed on verbatim to PMSCO. You don't need to specify the mode and time limit here. |
|
||||
|
||||
*/
|
||||
@@ -0,0 +1,153 @@
|
||||
/*! @page pag_concepts Design Concepts
|
||||
\section sec_tasks Tasks
|
||||
|
||||
In an optimization project, a number of optimizable, high-level parameters generated by the optimization algorithm
|
||||
must be mapped to the input parameters and atomic coordinates before the calculation program is executed.
|
||||
Possibly, the calculation program is executed multiple times for inequivalent domains, emitters or scan geometries.
|
||||
After the calculation, the output is collected, compared to the experimental data, and the model is refined.
|
||||
In PMSCO, the optimization is broken down into a set of _tasks_ and assigned to a stack of task _handlers_ according to the following figure.
|
||||
Each invocation of the scattering program (EDAC) runs a specific task,
|
||||
i.e. a calculation for a set of specific parameters, a fully-qualified cluster of atoms, and a specific angle and/or energy scan.
|
||||
|
||||
\dotfile tasks.dot "PMSCO task stack"
|
||||
|
||||
At the root, the _model handler_ proposes models that need to be calculated according to the operation mode specified at the command line.
|
||||
A _model_ is the minimum set of variable parameters in the context of a custom project.
|
||||
Other parameters that will not vary under optimization are set directly by the project code.
|
||||
The model handler may generate models based on a fixed scheme, e.g. on a grid, or based on R-factors of previous results.
|
||||
|
||||
For each model, one task is passed to the task handling chain, starting with the scan handler.
|
||||
The _scan handler_ generates sub-tasks for each experimental scan dataset.
|
||||
This way, the model can be optimized for multiple experimental scans in the same run (see Sec. \ref sec_scanning).
|
||||
|
||||
The _symmetry handler_ generates sub-tasks based on the number of symmetries contained in the experimental data (see Sec. \ref sec_symmetry).
|
||||
For instance, for a system that includes two inequivalent structural domains, two separate calculations have to be run for each model.
|
||||
The symmetry handler is implemented on the project level and may be customized for a specific system.
|
||||
|
||||
The _emitter handler_ generates a sub-task for each inequivalent emitter atom
|
||||
so that the tasks can be distributed to multiple processes (see Sec. \ref sec_emitters).
|
||||
In a single-process environment, all emitters are calculated in one task.
|
||||
|
||||
The _region handler_ may split a scan region into several smaller chunks
|
||||
so that the tasks can be distributed to multiple processes.
|
||||
With EDAC, only energy scans can benefit from chunking
|
||||
since it always calculates the full angular distribution.
|
||||
This layer has to be enabled specifically in the project module.
|
||||
It is disabled by default.
|
||||
|
||||
At the end of the stack, the tasks are fully specified and are passed to the calculation queue.
|
||||
They are dispatched to the available processes of the MPI environment in which PMSCO was started,
|
||||
which allows calculations to be run in parallel.
|
||||
Only now that the model is broken down into multiple tasks,
|
||||
the cluster and input files are generated, and the calculation program is started.
|
||||
|
||||
At the end of a calculation, the output is passed back through the task handler stack.
|
||||
In this phase, each level gathers the datasets from the sub-tasks to the data requested by the parent task
|
||||
and passes the result to the next higher level.
|
||||
|
||||
On the top level, the calculation is compared to the experimental data.
|
||||
Depending on the operation mode, the model parameters are refined, and new tasks issued.
|
||||
If the optimization is finished according to a set of defined criteria, PMSCO exits.
|
||||
|
||||
As an implentation detail, each task is given a unique _identifier_ consisting of five integer numbers
|
||||
which correspond to the five levels model, scan, symmetry, emitter and region.
|
||||
The identifier appears in the file names in the communication with the scattering program.
|
||||
Normally, the data files are deleted after the calculation, and only a few top-level files are kept
|
||||
(can be overridden at the command line or in the project code).
|
||||
At the top level, only the model ID is set, the other ones are undefined (-1).
|
||||
|
||||
|
||||
\section sec_symmetry Symmetry and Domain Averaging
|
||||
|
||||
A _symmetry_ under PMSCO is a discrete variant of a set of calculation parameters (including the atomic cluster)
|
||||
that is derived from the same set of model parameters
|
||||
and that contributes incoherently to the measured diffraction pattern.
|
||||
A symmetry may be represented by a special symmetry parameter which is not subject to optimization.
|
||||
|
||||
For instance, a real sample may have additional rotational domains that are not present in the cluster,
|
||||
increasing the symmetry from three-fold to six-fold.
|
||||
Or, an adsorbate may be present in a number of different lateral configurations on the substrate.
|
||||
In the first case, it may be sufficient to fold calculated data in the proper way to generate the same symmetry as in the measurement.
|
||||
In the latter case, it may be necessary to execute a scattering calculation for each possible orientation or a representative number of possible orientations.
|
||||
|
||||
PMSCO provides the basic framework to spawn multiple calculations according to the number of symmetries (cf. \ref sec_tasks).
|
||||
The actual data reduction from multiple symmetries to one measurement needs to be implemented on the project level.
|
||||
This section explains the necessary steps.
|
||||
|
||||
1. Your project needs to populate the pmsco.project.Project.symmetries list.
|
||||
For each symmetry, add a dictionary of symmetry parameters, e.g. <code>{'angle_azi': 15.0}</code>.
|
||||
There must be at least one symmetry in a project, otherwise no calculation is executed.
|
||||
|
||||
2. The project may apply the symmetry of a task to the cluster and parameter file if necessary.
|
||||
The pmsco.project.Project.create_cluster and pmsco.project.Project.create_params methods receive the index of the particular symmetry in addition to the model parameters.
|
||||
|
||||
3. The project combines the results of the calculations for the various symmetries into one dataset that can be compared to the measurement.
|
||||
The default method implemented in pmsco.project.Project just adds up all calculations with equal weight.
|
||||
If you need more control, you need to override the pmsco.project.Project.combine_symmetries method and implement your own algorithm.
|
||||
|
||||
|
||||
\section sec_scanning Scanning
|
||||
|
||||
PMSCO with EDAC currently supports the following scan axes.
|
||||
|
||||
- kinetic energy E
|
||||
- polar angle theta T
|
||||
- azimuthal angle phi P
|
||||
- analyser angle alpha A
|
||||
|
||||
The following combinations of these scan axes are allowed (see pmsco.data.SCANTYPES).
|
||||
|
||||
- E
|
||||
- E-T
|
||||
- E-A
|
||||
- T-P (hemispherical or hologram scan)
|
||||
|
||||
@attention The T and A axes cannot be combined.
|
||||
If a scan of one of them is specified, the other is assumed to be fixed at zero!
|
||||
This assumption may change in the future,
|
||||
so it is best to explicitly set the fixed angle to zero in the scan file.
|
||||
|
||||
@remark According to the measurement geometry at PEARL,
|
||||
alpha scans are implemented in EDAC as theta scans at phi = 90 in fixed cluster mode.
|
||||
The switch to fixed cluster mode is made by PMSCO internally,
|
||||
no change of angles or other parameters is necessary in the scan or project files
|
||||
besides filling the alpha instead of the theta column.
|
||||
|
||||
|
||||
\section sec_emitters Emitter Configurations
|
||||
|
||||
Since emitters contribute incoherently to the diffraction pattern,
|
||||
it should make no difference how the emitters are grouped and calculated.
|
||||
EDAC allows to specify multiple emitters in one calculation.
|
||||
However, running EDAC multiple times for a single-emitter configuration or simply summing up the results
|
||||
gives the same final diffraction pattern with no significant difference of used CPU time.
|
||||
It is, thus, easy to distribute the emitters over parallel processes in a multi-process environment.
|
||||
PMSCO can handle this transparently with a minimal effort.
|
||||
|
||||
Within the same framework, PMSCO also supports that clusters are tailored to a specific emitter configuration.
|
||||
Suppose that the unit cell contains a large number of inequivalent emitters.
|
||||
If all emitters had to be included in a single calculation,
|
||||
the cluster would grow very large and the calculation would take a long time
|
||||
because it would include many long scattering paths
|
||||
that effectively do not contribute intensity to the final result.
|
||||
Using single-emitters, a cluster can be built locally around the emitter and kept to a reasonable size.
|
||||
|
||||
Even when using this feature, PMSCO does not require that each configuration contains only one emitter.
|
||||
The term _emitter_ effectively means _emitter configuration_.
|
||||
A configuration can include multiple emitters which will not be broken up further.
|
||||
It is up to the project, what is included in a particular configuration.
|
||||
|
||||
To enable emitter handling,
|
||||
|
||||
1. override the count_emitters method of your cluster generator
|
||||
and return the number of emitter configurations of a given model, scan and symmetry.
|
||||
|
||||
2. handle the emitter index in your create_cluster method.
|
||||
|
||||
3. (optionally) override the pmsco.project.Project.combine_emitters method
|
||||
if the emitters should not be added with equal weights.
|
||||
|
||||
For implementation details see the respective method descriptions.
|
||||
|
||||
*/
|
||||
|
||||
@@ -0,0 +1,84 @@
|
||||
digraph G {
|
||||
compound = true;
|
||||
|
||||
/*
|
||||
subgraph cluster_project {
|
||||
label = "project";
|
||||
mode;
|
||||
domain;
|
||||
create_cluster;
|
||||
create_params;
|
||||
calc_modf;
|
||||
calc_rfac;
|
||||
comb_syms;
|
||||
comb_scans;
|
||||
}
|
||||
*/
|
||||
|
||||
subgraph cluster_model {
|
||||
label = "model handler";
|
||||
rank = same;
|
||||
model_creator [label="create model", group=creators];
|
||||
model_handler [label="evaluate results", group=handlers];
|
||||
|
||||
model_handler -> model_creator [constraint=false, label="optimize"];
|
||||
}
|
||||
|
||||
subgraph cluster_symmetry {
|
||||
label = "symmetry handler";
|
||||
rank = same;
|
||||
sym_creator [label="expand models", group=creators];
|
||||
sym_handler [label="combine symmetries", group=handlers];
|
||||
}
|
||||
|
||||
subgraph cluster_scan {
|
||||
label = "scan handler";
|
||||
rank = same;
|
||||
scan_creator [label="expand models", group=creators];
|
||||
scan_handler [label="combine scans", group=handlers];
|
||||
}
|
||||
|
||||
subgraph cluster_interface {
|
||||
label = "calculator interface"
|
||||
rank = same;
|
||||
calc_creator [label="generate input", group=creators];
|
||||
calc_handler [label="import output", group=handlers];
|
||||
}
|
||||
|
||||
calculator [label="calculator (EDAC)", shape=box];
|
||||
|
||||
model_creator -> sym_creator [label="model", style=bold];
|
||||
sym_creator -> scan_creator [label="models", style=bold];
|
||||
scan_creator -> calc_creator [label="models", style=bold];
|
||||
calc_creator -> calculator [label="clusters,\rparameters", style=bold];
|
||||
|
||||
calculator -> calc_handler [label="output files", style=bold];
|
||||
calc_handler -> scan_handler [label="raw data files", style=bold];
|
||||
scan_handler -> sym_handler [label="combined scans", style=bold];
|
||||
sym_handler -> model_handler [label="combined symmetries", style=bold];
|
||||
|
||||
mode [shape=parallelogram];
|
||||
mode -> model_creator [lhead="cluster_model"];
|
||||
|
||||
domain [shape=parallelogram];
|
||||
domain -> model_creator;
|
||||
//domain -> model_creator [lhead="cluster_model"];
|
||||
|
||||
create_cluster [shape=cds, label="cluster generator"];
|
||||
create_cluster -> calc_creator [style=dashed];
|
||||
|
||||
create_params [shape=cds, label="input file generator"];
|
||||
create_params -> calc_creator [style=dashed];
|
||||
|
||||
calc_modf [shape=cds, label="modulation function"];
|
||||
calc_modf -> model_handler [style=dashed];
|
||||
|
||||
calc_rfac [shape=cds, label="R-factor function"];
|
||||
calc_rfac -> model_handler [style=dashed];
|
||||
|
||||
comb_syms [shape=cds, label="symmetry combination rule"];
|
||||
comb_syms -> sym_handler [style=dashed];
|
||||
|
||||
comb_scans [shape=cds, label="scan combination rule"];
|
||||
comb_scans -> scan_handler [style=dashed];
|
||||
}
|
||||
@@ -0,0 +1,87 @@
|
||||
/*! @page pag_run Running PMSCO
|
||||
\section sec_run Running PMSCO
|
||||
|
||||
To run PMSCO you need the PMSCO code and its dependencies (cf. @ref pag_install),
|
||||
a code module that contains the project-specific code,
|
||||
and one or several files containing the scan parameters and experimental data.
|
||||
Please check the <code>projects</code> folder for examples of project modules.
|
||||
For a detailed description of the command line, see @ref pag_command.
|
||||
|
||||
|
||||
\subsection sec_run_single Single Process
|
||||
|
||||
Run PMSCO from the command prompt:
|
||||
|
||||
@code{.sh}
|
||||
cd work-dir
|
||||
python project-dir/project.py [pmsco-arguments] [project-arguments]
|
||||
@endcode
|
||||
|
||||
where <code>work-dir</code> is the destination directory for output files,
|
||||
<code>project.py</code> is the specific project module,
|
||||
and <code>project-dir</code> is the directory where the project file is located.
|
||||
PMSCO is run in one process which handles all calculations sequentially.
|
||||
|
||||
The command line arguments are usually divided into common arguments interpreted by the main pmsco code (pmsco.py),
|
||||
and project-specific arguments interpreted by the project module.
|
||||
However, it is ultimately up to the project module how the command line is interpreted.
|
||||
|
||||
Example command line for a single EDAC calculation of the two-atom project:
|
||||
@code{.sh}
|
||||
cd work/twoatom
|
||||
python pmsco/projects/twoatom/twoatom.py -s ea -o twoatom-demo -m single
|
||||
@endcode
|
||||
|
||||
The project file <code>twoatom.py</code> takes the lead of the project execution.
|
||||
Usually, it contains only project-specific code and delegates common tasks to the main pmsco code.
|
||||
|
||||
In the command line above, the <code>-o twoatom-demo</code> and <code>-m single</code> arguments
|
||||
are interpreted by the pmsco module.
|
||||
<code>-o</code> sets the base name of output files,
|
||||
and <code>-m</code> selects the operation mode to a single calculation.
|
||||
|
||||
The scan argument is interpreted by the project module.
|
||||
It refers to a dictionary entry that declares the scan file, the emitting atomic species, and the initial state.
|
||||
In this example, the project looks for the <code>twoatom_energy_alpha.etpai</code> scan file in the project directory,
|
||||
and calculates the modulation function for a N 1s initial state.
|
||||
The kinetic energy and emission angles are contained in the scan file.
|
||||
|
||||
|
||||
\subsection sec_run_parallel Parallel Processes
|
||||
|
||||
PMSCO handles parallelization automatically and transparently.
|
||||
To start PMSCO in a parallel environment in the login shell,
|
||||
just prefix the command with <code>mpiexec -np N</code>,
|
||||
where N is the number of processes.
|
||||
One process will assume the role of the master, and the remaining will assume the role of slaves.
|
||||
The slave processes will run the scattering calculations, while the master coordinates the tasks,
|
||||
and optimizes the model parameters (depending on the operation mode).
|
||||
|
||||
For optimum performance, the number of processes should not exceed the number of available processors.
|
||||
To start a two-hour optimization job with multiple processes on an quad-core workstation with hyperthreading:
|
||||
@code{.sh}
|
||||
cd work/my_project
|
||||
mpiexec -np 8 project-dir/project.py -o my_job_0001 -t 2 -m swarm
|
||||
@endcode
|
||||
|
||||
|
||||
\subsection sec_run_hpc High-Performance Cluster
|
||||
|
||||
The script @c bin/qpmsco.ra.sh takes care of submitting a PMSCO job to the slurm queue of the Ra cluster at PSI.
|
||||
The script can be adapted to other machines running the slurm resource manager.
|
||||
The script generates a job script based on @c pmsco.ra.template,
|
||||
substituting the necessary environment and parameters,
|
||||
and submits it to the queue.
|
||||
|
||||
Execute @c bin/qpmsco.ra.sh without arguments to see a summary of the arguments.
|
||||
|
||||
To submit a job to the PSI clusters (see also the PEARL-Wiki page MscCalcRa),
|
||||
the analog command to the previous section would be:
|
||||
@code{.sh}
|
||||
bin/qpmsco.ra.sh my_job_0001 1 8 2 projects/my_project/project.py swarm
|
||||
@endcode
|
||||
|
||||
Be sure to consider the resource allocation policy of the cluster
|
||||
before you decide on the number of processes.
|
||||
Requesting less resources will prolong the run time but might increase the scheduling priority.
|
||||
*/
|
||||
@@ -0,0 +1,168 @@
|
||||
/*! @page pag_install Installation
|
||||
\section sec_install Installation
|
||||
|
||||
\subsection sec_general General Remarks
|
||||
|
||||
The PMSCO code is maintained under git.
|
||||
The central repository for PSI-internal projects is at https://git.psi.ch/pearl/pmsco,
|
||||
the public repository at https://gitlab.psi.ch/pearl/pmsco.
|
||||
|
||||
For their own developments, users should clone the repository.
|
||||
Changes to common code should be submitted via pull requests.
|
||||
|
||||
|
||||
\subsection sec_requirements Requirements
|
||||
|
||||
The recommended IDE is [PyCharm (community edition)](https://www.jetbrains.com/pycharm).
|
||||
The documentation in [Doxygen](http://www.stack.nl/~dimitri/doxygen/index.html) format is part of the source code.
|
||||
The Doxygen compiler can generate separate documentation in HTML or LaTeX.
|
||||
|
||||
The MSC and EDAC codes compile with the GNU Fortran and C++ compilers on Linux.
|
||||
Other compilers may work but have not been tested.
|
||||
The code will run in any recent Linux environment on a workstation or in a virtual machine.
|
||||
Scientific Linux, CentOS7, [Ubuntu](https://www.ubuntu.com/)
|
||||
and [Lubuntu](http://lubuntu.net/) (recommended for virtual machine) have been tested.
|
||||
For optimization jobs, a high-performance cluster with 20-50 available processor cores is recommended.
|
||||
The code requires about 2 GB of RAM per process.
|
||||
|
||||
Please note that it may be important that the code remains compatible with earlier compiler and library versions.
|
||||
Newer compilers or the latest versions of the libraries contain features that will break the compatibility.
|
||||
The code can be used with newer versions as long they are backward compatible.
|
||||
The code depends on the following libraries:
|
||||
|
||||
- GCC 4.8
|
||||
- OpenMPI 1.10
|
||||
- F2PY
|
||||
- F2C
|
||||
- SWIG
|
||||
- Python 2.7 (incompatible with Python 3.0)
|
||||
- Numpy 1.11 (incompatible with Numpy 1.13 and later)
|
||||
- MPI4PY (from PyPI)
|
||||
- BLAS
|
||||
- LAPACK
|
||||
- periodictable
|
||||
|
||||
Most of these requirements are available from the Linux distribution, or from PyPI (pip install), respectively.
|
||||
If there are any issues with the packages installed by the distribution, try the ones from PyPI
|
||||
(e.g. there is currently a bug in the Debian mpi4py package).
|
||||
The F2C source code is contained in the repository for machines which don't have it installed.
|
||||
On the PSI cluster machines, the environment must be set using the module system and conda (on Ra).
|
||||
Details are explained in the PEARL Wiki.
|
||||
|
||||
\subsubsection sec_install_ubuntu Installation on Ubuntu 16.04
|
||||
|
||||
The following instructions install the necessary dependencies on Ubuntu (or Lubuntu 16.04):
|
||||
|
||||
@code{.sh}
|
||||
sudo apt-get update
|
||||
|
||||
sudo apt-get install \
|
||||
binutils \
|
||||
build-essential \
|
||||
doxygen \
|
||||
doxypy \
|
||||
f2c \
|
||||
g++ \
|
||||
gcc \
|
||||
gfortran \
|
||||
git \
|
||||
graphviz \
|
||||
ipython \
|
||||
libopenmpi-dev \
|
||||
make \
|
||||
openmpi-bin \
|
||||
openmpi-common \
|
||||
python-all \
|
||||
python-mock \
|
||||
python-nose \
|
||||
python-numpy \
|
||||
python-pip \
|
||||
python-scipy \
|
||||
python2.7-dev \
|
||||
swig
|
||||
|
||||
sudo pip install --system mpi4py periodictable
|
||||
|
||||
cd /usr/lib
|
||||
sudo ln -s /usr/lib/libblas/libblas.so.3 libblas.so
|
||||
@endcode
|
||||
|
||||
The following instructions install the PyCharm IDE and a few other useful utilities:
|
||||
|
||||
@code{.sh}
|
||||
sudo sh -c 'echo "deb http://archive.getdeb.net/ubuntu xenial-getdeb apps" >> /etc/apt/sources.list.d/getdeb.list'
|
||||
wget -q -O - http://archive.getdeb.net/getdeb-archive.key | sudo apt-key add -
|
||||
sudo apt-get update
|
||||
sudo apt-get install \
|
||||
avogadro \
|
||||
gitg \
|
||||
meld \
|
||||
openjdk-9-jdk \
|
||||
pycharm
|
||||
@endcode
|
||||
|
||||
To produce documentation in PDF format (not recommended on virtual machine), install LaTeX:
|
||||
|
||||
@code{.sh}
|
||||
sudo apt-get install texlive-latex-recommended
|
||||
@endcode
|
||||
|
||||
|
||||
\subsection sec_compile Compilation
|
||||
|
||||
Make sure you have access to the PMSCO Git repository and set up your Git environment.
|
||||
Depending on your setup, location and permissions, one of the following addresses may work.
|
||||
Private key authentication is usually recommended except on shared computers.
|
||||
|
||||
| Repository | Access |
|
||||
| --- | --- |
|
||||
| `git@git.psi.ch:pearl/pmsco.git` | PSI intranet, SSH private key authentication |
|
||||
| `https://git.psi.ch/pearl/pmsco.git` | PSI intranet, password prompt |
|
||||
| `git@gitlab.psi.ch:pearl/pmsco.git` | Public repository, SSH private key authentication |
|
||||
| `https://gitlab.psi.ch/pearl/pmsco.git` | Public repository, password prompt |
|
||||
|
||||
Clone the code repository using one of these repositiory addresses and switch to the desired branch:
|
||||
|
||||
@code{.sh}
|
||||
cd ~
|
||||
git clone git@git.psi.ch:pearl/pmsco.git pmsco
|
||||
cd pmsco
|
||||
git checkout master
|
||||
git checkout -b my_branch
|
||||
@endcode
|
||||
|
||||
The compilation of the various modules is started by <code>make all</code>.
|
||||
The compilation step is necessary only once after installation.
|
||||
|
||||
If the compilation of _loess.so failes due to a missing BLAS library,
|
||||
try to set a link to the BLAS library as follows (the actual file names may vary due to the actual distribution or version):
|
||||
@code{.sh}
|
||||
cd /usr/lib
|
||||
sudo ln -s /usr/lib/libblas/libblas.so.3 libblas.so
|
||||
@endcode
|
||||
|
||||
|
||||
\subsection sec_test Tests
|
||||
|
||||
Run the unit tests.
|
||||
They should pass successfully.
|
||||
Re-check from time to time.
|
||||
|
||||
@code{.sh}
|
||||
cd ~/pmsco
|
||||
nosetests
|
||||
@endcode
|
||||
|
||||
Run the twoatom project to check the compilation of the calculation programs.
|
||||
|
||||
@code{.sh}
|
||||
cd ~/pmsco
|
||||
mkdir work
|
||||
cd work
|
||||
mkdir twoatom
|
||||
cd twoatom/
|
||||
nice python ~/pmsco/projects/twoatom/twoatom.py -s ~/pmsco/projects/twoatom/twoatom_energy_alpha.etpai -o twoatom_energy_alpha -m single
|
||||
@endcode
|
||||
|
||||
To learn more about running PMSCO, see @ref pag_run.
|
||||
*/
|
||||
@@ -0,0 +1,61 @@
|
||||
/*! @mainpage Introduction
|
||||
\section sec_intro Introduction
|
||||
|
||||
PMSCO stands for PEARL multiple-scattering cluster calculations and structural optimization.
|
||||
It is a collection of computer programs to calculate photoelectron diffraction patterns,
|
||||
and to optimize structural models based on measured data.
|
||||
|
||||
The actual scattering calculation is done by code developed by other parties.
|
||||
While the scattering program typically calculates a diffraction pattern based on a set of static parameters and a specific coordinate file in a single process,
|
||||
PMSCO wraps around that program to facilitate parameter handling, cluster building, structural optimization and parallel processing.
|
||||
|
||||
In the current version, the [EDAC](http://garciadeabajos-group.icfo.es/widgets/edac/) code
|
||||
developed by F. J. García de Abajo, M. A. Van Hove, and C. S. Fadley (1999) is used for scattering calculations.
|
||||
Other code can be integrated as well.
|
||||
Initially, support for the MSC program by Kaduwela, Friedman, and Fadley was planned but is currently not maintained.
|
||||
PMSCO is written in Python 2.7.
|
||||
EDAC is written in C++, MSC in Fortran.
|
||||
PMSCO interacts with the calculation programs through Python wrappers for C++ or Fortran.
|
||||
|
||||
The MSC and EDAC source code is contained in the same software repository.
|
||||
The PMSCO, MSC, and EDAC programs may not be used outside the PEARL group without an explicit agreement by the respective original authors.
|
||||
Users of the PMSCO code are requested to coordinate and share the development of the code with the original author.
|
||||
Please read and respect the respective license agreements.
|
||||
|
||||
|
||||
\section sec_intro_highlights Highlights
|
||||
|
||||
- angle or energy scanned XPD.
|
||||
- various scanning modes including energy, polar angle, azimuthal angle, analyser angle.
|
||||
- averaging over multiple symmetries (domains or emitters).
|
||||
- global optimization of multiple scans.
|
||||
- structural optimization algorithms: particle swarm optimization, grid search, gradient search.
|
||||
- calculation of the modulation function.
|
||||
- calculation of the weighted R-factor.
|
||||
- automatic parallel processing using OpenMPI.
|
||||
|
||||
|
||||
\section sec_project Optimization Projects
|
||||
|
||||
To set up a new optimization project, you need to:
|
||||
|
||||
- create a new directory under projects.
|
||||
- create a new Python module in this directory, e.g., my_project.py.
|
||||
- implement a sub-class of project.Project in my_project.py.
|
||||
- override the create_cluster, create_params, and create_domain methods.
|
||||
- optionally, override the combine_symmetries and combine_scans methods.
|
||||
- add a global function create_project to my_project.py.
|
||||
- provide experimental data files (intensity or modulation function).
|
||||
|
||||
For details, see the documentation of the Project class,
|
||||
and the example projects.
|
||||
|
||||
|
||||
\section sec_intro_start Getting Started
|
||||
|
||||
- @ref pag_concepts
|
||||
- @ref pag_install
|
||||
- @ref pag_run
|
||||
- @ref pag_command
|
||||
|
||||
*/
|
||||
@@ -0,0 +1,51 @@
|
||||
digraph "modules" {
|
||||
node [fillcolor="transparent"];
|
||||
|
||||
main [label="__main__.py"];
|
||||
pmsco [label="pmsco.py"];
|
||||
project [label="project.py"];
|
||||
dispatch [label="dispatch.py"];
|
||||
handlers [label="handlers.py"];
|
||||
gradient [label="gradient.py"];
|
||||
grid [label="grid.py"];
|
||||
swarm [label="swarm.py"];
|
||||
cluster [label="cluster.py"];
|
||||
data [label="data.py"];
|
||||
|
||||
calc_interface [label="calc_interface.py"];
|
||||
edac_interface [label="edac_interface.py"];
|
||||
edac [label="_edac.so"];
|
||||
loess [label="_loess.so"];
|
||||
|
||||
custom [label="custom.py", fillcolor="red"];
|
||||
|
||||
main -> pmsco;
|
||||
|
||||
pmsco -> project;
|
||||
pmsco -> swarm;
|
||||
pmsco -> grid;
|
||||
pmsco -> gradient;
|
||||
pmsco -> dispatch;
|
||||
|
||||
project -> loess;
|
||||
project -> cluster;
|
||||
project -> data;
|
||||
|
||||
dispatch -> calc_interface;
|
||||
dispatch -> handlers;
|
||||
|
||||
handlers -> project;
|
||||
|
||||
gradient -> handlers;
|
||||
grid -> handlers;
|
||||
swarm -> handlers;
|
||||
|
||||
calc_interface -> edac_interface;
|
||||
edac_interface -> data;
|
||||
edac_interface -> cluster;
|
||||
edac_interface -> edac;
|
||||
|
||||
custom -> project;
|
||||
custom -> cluster;
|
||||
custom -> data;
|
||||
}
|
||||
@@ -0,0 +1,27 @@
|
||||
digraph "processes" {
|
||||
|
||||
optimizer;
|
||||
symmetrizer;
|
||||
parallelizer;
|
||||
comparator;
|
||||
cluster_gen [label="cluster generator"];
|
||||
|
||||
{
|
||||
rank="same";
|
||||
edac1 [label="EDAC 1"];
|
||||
edac2 [label="EDAC 2"];
|
||||
edacN [label="EDAC N"];
|
||||
edac2 -> edacN [style="dotted", dir="none"];
|
||||
}
|
||||
|
||||
optimizer -> symmetrizer;
|
||||
symmetrizer -> scanner [label="N"];
|
||||
scanner -> parallelizer [label="N x M"];
|
||||
parallelizer -> cluster;
|
||||
parallelizer -> edac1;
|
||||
parallelizer -> edac2;
|
||||
parallelizer -> edacN;
|
||||
|
||||
optimizer -> comparator;
|
||||
|
||||
}
|
||||
@@ -0,0 +1,95 @@
|
||||
digraph "tasks" {
|
||||
nodesep=0.3;
|
||||
node [fillcolor="transparent", width=1.0, height=0.7];
|
||||
//node [fillcolor="transparent", height=0.7];
|
||||
newrank=true;
|
||||
compound=true;
|
||||
splines=false;
|
||||
|
||||
//{rank=same;
|
||||
initial [shape=note, label="initial\nparameters"];
|
||||
result [shape=note, label="optimized\nparameters"];
|
||||
data [shape=note, label="experimental\ndata"];
|
||||
//}
|
||||
|
||||
subgraph cluster_model {
|
||||
shape=rect;
|
||||
rank=same;
|
||||
label="model handler";
|
||||
create_model [label="generate\nmodel parameters"];
|
||||
evaluate_model [label="evaluate\nmodel"];
|
||||
}
|
||||
custom_modf [label="modulation\nfunction", shape=cds];
|
||||
{rank=same; create_model; evaluate_model; custom_modf;}
|
||||
custom_modf -> evaluate_model [lhead=cluster_model];
|
||||
initial -> create_model;
|
||||
data -> evaluate_model;
|
||||
result -> evaluate_model [dir=back];
|
||||
create_model -> result [dir=back];
|
||||
|
||||
|
||||
subgraph cluster_scan {
|
||||
label="scan handler";
|
||||
rank=same;
|
||||
create_scan [label="define\nscan\ntasks"];
|
||||
combine_scan [label="gather\nscan\nresults"];
|
||||
}
|
||||
custom_scan [label="scan\nconfiguration", shape=note];
|
||||
{rank=same; custom_scan; create_scan; combine_scan;}
|
||||
custom_scan -> create_scan [lhead=cluster_scan];
|
||||
|
||||
subgraph cluster_symmetry {
|
||||
label="symmetry handler";
|
||||
rank=same;
|
||||
create_symmetry [label="define\nsymmetry\ntasks"];
|
||||
combine_symmetry [label="gather\nsymmetry\nresults"];
|
||||
}
|
||||
custom_symmetry [label="symmetry\ndefinition", shape=cds];
|
||||
{rank=same; create_symmetry; combine_symmetry; custom_symmetry;}
|
||||
custom_symmetry -> combine_symmetry [lhead=cluster_symmetry];
|
||||
|
||||
subgraph cluster_emitter {
|
||||
label="emitter handler";
|
||||
rank=same;
|
||||
create_emitter [label="define\nemitter\ntasks"];
|
||||
combine_emitter [label="gather\nemitter\nresults"];
|
||||
}
|
||||
custom_emitter [label="emitter\nconfiguration", shape=cds];
|
||||
{rank=same; custom_emitter; create_emitter; combine_emitter;}
|
||||
custom_emitter -> combine_emitter [lhead=cluster_emitter];
|
||||
|
||||
subgraph cluster_region {
|
||||
label="region handler";
|
||||
rank=same;
|
||||
create_region [label="define\nregion\ntasks"];
|
||||
combine_region [label="gather\nregion\nresults"];
|
||||
}
|
||||
custom_region [label="scan\nconfiguration", shape=note];
|
||||
{rank=same; custom_region; create_region; combine_region;}
|
||||
custom_region -> create_region [lhead=cluster_region];
|
||||
|
||||
|
||||
subgraph cluster_edac {
|
||||
label="parallel computing";
|
||||
edac [label=EDAC, peripheries=5];
|
||||
}
|
||||
create_cluster [label="cluster\ngenerator", shape=cds];
|
||||
{rank=same; create_cluster; edac;}
|
||||
create_cluster -> edac;
|
||||
|
||||
create_model -> create_scan [label="level 1 tasks"];
|
||||
evaluate_model -> combine_scan [label="level 1 results", dir=back];
|
||||
|
||||
create_scan -> create_symmetry [label="level 2 tasks"];
|
||||
combine_scan -> combine_symmetry [label="level 2 results", dir=back];
|
||||
|
||||
create_symmetry -> create_emitter [label="level 3 tasks"];
|
||||
combine_symmetry -> combine_emitter [label="level 3 results", dir=back];
|
||||
|
||||
create_emitter -> create_region [label="level 4 tasks"];
|
||||
combine_emitter -> combine_region [label="level 4 results", dir=back];
|
||||
|
||||
create_region -> edac [label="level 5 tasks"];
|
||||
combine_region -> edac [label="level 5 results", dir=back];
|
||||
|
||||
}
|
||||
@@ -0,0 +1,10 @@
|
||||
digraph "tasks" {
|
||||
node [fillcolor="transparent", width=1.0, height=0.7];
|
||||
|
||||
data [shape=note, label="input\noutput"];
|
||||
task [label="process\nunit", shape=box];
|
||||
custom [label="user\ncode", shape="cds"];
|
||||
process [label="process"];
|
||||
|
||||
task -> process -> custom -> data [style=invis];
|
||||
}
|
||||
@@ -0,0 +1,36 @@
|
||||
SHELL=/bin/sh
|
||||
|
||||
# makefile for all programs, modules and documentation
|
||||
#
|
||||
# required libraries for LOESS module: libblas, liblapack, libf2c
|
||||
# (you may have to set soft links so that linker finds them)
|
||||
#
|
||||
# on shared computing systems (high-performance clusters)
|
||||
# you may have to switch the environment before running this script.
|
||||
#
|
||||
# note: the public distribution does not include third-party code
|
||||
# (EDAC in particular) because of incompatible license terms.
|
||||
# please obtain such code from the original authors
|
||||
# and copy it to the proper directory before compilation.
|
||||
#
|
||||
# the MSC and MUFPOT programs are currently not used.
|
||||
# they are not built by the top-level targets all and bin.
|
||||
|
||||
.PHONY: all bin docs clean edac loess msc mufpot
|
||||
|
||||
PMSCO_DIR = pmsco
|
||||
DOCS_DIR = docs
|
||||
|
||||
all: edac loess docs
|
||||
|
||||
bin: edac loess
|
||||
|
||||
edac loess msc mufpot:
|
||||
$(MAKE) -C $(PMSCO_DIR)
|
||||
|
||||
docs:
|
||||
$(MAKE) -C $(DOCS_DIR)
|
||||
|
||||
clean:
|
||||
$(MAKE) -C $(PMSCO_DIR) clean
|
||||
$(MAKE) -C $(DOCS_DIR) clean
|
||||
@@ -0,0 +1,17 @@
|
||||
"""
|
||||
@package pmsco.__main__
|
||||
__main__ module
|
||||
|
||||
thanks to this small module you can go to the project directory and run PMSCO like this:
|
||||
@verbatim
|
||||
python pmsco [pmsco-arguments]
|
||||
@endverbatim
|
||||
"""
|
||||
|
||||
import pmsco
|
||||
import sys
|
||||
|
||||
if __name__ == '__main__':
|
||||
args, unknown_args = pmsco.parse_cli()
|
||||
pmsco.main_pmsco(args, unknown_args)
|
||||
sys.exit(0)
|
||||
@@ -0,0 +1,131 @@
|
||||
"""
|
||||
@package pmsco.calculator
|
||||
abstract scattering program interface.
|
||||
|
||||
this module declares the basic interface to scattering programs.
|
||||
for each scattering program (EDAC, MSC, SSC, ...) a specific interface must be derived from CalcInterface.
|
||||
the derived interface must implement the run() method.
|
||||
the run() method and the scattering code may use only the parameters declared in the interface.
|
||||
|
||||
TestCalcInterface is provided for testing the PMSCO code quickly without calling an external program.
|
||||
|
||||
@author Matthias Muntwiler
|
||||
|
||||
@copyright (c) 2015 by Paul Scherrer Institut @n
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); @n
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
import time
|
||||
import numpy as np
|
||||
import data as md
|
||||
import cluster as mc
|
||||
|
||||
__author__ = 'matthias muntwiler'
|
||||
|
||||
|
||||
class Calculator(object):
|
||||
"""
|
||||
Interface class to the calculation program.
|
||||
"""
|
||||
def run(self, params, cluster, scan, output_file):
|
||||
"""
|
||||
run a calculation with the given parameters and cluster.
|
||||
|
||||
the result is returned as the method result and in a file named <code>output_file + '.etpi'</code>,
|
||||
or <code>output_file + '.etpai'</code> depending on scan mode.
|
||||
all other intermediate files are deleted unless keep_temp_files is True.
|
||||
|
||||
@param params: a msco_project.Params() object with all necessary values except cluster and output files set.
|
||||
|
||||
@param cluster: a msco_cluster.Cluster() object with all atom positions set.
|
||||
|
||||
@param scan: a msco_project.Scan() object describing the experimental scanning scheme.
|
||||
|
||||
@param output_file: base name for all intermediate and output files
|
||||
|
||||
@return: result_file, files_cats
|
||||
@arg result_file is the name of the main ETPI or ETPAI result file to be further processed.
|
||||
@arg files_cats is a dictionary that lists the names of all created data files with their category.
|
||||
the dictionary key is the file name,
|
||||
the value is the file category (cluster, phase, etc.).
|
||||
"""
|
||||
return None, None
|
||||
|
||||
def check_cluster(self, cluster, output_file):
|
||||
"""
|
||||
export the cluster in XYZ format for reference.
|
||||
|
||||
along with the complete cluster, the method also saves cuts in the xz (extension .y.xyz) and yz (.x.xyz) plane.
|
||||
|
||||
@param cluster: a pmsco.cluster.Cluster() object with all atom positions set.
|
||||
|
||||
@param output_file: base name for all intermediate and output files
|
||||
|
||||
@return: dictionary listing the names of the created files with their category.
|
||||
the dictionary key is the file name,
|
||||
the value is the file category (cluster).
|
||||
|
||||
@warning experimental: this method may be moved elsewhere in a future version.
|
||||
"""
|
||||
xyz_filename = output_file + ".xyz"
|
||||
cluster.save_to_file(xyz_filename, fmt=mc.FMT_XYZ)
|
||||
files = {xyz_filename: 'cluster'}
|
||||
|
||||
clucut = mc.Cluster()
|
||||
clucut.copy_from(cluster)
|
||||
clucut.trim_slab("x", 0.0, 0.1)
|
||||
xyz_filename = output_file + ".x.xyz"
|
||||
clucut.save_to_file(xyz_filename, fmt=mc.FMT_XYZ)
|
||||
files[xyz_filename] = 'cluster'
|
||||
|
||||
clucut.copy_from(cluster)
|
||||
clucut.trim_slab("y", 0.0, 0.1)
|
||||
xyz_filename = output_file + ".y.xyz"
|
||||
clucut.save_to_file(xyz_filename, fmt=mc.FMT_XYZ)
|
||||
files[xyz_filename] = 'cluster'
|
||||
|
||||
return files
|
||||
|
||||
|
||||
class TestCalculator(Calculator):
|
||||
"""
|
||||
interface class producing random data for testing the MSCO code without calling an external program.
|
||||
"""
|
||||
def run(self, params, cluster, scan, output_file):
|
||||
"""
|
||||
produce a random test data set.
|
||||
|
||||
the scan scheme is generated from the given parameters.
|
||||
the intensities are random values.
|
||||
|
||||
@return: result_file, files_cats
|
||||
the result file contains an ETPI or ETPAI array with random intensity data.
|
||||
"""
|
||||
|
||||
# set up scan
|
||||
params.fixed_cluster = 'a' in scan.mode
|
||||
|
||||
# generate file names
|
||||
base_filename = output_file
|
||||
clu_filename = base_filename + ".clu"
|
||||
if params.fixed_cluster:
|
||||
etpi_filename = base_filename + ".etpai"
|
||||
else:
|
||||
etpi_filename = base_filename + ".etpi"
|
||||
|
||||
cluster.save_to_file(clu_filename)
|
||||
|
||||
# generate data and save in ETPI or ETPAI format
|
||||
result_etpi = scan.raw_data.copy()
|
||||
result_etpi['i'] = np.random.random_sample(result_etpi.shape)
|
||||
|
||||
# slow down the test for debugging
|
||||
time.sleep(5)
|
||||
|
||||
md.save_data(etpi_filename, result_etpi)
|
||||
|
||||
files = {clu_filename: 'cluster', etpi_filename: 'energy'}
|
||||
return etpi_filename, files
|
||||
@@ -0,0 +1,785 @@
|
||||
"""
|
||||
@package pmsco.cluster
|
||||
cluster tools for MSC and EDAC
|
||||
|
||||
the Cluster class is provided to facilitate the construction and import/export of clusters.
|
||||
a cluster can be built by adding single atoms, layers, or a half-space bulk lattice.
|
||||
the class can import from/export to EDAC, MSC, and XYZ cluster files.
|
||||
XYZ allows for export to 3D visualizers, e.g. Avogadro.
|
||||
|
||||
@pre requires the periodictable package (https://pypi.python.org/pypi/periodictable)
|
||||
@code{.sh}
|
||||
pip install --user periodictable
|
||||
@endcode
|
||||
|
||||
@author Matthias Muntwiler
|
||||
|
||||
@copyright (c) 2015 by Paul Scherrer Institut
|
||||
"""
|
||||
|
||||
import math
|
||||
import numpy as np
|
||||
import periodictable as pt
|
||||
|
||||
## default file format identifier
|
||||
FMT_DEFAULT = 0
|
||||
## MSC file format identifier
|
||||
FMT_MSC = 1
|
||||
## EDAC file format identifier
|
||||
FMT_EDAC = 2
|
||||
## XYZ file format identifier
|
||||
FMT_XYZ = 3
|
||||
|
||||
## numpy.array datatype of Cluster.data array
|
||||
DTYPE_CLUSTER_INTERNAL = [('i','i4'), ('t','i4'), ('s','a2'), ('x','f4'), ('y','f4'), ('z','f4'), ('e','u1')]
|
||||
## file format of internal Cluster.data array
|
||||
FMT_CLUSTER_INTERNAL = ["%5u", "%2u", "%s", "%7.3f", "%7.3f", "%7.3f", "%1u"]
|
||||
## field (column) names of internal Cluster.data array
|
||||
FIELDS_CLUSTER_INTERNAL = ['i','t','s','x','y','z','e']
|
||||
|
||||
## numpy.array datatype of cluster for MSC cluster file input/output
|
||||
DTYPE_CLUSTER_MSC = [('i','i4'), ('x','f4'), ('y','f4'), ('z','f4'), ('t','i4')]
|
||||
## file format of MSC cluster file
|
||||
FMT_CLUSTER_MSC = ["%5u", "%7.3f", "%7.3f", "%7.3f", "%2u"]
|
||||
## field (column) names of MSC cluster file
|
||||
FIELDS_CLUSTER_MSC = ['i','x','y','z','t']
|
||||
|
||||
## numpy.array datatype of cluster for EDAC cluster file input/output
|
||||
DTYPE_CLUSTER_EDAC= [('i','i4'), ('t','i4'), ('x','f4'), ('y','f4'), ('z','f4')]
|
||||
## file format of EDAC cluster file
|
||||
FMT_CLUSTER_EDAC = ["%5u", "%2u", "%7.3f", "%7.3f", "%7.3f"]
|
||||
## field (column) names of EDAC cluster file
|
||||
FIELDS_CLUSTER_EDAC = ['i','t','x','y','z']
|
||||
|
||||
## numpy.array datatype of cluster for XYZ file input/output
|
||||
DTYPE_CLUSTER_XYZ= [('s','a2'), ('x','f4'), ('y','f4'), ('z','f4')]
|
||||
## file format of XYZ cluster file
|
||||
FMT_CLUSTER_XYZ = ["%s", "%10.5f", "%10.5f", "%10.5f"]
|
||||
## field (column) names of XYZ cluster file
|
||||
FIELDS_CLUSTER_XYZ = ['s','x','y','z']
|
||||
|
||||
|
||||
class Cluster(object):
|
||||
"""
|
||||
Represents a cluster of atoms by their coordinates and chemical element.
|
||||
|
||||
the object stores the following information per atom in the @ref data array:
|
||||
|
||||
- sequential atom index (1-based)
|
||||
- atom type (chemical element number)
|
||||
- chemical element symbol
|
||||
- x coordinate of the atom position
|
||||
- t coordinate of the atom position
|
||||
- z coordinate of the atom position
|
||||
- emitter flag
|
||||
|
||||
the class also defines methods that add or manipulate atoms of the cluster.
|
||||
see most importantly the set_rmax, add_atom, add_layer and add_bulk functions.
|
||||
emitters can be flagged by the set_emitter method.
|
||||
|
||||
you may also manipulate the data array directly.
|
||||
in this case, be sure to keep the data array consistent.
|
||||
the update methods can help to recreate the index, atom type or symbol columns.
|
||||
|
||||
the class can also load and save files in some simple formats.
|
||||
"""
|
||||
|
||||
## @var rmax
|
||||
# maximum distance of atoms from the origin.
|
||||
#
|
||||
# float, default = 0
|
||||
#
|
||||
# this parameter restricts the addition of new atoms.
|
||||
# changing the parameter does not affect existing atoms.
|
||||
# the default is 0 (no atom will be added!).
|
||||
# you must set this parameter explicitly!
|
||||
|
||||
## @var dtype
|
||||
# data type of the internal numpy.ndarray.
|
||||
|
||||
## @var file_format
|
||||
# default file format.
|
||||
#
|
||||
# must be one of the FMT_MSC, FMT_EDAC, FMT_XYZ constants.
|
||||
# the initial value is FMT_XYZ.
|
||||
|
||||
## @var data
|
||||
# structured numpy array holding the atom positions.
|
||||
#
|
||||
# the columns of the array are:
|
||||
# @arg @c 'i' (int) atom index (1-based)
|
||||
# @arg @c 't' (int) atom type (chemical element number)
|
||||
# @arg @c 's' (string) chemical element symbol
|
||||
# @arg @c 'x' (float32) x coordinate of the atom position
|
||||
# @arg @c 'y' (float32) t coordinate of the atom position
|
||||
# @arg @c 'z' (float32) z coordinate of the atom position
|
||||
# @arg @c 'e' (uint8) 1 = emitter, 0 = regular atom
|
||||
|
||||
## @var comment (str)
|
||||
# one-line comment that can be included in some cluster files
|
||||
|
||||
def __init__(self):
|
||||
self.data = None
|
||||
self.rmax = 0.0
|
||||
self.dtype = DTYPE_CLUSTER_INTERNAL
|
||||
self.file_format = FMT_XYZ
|
||||
self.comment = ""
|
||||
self.clear()
|
||||
|
||||
def clear(self):
|
||||
"""
|
||||
Remove all atoms from the cluster.
|
||||
"""
|
||||
n_atoms = 0
|
||||
self.data = np.zeros(n_atoms, dtype=self.dtype)
|
||||
|
||||
def copy_from(self, cluster):
|
||||
"""
|
||||
Copy the data from another cluster.
|
||||
|
||||
@param cluster (Cluster): other Cluster object.
|
||||
"""
|
||||
self.data = cluster.data.copy()
|
||||
|
||||
def set_rmax(self, r):
|
||||
"""
|
||||
set rmax, the maximum distance of atoms from the origin.
|
||||
|
||||
atoms with norm greater than rmax will not be added to the cluster
|
||||
by the add_layer() and add_bulk() methods.
|
||||
existing atoms are not affected when changing rmax.
|
||||
|
||||
you must set this parameter explicitly, as the default value is 0
|
||||
(no atom will be added)!
|
||||
"""
|
||||
self.rmax = r
|
||||
|
||||
def build_element(self, index, element_number, x, y, z, emitter):
|
||||
"""
|
||||
build a tuple in the format of the internal data array.
|
||||
|
||||
@param index: (int) index
|
||||
|
||||
@param element_number: (int) chemical element number
|
||||
|
||||
@param x, y, z: (float) atom coordinates in the cluster
|
||||
|
||||
@param emitter: (uint) 1 = emitter, 0 = regular
|
||||
"""
|
||||
symbol = pt.elements[element_number].symbol
|
||||
element = (index, element_number, symbol, x, y, z, emitter)
|
||||
return element
|
||||
|
||||
def add_atom(self, atomtype, v_pos, is_emitter):
|
||||
"""
|
||||
add a single atom to the cluster.
|
||||
|
||||
@param atomtype: (int) chemical element number
|
||||
|
||||
@param v_pos: (numpy.ndarray, shape = (3)) position vector
|
||||
|
||||
@param is_emitter: (uint) 1 = emitter, 0 = regular
|
||||
"""
|
||||
n0 = self.data.shape[0] + 1
|
||||
element = self.build_element(n0, atomtype, v_pos[0], v_pos[1], v_pos[2], is_emitter)
|
||||
self.data = np.append(self.data, np.array(element,
|
||||
dtype=self.data.dtype))
|
||||
|
||||
def add_layer(self, atomtype, v_pos, v_lat1, v_lat2):
|
||||
"""
|
||||
add a layer of atoms to the cluster.
|
||||
|
||||
the layer is expanded up to the limit given by
|
||||
self.rmax (maximum distance from the origin).
|
||||
all atoms are non-emitters.
|
||||
|
||||
@param atomtype: (int) chemical element number
|
||||
|
||||
@param v_pos: (numpy.ndarray, shape = (3))
|
||||
position vector of the first atom (basis vector)
|
||||
|
||||
@param v_lat1, v_lat2: (numpy.ndarray, shape = (3))
|
||||
lattice vectors.
|
||||
"""
|
||||
r_great = max(self.rmax, np.linalg.norm(v_pos))
|
||||
n0 = self.data.shape[0] + 1
|
||||
n1 = max(int(r_great / np.linalg.norm(v_lat1)) + 1, 3) * 2
|
||||
n2 = max(int(r_great / np.linalg.norm(v_lat2)) + 1, 3) * 2
|
||||
nn = 0
|
||||
buf = np.empty((2 * n1 + 1) * (2 * n2 + 1), dtype=self.dtype)
|
||||
for i1 in range(-n1, n1 + 1):
|
||||
for i2 in range(-n2, n2 + 1):
|
||||
v = v_pos + v_lat1 * i1 + v_lat2 * i2
|
||||
if np.linalg.norm(v) <= self.rmax:
|
||||
buf[nn] = self.build_element(nn + n0, atomtype, v[0], v[1], v[2], 0)
|
||||
nn += 1
|
||||
buf = np.resize(buf, nn)
|
||||
self.data = np.append(self.data, buf)
|
||||
|
||||
def add_bulk(self, atomtype, v_pos, v_lat1, v_lat2, v_lat3, z_surf=0.0):
|
||||
"""
|
||||
add bulk atoms to the cluster.
|
||||
|
||||
the lattice is expanded up to the limits given by
|
||||
self.rmax (maximum distance from the origin)
|
||||
and z_surf (position of the surface).
|
||||
all atoms are non-emitters.
|
||||
|
||||
@param atomtype: (int) chemical element number
|
||||
|
||||
@param v_pos: (numpy.ndarray, shape = (3))
|
||||
position vector of the first atom (basis vector)
|
||||
|
||||
@param v_lat1, v_lat2, v_lat3: (numpy.ndarray, shape = (3))
|
||||
lattice vectors.
|
||||
|
||||
@param z_surf: (float) position of surface.
|
||||
atoms with z > z_surf are not added.
|
||||
"""
|
||||
r_great = max(self.rmax, np.linalg.norm(v_pos))
|
||||
n0 = self.data.shape[0] + 1
|
||||
n1 = max(int(r_great / np.linalg.norm(v_lat1)) + 1, 4) * 3
|
||||
n2 = max(int(r_great / np.linalg.norm(v_lat2)) + 1, 4) * 3
|
||||
n3 = max(int(r_great / np.linalg.norm(v_lat3)) + 1, 4) * 3
|
||||
nn = 0
|
||||
buf = np.empty((2 * n1 + 1) * (2 * n2 + 1) * (n3 + 1), dtype=self.dtype)
|
||||
for i1 in range(-n1, n1 + 1):
|
||||
for i2 in range(-n2, n2 + 1):
|
||||
for i3 in range(-n3, n3 + 1):
|
||||
v = v_pos + v_lat1 * i1 + v_lat2 * i2 + v_lat3 * i3
|
||||
if np.linalg.norm(v) <= self.rmax and v[2] <= z_surf:
|
||||
buf[nn] = self.build_element(nn + n0, atomtype, v[0], v[1], v[2], 0)
|
||||
nn += 1
|
||||
buf = np.resize(buf, nn)
|
||||
self.data = np.append(self.data, buf)
|
||||
|
||||
def add_cluster(self, cluster, check_rmax=False, check_unique=False, tol=0.001):
|
||||
"""
|
||||
add atoms from another cluster object.
|
||||
|
||||
@note the order of atoms in the internal data array may change during this operation.
|
||||
the atom index is updated.
|
||||
|
||||
@param cluster: Cluster object to be added.
|
||||
|
||||
@param check_rmax: if True, atoms outside self.rmax are not added.
|
||||
if False (default), all atoms of the other cluster are added.
|
||||
|
||||
@param check_unique: if True, atoms occupying the same position as an existing atom will not be added.
|
||||
if False (default), all atoms are added even if they occupy the same position.
|
||||
|
||||
@param tol: tolerance for checking uniqueness.
|
||||
positions of two atoms are considered equal if all coordinates lie within the tolerance interval.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
assert isinstance(cluster, Cluster)
|
||||
data = self.data.copy()
|
||||
source = cluster.data.copy()
|
||||
|
||||
if check_rmax and source.shape[0] > 0:
|
||||
source_xyz = source[['x', 'y', 'z']].copy()
|
||||
source_xyz = source_xyz.view((source_xyz.dtype[0], len(source_xyz.dtype.names)))
|
||||
b_rmax = np.linalg.norm(source_xyz, axis=1) <= self.rmax
|
||||
idx = np.where(b_rmax)
|
||||
source = source[idx]
|
||||
data = np.append(data, source)
|
||||
|
||||
if check_unique and data.shape[0] > 0:
|
||||
data_xyz = data[['x', 'y', 'z']].copy()
|
||||
data_xyz = data_xyz.view((data_xyz.dtype[0], len(data_xyz.dtype.names)))
|
||||
tol_xyz = np.round(data_xyz / tol)
|
||||
uni_xyz = tol_xyz.view(tol_xyz.dtype.descr * 3)
|
||||
_, idx = np.unique(uni_xyz, return_index=True)
|
||||
data = data[np.sort(idx)]
|
||||
|
||||
self.data = data
|
||||
self.update_index()
|
||||
|
||||
def get_z_layers(self, tol=0.001):
|
||||
"""
|
||||
return the z-coordinates of atomic layers.
|
||||
the layers are stacked in the z-direction.
|
||||
|
||||
the function gathers unique z-coordinates.
|
||||
coordinates which are within the given tolerance are assigned to the same layer.
|
||||
|
||||
@param tol: tolerance
|
||||
@return: (numpy.ndarray) z-coordinates of the layers.
|
||||
the coordinates are numerically ordered, the top layer appears last.
|
||||
the returned coordinates may not be identical to any atom coordinate of a layer
|
||||
but deviate up to the given tolerance.
|
||||
"""
|
||||
self_z = self.data['z'].view(np.float32).reshape(self.data.shape)
|
||||
z2 = np.round(self_z.copy() / tol)
|
||||
layers = np.unique(z2) * tol
|
||||
return layers
|
||||
|
||||
def relax(self, z_cut, z_shift, element=0):
|
||||
"""
|
||||
shift atoms below a certain z coordinate by a fixed distance in the z direction.
|
||||
|
||||
@param z_cut: atoms below this z coordinate are shifted.
|
||||
@param z_shift: amount of shift in z direction
|
||||
(positive to move towards the surface, negative to move into the bulk).
|
||||
@param element: (int) chemical element number if atoms of a specific element should be affected.
|
||||
by default (element = 0), all atoms are moved.
|
||||
@return: (numpy.ndarray) indices of the atoms that have been shifted.
|
||||
"""
|
||||
self_z = self.data['z'].view(np.float32).reshape(self.data.shape)
|
||||
b_z = self_z <= z_cut
|
||||
b_all = b_z
|
||||
|
||||
if element:
|
||||
try:
|
||||
b_el = self.data['t'] == int(element)
|
||||
except ValueError:
|
||||
b_el = self.data['s'] == element
|
||||
b_all = np.all([b_z, b_el], axis=0)
|
||||
|
||||
idx = np.where(b_all)
|
||||
self.data['z'][idx] += z_shift
|
||||
|
||||
return idx
|
||||
|
||||
def matrix_transform(self, matrix):
|
||||
"""
|
||||
apply a transformation matrix to each atom of the cluster.
|
||||
|
||||
the transformed atom positions are calculated as v = R * transpose(v)
|
||||
|
||||
@param matrix: transformation matrix
|
||||
|
||||
@return: None
|
||||
"""
|
||||
for atom in self.data:
|
||||
v = np.matrix([atom['x'], atom['y'], atom['z']])
|
||||
w = matrix * v.transpose()
|
||||
atom['x'] = float(w[0])
|
||||
atom['y'] = float(w[1])
|
||||
atom['z'] = float(w[2])
|
||||
|
||||
def rotate_x(self, angle):
|
||||
"""
|
||||
rotate cluster about the surface normal axis
|
||||
|
||||
@param angle (float) in degrees
|
||||
"""
|
||||
angle = math.radians(angle)
|
||||
s = math.sin(angle)
|
||||
c = math.cos(angle)
|
||||
matrix = np.matrix([[1, 0, 0], [0, c, -s], [0, s, c]])
|
||||
self.matrix_transform(matrix)
|
||||
|
||||
def rotate_y(self, angle):
|
||||
"""
|
||||
rotate cluster about the surface normal axis
|
||||
|
||||
@param angle (float) in degrees
|
||||
"""
|
||||
angle = math.radians(angle)
|
||||
s = math.sin(angle)
|
||||
c = math.cos(angle)
|
||||
matrix = np.matrix([[c, 0, s], [0, 1, 0], [-s, 0, c]])
|
||||
self.matrix_transform(matrix)
|
||||
|
||||
def rotate_z(self, angle):
|
||||
"""
|
||||
rotate cluster about the surface normal axis
|
||||
|
||||
@param angle (float) in degrees
|
||||
"""
|
||||
angle = math.radians(angle)
|
||||
s = math.sin(angle)
|
||||
c = math.cos(angle)
|
||||
matrix = np.matrix([[c, -s, 0], [s, c, 0], [0, 0, 1]])
|
||||
self.matrix_transform(matrix)
|
||||
|
||||
def find_positions(self, pos, tol=0.001):
|
||||
"""
|
||||
find all atoms which occupy a given position.
|
||||
|
||||
@param pos: (numpy.array, shape = (3)) position vector.
|
||||
|
||||
@param tol: (float) matching tolerance per coordinate.
|
||||
|
||||
@return numpy.array of indices which match v_pos.
|
||||
"""
|
||||
b2 = np.abs(pos - self.get_positions()) < tol
|
||||
b1 = np.all(b2, axis=1)
|
||||
idx = np.where(b1)
|
||||
return idx[0]
|
||||
|
||||
def find_index_cylinder(self, pos, r_xy, r_z, element):
|
||||
"""
|
||||
find atoms of a given element within a cylindrical volume and return their indices.
|
||||
|
||||
@param pos: (numpy.array, shape = (3)) center position of the cylinder.
|
||||
|
||||
@param r_xy: (float) radius of the cylinder.
|
||||
returned atoms must match |atom(x,y) - pos(x,y)| <= r_xy.
|
||||
|
||||
@param r_z: (float) half height of the cylinder.
|
||||
returned atoms must match |atom(z) - pos(z)| <= r_z.
|
||||
|
||||
@param element: (str or int) element symbol or atomic number.
|
||||
if None, the element is not checked.
|
||||
|
||||
@return numpy.array of indices which match v_pos.
|
||||
"""
|
||||
pos_xy = pos[0:2]
|
||||
self_xy = self.data[['x', 'y']].copy()
|
||||
self_xy = self_xy.view((self_xy.dtype[0], len(self_xy.dtype.names)))
|
||||
b_xy = np.linalg.norm(self_xy - pos_xy, axis=1) <= r_xy
|
||||
|
||||
pos_z = pos[2]
|
||||
self_z = self.data['z']
|
||||
b_z = np.abs(self_z - pos_z) <= r_z
|
||||
|
||||
if element is not None:
|
||||
try:
|
||||
b_el = self.data['t'] == int(element)
|
||||
except ValueError:
|
||||
b_el = self.data['s'] == element
|
||||
b_all = np.all([b_xy, b_z, b_el], axis=0)
|
||||
else:
|
||||
b_all = np.all([b_xy, b_z], axis=0)
|
||||
|
||||
idx = np.where(b_all)
|
||||
return idx[0]
|
||||
|
||||
def trim_cylinder(self, r_xy, r_z):
|
||||
"""
|
||||
remove atoms outside a given cylinder.
|
||||
|
||||
the cylinder is centered at the origin.
|
||||
|
||||
@param r_xy: (float) radius of the cylinder.
|
||||
atoms to keep must match |atom(x,y)| <= r_xy.
|
||||
|
||||
@param r_z: (float) half height of the cylinder.
|
||||
atoms to keep must match |atom(z)| <= r_z.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
self_xy = self.data[['x', 'y']].copy()
|
||||
self_xy = self_xy.view((self_xy.dtype[0], len(self_xy.dtype.names)))
|
||||
b_xy = np.linalg.norm(self_xy, axis=1) <= r_xy
|
||||
|
||||
self_z = self.data['z']
|
||||
b_z = np.abs(self_z) <= r_z
|
||||
|
||||
b_all = np.all([b_xy, b_z], axis=0)
|
||||
idx = np.where(b_all)
|
||||
self.data = self.data[idx]
|
||||
self.update_index()
|
||||
|
||||
def trim_sphere(self, radius):
|
||||
"""
|
||||
remove atoms outside a given sphere.
|
||||
|
||||
the sphere is centered at the origin.
|
||||
|
||||
@param radius: (float) radius of the sphere.
|
||||
atoms to keep must match |atom(x,y,z)| <= radius.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
self_xyz = self.data[['x', 'y', 'z']].copy()
|
||||
self_xyz = self_xyz.view((self_xyz.dtype[0], len(self_xyz.dtype.names)))
|
||||
b_xyz = np.linalg.norm(self_xyz, axis=1) <= radius
|
||||
idx = np.where(b_xyz)
|
||||
self.data = self.data[idx]
|
||||
self.update_index()
|
||||
|
||||
def trim_slab(self, axis, center, depth):
|
||||
"""
|
||||
remove atoms outside a slab that is parallel to one of the coordinate planes.
|
||||
|
||||
@param axis: axis to trim: 'x', 'y' or 'z'.
|
||||
@param center: center position of the slab.
|
||||
@param depth: thickness of the slab.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
coord = self.data[axis].view(np.float32).reshape(self.data.shape)
|
||||
sel = np.abs(coord - center) <= depth / 2
|
||||
idx = np.where(sel)
|
||||
self.data = self.data[idx]
|
||||
self.update_index()
|
||||
|
||||
def set_emitter(self, pos=None, idx=-1, tol=0.001):
|
||||
"""
|
||||
select an atom as emitter.
|
||||
|
||||
the emitter atom can be specified by position or index.
|
||||
either one of the pos or idx arguments must be specified.
|
||||
|
||||
@param idx: (int) array index of the atom.
|
||||
|
||||
@param pos: (numpy.array, shape = (3)) position vector.
|
||||
|
||||
@param tol: (float) matching tolerance per component if pos argument is used.
|
||||
|
||||
@raise IndexError if the position cannot be found
|
||||
"""
|
||||
if pos is not None:
|
||||
ares = self.find_positions(pos, tol)
|
||||
idx = ares[0]
|
||||
item = self.data[idx]
|
||||
item['e'] = 1
|
||||
|
||||
def move_to_first(self, pos=None, idx=0, tol=0.001):
|
||||
"""
|
||||
move an atom to the first position.
|
||||
|
||||
the emitter atom can be specified by position or index.
|
||||
either one of the pos or idx arguments must be specified.
|
||||
|
||||
@param idx: (int) array index of the atom.
|
||||
must be greater than 1 to have an effect.
|
||||
|
||||
@param pos: (numpy.array, shape = (3)) position vector.
|
||||
|
||||
@param tol: (float) matching tolerance per component if pos argument is used.
|
||||
|
||||
@raise IndexError if the position cannot be found
|
||||
"""
|
||||
|
||||
if pos is not None:
|
||||
ares = self.find_positions(pos, tol)
|
||||
idx = ares[0]
|
||||
if idx:
|
||||
em = self.data[idx]
|
||||
self.data = np.delete(self.data, idx)
|
||||
self.data = np.insert(self.data, 0, em)
|
||||
self.update_index()
|
||||
|
||||
def get_positions(self):
|
||||
"""
|
||||
get an array of the atom coordinates.
|
||||
|
||||
the returned array is an independent copy of the original data.
|
||||
changes will not affect the original cluster.
|
||||
|
||||
@return numpy.ndarray, shape = (N,3)
|
||||
"""
|
||||
pos = self.data[['x', 'y', 'z']].copy()
|
||||
pos = pos.view((pos.dtype[0], len(pos.dtype.names)))
|
||||
return pos
|
||||
|
||||
def set_positions(self, positions):
|
||||
"""
|
||||
set atom coordinates from an array of shape (N,3).
|
||||
|
||||
this method can be used on a modified array obtained from get_positions.
|
||||
N must be the number of atoms defined in the cluster.
|
||||
|
||||
@param positions: numpy.ndarray of shape (N,3) where N is the number of atoms in this cluster.
|
||||
|
||||
@return: None
|
||||
|
||||
@raise AssertionError if the array sizes do not match.
|
||||
"""
|
||||
assert isinstance(positions, np.ndarray)
|
||||
assert positions.shape == (self.data.shape[0], 3)
|
||||
self.data['x'] = positions[:, 0]
|
||||
self.data['y'] = positions[:, 1]
|
||||
self.data['z'] = positions[:, 2]
|
||||
|
||||
def get_position(self, index):
|
||||
"""
|
||||
get the position of a single atom.
|
||||
|
||||
@param index: (int) index of the atom.
|
||||
|
||||
@return numpy.array, shape = (3): position vector.
|
||||
the array instance is independent from the original array.
|
||||
"""
|
||||
rec = self.data[index]
|
||||
return np.array((rec['x'], rec['y'], rec['z']))
|
||||
|
||||
def get_atom_count(self):
|
||||
"""
|
||||
get the number of atoms (positions) in the cluster.
|
||||
|
||||
@return the number of atoms in the cluster.
|
||||
"""
|
||||
return self.data.shape[0]
|
||||
|
||||
def get_atomtype(self, index):
|
||||
"""
|
||||
get the chemical element number of an atom.
|
||||
|
||||
@param index: (int) index of the atom.
|
||||
|
||||
@return int: chemical element number.
|
||||
"""
|
||||
rec = self.data[index]
|
||||
return rec['t']
|
||||
|
||||
def get_symbol(self, index):
|
||||
"""
|
||||
get the chemical element symbol of an atom.
|
||||
|
||||
@param index: (int) index of the atom.
|
||||
|
||||
@return string: chemical element symbol.
|
||||
"""
|
||||
rec = self.data[index]
|
||||
return rec['s']
|
||||
|
||||
def get_emitters(self):
|
||||
"""
|
||||
get a list of all emitters.
|
||||
|
||||
@return list of tuples (x, y, z, atomtype)
|
||||
"""
|
||||
idx = self.data['e'] != 0
|
||||
ems = self.data[['x', 'y', 'z', 't']][idx]
|
||||
return map(tuple, ems)
|
||||
|
||||
def get_emitter_count(self):
|
||||
"""
|
||||
get the number of emitters in the cluster.
|
||||
|
||||
@return the number of atoms marked as emitter.
|
||||
"""
|
||||
idx = self.data['e'] != 0
|
||||
return np.sum(idx)
|
||||
|
||||
def load_from_file(self, f, fmt=FMT_DEFAULT):
|
||||
"""
|
||||
load a cluster from a file created by the scattering program.
|
||||
|
||||
@param f (string/handle): path name or open file handle of the cluster file.
|
||||
|
||||
@param fmt (int): file format.
|
||||
must be one of the FMT_ constants.
|
||||
if FMT_DEFAULT, self.file_format is used.
|
||||
|
||||
@remark if the filename ends in .gz, the file is loaded from compressed gzip format
|
||||
"""
|
||||
if fmt == FMT_DEFAULT:
|
||||
fmt = self.file_format
|
||||
|
||||
if fmt == FMT_MSC:
|
||||
dtype = DTYPE_CLUSTER_MSC
|
||||
fields = FIELDS_CLUSTER_MSC
|
||||
sh = 0
|
||||
elif fmt == FMT_EDAC:
|
||||
dtype = DTYPE_CLUSTER_EDAC
|
||||
fields = FIELDS_CLUSTER_EDAC
|
||||
sh = 1
|
||||
elif fmt == FMT_XYZ:
|
||||
dtype = DTYPE_CLUSTER_XYZ
|
||||
fields = FIELDS_CLUSTER_XYZ
|
||||
sh = 2
|
||||
else:
|
||||
dtype = DTYPE_CLUSTER_XYZ
|
||||
fields = FIELDS_CLUSTER_XYZ
|
||||
sh = 2
|
||||
|
||||
data = np.genfromtxt(f, dtype=dtype, skip_header=sh)
|
||||
self.data = np.empty(data.shape, dtype=self.dtype)
|
||||
self.data['x'] = data['x']
|
||||
self.data['y'] = data['y']
|
||||
self.data['z'] = data['z']
|
||||
if 'i' in fields:
|
||||
self.data['i'] = data['i']
|
||||
else:
|
||||
self.update_index()
|
||||
if 't' in fields:
|
||||
self.data['t'] = data['t']
|
||||
if 's' in fields:
|
||||
self.data['s'] = data['s']
|
||||
else:
|
||||
self.update_symbols()
|
||||
if 't' not in fields:
|
||||
self.update_atomtypes()
|
||||
if 'e' in fields:
|
||||
self.data['e'] = data['e']
|
||||
else:
|
||||
self.data['e'] = 0
|
||||
|
||||
pos = self.positions()
|
||||
# note: np.linalg.norm does not accept axis argument in version 1.7
|
||||
# (check np.version.version)
|
||||
norm = np.sqrt(np.sum(pos**2, axis=1))
|
||||
self.rmax = np.max(norm)
|
||||
|
||||
def update_symbols(self):
|
||||
"""
|
||||
update element symbols from element numbers.
|
||||
|
||||
if you have modified the element numbers in the self.data array directly,
|
||||
this method updates the symbol column to make the data consistent.
|
||||
"""
|
||||
for atom in self.data:
|
||||
atom['s'] = pt.elements[atom['t']].symbol
|
||||
|
||||
def update_atomtypes(self):
|
||||
"""
|
||||
update element numbers from element symbols.
|
||||
|
||||
if you have modified the element symbols in the self.data array directly,
|
||||
this method updates the atom type column to make the data consistent.
|
||||
"""
|
||||
for atom in self.data:
|
||||
atom['t'] = pt.elements.symbol(atom['s'].strip()).number
|
||||
|
||||
def update_index(self):
|
||||
"""
|
||||
update the index column.
|
||||
|
||||
if you have modified the order or number of elements in the self.data array directly,
|
||||
you may need to re-index the atoms if your code uses functions that rely on the index.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
self.data['i'] = np.arange(1, self.data.shape[0] + 1)
|
||||
|
||||
def save_to_file(self, f, fmt=FMT_DEFAULT, comment=""):
|
||||
"""
|
||||
save the cluster to a file which can be read by the scattering program.
|
||||
|
||||
the method updates the atom index because some file formats require an index column.
|
||||
|
||||
@param f: (string/handle) path name or open file handle of the cluster file.
|
||||
|
||||
@param fmt: (int) file format.
|
||||
must be one of the FMT_ constants.
|
||||
if FMT_DEFAULT, self.file_format is used.
|
||||
|
||||
@param comment: (str) comment line (second line) in XYZ file.
|
||||
not used in other file formats.
|
||||
by default, self.comment is used.
|
||||
|
||||
@remark if the filename ends in .gz, the file is saved in compressed gzip format
|
||||
"""
|
||||
if fmt == FMT_DEFAULT:
|
||||
fmt = self.file_format
|
||||
|
||||
if not comment:
|
||||
comment = self.comment
|
||||
|
||||
if fmt == FMT_MSC:
|
||||
file_format = FMT_CLUSTER_MSC
|
||||
fields = FIELDS_CLUSTER_MSC
|
||||
header = ""
|
||||
elif fmt == FMT_EDAC:
|
||||
file_format = FMT_CLUSTER_EDAC
|
||||
fields = FIELDS_CLUSTER_EDAC
|
||||
header = "%u l(A)" % (self.data.shape[0])
|
||||
elif fmt == FMT_XYZ:
|
||||
file_format = FMT_CLUSTER_XYZ
|
||||
fields = FIELDS_CLUSTER_XYZ
|
||||
header = "{0}\n{1}".format(self.data.shape[0], comment)
|
||||
else:
|
||||
file_format = FMT_CLUSTER_XYZ
|
||||
fields = FIELDS_CLUSTER_XYZ
|
||||
header = "{0}\n{1}".format(self.data.shape[0], comment)
|
||||
|
||||
self.update_index()
|
||||
data = self.data[fields]
|
||||
np.savetxt(f, data, fmt=file_format, header=header, comments="")
|
||||
+840
@@ -0,0 +1,840 @@
|
||||
"""
|
||||
@package pmsco.data
|
||||
import, export, evaluation of msc data
|
||||
|
||||
@author Matthias Muntwiler
|
||||
|
||||
@copyright (c) 2015 by Paul Scherrer Institut @n
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); @n
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
import os
|
||||
import logging
|
||||
import numpy as np
|
||||
import scipy.optimize as so
|
||||
import loess.loess as loess
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
## energy, intensity
|
||||
DTYPE_EI = [('e', 'f4'), ('i', 'f4')]
|
||||
## energy, theta, phi, intensity
|
||||
DTYPE_ETPI = [('e', 'f4'), ('t', 'f4'), ('p', 'f4'), ('i', 'f4')]
|
||||
## energy, theta, phi, intensity, sigma (standard deviation)
|
||||
DTYPE_ETPIS = [('e', 'f4'), ('t', 'f4'), ('p', 'f4'), ('i', 'f4'), ('s', 'f4')]
|
||||
## energy, theta, phi, alpha, intensity
|
||||
DTYPE_ETPAI = [('e', 'f4'), ('t', 'f4'), ('p', 'f4'), ('a', 'f4'), ('i', 'f4')]
|
||||
## energy, theta, phi, alpha, intensity, sigma (standard deviation)
|
||||
DTYPE_ETPAIS = [('e', 'f4'), ('t', 'f4'), ('p', 'f4'), ('a', 'f4'), ('i', 'f4'), ('s', 'f4')]
|
||||
## theta, phi
|
||||
DTYPE_TP = [('t', 'f4'), ('p', 'f4')]
|
||||
## theta, phi, intensity
|
||||
DTYPE_TPI = [('t', 'f4'), ('p', 'f4'), ('i', 'f4')]
|
||||
## theta, phi, intensity, sigma (standard deviation)
|
||||
DTYPE_TPIS = [('t', 'f4'), ('p', 'f4'), ('i', 'f4'), ('s', 'f4')]
|
||||
|
||||
DTYPES = {'EI': DTYPE_EI, 'ETPI': DTYPE_ETPI, 'ETPIS': DTYPE_ETPIS, 'ETPAI': DTYPE_ETPAI, 'ETPAIS': DTYPE_ETPAIS,
|
||||
'TP': DTYPE_TP, 'TPI': DTYPE_TPI, 'TPIS': DTYPE_TPIS, }
|
||||
DATATYPES = DTYPES.keys
|
||||
|
||||
## supportd scan types
|
||||
# @arg @c 'E' energy
|
||||
# @arg @c 'EA' energy - alpha (analyser)
|
||||
# @arg @c 'ET' energy - theta
|
||||
# @arg @c 'TP' theta - phi (holo scan)
|
||||
SCANTYPES = ['E', 'EA', 'ET', 'TP']
|
||||
|
||||
|
||||
def create_etpi(shape, sigma_column=True):
|
||||
"""
|
||||
create an ETPI array of a given size.
|
||||
|
||||
an ETPI array is a numpy structured array.
|
||||
the array is initialized with zeroes.
|
||||
|
||||
@param shape (tuple) shape of the array
|
||||
"""
|
||||
if sigma_column:
|
||||
data = np.zeros(shape, dtype=DTYPE_ETPIS)
|
||||
else:
|
||||
data = np.zeros(shape, dtype=DTYPE_ETPI)
|
||||
return data
|
||||
|
||||
|
||||
def create_data(shape, datatype='', dtype=None):
|
||||
"""
|
||||
create a data array of a given size and type.
|
||||
|
||||
a data array is a numpy structured array.
|
||||
the array is initialized with zeroes.
|
||||
either datatype or dtype must be specified, dtypes takes precedence.
|
||||
|
||||
@param shape (tuple) shape of the array, only scalars (1-tuples) supported currently
|
||||
@param datatype see DATATYPES
|
||||
@param dtype see DTYPES
|
||||
"""
|
||||
if not dtype:
|
||||
dtype = DTYPES[datatype]
|
||||
data = np.zeros(shape, dtype=dtype)
|
||||
return data
|
||||
|
||||
|
||||
def load_plt(filename, int_column=-1):
|
||||
"""
|
||||
loads ETPI data from an MSC output (plt) file
|
||||
|
||||
plt file format:
|
||||
5-9 columns, space or tab delimited
|
||||
column 0: energy
|
||||
column 1: momentum
|
||||
column 2: theta
|
||||
column 3: phi
|
||||
columns 4-8: intensities
|
||||
comment lines must start with # character
|
||||
|
||||
filename: path or name of the file to be read
|
||||
|
||||
int_column: index of the column to be read as intensity
|
||||
typical values: 4, 5, 6, 7, 8
|
||||
or negative: -1 (last), -2, (second last), ...
|
||||
default: -1
|
||||
|
||||
returns a structured one-dimensional numpy.ndarray
|
||||
|
||||
data[i]['e'] = energy
|
||||
data[i]['t'] = theta
|
||||
data[i]['p'] = phi
|
||||
data[i]['i'] = selected intensity column
|
||||
"""
|
||||
data = np.genfromtxt(filename, usecols=(0, 2, 3, int_column), dtype=DTYPE_ETPI)
|
||||
sort_data(data)
|
||||
return data
|
||||
|
||||
|
||||
def load_edac_pd(filename, int_column=-1, energy=0.0, theta=0.0, phi=0.0, fixed_cluster=False):
|
||||
"""
|
||||
load ETPI or ETPAI data from an EDAC PD output file.
|
||||
|
||||
EDAC file format:
|
||||
@arg row 0: "--- scan PD"
|
||||
@arg row 1: column names
|
||||
@arg rows 2 and following: space delimited data
|
||||
|
||||
@arg first columns (up to 3): energy, theta, phi depending on scan
|
||||
@arg last columns (arbitrary number): intensity at the recursion order specified in the header
|
||||
|
||||
@param filename: path or name of the file to be read
|
||||
|
||||
@param int_column: index of the column to be read as intensity.
|
||||
typical values: -1 (last), -2, (second last), ...
|
||||
default: -1
|
||||
|
||||
@param energy: default value if energy column is missing
|
||||
@param theta: default value if theta column is missing
|
||||
@param phi: default value if phi column is missing
|
||||
|
||||
@param fixed_cluster:
|
||||
if True, (theta, phi) are mapped to (alpha, phi). theta is copied from function argument.
|
||||
if False, angles are copied literally.
|
||||
|
||||
@return a structured one-dimensional numpy.ndarray (ETPI or ETPAI)
|
||||
|
||||
@verbatim
|
||||
data[i]['e'] = energy
|
||||
data[i]['t'] = theta
|
||||
data[i]['p'] = phi
|
||||
data[i]['i'] = selected intensity column
|
||||
@endverbatim
|
||||
"""
|
||||
with open(filename, 'r') as f:
|
||||
header1 = f.readline().strip()
|
||||
header2 = f.readline().strip()
|
||||
if not header1 == '--- scan PD':
|
||||
logger.warning("unexpected EDAC output file header format")
|
||||
|
||||
col_names = header2.split()
|
||||
dtype = []
|
||||
cols = []
|
||||
ncols = 0
|
||||
for name in col_names:
|
||||
if name == "eV":
|
||||
dtype.append(('e', 'f4'))
|
||||
cols.append(ncols)
|
||||
ncols += 1
|
||||
elif name == "theta":
|
||||
dtype.append(('t', 'f4'))
|
||||
cols.append(ncols)
|
||||
ncols += 1
|
||||
elif name == "phi":
|
||||
dtype.append(('p', 'f4'))
|
||||
cols.append(ncols)
|
||||
ncols += 1
|
||||
elif name == "order":
|
||||
dtype.append(('i', 'f4'))
|
||||
cols.append(int_column)
|
||||
ncols += 1
|
||||
break
|
||||
else:
|
||||
logger.warning("unexpected EDAC output file column name")
|
||||
break
|
||||
cols = tuple(cols)
|
||||
raw = np.genfromtxt(filename, usecols=cols, dtype=dtype, skip_header=2)
|
||||
|
||||
if fixed_cluster:
|
||||
etpi = np.empty(raw.shape, dtype=DTYPE_ETPAI)
|
||||
else:
|
||||
etpi = np.empty(raw.shape, dtype=DTYPE_ETPI)
|
||||
|
||||
if 'eV' in col_names:
|
||||
etpi['e'] = raw['e']
|
||||
else:
|
||||
etpi['e'] = energy
|
||||
if 'theta' in col_names:
|
||||
etpi['t'] = raw['t']
|
||||
else:
|
||||
etpi['t'] = theta
|
||||
if 'phi' in col_names:
|
||||
etpi['p'] = raw['p']
|
||||
else:
|
||||
etpi['p'] = phi
|
||||
etpi['i'] = raw['i']
|
||||
|
||||
if fixed_cluster:
|
||||
etpi['a'] = etpi['t']
|
||||
etpi['t'] = theta
|
||||
|
||||
sort_data(etpi)
|
||||
return etpi
|
||||
|
||||
|
||||
def load_etpi(filename):
|
||||
"""
|
||||
loads ETPI or ETPIS data from a text file
|
||||
|
||||
etpi file format:
|
||||
4 or 5 columns, space or tab delimited
|
||||
column 0: energy
|
||||
column 1: theta
|
||||
column 2: phi
|
||||
column 3: intensity
|
||||
column 4: sigma error (standard deviation). optional defaults to 0.
|
||||
comment lines must start with # character
|
||||
comment lines may appear anywhere, and are ignored
|
||||
|
||||
filename: path or name of the file to be read
|
||||
load_etpi handles compressed files (ending .gz) transparently.
|
||||
|
||||
returns a structured one-dimensional numpy.ndarray
|
||||
|
||||
data[i]['e'] = energy
|
||||
data[i]['t'] = theta
|
||||
data[i]['p'] = phi
|
||||
data[i]['i'] = intensity
|
||||
data[i]['s'] = sigma
|
||||
|
||||
@deprecated new code should use load_data().
|
||||
"""
|
||||
try:
|
||||
data = np.loadtxt(filename, dtype=DTYPE_ETPIS)
|
||||
except IndexError:
|
||||
data = np.loadtxt(filename, dtype=DTYPE_ETPI)
|
||||
sort_data(data)
|
||||
return data
|
||||
|
||||
|
||||
def load_data(filename, dtype=None):
|
||||
"""
|
||||
load column data (ETPI, and the like) from a text file.
|
||||
|
||||
the extension must specify one of DATATYPES (case insensitive)
|
||||
corresponding to the meaning of the columns in the file.
|
||||
|
||||
@param filename
|
||||
|
||||
@param dtype: override data type recognition if the extension cannot be used.
|
||||
must be one of the data.DTYPE constants
|
||||
DTYPE_EI, DTYPE_ETPI, DTYPE_ETPIS, DTYPE_ETPAI, or DTYPE_ETPAIS.
|
||||
by default, the function uses the extension to determine the data type.
|
||||
the actual type can be read from the dtype attribute of the returned array.
|
||||
|
||||
@return one-dimensional numpy structured ndarray with data
|
||||
"""
|
||||
if not dtype:
|
||||
(root, ext) = os.path.splitext(filename)
|
||||
datatype = ext[1:].upper()
|
||||
dtype = DTYPES[datatype]
|
||||
|
||||
data = np.loadtxt(filename, dtype=dtype)
|
||||
sort_data(data)
|
||||
return data
|
||||
|
||||
|
||||
def save_data(filename, data):
|
||||
"""
|
||||
save column data (ETPI, and the like) to a text file.
|
||||
|
||||
the extension must specify one of DATATYPES (case insensitive)
|
||||
corresponding to the meaning of the columns in the file.
|
||||
|
||||
@param filename
|
||||
|
||||
@param data ETPI-like structured numpy.ndarray.
|
||||
|
||||
@remark this function is plain numpy.savetxt, provided for convenience.
|
||||
"""
|
||||
np.savetxt(filename, data, fmt='%g')
|
||||
|
||||
|
||||
def sort_data(data):
|
||||
"""
|
||||
sort scan data (ETPI and the like) in a consistent order.
|
||||
|
||||
the function sorts the data array along the scan dimensions energy, theta, phi and alpha.
|
||||
this function should be used for all sorting of measured and calculated data
|
||||
to ensure a consistent sort order.
|
||||
|
||||
the function determines the sort key based on the scan fields of the data array,
|
||||
ignoring the intensity and sigma fields.
|
||||
|
||||
the function uses the _mergesort_ algorithm which preserves the relative order of indistinct elements.
|
||||
|
||||
@warning sorting on intensity and sigma fields would mix up the scan dimensions and produce invalid results!
|
||||
|
||||
@param data ETPI-like structured numpy.ndarray.
|
||||
|
||||
@return: None. the data array is sorted in place.
|
||||
"""
|
||||
sort_key = [name for name in data.dtype.names if name in {'e', 't', 'p', 'a'}]
|
||||
data.sort(kind='mergesort', order=sort_key)
|
||||
|
||||
|
||||
def restructure_data(data, dtype=DTYPE_ETPAIS, defaults=None):
|
||||
"""
|
||||
restructure the type of a data array by adding or removing columns.
|
||||
|
||||
example: to combine an ETPI and an ETPAI scan, both arrays must have the same data type.
|
||||
this function adds the necessary columns and initializes them with default values.
|
||||
to find out the appropriate data type, use the common_dtype() function.
|
||||
to concatenate arrays, call numpy.hstack on a tuple of arrays.
|
||||
|
||||
@param data: original data array (a structured numpy array having one of the DTYPES data types).
|
||||
|
||||
@param dtype: data type of the new array. must be one out of DTYPES.
|
||||
default is DTYPE_ETPAIS which includes any possible field.
|
||||
|
||||
@param defaults: default values for new fields.
|
||||
this must be a dictionary where the key is the field name and value the default value of the field.
|
||||
the dictionary can contain an arbitrary sub-set of fields.
|
||||
undefined fields are initialized to zero.
|
||||
if the parameter is unspecified, all fields are initialized to zero.
|
||||
|
||||
@return: re-structured numpy array
|
||||
"""
|
||||
new_data = np.zeros(data.shape, dtype=dtype)
|
||||
fields = [dt[0] for dt in dtype if dt[0] in data.dtype.names]
|
||||
|
||||
if defaults is not None:
|
||||
for field, value in defaults.iteritems():
|
||||
if field in new_data.dtype.names:
|
||||
new_data[field] = value
|
||||
|
||||
for field in fields:
|
||||
new_data[field] = data[field]
|
||||
|
||||
return new_data
|
||||
|
||||
|
||||
def common_dtype(scans):
|
||||
"""
|
||||
determine the common data type for a number of scans.
|
||||
|
||||
example: to combine an ETPI and an ETPAI scan, both arrays must have the same data type.
|
||||
this function determines the least common data type.
|
||||
to restructure each array, use the restructure_data() function.
|
||||
to concatenate arrays, call numpy.hstack on a tuple of arrays.
|
||||
|
||||
@param scans: iterable of scan data or types.
|
||||
the elements of the list must be ETPI-like numpy structured arrays,
|
||||
numpy.dtype specifiers of a permitted ETPI-like array,
|
||||
or one of the DTYPE constants listed in DTYPES.
|
||||
|
||||
@return: DTYPE constant which includes all the fields referred to in the input data.
|
||||
"""
|
||||
fields = set([])
|
||||
for item in scans:
|
||||
if isinstance(item, np.ndarray):
|
||||
names = item.dtype.names
|
||||
elif isinstance(item, np.dtype):
|
||||
names = item.names
|
||||
else:
|
||||
names = [dt[0] for dt in item]
|
||||
for name in names:
|
||||
fields.add(name)
|
||||
|
||||
dtype = [dt for dt in DTYPE_ETPAIS if dt[0] in fields]
|
||||
return dtype
|
||||
|
||||
|
||||
def detect_scan_mode(data):
|
||||
"""
|
||||
detect the scan mode and unique scan positions in a data array.
|
||||
|
||||
the function detects which columns of the data array are scanning.
|
||||
if the values of a column are not constant, the column is considered to be scanning.
|
||||
the function does not require a particular ordering of the scan positions
|
||||
(although other parts of the code may do so).
|
||||
the function returns the names of the scanning columns.
|
||||
|
||||
the function also extracts unique positions for each column, and returns one array per column of input data.
|
||||
in the case of a fixed (non-scanning) column, the resulting array contains one data point.
|
||||
if the input data does not contain a particular column, the resulting array will contain 0 per default.
|
||||
|
||||
if both theta and phi columns are non-constant, the function reports a theta-phi scan.
|
||||
in a theta-phi scan, each pair (theta, phi) is considered a scan position,
|
||||
and uniqueness is enforced with respect to the (theta, phi) pairs.
|
||||
the individual theta and phi arrays may contain duplicate values.
|
||||
|
||||
@param data ETPI-like structured numpy.ndarray.
|
||||
only the 'e', 't', 'p', and 'a' columns are considered.
|
||||
|
||||
@return the tuple (scan_mode, scan_positions), where
|
||||
@arg scan_mode is a list of column names that refer to the scanned variables,
|
||||
i.e. non-constant columns in the input data.
|
||||
possible values are 'e', 't', 'p', and 'a'.
|
||||
@arg scan_positions is a dictionary of scan dimensions.
|
||||
the dictionary contains one-dimensional numpy arrays, one for each dimension.
|
||||
the dictionary keys are 'e', 't', 'p', and 'a'.
|
||||
if a dimension is not scanned, the corresponding array contains just one element.
|
||||
if the input data does not contain a column at all,
|
||||
the corresponding output array is not included in the dictionary.
|
||||
|
||||
note the special case of theta-phi scans.
|
||||
theta and phi are always returned as two separate arrays
|
||||
"""
|
||||
scan_mode = []
|
||||
|
||||
try:
|
||||
scan_energy = np.unique(data['e'])
|
||||
except ValueError:
|
||||
scan_energy = np.array([])
|
||||
try:
|
||||
scan_theta = np.unique(data['t'])
|
||||
except ValueError:
|
||||
scan_theta = np.array([])
|
||||
try:
|
||||
scan_phi = np.unique(data['p'])
|
||||
except ValueError:
|
||||
scan_phi = np.array([])
|
||||
try:
|
||||
scan_alpha = np.unique(data['a'])
|
||||
except ValueError:
|
||||
scan_alpha = np.array([])
|
||||
|
||||
# theta-phi scan
|
||||
if scan_theta.shape[0] >= 2 and scan_phi.shape[0] >= 2:
|
||||
try:
|
||||
scan_theta_phi = np.unique(data[['t', 'p']])
|
||||
except ValueError:
|
||||
scan_theta_phi = None
|
||||
if scan_theta_phi is not None and len(scan_theta_phi.dtype.names) == 2:
|
||||
scan_theta = scan_theta_phi['t']
|
||||
scan_phi = scan_theta_phi['p']
|
||||
|
||||
scan_positions = {}
|
||||
if scan_energy.shape[0] >= 1:
|
||||
scan_positions['e'] = scan_energy
|
||||
if scan_energy.shape[0] >= 2:
|
||||
scan_mode.append('e')
|
||||
if scan_theta.shape[0] >= 1:
|
||||
scan_positions['t'] = scan_theta
|
||||
if scan_theta.shape[0] >= 2:
|
||||
scan_mode.append('t')
|
||||
if scan_phi.shape[0] >= 1:
|
||||
scan_positions['p'] = scan_phi
|
||||
if scan_phi.shape[0] >= 2:
|
||||
scan_mode.append('p')
|
||||
if scan_alpha.shape[0] >= 1:
|
||||
scan_positions['a'] = scan_alpha
|
||||
if scan_alpha.shape[0] >= 2:
|
||||
scan_mode.append('a')
|
||||
|
||||
return scan_mode, scan_positions
|
||||
|
||||
|
||||
def filter_tp(data, filter):
|
||||
"""
|
||||
select data points from an ETPI array that match theta and phi coordinates of another ETPI array.
|
||||
|
||||
the matching tolerance is 0.001.
|
||||
|
||||
@param data ETPI-like structured numpy.ndarray (ETPI, ETPIS, ETPAI, ETPAIS).
|
||||
|
||||
@param filter ETPI-like structured numpy.ndarray (ETPI, ETPIS, ETPAI, ETPAIS).
|
||||
only 't' and 'p' columns are used.
|
||||
|
||||
@return filtered data (numpy.ndarray)
|
||||
copy of selected data rows from input data.
|
||||
same data type as input data.
|
||||
"""
|
||||
# copy theta,phi into separate structured arrays
|
||||
data_tp = np.zeros_like(data, dtype=[('t', '<i4'), ('p', '<i4')])
|
||||
filter_tp = np.zeros_like(filter, dtype=[('t', '<i4'), ('p', '<i4')])
|
||||
# multiply by 10, round to integer
|
||||
data_tp['t'] = np.around(data['t'] * 10.0)
|
||||
data_tp['p'] = np.around(data['p'] * 10.0)
|
||||
filter_tp['t'] = np.around(filter['t'] * 10.0)
|
||||
filter_tp['p'] = np.around(filter['p'] * 10.0)
|
||||
# calculate intersection
|
||||
idx = np.in1d(data_tp, filter_tp)
|
||||
result = data[idx]
|
||||
return result
|
||||
|
||||
def interpolate_hemi_scan(rect_tpi, hemi_tpi):
|
||||
"""
|
||||
interpolate a hemispherical scan from a rectangular angle scan.
|
||||
|
||||
the function interpolates in phi (azimuth) only.
|
||||
the rectangular array must contain a matching scan line for each theta (polar angle) of the hemi scan.
|
||||
this requires that the hemi scan have a linear theta axis.
|
||||
|
||||
@param rect_tpi TPI structured numpy.ndarray.
|
||||
rectangular theta-phi scan.
|
||||
each azimuthal line has the same number of points and range.
|
||||
the azimuthal coordinate is monotonically increasing.
|
||||
@param hemi_tpi TPI structured numpy.ndarray.
|
||||
hemispherical theta-phi scan.
|
||||
each theta of the hemi scan must have a matching scan line in the rectangular scan.
|
||||
the array may contain additional columns (E, A, S) as long as each (theta,phi) pair is unique.
|
||||
the extra columns are not altered.
|
||||
@return hemi_tpi with the interpolation result in the I column.
|
||||
"""
|
||||
lin_theta = np.unique(hemi_tpi['t'])
|
||||
for theta in lin_theta:
|
||||
sel_theta = np.abs(hemi_tpi['t'] - theta) < 0.1
|
||||
lin_phi = hemi_tpi['p'][sel_theta]
|
||||
|
||||
sel_rect_theta = np.abs(rect_tpi['t'] - theta) < 0.1
|
||||
rect_phi_1d = rect_tpi['p'][sel_rect_theta]
|
||||
rect_int_1d = rect_tpi['i'][sel_rect_theta]
|
||||
|
||||
result = np.interp(lin_phi, rect_phi_1d, rect_int_1d)
|
||||
hemi_tpi['i'][sel_theta] = result
|
||||
return hemi_tpi
|
||||
|
||||
def reshape_2d(flat_data, axis_columns, return_column='i'):
|
||||
"""
|
||||
reshape an ETPI-like array into a two-dimensional array according to the scan axes.
|
||||
|
||||
@param flat_data structured, one-dimensional numpy.ndarray with column labels.
|
||||
the array must contain a rectangular scan grid.
|
||||
the array must be sorted in the order of axis_labels.
|
||||
|
||||
@param axis_columns list of column names that designate the axes
|
||||
|
||||
@return the tuple (result_data, axis0, axis1), where
|
||||
@arg result_data (ndarray) new two-dimensional ndarray of the scan
|
||||
@arg axis0 (ndarray) scan positions along the first dimension
|
||||
@arg axis1 (ndarray) scan positions along the second dimension
|
||||
"""
|
||||
|
||||
axis0 = np.unique(flat_data[axis_columns[0]])
|
||||
n0 = len(axis0)
|
||||
axis1 = np.unique(flat_data[axis_columns[1]])
|
||||
n1 = len(axis1)
|
||||
data = np.reshape(flat_data[return_column], (n0, n1), order='C')
|
||||
return data.copy(), axis0, axis1
|
||||
|
||||
|
||||
def calc_modfunc_mean(data):
|
||||
"""
|
||||
calculates the modulation function using the mean value of data.
|
||||
this is a simplified calculation method
|
||||
which can be used if the I0 of the data does not have a strong variation.
|
||||
|
||||
@param data: ETPI array containing experimental or calculated intensity.
|
||||
|
||||
@return ETPI array containing the modulation function.
|
||||
"""
|
||||
|
||||
scan_mode, scan_positions = detect_scan_mode(data)
|
||||
modf = data.copy()
|
||||
|
||||
if len(scan_mode) == 1:
|
||||
norm = np.mean(data['i'], dtype=np.float64)
|
||||
modf = data.copy()
|
||||
modf['i'] = (data['i'] - norm) / norm
|
||||
elif len(scan_mode) == 2:
|
||||
axis0 = scan_positions[scan_mode[0]]
|
||||
n0 = len(axis0)
|
||||
axis1 = scan_positions[scan_mode[1]]
|
||||
n1 = len(axis1)
|
||||
nd_data = np.reshape(data['i'], (n0, n1), order='C')
|
||||
|
||||
prof0 = np.mean(nd_data, axis=1, dtype=np.float64)
|
||||
norm0 = np.mean(prof0, dtype=np.float64)
|
||||
nd_modf = (nd_data - norm0) / norm0
|
||||
|
||||
modf['i'] = np.ravel(nd_modf, order='C')
|
||||
else:
|
||||
logger.error('unsupported scan in calc_modfunc_mean: {0}'.format(scan_mode))
|
||||
|
||||
return modf
|
||||
|
||||
|
||||
def calc_modfunc_loess(data):
|
||||
"""
|
||||
calculate the modulation function using LOESS (locally weighted regression) smoothing.
|
||||
|
||||
the modulation function of I(x) is (I(x) - S(x)) / S(x)
|
||||
where the array S(x) is a LOESS-smoothed copy of I(x).
|
||||
|
||||
this function uses true multi-dimensional LOESS smoothing,
|
||||
in the same way as Igor's Loess operation.
|
||||
|
||||
this function uses the LOESS algorithm implemented by
|
||||
William S. Cleveland, Eric Grosse, Ming-Jen Shyu, dated 18 August 1992.
|
||||
the code and the python interface are included in the loess package.
|
||||
|
||||
@param data structured numpy.ndarray in EI, ETPI, or ETPAI format.
|
||||
can contain a one- or multi-dimensional scan.
|
||||
the algorithm does not require any specific scan mode or order
|
||||
(no rectangular grid, no particular scan hierarchy, no sorting).
|
||||
|
||||
if data contains a hemispherical scan, the phi dimension is ignored,
|
||||
i.e. the function effectively applies a phi-average.
|
||||
|
||||
the modulation function is calculated for the finite-valued scan points.
|
||||
NaNs are ignored and do not affect the finite values.
|
||||
|
||||
@return copy of the data array with the modulation function in the 'i' column.
|
||||
|
||||
@todo is a fixed smoothing factor of 0.5 okay?
|
||||
"""
|
||||
sel = np.isfinite(data['i'])
|
||||
_data = data[sel]
|
||||
|
||||
modf = data.copy()
|
||||
if _data.shape[0]:
|
||||
scan_mode, __ = detect_scan_mode(_data)
|
||||
if 't' in scan_mode and 'p' in scan_mode:
|
||||
scan_mode.remove('p')
|
||||
|
||||
lo = loess.loess_struct(_data.shape[0], len(scan_mode))
|
||||
factors = [_data[axis] for axis in scan_mode]
|
||||
lo.set_x(np.hstack(tuple(factors)))
|
||||
lo.set_y(_data['i'])
|
||||
lo.model.span = 0.5
|
||||
loess.loess(lo)
|
||||
|
||||
modf['i'][sel] = lo.get_fitted_residuals() / lo.get_fitted_values()
|
||||
else:
|
||||
modf['i'] = np.nan
|
||||
|
||||
return modf
|
||||
|
||||
|
||||
def rfactor(experiment, theory):
|
||||
"""
|
||||
calculate the R-factor of a calculated modulation function.
|
||||
|
||||
if the sigma column is present in experiment and non-zero,
|
||||
the R-factor terms are weighted by 1/sigma**2.
|
||||
|
||||
the input arrays must have the same shape and the coordinate columns must be identical (they are ignored).
|
||||
the array elements are compared element-by-element.
|
||||
terms having NaN intensity are ignored.
|
||||
|
||||
@param experiment: ETPI, ETPIS, ETPAI or ETPAIS array containing the experimental modulation function.
|
||||
|
||||
@param theory: ETPI or ETPAI array containing the calculated modulation functions.
|
||||
|
||||
@return scalar R-factor in the range from 0.0 to 2.0.
|
||||
|
||||
@raise ValueError if the function fails (e.g. division by zero or all elements non-finite).
|
||||
"""
|
||||
sel = np.logical_and(np.isfinite(theory['i']), np.isfinite(experiment['i']))
|
||||
theory = theory[sel]
|
||||
experiment = experiment[sel]
|
||||
if ('s' in experiment.dtype.names) and (experiment['s'].min()) > 0.0:
|
||||
wgts = 1.0 / experiment['s'] ** 2
|
||||
else:
|
||||
wgts = 1.0
|
||||
difs = wgts * (experiment['i'] - theory['i']) ** 2
|
||||
sums = wgts * (experiment['i'] ** 2 + theory['i'] ** 2)
|
||||
sum1 = difs.sum(dtype=np.float64)
|
||||
sum2 = sums.sum(dtype=np.float64)
|
||||
return sum1 / sum2
|
||||
|
||||
|
||||
def scaled_rfactor(scale, experiment, weights, theory):
|
||||
"""
|
||||
calculate the R-factor of a modulation function against the measurement with scaled amplitude.
|
||||
|
||||
this function allows to apply a scaling factor to the experimental function and returns the R-factor.
|
||||
this is useful if the amplitudes of the two functions do not match due to systematic effects
|
||||
of the calculation or the measurement.
|
||||
|
||||
this function is used by optimize_rfactor() as a scipy.optimize.least_squares optimization function,
|
||||
which requires a specific signature.
|
||||
|
||||
NaNs will propagate to the final result.
|
||||
math exceptions are not handled.
|
||||
|
||||
@param scale: scaling factor (> 0).
|
||||
the experimental modulation function is multiplied by this parameter.
|
||||
< 1 (> 1) decreases (increases) the experimental amplitude.
|
||||
the R factor is calculated using the scaled modulation function.
|
||||
|
||||
@param experiment: numpy.ndarray containing the experimental modulation function
|
||||
|
||||
@param weights: numpy.ndarray containing the experimental weights
|
||||
|
||||
@param theory: numpy.ndarray containing the theoretical modulation function
|
||||
|
||||
@return: scalar R-factor in the range from 0.0 to 2.0.
|
||||
nan if any element of the function arguments is nan.
|
||||
|
||||
@raise ValueError if all experiments and theory values or all weights are zero.
|
||||
"""
|
||||
difs = weights * (scale * experiment - theory) ** 2
|
||||
sums = weights * (scale ** 2 * experiment ** 2 + theory ** 2)
|
||||
sum1 = difs.sum(dtype=np.float64)
|
||||
sum2 = sums.sum(dtype=np.float64)
|
||||
return sum1 / sum2
|
||||
|
||||
|
||||
def optimize_rfactor(experiment, theory):
|
||||
"""
|
||||
calculate the R-factor of a calculated modulation function against the measurement, adjusting their amplitude.
|
||||
|
||||
if the sigma column is present in experiment and non-zero,
|
||||
the R-factor terms are weighted by 1/sigma**2.
|
||||
|
||||
this function varies the scale of the experimental function and returns the minimum R-factor.
|
||||
this is useful if the amplitudes of the two functions do not match due to systematic effects
|
||||
of the calculation or the measurement.
|
||||
|
||||
the optimization is done in a scipy.optimize.least_squares optimization of the scaled_rfactor() function.
|
||||
the initial guess of the scaling factor is 0.7, the constraining boundaries are 1/10 and 10.
|
||||
|
||||
the input arrays must have the same shape and the coordinate columns must be identical (they are ignored).
|
||||
the array elements are compared element-by-element.
|
||||
terms having NaN intensity are ignored.
|
||||
|
||||
@param experiment: ETPI, ETPIS, ETPAI or ETPAIS array containing the experimental modulation function.
|
||||
|
||||
@param theory: ETPI or ETPAI array containing the calculated modulation functions.
|
||||
|
||||
@return scalar R-factor in the range from 0.0 to 2.0.
|
||||
|
||||
@raise ValueError if the optimization fails (e.g. division by zero or all elements non-finite).
|
||||
"""
|
||||
sel = np.logical_and(np.isfinite(theory['i']), np.isfinite(experiment['i']))
|
||||
theory = theory[sel]
|
||||
experiment = experiment[sel]
|
||||
if ('s' in experiment.dtype.names) and (experiment['s'].min() > 0.0):
|
||||
wgts = 1.0 / experiment['s'] ** 2
|
||||
else:
|
||||
wgts = np.ones_like(experiment['i'])
|
||||
|
||||
result = so.least_squares(scaled_rfactor, 0.7, bounds=(0.1, 10.0), args=(experiment['i'], wgts, theory['i']))
|
||||
result_r = scaled_rfactor(result.x, experiment['i'], wgts, theory['i'])
|
||||
|
||||
return result_r
|
||||
|
||||
|
||||
def alpha_average(data):
|
||||
"""
|
||||
average I(alpha, theta, phi) over alpha.
|
||||
|
||||
@param data structured numpy.ndarray in ETPAI or ETPAIS format with a non-singular alpha dimension.
|
||||
|
||||
@return resulting ETPI or ETPIS data array.
|
||||
"""
|
||||
scan_mode, scan_positions = detect_scan_mode(data)
|
||||
result = data.copy()
|
||||
|
||||
if len(scan_mode) == 2 and scan_mode[1] == 'a':
|
||||
axis0 = scan_positions[scan_mode[0]]
|
||||
n0 = len(axis0)
|
||||
axis1 = scan_positions[scan_mode[1]]
|
||||
n1 = len(axis1)
|
||||
nd_data = np.reshape(data, (n0, n1), order='C')
|
||||
|
||||
nd_result = nd_data[:, 0]
|
||||
names = list(nd_data.dtype.names)
|
||||
names.remove('a')
|
||||
for name in names:
|
||||
nd_result[name] = np.mean(nd_data[name], axis=1, dtype=np.float64)
|
||||
result = nd_result[names]
|
||||
else:
|
||||
logger.error('unsupported scan in alpha_average: {0}'.format(scan_mode))
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def phi_average(data):
|
||||
"""
|
||||
average I(theta, phi) over phi.
|
||||
|
||||
@param data TPI-like structured numpy.ndarray containing a hemispherical scan.
|
||||
|
||||
@return resulting TI or TIS data array.
|
||||
"""
|
||||
scan_mode, scan_positions = detect_scan_mode(data)
|
||||
result = data.copy()
|
||||
|
||||
if scan_mode == ['t', 'p']:
|
||||
t_axis = np.unique(scan_positions['t'])
|
||||
nt = len(t_axis)
|
||||
|
||||
names = list(data.dtype.names)
|
||||
names.remove('p')
|
||||
dtype = [(name, data.dtype[name].str) for name in names]
|
||||
result = create_data((nt), dtype=dtype)
|
||||
|
||||
for i,t in enumerate(t_axis):
|
||||
sel = np.abs(scan_positions['t'] - t) < 0.01
|
||||
for name in names:
|
||||
result[name][i] = np.mean(data[name][sel], dtype=np.float64)
|
||||
else:
|
||||
logger.error('unsupported scan in phi_average: {0}'.format(scan_mode))
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def alpha_mirror_average(data):
|
||||
"""
|
||||
calculate the average of I(alpha, theta, phi) and I(-alpha, theta, phi).
|
||||
|
||||
@param data structured numpy.ndarray in ETPAI or ETPAIS format.
|
||||
for each (alpha, theta, phi) the array must contain a corresponding (-alpha, theta, phi)
|
||||
within a tolerance of 0.5 degrees in alpha. otherwise, a warning is issued.
|
||||
|
||||
@return resulting data array, same shape as input.
|
||||
the array is sorted.
|
||||
"""
|
||||
|
||||
result1 = data.copy()
|
||||
sort_data(result1)
|
||||
|
||||
result2 = data.copy()
|
||||
try:
|
||||
result2['a'] = -result2['a']
|
||||
sort_data(result2)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
if np.allclose(result1['a'], result2['a'], atol=0.5):
|
||||
result1['i'] = (result1['i'] + result2['i']) / 2.0
|
||||
try:
|
||||
result1['s'] = np.sqrt(result1['s'] ** 2 + result2['s'] ** 2) / 2.0
|
||||
except ValueError:
|
||||
pass
|
||||
else:
|
||||
logger.warning('asymmetric alpha scan. skipping alpha mirror average.')
|
||||
|
||||
return result1
|
||||
@@ -0,0 +1,972 @@
|
||||
"""
|
||||
@package pmsco.dispatch
|
||||
calculation dispatcher.
|
||||
|
||||
@author Matthias Muntwiler
|
||||
|
||||
@copyright (c) 2015 by Paul Scherrer Institut @n
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); @n
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
from __future__ import division
|
||||
import os
|
||||
import os.path
|
||||
import datetime
|
||||
import signal
|
||||
import collections
|
||||
import copy
|
||||
import logging
|
||||
from mpi4py import MPI
|
||||
from helpers import BraceMessage as BMsg
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# messages sent from master to slaves
|
||||
|
||||
## master sends new assignment
|
||||
## the message is a dictionary of model parameters
|
||||
TAG_NEW_TASK = 1
|
||||
## master calls end of calculation
|
||||
## the message is empty
|
||||
TAG_FINISH = 2
|
||||
|
||||
# messages sent from slaves to master
|
||||
|
||||
## slave reports new result
|
||||
## the message is a dictionary of model parameters and results
|
||||
TAG_NEW_RESULT = 1
|
||||
## slave confirms end of calculation
|
||||
## currently not used
|
||||
TAG_FINISHED = 2
|
||||
## slave has encountered an error, result is invalid
|
||||
## the message contains the original task message
|
||||
TAG_INVALID_RESULT = 3
|
||||
## slave has encountered an error and is aborting
|
||||
## the message is empty
|
||||
TAG_ERROR_ABORTING = 4
|
||||
|
||||
CalcID = collections.namedtuple('CalcID', ['model', 'scan', 'sym', 'emit', 'region'])
|
||||
|
||||
|
||||
class CalculationTask(object):
|
||||
"""
|
||||
identifies a calculation task by index and model parameters.
|
||||
|
||||
given an object of this class, the project must be able to:
|
||||
* produce calculation parameters,
|
||||
* produce a cluster,
|
||||
* gather results.
|
||||
|
||||
a calculation task is identified by:
|
||||
|
||||
@arg @c id.model structure number or iteration (handled by the mode module)
|
||||
@arg @c id.scan scan number (handled by the project)
|
||||
@arg @c id.sym symmetry number (handled by the project)
|
||||
@arg @c id.emit emitter number (handled by the project)
|
||||
@arg @c id.region region number (handled by the region handler)
|
||||
|
||||
specified members must be greater or equal to zero.
|
||||
-1 is the wildcard which is used in parent tasks,
|
||||
where, e.g., no specific symmetry is chosen.
|
||||
the root task has the ID (-1, -1, -1, -1).
|
||||
"""
|
||||
|
||||
## @var id (CalcID)
|
||||
# named tuple CalcID containing the 4-part calculation task identifier.
|
||||
|
||||
## @var parent_id (CalcID)
|
||||
# named tuple CalcID containing the task identifier of the parent task.
|
||||
|
||||
## @var model (dict)
|
||||
# dictionary containing the model parameters of the task.
|
||||
#
|
||||
# this is typically initialized to the parameters of the parent task,
|
||||
# and varied at the level where the task ID was produced.
|
||||
|
||||
## @var file_root (string)
|
||||
# file name without extension and index.
|
||||
|
||||
## @var file_ext (string)
|
||||
# file name extension including dot.
|
||||
#
|
||||
# the extension is set by the scattering code interface.
|
||||
# it must be passed back up the hierarchy.
|
||||
|
||||
## @var result_filename (string)
|
||||
# name of the ETPI or ETPAI file that contains the result (intensity) data.
|
||||
#
|
||||
# this member is filled at the end of the calculation by MscoProcess.calc().
|
||||
# the filename can be constructed given the base name, task ID, and extension.
|
||||
# since this may be tedious, the filename must be returned here.
|
||||
|
||||
## @var modf_filename (string)
|
||||
# name of the ETPI or ETPAI file that contains the resulting modulation function.
|
||||
|
||||
## @var time (timedelta)
|
||||
# execution time of the task.
|
||||
#
|
||||
# execution time is measured as wall time of a single calculation.
|
||||
# in parent tasks, execution time is the sum of the children's execution time.
|
||||
#
|
||||
# this information may be used to plan the end of the program run or for statistics.
|
||||
|
||||
## @var files (dict)
|
||||
# files generated by the task and their category
|
||||
#
|
||||
# dictionary key is the file name,
|
||||
# value is the file category, e.g. 'cluster', 'phase', etc.
|
||||
#
|
||||
# this information is used to automatically clean up unnecessary data files.
|
||||
|
||||
## @var region (dict)
|
||||
# scan positions to substitute the ones from the original scan.
|
||||
#
|
||||
# this is used to distribute scans over multiple calculator processes,
|
||||
# cf. e.g. @ref EnergyRegionHandler.
|
||||
#
|
||||
# dictionary key must be the scan dimension 'e', 't', 'p', 'a'.
|
||||
# the value is a numpy.ndarray containing the scan positions.
|
||||
#
|
||||
# the dictionary can be empty if the original scan shall be calculated at once.
|
||||
|
||||
def __init__(self):
|
||||
"""
|
||||
create a new calculation task instance with all members equal to zero (root task).
|
||||
"""
|
||||
self.id = CalcID(-1, -1, -1, -1, -1)
|
||||
self.parent_id = self.id
|
||||
self.model = {}
|
||||
self.file_root = ""
|
||||
self.file_ext = ""
|
||||
self.result_filename = ""
|
||||
self.modf_filename = ""
|
||||
self.result_valid = False
|
||||
self.time = datetime.timedelta()
|
||||
self.files = {}
|
||||
self.region = {}
|
||||
|
||||
def __eq__(self, other):
|
||||
"""
|
||||
consider two tasks equal if they have the same ID.
|
||||
|
||||
EXPERIMENTAL
|
||||
not clear whether this is a good idea.
|
||||
we want this equality because the calculation may modify a task to return results.
|
||||
yet, it should be considered the same task.
|
||||
e.g., we want to find the task in the original task list.
|
||||
"""
|
||||
return isinstance(other, self.__class__) and self.id == other.id
|
||||
|
||||
def __hash__(self):
|
||||
"""
|
||||
the hash depends on the ID only.
|
||||
"""
|
||||
return hash(self.id)
|
||||
|
||||
def get_mpi_message(self):
|
||||
"""
|
||||
convert the task data to a format suitable for an MPI message.
|
||||
|
||||
mpi4py does not properly pickle objects.
|
||||
we need to convert our data to basic types.
|
||||
|
||||
@return: (dict)
|
||||
"""
|
||||
msg = vars(self)
|
||||
msg['id'] = self.id._asdict()
|
||||
msg['parent_id'] = self.parent_id._asdict()
|
||||
return msg
|
||||
|
||||
def set_mpi_message(self, msg):
|
||||
"""
|
||||
set object attributes from MPI message.
|
||||
|
||||
@param msg: message created by get_mpi_message()
|
||||
|
||||
@return: None
|
||||
"""
|
||||
if isinstance(msg['id'], dict):
|
||||
msg['id'] = CalcID(**msg['id'])
|
||||
if isinstance(msg['parent_id'], dict):
|
||||
msg['parent_id'] = CalcID(**msg['parent_id'])
|
||||
for k, v in msg.iteritems():
|
||||
self.__setattr__(k, v)
|
||||
|
||||
def format_filename(self, **overrides):
|
||||
"""
|
||||
format input or output file name including calculation index.
|
||||
|
||||
@param overrides optional keyword arguments override object fields.
|
||||
the following keywords are handled: @c root, @c model, @c scan, @c sym, @c emit, @c region, @c ext.
|
||||
|
||||
@return a string consisting of the concatenation of the base name, the ID, and the extension.
|
||||
"""
|
||||
parts = {}
|
||||
parts['root'] = self.file_root
|
||||
parts['model'] = self.id.model
|
||||
parts['scan'] = self.id.scan
|
||||
parts['sym'] = self.id.sym
|
||||
parts['emit'] = self.id.emit
|
||||
parts['region'] = self.id.region
|
||||
parts['ext'] = self.file_ext
|
||||
|
||||
for key in overrides.keys():
|
||||
parts[key] = overrides[key]
|
||||
|
||||
filename = "{root}_{model}_{scan}_{sym}_{emit}_{region}{ext}".format(**parts)
|
||||
return filename
|
||||
|
||||
def copy(self):
|
||||
"""
|
||||
create a copy of the task.
|
||||
|
||||
@return: new independent CalculationTask with the same attributes as the original one.
|
||||
"""
|
||||
return copy.deepcopy(self)
|
||||
|
||||
def change_id(self, **kwargs):
|
||||
"""
|
||||
change the ID of the task.
|
||||
|
||||
@param kwargs: keyword arguments to change specific parts of the ID.
|
||||
|
||||
@note instead of changing all parts of the ID, you may simply assign a new CalcID to the id member.
|
||||
"""
|
||||
self.id = self.id._replace(**kwargs)
|
||||
|
||||
def add_task_file(self, name, category):
|
||||
"""
|
||||
register a file that was generated by the calculation task.
|
||||
|
||||
this information is used to automatically clean up unnecessary data files.
|
||||
|
||||
@param name: file name (optionally including a path).
|
||||
@param category: file category, e.g. 'cluster', 'phase', etc.
|
||||
@return: None
|
||||
"""
|
||||
self.files[name] = category
|
||||
|
||||
def rename_task_file(self, old_filename, new_filename):
|
||||
"""
|
||||
rename a file.
|
||||
|
||||
update the file list after a file was renamed.
|
||||
the method silently ignores if old_filename is not listed.
|
||||
|
||||
@param old_filename: old file name
|
||||
@param new_filename: new file name
|
||||
@return: None
|
||||
"""
|
||||
try:
|
||||
self.files[new_filename] = self.files[old_filename]
|
||||
del self.files[old_filename]
|
||||
except KeyError:
|
||||
logger.warning("CalculationTask.rename_task_file: could not rename file {0} to {1}".format(old_filename,
|
||||
new_filename))
|
||||
|
||||
def remove_task_file(self, filename):
|
||||
"""
|
||||
remove a file from the list of generated data files.
|
||||
|
||||
the method silently ignores if filename is not listed.
|
||||
the method removes the file from the internal list.
|
||||
it does not delete the file.
|
||||
|
||||
@param filename: file name
|
||||
@return: None
|
||||
"""
|
||||
try:
|
||||
del self.files[filename]
|
||||
except KeyError:
|
||||
logger.warning("CalculationTask.remove_task_file: could not remove file {0}".format(filename))
|
||||
|
||||
|
||||
class MscoProcess(object):
|
||||
"""
|
||||
code shared by MscoMaster and MscoSlave.
|
||||
|
||||
mainly passing project parameters, handling OS signals,
|
||||
calling an MSC calculation.
|
||||
"""
|
||||
|
||||
## @var _finishing
|
||||
# if True, the task loop should not accept new tasks.
|
||||
#
|
||||
# the loop still waits for the results of running calculations.
|
||||
|
||||
## @var _running
|
||||
# while True, the task loop keeps running.
|
||||
#
|
||||
# if False, the loop will exit just before the next iteration.
|
||||
# pending tasks and running calculations will not be waited for.
|
||||
#
|
||||
# @attention maks sure that all calculations are finished before resetting this flag.
|
||||
# higher ranked processes may not exit if they do not receive the finish message.
|
||||
|
||||
## @var datetime_limit (datetime.datetime)
|
||||
# date and time when the calculations should finish (regardless of result)
|
||||
# because the process may get killed by the scheduler after this time.
|
||||
#
|
||||
# the default is 2 days after start.
|
||||
|
||||
def __init__(self, comm):
|
||||
self._comm = comm
|
||||
self._project = None
|
||||
self._calculator = None
|
||||
self._running = False
|
||||
self._finishing = False
|
||||
self.stop_signal = False
|
||||
self.datetime_limit = datetime.datetime.now() + datetime.timedelta(days=2)
|
||||
|
||||
def setup(self, project):
|
||||
self._project = project
|
||||
self._calculator = project.calculator_class()
|
||||
self._running = False
|
||||
self._finishing = False
|
||||
self.stop_signal = False
|
||||
|
||||
try:
|
||||
# signal handlers
|
||||
signal.signal(signal.SIGTERM, self.receive_signal)
|
||||
signal.signal(signal.SIGUSR1, self.receive_signal)
|
||||
signal.signal(signal.SIGUSR2, self.receive_signal)
|
||||
except AttributeError:
|
||||
pass
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
if project.timedelta_limit:
|
||||
self.datetime_limit = datetime.datetime.now() + project.timedelta_limit
|
||||
|
||||
# noinspection PyUnusedLocal
|
||||
def receive_signal(self, signum, stack):
|
||||
"""
|
||||
sets the self.stop_signal flag,
|
||||
which will terminate the optimization process
|
||||
as soon as all slaves have finished their calculation.
|
||||
"""
|
||||
self.stop_signal = True
|
||||
|
||||
def run(self):
|
||||
pass
|
||||
|
||||
def cleanup(self):
|
||||
"""
|
||||
clean up after all calculations.
|
||||
|
||||
this method calls the clean up function of the project.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
self._project.cleanup()
|
||||
|
||||
def calc(self, task):
|
||||
"""
|
||||
execute a single calculation.
|
||||
|
||||
* create the cluster and parameter objects.
|
||||
* export the cluster for reference.
|
||||
* choose the scan file.
|
||||
* specify the output file name.
|
||||
* call the calculation program.
|
||||
* set task.result_filename, task.file_ext, task.time.
|
||||
|
||||
the function checks for some obvious errors, and skips the calculation if an error is detected, such as:
|
||||
|
||||
* missing atoms or emitters in the cluster.
|
||||
|
||||
@param task (CalculationTask) calculation task and identifier.
|
||||
"""
|
||||
|
||||
s_model = str(task.model)
|
||||
s_id = str(task.id)
|
||||
logger.info("calling calculation %s", s_id)
|
||||
logger.info("model %s", s_model)
|
||||
start_time = datetime.datetime.now()
|
||||
|
||||
# create parameter and cluster structures
|
||||
clu = self._project.cluster_generator.create_cluster(task.model, task.id)
|
||||
par = self._project.create_params(task.model, task.id)
|
||||
|
||||
# generate file names
|
||||
output_file = task.format_filename(ext="")
|
||||
|
||||
# determine scan range
|
||||
scan = self._project.scans[task.id.scan]
|
||||
if task.region:
|
||||
scan = scan.copy()
|
||||
try:
|
||||
scan.energies = task.region['e']
|
||||
logger.debug(BMsg("substitute energy region"))
|
||||
except KeyError:
|
||||
pass
|
||||
try:
|
||||
scan.thetas = task.region['t']
|
||||
logger.debug(BMsg("substitute theta region"))
|
||||
except KeyError:
|
||||
pass
|
||||
try:
|
||||
scan.phis = task.region['p']
|
||||
logger.debug(BMsg("substitute phi region"))
|
||||
except KeyError:
|
||||
pass
|
||||
try:
|
||||
scan.alphas = task.region['a']
|
||||
logger.debug(BMsg("substitute alpha region"))
|
||||
except KeyError:
|
||||
pass
|
||||
|
||||
# check parameters and call the msc program
|
||||
if clu.get_atom_count() < 2:
|
||||
logger.error("empty cluster in calculation %s", s_id)
|
||||
task.result_valid = False
|
||||
elif clu.get_emitter_count() < 1:
|
||||
logger.error("no emitters in cluster of calculation %s.", s_id)
|
||||
task.result_valid = False
|
||||
else:
|
||||
files = self._calculator.check_cluster(clu, output_file)
|
||||
task.files.update(files)
|
||||
|
||||
task.result_filename, files = self._calculator.run(par, clu, scan, output_file)
|
||||
(root, ext) = os.path.splitext(task.result_filename)
|
||||
task.file_ext = ext
|
||||
task.result_valid = True
|
||||
task.files.update(files)
|
||||
|
||||
task.time = datetime.datetime.now() - start_time
|
||||
|
||||
return task
|
||||
|
||||
|
||||
class MscoMaster(MscoProcess):
|
||||
"""
|
||||
MscoMaster process for MSC calculations.
|
||||
|
||||
This class implements the main loop of the master (rank 0) process.
|
||||
It sends calculation commands to the slaves, and dispatches the results
|
||||
to the appropriate post-processing modules.
|
||||
|
||||
if there is only one process, the MscoMaster executes the calculations sequentially.
|
||||
"""
|
||||
|
||||
## @var _pending_tasks (OrderedDict)
|
||||
# CalculationTask objects of pending calculations.
|
||||
# the dictionary keys are the task IDs.
|
||||
|
||||
## @var _running_tasks
|
||||
# CalculationTask objects of currently running calculations.
|
||||
# the dictionary keys are the task IDs.
|
||||
|
||||
## @var _complete_tasks
|
||||
# CalculationTask objects of complete calculations.
|
||||
#
|
||||
# calculations are removed from the list when they are passed to the result handlers.
|
||||
# the dictionary keys are the task IDs.
|
||||
|
||||
## @var _slaves
|
||||
# total number of MPI slave ranks = number of calculator slots
|
||||
|
||||
## @var _idle_ranks
|
||||
# list of ranks which are waiting to receive a task.
|
||||
#
|
||||
# list of int, default = []
|
||||
|
||||
## @var max_calculations
|
||||
# maximum number of calculations
|
||||
#
|
||||
# if this limit is exceeded, the optimization will stop.
|
||||
# the limit is meant to catch irregular situations such as run-time calculation errors or infinite loops.
|
||||
|
||||
## @var _calculations
|
||||
# number of dispatched calculations
|
||||
#
|
||||
# if this number exceeds the @ref max_calculations, the optimization will stop.
|
||||
|
||||
## @var _running_slaves
|
||||
# number of running slave ranks
|
||||
#
|
||||
# keeps track of active (idle or busy) slave ranks.
|
||||
# it is used to make sure (if possible) that all slave tasks have finished before the master quits.
|
||||
# the number is decremented when a slave quits due to an error or when the master sends a finish message.
|
||||
|
||||
## @var _min_queue_len
|
||||
# if the queue length drops below this number, the dispatcher asks for the next round of tasks.
|
||||
|
||||
## @var _model_done
|
||||
# (bool) True if the model handler did returned an empty list of new tasks.
|
||||
|
||||
## @var _root_task
|
||||
# (CalculationTask) root calculation task
|
||||
#
|
||||
# this is the root of the calculation tasks tree.
|
||||
# it defines the initial model and the output file name.
|
||||
# it is passed to the model handler during the main loop.
|
||||
|
||||
# @var _model_handler
|
||||
# (ModelHandler) model handler instance
|
||||
|
||||
# @var _scan_handler
|
||||
# (ScanHandler) scan handler instance
|
||||
|
||||
# @var _symmetry_handler
|
||||
# (SymmetryHandler) symmetry handler instance
|
||||
|
||||
# @var _emitter_handler
|
||||
# (EmitterHandler) emitter handler instance
|
||||
|
||||
# @var _region_handler
|
||||
# (RegionHandler) region handler instance
|
||||
|
||||
def __init__(self, comm):
|
||||
super(MscoMaster, self).__init__(comm)
|
||||
self._pending_tasks = collections.OrderedDict()
|
||||
self._running_tasks = collections.OrderedDict()
|
||||
self._complete_tasks = collections.OrderedDict()
|
||||
self._slaves = self._comm.Get_size() - 1
|
||||
self._idle_ranks = []
|
||||
self.max_calculations = 1000000
|
||||
self._calculations = 0
|
||||
self._running_slaves = 0
|
||||
self._model_done = False
|
||||
self._min_queue_len = self._slaves + 1
|
||||
|
||||
self._root_task = None
|
||||
self._model_handler = None
|
||||
self._scan_handler = None
|
||||
self._symmetry_handler = None
|
||||
self._emitter_handler = None
|
||||
self._region_handler = None
|
||||
|
||||
def setup(self, project):
|
||||
"""
|
||||
initialize the process, handlers, root task, slave counting.
|
||||
|
||||
this method initializes the run-time attributes of the master process,
|
||||
particularly the attributes that depend on the project.
|
||||
|
||||
it creates the root calculation task with the initial model defined by the project.
|
||||
|
||||
it creates and initializes the task handler objects according to the handler classes defined by the project.
|
||||
|
||||
the method notifies the handlers of the number of available slave processes (slots).
|
||||
some of the tasks handlers adjust their branching according to the number of slots.
|
||||
this mechanism may be used to balance the load between the task levels.
|
||||
however, the current implementation is very coarse in this respect.
|
||||
it advertises all slots to the model handler but a reduced number to the remaining handlers
|
||||
depending on the operation mode.
|
||||
the region handler receives a maximum of 4 slots except in single calculation mode.
|
||||
in single calculation mode, all slots can be used by all handlers.
|
||||
"""
|
||||
super(MscoMaster, self).setup(project)
|
||||
|
||||
logger.debug("master entering setup")
|
||||
self._running_slaves = self._slaves
|
||||
self._idle_ranks = range(1, self._running_slaves + 1)
|
||||
|
||||
self._root_task = CalculationTask()
|
||||
self._root_task.file_root = project.output_file
|
||||
self._root_task.model = project.create_domain().start
|
||||
|
||||
self._model_handler = project.handler_classes['model']()
|
||||
self._scan_handler = project.handler_classes['scan']()
|
||||
self._symmetry_handler = project.handler_classes['symmetry']()
|
||||
self._emitter_handler = project.handler_classes['emitter']()
|
||||
self._region_handler = project.handler_classes['region']()
|
||||
|
||||
self._model_handler.datetime_limit = self.datetime_limit
|
||||
|
||||
slaves_adj = max(self._slaves, 1)
|
||||
self._model_handler.setup(project, slaves_adj)
|
||||
if project.mode != "single":
|
||||
slaves_adj = max(slaves_adj / 2, 1)
|
||||
self._scan_handler.setup(project, slaves_adj)
|
||||
self._symmetry_handler.setup(project, slaves_adj)
|
||||
self._emitter_handler.setup(project, slaves_adj)
|
||||
if project.mode != "single":
|
||||
slaves_adj = min(slaves_adj, 4)
|
||||
self._region_handler.setup(project, slaves_adj)
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
main loop.
|
||||
|
||||
calls slaves, accept and dispatches results.
|
||||
|
||||
setup() must be called before, cleanup() after.
|
||||
"""
|
||||
self._running = True
|
||||
self._calculations = 0
|
||||
|
||||
logger.debug("master entering main loop")
|
||||
# main task loop
|
||||
while self._running:
|
||||
logger.debug("new iteration of master main loop")
|
||||
self._create_tasks()
|
||||
self._dispatch_results()
|
||||
if self._finishing:
|
||||
self._dispatch_finish()
|
||||
else:
|
||||
self._dispatch_tasks()
|
||||
self._receive_result()
|
||||
self._check_finish()
|
||||
|
||||
logger.debug("master exiting main loop")
|
||||
self._running = False
|
||||
|
||||
def cleanup(self):
|
||||
logger.debug("master entering cleanup")
|
||||
self._region_handler.cleanup()
|
||||
self._emitter_handler.cleanup()
|
||||
self._symmetry_handler.cleanup()
|
||||
self._scan_handler.cleanup()
|
||||
self._model_handler.cleanup()
|
||||
super(MscoMaster, self).cleanup()
|
||||
|
||||
def _dispatch_results(self):
|
||||
"""
|
||||
pass results through the post-processing modules.
|
||||
"""
|
||||
logger.debug("dispatching results of %u tasks", len(self._complete_tasks))
|
||||
while self._complete_tasks:
|
||||
__, task = self._complete_tasks.popitem(last=False)
|
||||
|
||||
logger.debug("passing task %s to region handler", str(task.id))
|
||||
task = self._region_handler.add_result(task)
|
||||
|
||||
if task:
|
||||
logger.debug("passing task %s to emitter handler", str(task.id))
|
||||
task = self._emitter_handler.add_result(task)
|
||||
|
||||
if task:
|
||||
logger.debug("passing task %s to symmetry handler", str(task.id))
|
||||
task = self._symmetry_handler.add_result(task)
|
||||
|
||||
if task:
|
||||
logger.debug("passing task %s to scan handler", str(task.id))
|
||||
task = self._scan_handler.add_result(task)
|
||||
|
||||
if task:
|
||||
logger.debug("passing task %s to model handler", str(task.id))
|
||||
task = self._model_handler.add_result(task)
|
||||
|
||||
if task:
|
||||
logger.debug("root task %s complete", str(task.id))
|
||||
self._finishing = True
|
||||
|
||||
def _create_tasks(self):
|
||||
"""
|
||||
have the model handler generate the next round of top-level calculation tasks.
|
||||
|
||||
the method calls the model handler repeatedly
|
||||
until the pending tasks queue is filled up
|
||||
to more than the minimum queue length.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
logger.debug("creating new tasks from root")
|
||||
while len(self._pending_tasks) < self._min_queue_len:
|
||||
tasks = self._model_handler.create_tasks(self._root_task)
|
||||
logger.debug("model handler returned %u new tasks", len(tasks))
|
||||
if not tasks:
|
||||
self._model_done = True
|
||||
break
|
||||
for task in tasks:
|
||||
self.add_model_task(task)
|
||||
|
||||
def _dispatch_tasks(self):
|
||||
"""
|
||||
send pending tasks to available slaves or master.
|
||||
|
||||
if there is only one process, the master executes one task, and returns.
|
||||
"""
|
||||
logger.debug("dispatching tasks to calculators")
|
||||
if self._slaves > 0:
|
||||
while not self._finishing:
|
||||
try:
|
||||
rank = self._idle_ranks.pop(0)
|
||||
except IndexError:
|
||||
break
|
||||
|
||||
try:
|
||||
__, task = self._pending_tasks.popitem(last=False)
|
||||
except KeyError:
|
||||
self._idle_ranks.append(rank)
|
||||
break
|
||||
else:
|
||||
logger.debug("assigning task %s to rank %u", str(task.id), rank)
|
||||
self._running_tasks[task.id] = task
|
||||
self._comm.send(task.get_mpi_message(), dest=rank, tag=TAG_NEW_TASK)
|
||||
self._calculations += 1
|
||||
else:
|
||||
if not self._finishing:
|
||||
try:
|
||||
__, task = self._pending_tasks.popitem(last=False)
|
||||
except KeyError:
|
||||
pass
|
||||
else:
|
||||
logger.debug("executing task %s in master process", str(task.id))
|
||||
self.calc(task)
|
||||
self._calculations += 1
|
||||
self._complete_tasks[task.id] = task
|
||||
|
||||
def _dispatch_finish(self):
|
||||
"""
|
||||
send all slave ranks a finish message.
|
||||
"""
|
||||
logger.debug("dispatch finish message to %u slaves", len(self._idle_ranks))
|
||||
while self._idle_ranks:
|
||||
rank = self._idle_ranks.pop()
|
||||
logger.debug("send finish tag to rank %u", rank)
|
||||
self._comm.send(None, dest=rank, tag=TAG_FINISH)
|
||||
self._running_slaves -= 1
|
||||
|
||||
def _receive_result(self):
|
||||
"""
|
||||
wait for a message from another rank and process it.
|
||||
"""
|
||||
if self._running_slaves > 0:
|
||||
logger.debug("waiting for calculation result")
|
||||
s = MPI.Status()
|
||||
data = self._comm.recv(source=MPI.ANY_SOURCE, tag=MPI.ANY_TAG, status=s)
|
||||
|
||||
if s.tag == TAG_NEW_RESULT:
|
||||
task_id = self._accept_task_done(data)
|
||||
self._idle_ranks.append(s.source)
|
||||
logger.debug(BMsg("received result of task {0} from rank {1}", task_id, s.source))
|
||||
elif s.tag == TAG_INVALID_RESULT:
|
||||
task_id = self._accept_task_done(data)
|
||||
self._idle_ranks.append(s.source)
|
||||
logger.error(BMsg("received invalid result of task {0} from rank {1}", task_id, s.source))
|
||||
elif s.tag == TAG_ERROR_ABORTING:
|
||||
self._finishing = True
|
||||
self._running_slaves -= 1
|
||||
task_id = self._accept_task_done(data)
|
||||
logger.error(BMsg("received abort signal from rank {1}", task_id, s.source))
|
||||
|
||||
def _accept_task_done(self, data):
|
||||
"""
|
||||
check the return message from a slave process and mark the task done.
|
||||
|
||||
if the message contains complete data of a running task, the corresponding CalculationTask object is returned.
|
||||
|
||||
@param data: a dictionary that can be imported into a CalculationTask object by the set_mpi_message() method.
|
||||
|
||||
@return: task ID (CalcID type) if the message contains the complete identification of a pending task,
|
||||
None if the ID cannot be determined or is not in the list of running tasks.
|
||||
"""
|
||||
try:
|
||||
task = CalculationTask()
|
||||
task.set_mpi_message(data)
|
||||
del self._running_tasks[task.id]
|
||||
self._complete_tasks[task.id] = task
|
||||
task_id = task.id
|
||||
except (TypeError, IndexError, KeyError):
|
||||
task_id = None
|
||||
|
||||
return task_id
|
||||
|
||||
def _check_finish(self):
|
||||
"""
|
||||
check whether the task loop is finished.
|
||||
|
||||
the task loop is finished on any of the following conditions:
|
||||
* there are no pending or running tasks,
|
||||
* a file named "finish_pmsco" exists in the working directory,
|
||||
* a SIGUSR1, SIGUSR2, or SIGTERM signal was received,
|
||||
* self.datetime_limit is exceeded, or
|
||||
* self.max_calculations is exceeded.
|
||||
|
||||
self._finishing is set if any of these conditions is fulfilled.
|
||||
|
||||
self._running is reset if self._finishing is set and no calculation tasks are running.
|
||||
|
||||
@return: self._finishing
|
||||
"""
|
||||
if not self._finishing and (self._model_done and not self._pending_tasks and not self._running_tasks):
|
||||
logger.info("finish: model handler is done")
|
||||
self._finishing = True
|
||||
if not self._finishing and (self._calculations >= self.max_calculations):
|
||||
logger.warning("finish: max. calculations (%u) exeeded", self.max_calculations)
|
||||
self._finishing = True
|
||||
if not self._finishing and self.stop_signal:
|
||||
logger.info("finish: stop signal received")
|
||||
self._finishing = True
|
||||
if not self._finishing and (datetime.datetime.now() > self.datetime_limit):
|
||||
logger.warning("finish: time limit exceeded")
|
||||
self._finishing = True
|
||||
if not self._finishing and os.path.isfile("finish_pmsco"):
|
||||
logger.info("finish: finish_pmsco file detected")
|
||||
self._finishing = True
|
||||
|
||||
if self._finishing and not self._running_slaves and not self._running_tasks:
|
||||
logger.info("finish: all calculations finished")
|
||||
self._running = False
|
||||
|
||||
return self._finishing
|
||||
|
||||
def add_model_task(self, task):
|
||||
"""
|
||||
add a new model task including all of its children to the task queue.
|
||||
|
||||
@param task (CalculationTask) task identifier and model parameters.
|
||||
"""
|
||||
|
||||
scan_tasks = self._scan_handler.create_tasks(task)
|
||||
for scan_task in scan_tasks:
|
||||
sym_tasks = self._symmetry_handler.create_tasks(scan_task)
|
||||
for sym_task in sym_tasks:
|
||||
emitter_tasks = self._emitter_handler.create_tasks(sym_task)
|
||||
for emitter_task in emitter_tasks:
|
||||
region_tasks = self._region_handler.create_tasks(emitter_task)
|
||||
for region_task in region_tasks:
|
||||
self._pending_tasks[region_task.id] = region_task
|
||||
|
||||
|
||||
class MscoSlave(MscoProcess):
|
||||
"""
|
||||
MscoSlave process for MSC calculations.
|
||||
|
||||
This class implements the main loop of a slave (rank > 0) process.
|
||||
It waits for assignments from the master process,
|
||||
and runs one calculation after the other.
|
||||
"""
|
||||
|
||||
## @var _errors
|
||||
# number of errors (exceptions) encountered in calculation tasks.
|
||||
#
|
||||
# typically, a task is aborted when an exception is encountered.
|
||||
|
||||
def __init__(self, comm):
|
||||
super(MscoSlave, self).__init__(comm)
|
||||
self._errors = 0
|
||||
self._max_errors = 5
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
Waits for messages from the master and dispatches tasks.
|
||||
"""
|
||||
logger.debug("slave entering main loop")
|
||||
s = MPI.Status()
|
||||
self._running = True
|
||||
while self._running:
|
||||
logger.debug("waiting for message")
|
||||
data = self._comm.recv(source=0, tag=MPI.ANY_TAG, status=s)
|
||||
if s.tag == TAG_NEW_TASK:
|
||||
logger.debug("received new task")
|
||||
self.accept_task(data)
|
||||
elif s.tag == TAG_FINISH:
|
||||
logger.debug("received finish message")
|
||||
self._running = False
|
||||
|
||||
logger.debug("slave exiting main loop")
|
||||
|
||||
def accept_task(self, data):
|
||||
"""
|
||||
Executes a calculation task and returns the result to the master.
|
||||
|
||||
if a recoverable exception (math, value and key errors) occurs,
|
||||
the method catches the exception but sends a failure message to the master.
|
||||
if exceptions occur repeatedly, the slave aborts and sends an abort message to the master.
|
||||
|
||||
@param data: task message received from MPI.
|
||||
"""
|
||||
task = CalculationTask()
|
||||
task.set_mpi_message(data)
|
||||
logger.debug(BMsg("executing task {0} in slave process", task.id))
|
||||
try:
|
||||
result = self.calc(task)
|
||||
self._errors = 0
|
||||
except (ValueError, ArithmeticError, LookupError):
|
||||
logger.exception(BMsg("unhandled exception in calculation task {0}", task.id))
|
||||
self._errors += 1
|
||||
if self._errors <= self._max_errors:
|
||||
self._comm.send(data, dest=0, tag=TAG_INVALID_RESULT)
|
||||
else:
|
||||
logger.error("too many exceptions, aborting")
|
||||
self._running = False
|
||||
self._comm.send(data, dest=0, tag=TAG_ERROR_ABORTING)
|
||||
else:
|
||||
logger.debug(BMsg("sending result of task {0} to master", result.id))
|
||||
self._comm.send(result.get_mpi_message(), dest=0, tag=TAG_NEW_RESULT)
|
||||
|
||||
|
||||
def run_master(mpi_comm, project):
|
||||
"""
|
||||
initialize and run the master calculation loop.
|
||||
|
||||
a MscoMaster object is created.
|
||||
the MscoMaster executes the calculation loop and dispatches the tasks.
|
||||
|
||||
this function must be called in the MPI rank 0 process only.
|
||||
|
||||
if an unhandled exception occurs, this function aborts the MPI communicator, killing all MPI processes.
|
||||
the caller will not have a chance to handle the exception.
|
||||
|
||||
@param mpi_comm: MPI communicator (mpi4py.MPI.COMM_WORLD).
|
||||
|
||||
@param project: project instance (sub-class of project.Project).
|
||||
"""
|
||||
try:
|
||||
master = MscoMaster(mpi_comm)
|
||||
master.setup(project)
|
||||
master.run()
|
||||
master.cleanup()
|
||||
except (SystemExit, KeyboardInterrupt):
|
||||
mpi_comm.Abort()
|
||||
raise
|
||||
except Exception:
|
||||
logger.exception("unhandled exception in master calculation loop.")
|
||||
mpi_comm.Abort()
|
||||
raise
|
||||
|
||||
|
||||
def run_slave(mpi_comm, project):
|
||||
"""
|
||||
initialize and run the slave calculation loop.
|
||||
|
||||
a MscoSlave object is created.
|
||||
the MscoSlave accepts tasks from rank 0 and runs the calculations.
|
||||
|
||||
this function must be called in MPI rank > 0 processes.
|
||||
|
||||
if an unhandled exception occurs, the slave process terminates.
|
||||
unless it is a SystemExit or KeyboardInterrupt (where we expect that the master also receives the signal),
|
||||
the MPI communicator is aborted, killing all MPI processes.
|
||||
|
||||
@param mpi_comm: MPI communicator (mpi4py.MPI.COMM_WORLD).
|
||||
|
||||
@param project: project instance (sub-class of project.Project).
|
||||
"""
|
||||
try:
|
||||
slave = MscoSlave(mpi_comm)
|
||||
slave.setup(project)
|
||||
slave.run()
|
||||
slave.cleanup()
|
||||
except (SystemExit, KeyboardInterrupt):
|
||||
raise
|
||||
except Exception:
|
||||
logger.exception("unhandled exception in slave calculation loop.")
|
||||
mpi_comm.Abort()
|
||||
raise
|
||||
|
||||
|
||||
def run_calculations(project):
|
||||
"""
|
||||
initialize and run the main calculation loop.
|
||||
|
||||
depending on the MPI rank, the function branches into run_master() (rank 0) or run_slave() (rank > 0).
|
||||
|
||||
@param project: project instance (sub-class of project.Project).
|
||||
"""
|
||||
mpi_comm = MPI.COMM_WORLD
|
||||
mpi_rank = mpi_comm.Get_rank()
|
||||
|
||||
if mpi_rank == 0:
|
||||
logger.debug("MPI rank %u setting up master loop", mpi_rank)
|
||||
run_master(mpi_comm, project)
|
||||
else:
|
||||
logger.debug("MPI rank %u setting up slave loop", mpi_rank)
|
||||
run_slave(mpi_comm, project)
|
||||
@@ -0,0 +1,3 @@
|
||||
edac.py
|
||||
edac_wrap.cxx
|
||||
revision.py
|
||||
@@ -0,0 +1 @@
|
||||
__author__ = 'muntwiler_m'
|
||||
@@ -0,0 +1,7 @@
|
||||
/* EDAC interface for other programs */
|
||||
%module edac
|
||||
%{
|
||||
extern int run_script(char *scriptfile);
|
||||
%}
|
||||
|
||||
extern int run_script(char *scriptfile);
|
||||
@@ -0,0 +1,130 @@
|
||||
*** /home/muntwiler_m/mnt/pearl_data/software/edac/edac_all.cpp 2011-04-14 23:38:44.000000000 +0200
|
||||
--- edac_all.cpp 2016-02-11 12:15:45.322049772 +0100
|
||||
***************
|
||||
*** 10117,10122 ****
|
||||
--- 10117,10123 ----
|
||||
void scan_imfp(char *name);
|
||||
void scan_imfp(FILE *fout);
|
||||
numero iimfp_TPP(numero kr);
|
||||
+ numero iimfp_SD(numero kr);
|
||||
numero TPP_rho, TPP_Nv, TPP_Ep, TPP_Eg;
|
||||
numero screening_length;
|
||||
int scattering_so;
|
||||
***************
|
||||
*** 10230,10235 ****
|
||||
--- 10231,10237 ----
|
||||
|
||||
int n_th;
|
||||
int n_fi;
|
||||
+ int n_ang;
|
||||
numero *th, *fi;
|
||||
|
||||
numero *th_out,
|
||||
***************
|
||||
*** 10239,10244 ****
|
||||
--- 10241,10247 ----
|
||||
void free(void);
|
||||
void init_th(numero thi, numero thf, int nth);
|
||||
void init_phi(numero fii, numero fif, int nfi);
|
||||
+ void read_angles(FILE *fin, char *my_file);
|
||||
void init_refraction(
|
||||
numero refraction);
|
||||
void init_transmission(
|
||||
***************
|
||||
*** 12485,12490 ****
|
||||
--- 12488,12494 ----
|
||||
else {
|
||||
kr=sqrt(sqr(calc.k[ik])+2*V0);
|
||||
if(iimfp_flag==0) ki=iimfp.val(kr)/2;
|
||||
+ else if(iimfp_flag==3) ki=iimfp_SD(kr)/2;
|
||||
else ki=iimfp_TPP(kr)/2;
|
||||
set_k(complex(kr,ki));
|
||||
} } else if(calc.k_flag==2) set_k(calc.kc[ik]);
|
||||
***************
|
||||
*** 12507,12512 ****
|
||||
--- 12511,12522 ----
|
||||
numero imfp=E/(TPP_Ep*TPP_Ep*(beta*log(gamma*E)-C/E+D/(E*E)))/a0_au;
|
||||
return 1/imfp;
|
||||
}
|
||||
+ numero propagation::iimfp_SD(numero kr)
|
||||
+ {
|
||||
+ numero E=sqr(kr)/2*au_eV;
|
||||
+ numero imfp = (1.43e3/sqr(E) + 0.54*sqrt(E))/a0_au;
|
||||
+ return 1/imfp;
|
||||
+ }
|
||||
void propagation::scan_imfp(char *name)
|
||||
{
|
||||
FILE *fout=NULL;
|
||||
***************
|
||||
*** 13202,13208 ****
|
||||
}
|
||||
final_state::final_state(void)
|
||||
{
|
||||
! n_th=n_fi=0;
|
||||
n_1=n_2=0;
|
||||
Ylm0_th_flag=Ylm0_fi_flag=0;
|
||||
mesh_flag=0;
|
||||
--- 13212,13218 ----
|
||||
}
|
||||
final_state::final_state(void)
|
||||
{
|
||||
! n_th=n_fi=n_ang=0;
|
||||
n_1=n_2=0;
|
||||
Ylm0_th_flag=Ylm0_fi_flag=0;
|
||||
mesh_flag=0;
|
||||
***************
|
||||
*** 13233,13238 ****
|
||||
--- 13243,13271 ----
|
||||
if(n_fi==1) fi[0]=fii;
|
||||
else for(j=0; j<n_fi; j++) fi[j]=fii+j*(fif-fii)/(n_fi-1);
|
||||
} }
|
||||
+ void final_state::read_angles(FILE *fin, char *my_file)
|
||||
+ {
|
||||
+ FILE *fang; int i, nang;
|
||||
+ if(!strcmpC(my_file,"inline")) fang=fin;
|
||||
+ else fang=open_file(foutput,my_file,"r");
|
||||
+ nang=read_int(fang);
|
||||
+ free_mesh();
|
||||
+ if(nang>1) {
|
||||
+ delete [] th; delete [] th_out; delete [] transmission; delete [] fi;
|
||||
+ n_th=nang;
|
||||
+ th=new numero [n_th];
|
||||
+ th_out=new numero [n_th];
|
||||
+ transmission=new numero [n_th];
|
||||
+ n_fi=nang;
|
||||
+ fi=new numero [n_fi];
|
||||
+ for(i=0; i<nang; i++) {
|
||||
+ th[i]=th_out[i]=read_numero(fang);
|
||||
+ transmission[i]=1;
|
||||
+ fi[i]=read_numero(fang);
|
||||
+ }
|
||||
+ }
|
||||
+ if(strcmpC(my_file,"inline")) fclose(fang);
|
||||
+ }
|
||||
void final_state::init_refraction(numero refraction)
|
||||
{
|
||||
int i;
|
||||
***************
|
||||
*** 14743,14748 ****
|
||||
--- 14776,14783 ----
|
||||
|| scat.TPP_Ep<=0 || scat.TPP_Eg<0)
|
||||
on_error(foutput,"(input) imfp TPP-2M", "wrong parameters");
|
||||
scat.iimfp_flag=1;
|
||||
+ } else if(!strcmpC(name,"SD-UC")) {
|
||||
+ scat.iimfp_flag=3;
|
||||
} else {
|
||||
scat.read_imfp(fprog,name);
|
||||
scat.iimfp_flag=0;
|
||||
***************
|
||||
*** 15162,15164 ****
|
||||
--- 15197,15206 ----
|
||||
fprintf(foutput,"That's all, folks!\n");
|
||||
return 0;
|
||||
}
|
||||
+ int run_script(char *scriptfile)
|
||||
+ {
|
||||
+ particle_type=electrones;
|
||||
+ init_fact();
|
||||
+ electron.program(scriptfile);
|
||||
+ return 0;
|
||||
+ }
|
||||
@@ -0,0 +1,52 @@
|
||||
SHELL=/bin/sh
|
||||
|
||||
# makefile for EDAC program and module
|
||||
#
|
||||
# the EDAC source code is not included in the public distribution.
|
||||
# please obtain it from the original author,
|
||||
# copy it to this directory,
|
||||
# and apply the edac_all.patch patch before compilation.
|
||||
#
|
||||
# see the top-level makefile for additional information.
|
||||
|
||||
.SUFFIXES:
|
||||
.SUFFIXES: .c .cpp .cxx .exe .f .h .i .o .py .pyf .so
|
||||
.PHONY: all clean edac
|
||||
|
||||
FC=gfortran
|
||||
FCCOPTS=
|
||||
F2PY=f2py
|
||||
F2PYOPTS=
|
||||
CC=g++
|
||||
CCOPTS=-Wno-write-strings
|
||||
SWIG=swig
|
||||
SWIGOPTS=
|
||||
PYTHON=python
|
||||
PYTHONOPTS=
|
||||
|
||||
all: edac
|
||||
|
||||
edac: edac.exe _edac.so edac.py
|
||||
|
||||
edac.exe: edac_all.cpp
|
||||
$(CC) $(CCOPTS) -o edac.exe edac_all.cpp
|
||||
|
||||
edac_wrap.cxx: edac_all.cpp edac.i
|
||||
$(SWIG) $(SWIGOPTS) -c++ -python edac.i
|
||||
|
||||
edac.py _edac.so: edac_wrap.cxx setup.py
|
||||
$(PYTHON) $(PYTHONOPTS) setup.py build_ext --inplace
|
||||
|
||||
revision.py: _edac.so
|
||||
git log --pretty=format:"code_rev = 'Code revision %h, %ad'" --date=iso -1 > $@ || echo "code_rev = 'Code revision unknown, "`date +"%F %T %z"`"'" > $@
|
||||
echo "" >> revision.py
|
||||
|
||||
revision.txt: _edac.so edac.exe
|
||||
git log --pretty=format:"Code revision %h, %ad" --date=iso -1 > $@ || echo "Code revision unknown, "`date +"%F %T %z"` > $@
|
||||
echo "" >> revision.txt
|
||||
|
||||
clean:
|
||||
rm -f *.so *.o *.exe
|
||||
rm -f *_wrap.cxx
|
||||
rm -f revision.py
|
||||
rm -f revision.txt
|
||||
@@ -0,0 +1,20 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
"""
|
||||
setup.py file for EDAC
|
||||
"""
|
||||
|
||||
from distutils.core import setup, Extension
|
||||
|
||||
|
||||
edac_module = Extension('_edac',
|
||||
sources=['edac_wrap.cxx', 'edac_all.cpp'],
|
||||
)
|
||||
|
||||
setup (name = 'edac',
|
||||
version = '0.1',
|
||||
author = "Matthias Muntwiler",
|
||||
description = """EDAC module in Python""",
|
||||
ext_modules = [edac_module],
|
||||
py_modules = ["edac"], requires=['numpy']
|
||||
)
|
||||
@@ -0,0 +1,223 @@
|
||||
"""
|
||||
@package pmsco.edac_calculator
|
||||
Garcia de Abajo EDAC program interface.
|
||||
|
||||
@author Matthias Muntwiler
|
||||
|
||||
@copyright (c) 2015 by Paul Scherrer Institut @n
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); @n
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
from __future__ import division
|
||||
import os
|
||||
import logging
|
||||
import math
|
||||
import numpy as np
|
||||
import calculator
|
||||
import data as md
|
||||
import cluster as mc
|
||||
import edac.edac as edac
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class EdacCalculator(calculator.Calculator):
|
||||
def write_input_file(self, params, scan, filepath):
|
||||
"""
|
||||
write parameters to an EDAC input file
|
||||
|
||||
EDAC will calculate results on a rectangular grid.
|
||||
the grid is constructed from the limits of the scan coordinates
|
||||
and the number of positions in each respective dimension.
|
||||
to avoid any confusion, the input scan should be rectangular with equidistant steps.
|
||||
|
||||
the following scans are supported:
|
||||
(energy), (energy, theta), (energy, phi), (energy, alpha), (theta, phi) holo.
|
||||
except for the holo scan, each scan dimension must be linear.
|
||||
the holo scan is translated to a rectangular (theta, phi) scan
|
||||
where theta is copied
|
||||
and phi is replaced by a linear scan from the minimum to the maximum phi at 1 degree steps.
|
||||
the scan type is detected from the scan file.
|
||||
|
||||
if alpha is defined, theta is implicitly set to normal emission! (to be generalized)
|
||||
|
||||
TODO: some parameters are still hard-coded.
|
||||
"""
|
||||
with open(filepath, "w") as f:
|
||||
f.write("verbose off\n")
|
||||
f.write("cluster input %s\n" % (params.cluster_file))
|
||||
f.write("emitters %u l(A)\n" % (len(params.emitters)))
|
||||
for em in params.emitters:
|
||||
f.write("%g %g %g %u\n" % em)
|
||||
#for iat in range(params.atom_types):
|
||||
#pf = params.phase_file[iat]
|
||||
#pf = pf.replace(".pha", ".edac.pha")
|
||||
#f.write("scatterer %u %s\n" % (params.atomic_number[iat], pf))
|
||||
|
||||
en = scan.energies + params.work_function
|
||||
en_min = en.min()
|
||||
en_max = en.max()
|
||||
if en.shape[0] <= 1:
|
||||
en_num = 1
|
||||
else:
|
||||
de = np.diff(en)
|
||||
de = de[de >= 0.01]
|
||||
de = de.min()
|
||||
en_num = int(round((en_max - en_min) / de)) + 1
|
||||
if en_num != en.shape[0]:
|
||||
logger.warning("energy scan length mismatch: EDAC {0}, scan {1}".format(en_num, en.shape[0]))
|
||||
assert en_num < en.shape[0] * 10, \
|
||||
"linearization of energy scan causes excessive oversampling {0}/{1}".format(en_num, en.shape[0])
|
||||
f.write("emission energy E(eV) {en0:f} {en1:f} {nen:d}\n".format(en0=en_min, en1=en_max, nen=en_num))
|
||||
|
||||
if params.fixed_cluster:
|
||||
th = scan.alphas
|
||||
ph = np.remainder(scan.phis + 90.0, 360.0)
|
||||
f.write("fixed cluster\n")
|
||||
if np.abs(scan.thetas).max() > 0.0:
|
||||
logger.warning("theta angle implicitly set to zero due to alpha scan.")
|
||||
else:
|
||||
th = np.unique(scan.thetas)
|
||||
ph = scan.phis
|
||||
f.write("movable cluster\n")
|
||||
|
||||
th_min = th.min()
|
||||
th_max = th.max()
|
||||
if th.shape[0] <= 1:
|
||||
th_num = 1
|
||||
else:
|
||||
dt = np.diff(th)
|
||||
dt = dt[dt >= 0.1]
|
||||
dt = dt.min()
|
||||
if ph.shape[0] > 1:
|
||||
# hemispherical scan
|
||||
if th_min < 0:
|
||||
th_min = max(th_min - dt, -90.0)
|
||||
else:
|
||||
th_min = max(th_min - dt, 0.0)
|
||||
if th_max > 0:
|
||||
th_max = min(th_max + dt, 90.0)
|
||||
else:
|
||||
th_max = min(th_max + dt, 0.0)
|
||||
th_num = int(round((th_max - th_min) / dt)) + 1
|
||||
assert th_num < th.shape[0] * 10, \
|
||||
"linearization of theta scan causes excessive oversampling {0}/{1}".format(th_num, th.shape[0])
|
||||
|
||||
f.write("beta {0}\n".format(params.polar_incidence_angle, params.azimuthal_incidence_angle))
|
||||
f.write("incidence {0} {1}\n".format(params.polar_incidence_angle, params.azimuthal_incidence_angle))
|
||||
f.write("emission angle theta {th0:f} {th1:f} {nth:d}\n".format(th0=th_min, th1=th_max, nth=th_num))
|
||||
|
||||
ph_min = ph.min()
|
||||
ph_max = ph.max()
|
||||
if th.shape[0] <= 1:
|
||||
# azimuthal scan
|
||||
ph_num = ph.shape[0]
|
||||
elif ph.shape[0] <= 1:
|
||||
# polar scan
|
||||
ph_num = 1
|
||||
else:
|
||||
# hemispherical scan
|
||||
dp = np.diff(ph)
|
||||
dp = dp[dp >= 0.1]
|
||||
dp = dp.min()
|
||||
ph_min = max(ph_min - dp, 0.0)
|
||||
ph_max = min(ph_max + dp, 360.0)
|
||||
dt = (th_max - th_min) / (th_num - 1)
|
||||
dp = min(dp, dt)
|
||||
ph_num = int(round((ph_max - ph_min) / dp)) + 1
|
||||
assert ph_num < ph.shape[0] * 10, \
|
||||
"linearization of phi scan causes excessive oversampling {0}/{1}".format(ph_num, ph.shape[0])
|
||||
|
||||
f.write("emission angle phi {ph0:f} {ph1:f} {nph:d}\n".format(ph0=ph_min, ph1=ph_max, nph=ph_num))
|
||||
|
||||
f.write("initial state {0}\n".format(params.initial_state))
|
||||
polarizations = {'H': 'LPx', 'V': 'LPy', 'L': 'LCP', 'R': 'RCP'}
|
||||
f.write("polarization {0}\n".format(polarizations[params.polarization]))
|
||||
f.write("muffin-tin\n")
|
||||
f.write("V0 E(eV) {0}\n".format(params.inner_potential))
|
||||
f.write("cluster surface l(A) {0}\n".format(params.z_surface))
|
||||
f.write("imfp SD-UC\n")
|
||||
f.write("temperature %g %g\n" % (params.experiment_temperature, params.debye_temperature))
|
||||
f.write("iteration recursion\n")
|
||||
f.write("dmax l(A) %g\n" % (params.dmax))
|
||||
f.write("lmax %u\n" % (params.lmax))
|
||||
f.write("orders %u " % (len(params.orders)))
|
||||
for order in params.orders:
|
||||
f.write("%u " % (order))
|
||||
f.write("\n")
|
||||
f.write("emission angle window 1\n")
|
||||
f.write("scan pd %s\n" % (params.output_file))
|
||||
f.write("end\n")
|
||||
|
||||
def run(self, params, cluster, scan, output_file):
|
||||
"""
|
||||
run EDAC with the given parameters and cluster.
|
||||
|
||||
@param params: a msc_param.Params() object with all necessary values except cluster and output files set.
|
||||
|
||||
@param cluster: a msc_cluster.Cluster(format=FMT_EDAC) object with all atom positions set.
|
||||
|
||||
@param scan: a msco_project.Scan() object describing the experimental scanning scheme.
|
||||
|
||||
@param output_file: base name for all intermediate and output files
|
||||
|
||||
@return: result_file, files_cats
|
||||
"""
|
||||
|
||||
# set up scan
|
||||
params.fixed_cluster = 'a' in scan.mode
|
||||
|
||||
# generate file names
|
||||
base_filename = output_file
|
||||
clu_filename = base_filename + ".clu"
|
||||
out_filename = base_filename + ".out"
|
||||
par_filename = base_filename + ".par"
|
||||
dat_filename = out_filename
|
||||
if params.fixed_cluster:
|
||||
etpi_filename = base_filename + ".etpai"
|
||||
else:
|
||||
etpi_filename = base_filename + ".etpi"
|
||||
|
||||
# fix EDAC particularities
|
||||
params.cluster_file = clu_filename
|
||||
params.output_file = out_filename
|
||||
params.data_file = dat_filename
|
||||
params.emitters = cluster.get_emitters()
|
||||
|
||||
# save parameter files
|
||||
logger.debug("writing cluster file %s", clu_filename)
|
||||
cluster.save_to_file(clu_filename, fmt=mc.FMT_EDAC)
|
||||
logger.debug("writing input file %s", par_filename)
|
||||
self.write_input_file(params, scan, par_filename)
|
||||
|
||||
# run EDAC
|
||||
logger.info("calling EDAC with input file %s", par_filename)
|
||||
edac.run_script(par_filename)
|
||||
|
||||
# load results and save in ETPI or ETPAI format
|
||||
logger.debug("importing data from output file %s", dat_filename)
|
||||
result_etpi = md.load_edac_pd(dat_filename, energy=scan.energies[0] + params.work_function,
|
||||
theta=scan.thetas[0], phi=scan.phis[0],
|
||||
fixed_cluster=params.fixed_cluster)
|
||||
result_etpi['e'] -= params.work_function
|
||||
|
||||
if 't' in scan.mode and 'p' in scan.mode:
|
||||
hemi_tpi = scan.raw_data.copy()
|
||||
hemi_tpi['i'] = 0.0
|
||||
try:
|
||||
hemi_tpi['s'] = 0.0
|
||||
except ValueError:
|
||||
pass
|
||||
result_etpi = md.interpolate_hemi_scan(result_etpi, hemi_tpi)
|
||||
|
||||
if result_etpi.shape[0] != scan.raw_data.shape[0]:
|
||||
logger.error("scan length mismatch: EDAC result: %u, scan data: %u", result_etpi.shape[0], scan.raw_data.shape[0])
|
||||
logger.debug("save result to file %s", etpi_filename)
|
||||
md.save_data(etpi_filename, result_etpi)
|
||||
|
||||
files = {clu_filename: 'input', par_filename: 'input', dat_filename: 'output',
|
||||
etpi_filename: 'region'}
|
||||
return etpi_filename, files
|
||||
+324
@@ -0,0 +1,324 @@
|
||||
"""
|
||||
@package pmsco.files
|
||||
manage files produced by pmsco.
|
||||
|
||||
@author Matthias Muntwiler
|
||||
|
||||
@copyright (c) 2016 by Paul Scherrer Institut @n
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); @n
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
import os
|
||||
import logging
|
||||
import mpi4py
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
## @var FILE_CATEGORIES
|
||||
# categories of generated files.
|
||||
#
|
||||
# these labels are used to decide which output files are kept or deleted after the calculation.
|
||||
#
|
||||
# each string of this set marks a category of files.
|
||||
#
|
||||
# @arg @c 'input' : raw input files for calculator, including cluster and phase files in custom format
|
||||
# @arg @c 'output' : raw output files from calculator
|
||||
# @arg @c 'phase' : phase files in portable format for report
|
||||
# @arg @c 'cluster' : cluster files in portable XYZ format for report
|
||||
# @arg @c 'log' : log files
|
||||
# @arg @c 'debug' : debug files
|
||||
# @arg @c 'model': output files in ETPAI format: complete simulation (a_-1_-1_-1_-1)
|
||||
# @arg @c 'scan' : output files in ETPAI format: scan (a_b_-1_-1_-1)
|
||||
# @arg @c 'symmetry' : output files in ETPAI format: symmetry (a_b_c_-1_-1)
|
||||
# @arg @c 'emitter' : output files in ETPAI format: emitter (a_b_c_d_-1)
|
||||
# @arg @c 'region' : output files in ETPAI format: region (a_b_c_d_e)
|
||||
# @arg @c 'report': final report of results
|
||||
# @arg @c 'population': final state of particle population
|
||||
# @arg @c 'rfac': files related to models which give bad r-factors (dynamic category, see below).
|
||||
#
|
||||
# @note @c 'rfac' is a dynamic category not connected to a particular file or content type.
|
||||
# no file should be marked @c 'rfac'.
|
||||
# the string is used only to specify whether bad models should be deleted or not.
|
||||
# if so, all files related to bad models are deleted, regardless of their static category.
|
||||
#
|
||||
FILE_CATEGORIES = {'cluster', 'phase', 'input', 'output',
|
||||
'report', 'region', 'emitter', 'scan', 'symmetry', 'model',
|
||||
'log', 'debug', 'population', 'rfac'}
|
||||
|
||||
## @var FILE_CATEGORIES_TO_KEEP
|
||||
# categories of files to be keep.
|
||||
#
|
||||
# this constant defines the default set of file categories that are kept after the calculation.
|
||||
#
|
||||
FILE_CATEGORIES_TO_KEEP = {'cluster', 'model', 'report', 'population'}
|
||||
|
||||
## @var FILE_CATEGORIES_TO_DELETE
|
||||
# categories of files to be deleted.
|
||||
#
|
||||
# this constant defines the default set of file categories that are deleted after the calculation.
|
||||
# it contains all values from FILE_CATEGORIES minus FILE_CATEGORIES_TO_KEEP.
|
||||
# it is used to initialize Project.files_to_delete.
|
||||
#
|
||||
FILE_CATEGORIES_TO_DELETE = FILE_CATEGORIES - FILE_CATEGORIES_TO_KEEP
|
||||
|
||||
|
||||
class FileTracker(object):
|
||||
"""
|
||||
organize output files of calculations.
|
||||
|
||||
the file manager stores references to data files generated during calculations
|
||||
and cleans up unused files according to a range of filter criteria.
|
||||
"""
|
||||
|
||||
## @var files_to_delete (set)
|
||||
# categories of generated files that should be deleted after the calculation.
|
||||
#
|
||||
# each string of this set marks a category of files to be deleted.
|
||||
# the complete set of recognized categories is files.FILE_CATEGORIES.
|
||||
# the default setting after initialization is files.FILE_CATEGORIES_TO_DELETE.
|
||||
#
|
||||
# in optimization modes, an output file is kept only
|
||||
# if its model produced one of the best R-factors and
|
||||
# its category is not listed in this set.
|
||||
# all other (bad R-factor) files are deleted regardless of their category.
|
||||
|
||||
## @var keep_rfac (int)
|
||||
# number of best models to keep.
|
||||
#
|
||||
# if @c 'rfac' is set in files_to_delete, all files of bad models (regardless of their category) are deleted.
|
||||
# this parameter specifies how many of the best models are kept.
|
||||
#
|
||||
# the default is 10.
|
||||
|
||||
## @var _last_id (int)
|
||||
# last used file identification number (incremental)
|
||||
|
||||
## @var _path_by_id (dict)
|
||||
# key = file id, value = file path
|
||||
|
||||
## @var _model_by_id (dict)
|
||||
# key = file id, value = model number
|
||||
|
||||
## @var _category_by_id (dict)
|
||||
# key = file id, value = category (str)
|
||||
|
||||
## @var _rfac_by_model (dict)
|
||||
# key = model number, value = file id
|
||||
|
||||
## @var _complete_by_model (dict)
|
||||
# key = model number, value (boolean) = all calculations complete, files can be deleted
|
||||
|
||||
def __init__(self):
|
||||
self._id_by_path = {}
|
||||
self._path_by_id = {}
|
||||
self._model_by_id = {}
|
||||
self._category_by_id = {}
|
||||
self._rfac_by_model = {}
|
||||
self._complete_by_model = {}
|
||||
self._last_id = 0
|
||||
self.categories_to_delete = FILE_CATEGORIES_TO_DELETE
|
||||
self.keep_rfac = 10
|
||||
|
||||
def add_file(self, path, model, category='default'):
|
||||
"""
|
||||
add a new data file to the list.
|
||||
|
||||
@param path: (str) system path of the file relative to the working directory.
|
||||
|
||||
@param model: (int) model number
|
||||
|
||||
@param category: (str) file category, e.g. 'output', etc.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
self._last_id += 1
|
||||
_id = self._last_id
|
||||
self._id_by_path[path] = _id
|
||||
self._path_by_id[_id] = path
|
||||
self._model_by_id[_id] = model
|
||||
self._category_by_id[_id] = category
|
||||
|
||||
def rename_file(self, old_path, new_path):
|
||||
"""
|
||||
rename a data file in the list.
|
||||
|
||||
the method does not rename the file in the file system.
|
||||
|
||||
@param old_path: must match an existing file path identically.
|
||||
if old_path is not in the list, the method does nothing.
|
||||
|
||||
@param new_path: new path.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
try:
|
||||
_id = self._id_by_path[old_path]
|
||||
except KeyError:
|
||||
pass
|
||||
else:
|
||||
del self._id_by_path[old_path]
|
||||
self._id_by_path[new_path] = _id
|
||||
self._path_by_id[_id] = new_path
|
||||
|
||||
def remove_file(self, path):
|
||||
"""
|
||||
remove a file from the list.
|
||||
|
||||
the method does not delete the file from the file system.
|
||||
|
||||
@param path: must match an existing file path identically.
|
||||
if path is not in the list, the method does nothing.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
try:
|
||||
_id = self._id_by_path[path]
|
||||
except KeyError:
|
||||
pass
|
||||
else:
|
||||
del self._id_by_path[path]
|
||||
del self._path_by_id[_id]
|
||||
del self._model_by_id[_id]
|
||||
del self._category_by_id[_id]
|
||||
|
||||
def update_model_rfac(self, model, rfac):
|
||||
"""
|
||||
update the stored R factors of all files that depend on a specified model.
|
||||
the model handler should set this flag if files with bad R factors should be deleted.
|
||||
by default (after adding files of a new model), the R factor is unset and
|
||||
delete_bad_rfac() will not act on that model.
|
||||
|
||||
@param model: (int) model number.
|
||||
@param rfac: (float) new R factor
|
||||
@return: None
|
||||
"""
|
||||
self._rfac_by_model[model] = rfac
|
||||
|
||||
def set_model_complete(self, model, complete):
|
||||
"""
|
||||
specify whether the calculations of a model are complete and its files can be deleted.
|
||||
the model handler must set this flag.
|
||||
by default (after adding files of a new model), it is False.
|
||||
|
||||
@param model: (int) model number.
|
||||
@param complete: (bool) True if all calculations of the model are complete (files can be deleted).
|
||||
@return: None
|
||||
"""
|
||||
self._complete_by_model[model] = complete
|
||||
|
||||
def delete_files(self, categories=None, keep_rfac=0):
|
||||
"""
|
||||
delete the files matching the list of categories.
|
||||
|
||||
@param categories: set of file categories to delete.
|
||||
may include 'rfac' if bad r-factors should be deleted additionally (regardless of static category).
|
||||
defaults to self.categories_to_delete.
|
||||
|
||||
@param keep_rfac: number of best models to keep if bad r-factors are to be deleted.
|
||||
the effective keep number is the greater of self.keep_rfac and this argument.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
if categories is None:
|
||||
categories = self.categories_to_delete
|
||||
for cat in categories:
|
||||
self.delete_category(cat)
|
||||
if 'rfac' in categories:
|
||||
self.delete_bad_rfac(keep=keep_rfac)
|
||||
|
||||
def delete_bad_rfac(self, keep=0, force_delete=False):
|
||||
"""
|
||||
delete the files of all models except a specified number of good models.
|
||||
|
||||
the method first determines which models to keep.
|
||||
models with R factor values of 0.0, without a specified R-factor, and
|
||||
the specified number of best ranking non-zero models are kept.
|
||||
the files belonging to the keeper models are kept, all others are deleted,
|
||||
regardless of category.
|
||||
files of incomplete models are also kept.
|
||||
|
||||
the files are deleted from the list and the file system.
|
||||
|
||||
files are deleted only if 'rfac' is specified in self.categories_to_delete
|
||||
or if force_delete is set to True.
|
||||
otherwise the method does nothing.
|
||||
|
||||
@param keep: number of files to keep.
|
||||
the effective keep number is the greater of self.keep_rfac and this argument.
|
||||
|
||||
@param force_delete: delete the bad files even if 'rfac' is not selected in categories_to_delete.
|
||||
|
||||
@return: None
|
||||
|
||||
@todo should clean up rfac and model dictionaries from time to time.
|
||||
"""
|
||||
if force_delete or 'rfac' in self.categories_to_delete:
|
||||
keep = max(keep, self.keep_rfac)
|
||||
rfacs = [r for r in sorted(self._rfac_by_model.values()) if r > 0.0]
|
||||
try:
|
||||
rfac_split = rfacs[keep-1]
|
||||
except IndexError:
|
||||
return
|
||||
|
||||
complete_models = {_model for (_model, _complete) in self._complete_by_model.iteritems() if _complete}
|
||||
del_models = {_model for (_model, _rfac) in self._rfac_by_model.iteritems() if _rfac > rfac_split}
|
||||
del_models &= complete_models
|
||||
del_ids = {_id for (_id, _model) in self._model_by_id.iteritems() if _model in del_models}
|
||||
for _id in del_ids:
|
||||
self.delete_file(_id)
|
||||
|
||||
def delete_category(self, category):
|
||||
"""
|
||||
delete all files of a specified category from the list and the file system.
|
||||
|
||||
only files of complete models (cf. set_model_complete()) are deleted, but regardless of R-factor.
|
||||
|
||||
@param category: (str) category.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
complete_models = {_model for (_model, _complete) in self._complete_by_model.iteritems() if _complete}
|
||||
del_ids = {_id for (_id, cat) in self._category_by_id.iteritems() if cat == category}
|
||||
del_ids &= {_id for (_id, _model) in self._model_by_id.iteritems() if _model in complete_models}
|
||||
for _id in del_ids:
|
||||
self.delete_file(_id)
|
||||
|
||||
def delete_file(self, _id):
|
||||
"""
|
||||
delete a specified file from the list and the file system.
|
||||
|
||||
the file is identified by ID number.
|
||||
this method is unconditional. it does not consider category, completeness, nor R-factor.
|
||||
|
||||
@param _id: (int) ID number of the file to delete.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
path = self._path_by_id[_id]
|
||||
cat = self._category_by_id[_id]
|
||||
model = self._model_by_id[_id]
|
||||
del self._id_by_path[path]
|
||||
del self._path_by_id[_id]
|
||||
del self._model_by_id[_id]
|
||||
del self._category_by_id[_id]
|
||||
try:
|
||||
self._os_delete_file(path)
|
||||
except OSError:
|
||||
logger.warning("error deleting file {0}".format(path))
|
||||
else:
|
||||
logger.debug("delete file {0} ({1}, model {2})".format(path, cat, model))
|
||||
|
||||
@staticmethod
|
||||
def _os_delete_file(path):
|
||||
"""
|
||||
have the operating system delete a file path.
|
||||
|
||||
this function is separate so that we can mock it in unit tests.
|
||||
|
||||
@param path: OS path
|
||||
@return: None
|
||||
"""
|
||||
os.remove(path)
|
||||
@@ -0,0 +1,280 @@
|
||||
"""
|
||||
gradient optimization module for MSC calculations
|
||||
|
||||
the module starts multiple MSC calculations and optimizes the model parameters
|
||||
with a gradient search.
|
||||
|
||||
the optimization task is distributed over multiple processes using MPI.
|
||||
the optimization must be started with N+1 processes in the MPI environment,
|
||||
where N equals the number of fit parameters.
|
||||
|
||||
IMPLEMENTATION IN PROGRESS - DEBUGGING
|
||||
|
||||
Requires: scipy, numpy
|
||||
|
||||
Author: Matthias Muntwiler
|
||||
|
||||
Copyright (c) 2015 by Paul Scherrer Institut
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import scipy.optimize as so
|
||||
import data as md
|
||||
from mpi4py import MPI
|
||||
|
||||
# messages sent from master to slaves
|
||||
|
||||
# master sends new assignment
|
||||
# the message is a dictionary of model parameters
|
||||
TAG_NEW_TASK = 1
|
||||
# master calls end of calculation
|
||||
# the message is empty
|
||||
TAG_FINISH = 2
|
||||
# master sends current population
|
||||
# currently not used
|
||||
TAG_POPULATION = 2
|
||||
|
||||
# messages sent from slaves to master
|
||||
# slave reports new result
|
||||
# the message is a dictionary of model parameters and results
|
||||
TAG_NEW_RESULT = 1
|
||||
# slave confirms end of calculation
|
||||
# currently not used
|
||||
TAG_FINISHED = 2
|
||||
|
||||
class MscProcess(object):
|
||||
"""
|
||||
Code shared by MscoMaster and MscoSlave
|
||||
"""
|
||||
def __init__(self, comm):
|
||||
self.comm = comm
|
||||
|
||||
def setup(self, project):
|
||||
self.project = project
|
||||
self.running = False
|
||||
self.finishing = False
|
||||
self.iteration = 0
|
||||
|
||||
def run(self):
|
||||
pass
|
||||
|
||||
def cleanup(self):
|
||||
pass
|
||||
|
||||
def calc(self, pars):
|
||||
"""
|
||||
Executes a single MSC calculation.
|
||||
|
||||
pars: A dictionary of parameters expected by the cluster and parameters functions.
|
||||
|
||||
returns: pars with three additional values:
|
||||
rank: rank of the calculation process
|
||||
index: iteration index of the calculation process
|
||||
rfac: resulting R-factor
|
||||
|
||||
all other calculation results are discarded.
|
||||
"""
|
||||
rev = "rank %u, iteration %u" % (self.comm.rank, self.iteration)
|
||||
|
||||
# create parameter and cluster structures
|
||||
clu = self.project.create_cluster(pars)
|
||||
par = self.project.create_params(pars)
|
||||
|
||||
# generate file names
|
||||
base_filename = "%s_%u_%u" % (self.project.output_file, self.comm.rank, self.iteration)
|
||||
|
||||
# call the msc program
|
||||
result_etpi = self.project.run_calc(par, clu, self.project.scan_file, base_filename, delete_files=True)
|
||||
|
||||
# calculate modulation function and R-factor
|
||||
result_etpi = md.calc_modfunc_lowess(result_etpi)
|
||||
result_r = md.rfactor(self.project.scan_modf, result_etpi)
|
||||
|
||||
pars['rank'] = self.comm.rank
|
||||
pars['iter'] = self.iteration
|
||||
pars['rfac'] = result_r
|
||||
|
||||
return pars
|
||||
|
||||
class MscMaster(MscProcess):
|
||||
def __init__(self, comm):
|
||||
super(MscMaster, self).__init__(comm)
|
||||
self.slaves = self.comm.Get_size() - 1
|
||||
self.running_slaves = 0
|
||||
|
||||
def setup(self, project):
|
||||
super(MscMaster, self).setup(project)
|
||||
self.dom = project.create_domain()
|
||||
self.running_slaves = self.slaves
|
||||
|
||||
self._outfile = open(self.project.output_file + ".dat", "w")
|
||||
self._outfile.write("#")
|
||||
self._outfile_keys = self.dom.start.keys()
|
||||
self._outfile_keys.append('rfac')
|
||||
for name in self._outfile_keys:
|
||||
self._outfile.write(" " + name)
|
||||
self._outfile.write("\n")
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
starts the minimization
|
||||
"""
|
||||
# pack initial guess, bounds, constant parameters
|
||||
nparams = len(self.dom.start)
|
||||
fit_params = np.zeros((nparams))
|
||||
params_index = {}
|
||||
const_params = self.dom.max.copy()
|
||||
bounds = []
|
||||
n_fit_params = 0
|
||||
for key in self.dom.start:
|
||||
if self.dom.max[key] > self.dom.min[key]:
|
||||
fit_params[n_fit_params] = self.dom.start[key]
|
||||
params_index[key] = n_fit_params
|
||||
n_fit_params += 1
|
||||
bounds.append((self.dom.min[key], self.dom.max[key]))
|
||||
fit_params.resize((n_fit_params))
|
||||
|
||||
fit_result = so.minimize(self._minfunc, fit_params,
|
||||
args=(params_index, const_params),
|
||||
method='L-BFGS-B', jac=True, bounds=bounds)
|
||||
|
||||
msc_result = const_params.copy()
|
||||
for key, index in params_index.items():
|
||||
msc_result[key] = fit_result.x[index]
|
||||
msc_result['rfac'] = fit_result.fun
|
||||
|
||||
self._outfile.write("# result of gradient optimization\n")
|
||||
self._outfile.write("# success = {0}, iterations = {1}, calculations = {2}\n".format(fit_result.success, fit_result.nit, fit_result.nfev))
|
||||
self._outfile.write("# message: {0}\n".format(fit_result.message))
|
||||
for name in self._outfile_keys:
|
||||
self._outfile.write(" " + str(msc_result[name]))
|
||||
self._outfile.write("\n")
|
||||
|
||||
def _minfunc(self, fit_params, params_index, const_params):
|
||||
"""
|
||||
function to be minimized
|
||||
|
||||
fit_params (numpy.ndarray): current fit position
|
||||
master (MscoMaster): reference to the master process
|
||||
params_index (dict): dictionary of fit parameters
|
||||
and their index in fit_params.
|
||||
key=MSC parameter name, value=index to fit_params.
|
||||
const_params (dict): dictionary of MSC parameters
|
||||
holding (at least) the constant parameter values.
|
||||
a copy of this instance, updated with the current fit position,
|
||||
is passed to MSC.
|
||||
"""
|
||||
|
||||
# unpack parameters
|
||||
msc_params = const_params.copy()
|
||||
for key, index in params_index.items():
|
||||
msc_params[key] = fit_params[index]
|
||||
|
||||
# run MSC calculations
|
||||
rfac, jac_dict = self.run_msc_calcs(msc_params, params_index)
|
||||
|
||||
# pack jacobian
|
||||
jac_arr = np.zeros_like(fit_params)
|
||||
for key, index in params_index.items():
|
||||
jac_arr[index] = jac_dict[key]
|
||||
|
||||
return rfac, jac_arr
|
||||
|
||||
def run_msc_calcs(self, params, params_index):
|
||||
"""
|
||||
params: dictionary of actual parameters
|
||||
params_index: dictionary of fit parameter indices.
|
||||
only the keys are used here
|
||||
to decide for which parameters the derivative is calculated.
|
||||
|
||||
returns:
|
||||
(float) R-factor at the params location
|
||||
(dict) approximate gradient at the params location
|
||||
"""
|
||||
# distribute tasks for gradient
|
||||
slave_rank = 1
|
||||
for key in params_index:
|
||||
params2 = params.copy()
|
||||
params2[key] += self.dom.step[key]
|
||||
params2['key'] = key
|
||||
self.comm.send(params2, dest=slave_rank, tag=TAG_NEW_TASK)
|
||||
slave_rank += 1
|
||||
|
||||
# run calculation for actual position
|
||||
result0 = self.calc(params)
|
||||
for name in self._outfile_keys:
|
||||
self._outfile.write(" " + str(result0[name]))
|
||||
self._outfile.write("\n")
|
||||
|
||||
# gather results
|
||||
s = MPI.Status()
|
||||
jacobian = params.copy()
|
||||
for slave in range(1, slave_rank):
|
||||
result1 = self.comm.recv(source=MPI.ANY_SOURCE, tag=MPI.ANY_TAG, status=s)
|
||||
if s.tag == TAG_NEW_RESULT:
|
||||
key = result1['key']
|
||||
jacobian[key] = (result1['rfac'] - result0['rfac']) / (result1[key] - result0[key])
|
||||
for name in self._outfile_keys:
|
||||
self._outfile.write(" " + str(result1[name]))
|
||||
self._outfile.write("\n")
|
||||
|
||||
self._outfile.flush()
|
||||
return result0['rfac'], jacobian
|
||||
|
||||
def cleanup(self):
|
||||
"""
|
||||
cleanup: close output file, terminate slave processes
|
||||
"""
|
||||
self._outfile.close()
|
||||
for rank in range(1, self.running_slaves + 1):
|
||||
self.comm.send(None, dest=rank, tag=TAG_FINISH)
|
||||
super(MscMaster, self).cleanup()
|
||||
|
||||
class MscSlave(MscProcess):
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
Waits for messages from the master and dispatches tasks.
|
||||
"""
|
||||
s = MPI.Status()
|
||||
self.running = True
|
||||
while self.running:
|
||||
data = self.comm.recv(source=0, tag=MPI.ANY_TAG, status=s)
|
||||
if s.tag == TAG_NEW_TASK:
|
||||
self.accept_task(data)
|
||||
elif s.tag == TAG_FINISH:
|
||||
self.running = False
|
||||
|
||||
def accept_task(self, pars):
|
||||
"""
|
||||
Executes a calculation task and returns the result to the master.
|
||||
"""
|
||||
result = self.calc(pars)
|
||||
self.comm.send(result, dest=0, tag=TAG_NEW_RESULT)
|
||||
self.iteration += 1
|
||||
|
||||
def optimize(project):
|
||||
"""
|
||||
main entry point for optimization
|
||||
|
||||
rank 0: starts the calculation, distributes tasks
|
||||
ranks 1...N-1: work on assignments from rank 0
|
||||
"""
|
||||
mpi_comm = MPI.COMM_WORLD
|
||||
mpi_rank = mpi_comm.Get_rank()
|
||||
|
||||
if mpi_rank == 0:
|
||||
master = MscMaster(mpi_comm)
|
||||
master.setup(project)
|
||||
master.run()
|
||||
master.cleanup()
|
||||
else:
|
||||
slave = MscSlave(mpi_comm)
|
||||
slave.setup(project)
|
||||
slave.run()
|
||||
slave.cleanup()
|
||||
+409
@@ -0,0 +1,409 @@
|
||||
"""
|
||||
@package pmsco.grid
|
||||
grid search optimization handler.
|
||||
|
||||
the module starts multiple MSC calculations and varies parameters on a fixed coordinate grid.
|
||||
|
||||
@author Matthias Muntwiler, matthias.muntwiler@psi.ch
|
||||
|
||||
@copyright (c) 2015 by Paul Scherrer Institut @n
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); @n
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
from __future__ import division
|
||||
import copy
|
||||
import os
|
||||
import datetime
|
||||
import numpy as np
|
||||
import logging
|
||||
import handlers
|
||||
from helpers import BraceMessage as BMsg
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class GridPopulation(object):
|
||||
"""
|
||||
grid population.
|
||||
"""
|
||||
|
||||
## @var model_start
|
||||
# (dict) initial model parameters.
|
||||
# read-only. call setup() to change this attribute.
|
||||
|
||||
## @var model_min
|
||||
# (dict) low limits of the model parameters.
|
||||
# read-only. call setup() to change this attribute.
|
||||
|
||||
## @var model_max
|
||||
# (dict) high limits of the model parameters.
|
||||
# if min == max, the parameter is kept constant.
|
||||
# read-only. call setup() to change this attribute.
|
||||
|
||||
## @var model_max
|
||||
# (dict) high limits of the model parameters.
|
||||
# read-only. call setup() to change this attribute.
|
||||
|
||||
## @var model_step
|
||||
# (dict) initial velocity (difference between two steps) of the particle.
|
||||
# read-only. call setup() to change this attribute.
|
||||
|
||||
## @var model_count
|
||||
# number of models (grid points).
|
||||
# initial value = 0.
|
||||
|
||||
## @var positions
|
||||
# (numpy.ndarray) flat list of grid coordinates and results.
|
||||
#
|
||||
# the column names include the names of the model parameters, taken from domain.start,
|
||||
# and the special names @c '_model', @c '_rfac'.
|
||||
# the special fields have the following meanings:
|
||||
#
|
||||
# * @c '_model': model number.
|
||||
# the model number counts identifies the grid point.
|
||||
# the field is used to associate the result of a calculation with the coordinate vector.
|
||||
# the model handlers use it to derive their model ID.
|
||||
#
|
||||
# * @c '_rfac': calculated R-factor for this position.
|
||||
# it is set by the add_result() method.
|
||||
#
|
||||
# @note if your read a single element, e.g. pos[0], from the array, you will get a numpy.void object.
|
||||
# this object is a <em>view</em> of the original array item
|
||||
|
||||
def __init__(self):
|
||||
"""
|
||||
initialize the population object.
|
||||
|
||||
"""
|
||||
self.model_start = {}
|
||||
self.model_min = {}
|
||||
self.model_max = {}
|
||||
self.model_step = {}
|
||||
|
||||
self.model_count = 0
|
||||
|
||||
self.positions = None
|
||||
|
||||
self.search_keys = []
|
||||
self.fixed_keys = []
|
||||
|
||||
@staticmethod
|
||||
def get_model_dtype(model_params):
|
||||
"""
|
||||
get numpy array data type for model parameters and grid control variables.
|
||||
|
||||
@param model_params: dictionary of model parameters or list of parameter names.
|
||||
|
||||
@return: dtype for use with numpy array constructors.
|
||||
this is a sorted list of (name, type) tuples.
|
||||
"""
|
||||
dt = []
|
||||
for key in model_params:
|
||||
dt.append((key, 'f4'))
|
||||
dt.append(('_model', 'i4'))
|
||||
dt.append(('_rfac', 'f4'))
|
||||
dt.sort(key=lambda t: t[0].lower())
|
||||
return dt
|
||||
|
||||
def setup(self, domain):
|
||||
"""
|
||||
set up the population and result arrays.
|
||||
|
||||
@param domain: definition of initial and limiting model parameters
|
||||
expected by the cluster and parameters functions.
|
||||
|
||||
@param domain.start: values of the fixed parameters.
|
||||
|
||||
@param domain.min: minimum values allowed.
|
||||
|
||||
@param domain.max: maximum values allowed.
|
||||
if abs(max - min) < step/2 , the parameter is kept constant.
|
||||
|
||||
@param domain.step: step size (distance between two grid points).
|
||||
if step <= 0, the parameter is kept constant.
|
||||
"""
|
||||
self.model_start = domain.start
|
||||
self.model_min = domain.min
|
||||
self.model_max = domain.max
|
||||
self.model_step = domain.step
|
||||
|
||||
self.model_count = 1
|
||||
self.search_keys = []
|
||||
self.fixed_keys = []
|
||||
scales = []
|
||||
|
||||
for p in domain.step.keys():
|
||||
if domain.step[p] > 0:
|
||||
n = np.round((domain.max[p] - domain.min[p]) / domain.step[p]) + 1
|
||||
else:
|
||||
n = 1
|
||||
if n > 1:
|
||||
self.search_keys.append(p)
|
||||
scales.append(np.linspace(domain.min[p], domain.max[p], n))
|
||||
else:
|
||||
self.fixed_keys.append(p)
|
||||
|
||||
# scales is a list of 1D arrays that hold the coordinates of the individual dimensions
|
||||
# nd_positions is a list of N-D arrays that hold the coordinates in all multiple dimensions
|
||||
# flat_positions is a list of 1D arrays that hold the coordinates in flat sequence
|
||||
if len(scales) > 1:
|
||||
positions_nd = np.meshgrid(*scales, indexing='ij')
|
||||
positions_flat = [arr.flatten() for arr in positions_nd]
|
||||
else:
|
||||
positions_flat = scales
|
||||
self.model_count = positions_flat[0].shape[0]
|
||||
|
||||
# shuffle the calculation order so that we may see the more interesting parts earlier
|
||||
shuffle_index = np.arange(self.model_count)
|
||||
np.random.shuffle(shuffle_index)
|
||||
positions_reordered = [pos[shuffle_index] for pos in positions_flat]
|
||||
|
||||
dt = self.get_model_dtype(self.model_min)
|
||||
|
||||
# positions
|
||||
self.positions = np.zeros(self.model_count, dtype=dt)
|
||||
|
||||
for idx, key in enumerate(self.search_keys):
|
||||
self.positions[key] = positions_reordered[idx]
|
||||
for idx, key in enumerate(self.fixed_keys):
|
||||
self.positions[key] = self.model_start[key]
|
||||
|
||||
self.positions['_model'] = np.arange(self.model_count)
|
||||
self.positions['_rfac'] = 2.1
|
||||
|
||||
def add_result(self, particle, rfac):
|
||||
"""
|
||||
add a calculation particle to the results array.
|
||||
|
||||
@param particle: dictionary of model parameters and particle values.
|
||||
the keys must correspond to the columns of the pos array,
|
||||
i.e. the names of the model parameters plus the _rfac, and _model fields.
|
||||
|
||||
@param rfac: calculated R-factor.
|
||||
the R-factor is written to the '_rfac' field.
|
||||
|
||||
@return None
|
||||
"""
|
||||
model = particle['_model']
|
||||
self.positions['_rfac'][model] = rfac
|
||||
|
||||
def save_array(self, filename, array):
|
||||
"""
|
||||
saves a population array to a text file.
|
||||
|
||||
@param array: population array to save.
|
||||
must be one of self.pos, self.vel, self.best, self.results
|
||||
"""
|
||||
header = " ".join(self.positions.dtype.names)
|
||||
np.savetxt(filename, array, fmt='%g', header=header)
|
||||
|
||||
def load_array(self, filename, array):
|
||||
"""
|
||||
load a population array from a text file.
|
||||
|
||||
the array to load must be compatible with the current population
|
||||
(same number of rows, same columns).
|
||||
the first row must contain column names.
|
||||
the ordering of columns may be different.
|
||||
the returned array is ordered according to the array argument.
|
||||
|
||||
@param array: population array to load.
|
||||
must be one of self.pos, self.vel, self.results.
|
||||
|
||||
@return array with loaded data.
|
||||
this may be the same instance as on input.
|
||||
|
||||
@raise AssertionError if the number of rows of the two files differ.
|
||||
"""
|
||||
data = np.genfromtxt(filename, names=True)
|
||||
assert data.shape == array.shape
|
||||
for name in data.dtype.names:
|
||||
array[name] = data[name]
|
||||
return array
|
||||
|
||||
def save_population(self, base_filename):
|
||||
"""
|
||||
saves the population array to a set of text files.
|
||||
|
||||
the file name extensions are .pos, .vel, and .best
|
||||
"""
|
||||
self.save_array(base_filename + ".pos", self.positions)
|
||||
|
||||
def load_population(self, base_filename):
|
||||
"""
|
||||
loads the population array from a set of previously saved text files.
|
||||
this can be used to continue an optimization job.
|
||||
|
||||
the file name extensions are .pos, .vel, and .best.
|
||||
the files must have the same format as produced by save_population.
|
||||
the files must have the same number of rows.
|
||||
"""
|
||||
self.load_array(base_filename + ".pos", self.positions)
|
||||
|
||||
def save_results(self, filename):
|
||||
"""
|
||||
saves the complete list of calculations results.
|
||||
"""
|
||||
self.save_array(filename, self.positions)
|
||||
|
||||
|
||||
class GridSearchHandler(handlers.ModelHandler):
|
||||
"""
|
||||
model handler which implements the grid search algorithm.
|
||||
|
||||
"""
|
||||
|
||||
## @var _pop (Population)
|
||||
# holds the population object.
|
||||
|
||||
## @var _outfile (file)
|
||||
# output file for model parametes and R factor.
|
||||
# the file is open during calculations.
|
||||
# each calculation result adds one line.
|
||||
|
||||
## @var _model_time (timedelta)
|
||||
# estimated CPU time to calculate one model.
|
||||
# this value is the maximum time measured of the completed calculations.
|
||||
# it is used to determine when the optimization should be finished so that the time limit is not exceeded.
|
||||
|
||||
## @var _timeout (bool)
|
||||
# indicates when the handler has run out of time,
|
||||
# i.e. time is up before convergence has been reached.
|
||||
# if _timeout is True, create_tasks() will not create further tasks,
|
||||
# and add_result() will signal completion when the _pending_tasks queue becomes empty.
|
||||
|
||||
def __init__(self):
|
||||
super(GridSearchHandler, self).__init__()
|
||||
self._pop = None
|
||||
self._outfile = None
|
||||
self._model_time = datetime.timedelta()
|
||||
self._timeout = False
|
||||
self._invalid_limit = 10
|
||||
self._next_model = 0
|
||||
|
||||
def setup(self, project, slots):
|
||||
"""
|
||||
initialize the particle swarm and open an output file.
|
||||
|
||||
@param project:
|
||||
|
||||
@param slots: number of calculation processes available through MPI.
|
||||
for efficiency reasons we set the population size twice the number of available slots.
|
||||
the minimum number of slots is 1, the recommended value is 10 or greater.
|
||||
the population size is set to at least 4.
|
||||
|
||||
@return:
|
||||
"""
|
||||
super(GridSearchHandler, self).setup(project, slots)
|
||||
|
||||
self._pop = GridPopulation()
|
||||
self._pop.setup(self._project.create_domain())
|
||||
self._invalid_limit = max(slots, self._invalid_limit)
|
||||
|
||||
self._outfile = open(self._project.output_file + ".dat", "w")
|
||||
self._outfile.write("# ")
|
||||
self._outfile.write(" ".join(self._pop.positions.dtype.names))
|
||||
self._outfile.write("\n")
|
||||
|
||||
return None
|
||||
|
||||
def cleanup(self):
|
||||
self._outfile.close()
|
||||
super(GridSearchHandler, self).cleanup()
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
develop the particle population and create a calculation task per particle.
|
||||
|
||||
this method advances the population by one step, and generates one task per particle.
|
||||
during the first call, the method first sets up a new population.
|
||||
|
||||
the process loop calls this method every time the length of the task queue drops
|
||||
below the number of calculation processes (slots).
|
||||
|
||||
@return list of generated tasks. empty list if all grid points have been calculated.
|
||||
"""
|
||||
|
||||
super(GridSearchHandler, self).create_tasks(parent_task)
|
||||
|
||||
# this is the top-level handler, we expect just one parent: root.
|
||||
parent_id = parent_task.id
|
||||
assert parent_id == (-1, -1, -1, -1, -1)
|
||||
self._parent_tasks[parent_id] = parent_task
|
||||
|
||||
time_pending = self._model_time * len(self._pending_tasks)
|
||||
time_avail = (self.datetime_limit - datetime.datetime.now()) * max(self._slots, 1)
|
||||
|
||||
out_tasks = []
|
||||
time_pending += self._model_time
|
||||
if time_pending > time_avail:
|
||||
self._timeout = True
|
||||
|
||||
model = self._next_model
|
||||
if not self._timeout and model < self._pop.model_count and self._invalid_count < self._invalid_limit:
|
||||
new_task = parent_task.copy()
|
||||
new_task.parent_id = parent_id
|
||||
pos = self._pop.positions[model]
|
||||
new_task.model = {k:pos[k] for k in pos.dtype.names}
|
||||
new_task.change_id(model=model)
|
||||
|
||||
child_id = new_task.id
|
||||
self._pending_tasks[child_id] = new_task
|
||||
out_tasks.append(new_task)
|
||||
self._next_model += 1
|
||||
|
||||
return out_tasks
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
calculate the R factor of the result and store it in the positions array.
|
||||
|
||||
* append the result to the result output file.
|
||||
* update the execution time statistics.
|
||||
* remove temporary files if requested.
|
||||
* check whether the grid search is complete.
|
||||
|
||||
@return parent task (CalculationTask) if the search is complete, @c None otherwise.
|
||||
"""
|
||||
super(GridSearchHandler, self).add_result(task)
|
||||
|
||||
self._complete_tasks[task.id] = task
|
||||
del self._pending_tasks[task.id]
|
||||
parent_task = self._parent_tasks[task.parent_id]
|
||||
|
||||
rfac = 1.0
|
||||
if task.result_valid:
|
||||
try:
|
||||
rfac = self._project.calc_rfactor(task)
|
||||
except ValueError:
|
||||
task.result_valid = False
|
||||
self._invalid_count += 1
|
||||
logger.warning(BMsg("calculation of model {0} resulted in an undefined R-factor.", task.id.model))
|
||||
|
||||
task.model['_rfac'] = rfac
|
||||
self._pop.add_result(task.model, rfac)
|
||||
|
||||
if self._outfile:
|
||||
s = (str(task.model[name]) for name in self._pop.positions.dtype.names)
|
||||
self._outfile.write(" ".join(s))
|
||||
self._outfile.write("\n")
|
||||
self._outfile.flush()
|
||||
|
||||
self._project.files.update_model_rfac(task.id.model, rfac)
|
||||
self._project.files.set_model_complete(task.id.model, True)
|
||||
|
||||
if task.result_valid:
|
||||
if task.time > self._model_time:
|
||||
self._model_time = task.time
|
||||
|
||||
# grid search complete?
|
||||
if len(self._pending_tasks) == 0:
|
||||
del self._parent_tasks[parent_task.id]
|
||||
else:
|
||||
parent_task = None
|
||||
|
||||
self.cleanup_files()
|
||||
return parent_task
|
||||
@@ -0,0 +1,948 @@
|
||||
"""
|
||||
@package pmsco.handlers
|
||||
project-independent task handlers for models, scans, symmetries, emitters and energies.
|
||||
|
||||
calculation tasks are organized in a hierarchical tree.
|
||||
at each node, a task handler (feel free to find a better name)
|
||||
creates a set of child tasks according to the optimization mode and requirements of the project.
|
||||
at the end points of the tree, the tasks are ready to be sent to calculation program.
|
||||
the handlers collect the results, and return one combined dataset per node.
|
||||
the passing of tasks and results between handlers is managed by the processing loop.
|
||||
|
||||
<em>model handlers</em> define the model parameters used in calculations.
|
||||
the parameters can be chosen according to user input, or according to a structural optimization algorithm.
|
||||
a model handler class derives from the ModelHandler class.
|
||||
the most simple one, SingleModelHandler, is implemented in this module.
|
||||
it calculates the diffraction pattern of a single model with the start parameters given in the domain object.
|
||||
the handlers of the structural optimizers are declared in separate modules.
|
||||
|
||||
<em>scan handlers</em> split a task into one child task per scan file.
|
||||
scans are defined by the project.
|
||||
the actual merging step from multiple scans into one result dataset is delegated to the project class.
|
||||
|
||||
<em>symmetry handlers</em> split a task into one child per symmetry.
|
||||
symmetries are defined by the project.
|
||||
the actual merging step from multiple symmetries into one result dataset is delegated to the project class.
|
||||
|
||||
<em>emitter handlers</em> split a task into one child per emitter configuration (inequivalent sets of emitting atoms).
|
||||
emitter configurations are defined by the project.
|
||||
the merging of calculation results of emitter configurations is delegated to the project class.
|
||||
since emitters contribute incoherently to the diffraction pattern,
|
||||
it should make no difference how the emitters are grouped and calculated.
|
||||
code inspection and tests have shown that per-emitter results from EDAC can be simply added.
|
||||
|
||||
<em>energy handlers</em> may split a calculation task into multiple tasks
|
||||
in order to take advantage of parallel processing.
|
||||
|
||||
while several classes of model handlers are available,
|
||||
the default handlers for scans, symmetries, emitters and energies should be sufficient in most situations.
|
||||
the scan and symmetry handlers call methods of the project class to invoke project-specific functionality.
|
||||
|
||||
@author Matthias Muntwiler, matthias.muntwiler@psi.ch
|
||||
|
||||
@copyright (c) 2015-17 by Paul Scherrer Institut @n
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); @n
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
"""
|
||||
|
||||
from __future__ import division
|
||||
import datetime
|
||||
import os
|
||||
import logging
|
||||
import math
|
||||
import numpy as np
|
||||
import data as md
|
||||
from helpers import BraceMessage as BMsg
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class TaskHandler(object):
|
||||
"""
|
||||
common ancestor for task handlers.
|
||||
|
||||
this class defines the common interface of task handlers.
|
||||
"""
|
||||
|
||||
## @var project
|
||||
# (Project) project instance.
|
||||
|
||||
## @var slots
|
||||
# (int) number of calculation slots (processes).
|
||||
#
|
||||
# for best efficiency the number of tasks generated should be greater or equal the number of slots.
|
||||
# it should not exceed N times the number of slots, where N is a reasonably small number.
|
||||
|
||||
## @var _pending_tasks
|
||||
# (dict) pending tasks by ID (created but not yet calculated).
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id,
|
||||
# the values are the corresponding CalculationTask objects.
|
||||
|
||||
## @var _complete_tasks
|
||||
# (dict) complete tasks by ID (calculation finished, parent not yet complete).
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id,
|
||||
# the values are the corresponding CalculationTask objects.
|
||||
|
||||
## @var _parent_tasks
|
||||
# (dict) pending parent tasks by ID.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id,
|
||||
# the values are the corresponding CalculationTask objects.
|
||||
|
||||
## @var invalid_count (int)
|
||||
# accumulated total number of invalid results received.
|
||||
#
|
||||
# the number is incremented by add_result if an invalid task is reported.
|
||||
# the number can be used by descendants to terminate a hopeless calculation.
|
||||
|
||||
def __init__(self):
|
||||
self._project = None
|
||||
self._slots = 0
|
||||
self._pending_tasks = {}
|
||||
self._parent_tasks = {}
|
||||
self._complete_tasks = {}
|
||||
self._invalid_count = 0
|
||||
|
||||
def setup(self, project, slots):
|
||||
"""
|
||||
initialize the handler with project data and the process environment.
|
||||
|
||||
the method is called once by the dispatcher before the calculation loop starts.
|
||||
the handler can initialize internal variables which it hasn't done in the constructor.
|
||||
|
||||
@param project (Project) project instance.
|
||||
|
||||
@param slots (int) number of calculation slots (processes).
|
||||
for best efficiency the number of tasks generated should be greater or equal the number of slots.
|
||||
it should not exceed N times the number of slots, where N is a reasonably small number.
|
||||
|
||||
@return None
|
||||
"""
|
||||
self._project = project
|
||||
self._slots = slots
|
||||
|
||||
def cleanup(self):
|
||||
"""
|
||||
clean up whatever is necessary, e.g. close files.
|
||||
|
||||
this method is called once after all calculations have finished.
|
||||
|
||||
@return None
|
||||
"""
|
||||
pass
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
create the next series of child tasks for the given parent task.
|
||||
|
||||
the method is called by the dispatcher when a new series of tasks should be generated.
|
||||
|
||||
when no more tasks are to be calculated, the method must return an empty list.
|
||||
processing will finish when all pending and running tasks are complete.
|
||||
|
||||
@param parent_task (CalculationTask) task with initial model parameters.
|
||||
|
||||
@return list of CalculationTask objects holding the parameters for the next calculations.
|
||||
the list must be empty if there are no more tasks.
|
||||
"""
|
||||
|
||||
return []
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
collect and combine the results of tasks created by the same handler.
|
||||
|
||||
this method collects the results of tasks that were created by self.create_tasks() and
|
||||
passes them on to the parent whenever a family (i.e. all tasks that have the same parent) is complete.
|
||||
when the family is complete, the method creates the data files that are represented by the parent task and
|
||||
signals to the caller that the parent task is complete.
|
||||
|
||||
the method is called by the dispatcher whenever a calculation task belonging to this handler completes.
|
||||
|
||||
as of this class, the method counts invalid results and
|
||||
adds the list of data files to the project's file tracker.
|
||||
collecting the tasks and combining their data must be implemented in sub-classes.
|
||||
|
||||
@param task: (CalculationTask) calculation task that completed.
|
||||
|
||||
@return parent task (CalculationTask) if the family is complete,
|
||||
None if the family is not complete yet.
|
||||
As of this class, the method returns None.
|
||||
"""
|
||||
if not task.result_valid:
|
||||
self._invalid_count += 1
|
||||
|
||||
self.track_files(task)
|
||||
|
||||
return None
|
||||
|
||||
def track_files(self, task):
|
||||
"""
|
||||
register all task files with the file tracker of the project.
|
||||
|
||||
@param task: CalculationTask object.
|
||||
the id, model, and files attributes are required.
|
||||
if model contains a '_rfac' value, the r-factor is
|
||||
|
||||
@return: None
|
||||
"""
|
||||
model_id = task.id.model
|
||||
for path, cat in task.files.iteritems():
|
||||
self._project.files.add_file(path, model_id, category=cat)
|
||||
|
||||
def cleanup_files(self, keep=10):
|
||||
"""
|
||||
delete uninteresting files.
|
||||
|
||||
@param: number of best ranking models to keep.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
self._project.files.delete_files(keep_rfac=keep)
|
||||
|
||||
|
||||
class ModelHandler(TaskHandler):
|
||||
"""
|
||||
abstract model handler.
|
||||
|
||||
structural optimizers must be derived from this class and implement a loop on the model.
|
||||
"""
|
||||
|
||||
## @var datetime_limit (datetime.datetime)
|
||||
# date and time when the model handler should finish (regardless of result)
|
||||
# because the process may get killed by the scheduler after this time.
|
||||
#
|
||||
# the default is 100 days after creation of the handler.
|
||||
|
||||
def __init__(self):
|
||||
super(ModelHandler, self).__init__()
|
||||
self.datetime_limit = datetime.datetime.now() + datetime.timedelta(days=100)
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
create tasks for the next population of models.
|
||||
|
||||
the method is called repeatedly by the dispatcher when the calculation queue runs empty.
|
||||
the model should then create the next round of tasks, e.g. the next generation of a population.
|
||||
the number of tasks created can be as low as one.
|
||||
|
||||
when no more tasks are to be calculated, the method must return an empty list.
|
||||
processing will finish when all pending and running tasks are complete.
|
||||
|
||||
@note it is not possible to hold back calculations, or to wait for results.
|
||||
the handler must either return a task, or signal the end of the optimization process.
|
||||
|
||||
@param parent_task (CalculationTask) task with initial model parameters.
|
||||
|
||||
@return list of CalculationTask objects holding the parameters for the next calculations.
|
||||
the list must be empty if there are no more tasks.
|
||||
"""
|
||||
super(ModelHandler, self).create_tasks(parent_task)
|
||||
|
||||
return []
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
collect and combine results of a scan.
|
||||
|
||||
this method is called by the dispatcher when all results for a scan are available.
|
||||
"""
|
||||
super(ModelHandler, self).add_result(task)
|
||||
|
||||
return None
|
||||
|
||||
|
||||
class SingleModelHandler(ModelHandler):
|
||||
"""
|
||||
single model calculation handler.
|
||||
|
||||
this class runs a single calculation on the start parameters defined in the domain of the project.
|
||||
"""
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
start one task with the start parameters.
|
||||
|
||||
subsequent calls will return an empty task list.
|
||||
|
||||
@param parent_task (CalculationTask) task with initial model parameters.
|
||||
"""
|
||||
super(SingleModelHandler, self).create_tasks(parent_task)
|
||||
|
||||
out_tasks = []
|
||||
if len(self._complete_tasks) + len(self._pending_tasks) == 0:
|
||||
parent_id = parent_task.id
|
||||
self._parent_tasks[parent_id] = parent_task
|
||||
new_task = parent_task.copy()
|
||||
new_task.change_id(model=0)
|
||||
new_task.parent_id = parent_id
|
||||
child_id = new_task.id
|
||||
self._pending_tasks[child_id] = new_task
|
||||
out_tasks.append(new_task)
|
||||
|
||||
return out_tasks
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
collect the end result of a single calculation.
|
||||
|
||||
the SingleModelHandler runs calculations for a single model.
|
||||
this method assumes that it will be called just once.
|
||||
it returns the parent task to signal the end of the calculations.
|
||||
|
||||
the result file is not deleted regardless of the files_to_delete project option.
|
||||
the task ID is removed from the file name.
|
||||
|
||||
@param task: (CalculationTask) calculation task that completed.
|
||||
|
||||
@return (CalculationTask) parent task.
|
||||
|
||||
"""
|
||||
super(SingleModelHandler, self).add_result(task)
|
||||
|
||||
self._complete_tasks[task.id] = task
|
||||
del self._pending_tasks[task.id]
|
||||
|
||||
parent_task = self._parent_tasks[task.parent_id]
|
||||
del self._parent_tasks[task.parent_id]
|
||||
|
||||
parent_task.result_valid = task.result_valid
|
||||
parent_task.file_ext = task.file_ext
|
||||
parent_task.result_filename = parent_task.file_root + parent_task.file_ext
|
||||
modf_ext = ".modf" + parent_task.file_ext
|
||||
parent_task.modf_filename = parent_task.file_root + modf_ext
|
||||
|
||||
rfac = 1.0
|
||||
if task.result_valid:
|
||||
try:
|
||||
rfac = self._project.calc_rfactor(task)
|
||||
except ValueError:
|
||||
task.result_valid = False
|
||||
logger.warning(BMsg("calculation of model {0} resulted in an undefined R-factor.", task.id.model))
|
||||
|
||||
task.model['_rfac'] = rfac
|
||||
self.save_report_file(task.model)
|
||||
|
||||
self._project.files.update_model_rfac(task.id.model, rfac)
|
||||
self._project.files.set_model_complete(task.id.model, True)
|
||||
|
||||
parent_task.time = task.time
|
||||
|
||||
return parent_task
|
||||
|
||||
def save_report_file(self, result):
|
||||
"""
|
||||
save model parameters and r-factor to a file.
|
||||
|
||||
the file name is derived from the project's output_file with '.dat' extension.
|
||||
the file has a space-separated column format.
|
||||
the first line contains the parameter names.
|
||||
this is the same format as used by the swarm and grid handlers.
|
||||
|
||||
@param result: dictionary of results and parameters. the values should be scalars and strings.
|
||||
|
||||
@return: None
|
||||
"""
|
||||
keys = [key for key in result]
|
||||
keys.sort(key=lambda t: t[0].lower())
|
||||
vals = (str(result[key]) for key in keys)
|
||||
with open(self._project.output_file + ".dat", "w") as outfile:
|
||||
outfile.write("# ")
|
||||
outfile.write(" ".join(keys))
|
||||
outfile.write("\n")
|
||||
outfile.write(" ".join(vals))
|
||||
outfile.write("\n")
|
||||
|
||||
|
||||
class ScanHandler(TaskHandler):
|
||||
"""
|
||||
split the parameters into one set per scan and gather the results.
|
||||
|
||||
the scan selection takes effect in MscoProcess.calc().
|
||||
"""
|
||||
|
||||
## @var _pending_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of IDs referring to pending calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
## @var _complete_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of complete calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
def __init__(self):
|
||||
super(ScanHandler, self).__init__()
|
||||
self._pending_ids_per_parent = {}
|
||||
self._complete_ids_per_parent = {}
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
generate a calculation task for each scan of the given parent task.
|
||||
|
||||
all scans share the model parameters.
|
||||
|
||||
@return list of CalculationTask objects, with one element per scan.
|
||||
the scan index varies according to project.scans.
|
||||
"""
|
||||
super(ScanHandler, self).create_tasks(parent_task)
|
||||
|
||||
parent_id = parent_task.id
|
||||
self._parent_tasks[parent_id] = parent_task
|
||||
assert parent_id not in self._pending_ids_per_parent.keys()
|
||||
self._pending_ids_per_parent[parent_id] = set()
|
||||
self._complete_ids_per_parent[parent_id] = set()
|
||||
|
||||
out_tasks = []
|
||||
for (i_scan, scan) in enumerate(self._project.scans):
|
||||
new_task = parent_task.copy()
|
||||
new_task.parent_id = parent_id
|
||||
new_task.change_id(scan=i_scan)
|
||||
|
||||
child_id = new_task.id
|
||||
self._pending_tasks[child_id] = new_task
|
||||
self._pending_ids_per_parent[parent_id].add(child_id)
|
||||
|
||||
out_tasks.append(new_task)
|
||||
|
||||
if not out_tasks:
|
||||
logger.error("no scan tasks generated. your project must link to at least one scan file.")
|
||||
|
||||
return out_tasks
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
collect and combine the calculation results versus scan.
|
||||
|
||||
* mark the task as complete
|
||||
* store its result for later
|
||||
* check whether this was the last pending task of the family (belonging to the same parent).
|
||||
|
||||
the actual merging of data is delegated to the project's combine_scans() method.
|
||||
|
||||
@param task: (CalculationTask) calculation task that completed.
|
||||
|
||||
@return parent task (CalculationTask) if the family is complete. None if the family is not complete yet.
|
||||
"""
|
||||
super(ScanHandler, self).add_result(task)
|
||||
|
||||
self._complete_tasks[task.id] = task
|
||||
del self._pending_tasks[task.id]
|
||||
|
||||
family_pending = self._pending_ids_per_parent[task.parent_id]
|
||||
family_complete = self._complete_ids_per_parent[task.parent_id]
|
||||
family_pending.remove(task.id)
|
||||
family_complete.add(task.id)
|
||||
|
||||
# all scans complete?
|
||||
if len(family_pending) == 0:
|
||||
parent_task = self._parent_tasks[task.parent_id]
|
||||
|
||||
parent_task.file_ext = task.file_ext
|
||||
parent_task.result_filename = parent_task.format_filename()
|
||||
modf_ext = ".modf" + parent_task.file_ext
|
||||
parent_task.modf_filename = parent_task.format_filename(ext=modf_ext)
|
||||
|
||||
child_tasks = [self._complete_tasks[task_id] for task_id in sorted(family_complete)]
|
||||
|
||||
child_valid = [t.result_valid for t in child_tasks]
|
||||
parent_task.result_valid = reduce(lambda a, b: a and b, child_valid)
|
||||
child_times = [t.time for t in child_tasks]
|
||||
parent_task.time = reduce(lambda a, b: a + b, child_times)
|
||||
|
||||
if parent_task.result_valid:
|
||||
self._project.combine_scans(parent_task, child_tasks)
|
||||
self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'model')
|
||||
self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'model')
|
||||
|
||||
del self._pending_ids_per_parent[parent_task.id]
|
||||
del self._complete_ids_per_parent[parent_task.id]
|
||||
del self._parent_tasks[parent_task.id]
|
||||
|
||||
return parent_task
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
class SymmetryHandler(TaskHandler):
|
||||
## @var _pending_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of IDs referring to pending calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
## @var _complete_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of complete calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
def __init__(self):
|
||||
super(SymmetryHandler, self).__init__()
|
||||
self._pending_ids_per_parent = {}
|
||||
self._complete_ids_per_parent = {}
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
generate a calculation task for each symmetry of the given parent task.
|
||||
|
||||
all symmetries share the same model parameters.
|
||||
|
||||
@return list of CalculationTask objects, with one element per symmetry.
|
||||
the symmetry index varies according to project.symmetries.
|
||||
"""
|
||||
super(SymmetryHandler, self).create_tasks(parent_task)
|
||||
|
||||
parent_id = parent_task.id
|
||||
self._parent_tasks[parent_id] = parent_task
|
||||
self._pending_ids_per_parent[parent_id] = set()
|
||||
self._complete_ids_per_parent[parent_id] = set()
|
||||
|
||||
out_tasks = []
|
||||
for (i_sym, sym) in enumerate(self._project.symmetries):
|
||||
new_task = parent_task.copy()
|
||||
new_task.parent_id = parent_id
|
||||
new_task.change_id(sym=i_sym)
|
||||
|
||||
child_id = new_task.id
|
||||
self._pending_tasks[child_id] = new_task
|
||||
self._pending_ids_per_parent[parent_id].add(child_id)
|
||||
|
||||
out_tasks.append(new_task)
|
||||
|
||||
if not out_tasks:
|
||||
logger.error("no symmetry tasks generated. your project must declare at least one symmetry.")
|
||||
|
||||
return out_tasks
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
collect and combine the calculation results versus symmetry.
|
||||
|
||||
* mark the task as complete
|
||||
* store its result for later
|
||||
* check whether this was the last pending task of the family (belonging to the same parent).
|
||||
|
||||
the actual merging of data is delegated to the project's combine_symmetries() method.
|
||||
|
||||
@param task: (CalculationTask) calculation task that completed.
|
||||
|
||||
@return parent task (CalculationTask) if the family is complete. None if the family is not complete yet.
|
||||
"""
|
||||
super(SymmetryHandler, self).add_result(task)
|
||||
|
||||
self._complete_tasks[task.id] = task
|
||||
del self._pending_tasks[task.id]
|
||||
|
||||
family_pending = self._pending_ids_per_parent[task.parent_id]
|
||||
family_complete = self._complete_ids_per_parent[task.parent_id]
|
||||
family_pending.remove(task.id)
|
||||
family_complete.add(task.id)
|
||||
|
||||
# all symmetries complete?
|
||||
if len(family_pending) == 0:
|
||||
parent_task = self._parent_tasks[task.parent_id]
|
||||
|
||||
parent_task.file_ext = task.file_ext
|
||||
parent_task.result_filename = parent_task.format_filename()
|
||||
modf_ext = ".modf" + parent_task.file_ext
|
||||
parent_task.modf_filename = parent_task.format_filename(ext=modf_ext)
|
||||
|
||||
child_tasks = [self._complete_tasks[task_id] for task_id in sorted(family_complete)]
|
||||
|
||||
child_valid = [t.result_valid for t in child_tasks]
|
||||
parent_task.result_valid = reduce(lambda a, b: a and b, child_valid)
|
||||
child_times = [t.time for t in child_tasks]
|
||||
parent_task.time = reduce(lambda a, b: a + b, child_times)
|
||||
|
||||
if parent_task.result_valid:
|
||||
self._project.combine_symmetries(parent_task, child_tasks)
|
||||
self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'scan')
|
||||
self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'scan')
|
||||
|
||||
del self._pending_ids_per_parent[parent_task.id]
|
||||
del self._complete_ids_per_parent[parent_task.id]
|
||||
del self._parent_tasks[parent_task.id]
|
||||
|
||||
return parent_task
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
class EmitterHandler(TaskHandler):
|
||||
"""
|
||||
the emitter handler distributes emitter configurations to calculation tasks and collects their results.
|
||||
|
||||
"""
|
||||
## @var _pending_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of IDs referring to pending calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
## @var _complete_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of complete calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
def __init__(self):
|
||||
super(EmitterHandler, self).__init__()
|
||||
self._pending_ids_per_parent = {}
|
||||
self._complete_ids_per_parent = {}
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
generate a calculation task for each emitter configuration of the given parent task.
|
||||
|
||||
all emitters share the same model parameters.
|
||||
|
||||
@return list of @ref CalculationTask objects with one element per emitter configuration
|
||||
if parallel processing is enabled.
|
||||
otherwise the list contains a single CalculationTask object with emitter index 0.
|
||||
the emitter index is used by the project's create_cluster method.
|
||||
"""
|
||||
super(EmitterHandler, self).create_tasks(parent_task)
|
||||
|
||||
parent_id = parent_task.id
|
||||
self._parent_tasks[parent_id] = parent_task
|
||||
self._pending_ids_per_parent[parent_id] = set()
|
||||
self._complete_ids_per_parent[parent_id] = set()
|
||||
|
||||
n_emitters = self._project.cluster_generator.count_emitters(parent_task.model, parent_task.id)
|
||||
if n_emitters > 1 and self._slots > 1:
|
||||
emitters = range(1, n_emitters + 1)
|
||||
else:
|
||||
emitters = [0]
|
||||
|
||||
out_tasks = []
|
||||
for em in emitters:
|
||||
new_task = parent_task.copy()
|
||||
new_task.parent_id = parent_id
|
||||
new_task.change_id(emit=em)
|
||||
|
||||
child_id = new_task.id
|
||||
self._pending_tasks[child_id] = new_task
|
||||
self._pending_ids_per_parent[parent_id].add(child_id)
|
||||
|
||||
out_tasks.append(new_task)
|
||||
|
||||
if not out_tasks:
|
||||
logger.error("no emitter tasks generated. your project must declare at least one emitter configuration.")
|
||||
|
||||
return out_tasks
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
collect and combine the calculation results of inequivalent emitters.
|
||||
|
||||
* mark the task as complete
|
||||
* store its result for later
|
||||
* check whether this was the last pending task of the family (belonging to the same parent).
|
||||
|
||||
the actual merging of data is delegated to the project's combine_emitters() method.
|
||||
|
||||
@param task: (CalculationTask) calculation task that completed.
|
||||
|
||||
@return parent task (CalculationTask) if the family is complete. None if the family is not complete yet.
|
||||
"""
|
||||
super(EmitterHandler, self).add_result(task)
|
||||
|
||||
self._complete_tasks[task.id] = task
|
||||
del self._pending_tasks[task.id]
|
||||
|
||||
family_pending = self._pending_ids_per_parent[task.parent_id]
|
||||
family_complete = self._complete_ids_per_parent[task.parent_id]
|
||||
family_pending.remove(task.id)
|
||||
family_complete.add(task.id)
|
||||
|
||||
# all emitters complete?
|
||||
if len(family_pending) == 0:
|
||||
parent_task = self._parent_tasks[task.parent_id]
|
||||
|
||||
parent_task.file_ext = task.file_ext
|
||||
parent_task.result_filename = parent_task.format_filename()
|
||||
modf_ext = ".modf" + parent_task.file_ext
|
||||
parent_task.modf_filename = parent_task.format_filename(ext=modf_ext)
|
||||
|
||||
child_tasks = [self._complete_tasks[task_id] for task_id in sorted(family_complete)]
|
||||
|
||||
child_valid = [t.result_valid for t in child_tasks]
|
||||
parent_task.result_valid = reduce(lambda a, b: a and b, child_valid)
|
||||
child_times = [t.time for t in child_tasks]
|
||||
parent_task.time = reduce(lambda a, b: a + b, child_times)
|
||||
|
||||
if parent_task.result_valid:
|
||||
self._project.combine_emitters(parent_task, child_tasks)
|
||||
self._project.files.add_file(parent_task.result_filename, parent_task.id.model, 'symmetry')
|
||||
self._project.files.add_file(parent_task.modf_filename, parent_task.id.model, 'symmetry')
|
||||
|
||||
del self._pending_ids_per_parent[parent_task.id]
|
||||
del self._complete_ids_per_parent[parent_task.id]
|
||||
del self._parent_tasks[parent_task.id]
|
||||
|
||||
return parent_task
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
class RegionHandler(TaskHandler):
|
||||
"""
|
||||
region handlers split a scan into a number of regions that can be calculated in parallel.
|
||||
|
||||
this class is an abstract base class.
|
||||
it implements only common code to combine different regions into one result.
|
||||
"""
|
||||
|
||||
## @var _pending_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of IDs referring to pending calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
## @var _complete_ids_per_parent
|
||||
# (dict) sets of child task IDs per parent
|
||||
#
|
||||
# each dictionary element is a set of complete calculation tasks (children)
|
||||
# belonging to a parent task identified by the key.
|
||||
#
|
||||
# the dictionary keys are the task identifiers CalculationTask.id of the parent tasks,
|
||||
# the values are sets of all child CalculationTask.id belonging to the parent.
|
||||
|
||||
def __init__(self):
|
||||
super(RegionHandler, self).__init__()
|
||||
self._pending_ids_per_parent = {}
|
||||
self._complete_ids_per_parent = {}
|
||||
|
||||
def add_result(self, task):
|
||||
"""
|
||||
gather results of all regions that belong to the same parent.
|
||||
|
||||
@param task: (CalculationTask) calculation task that completed.
|
||||
|
||||
@return parent task (CalculationTask) if the family is complete. None if the family is not complete yet.
|
||||
"""
|
||||
super(RegionHandler, self).add_result(task)
|
||||
|
||||
self._complete_tasks[task.id] = task
|
||||
del self._pending_tasks[task.id]
|
||||
|
||||
family_pending = self._pending_ids_per_parent[task.parent_id]
|
||||
family_complete = self._complete_ids_per_parent[task.parent_id]
|
||||
family_pending.remove(task.id)
|
||||
family_complete.add(task.id)
|
||||
|
||||
# all regions ready?
|
||||
if len(family_pending) == 0:
|
||||
parent_task = self._parent_tasks[task.parent_id]
|
||||
|
||||
parent_task.file_ext = task.file_ext
|
||||
parent_task.result_filename = parent_task.format_filename()
|
||||
modf_ext = ".modf" + parent_task.file_ext
|
||||
parent_task.modf_filename = parent_task.format_filename(ext=modf_ext)
|
||||
|
||||
child_tasks = [self._complete_tasks[task_id] for task_id in sorted(family_complete)]
|
||||
|
||||
child_valid = [t.result_valid for t in child_tasks]
|
||||
parent_task.result_valid = reduce(lambda a, b: a and b, child_valid)
|
||||
child_times = [t.time for t in child_tasks]
|
||||
parent_task.time = reduce(lambda a, b: a + b, child_times)
|
||||
|
||||
if parent_task.result_valid:
|
||||
stack1 = [md.load_data(t.result_filename) for t in child_tasks]
|
||||
dtype = md.common_dtype(stack1)
|
||||
stack2 = [md.restructure_data(d, dtype) for d in stack1]
|
||||
result_data = np.hstack(tuple(stack2))
|
||||
md.sort_data(result_data)
|
||||
md.save_data(parent_task.result_filename, result_data)
|
||||
self._project.files.add_file(parent_task.result_filename, parent_task.id.model, "emitter")
|
||||
for t in child_tasks:
|
||||
self._project.files.remove_file(t.result_filename)
|
||||
|
||||
del self._pending_ids_per_parent[parent_task.id]
|
||||
del self._complete_ids_per_parent[parent_task.id]
|
||||
del self._parent_tasks[parent_task.id]
|
||||
|
||||
return parent_task
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
class SingleRegionHandler(RegionHandler):
|
||||
"""
|
||||
trivial region handler
|
||||
|
||||
this is a trivial region handler.
|
||||
the whole parent task is identified as one region and calculated at once.
|
||||
"""
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
generate one calculation task for the parent task.
|
||||
|
||||
@return list of CalculationTask objects, with one element per region.
|
||||
the energy index enumerates the regions.
|
||||
"""
|
||||
super(SingleRegionHandler, self).create_tasks(parent_task)
|
||||
|
||||
parent_id = parent_task.id
|
||||
self._parent_tasks[parent_id] = parent_task
|
||||
self._pending_ids_per_parent[parent_id] = set()
|
||||
self._complete_ids_per_parent[parent_id] = set()
|
||||
|
||||
new_task = parent_task.copy()
|
||||
new_task.parent_id = parent_id
|
||||
new_task.change_id(region=0)
|
||||
|
||||
child_id = new_task.id
|
||||
self._pending_tasks[child_id] = new_task
|
||||
self._pending_ids_per_parent[parent_id].add(child_id)
|
||||
|
||||
out_tasks = [new_task]
|
||||
return out_tasks
|
||||
|
||||
|
||||
class EnergyRegionHandler(RegionHandler):
|
||||
"""
|
||||
split a scan into a number of energy regions that can be run in parallel.
|
||||
|
||||
the purpose of this task handler is to save wall clock time on a multi-processor machine
|
||||
by splitting energy scans into smaller chunks.
|
||||
|
||||
the handler distributes the processing slots to the scans proportional to their scan lengths
|
||||
so that all child tasks of the same parent finish approximately in the same time.
|
||||
pure angle scans are not split.
|
||||
|
||||
to use this feature, the project assigns this class to its @ref handler_classes['region'].
|
||||
it is safe to use this handler for calculations that do not involve energy scans.
|
||||
the handler is best used for single calculations.
|
||||
in optimizations that calculate many models there is no advantage in using it
|
||||
(on the contrary, the overhead increases the total run time slightly.)
|
||||
"""
|
||||
|
||||
## @var _slots_per_scan
|
||||
# (list of integers) number of processor slots assigned to each scan,
|
||||
# i.e. number of chunks to split a scan region into.
|
||||
#
|
||||
# the sequence has the same order as self._project.scans.
|
||||
|
||||
def __init__(self):
|
||||
super(EnergyRegionHandler, self).__init__()
|
||||
self._slots_per_scan = []
|
||||
|
||||
def setup(self, project, slots):
|
||||
"""
|
||||
initialize the handler with project data and the process environment.
|
||||
|
||||
this function distributes the processing slots to the scans.
|
||||
the slots are distributed proportional to the scan lengths of the energy scans
|
||||
so that all chunks have approximately the same size.
|
||||
|
||||
the number of slots per scan is stored in @ref _slots_per_scan for later use by @ref create_tasks.
|
||||
|
||||
@param project (Project) project instance.
|
||||
|
||||
@param slots (int) number of calculation slots (processes).
|
||||
|
||||
@return None
|
||||
"""
|
||||
super(EnergyRegionHandler, self).setup(project, slots)
|
||||
|
||||
scan_lengths = [scan.energies.shape[0] for scan in self._project.scans]
|
||||
total_length = sum(scan_lengths)
|
||||
f = min(1.0, float(self._slots) / total_length)
|
||||
self._slots_per_scan = [max(1, int(round(l * f))) for l in scan_lengths]
|
||||
|
||||
for i, scan in enumerate(self._project.scans):
|
||||
logger.debug(BMsg("region handler: split scan {file} into {slots} chunks",
|
||||
file=os.path.basename(scan.filename), slots=self._slots_per_scan[i]))
|
||||
|
||||
def create_tasks(self, parent_task):
|
||||
"""
|
||||
generate a calculation task for each energy region of the given parent task.
|
||||
|
||||
all child tasks share the model parameters.
|
||||
|
||||
@return list of CalculationTask objects, with one element per region.
|
||||
the energy index enumerates the regions.
|
||||
"""
|
||||
super(EnergyRegionHandler, self).create_tasks(parent_task)
|
||||
|
||||
parent_id = parent_task.id
|
||||
self._parent_tasks[parent_id] = parent_task
|
||||
self._pending_ids_per_parent[parent_id] = set()
|
||||
self._complete_ids_per_parent[parent_id] = set()
|
||||
|
||||
energies = self._project.scans[parent_id.scan].energies
|
||||
n_regions = self._slots_per_scan[parent_id.scan]
|
||||
regions = np.array_split(energies, n_regions)
|
||||
|
||||
out_tasks = []
|
||||
for ireg, reg in enumerate(regions):
|
||||
new_task = parent_task.copy()
|
||||
new_task.parent_id = parent_id
|
||||
new_task.change_id(region=ireg)
|
||||
if n_regions > 1:
|
||||
new_task.region['e'] = reg
|
||||
|
||||
child_id = new_task.id
|
||||
self._pending_tasks[child_id] = new_task
|
||||
self._pending_ids_per_parent[parent_id].add(child_id)
|
||||
|
||||
out_tasks.append(new_task)
|
||||
|
||||
if not out_tasks:
|
||||
logger.error("no region tasks generated. this is probably a bug.")
|
||||
|
||||
return out_tasks
|
||||
|
||||
|
||||
def choose_region_handler_class(project):
|
||||
"""
|
||||
choose a suitable region handler for the project.
|
||||
|
||||
the function returns the EnergyRegionHandler class
|
||||
if the project includes an energy scan with at least 10 steps.
|
||||
Otherwise, it returns the SingleRegionHandler.
|
||||
|
||||
angle scans do not benefit from region splitting in EDAC.
|
||||
|
||||
@param project: Project instance.
|
||||
@return: SingleRegionHandler or EnergyRegionHandler class.
|
||||
"""
|
||||
energy_scans = 0
|
||||
for scan in project.scans:
|
||||
if scan.energies.shape[0] >= 10:
|
||||
energy_scans += 1
|
||||
|
||||
if energy_scans >= 1:
|
||||
return EnergyRegionHandler
|
||||
else:
|
||||
return SingleRegionHandler
|
||||
@@ -0,0 +1,8 @@
|
||||
class BraceMessage:
|
||||
def __init__(self, fmt, *args, **kwargs):
|
||||
self.fmt = fmt
|
||||
self.args = args
|
||||
self.kwargs = kwargs
|
||||
|
||||
def __str__(self):
|
||||
return self.fmt.format(*self.args, **self.kwargs)
|
||||
@@ -0,0 +1,2 @@
|
||||
loess.py
|
||||
loess_wrap.c
|
||||
@@ -0,0 +1,115 @@
|
||||
Software for Locally-Weighted Regression 18 August 1992
|
||||
|
||||
William S. Cleveland
|
||||
Eric Grosse
|
||||
Ming-Jen Shyu
|
||||
|
||||
Locally-weighted regression, or loess, is a procedure for estimating a
|
||||
regression surface by a multivariate smoothing procedure: fitting a
|
||||
linear or quadratic function of the independent variables in a moving
|
||||
fashion that is analogous to how a moving average is computed for a
|
||||
time series. Compared to classical approaches - fitting global
|
||||
parametric functions - loess substantially increases the domain of
|
||||
surfaces that can be estimated without distortion. Also, a pleasant
|
||||
fact about loess is that analogues of the statistical procedures used
|
||||
in parametric function fitting - for example, ANOVA and t intervals -
|
||||
involve statistics whose distributions are well approximated by
|
||||
familiar distributions.
|
||||
|
||||
The follwing files are included in this distribution.
|
||||
README the instruction file you are reading now
|
||||
S.h header file
|
||||
air.c C source for air data example
|
||||
changes history of changes to loess
|
||||
depend.ps PostScript figure of how routines are related
|
||||
ethanol.c C source for ethanol data example
|
||||
galaxy.c C source for galaxy data example
|
||||
gas.c C source for gas data example
|
||||
loess.c C source (high-level loess routines)
|
||||
loess.h header file for loess_struct and predict_struct
|
||||
loess.m manual page for user-callable loess routines
|
||||
loessc.c C source (low-level loess routines)
|
||||
loessf.f FORTRAN source (low-level loess & predict routines)
|
||||
loessf.m documentation for FORTRAN source
|
||||
madeup.c C source for madeup data example
|
||||
makefile makefile to compile the example codes
|
||||
misc.c C source (anova, pointwise, and other support routines)
|
||||
predict.c C source (high-level predict routines)
|
||||
predict.m manual page for user-callable predict routines
|
||||
struct.m manual page for loess_struct, pred_struct
|
||||
supp.f supplemental Fortran loess drivers
|
||||
|
||||
After unpacking these files, just type "make" and if all goes well
|
||||
you should see output like:
|
||||
|
||||
loess(&gas):
|
||||
Number of Observations: 22
|
||||
Equivalent Number of Parameters: 5.5
|
||||
Residual Standard Error: 0.3404
|
||||
|
||||
loess(&gas_null):
|
||||
Number of Observations: 22
|
||||
Equivalent Number of Parameters: 3.5
|
||||
Residual Standard Error: 0.5197
|
||||
|
||||
predict(gas_fit_E, m, &gas, &gas_pred):
|
||||
1.19641 5.06875 0.523682
|
||||
|
||||
pointwise(&gas_pred, m, coverage, &gas_ci):
|
||||
1.98562 4.10981 5.48023 5.56651 3.52761 1.71062 1.47205
|
||||
1.19641 3.6795 5.05571 5.13526 3.14366 1.19693 0.523682
|
||||
0.407208 3.24919 4.63119 4.70401 2.7597 0.683247 -0.424684
|
||||
|
||||
anova(&gas_null, &gas, &gas_anova):
|
||||
2.5531 15.663 10.1397 0.000860102
|
||||
|
||||
To run other examples, simply type "make galaxy", or "make ethanol", etc.
|
||||
|
||||
If your loader complains about "-llinpack -lblas" in the makefile, change
|
||||
it to whatever your system prefers for accessing Linpack and the Blas.
|
||||
If necessary, these Fortran subroutines can be obtained by
|
||||
mail netlib@netlib.bell-labs.com
|
||||
send dnrm2 dsvdc dqrdc ddot dqrsl idamax from linpack core.
|
||||
|
||||
A 50 page user guide, in PostScript form, is available by anonymous ftp.
|
||||
ftp netlib.bell-labs.com
|
||||
login: anonymous
|
||||
password: <your email address>
|
||||
binary
|
||||
cd /netlib/a
|
||||
get cloess.ps.Z
|
||||
quit
|
||||
uncompress cloess.ps
|
||||
This guide describes crucial steps in the proper analysis of data using
|
||||
loess. Please read it.
|
||||
|
||||
Bug reports are appreciated. Send electronic mail to
|
||||
ehg@netlib.bell-labs.com
|
||||
including the words "this is not spam" in the Subject line
|
||||
or send paper mail to
|
||||
Eric Grosse
|
||||
Bell Labs 2T-502
|
||||
Murray Hill NJ 07974
|
||||
for problems with the Fortran inner core of the algorithm.
|
||||
The C drivers were written by Ming-Jen Shyu, who left Bell Labs. Eric will
|
||||
fix problems with them when he can.
|
||||
|
||||
Remember that this is experimental software distributed free of charge
|
||||
and comes with no warranty! Exercise professional caution.
|
||||
|
||||
Happy Smoothing!
|
||||
|
||||
/*
|
||||
* The authors of this software are Cleveland, Grosse, and Shyu.
|
||||
* Copyright (c) 1989, 1992 by AT&T.
|
||||
* Permission to use, copy, modify, and distribute this software for any
|
||||
* purpose without fee is hereby granted, provided that this entire notice
|
||||
* is included in all copies of any software which is or includes a copy
|
||||
* or modification of this software and in all copies of the supporting
|
||||
* documentation for such software.
|
||||
* THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED
|
||||
* WARRANTY. IN PARTICULAR, NEITHER THE AUTHORS NOR AT&T MAKE ANY
|
||||
* REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE MERCHANTABILITY
|
||||
* OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE.
|
||||
*/
|
||||
|
||||
@@ -0,0 +1,31 @@
|
||||
#include <stdlib.h>
|
||||
#include <stdio.h>
|
||||
#include <math.h>
|
||||
#include <string.h>
|
||||
|
||||
#define Calloc(n,t) (t *)calloc((unsigned)(n),sizeof(t))
|
||||
#define Free(p) free((char *)(p))
|
||||
|
||||
/* the mapping from f77 to C intermediate code -- may be machine dependent
|
||||
* the first definition satisfies lint's narrowminded preprocessing & should
|
||||
* stay the same for all implementations. The __STDC__ definition is for
|
||||
* ANSI standard conforming C compilers. The #else definition should
|
||||
* generate the version of the fortran subroutine & common block names x
|
||||
* handed to the local loader; e.g., "x_" in system V, Berkeley & 9th edition
|
||||
*/
|
||||
|
||||
#ifdef lint
|
||||
#define F77_SUB(x) x
|
||||
#define F77_COM(x) x
|
||||
#else
|
||||
#ifdef __STDC__
|
||||
#define F77_SUB(x) x##_
|
||||
#define F77_COM(x) x##_
|
||||
#else
|
||||
#define F77_SUB(x) x/**/_
|
||||
#define F77_COM(x) x/**/_
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#define NULL_ENTRY ((int *)NULL)
|
||||
|
||||
@@ -0,0 +1 @@
|
||||
__author__ = 'matthias muntwiler'
|
||||
@@ -0,0 +1,78 @@
|
||||
#include <stdio.h>
|
||||
#include "loess.h"
|
||||
|
||||
struct loess_struct air;
|
||||
double ozone[] = {3.44821724038273, 3.30192724889463, 2.28942848510666,
|
||||
2.6207413942089, 2.84386697985157, 2.66840164872194, 2,
|
||||
2.51984209978975, 2.22398009056931, 2.41014226417523,
|
||||
2.6207413942089, 2.41014226417523, 3.23961180127748,
|
||||
1.81712059283214, 3.10723250595386, 2.22398009056931, 1,
|
||||
2.22398009056931, 1.5874010519682, 3.1748021039364,
|
||||
2.84386697985157, 3.55689330449006, 4.86294413109428,
|
||||
3.33222185164595, 3.07231682568585, 4.14081774942285,
|
||||
3.39121144301417, 2.84386697985157, 2.75892417638112,
|
||||
3.33222185164595, 2.71441761659491, 2.28942848510666,
|
||||
2.35133468772076, 5.12992784003009, 3.65930571002297,
|
||||
3.1748021039364, 4, 3.41995189335339, 4.25432086511501,
|
||||
4.59470089220704, 4.59470089220704, 4.39682967215818,
|
||||
2.15443469003188, 3, 1.91293118277239, 3.63424118566428,
|
||||
3.27106631018859, 3.93649718310217, 4.29084042702621,
|
||||
3.97905720789639, 2.51984209978975, 4.30886938006377,
|
||||
4.7622031559046, 2.71441761659491, 3.73251115681725,
|
||||
4.34448148576861, 3.68403149864039, 4, 3.89299641587326,
|
||||
3.39121144301417, 2.0800838230519, 2.51984209978975,
|
||||
4.9596756638423, 4.46474509558454, 4.79141985706278,
|
||||
3.53034833532606, 3.03658897187566, 4.02072575858906,
|
||||
2.80203933065539, 3.89299641587326, 2.84386697985157,
|
||||
3.14138065239139, 3.53034833532606, 2.75892417638112,
|
||||
2.0800838230519, 3.55689330449006, 5.51784835276224,
|
||||
4.17933919638123, 4.23582358425489, 4.90486813152402,
|
||||
4.37951913988789, 4.39682967215818, 4.57885697021333,
|
||||
4.27265868169792, 4.17933919638123, 4.49794144527541,
|
||||
3.60882608013869, 3.1748021039364, 2.71441761659491,
|
||||
2.84386697985157, 2.75892417638112, 2.88449914061482,
|
||||
3.53034833532606, 2.75892417638112, 3.03658897187566,
|
||||
2.0800838230519, 2.35133468772076, 3.58304787101595,
|
||||
2.6207413942089, 2.35133468772076, 2.88449914061482,
|
||||
2.51984209978975, 2.35133468772076, 2.84386697985157,
|
||||
3.30192724889463, 1.91293118277239, 2.41014226417523,
|
||||
3.10723250595386, 2.41014226417523, 2.6207413942089,
|
||||
2.71441761659491};
|
||||
double rad_temp_wind[] = {190, 118, 149, 313, 299, 99, 19, 256, 290, 274, 65,
|
||||
334, 307, 78, 322, 44, 8, 320, 25, 92, 13, 252, 223, 279, 127,
|
||||
291, 323, 148, 191, 284, 37, 120, 137, 269, 248, 236, 175,
|
||||
314, 276, 267, 272, 175, 264, 175, 48, 260, 274, 285, 187,
|
||||
220, 7, 294, 223, 81, 82, 213, 275, 253, 254, 83, 24, 77, 255,
|
||||
229, 207, 192, 273, 157, 71, 51, 115, 244, 190, 259, 36, 212,
|
||||
238, 215, 203, 225, 237, 188, 167, 197, 183, 189, 95, 92, 252,
|
||||
220, 230, 259, 236, 259, 238, 24, 112, 237, 224, 27, 238, 201,
|
||||
238, 14, 139, 49, 20, 193, 191, 131, 223,
|
||||
67, 72, 74, 62, 65, 59, 61, 69, 66, 68, 58, 64, 66, 57, 68,
|
||||
62, 59, 73, 61, 61, 67, 81, 79, 76, 82, 90, 87, 82, 77, 72,
|
||||
65, 73, 76, 84, 85, 81, 83, 83, 88, 92, 92, 89, 73, 81, 80,
|
||||
81, 82, 84, 87, 85, 74, 86, 85, 82, 86, 88, 86, 83, 81, 81,
|
||||
81, 82, 89, 90, 90, 86, 82, 80, 77, 79, 76, 78, 78, 77, 72,
|
||||
79, 81, 86, 97, 94, 96, 94, 91, 92, 93, 93, 87, 84, 80, 78,
|
||||
75, 73, 81, 76, 77, 71, 71, 78, 67, 76, 68, 82, 64, 71, 81,
|
||||
69, 63, 70, 75, 76, 68,
|
||||
7.4, 8, 12.6, 11.5, 8.6, 13.8, 20.1, 9.7, 9.2, 10.9, 13.2,
|
||||
11.5, 12, 18.4, 11.5, 9.7, 9.7, 16.6, 9.7, 12, 12, 14.9, 5.7,
|
||||
7.4, 9.7, 13.8, 11.5, 8, 14.9, 20.7, 9.2, 11.5, 10.3, 4, 9.2,
|
||||
9.2, 4.6, 10.9, 5.1, 6.3, 5.7, 7.4, 14.3, 14.9, 14.3, 6.9,
|
||||
10.3, 6.3, 5.1, 11.5, 6.9, 8.6, 8, 8.6, 12, 7.4, 7.4, 7.4,
|
||||
9.2, 6.9, 13.8, 7.4, 4, 10.3, 8, 11.5, 11.5, 9.7, 10.3, 6.3,
|
||||
7.4, 10.9, 10.3, 15.5, 14.3, 9.7, 3.4, 8, 9.7, 2.3, 6.3, 6.3,
|
||||
6.9, 5.1, 2.8, 4.6, 7.4, 15.5, 10.9, 10.3, 10.9, 9.7, 14.9,
|
||||
15.5, 6.3, 10.9, 11.5, 6.9, 13.8, 10.3, 10.3, 8, 12.6, 9.2,
|
||||
10.3, 10.3, 16.6, 6.9, 14.3, 8, 11.5};
|
||||
long n = 111, p = 3;
|
||||
|
||||
main() {
|
||||
printf("\nloess(&air):\n");
|
||||
loess_setup(rad_temp_wind, ozone, n, p, &air);
|
||||
air.model.span = 0.8;
|
||||
loess(&air);
|
||||
loess_summary(&air);
|
||||
|
||||
loess_free_mem(&air);
|
||||
}
|
||||
@@ -0,0 +1,168 @@
|
||||
CHANGES PLANNED SOMEDAY
|
||||
1) more vertices in k-d tree for dimension > 2, to get continuity.
|
||||
2) triangulation based method.
|
||||
----------------------
|
||||
|
||||
19 Nov 1987 workspace not big enough for degree=2
|
||||
|
||||
22 Jan 1988 switched from depth first to breadth first tree build
|
||||
|
||||
14 Mar 1988 lostt.3 extra space needed if (method mod 1000 = 0),
|
||||
not the documented (method/1000=0)
|
||||
|
||||
28 Apr 1988 l2tr.g vval2 needed to be initialized to 0
|
||||
|
||||
galaxy smooth needs double precision on vax
|
||||
|
||||
26 May 1988 bbox.g add 10% margin to allow limited extrapolation
|
||||
|
||||
6 June 1988 loess/lostt.f trL wasn't set if method/1000==0
|
||||
|
||||
10 June 1988 losave, loread
|
||||
|
||||
v(RCOND) 1 / max condition number
|
||||
|
||||
12 June 1988 lofort
|
||||
|
||||
21 June 1988 additional workspace for explicit L
|
||||
|
||||
27 June 1988 workspace checking in lowesf was slightly pessimistic
|
||||
|
||||
30 June 1988 Changed default fdiam to 0.
|
||||
Added warning messages for memory limits and pseudoinverse.
|
||||
|
||||
4 Aug 1988 bbox.g changed margin from 10% to 0.5%.
|
||||
|
||||
24 Aug 1988 loser documentation should have specified workspace
|
||||
of size ...+m*n, not ...+m**2.
|
||||
|
||||
Sep 1988
|
||||
loess-based approximations of delta1,2.
|
||||
pseudo-values, so statistics are available with robustness iterations.
|
||||
reorganize error messages to better fit into S.
|
||||
sample driver program.
|
||||
somewhat shorter code generated by ehg170.
|
||||
|
||||
20 Dec 1988
|
||||
workspace in loser
|
||||
|
||||
27 Jan 1989
|
||||
workspace checking in lostt was a bit pessimistic.
|
||||
|
||||
3 Feb 1989
|
||||
l2fit, l2tr: error message should contain sqrt(rho)
|
||||
|
||||
18 Dec 1989
|
||||
ehg141, ehg179-ehg181: new delta approximations
|
||||
|
||||
24 Jan 1990
|
||||
master copy moved from Sun3/180 to SGI 4D/240S
|
||||
(no intentional changes)
|
||||
|
||||
25 Jan 1990
|
||||
(many routines touched; ehg127 added) cleaned up computational
|
||||
kernel, added provision for only first dd<=d variables to enter
|
||||
the distance calculation ("conditionally parametric variables"),
|
||||
added independent bounds on total and componentwise degree for
|
||||
local polynomial model, made extrapolation warning message print
|
||||
a bit more detail.
|
||||
|
||||
14 Mar 1990
|
||||
added setLf argument to lowesd; added lowesr, lowesl for resmoothing.
|
||||
|
||||
-------------------------------------------------------
|
||||
Converting to the new version of loess
|
||||
5 April 1990
|
||||
|
||||
Over the past few months, a number of changes have been made to the
|
||||
loess package, to provide more control over the local model, to allow
|
||||
conditionally parametric variables, and to return exact statistical
|
||||
quantities for the blending method. Unlike earlier internal
|
||||
algorithmic improvements, this round of changes added some extra
|
||||
arguments in the Fortran calling sequences. The purpose of this note
|
||||
is to assist in converting programs that called the old version.
|
||||
|
||||
An explicit argument setLf has been added to lowesd(), since it affects
|
||||
the partitioning of the workspace. To help protect against inadvertent
|
||||
version mismatches, the version number that lowesd() checks has also
|
||||
been changed. The componentwise degree and the specification of
|
||||
conditionally nonparametric variables can be changed from the default
|
||||
by modifying iv(CDEG) and iv(NDIST).
|
||||
|
||||
The influence matrix L for blending is now explicitly available by
|
||||
calling a new subroutine lowesl(), but this loses the speed
|
||||
advantage of blending. A faster, sometimes equivalent method is
|
||||
to use the influence matrix that carries data values to coefficients
|
||||
at the vertices of the k-d tree. This information is saved in iv(iv(Lq))
|
||||
and v(iv(Lf)), for the afficionado.
|
||||
|
||||
The new subroutine lowesr() takes advantage of Lq and Lf to allow rapid
|
||||
resmoothing for applications when only y, not x, is subject to change.
|
||||
-------------------------------------------------------
|
||||
|
||||
7 May 1990
|
||||
new delta approximations.
|
||||
added prior weights to input format for sample driver.
|
||||
|
||||
29 May 1990
|
||||
loess,lostt,loser,pseudo moved from Fortran to S.
|
||||
|
||||
11 Jul 1990
|
||||
column equilibration, so pseudoinverse is needed less often.
|
||||
|
||||
27 May 1991
|
||||
lowesd version 105; increased nvmax,ncmax to max(200,n).
|
||||
l2fit added ihat=1 (diagL only).
|
||||
ehg133,lowese removed unused arguments dist,eta.
|
||||
ehg190,ehg141 changed name to lowesa, slight change to calling sequence.
|
||||
ehg144 changed name to lowesc
|
||||
m9rwt changed name to lowesw
|
||||
pseudo changed name to lowesp
|
||||
|
||||
22 Jul 1991 IMPORTANT BUG FIX!
|
||||
ehg131 vval2 should be dimensioned 0:d, not 0:8
|
||||
|
||||
26 Jul 1991
|
||||
lowesd change calling sequence to provide tighter memory allocation
|
||||
diff old/man/internal new/man/internal
|
||||
< lowesd(105,iv,liv,lv,v,d,n,f,tdeg,setLf) setup workspace
|
||||
> lowesd(106,iv,liv,lv,v,d,n,f,tdeg,nvmax,setLf) setup workspace
|
||||
< liv 50+(2^d+6)*max(200,n)
|
||||
< if setLf, add nf*max(200,n)
|
||||
< lv 50+(3*d+4)*max(200,n)+(tau+2)*nf
|
||||
< if setLf, add (d+1)*nf*max(200,n)
|
||||
> liv 50+(2^d+6)*nvmax
|
||||
> if setLf, add nf*nvmax
|
||||
> lv 50+(3*d+4)*nvmax+(tau+2)*nf
|
||||
> if setLf, add (d+1)*nf*nvmax
|
||||
> nvmax limit on number of vertices for kd-tree; e.g. max(200,n)
|
||||
|
||||
20 Sep 1991
|
||||
sample.f brought in sync with recent loess changes.
|
||||
|
||||
24 Dec 1991
|
||||
l2fit.f fixed comment in single precision version
|
||||
|
||||
10 Jan 1992
|
||||
ehg197.f new formula for approximating trL, valid for small f
|
||||
|
||||
15 May 1992
|
||||
netlib/a/dloess now includes C drivers (written by Ming-Jen Shyu,
|
||||
adapted from code used inside the S system)
|
||||
|
||||
22 Jun 1992
|
||||
ehg191.f Loop 11 ran too far, picking up one more value than necessary.
|
||||
The value was not used, so the loess computation itself is unaffected,
|
||||
but on some systems the old code could conceivably cause a reference
|
||||
to an invalid memory address and abort with a segmentation fault
|
||||
message.
|
||||
|
||||
23 Jun 1992
|
||||
S.h #include <math.h>, since loessc.c calls floor() and pow().
|
||||
|
||||
18 Aug 1992
|
||||
netlib/a/dloess A new release with bug fixes in all the C drivers, new
|
||||
example codes, and detail documentations.
|
||||
|
||||
25 Mar 1996
|
||||
predict.c fix enormous memory leak. update email address
|
||||
+33320
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,117 @@
|
||||
%!
|
||||
/Courier-Bold findfont 10 scalefont setfont
|
||||
%draw a box
|
||||
%x y width height box
|
||||
/box { newpath
|
||||
/height exch def
|
||||
/width exch def
|
||||
/y exch def
|
||||
/x exch def
|
||||
x width 2 div sub
|
||||
y height 2 div sub moveto
|
||||
width 0 rlineto
|
||||
0 height rlineto
|
||||
width neg 0 rlineto
|
||||
closepath } def
|
||||
|
||||
%draw a circle
|
||||
%x y radius circle
|
||||
/circle { newpath 0 360 arc } def
|
||||
|
||||
%draw an ellipse
|
||||
%x y width height ellipse
|
||||
/ellipse { gsave
|
||||
/height exch def
|
||||
/width exch def
|
||||
1 height width div scale
|
||||
width height div mul
|
||||
width 2 div
|
||||
circle stroke
|
||||
grestore } def
|
||||
|
||||
%draw a centered label
|
||||
%x y str
|
||||
/label {
|
||||
/str exch def
|
||||
/y exch def
|
||||
/x exch def
|
||||
str stringwidth
|
||||
pop /width exch def
|
||||
x width 2 div sub
|
||||
y 10 3 div sub moveto str show
|
||||
} def
|
||||
|
||||
%draw a line
|
||||
%x1 y1 x2 y2 drawline
|
||||
/drawline { 4 -2 roll moveto lineto stroke } def
|
||||
|
||||
277 684 42 14 box stroke
|
||||
277 684 (lowesd) label
|
||||
349 630 42 14 box stroke
|
||||
349 630 (lowesf) label
|
||||
205 630 42 14 box stroke
|
||||
205 630 (lowesb) label
|
||||
155 565 42 14 box stroke
|
||||
155 565 (lowesr) label
|
||||
146 427 42 14 box stroke
|
||||
146 427 (lowese) label
|
||||
277 576 42 14 box stroke
|
||||
277 576 (lowesl) label
|
||||
203 464 42 14 box stroke
|
||||
203 464 (lofort) label
|
||||
81 576 42 14 box stroke
|
||||
81 576 (losave) label
|
||||
81 522 42 14 box stroke
|
||||
81 522 (lohead) label
|
||||
81 468 42 14 box stroke
|
||||
81 468 (loread) label
|
||||
405 540 42 14 box stroke
|
||||
405 540 (lowesa) label
|
||||
342 539 42 14 box stroke
|
||||
342 539 (lowesc) label
|
||||
92 461 134 434 drawline
|
||||
124.266363 435.502104 134.000000 434.000000 drawline
|
||||
134.000000 434.000000 128.592424 442.231532 drawline
|
||||
81 515 81 475 drawline
|
||||
77.000000 484.000000 81.000000 475.000000 drawline
|
||||
81.000000 475.000000 85.000000 484.000000 drawline
|
||||
81 569 81 529 drawline
|
||||
77.000000 538.000000 81.000000 529.000000 drawline
|
||||
81.000000 529.000000 85.000000 538.000000 drawline
|
||||
289 569 329 546 drawline
|
||||
319.203959 547.018615 329.000000 546.000000 drawline
|
||||
329.000000 546.000000 323.191728 553.953865 drawline
|
||||
154 558 146 434 drawline
|
||||
142.587739 443.238857 146.000000 434.000000 drawline
|
||||
146.000000 434.000000 150.571142 442.723799 drawline
|
||||
188 623 97 583 drawline
|
||||
103.629564 590.283466 97.000000 583.000000 drawline
|
||||
97.000000 583.000000 106.848776 582.959760 drawline
|
||||
204 623 203 471 drawline
|
||||
199.059296 480.026120 203.000000 471.000000 drawline
|
||||
203.000000 471.000000 207.059123 479.973490 drawline
|
||||
214 623 267 583 drawline
|
||||
257.406670 585.228906 267.000000 583.000000 drawline
|
||||
267.000000 583.000000 262.225925 591.614419 drawline
|
||||
199 623 160 572 drawline
|
||||
162.289620 581.579021 160.000000 572.000000 drawline
|
||||
160.000000 572.000000 168.644482 576.719420 drawline
|
||||
220 623 389 547 drawline
|
||||
379.151237 547.043173 389.000000 547.000000 drawline
|
||||
389.000000 547.000000 382.432359 554.339352 drawline
|
||||
202 623 148 434 drawline
|
||||
146.626394 443.752600 148.000000 434.000000 drawline
|
||||
148.000000 434.000000 154.318586 441.554831 drawline
|
||||
348 623 342 546 drawline
|
||||
338.711268 555.283547 342.000000 546.000000 drawline
|
||||
342.000000 546.000000 346.687091 554.662054 drawline
|
||||
353 623 400 547 drawline
|
||||
391.864262 552.550655 400.000000 547.000000 drawline
|
||||
400.000000 547.000000 398.668290 556.758409 drawline
|
||||
267 677 214 637 drawline
|
||||
218.774075 645.614419 214.000000 637.000000 drawline
|
||||
214.000000 637.000000 223.593330 639.228906 drawline
|
||||
286 677 339 637 drawline
|
||||
329.406670 639.228906 339.000000 637.000000 drawline
|
||||
339.000000 637.000000 334.225925 645.614419 drawline
|
||||
showpage
|
||||
@@ -0,0 +1,274 @@
|
||||
subroutine dqrsl(x,ldx,n,k,qraux,y,qy,qty,b,rsd,xb,job,info)
|
||||
integer ldx,n,k,job,info
|
||||
double precision x(ldx,1),qraux(1),y(1),qy(1),qty(1),b(1),rsd(1),
|
||||
* xb(1)
|
||||
c
|
||||
c dqrsl applies the output of dqrdc to compute coordinate
|
||||
c transformations, projections, and least squares solutions.
|
||||
c for k .le. min(n,p), let xk be the matrix
|
||||
c
|
||||
c xk = (x(jpvt(1)),x(jpvt(2)), ... ,x(jpvt(k)))
|
||||
c
|
||||
c formed from columnns jpvt(1), ... ,jpvt(k) of the original
|
||||
c n x p matrix x that was input to dqrdc (if no pivoting was
|
||||
c done, xk consists of the first k columns of x in their
|
||||
c original order). dqrdc produces a factored orthogonal matrix q
|
||||
c and an upper triangular matrix r such that
|
||||
c
|
||||
c xk = q * (r)
|
||||
c (0)
|
||||
c
|
||||
c this information is contained in coded form in the arrays
|
||||
c x and qraux.
|
||||
c
|
||||
c on entry
|
||||
c
|
||||
c x double precision(ldx,p).
|
||||
c x contains the output of dqrdc.
|
||||
c
|
||||
c ldx integer.
|
||||
c ldx is the leading dimension of the array x.
|
||||
c
|
||||
c n integer.
|
||||
c n is the number of rows of the matrix xk. it must
|
||||
c have the same value as n in dqrdc.
|
||||
c
|
||||
c k integer.
|
||||
c k is the number of columns of the matrix xk. k
|
||||
c must nnot be greater than min(n,p), where p is the
|
||||
c same as in the calling sequence to dqrdc.
|
||||
c
|
||||
c qraux double precision(p).
|
||||
c qraux contains the auxiliary output from dqrdc.
|
||||
c
|
||||
c y double precision(n)
|
||||
c y contains an n-vector that is to be manipulated
|
||||
c by dqrsl.
|
||||
c
|
||||
c job integer.
|
||||
c job specifies what is to be computed. job has
|
||||
c the decimal expansion abcde, with the following
|
||||
c meaning.
|
||||
c
|
||||
c if a.ne.0, compute qy.
|
||||
c if b,c,d, or e .ne. 0, compute qty.
|
||||
c if c.ne.0, compute b.
|
||||
c if d.ne.0, compute rsd.
|
||||
c if e.ne.0, compute xb.
|
||||
c
|
||||
c note that a request to compute b, rsd, or xb
|
||||
c automatically triggers the computation of qty, for
|
||||
c which an array must be provided in the calling
|
||||
c sequence.
|
||||
c
|
||||
c on return
|
||||
c
|
||||
c qy double precision(n).
|
||||
c qy conntains q*y, if its computation has been
|
||||
c requested.
|
||||
c
|
||||
c qty double precision(n).
|
||||
c qty contains trans(q)*y, if its computation has
|
||||
c been requested. here trans(q) is the
|
||||
c transpose of the matrix q.
|
||||
c
|
||||
c b double precision(k)
|
||||
c b contains the solution of the least squares problem
|
||||
c
|
||||
c minimize norm2(y - xk*b),
|
||||
c
|
||||
c if its computation has been requested. (note that
|
||||
c if pivoting was requested in dqrdc, the j-th
|
||||
c component of b will be associated with column jpvt(j)
|
||||
c of the original matrix x that was input into dqrdc.)
|
||||
c
|
||||
c rsd double precision(n).
|
||||
c rsd contains the least squares residual y - xk*b,
|
||||
c if its computation has been requested. rsd is
|
||||
c also the orthogonal projection of y onto the
|
||||
c orthogonal complement of the column space of xk.
|
||||
c
|
||||
c xb double precision(n).
|
||||
c xb contains the least squares approximation xk*b,
|
||||
c if its computation has been requested. xb is also
|
||||
c the orthogonal projection of y onto the column space
|
||||
c of x.
|
||||
c
|
||||
c info integer.
|
||||
c info is zero unless the computation of b has
|
||||
c been requested and r is exactly singular. in
|
||||
c this case, info is the index of the first zero
|
||||
c diagonal element of r and b is left unaltered.
|
||||
c
|
||||
c the parameters qy, qty, b, rsd, and xb are not referenced
|
||||
c if their computation is not requested and in this case
|
||||
c can be replaced by dummy variables in the calling program.
|
||||
c to save storage, the user may in some cases use the same
|
||||
c array for different parameters in the calling sequence. a
|
||||
c frequently occuring example is when one wishes to compute
|
||||
c any of b, rsd, or xb and does not need y or qty. in this
|
||||
c case one may identify y, qty, and one of b, rsd, or xb, while
|
||||
c providing separate arrays for anything else that is to be
|
||||
c computed. thus the calling sequence
|
||||
c
|
||||
c call dqrsl(x,ldx,n,k,qraux,y,dum,y,b,y,dum,110,info)
|
||||
c
|
||||
c will result in the computation of b and rsd, with rsd
|
||||
c overwriting y. more generally, each item in the following
|
||||
c list contains groups of permissible identifications for
|
||||
c a single callinng sequence.
|
||||
c
|
||||
c 1. (y,qty,b) (rsd) (xb) (qy)
|
||||
c
|
||||
c 2. (y,qty,rsd) (b) (xb) (qy)
|
||||
c
|
||||
c 3. (y,qty,xb) (b) (rsd) (qy)
|
||||
c
|
||||
c 4. (y,qy) (qty,b) (rsd) (xb)
|
||||
c
|
||||
c 5. (y,qy) (qty,rsd) (b) (xb)
|
||||
c
|
||||
c 6. (y,qy) (qty,xb) (b) (rsd)
|
||||
c
|
||||
c in any group the value returned in the array allocated to
|
||||
c the group corresponds to the last member of the group.
|
||||
c
|
||||
c linpack. this version dated 08/14/78 .
|
||||
c g.w. stewart, university of maryland, argonne national lab.
|
||||
c
|
||||
c dqrsl uses the following functions and subprograms.
|
||||
c
|
||||
c blas daxpy,dcopy,ddot
|
||||
c fortran dabs,min0,mod
|
||||
c
|
||||
c internal variables
|
||||
c
|
||||
integer i,j,jj,ju,kp1
|
||||
double precision ddot,t,temp
|
||||
logical cb,cqy,cqty,cr,cxb
|
||||
c
|
||||
c
|
||||
c set info flag.
|
||||
c
|
||||
info = 0
|
||||
c
|
||||
c determine what is to be computed.
|
||||
c
|
||||
cqy = job/10000 .ne. 0
|
||||
cqty = mod(job,10000) .ne. 0
|
||||
cb = mod(job,1000)/100 .ne. 0
|
||||
cr = mod(job,100)/10 .ne. 0
|
||||
cxb = mod(job,10) .ne. 0
|
||||
ju = min0(k,n-1)
|
||||
c
|
||||
c special action when n=1.
|
||||
c
|
||||
if (ju .ne. 0) go to 40
|
||||
if (cqy) qy(1) = y(1)
|
||||
if (cqty) qty(1) = y(1)
|
||||
if (cxb) xb(1) = y(1)
|
||||
if (.not.cb) go to 30
|
||||
if (x(1,1) .ne. 0.0d0) go to 10
|
||||
info = 1
|
||||
go to 20
|
||||
10 continue
|
||||
b(1) = y(1)/x(1,1)
|
||||
20 continue
|
||||
30 continue
|
||||
if (cr) rsd(1) = 0.0d0
|
||||
go to 250
|
||||
40 continue
|
||||
c
|
||||
c set up to compute qy or qty.
|
||||
c
|
||||
if (cqy) call dcopy(n,y,1,qy,1)
|
||||
if (cqty) call dcopy(n,y,1,qty,1)
|
||||
if (.not.cqy) go to 70
|
||||
c
|
||||
c compute qy.
|
||||
c
|
||||
do 60 jj = 1, ju
|
||||
j = ju - jj + 1
|
||||
if (qraux(j) .eq. 0.0d0) go to 50
|
||||
temp = x(j,j)
|
||||
x(j,j) = qraux(j)
|
||||
t = -ddot(n-j+1,x(j,j),1,qy(j),1)/x(j,j)
|
||||
call daxpy(n-j+1,t,x(j,j),1,qy(j),1)
|
||||
x(j,j) = temp
|
||||
50 continue
|
||||
60 continue
|
||||
70 continue
|
||||
if (.not.cqty) go to 100
|
||||
c
|
||||
c compute trans(q)*y.
|
||||
c
|
||||
do 90 j = 1, ju
|
||||
if (qraux(j) .eq. 0.0d0) go to 80
|
||||
temp = x(j,j)
|
||||
x(j,j) = qraux(j)
|
||||
t = -ddot(n-j+1,x(j,j),1,qty(j),1)/x(j,j)
|
||||
call daxpy(n-j+1,t,x(j,j),1,qty(j),1)
|
||||
x(j,j) = temp
|
||||
80 continue
|
||||
90 continue
|
||||
100 continue
|
||||
c
|
||||
c set up to compute b, rsd, or xb.
|
||||
c
|
||||
if (cb) call dcopy(k,qty,1,b,1)
|
||||
kp1 = k + 1
|
||||
if (cxb) call dcopy(k,qty,1,xb,1)
|
||||
if (cr .and. k .lt. n) call dcopy(n-k,qty(kp1),1,rsd(kp1),1)
|
||||
if (.not.cxb .or. kp1 .gt. n) go to 120
|
||||
do 110 i = kp1, n
|
||||
xb(i) = 0.0d0
|
||||
110 continue
|
||||
120 continue
|
||||
if (.not.cr) go to 140
|
||||
do 130 i = 1, k
|
||||
rsd(i) = 0.0d0
|
||||
130 continue
|
||||
140 continue
|
||||
if (.not.cb) go to 190
|
||||
c
|
||||
c compute b.
|
||||
c
|
||||
do 170 jj = 1, k
|
||||
j = k - jj + 1
|
||||
if (x(j,j) .ne. 0.0d0) go to 150
|
||||
info = j
|
||||
c ......exit
|
||||
go to 180
|
||||
150 continue
|
||||
b(j) = b(j)/x(j,j)
|
||||
if (j .eq. 1) go to 160
|
||||
t = -b(j)
|
||||
call daxpy(j-1,t,x(1,j),1,b,1)
|
||||
160 continue
|
||||
170 continue
|
||||
180 continue
|
||||
190 continue
|
||||
if (.not.cr .and. .not.cxb) go to 240
|
||||
c
|
||||
c compute rsd or xb as required.
|
||||
c
|
||||
do 230 jj = 1, ju
|
||||
j = ju - jj + 1
|
||||
if (qraux(j) .eq. 0.0d0) go to 220
|
||||
temp = x(j,j)
|
||||
x(j,j) = qraux(j)
|
||||
if (.not.cr) go to 200
|
||||
t = -ddot(n-j+1,x(j,j),1,rsd(j),1)/x(j,j)
|
||||
call daxpy(n-j+1,t,x(j,j),1,rsd(j),1)
|
||||
200 continue
|
||||
if (.not.cxb) go to 210
|
||||
t = -ddot(n-j+1,x(j,j),1,xb(j),1)/x(j,j)
|
||||
call daxpy(n-j+1,t,x(j,j),1,xb(j),1)
|
||||
210 continue
|
||||
x(j,j) = temp
|
||||
220 continue
|
||||
230 continue
|
||||
240 continue
|
||||
250 continue
|
||||
return
|
||||
end
|
||||
@@ -0,0 +1,481 @@
|
||||
subroutine dsvdc(x,ldx,n,p,s,e,u,ldu,v,ldv,work,job,info)
|
||||
integer ldx,n,p,ldu,ldv,job,info
|
||||
double precision x(ldx,1),s(1),e(1),u(ldu,1),v(ldv,1),work(1)
|
||||
c
|
||||
c
|
||||
c dsvdc is a subroutine to reduce a double precision nxp matrix x
|
||||
c by orthogonal transformations u and v to diagonal form. the
|
||||
c diagonal elements s(i) are the singular values of x. the
|
||||
c columns of u are the corresponding left singular vectors,
|
||||
c and the columns of v the right singular vectors.
|
||||
c
|
||||
c on entry
|
||||
c
|
||||
c x double precision(ldx,p), where ldx.ge.n.
|
||||
c x contains the matrix whose singular value
|
||||
c decomposition is to be computed. x is
|
||||
c destroyed by dsvdc.
|
||||
c
|
||||
c ldx integer.
|
||||
c ldx is the leading dimension of the array x.
|
||||
c
|
||||
c n integer.
|
||||
c n is the number of rows of the matrix x.
|
||||
c
|
||||
c p integer.
|
||||
c p is the number of columns of the matrix x.
|
||||
c
|
||||
c ldu integer.
|
||||
c ldu is the leading dimension of the array u.
|
||||
c (see below).
|
||||
c
|
||||
c ldv integer.
|
||||
c ldv is the leading dimension of the array v.
|
||||
c (see below).
|
||||
c
|
||||
c work double precision(n).
|
||||
c work is a scratch array.
|
||||
c
|
||||
c job integer.
|
||||
c job controls the computation of the singular
|
||||
c vectors. it has the decimal expansion ab
|
||||
c with the following meaning
|
||||
c
|
||||
c a.eq.0 do not compute the left singular
|
||||
c vectors.
|
||||
c a.eq.1 return the n left singular vectors
|
||||
c in u.
|
||||
c a.ge.2 return the first min(n,p) singular
|
||||
c vectors in u.
|
||||
c b.eq.0 do not compute the right singular
|
||||
c vectors.
|
||||
c b.eq.1 return the right singular vectors
|
||||
c in v.
|
||||
c
|
||||
c on return
|
||||
c
|
||||
c s double precision(mm), where mm=min(n+1,p).
|
||||
c the first min(n,p) entries of s contain the
|
||||
c singular values of x arranged in descending
|
||||
c order of magnitude.
|
||||
c
|
||||
c e double precision(p),
|
||||
c e ordinarily contains zeros. however see the
|
||||
c discussion of info for exceptions.
|
||||
c
|
||||
c u double precision(ldu,k), where ldu.ge.n. if
|
||||
c joba.eq.1 then k.eq.n, if joba.ge.2
|
||||
c then k.eq.min(n,p).
|
||||
c u contains the matrix of left singular vectors.
|
||||
c u is not referenced if joba.eq.0. if n.le.p
|
||||
c or if joba.eq.2, then u may be identified with x
|
||||
c in the subroutine call.
|
||||
c
|
||||
c v double precision(ldv,p), where ldv.ge.p.
|
||||
c v contains the matrix of right singular vectors.
|
||||
c v is not referenced if job.eq.0. if p.le.n,
|
||||
c then v may be identified with x in the
|
||||
c subroutine call.
|
||||
c
|
||||
c info integer.
|
||||
c the singular values (and their corresponding
|
||||
c singular vectors) s(info+1),s(info+2),...,s(m)
|
||||
c are correct (here m=min(n,p)). thus if
|
||||
c info.eq.0, all the singular values and their
|
||||
c vectors are correct. in any event, the matrix
|
||||
c b = trans(u)*x*v is the bidiagonal matrix
|
||||
c with the elements of s on its diagonal and the
|
||||
c elements of e on its super-diagonal (trans(u)
|
||||
c is the transpose of u). thus the singular
|
||||
c values of x and b are the same.
|
||||
c
|
||||
c linpack. this version dated 08/14/78 .
|
||||
c correction made to shift 2/84.
|
||||
c g.w. stewart, university of maryland, argonne national lab.
|
||||
c
|
||||
c dsvdc uses the following functions and subprograms.
|
||||
c
|
||||
c external drot
|
||||
c blas daxpy,ddot,dscal,dswap,dnrm2,drotg
|
||||
c fortran dabs,dmax1,max0,min0,mod,dsqrt
|
||||
c
|
||||
c internal variables
|
||||
c
|
||||
integer i,iter,j,jobu,k,kase,kk,l,ll,lls,lm1,lp1,ls,lu,m,maxit,
|
||||
* mm,mm1,mp1,nct,nctp1,ncu,nrt,nrtp1
|
||||
double precision ddot,t,r
|
||||
double precision b,c,cs,el,emm1,f,g,dnrm2,scale,shift,sl,sm,sn,
|
||||
* smm1,t1,test,< | ||||