# Overview
This project provides a REST interface to execute queries on the databuffer.
# Requirements
This project requires Java 8 or greater.
# Deployment
This application runs in a [docker container](https://github.psi.ch/docker/query_rest). Use the instructions provided by [ch.psi.daq.install](https://git.psi.ch/sf_daq/ch.psi.daq.install#query_rest) to install the application on a server.
## Application Properties
Following files define and describe application properties:
- [Cassandra](https://github.psi.ch/sf_daq/ch.psi.daq.cassandra/blob/master/src/main/resources/cassandra.properties) specific properties.
- [Query](https://github.psi.ch/sf_daq/ch.psi.daq.query/blob/master/src/main/resources/query.properties) specific properties..
- [Query REST](https://github.psi.ch/sf_daq/ch.psi.daq.queryrest/blob/master/src/main/resources/queryrest.properties) specific properties.
It is possible to overwrite properties by defining new values in `${HOME}/.config/daq/queryrest.properties`
## Maven
Upload jar to the Maven repository (from ch.psi.daq.buildall):
```bash
./gradlew ch.psi.daq.queryrest:uploadArchives
```
## DropIt
Upload jar DropIt (from ch.psi.daq.buildall):
```bash
./gradlew ch.psi.daq.queryrest:dropIt -x test
```
## Local Instance
[DAQLocal](https://github.psi.ch/sf_daq/ch.psi.daq.daqlocal) provides a local instance of the DAQ system for testing purposes (allowing users/developers to verify their code before they come to PSI to do their research and interact with the DAQ cluster).
# REST Interface
The REST interface is accessible through `http://data-api.psi.ch/sf`.
## Query Channel Names
### Request
```
POST http://:/channels
```
#### Data
```json
{
"regex":"TRFCA|TRFCB",
"backends":[
"sf-databuffer"
],
"ordering":"asc",
"reload":true
}
```
##### Explanation
- **regex**: Reqular expression used to filter channel names. In case this value is undefined, no filter will be applied. Filtering is done using JAVA's [Pattern](https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html), more precisely [Matcher.find()](https://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#find--)).
- **backends**: Array of backends to access (values: sf-databuffer|sf-archiverappliance). In case this value is undefined, all backends will be queried for their channels.
- **ordering**: The ordering of the channel names (values: **none**|asc|desc).
- **reload**: Forces the server to reload cached channel names (values: **false**|true).
### Example
#### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"regex": "AMPLT|PHASE"}' http://data-api.psi.ch/sf/channels | python -m json.tool
```
#### Response
```json
[
{
"backend":"sf-databuffer",
"channels":[
"Channel_01",
"Channel_02",
"Channel_03"
]
},
{
"backend":"sf-archiverappliance",
"channels":[
"Channel_01",
"Channel_04",
"Channel_05"
]
}
]
```
## Query Data
### Request
```
GET http://:/query
```
#### Request body
A request is performed by sending a valid JSON object in the HTTP request body. The JSON query defines the channels to be queried, the range, and how the data should be aggregated (this is optional but highly recommended).
#### Data
```json
{
"channels":[
"Channel_01"
],
"range":{
"startPulseId":0,
"endPulseId":3
},
"ordering":"asc",
"fields":[
"pulseId",
"globalDate",
"value"
],
"aggregation":{
"aggregationType":"value",
"aggregations":[
"min",
"mean",
"max"
],
"nrOfBins":2
},
"response":{
"format":"json",
"compression":"none"
}
}
```
##### Explanation
- **channels**: Array of channels to be queried (see [here](Readme.md#query_channel_names) and [here](Readme.md#define_channel_names)).
- **range**: The range of the query (see [here](Readme.md#query_range)).
- **ordering**: The ordering of the data (see [here](Readme.md#data_ordering)).
- **fields**: Array of requested fields (see [here](Readme.md#requested_fields)).
- **aggregation**: Setting this attribute activates data aggregation (see [here](Readme.md#data_aggregation) for its specification).
- **response**: Specifies the format of the response of the requested data (see [here](Readme.md#response_format)). If this value is not set it defaults to JSON.
### Define Channel Names
The simplest way to define channels is to use an array of channel name Strings.
```json
"channels":[
"Channel_02",
"Channel_04"
]
```
The query interface will automatically select the backend which contains the channel (e.g., *sf-databuffer* for *Channel_02* and *sf-archiverappliance* for *Channel_04*). In case name clashes exist, the query interface will use following order of priority: *sf-databuffer* and then *sf-archiverappliance*.
It is also possible to explicitly define the backend to overcome name clashes.
```json
"channels":[
{
"name":"Channel_01",
"backend":"sf-archiverappliance"
},
{
"name":"Channel_01",
"backend":"sf-databuffer"
}
]
```
### Define Query Range
Queries are applied to a range. The following types of ranges are supported.
#### By Pulse-Id
```json
"range":{
"startPulseId":0,
"endPulseId":100
}
```
- **startPulseId**: The start pulse-id of the range request.
- **endPulseId**: The end pulse-id of the range request.
#### By Date
```json
"range":{
"startDate":"2015-08-06T18:00:00.000",
"endDate":"2015-08-06T18:59:59.999",
}
```
- **startDate**: The start date of the time range in the ISO8601 format (such as 1997-07-16T19:20:30.123+02:00 or 1997-07-16T19:20:30.123456789+02:00 (omitting +02:00 falls back to the server's time zone)).
- **endDate**: The end date of the time range.
#### By Time
```json
"range":{
"startSeconds":"0.0",
"endSeconds":"1.000999999"
}
```
- **startSeconds**: The start time of the range in seconds since midnight, January 1, 1970 UTC (the UNIX epoch) as a decimal value including fractional seconds.
- **endSeconds**: The end time of the range in seconds.
### Data Ordering
```json
"ordering":"asc"
```
- **ordering**: Defines the ordering of the requested data (values: **asc**|desc|none). Use *none* in case ordering does not matter (allows for server side optimizations).
### Requested Fields
```json
"fields":[
"pulseId",
"globalDate",
"value"
]
```
- **fields**: Array of requested fields (see [here](https://github.psi.ch/sf_daq/ch.psi.daq.domain/blob/master/src/main/java/ch/psi/daq/domain/query/operation/QueryField.java) for possible values).
It is possible to request the time in seconds (since midnight, January 1, 1970 UTC (the UNIX epoch) as a decimal value including fractional seconds - using fields *globalSeconds* and *iocSeconds*), in milliseconds (since midnight, January 1, 1970 UTC (the JAVA epoch) - using fields *globalMillis* and *iocMillis*) or as a ISO8601 formatted String - using fields *globalDate* and *iocDate* (such as 1997-07-16T19:20:30.123456789+02:00).
### Data Aggregation
It is possible (and recommended) to aggregate queried data.
```json
"aggregation":{
"aggregationType":"value",
"aggregations":[
"min",
"mean",
"max"
],
"nrOfBins":2
}
```
- **aggregationType**: Specifies the type of aggregation (see [here](https://github.psi.ch/sf_daq/ch.psi.daq.domain/blob/master/src/main/java/ch/psi/daq/domain/query/operation/AggregationType.java)). The default type is *value* aggregation (e.g., sum([1,2,3])=6). Alternatively, it is possible to define *index* aggregation for multiple arrays in combination with binning (e.g., sum([1,2,3], [3,2,1]) = [4,4,4]).
- **aggregations**: Array of requested aggregations (see [here](https://github.psi.ch/sf_daq/ch.psi.daq.domain/blob/master/src/main/java/ch/psi/daq/domain/query/operation/Aggregation.java) for possible values). These values will be added to the *data* array response.
- **extrema**: Array of requested extrema (see [here](https://github.psi.ch/sf_daq/ch.psi.daq.domain/blob/master/src/main/java/ch/psi/daq/domain/query/operation/Extrema.java) for possible values). These values will be added to the *data* array response.
- **nrOfBins**: Activates data binning. Specifies the number of bins the pulse/time range should be divided into.
- **durationPerBin**: Activates data binning. Specifies the duration per bin for time-range queries (using duration makes this binning strategy consistent between channel with different update frequencies). The duration is defined as a [ISO-8601](https://en.wikipedia.org/wiki/ISO_8601#Durations) duration (e.g., `PT1H` for 1 hour, `PT2S` for 2 seconds, `PT0.05S` for 50 milliseconds etc.). The resolution is in milliseconds and thus the minimal duration is 1 millisecond.
- **pulsesPerBin**: Activates data binning. Specifies the number of pulses per bin for pulse-range queries (using number of pulses makes this binning strategy consistent between channel with different update frequencies).
### Response Format
It is possible to specify the response format the queried data should have.
```json
"response":{
"format":"json",
"compression":"none"
}
```
- **format**: The format of the response (values: **json**|csv). Please note that `csv` does not support `index` and `extrema` aggregations.
- **compression**: Responses can be compressed when transferred from the server (values: **none**|gzip). If compression is enabled, you have to tell `curl` that the data is compressed by defining the attribute `--compressed` so that it decompresses the data automatically.
### Example Queries
The following examples build on waveform data (see below). They also work for scalars (consider it as a waveform of length = 1) and images (waveform of length = dimX * dimY).

```json
[
{
"channel":"Channel_01",
"data":[
{
"iocSeconds":"0.000000000",
"pulseId":0,
"globalSeconds":"0.000000000",
"shape":[
4
],
"value":[1,2,3,4]
},
{
"iocSeconds":"0.010000000",
"pulseId":1,
"globalSeconds":"0.010000000",
"shape":[
4
],
"value":[2,3,4,5]
},
{
"iocSeconds":"0.020000000",
"pulseId":2,
"globalSeconds":"0.020000000",
"shape":[
4
],
"value":[3,4,5,6]
},
{
"iocSeconds":"0.030000000",
"pulseId":3,
"globalSeconds":"0.030000000",
"shape":[
4
],
"value":[4,5,6,7]
}
]
}
]
```
### Query Examples
#### Query by Pulse-Id Range
```json
{
"range":{
"startPulseId":0,
"endPulseId":3
},
"channels":[
"Channel_01"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"range":{"startPulseId":0,"endPulseId":3},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
See JSON representation of the data above.
#### Query by Time Range
```json
{
"range":{
"startSeconds":"0.0",
"endSeconds":"0.030999999"
},
"channels":[
"Channel_01"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"range":{"startSeconds":"0.0","endSeconds":"0.030999999"},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
See JSON representation of the data above.
#### Query by Date Range
```json
{
"range":{
"startDate":"1970-01-01T01:00:00.000",
"endDate":"1970-01-01T01:00:00.030"
},
"channels":[
"Channel_01"
]
}
```
The supported date format is ISO8601 (such as 1997-07-16T19:20:30.123+02:00 or 1997-07-16T19:20:30.123456789+02:00 (omitting +02:00 falls back to the server's time zone)).
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"range":{"startDate":"1970-01-01T01:00:00.000","endDate":"1970-01-01T01:00:00.030"},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
See JSON representation of the data above.
#### Querying Archiver Appliance
```json
{
"range":{
"startSeconds":"0.0",
"endSeconds":"0.030999999"
},
"channels":[
{
"name": "Channel_01",
"backend":"sf-archiverappliance"
},
{
"name": "Channel_02",
"backend":"sf-archiverappliance"
}
]
}
```
Archiver Appliance supports queries by *time range* and *date range* only (as it has no notion about pulse-id).
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"range":{"startSeconds":"0.0","endSeconds":"0.030999999"},"channels":[{"name": "Channel_01","backend":"sf-archiverappliance"}]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
See JSON representation of the data above.
#### Query Using Compression
```json
{
"response":{
"compression":"gzip"
},
"range":{
"startPulseId":0,
"endPulseId":3
},
"channels":[
"Channel_01"
]
}
```
##### Command (gzip)
The `curl` command has a `--compressed` option to decompress data automatically.
```bash
curl --compressed -H "Content-Type: application/json" -X POST -d '{"response":{"compression":"gzip"},"range":{"startPulseId":0,"endPulseId":3},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
#### Querying for Specific Fields
Allows for server side optimizations since not all data needs to be retrieved.
```json
{
"fields":["pulseId","value"],
"range":{
"startPulseId":0,
"endPulseId":3
},
"channels":[
"Channel_01"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"fields":["pulseId","value"],"range":{"startPulseId":0,"endPulseId":3},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
```json
[
{
"channel":"Channel_01",
"data":[
{
"pulseId":0,
"value":[1,2,3,4]
},
{
"pulseId":1,
"value":[2,3,4,5]
},
{
"pulseId":2,
"value":[3,4,5,6]
},
{
"pulseId":3,
"value":[4,5,6,7]
}
]
}
]
```
#### Query CSV Format
```json
{
"response":{
"format":"csv"
},
"range":{
"startPulseId":0,
"endPulseId":4
},
"channels":[
"channel1",
"channel2"
],
"fields":[
"channel",
"pulseId",
"iocSeconds",
"globalSeconds",
"shape",
"eventCount",
"value"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"response":{"format":"csv"},"range":{"startPulseId":0,"endPulseId":4},"channels":["channel1","channel2"],"fields":["channel","pulseId","iocSeconds","globalSeconds","shape","eventCount","value"]}' http://data-api.psi.ch/sf/query
```
##### Response
The response is in CSV.
```text
channel;pulseId;iocSeconds;globalSeconds;shape;eventCount;value
testChannel1;0;0.000000000;0.000000000;[1];1;0
testChannel1;1;0.010000000;0.010000000;[1];1;1
testChannel1;2;0.020000000;0.020000000;[1];1;2
testChannel1;3;0.030000000;0.030000000;[1];1;3
testChannel1;4;0.040000000;0.040000000;[1];1;4
testChannel2;0;0.000000000;0.000000000;[1];1;0
testChannel2;1;0.010000000;0.010000000;[1];1;1
testChannel2;2;0.020000000;0.020000000;[1];1;2
testChannel2;3;0.030000000;0.030000000;[1];1;3
testChannel2;4;0.040000000;0.040000000;[1];1;4
```
#### Data Ordering
```json
{
"ordering":"desc",
"fields":["pulseId","value"],
"range":{
"startPulseId":0,
"endPulseId":3
},
"channels":[
"Channel_01"
]
}
```
Use **none** in case ordering does not matter (allows for server side optimizations).
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"ordering":"desc","fields":["pulseId","value"],"range":{"startPulseId":0,"endPulseId":3},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
```json
[
{
"channel":"Channel_01",
"data":[
{
"pulseId":3,
"value":[4,5,6,7]
},
{
"pulseId":2,
"value":[3,4,5,6]
},
{
"pulseId":1,
"value":[2,3,4,5]
},
{
"pulseId":0,
"value":[1,2,3,4]
}
]
}
]
```
#### Query Aggregated Values
```json
{
"aggregation":{
"aggregationType":"value",
"aggregations":["min","mean","max"]
},
"fields":["pulseId","value"],
"range":{
"startPulseId":0,
"endPulseId":3
},
"channels":[
"Channel_01"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"aggregation":{"aggregationType":"value","aggregations":["min","mean","max"]},"fields":["pulseId","value"],"range":{"startPulseId":0,"endPulseId":3},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
```json
[
{
"channel":"Channel_01",
"data":[
{
"pulseId":0,
"value":{
"min":1.0,
"max":4.0,
"mean":2.5
}
},
{
"pulseId":1,
"value":{
"min":2.0,
"max":5.0,
"mean":3.5
}
},
{
"pulseId":2,
"value":{
"min":3.0,
"max":6.0,
"mean":4.5
}
},
{
"pulseId":3,
"value":{
"min":4.0,
"max":7.0,
"mean":5.5
}
}
]
}
]
```
Illustration of array value aggregation:

#### Value Aggregation with Binning (nrOfBins)
```json
{
"aggregation":{
"nrOfBins":2,
"aggregationType":"value",
"aggregations":["min","mean","max"]
},
"fields":["pulseId","value"],
"range":{
"startPulseId":0,
"endPulseId":3
},
"channels":[
"Channel_01"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"aggregation":{"nrOfBins":2,"aggregationType":"value","aggregations":["min","mean","max"]},"fields":["pulseId","value"],"range":{"startPulseId":0,"endPulseId":3},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
```json
[
{
"channel":"Channel_01",
"data":[
{
"pulseId":0,
"value":{
"min":1.0,
"max":5.0,
"mean":3.0
}
},
{
"pulseId":2,
"value":{
"min":3.0,
"max":7.0,
"mean":5.0
}
}
]
}
]
```
Illustration of array value aggregation with additional binning:

#### Value Aggregation with Binning (durationPerBin/pulsesPerBin)
**durationPerBin** specifies the duration per bin for time-range queries (using duration makes this binning strategy consistent between channel with different update frequencies). The duration is defined as a [ISO-8601](https://en.wikipedia.org/wiki/ISO_8601#Durations) duration (e.g., `PT1H` for 1 hour, `PT2S` for 2 seconds, `PT0.05S` for 50 milliseconds etc.). The resolution is in milliseconds and thus the minimal duration is 1 millisecond.
**pulsesPerBin** specifies the number of pulses per bin for pulse-range queries (using number of pulses makes this binning strategy consistent between channel with different update frequencies).
```json
{
"aggregation":{
"pulsesPerBin":2,
"aggregationType":"value",
"aggregations":["min","mean","max"]
},
"fields":["globalMillis","value"],
"range":{
"startSeconds":"0.0",
"endSeconds":"0.030000000"
},
"channels":[
"Channel_01"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"aggregation":{"pulsesPerBin":2,"aggregationType":"value","aggregations":["min","mean","max"]},"fields":["globalMillis","value"],"range":{"startSeconds":"0.0","endSeconds":"0.030000000"},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
```json
[
{
"channel":"Channel_01",
"data":[
{
"globalMillis":0,
"value":{
"min":1.0,
"max":5.0,
"mean":3.0
}
},
{
"globalMillis":20,
"value":{
"min":3.0,
"max":7.0,
"mean":5.0
}
}
]
}
]
```
Illustration of array value aggregation with additional binning:

#### Index Aggregation
```json
{
"aggregation":{
"pulsesPerBin":1,
"aggregationType":"index",
"aggregations":["min","mean","max","sum"]
},
"fields":["pulseId","value"],
"range":{
"startPulseId":0,
"endPulseId":3
},
"channels":[
"Channel_01"
]
}
```
##### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"aggregation":{"nrOfBins":1,"aggregationType":"index","aggregations":["min","max","mean","sum"]},"fields":["pulseId","value"],"range":{"startPulseId":0,"endPulseId":3},"channels":["Channel_01"]}' http://data-api.psi.ch/sf/query | python -m json.tool
```
##### Response
```json
[
{
"channel":"Channel_01",
"data":[
{
"pulseId":0,
"value":[
{
"min":1.0,
"max":4.0,
"mean":2.5,
"sum":10.0
},
{
"min":2.0,
"max":5.0,
"mean":3.5,
"sum":14.0
},
{
"min":3.0,
"max":6.0,
"mean":4.5,
"sum":18.0
},
{
"min":4.0,
"max":7.0,
"mean":5.5,
"sum":22.0
}
]
}
]
}
]
```
Illustration of array index aggregation with additional with binning (several nrOfBins are also possible):

```
## Query Channel Status
It is possible to retieve channel specific status information.
### Request
```
POST http://:/status/channels
```
#### Data
```json
{
"channels":[
"Channel_02",
"Channel_04"
]
}
```
##### Explanation
- **channels**: Array of channels to be queried (see [here](Readme.md#query_channel_names) and [here](Readme.md#define_channel_names)).
### Example
#### Command
```bash
curl -H "Content-Type: application/json" -X POST -d '{"channels": ["Channel_02","Channel_04"]}' http://data-api.psi.ch/sf/status/channels | python -m json.tool
```
#### Response
```json
[
{
"channel":{
"name":"Channel_02",
"backend":"sf-databuffer"
},
"recording":true,
"connected":true,
"lastEventDate":"2016-07-06T09:16:19.607242575+02:00"
},
{
"channel":{
"name":"Channel_04",
"backend":"sf-archiverappliance"
},
"recording":false,
"connected":false,
"lastEventDate":"2016-07-06T04:16:14.000000000+02:00"
}
]
```
##### Explanation
- **channel**: The name and backend of the channel.
- **recording**: Defines if the channel is still recorded (please note that for beam synchronous DAQ this means that the source/IOC providing the channel is still recorded).
- **connected**: Defines if the channel is still connected (please note that for beam synchronous DAQ this means that the source/IOC providing the channel is still connected).
- **lastEventDate**: The timestamp of the last received event from the channel in the ISO8601 format.