# **Saving multiple sources at SwissFEL: a very serious guide** *Scroll down for the boring code* **Hey, look at my nice data!** *What a load of bs!* **Oy, no need to be rude!** *I’m not! I mean, what a load of **Beam Synchronous** data!* **Well, thanks... but it’s not all saved.** *Bummer. How were you saving it?* **I used this cool command-line script `bs...`** *Ooooh... I’m going to stop you right there.* **But my bs command works; there must be a problem with the sources!** *Calm down, detective, try saving each source individually with the `bs` command...* **OK, hang on... what the?! They *both* save fine on their own!** *Yep, the issue is when you try and save sources from different IOCs/devices with the `bs` command. The data is taken from the **dispatcher**. If the two sources don’t arrive at the dispatcher within a small time window, only the first source is sent in the message. Different sources arrive at the dispatcher at **different times**.* **What type of BS is that?! I can't wait and wait...** *Good question. Some bs data comes from **pipelines**, where calculations and moving data around **takes time**. You’re not just saving numbers—you’re saving processed results.* **Pipelines?! I want data, not plumbing problems!** *Think of pipelines as hardworking elves doing data analysis behind the scenes. No pipelines, more work for you.* **Alright, I’m sold. But how do I save multiple sources without all this drama?** *You need to save from the **data buffer**. The system can handle sources arriving at slightly different times there.* **The data buffer? How?** *You’ve got plenty of tools for accessing it:* - **[DataHub](https://github.com/paulscherrerinstitute/datahub)** (don't ask about a front-end) - **[Data API](https://github.com/paulscherrerinstitute/data_api_python)** (if you speak code) - **[Eco](https://github.com/paulscherrerinstitute/eco), [Slic](https://gitlab.psi.ch/slic), Service Now, Concour, Time** (some might not work). **Steady your sources, Doc Brown, I just checked my data and one of the sources isn't running at 100 Hz and missing data, I told you the source was the problem** *Missing data isn't necessarily a problem; the approaches above can handle missing pulse IDs and return you all the data that is in the data buffer* **Pulse IDs? Does my data need to prove its age?** *No, its a way to sort data, with bs data every shot had a unique pulse ID. You can use SwissFEL data analysis packages to match arrays with missing shots in for you* **Cool! Anything else I should know?** *Yes: the databuffer can’t clear your desk at PiA* **✅ Do Say:** *Pulse IDs rock my world!* **❌ Don’t Say:** *Beam synchronous PV* ## Example scripts using datahub ### Saving historic data for a set time range ```python from datahub import * import matplotlib.pyplot as plt from datetime import datetime, timedelta # Set time range: from 6 minutes ago to 5 minutes ago now = datetime.now() from_time, to_time = [ (now - timedelta(minutes=m)).strftime('%Y-%m-%d %H:%M:%S.%f')[:-3] for m in [6, 5] ] # Define the channels to monitor channels = [ "SARFE10-PSSS059:SPECTRUM_Y", "SAROP21-PBPS133:INTENSITY", "SARFE10-PBPG050:FAST-PULSE-ENERGY" ] # Construct the query with channels and time range query = { "channels": channels, "start": from_time, "end": to_time } # Connect to the data source and retrieve data with Daqbuf(backend="sf-databuffer", cbor=True) as source: table = Table() source.add_listener(table) source.request(query) dataframe = table.as_dataframe(index=Table.PULSE_ID) # Iterate through each channel and print the number of pulses for channel in channels: if channel in dataframe.columns: NumShots = dataframe[channel].count() print(f"{channel}: {NumShots} pulses") else: print(f"{channel}: Channel not found in the dataframe.") ``` ### Example of a stream of live data ```python from datahub import Bsread, Table channels = [ "SARFE10-PSSS059:SPECTRUM_Y", "SAROP21-PBPS133:INTENSITY", "SARFE10-PBPG050:FAST-PULSE-ENERGY" ] with Bsread() as source: table = Table() source.add_listener(table) source.req(channels, 0.0, 2.0) dataframe = table.as_dataframe(index=Table.PULSE_ID) for channel in channels: if channel in dataframe.columns: NumShots = dataframe[channel].count() print(f"{channel}: {NumShots} pulses") else: print(f"{channel}: Channel not found in the dataframe.") ```