phd_thesis/preparation.tex at master · PatrickBaus/phd_thesis · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% FIXME: Change variable names according to Noise Sources in Bulk CMOS
\chapter{Preparation}%
\label{sec:preparation}
\begin{chapquote}{Lewis Carroll, \textit{Alice in Wonderland}}
``Begin at the beginning,'' the King said gravely, ``and go on till you come to the end: then stop.''
\end{chapquote}
%\section{Grounding and Shielding}
%Add parts from "references\\Grounding and Shielding.pdf"

\section{Laser System}%
\label{sec:prep_laser_system}
%\subsection{Requirements Laser System}
The ARTEMIS experiment is currently in the process of commissioning. Due to the relative abundance of Argon, the ability to create highly charged ion species using an electron beam along with a scientifically relevant transition at $\lambda = \qty{441.25575(17)}{\nm}$ \cite{ar13+_wavelength} with a lifetime of \qty{9.573(6)}{\ms} \cite{ar13+_lifetime} makes \ce{Ar^{13+}} an ideal candidate for this purpose. Figure \ref{fig:bohr_ar13} shows the simplified electronic configuration of the boron-like \ce{Ar^{13+}} investigated.
\begin{figure}[ht]
    \centering
    \begin{subfigure}{0.28\linewidth}
        \centering
        \import{figures/}{fig_bohr_Ar13.tex}
        \caption{Shell model of boron-like \ce{Ar^{13+}}.}
        \label{fig:bohr_ar13}
    \end{subfigure}
    \begin{subfigure}{0.6\linewidth}
        \centering
        \scalebox{0.75}{%
            \import{figures/}{fig_level_transitions.tex}
        }% scalebox
        \caption{Level diagram of the \ce{Ar^{13+}} transitions accessible at \qty{441}{\nm}. Frequencies with numbers denote optical transitions, letters denote microwave transitions.}
        \label{fig:level_transitions}
    \end{subfigure}
    \caption{Electronic configuration and optically accessible transitions of \ce{Ar^{13+}}.}
    \label{fig:ar13}
\end{figure}

The optical transitions of highly charged \ce{Ar^{13+}} around \qty{441}{\nm} shown in figure \ref{fig:level_transitions} can be used for laser spectroscopy. The transition of interest for commissioning is the transition from the ground state $[(1s)^2 (2s)^2 2p]^2 \, P_{1/2}$ to the first exited state $[(1s)^2 (2s)^2 2p]^2 \, P_{3/2}$ \cite{optical_transitions_ar13}. The Zeeman splitting introduced by the \qty{7}{\tesla} magnet of the ARTEMIS Penning trap results in a frequency splitting of the $^2P_{1/2}$ ground state by $\nu_d = \qty{65}{\GHz}$ and the exited $^2 P_{3/2}$ state of $\nu_a = \qty{130}{\GHz}$. The natural linewidth of these transitions is $\Gamma \approx 2 \pi \times \qty{16.63(1)}{\Hz}$ which is fairly small, but there is substantial Doppler broadening of
\begin{equation}
    \Delta \nu (\lambda = \qty{441}{\nm}, T=\qty{4}{\K}, m=\qty{39.948}{u}) = \frac{2}{\lambda}\sqrt{2 \ln 2 \frac{k_B T}{m}} \approx 2 \pi \times \qty{150}{\MHz} \, , \label{eqn:doppler_broadening}
\end{equation}
seen in the trap which is kept at a temperature of \qty{4}{\K}.

The simplified laser setup shown in figure \ref{fig:laser_system_old} as a schematic was characterised by \citeauthor{thesis_alex} \cite{thesis_alex} and its transfer accuracy of the the wavelength was calculated to be \qty{1.2}{\MHz}. Considering the additional long-term drift of the system led to an upper limit of the absolute wavelength uncertainty of \qty{2.2}{\MHz}. This can be considered more than adequate given the Doppler broadening of \qty{150}{\MHz} calculated in equation \ref{eqn:doppler_broadening}. While being sufficiently accurate, the system has the significant drawback of being fairly complicated to manage if not maintained on a daily basis. Its performance and uncertainty relies on the exact knowledge of the tellurium spectrum surrounding both wavelengths of the lasers.
\begin{figure}[ht]
    \centering
    %\scalebox{1}{%
        \import{figures/}{laser_system_old.tex}
    %} % scalebox
    \caption{Simplified setup of the laser system in use at ARTEMIS prior to this work. Blue lines are laser beams, black lines are electronic signals delivering feedback.}
    \label{fig:laser_system_old}
\end{figure}

To operate the system, the master laser must first be locked to a tellurium reference transition which has to be searched manually using the tellurium spectrum charted by \cite{thesis_alex}. While not complicated, it requires a trained user nonetheless. The transfer cavity is then locked to the master laser, a straight forward process. The spectroscopy laser now has to be locked to the correct fringe of the transfer cavity. This is done by observing the tellurium background again to adjust the diode laser mode to a frequency close to the desired cavity fringe. This is by no means an automated process and requires a competent operator. Although this locking procedure requires training it does work reliably, but there are more issues that are not readily apparent.

One potential problem lies with the master laser. The whole calibration is geared towards a reference transition at \qty{452.756}{\nm} catalogued by \citeauthor{te_reference_lines} \cite{te_reference_lines}. Replacing the master laser in case of a failure is challenging. Blue laser diodes are less flexible when tuning in comparison to their NIR cousins as discussed in \cite{thesis_baus,thesis_alex}. The diode used in the current laser was handpicked for this wavelength. This limits the availability of replacement parts and increases their replacement value. Using another diode and wavelength along with a different tellurium transition would require creating a new tellurium map which is a laborious process.

Another challenge is created by the locking scheme which becomes more complicated when introducing a third laser to complete the closed transition for the laser-microwave double-resonance spectroscopy. The setup in its present condition was already prepared for the second spectroscopy laser which must also be locked to the transfer cavity. Currently both lasers are sent into the transfer cavity with perpendicular polarisations to separate the beams using a polarizing beam splitter. Since the reference laser is running at \qty{453}{\nm} while the spectroscopy laser is using \qty{441}{\nm}, it is also possible to separate those two via a dichroic mirror. While possible, this scheme requires the close overlap of three beams along with their reflection as required for saturation. Such a setup strongly couples the three beam paths and further complicates future adjustments.

The final issue regarding the transfer cavity concerns the required high voltage for the piezoelectric actuator to adjust its length. This piezo requires up to \qty{1.25}{\kV} to reach the necessary translation required for scanning the tellurium spectrum. Not only does this pose a risk to an untrained operator, but it was also discovered during the commissioning of the current system that high frequency noise from the voltage supply is radiated and can enter the experiment. These issues were minimized by testing several power supplies, choosing the lowest noise device and keeping the supply well separated from the rest of the experiment increasing space required for the whole system.

During the commissioning and testing of the laser system it was also found that the laser drivers had issues with blue laser diodes. Those drivers used a modified in-house current source normally used for NIR diodes based on the design of \citeauthor{libbrecht_hall} \cite{libbrecht_hall}. With these drivers an increasing instability was observed when adjusting the current up to the operating point. The origin of this problem lies in the larger operating voltage of the blue laser diode of up to \qty{7}{\V} compared to the more moderate \qty{2.5}{\V} of NIR diodes. The details are discussed in section \ref{sec:compliance_voltage}. Other commercial drivers tested either had the same problem or were far noisier and therefore harder to work with, due to the modification made by the manufacturers to increase the compliance voltage.

To conclude, the commissioning of the laser system has brought up two issues with the current system that impact its availability and performance. The transfer cavity can be considered the Achilles' heel of the system and replacing it with a more flexible alternative like a high performance wavemeter can greatly improve the usability and flexibility of the whole laser system as it breaks up the tight dependency on the tellurium reference laser. Additionally this opens up alternative wavelengths for the spectroscopy of other highly charges ion species. The second issue was found with the laser driver and the apparent lack of commercial solutions sparked the development of a novel laser driver for the next generation of laser diodes to gain direct access to more wavelengths. This is discussed in the next section.

\clearpage
\section{Laser Current Driver}%
\label{sec:laser_current_driver}
% Include Emission wavelength dependence of characteristic temperature of InGaN laser diodes
% Check Diode Laser Characteristics
% I-lamda in Wavelength Dependence of InGaN Laser Diode Characteristics
% also Determination of piezoelectric fields in strained GaInN quantum wells using the quantum-confined Stark effect
Laser diodes are current driven devices, because
\begin{equation*}
    P_{out} \propto I
\end{equation*}
and the diode current $I$ approximately follows the Shockley equation \cite{shockley_diode}
\begin{equation}
    I = I_0 \left( e^{\frac{qV_d}{k_B T}} - 1\right) \, . \label{eqn:shockley}
\end{equation}

where $k_B$ is Boltzmann constant, $T$ the temperature, $q$ the electron charge and $V_d$ the diode voltage. The exponential dependence of the current on the supply voltage calls for a current source to drive a laser diode safely without risking thermal damage due to excessive injection currents.

The primary function of a laser driver is therefore to provide a stable, but user adjustable, current. This current can typically be modulated at frequencies up to several \unit{\MHz} to shape the frequency and amplitude of the laser output light. Additional features, like current and voltage limits, aid in protecting the expensive laser diodes and it is not uncommon to have additional safeguards inside the laser head that are under control of the laser driver like a shorting relay to ensure the laser diode is shorted when the driver is disconnected or disabled.

The focus of this work lies on two types of laser diodes, indium gallium nitride (InGaN) and aluminium gallium arsenide (AlGaAs), but is not limited to those two types. The former material is, for example, used for blue laser diodes at around \qty{450}{\nm}, discussed in the previous section, and laser diodes up to green wavelengths, the latter is used for near-infrared (NIR) laser diodes such as \qty{780}{\nm} laser diodes. Both wavelengths are used for experiments in the Atoms-Photons-Quanta group (in future referred to as \textit{the group}). The former type is used in the ARTEMIS experiment for the spectroscopy of highly charged ions, the latter is extensively used to manipulate and control the rubidium atoms used by the quantum computing experiments. This section deals with the design challenges of such a device used for high precision laser spectroscopy. First the design requirements are established and, from those conditions, technical specifications are developed.

The design requirements are split into three parts which need to be discussed: The environment includes effects like temperature, humidity and time. That section mostly focuses on the ambient temperature though because its effects are the most pronounced. The current source electrical requirements, like drift, noise, output impedance, and modulation bandwidth are discussed as these have a profound impact on the intended application in experiments. Finally, the user interface including the external communication interfaces are defined.

\subsection{Design Goals: Ambient Environment}%
\label{sec:design_goal_environment}
The lasers and its accompanying driver is to be mostly used in a clean laboratory environment. In this particular use case the air is filtered using H14 HEPA filters, but less rigorously controlled environments must be considered as well, because not all fields of application are in optical labs. A mostly dust free industrial environment is considered acceptable as well. Typical lab temperatures are in the range of \qtyrange[range-phrase=\textup{~to~}]{20}{30}{\celsius}. This temperature range was also encountered in the labs discussed in this work before improvements were implemented as part of this work. The upper end of the range must be considered when operating the devices inside a rack where the temperatures are even higher and the device should therefore be tested for its upper limit. A temperature of \qty{35}{\celsius} is a typical value measured inside the racks used in the lab. Humidity is only controlled with dehumidifiers, limiting only the upper bound, resulting in a range of \qtyrange[range-phrase=\textup{~to~}]{15}{60}{\percent rH}.

Figure \ref{fig:lab_temperature_start_of_project} shows a typical one day span of the lab temperature as it was found at the start of this project, plotted using Matplotlib \cite{matplotlib}.
\begin{figure}[ht]
    \centering
    \input{images/temperature_011_2016.pgf}% data/plot_generic.py
    \caption{Temperature in lab 011 of the APQ group on 2016-11-26. Recorded by the LabKraken monitor. See section \ref{sec:res_labkraken} for details.}
    \label{fig:lab_temperature_start_of_project}
\end{figure}

As it can be seen there are strong oscillations of the temperature around the setpoint of \qty{21}{\celsius} as a result of the on–off air conditioning temperature controller. The commercial controller initially installed was using an IMI Heimeier \device{EMO T} thermoelectric actuator \cite{datasheet_heimeier_emo_t}, which is a two-step valve. Although this solution was later replaced by a custom design described in section \ref{sec:lab_temp_control}, these type of controllers are found in many other labs and temperature swings of \qty{2}{\kelvin} must therefore be expected.

These environmental parameters can now be used to estimate the design requirements for the laser driver. In comparison to the other laser system used in this group, the \qty{450}{\nm} system \cite{thesis_baus} required for the spectroscopy of highly charged ions \cite{thesis_alex} at GSI is the more demanding system. This system was found to be more susceptible to changes of the drive current since the wavelength selective filter element was far broader in comparison to a \qty{780}{\nm} system \cite{two_filter_paper}. This laser is stable over regions of tens of \unit{\uA} and requires a maximum drive current of \qty{145}{\mA} \cite{datasheet_osram_pl450b}.

From these considerations, the requirements for the driver can be inferred. It should be able to supply at least \qty{150}{\mA} and stay well within \qty{10}{\uA} over the whole environmental range. Given a worst-case scenario a tolerance of $3\sigma$ (\qty{99.7}{\percent}) must be met \cite{worst_case_design}.

The environmental parameters that mostly affect current sources are temperature and humidity. Air pressure is typically a matter of concern for high voltage systems \cite{IPC-2221B} and secondary in consideration for this design as it is a low voltage system (\qty{<= 48}{\V}). Air pressure effects are also the most expensive to test for, as a pressure chamber is required. Humidity affects electronics both directly though corrosion and also indirectly because the epoxy resin used in the FR-4 PCBs and component moulding is hygroscopic and the absorbed humidity leads to swelling and mechanical stress. This effect is very slow at ambient temperature and can easily take days to show \cite{epoxy_humidity}. This parameter is therefore handled via the long-term stability and not specified separately.

Given environmental conditions, the relative coefficients can be calculated. This estimation assumes a minimum setpoint resolution of 2 steps within the mode-hop-free region of the laser and calculates the \qty{99.7}{\percent} confidence interval. The steps are given in table \ref{tab:dgdrive_tempco}:
\begin{table}[hb]
    \centering
    \begin{tabular}{llr}
        \toprule
        Property& Value& Result \\
        \midrule
        Stable range & \qty{10}{\uA}& \qty{10}{\uA}\\
        2 steps of resolution  & $\div 2$& \qty{5}{\uA} \\
        $1 \sigma$  & $\div 2$& \qty{2.5}{\uA} \\
        Maximum output& \qty{150}{\mA}& \qty{17}{\uA \per \A}\\
        Temperature range& \qty{5}{\K}& \qty{3}{\uA \per \A \per \K}\\
        Worst case ($3 \sigma$)& $\div 3$& \qty{1}{\uA \per \A \per \K}\\
        \bottomrule
    \end{tabular}
    \caption{Estimated requirement for the temperature coefficient of the laser driver.}
    \label{tab:dgdrive_tempco}
\end{table}

While the requirements look moderate at first sight, tuning a quick estimation shown in table \ref{tab:dgdrive_tempco} leads to a temperature coefficient of \qty[per-mode = symbol]{1}{\uA \per \A \per \K} or even tighter when using a higher output driver -- a rather formidable specification for a current source.

Regarding the long-term stability, a \qty{30}{\day} number can be estimated. One may be inclined to call for a drift which is smaller than the stable range, but this would be short-sighted, as there are other factors to consider. The laser including the external resonator has its own figure of merit regarding the spectral drift rate. \citeauthor{ecdl_stability} \cite{ecdl_stability} reported a drift of \qty{2.9}{\MHz \per \hour}, which was attributed either to the external resonator itself, the piezo or the collimation lens. It is most likely that this drift was caused by mechanical changes of the external resonator as it defines the output mode of the laser. The mechanical drift limits the required stability of the current source considerably, as a typical frequency change of the internal resonator with the current of \qty[per-mode=symbol]{3}{\MHz \per \micro \A} \cite{diodelaser_modulation} can be assumed. The (linear) ageing drift of the external resonator over \qty{30}{\day} is equivalent to a \qty{720}{\uA} drift over the same period. For the electronics, the drift is assumed to follow an Arrhenius-like equation resulting from stress, caused during manufacturing. This may eventually change to a slow linear drift after several months of relaxation. The coefficient can either be a positive or negative and leads to
\begin{table}[hb]
    \centering
    \begin{tabular}{llr}
        \toprule
        Property& Value& Result \\
        \midrule
        Ageing drift limit & \qty{720}{\uA}& \qty{720}{\uA}\\
        $1 \sigma$  & $\div 2$& \qty{360}{\uA} \\
        Maximum output& \qty{500}{\mA}& \qty{720}{\uA \per \A}\\
        Worst case ($3 \sigma$)& $\div 3$& \qty{240}{\uA \per \A}\\
        \bottomrule
    \end{tabular}
    \caption{Estimated requirement for the long-term stability over \qty{30}{\day} of the laser driver.}
    \label{tab:dgdrive_stability}
\end{table}

Based on these numbers, it is straightforward to see that the long-term stability of a laser driver is less important than the short-term temperature coefficient since the limiting factor is the mechanical construction of the laser. This necessitates an atomic reference for long-term stability and to compensate for acoustic resonances of the external resonator. Regarding the choice of suitable devices, the tight specification of the temperature coefficient most likely leads to a choice of components that will pass these long-term criteria as well, alleviating a bit the burden of proof as long-term drift specifications are hard to come by since they need a lot of time to validate and cannot be extrapolated from high temperature burn-in tests \cite{voltage_reference_drift}.

%While \qty{800}{\uA \per \A} over a \qty{30}{\day} period may seem large at first, it is actually very hard to accurately produce a current. To put this number in perspective, a commercial high-end current source like the Keithley \device{2600B} is specified for the \qty{100}{\mA} range at about \qty{170}{\uA \per \A} for a \qty{30}{\day} period when calculated from the 1-year specification \cite{datasheet_keithley2600}, again assuming an Arrhenius-like equation as the basis.

All of this leads to the following design specifications regarding the stability of the current driver:
\begin{center}
    \begin{specifications}[label={lst:dgDrive_specs_environment}]{Current source, environmental}
    \begin{itemize}
        \item Temperature range \qtyrange[text-series-to-math, reset-text-series = false, reset-math-version = false, range-phrase=\textup{~to~}]{20}{35}{\celsius}
        \item \textbf{Temperature coefficient \qty[text-series-to-math, reset-text-series = false, reset-math-version = false]{<= 1}{\uA \per \A \per \K}}
        \item Humidity (non-condensing) \qty{<= 75}{\percent rH}
        \item Humidity coefficient not specified, but included in the long-term drift
        \item Maximum altitude not specified
        \item Long-term drift over \qty{30}{\day} \qty{<= 240}{\uA \per \A}
    \end{itemize}
    \end{specifications}
\end{center}

%A basic laser current driver design that has some of the  can be found in the work of \citeauthor{libbrecht_hall} \cite{libbrecht_hall}. While this design contains all the basic features, like a current source, a modulation input and a voltage limit, there are several shortcomings that have emerged over the years with new generations of laser didoes. The laser driver used by legacy applications in this group is based on the aforementioned paper and has been successfully employed in several projects over the years, but several limitations have come up in recent years. In order to derive the design requirements of a new generation of laser drivers the important design elements need to be identified first. The essential design elements are the bulk current source, the modulation current source, the reference element and output programming. The next 4 sections will deal with each element and outline the design goals that were identified while employing the legacy generation of diode drivers in several experiments.

\subsection{Design Goals: Current Source}
% https://www.laserdiodecontrol.com/laser-diode-parameter-overview
% Diode Lasers and Photonic Integrated Circuits (Characteristic temperature)
% Near Threshold Operation of Semiconductor Lasers and Resonant-Type Laser Amplifiers
The change in output current caused by load impedance should be an order of magnitude less than the drift specification to ensure a negligible effect compared to the drift over time. The load resistance presented by the laser diodes most commonly used in our experiments ranges from \qty{50}{\ohm} \cite{datasheet_osram_pl450b} to \qty{30}{\ohm} \cite{datasheet_adl_785} and \qtyrange{10}{15}{\ohm} for \qty{780}{\nm} laser diode \cite{datasheet_sharp_780nm,datasheet_thorlabs_780nm}. The output impedance requirement can therefore be estimated as
\begin{align}
    \frac{R_{load}}{R_{out}} &= \frac{I_{set}}{I_{out}} - 1 \leq \qty[per-mode = symbol]{6.7}{\uA \per \A} \nonumber\\
    R_{out} &\geq \frac{\qty{50}{\ohm}}{\qty[per-mode = symbol]{6.7}{\uA \per \A}} = \qty{7.5}{\mega \ohm}
\end{align}

An output impedance of more than \qty{7.5}{\mega \ohm} for slowly changing loads is a tough requirement, depending on the type of current source, which requires carefully selected components. A high output impedance is, for example, of importance to suppress radiated noise coming from external sources. Especially low frequency components from the mains supply can magnetically couple into the cables, because they are long enough. This noise can be substantial and a high output impedance at low frequencies is therefore important. Other applications will be discussed throughout this work. While a subpar output impedance is more of a limiting factor, the compliance voltage discussed next is a key requirement.

The compliance voltage is the maximum voltage the current source can apply to the load and is another non-ideal component of a real current source. The required voltage strongly depends on the type of laser diode used. The near-infrared laser diodes discussed above have an operating voltage of \qtyrange{1.5}{3}{\V}, while the Osram \device{PL 450B} blue laser diode is specified for \qtyrange{5.5}{7}{\V}. The \qty{7}{\V} required by the Osram laser diode is fairly high for a Fabry–Pérot laser diode and has proven difficult in the past \cite{thesis_baus} as most laser current drivers available are designed for the much lower forward voltage of the near infrared laser diodes. Even higher voltages of around \qtyrange{12}{15}{\V} are required for quantum cascade lasers, but these are currently neither used nor is their use planned in any experiment in the group.

The maximum output current of the laser driver currently required for laser diodes used in the group is \qty{250}{\mA} for the Thorlabs \device{L785H1} \cite{datasheet_thorlabs_780nm}. Therefore a maximum output current of \qty{300}{\mA} is considered sufficient.

The current noise of the laser driver can be estimated from the laser linewidth sought after as the laser frequency is sensitive to the injection current. At low frequencies, about \qty[per-mode=symbol]{-3}{\MHz \per \micro \A} can be attributed to the thermal expansion of the internal resonator of the diode due to resistive heating \cite{diodelaser_modulation}. Above \qty{1}{\MHz} this effect starts declining and exposes the change of the refractive index due to the presence of charge carriers. This high frequency effect is an order of magnitude weaker. Since the frequency sensitivity to current variations of the laser diode drops with higher frequencies, the most important range is from DC to \qty{100}{\kHz}.

To estimate the linewidth requirement, it is important to look at the experimental setup. While the spectroscopy of \ce{Ar^13+} at \qty{4}{\K} is limited to around \qty{150}{\MHz}  as shown on page \pageref{eqn:doppler_broadening}, the quantum computing experiments in this group have more stringent needs. It was shown in \cite{ecdl_stability, ecdl_silicone_housing,ecdl_linewidth_scholten} that with reasonable expense a passive linewidth of less than \qty{100}{\kHz} can be achieved. Using the relationship of the frequency sensitivity to a current modulation of laser diodes, \qty{100}{\kHz} translates to a current noise of \qty{30}{\nA_{rms}} from \qty{1}{\Hz} to \qty{100}{\kHz}. The lower \qty{1}{\Hz} limit is chosen fairly arbitrary, but the presence of $\frac 1 f$-noise inhibits a definition down to DC. There should be negligible amounts noise below \qty{1}{\Hz} compared to the upper \qty{100}{\kHz} though.

The final aspect of the current source that needs to be specified, is the bandwidth of the current steering input. The bandwidth in these terms defines a reasonably flat (\qty{\leq 3}{\dB}) response. As it was discussed above, beyond a frequency of \qty{1}{\MHz}, the frequency sensitivity of the laser diode to current modulation drops by an order of magnitude, altering the transfer function and introducing new challenges for control loops. Therefore a minimum bandwidth of \qty{1}{\MHz} is considered sufficient.

Above \qty{1}{\MHz} it is recommended to either use more dedicated solutions like the direct modulation at the laser head presented in \cite{current_mod_paper} or switch to acousto-optic modulators (AOMs) or electro-optic modulators (EOMs).

This leads to the following requirements regarding the current source of the laser driver:
\begin{center}
    \begin{specifications}[label={lst:dgDrive_specs_electrical}]{Current source, electrical}
    \begin{itemize}
        \item Maximum output current \qty{300}{\mA}, optionally \qty{500}{\mA}
        \item \textbf{Compliance voltage \qty[text-series-to-math, reset-text-series = false, reset-math-version = false]{\geq 8}{\V}}
        \item Output impedance \qty{\geq 7.5}{\mega\ohm} at low frequencies (close to DC)
        \item \textbf{Current noise \qty[text-series-to-math, reset-text-series = false, reset-math-version = false]{\leq 30}{\nA_{rms}} from DC to \qty[text-series-to-math, reset-text-series = false, reset-math-version = false]{100}{\kHz}}
        \item \qty{3}{\dB}-bandwidth of the modulation source \qty{\geq 1}{\MHz}
    \end{itemize}
    \end{specifications}
\end{center}

\subsection{Design Goals: User Interface and Form Factor}
The user interface must allow repeatability and reproducibility of the outputs. The reason is that the laser system is intended to be portable to be moved from the university where it is performance tested to the GSI facility. Within the labs, systems are usually moved from test stands to the actual experiment as well. Requiring as little setup efforts as possible is a big advantage.

The interface must both be accessible both locally and remotely to allow simple adjustment of the parameters while on the bench and also from within the experimental control software. The local controls must be directly accessible to humans without tools to give a better user experience.

The remote user interface is strictly required because the Penning trap and the laser system are spatially separated with the laser system being located in a special laser lab for environmental as well as safety reasons. This separation is about \qty{30}{\meter}. Ideally this remote interface is computer controlled to give full access to all features of the laser system. USB or Ethernet is preferred as this does not require extra hardware in the lab.

Regarding the application programming interface (API), support for both Python and optionally LabVIEW is favoured, as most of the group has switched from LabVIEW to labscript suite \cite{labscript_2013} on Python to run the experiments.

The form factor should allow integration into standard 19-inch racks to allow simple transportation from the experiment location at GSI to the university for testing and calibration.

\begin{center}
    \begin{specifications}[label={lst:dgDrive_specs_api}]{Current source, user interface}
    \begin{itemize}
        \item Local control via the front end without tools
        \item Remote access via a digital interface
        \item Software API supporting \textbf{Python} and optionally LabVIEW
    \end{itemize}
    \end{specifications}
\end{center}

\clearpage
\section{Laser Temperature Controller}%
\label{sec:laser_temperatrure_controller}
The external cavity diode laser (ECDL) design employed at GSI and in this group, based on \cite{ecdl_paris}, consists of two parts: The laser diode, mounted in an aluminium frame containing a collimator, which is mounted in an external resonator also made of aluminium. The aluminum used for the external resonator is AlZn4.5Mg1, also called alloy 7200 \cite{datasheet_laser_alu}. It has a moderate thermal coefficient of expansion of \qty{23.1}{\micro\meter \per \m \per \K}, which is one order of magnitude larger than that of Invar, but is significantly easier to machine.

In order to derive the required stability criteria the laser diode and the external resonator must both be considered. The influence of external parameters on the laser wavelength were discussed in the work of \citeauthor{thesis_tilman} \cite{thesis_tilman}. The temperature sensitivity of a typical near-infrared laser diode at \qty{780}{\nm} along with the external resonator used in this group were calculated to be
\begin{align*}
    K_{T,diode} &\approx \qty{-3}{\GHz \per \K}\\
    K_{T,resonator} &\approx \qty{-9}{\GHz \per \K}\,.
\end{align*}

From these number it is clear that the resonator marks the lower bound. Going to a blue \qty{441}{\nm} laser this criterion is even more critical, because $K_{T,resonator}$ is proportional to the laser frequency and the frequency almost doubles, this leads to a sensitivity of the resonator on the order of
\begin{equation}
    K_{T,resonator} \approx \qty{-16}{\GHz \per \K}\,.
\end{equation}

This implies that in order to match the stability of the laser current driver, the temperate stability should be far better than \qty{1}{\milli \K}. A temperature stability of better than \qty{100}{\micro \K} has been demonstrated before \cite{tempcontroller_10uK,tempcontroller_10uK_jw,tempcontroller_15uK,tempcontroller_30uK,tempcontroller_40uK,tempcontroller_50uK,tempcontroller_65uK}, but all of these solution have in common that they use either multiple layers of shielding and control or elaborate baths into which the subject is submerged. The controller itself is then typically placed inside the controlled environment to shield it from external effects. This type of setup is not feasible in this situation as it would require a considerable redesign of the laser resonator. The laser resonators in use in this group \cite{thesis_tilman} have been set up over the course of several years and there are dozens of them in use. This existing design must therefore be taken into consideration as well. The resonator in its current state does does not have an airtight seal. The sensitivity of the laser frequency to the barometric pressure can be estimated using the formula developed by \citeauthor{ciddor} \cite{ciddor} to be
\begin{equation}
    K_{baro} =  \qty{-75}{\MHz \per \hecto \Pa}\,.
\end{equation}

This leads to a frequency drift of several \qty{100}{\MHz} due to a pressure drift of around \qty{\pm 10}{\hecto \Pa} observed in the lab over a typical day. A long-term drift of \qty{55}{\hecto \Pa} over the year 2022 was also recorded by the monitoring software LabKraken. On shorter time scales the air pressure varies on the order of tens of \unit{\Pa}. This must be matched by the temperature controller. It is therefore sufficient to call for a stability of \qty{<1}{\milli \K} when using an unsealed resonator. To guarantee such a stability, the resolution of the driver should be at least \qty{200}{\micro \K}, preferably \qty{100}{\micro \K}.

The type of temperature transducer used in the laser design is a \qty{10}{\kilo \ohm} thermistor, so the design must work with this type of sensor, while the support of other sensors like a PT100 is optional.

Finally, a problem often encountered with analog proportional–integral–derivative (PID) controllers when temperature controlling large, well isolated bodies is the long time scales involved. One example found in the lab are high finesse cavities mounted in vacuum enclosures. These extremely stable cavities are extensively used to reduce the linewidth of lasers to a few \unit{\Hz}. The time scales involved necessitate very long integration times $T_i$ or a rather small gain of the integral term $k_i$ of the PID controller. See section \ref{sec:pid_controller_basics} for details on PID controllers and the terminology. An illustration of the problem encountered with an analog controller can be seen in figure \ref{fig:stability_cavity}. It shows the temperature of a Stable Laser Systems \device{VH 6020} cavity housing used for a high finesse Fabry-Pérot cavity which has a time constant of \qtyrange[range-units = single, range-phrase={~to~}]{4}{7}{\hour} \cite{datasheet_vh6020}.
\begin{figure}[ht]
    \centering
    \input{images/stability_cavity.pgf}% data/plot_generic.py
    \caption{Temperature of a Stable Laser Systems \device{VH 6020} controlled by a Team Wavelength \device{HTC1500} temperature controller.}
    \label{fig:stability_cavity}
\end{figure}

The Team Wavelength \device{HTC1500} used in this example is an analog PID controller configured for the maximum specified integration time of \qty{10}{\s} using a \qty{10}{\uF} capacitor. As can be seen by the oscillatory behaviour, this time constant is far too short. Longer time scales can easily be reached by digital controllers which allow very long integration times limited only by the numerical resolution. A digital system can extend the scope of application from lasers to many other systems like those cavities to improve their stability. In addition, a digital system gives more control over the PID tuning parameters, also ensuring repeatability which greatly simplifies setting up new laser systems because a common set of PID parameters can be used a starting point before tuning the controller. Another benefit is the possibility to implement a modified algorithm and additional filters as detailed in section \ref{sec:pid_controller_basics}. This versatility to quickly adapt the programming again increases the number of applications. Integrating autotuning algorithms to help the user find a usable set of parameters reduces setup times of new systems. For all of those reasons, the new controller should be based on a digital design.

The final aspect to be considered is the output driver of the controller. The laser design used in this group uses two Peltier elements to cool both the resonator and the laser diode independently. The driver must therefore integrate two channels. While the biggest TECs currently in use are Laird \device{CP14,127,06,L1,W4.5} which can draw up to \qty{6}{\A} at \qty{15.4}{\V} \cite{datasheet_tec} their optimal coefficient of power is between \qtyrange[range-units = single]{1}{2}{\A} and \qtyrange[range-units = single]{5}{8}{\V}. Having a driver that can output \qty{4}{\A} at \qty{12}{\V} is considered more than sufficient, even for larger TECs and future projects.

Commercial temperature controllers specifying a stability of better than \qty{1}{\milli \kelvin} are hard to come by, especially with multiple channels. Two units were tested for this laser setup. The Vescent \device{SLICE-QTC} and an ILX Lightwave \device{LDT-5948}. The latter is specified for a stability of \qty{5}{\milli \kelvin}, but their application note claims a better performance \cite{appnote_ilx_millikelvin}.

The requirements for the temperature controller can be summarised as:
\begin{center}
    \begin{specifications}[label={lst:dgTemp_requirements}]{Temperature controller}
    \begin{itemize}
        \item Stability: \qty{<1}{\milli \K}
        \item Resolution: \qty{<200}{\micro \K}, \qty{<100}{\micro \K} preferred
        \item Temperature sensor: \qty{10}{\kilo \ohm} thermistor
        \item Two or more channels
        \item Output power: \qty{4}{\A} at \qty{12}{\V}
        \item Digital interface to work with long time scales and reproducible PID parameters
    \end{itemize}
    \end{specifications}
\end{center}

\clearpage
\section{LabKraken}%
\label{sec:prep_labkraken}
\subsection{Design Goals}
LabKraken is designed to be an asynchronous, resilient data acquisition suite that scales to thousands of sensors and across different networks to serve the need for monitoring and automation required for large scale experiments spanning multiple sites. It is written in Python and supports many sensors and instruments found in a scientific environment. Such sensors include Standard Commands for Programmable Instruments (SCPI) capable devices accessible via Ethernet or GPIB or sensors using a serial protocol. Many other Ethernet capable devices are also supported via a simple driver interface.

\subsection{Software Architecture}
To meet the increasing demand for high quality data, LabKraken needs to scale to thousands of sensors which must to be served concurrently. This problem is commonly referred to as the C10K problem as dubbed by \citeauthor{10kProblem} back in 1999 \cite{10kProblem} and refers to handling \num{10000} concurrent connections via network sockets. While today millions of concurrent connections can be handled by servers, handling \num{10000} can still be challenging, especially if the data sources are heterogeneous as is typical for sensor networks of diverse sensors from different manufacturers.

In order to meet the design goals, an asynchronous architecture was chosen and several different approaches were implemented over time. All in all, four complete rewrites of the software were made to arrive at the architecture introduced here. The reason for the rewrites is mostly historical and can be explained by the development of the Python programming language which was used to write the code. The first version was written using Python 2.6 and exclusively supported sensors made by Tinkerforge. With the release of Python 3.5 which supported a new syntax for asynchronous coroutines, the software was rewritten from scratch to support this new syntax, because it made the code a lot more verbose and easier to follow. When Python 3.7 was released asynchronous generator expressions where mature enough to be used in productions and the program was again rewritten to use the new syntax. In 2021 a new approach was taken and the program was once more rewritten with a functional programming style. Some of those approaches will be discussed in the next sections to highlight limits of the programming style used and the improvements made to overcome them. This development underlines important steps in the progress of asynchronous programming made by the Python language in recent years that can be applied to many other problems, for example process control. Specifically so since Python is a very popular language among scientists and used in many experiments. Each of the following sections discusses the same program, but written in different programming styles to show the differences. Especially the last example presenting a function programming style is interesting for experimental control as it gives a clean representation of the data flow from the producer to the consumer \cite{concurrent_programming}.

The example program that will be discussed does the following job. It opens a network connection to a remote (Tinkerforge) sensor platform, then queries the other side for its sensors. When the sensors are returned it looks for a specific sensor, then starts reading data from that sensor to finally print it. The example itself is designed around the Tinkerforge sensors to present a working example instead of the typically used pseudocode. It does represent a common program flow in a sensor application though and the concept is not limited to the Tinkerforge programming API.

\subsubsection{Threaded Design}
The first version of LabKraken used a threaded design approach, because the original libraries of the Tinkerforge sensors are built around threads. Most threaded programs make extensive use of callbacks. These are functions that are passed from the main thread to the worker, typically on creation, and are then called by the worker to inform the main thread of its activity. The downside is that main thread has no knowledge about the caller, the callback might have even been passed on by the worker to another thread.

\lstinputlisting[firstline=1, firstnumber=1, frame=single, breakindent=.5\textwidth, frame=single, breaklines=true, numbers=left, xleftmargin=2em, numberstyle=\tiny, style=mypython]{source/lab_kraken_threads.py}

The program starts at line \num{24} by making a connection to the host, then the first callback functions are registered with the connection object. These callbacks allow the connection thread to signal the program when the connection has been established (\textit{cb\_connected}) and when new sensors are found (\textit{cb\_enumerate}). The main program is now finished and it waits until terminated by the user. All the work is done inside the thread and the program flow unfortunately looses itself in the callbacks which get called by the connection object and their order can only be guessed from the documentation as the main program has no control over it. As the program continues it first enters the \textit{cb\_connected} callback where it will query the host for its sensors in line \num{9}. The answer will be returned through the \textit{cb\_enumerate} callback. This function filters the sensor id for a known sensor and then attaches another callback for the sensor to return data. It then configures the sensor. This program flow is typical for a callback driven design and the reader may imagine how more complex tasks are implemented. As the program grows, more and more layers of callbacks will be added and in the end, the code will be impossible to read without intimate knowledge. The effort of maintaining the callback driven code resulted in the decision to redesign the program when moving to Python 3.

To untangle this problem, Python 3.7 introduced so-called generators. This is a type of expression that will produce values from an iterator. An iterator is an (infinite) ordered series of values or events which can be processed by requesting the next value until the series is exhausted, if it is finite. The main advantage is that the logic of the program stays within the main part and only the data gathering is done outside of this scope. A generator based program is shown next.

\subsubsection{Generator Design}
In addition to a different coding style the code base is moved from a multithreaded to a asynchronous approach using Python asyncio. The difference is that asyncio uses a single thread as opposed to multithreaded code. Multiple threads can run concurrently on multiple processor cores, so different cores can process data at the same time. Asynchronous programs must pause the execution of code paths because they run within a single thread on a single core. This type of programming works best when the tasks are not computationally intensive but input/output bound by external peripherals like a network. While the processor is waiting for the slower network it can work on other tasks. The advantage is that access to shared resources is greatly simplified as these resources will never be accessed at the same time. The code is shown below.
\lstinputlisting[firstline=4, firstnumber=1, frame=single, breakindent=.5\textwidth, frame=single, breaklines=true, numbers=left, xleftmargin=2em, numberstyle=\tiny, style=mypython]{source/lab_kraken_async.py}

The first impression that can be gather from the new design is that the code has become more concise. To understand it, a few Python language keywords used must be introduced. In order to yield control to the next task, the keyword \textit{await} is used which is put in front of a function call. This will pause the current execution and wait until the function has returned with a result. Another important language feature used is a so-called context. A context is created using the \textit{async with} command and it makes sure that after leaving the context certain commands are executed. This can be used to clean up after the the creation and use of certain objects like the Ethernet connection. The connection context will make sure that the Ethernet connection will be properly terminated no matter whether enclosed content was shut down gracefully or not. The iterator uses the \textit{async for} keyword and works asynchronously as well. It pauses the code until a new event can be produced.

The code starts at line \num{26} and runs the \textit{main()} function. This function first connects to the host by creating a context in line \num{15} where the \textit{connection} variable is usable. Using this connection the sensor platform is queried in line \num{16}. In comparison to the precious example it is now far easier to follow the program flow because the context and generator reveal what is happening next. Unfortunately, reading the sensor still requires passing it to a new task because the generator will keep generating more sensors in the meantime. Although the code is split into multiple tasks, the nesting of callbacks as in the previous example is resolved and the readability of the code has improved tremendously.

The only problem is the error handling, because these worker tasks do not communicate with the original task that created them. This can be solved using the so-called observer pattern where an observer tasks watches the workers and handles such event. Directly implementing this pattern creates a myriad of events and event handler registrations. Missing one such event can break the whole program and leads to bugs that are hard diagnose and fix. To simplify this pattern a stream based approach can be applied. The observable is treated as a stream of events which is being processed by the observer using a chain of operators and actions executed in a certain order. This is much like an assembly line where different tasks are executed as the product passes each station.

\subsubsection{Stream Design}
The previously mentioned observer pattern is often implemented using data streams representing the subjects observed while the consumers are the observers. Using functional programming style these data streams can be written in a very concise form as shown in the following version of the example program.
\lstinputlisting[firstline=5, firstnumber=1, frame=single, breakindent=.5\textwidth, frame=single, breaklines=true, numbers=left, xleftmargin=2em, numberstyle=\tiny, style=mypython]{source/lab_kraken_stream.py}

The program starts in line \num{14} and enters the \textit{main()} function. Here, a context is used again to open the connection to the sensor platform. The sensors are queried next and a stream is created to read the reply, then filter for the specified sensor which is then read and the result is printed.

Using this programming style the intent of the program is revealed immediately, even before starting the stream. The syntax uses was borrowed from the Python library \textit{aiostreams} \cite{aiostreams}, which is similar to ReactiveX \cite{reactivex}, a library developed by Microsoft to operate on data streams. The interesting aspect of this code is the use of the pipe operator which inject the result of one function into the next function as its parameters. This way a chain of function calls is created. The \textit{lambda} keyword denotes a small anonymous function, but regular functions can also be used. In combination with an operator like \textit{filter}, \textit{switchmap}, or \textit{print}, they dictate the program flow, hence the name functional programming. These operators need some introduction though. The \textit{filter} operator is simple to understand as it will only pass on the input when the function, to which the input is passed as well, returns true. The \textit{switchmap} operator is more interesting. It is a combination of a \text{map} operator and a \text{switch} operator. The former applies a function to the input and then passes on the output of the function, in this case \textit{read\_temperature} which creates an iterator. The latter operator will take its most recent input and iterate, producing temperature values. This operator will terminate the iteration when a new input is passed and then iterate the new input. This is handy as it automatically makes sure that there is only one reader per sensor and for example new sensor configurations can be injected into the data stream above the \textit{switchmap} which automatically replace the old sensor reader.

This style of programming was found to be ideal for real-time data processing, as it allows to continuously update configurations or add and remove sensors, or even hosts, without having to worry about what happens along the pipeline.

\subsubsection{Device Identifiers}
Every sensor network needs device identifiers. Preferably those identifiers should be unique. Typically a device has some kind of internal identifier. Here are a few examples of the sensors used in the authors network:
\begin{table}[ht]
    \centering
    \begin{tabularx}{0.95\textwidth}{|l|p{3cm}|X|}
        \hline
        Device Type& Identifiers& Example\\
        \hline
        GPIB (SCPI)& \textit{*IDN?}& \small{Keysight Technologies,34470A,MYXXXXXXXX,A.03.03-02.40-03.03-00.52-02-01} or\newline\small{Agilent Technologies,34410A,MYXXXXXXXX,A.03.03-02.40-03.03-00.52-02-01}\\
        \hline
        Tinkerforge& A base58 encoded integer device id& QE9 (163684)\\
        \hline
        LabNode& UUID & cc2f2159-e2fb-4ed9-8021-7771890b37ad\\
        \hline
    \end{tabularx}
    \caption{Device identifiers used by common devices found in the lab. The serial number of the Keysight \device{34470A} DMM was obscured on purpose.}
    \label{tab:common_device_ids}
\end{table}

As it can be seen in table \ref{tab:common_device_ids}, most of these identifiers do not guarantee to uniquely identify a device within a network. The Tinkerforge id is the weakest, as it is a \qty{32}{\bit} integer (\num{4294967295} options), which might easily collide with another id from a different manufacturer. For better readability the id is typically presented as a base58 encoded string. An encoder/decoder example can be found in the TinkerforgeAsync library \cite{TinkerforgeAsync}.

The id string returned by a SCPI device is slightly more useful, but again does not guarantee uniqueness. As per the SCPI specification it returns a string containing \textit{\$manufacturer,\$name,\$serial,\$revision}. Even when ignoring the software revision part which might change on update, the same device might return a different id depending on its settings. The id string shown in table \ref{tab:common_device_ids} relate to the same device, but the latter uses a compatibility flag in the settings.

The only reasonably unique id is used by the LabNodes. The universal unique identifier (UUID) or globally unique identifier (GUID), as dubbed by Microsoft, can be used for networks with participant numbers going into the millions. There are several versions defined in RFC 4122 \cite{rfc_uuid} and the LabNodes use version 4, which is a random \qty{128}{\bit} identifier with \qty{122}{\bit} of entropy. Of the remaining \qty{6}{\bit}, \qty{4}{\bit} are reserved for the UUID version and \qty{2}{\bit} for the variant. This allows to prove the usefulness as a unique id as below.

Calculating the chance of a collision between two random UUIDs is called the birthday problem \cite{BirthdayProblem} in probability theory. The probability of at least one collision in $n$ devices out of $M = 2^{122}$ possibilities can be calculated as follows:
\begin{align}
    p(n) &= 1 - 1 \cdot \left(1 - \frac{1}{M}\right) \cdot \left(1 - \frac{2}{M}\right) \dots \left(1 - \frac{n-1}{M}\right) \nonumber\\
    &= 1 - \prod_{k=1}^{n-1} \left(1 - \frac{k}{M} \right)
\end{align}
Using the Taylor series $e^x = 1+x \dots$, assuming $n \ll M$ and approximating we can simplify this to:
\begin{align}
    p(n) &\approx 1 - \left(e^\frac{-1}{M} \cdot e^\frac{-2}{M} \dots e^\frac{-(n-1)}{M} \right) \nonumber\\
    &\approx 1 - \left(e^\frac{-n(n-1)/2}{M} \right) \nonumber\\
    &\approx 1 - \left(1 - \frac{n^2}{2 M} \right) = \frac{n^2}{2 M}
\end{align}
For one million devices using random UUIDs, this gives a probability of about \num{2e-25}, which is negligible.

In the LabKraken implementation, all devices, except for the LabNodes which already have a UUID, will be mapped to UUIDs using the underlying configuration database. It is up to the user to ensure the uniqueness of the non-UUID ids reported by the devices to ensure proper mapping. These UUIDs can then be used to address and configure each device on the sensor network.

%\subsubsection{Ethernet Bus and Synchronous Buses}
%There are inherent challenges involved with the Ethernet bus for instrumentation. The Ethernet bus is intrinsically asynchronous and multiple controllers can talk to the device at the same time. Not only that, but different processes within the same controller can talk to the same device. This makes deterministic statements about the device state challenging. A device that is not designed to work asynchronously in the first place may have trouble with multiple requests coming in from different clients. This must be kept in mind when using serial adapters like USB or GPIB to Ethernet.

%While it is impossible to rule out the possibility of multiple controllers on a network, care was taken to synchronize the workers within Kraken.
%\subsection{Databases}
%\subsubsection{Cardinality}
%\begin{itemize}
% \item TimescaleDB vs Influx
% \item Example Sensors vs. Experiment
%\end{itemize}

\clearpage
\section{Short Introduction to Control Theory}
This section will give a very brief introduction to some basic concepts of control theory. Many systems require control over one or more process variables. For example, temperature control of a room or a device, or even creating a programmable current from a voltage is one such problem. All of this requires control over a process and is established through feedback, which allows a controller to be aware of the state of the system.

The focus of this section is narrowed down to the concept of feedback and control with regard to developing and understanding PID controllers for temperature control. Simpler feedback loops like those typically used around op-amps will not be primarily considered in this section and are discussed in the relevant part of the documentation. In the following sections, first general properties of the Laplace transform and useful relationships are introduced, then, a model for the system and its controller will be developed, finally, using the model, tuning of the control parameters using different tuning algorithms will be discussed.

\subsection{Introduction to the Transfer Function and the Laplace Domain}%
\label{sec:transfer_function}
There are two types of control systems: open- and closed-loop systems. A system is called open loop, if the output of a system does not feed back to its input as in figure \ref{fig:open_loop}. On the other hand, if the output influences the input of the system via feedback, it is called a closed-loop system, as shown in figure \ref{fig:closed_loop}. Although feedback can be treated in static systems, it is more useful to treat it in dynamic systems, either in the time-domain or the frequency-domain. To discuss these systems, the terminology used in the following section needs to be defined. $G(s)$ is called the transfer function of the system, while $U(s)$ is the input, $Y(s)$ is the output, $s$ is a complex frequency domain variable, $\beta$ is the feedback parameter, also called feedback fraction, as shown in figure \ref{fig:closed_loop}. In this section, upper case letters are used to denote functions in the Laplace domain, while lower case letters are referring to functions in the time domain. Normally, the transfer function is denoted $H(s)$ but to prevent confusion with the Heaviside function $H(t)$, the letter $G$ is used here for the transfer function. In later chapters the common form $H(s)$ is used.
\begin{figure}[ht]
    \centering
    \begin{subfigure}{0.4\linewidth}
        \centering
        \import{figures/}{open_loop.tex}
        \caption{Open-loop system.}
        \label{fig:open_loop}
    \end{subfigure}
    \begin{subfigure}{0.4\linewidth}
        \centering
        \import{figures/}{closed_loop.tex}
        \caption{Closed-loop system.}
        \label{fig:closed_loop}
    \end{subfigure}
    \caption{Block diagram of a closed- and an open-loop control system.}
    \label{fig:feedback_systems}
\end{figure}

It is convenient to express the transfer function in its Laplace transform for several reasons that will be explained below. The systems to be discussed are physical system and hence are causal. That means the output only depends on past and present inputs, but not future inputs. For this reason, only the one-sided or unilateral Laplace transform needs to be considered. It is defined as:
\begin{equation}
    \mathscr{L}\left( f(t) \right) = F(s) = \int_0^\infty f(t) e^{-st}\,dt.
\end{equation}

with $f: \mathbb{R}^+ \to \mathbb{R}$ that is integrable and grows no faster than $c \cdot e^{s_0t}$ for $s_0, c \in \mathbb{R}$. The latter attribute is important for deriving the rules of differentiation and integration.

To understand the benefits of using the Laplace representation of the transfer function, a few useful properties should be discussed with regard to the PID controller. First of all, the Laplace transform is linear:
\begin{align}
    \mathscr{L}\left(a \cdot f(t) + b \cdot g(t) \right) &= \int_0^\infty (a \cdot f(t) + b \cdot g(t)) e^{-st}\,dt \nonumber\\
    &= a \int_0^\infty f(t) e^{-st}\,dt + b \int_0^\infty g(t) e^{-st}\,dt \nonumber\\
    &= a \mathscr{L}\left(f(t)\right) + b \mathscr{L}\left(g(t)\right)
\end{align}

Another interesting property is the derivative and integral of a function $f$. The function $f$ must, of course, be differentiable and grow no faster than the exponential function as defined above:
\begin{align}
    \mathscr{L}\left(\frac{df}{dt}\right) &= \int_0^\infty \underbracket{f'(t)}_{v'(t)} \underbracket{\vphantom{f'(t)}e^{-st}}_{u(t)}\,dt \nonumber\\
    &= \left[e^{-st} f(t) \right]_0^\infty - \int_0^\infty (-s)f'(t)\,dt \nonumber\\
    &= -f(0) + s \int_0^\infty f'(t)\,dt \nonumber\\
    &= s F(s) - f(0)
\end{align}

\begin{align}
    \mathscr{L} \left( \int_0^t f(\tau)\,d\tau \right) &= \int_0^\infty \left(\int_0^t f(\tau)\,d\tau e^{-st} \right)\,dt \nonumber\\
    &= \int_0^\infty \underbracket{e^{-st}\vphantom{\int_0^t}}_{v'(t)} \underbracket{\int_0^t f(t)\,d\tau}_{u(t)}\,dt \nonumber\\
    &= \left[\frac{-1}{s} e^{-st} \int_0^t f(t)\,d\tau \right]_0^\infty - \int_0^\infty \frac{-1}{s} e^{-s\tau} f(\tau)\,d\tau \nonumber\\
    &= 0 + \frac{1}{s} \int_0^\infty e^{-s\tau} f(\tau)\,d\tau \nonumber\\
    &= \frac{1}{s} F(s) \label{eqn:lapace_integration}
\end{align}

If the initial state $f(0)$ can be chosen to be $0$, the differentiation becomes a simple multiplication by $s$, while the integration becomes a division by $s$. These three properties greatly simplify the calculations required for studying a proportional–integral–derivative controller in section \ref{sec:pid_controller_basics}.

Finally, the most important aspect is the possibility to give a simple relation between the input $u(t)$ and the output $y(t)$ of a system. This relationship between input and output of a system as shown in figure \ref{fig:open_loop} is given by the convolution, see e.g. \cite{pid_basics}. Assuming the system has an initial state of $0$ for $t<0$, hence $u(t<0) = 0$ and $g(t<0) = 0$, one can calculate:
\begin{equation}
    y(t) = (u \ast g)(t) = \int_0^\infty u(\tau) g(t-\tau)\,d\tau
    \label{eqn:convolution}
\end{equation}

Applying the Laplace transform, greatly simplifies this:
\begin{align}
    Y(s) &= \int_0^\infty e^{-st} y(t)\,dt \nonumber\\
    \overset{\ref{eqn:convolution}}&{=} \int_0^\infty \underbrace{e^{-st}}_{e^{-s(t-\tau)}e^{-s\tau}} \int_0^\infty u(\tau) g(t-\tau)\,d\tau\,dt \nonumber\\
    &= \int_0^\infty \int_0^t e^{-s(t-\tau)} e^{-s\tau} g(t-\tau) u(\tau)\,d\tau\,dt \nonumber\\
    &= \int_0^\infty e^{-s\tau} u(\tau)\,d\tau \int_0^\infty e^{-st} g(t)\,dt \nonumber\\
    &= U(s) \cdot G(s)
\end{align}

This formula is a lot simpler than the convolution of $u(t)$ and $g(t)$, therefore the use of the Laplace transform has become very popular in control theory.

Having derived some of the most useful properties, it is interesting to look at a few functions, which are heavily used in control theory, like a function delayed by the time interval $\theta$. To demonstrate its properties, let $f(t-\theta)$ be
\begin{equation}
    g(t) \coloneqq \begin{cases} f(t-\theta), & t \geq \theta \\ 0, & t < \theta \end{cases} \,. \label{eqn:delayed_f}
\end{equation}

The reason for this definition is that it is mandatory for the system to be causal. This means, it is impossible to get information from the future ($t<\theta$). To satisfy this requirement, any constant other than \num{0} may be chosen, as is done later in section \ref{sec:pid_tuning_rules}, when determining tuning parameters and fitting experimental data to a model. An example of such a time delayed function $g(t)$ is shown in figure \ref{fig:heaviside_delayed}.
\begin{figure}[ht]
    \centering
    \begin{subfigure}{0.4\linewidth}
        \centering
        \scalebox{0.75}{%
            \import{figures/}{laplace_no_delay.tex}
        } % scalebox
        \caption{Original signal $f(t)$.}
        \label{fig:heaviside}
    \end{subfigure}
    \begin{subfigure}{0.4\linewidth}
        \centering
        \scalebox{0.75}{%
            \import{figures/}{laplace_time_delay.tex}
        } % scalebox
        \caption{Delayed signal $f(t-2)$.}
        \label{fig:heaviside_delayed}
    \end{subfigure}
\end{figure}

The Laplace transform of a delayed signal $g(t)$ can be calculated as follows:
\begin{align}
    \mathscr{L}\left( g(t) \right) &= \int_0^\infty g(t) e^{-st}\,dt \nonumber\\
    \overset{\ref{eqn:delayed_f}}&{=} \int_\theta^\infty f(t-\theta) e^{-st}\,dt \nonumber\\
    \overset{\tau \coloneqq t-\theta}&{=} \int_0^\infty f(\tau) e^{-s(\tau+\theta)}\,d\tau \nonumber\\
    &= e^{-s\theta} \int_0^\infty f(\tau) e^{-s\tau} \,d\tau \nonumber\\
    &= e^{-s\theta} F(s) \label{eqn:laplace_delayed}
\end{align}

To satisfy the causality requirement in the time domain, the Heaviside function $H(t)$ can be used to give a more concise representation of $g(t)$:
\begin{align}
    \mathscr{L}\left( f(t-\theta) H(t-\theta) \right) = e^{-s\theta} F(s) \label{eqn:laplace_causality}
\end{align}

Lastly, the Laplace transform of $e^{at}$ is given, which is commonly used in differential equations:
\begin{align}
    \mathscr{L}\left(e^{at} \right) &= \int_0^\infty e^{(a-s)t}\,dt = \frac{1}{a-s} \left[e^{(a-s)t} \right]_0^\infty = \frac{1}{s-a} \label{eqn:laplace_exponential}
\end{align}

Using these tools, it is possible calculate the transfer function of a closed-loop temperature controller, which will be done in the next section.

\subsection{A Model for Temperature Control}%
\label{sec:temperature_control_model}
\begin{figure}[ht]
    \centering
    %\scalebox{1}{%
        \import{figures/}{first-order_model.tex}
    %} % scalebox
    \caption{Simple temperature model of a generic system.}
    \label{fig:first-order_model_room}
\end{figure}

In order to describe a closed-loop system using a transfer function $G(s)$, one has to first create a model for the process and the controller involved. This section will derive the simple, but very useful first order model with dead-time. This model can be derived from the idea that the system at temperature $T_{system}$ has a thermal capacitance $C_{system}$, an influx of heat $\dot Q_{load}$ from a thermal load and a controller removing heat from the system through a heat exchanger with a resistance of $R_{force}$. Additionally, there is some leakage through the walls of the system to the ambient environment via $R_{leakage}$. This analogy of thermodynamics with electrodynamics allows to create the model shown in figure \ref{fig:first-order_model_room}. Since this model is to be used for a temperature controller, more simplifications can be made and a so-called small-signal model can be developed as opposed to the large-signal model shown above. The small-signal model is an approximation around a working point that is valid for small deviations around it, similar to a Taylor approximation. The small-signal model can be used to calculate the system response to small changes of the controller output in order to estimate the control parameters at that working point.

Using the small-signal approach, the system response can be split into a constant and a dynamic part: the 0\textsuperscript{th} and 1\textsuperscript{st} order of the Taylor approximation. In order to simplify the system shown in figure \ref{fig:first-order_model_room}, the assumption can be made that the system load $\dot Q_{load}$ and the flux through $R_{leakage}$ is \textit{reasonably stable}. \textit{Reasonably stable} means that it can be treated as small deviations and additionally any changes are within the bandwidth of the controller and well suppressed. This allows to treat them as (almost) constant effects. Constant effect can be neglected in the small-signal model because they only manifest as a constant offset applied to the output of the controller and have no dynamics. This leaves only the room with its heat capacity and the heat exchanger as dominant factors in the small-signal model shown in figure \ref{fig:first-order_model}. Here $T_{force}$ and $T_{system}$ were replaced by $T_{in}$ and $T_{out}$, $R_{force}$ and $C_{system}$ were replaced by $R$ and $C$ for better readability.
\begin{figure}[htb]
    \centering
    \scalebox{1}{%
        \import{figures/}{first-order_model_kirchhoff.tex}
    } % scalebox
    \caption{Simplifications of the temperature model of a room lead to this first order model.}
    \label{fig:first-order_model}
\end{figure}

This is the classic $RC$ circuit. To calculate the transfer function, a relationship between $T_{in}$ and $T_{out}$ is required and exploiting the analogy of thermodynamics and electrodynamics again, using Kirchhoff's second law, following the arrow in figure \ref{fig:first-order_model} one finds:
\begin{alignat}{1}
    \sum T_i &= 0 \nonumber\\
    T_{in}(t) - \dot{Q}(t) R - \frac 1 C \int \dot{Q}(t)\,dt &= 0 \label{eqn:first-order_model_kirchhoff}
\end{alignat}

Taking the Laplace transform, applying equation \ref{eqn:lapace_integration}, solving for $ \dot Q(s)$ and using $T_{out} = \frac{1}{sC} \dot Q(s)$ to replace $\dot Q$, equation \ref{eqn:first-order_model_kirchhoff} can be written as:
\begin{align*}
    T_{in}(s) - \dot{Q}(s) R - \frac{1}{sC} \dot{Q}(s) &= 0\\
    \dot{Q}(s) = \frac{T_{in}(s)}{R-\frac{1}{sC}} &= \frac{T_{out}}{\frac{1}{sC}}
\end{align*}

This allows to calculate the transfer function of the process $P$ as
\begin{align}
    P(s) &= \frac{T_{out}}{T_{in}} = \frac{\frac{1}{sC}}{R-\frac{1}{sC}} \nonumber\\
    &= \frac{1}{sRC + 1} \nonumber\\
    &= \frac{1}{1 + s\tau} = \frac{K}{1 + s\tau}\,. \label{eqn:first-order_model}
\end{align}
with the system gain $K$ and the time constant $\tau$. In case of the $RC$ circuit, the gain is $1$, but other systems may have a gain factor of $K \neq 1$. This is generally the case when using any type of sensor that converts the measurand into the input signal. $K$ is therefore included here for the sake of generality.

Equation \ref{eqn:first-order_model} is called the transfer function of a first order model, because its origin is a differential equation of first order. This model describes homogeneous systems like a room very well, as can be seen in section \ref{sec:pid_controller_tuning}, but in order to derive the transfer function including the controller and the sensor some more work is required to derive the sensor transfer function.

Expanding on figure \ref{fig:open_loop} and equation \ref{eqn:convolution} the open-loop transfer function of the process and its sensor becomes:
\begin{equation}
    G(s) = P(s) \cdot S(s)
\end{equation}
and the block diagram changes to
\begin{figure}[htb]
    \centering
    %\scalebox{1}{%
        \import{figures/}{open_loop_full.tex}
    %}% scalebox
    \caption{Open-loop system with sensor.}
\end{figure}

The transfer function of the sensor, given an ideal linear transducer, can be modeled as a delay line with delay $\theta$ and $f(t-\theta) = H(t-\theta)$. A sensor gain of $1$ is assumed here, because any system gain is assumed be included in the parameter $K$ of the process transfer funciton. Using equation \ref{eqn:laplace_delayed}, $S(s)$ can be written as
\begin{equation}
    S(s) = e^{-\theta s} .
\end{equation}

The full system model including the time delay can now be written as:
\begin{equation}
    G(s) = \frac{K}{1 + s\tau} e^{-\theta s} \label{eqn:first-order_plus_dead_time_model}
\end{equation}

This is called a first order plus dead-time model (FOPDT) or first order plus time-delay model (FOPTD). While the Laplace representation is useful to mathematically explore the mode, in order to fit experimental data to this model it is more convenient to transform the transfer function \ref{eqn:first-order_plus_dead_time_model} into the time domain. To have a meaningful result, an input $U(s)$ is required, because $G(s)$ is only a transformation. In principal, any function can serve this purpose, but a step function is typically used, for example by \citeauthor{ziegler_nichols} \cite{ziegler_nichols} and many others \cite{tuning_rules,pessen_integral,simc,simc_paper,pid_controllers_for_time_delay_systems,pi_stabilization_of_fopdt_systems, pid_basics}. The step function is both simple to calculate and to apply to a real system in form of a controller output change. This technique is also explored in more detail in section \ref{sec:pid_controller_tuning}. Using equations \ref{eqn:laplace_delayed} and \ref{eqn:laplace_exponential}, the Heaviside $H(t)$ step function transforms into
\begin{equation}
    \mathscr{L} \left(u(t) \right) = U(s) = \mathscr{L} \left( \Delta u H(t) \right) = \frac{\Delta u}{s}
\end{equation}
with the step size $\Delta u$. The output response $Y(s)$ of the system to the step can then be calculated analytically.
\begin{align}
    Y(s) &= U(s) \cdot G(s)\nonumber\\
    &= \frac{\Delta u}{s} \frac{K}{1 + s\tau} e^{-\theta s} \nonumber\\
    &=  K \Delta u \frac{1}{s (1 + s\tau)} e^{-\theta s} \nonumber\\
    &= K \Delta u \left(\frac{1}{s} - \frac{\tau}{s\tau+1} \right) e^{-\theta s} \nonumber\\
    &= K \Delta u \left(\frac{1}{s} - \frac{1}{s+\frac{1}{\tau}} \right) e^{-\theta s}
\end{align}

To derive $y(t)$, the inverse Laplace transform of $Y(s)$ is required. Unfortunately, this is not as simple as the Laplace transform. Fortunately though the required equations were already derived in equations \ref{eqn:lapace_integration} and \ref{eqn:laplace_exponential}. Making sure causality is guaranteed as shown in equation \ref{eqn:laplace_causality}, the simple first order model can be transformed back into the time domain.
\begin{align}
     y(t) &= \mathscr{L}^{-1} \left(Y(s)\right) \nonumber\\
     &= K \Delta u \mathscr{L}^{-1} \left(\frac{1}{s} e^{-\theta s} \right)  - K \mathscr{L}^{-1} \left( \frac{1}{s+\frac{1}{\tau}} e^{-\theta s} \right) \nonumber\\
    \overset{\ref{eqn:laplace_exponential}}&{=} K \Delta u \cdot 1 \cdot H(t-\theta) - \left(e^{-\frac{t-\theta}{\tau}} \right) H(t-\theta) \nonumber\\
    &= K \Delta u \left(1-e^{-\frac{t-\theta}{\tau}} \right) H(t-\theta) \label{eqn:first-order_plus_dead_time_model_time-domain}
\end{align}

The time domain solution of the FOPDT model can now be used to extract the parameters $\tau$, $\theta$ and $K$ from a real physical system.

The procedure can be summarised from the above as follows. The controller must be set to a constant output and the room must be given time to reach equilibrium. Once the temperature has settled, an output step of $\Delta u$ is applied. The system will respond after a time delay and then follow an exponential function. A simulation of the step response applied to a first order model with time delay is shown in figure \ref{fig:fopdt}. The gain is $K=1$. The solid black line shows the response of the transfer function, including the system and the sensor. The dashed lines show the individual components, the Heaviside function governing the delay and the exponential term of the system. The controller output step $\Delta u = 1$ is applied at $t=0$ and not shown explicitly. From figure \ref{fig:fopdt} it can be clearly seen that the sensor does not register a change until the time delay $\theta$ has passed and the Heaviside function changes from $0$ to $1$. Then the system responds with an exponential decay towards \num{1}.
\begin{figure}[ht]
    \centering
    \input{images/FOPDT_theory.pgf}% data/simulations/sim_laplace_fopdt.py
    \caption{Time domain plot of a first order plus dead time model showing individual components of the model and the composite function $y(t)$. Model parameters used: $K= \Delta u = 1$, $\tau=2$, $\theta=4$.}
    \label{fig:fopdt}
\end{figure}

So far, only open-loop systems were discussed. Using the FOPDT model, the system parameters can now be extracted from an existing system using a fit to the time domain reaction of such a system to a step input. Having extracted the system parameters, the next step is to design a controller around the system and close the loop to realize a controlled system. This is shown in the next section.

\subsection{PID Controller Basics}%
\label{sec:pid_controller_basics}
While there are many different types of controllers, like the bang–bang controller utilized in many temperature controller, which turns on at a certain threshold and turns off at another one, producing the saw-tooth shaped room temperature curve shown in figure \ref{fig:lab_temperature_start_of_project}, a continuous control system is desired to keep fluctuations to a minimum. The most commonly used controller type for non-integrating systems is the proportional–integral–derivative (PID) controller \cite{pid_in_industry}. A non-integrating system is a system without memory whose steady state does not depend on previous inputs. The advantage of applying a PID controller is that the controller does not need any special knowledge about the system model. A universal PID is simple to implement and can be tuned to control a wide range of different systems. While there are many variations of the PID algorithm \cite{pid_controller}, this section only introduces the basic, parallel, PID controller commonly used in digital implementation and deals with some of the shortcomings in practical applications.

In order to extend the FOPDT system derived in the previous section \ref{sec:temperature_control_model}, with the PID controller one has to move to a closed-loop system. Adding to figure \ref{fig:closed_loop} and inserting a new control block into the transfer function yields figure \ref{fig:closed_loop_pid}.
\begin{figure}[ht]
    \centering
    \scalebox{1}{%
        \import{figures/}{closed_loop_pid.tex}
    }% scalebox
    \caption{Closed-loop system with a PID controller.}
    \label{fig:closed_loop_pid}
\end{figure}

The error signal $E(s)$ used by the PID controller is the difference between the setpoint and the control parameter, in this case the room temperature. The transfer function of the PID controller can be split into three parts. A proportional part that is proportional to the error representing the present state, an integral part that is proportional to the accumulated error, representing the past state, and a derivative part that is proportional to the change in the error, extrapolating into the future. Analytically, it can be written as
\begin{align}
    c(t) &= k_p e(t) + k_i \int_0^t e(\tau) \,d\tau + k_d \frac{\mathrm{d}e(t)}{\mathrm{d}t} \label{eqn:pid_controller}\\
    C(s) &= k_p + k_i \frac{1}{s} + k_d s \,. \label{eqn:pid_controller_laplace}
\end{align}

The following discussion will mostly focus on equation \ref{eqn:pid_controller}, because, the time-domain equation is the one that can be implemented in software. As hinted above, there are a few shortcomings with the classic PID equation, when used in a real system which requires dynamic changes of the the setpoint or $k_i$.

The first problem that needs addressing is occurring when changing the PID parameter $k_i$, because equation \ref{eqn:pid_controller} is given for a time-independent $k_i$. Assuming a settled system without external disturbances, the output is fully determined by the integrator value because the error is zero. Now, when $k_i$ is changed, the output immediately changes, due to the change of the integral term. This is unintended. To fix it, the integral term must be changed to
\begin{equation}
    k_i \int_0^t e(\tau) \,d\tau \Rightarrow \int_0^t k_i(\tau) e(\tau) \,d\tau \,.
\end{equation}

This way, when adjusting $k_i$, its new value is applied to future error values only and there is no sudden kick.

The next issue is called \text{derivative kick}. When looking at the derivative part of equation \ref{eqn:pid_controller}, it can be seen that when instantly changing the setpoint, as in a step function, $\frac{\mathrm{d}e(t)}{\mathrm{d}t} \to \infty$. This behaviour is not intended and to fix this, the derivative part can be modified as follows.
\begin{align}
    \frac{\mathrm{d}e(t)}{\mathrm{d}t} &= \frac{\mathrm{d}\left(u(t) - y(t)\right)}{\mathrm{d}t} = \underbrace{\cancel{\frac{\mathrm{d}u(t)}{\mathrm{d}t}}}_{\to \infty} - \frac{\mathrm{d}y(t)}{\mathrm{d}t} \nonumber\\
    &=- \frac{\mathrm{d}y(t)}{\mathrm{d}t}
\end{align}

The new derivative term is equal to the unmodified one, except in the case of setpoint changes. Removing the setpoint from the equation, the controller behaves as intended. This solution is sometimes called \textit{derivative on measurement} as opposed to \textit{derivative on error}.
\begin{figure}[hb]
    \centering
    \input{images/sim_pid_controller_bode.pgf}% data/simulations/sim_pid_controller_bode.py
    \caption{Magnitude plot over frequency of the PID controller transfer function. Both the ideal PID controller and the PID controller with a filtered derivative are shown.}
    \label{fig:sim_pid_controller}
\end{figure}

Another issue is caused by the derivative term with noisy inputs. Assuming there is a very short input spike due to noise, the differential of the derivative term will again be sent to very high values, pushing the output away from the correct value and forcing the controller to slowly rebalance.

To further discuss the problem and its solution it is best to visit the frequency domain and visualize the transfer function of the PID controller as shown in figure \ref{fig:sim_pid_controller}. The ideal PID controller without filtering of the derivative displays a very strong response to low frequency inputs. This is due to the integral action, which removes any (constant) offset. It needs to have infinite gain at DC to push the offset to zero. In reality this is limited by input noise. Then follows a plateau with a magnitude of $k_p$ for the proportional term and finally the differential gain starts growing in magnitude and keeps steadily growing with rising frequency, just as expected.

With some knowledge about the process or the sensor it is possible to define an upper frequency, above which inputs become unrealistic and must therefore be unwanted noise. By filtering the derivative term with a first order filter causes it to roll off and its gain becomes constant as shown in figure \ref{fig:sim_pid_controller}. By adding the filter the PID controller transfer function changes to

\begin{equation}
    C(s) = k_p + k_i \frac{1}{s} + \frac{k_d s}{1 + s \alpha k_d} \,. \label{eqn:pid_controller_filtered}
\end{equation}

Typically $\alpha$ is in the range of \numrange{0.05}{0.2} \citep[p. 129]{pid_controller}.

An alternative is to filter the whole input. Depending on the filter cutoff, there is not much difference to equation \ref{eqn:pid_controller_filtered}, because the filter will not touch the proportional and integral part of the transfer function if both are well within its passband.

From figure \ref{fig:sim_pid_controller} it can also be seen, why in some publications, the gain $k_p$ is applied to all three terms and $k_i$ and $k_d$ are replaced with $T_i$ and $T_d$ to accommodate for that.
\begin{equation}
    C(s) = k_p \left(1 + \frac{1}{T_i s} + \frac{T_d s}{1 + s \alpha T_d} \right) \label{eqn:pid_controller_series}
\end{equation}

Using this form allows to shift the overall gain up and down keeping its shape instead of just the $k_p$ part, thus changing the corner frequencies. The alternative form is only given here for the sake of completeness. The author uses the ideal form shown in equation \ref{eqn:pid_controller_laplace} with the parameters $k_p$, $k_i$, and $k_d$ wherever possible.

This concludes the discussion of the PID controller and the introduction of the basic terms. It now begs the question how the controller interacts with the system and how to derive the optimal PID parameters from a given system or model. Thus, the next section discusses controller tuning rules and their effect on the system performance.

\subsection{PID Tuning Rules}%
\label{sec:pid_tuning_rules}
While many PID tuning rules can be found in the literature, their application depends on the underlying system and the desired system response. This section will discuss several proposed solutions and compare them to the authors use case. The section aims to give a simple method to determine decent PI/PID parameters for the applications found in the lab. Among the methods discussed are the most classic set of tuning rules developed by \citeauthor{ziegler_nichols} \cite{ziegler_nichols}, and an improved version called Skogestad Internal Model Control (SIMC) presented by \citeauthor{simc_paper} \cite{simc_paper} which promises better performance for non-integrating systems. These rules all include simple instructions to extract the necessary parameters using pen and paper. Using a computer and fitting algorithms, the bar for \textit{simple} has been raised considerably, so more complex approaches can be undertaken which extract more parameters from the system. Using these additional parameters, more precise control is promised by \citeauthor{pid_basics} \cite{pid_basics, advanced_pid_control} with a method called AMIGO. Finally, it is possible to shape the control loop to result in a desired transfer function. This technique is mostly used in motor control \cite{pid_controller,advanced_pid_control} and also requires the model parameters.

All of these rules will be compared against a demo model of a room to explain the details. It is the first order model with delay which was derived in equation \ref{eqn:first-order_plus_dead_time_model}. The discussion is limited to the FOPDT model, because the systems treated in this work could be modelled very well using this equation. Higher-order models are discussed in more details for example in \cite{advanced_pid_control,pid_controller,simc_paper}, in case the reader encounters such a system and feels the need to extract the model parameters.
\begin{equation}
    G(s) = \frac{K e^{-\theta s}}{1 + s \tau} \label{eqn:demo_process_model}
\end{equation}

The following parameters were extracted from lab 011 of the APQ group, using the techniques shown in section \ref{sec:temperature_control_model} using equation \ref{eqn:first-order_plus_dead_time_model_time-domain}. The details are discussed in section \ref{sec:pid_controller_tuning}. The system gain $K$ was scaled to the full scale output (\qty{4095}{\bit}) of the controller, hence the somewhat strange unit \unit[per-mode=power]{\K \bit\per\bit}.
\begin{table}[hb]
    \centering
    \begin{tabular}{ccc}
        \toprule
        Gain K& Lag $\tau$& Delay $\theta$ \\
        \midrule
        \qty[per-mode=power]{13.07}{\K \bit\per\bit}& \qty{395}{\s}& \qty{187}{\s}\\
        \bottomrule
    \end{tabular}
    \caption{Example paramters extracted from lab 011 using the techniques shown here and as applied in section \ref{sec:pid_controller_tuning}.}.
    \label{tab:pid_example_model}
\end{table}

Before detailing the tuning parameters, the loop shaping method will be explained first, because it cannot only be used to derive custom rules but was also used to create the SIMC rules proposed by \citeauthor{simc_paper} \cite{simc_paper}. The aim of this method is to derive a controller that shapes the model in such a way that a desired system response to setpoint changes is achieved. A general closed-loop system with a controller $C$ and a system $G$ is shown in figure \ref{fig:closed_loop_controller}. This will be used as a basis to find the required controller for a desired transfer function $\frac{Y(s)}{U(s)}$.
\begin{figure}[ht]
    \centering
    \scalebox{1}{%
        \import{figures/}{closed_loop_controller.tex}
    }% scalebox
    \caption{Closed-loop system $G$ with a controller $C$.}
    \label{fig:closed_loop_controller}
\end{figure}

Starting with the transfer function of the controlled system, made up of the controller and the system, most experimenters would, at least in a feverish dream, prefer a transfer function of the following divine form
\begin{equation*}
    \frac{Y(s)}{U(s)} = 1 \,,
\end{equation*}
but unfortunately life is more profane and there is no controller that will always (and with warp speed) force a system to a certain setpoint. One may therefore settle for the second-best choice, a first order low pass with a slow roll-off and a small delay, which must be added to ensure causality. One therefore arrives at
\begin{equation}
    \frac{Y(s)}{U(s)} = \frac{e^{-\theta s}}{1 + s \tau_c}\,, \label{eqn:desired_transfer_function}
\end{equation}
where $\tau_c$ is the closed-loop time constant and a measure for the aggressiveness of the controller. A small $\tau_c$ results in a more aggressive controller with faster response.

For the system shown figure \ref{fig:closed_loop_controller} the closed-loop transfer function is found to be
\begin{align*}
    \frac{Y(s)}{U(s)} &= \frac{C(s) G(s)}{C(s) G(s) + 1} \\
    \Rightarrow C(s) &= \frac{1}{G(s)} \frac{1}{\frac{Y(s)}{U(s)} -1}
\end{align*}

This loop now needs to be shaped into the desired transfer function given in equation \ref{eqn:desired_transfer_function}, so substituting $\frac{Y(s)}{U(s)}$ yields
\begin{align}
    C(s) &= \frac{1}{G(s)} \frac{e^{-\theta s}}{s \tau_c +1 - \underbrace{e^{-\theta s}}_{\approx 1 - \theta s}}\\
    &\approx \frac{1}{G(s)} \frac{e^{-\theta s}}{s (\tau_c + \theta)} \,.
\end{align}

$e^{-\theta s}$ was approximated using a first order Taylor expansion. The desired controller response now only depends on the system (including the sensor) to be controlled. So, substituting the system equation \ref{eqn:demo_process_model} results in
\begin{align}
    C(s) &= \frac{1}{K} \frac{s \tau + 1}{(\tau_c + \theta) s} \nonumber\\
    &= \underbrace{\frac{1}{K} \frac{\tau}{\tau_c + \theta}}_{k_p} + \underbrace{\frac{1}{K} \frac{1}{\tau_c + \theta}}_{k_i} \frac{1}{s}\,.
\end{align}

This is a PI controller with $k_p = \frac{1}{K} \frac{\tau}{\tau_c + \theta}$ and $k_i = \frac{1}{K} \frac{1}{\tau_c + \theta}$. From these calculations, it can be seen that a first order model can be fully treated using a PI controller. Second order (and higher order) models typically necessitate a PID or more sophisticated controller for optimal control. The problems discussed in this work mainly focus temperature control of (mostly) homogeneous objects, so the focus lies on the PI controller for most of the remaining section but the ideas and simulations can similarly be applied to the PID controller as well. Any caveats to be expected when treating a PID instead of a PI controller will be discussed.

Using the loop shaping technique, it is fairly easy to derive custom rules in case the model parameters can be extracted. As mentioned above, one such loop-shaped tuning rule is the SIMC rule set and the authors of those rules give advice for an ample variety of different models and also investigate the parameter choice regarding stability, load, and setpoint disturbances. Before attempting a custom approach, it is therefore recommended to check \cite{simc_paper} for an appropriate set of rules for more complex models in order to save time and effort.
\begin{table}
    \centering
    \begin{tabular}{lcccc}
        \toprule
        Tuning Rule& $k_p$& $T_i$ & $T_d$ & Source \\
        \midrule
        Z-N PI & $\frac{0.9 \tau}{K \theta}$ & $\frac{\theta}{0.3}$ & -- & \cite{ziegler_nichols}\\
        Z-N PID & $\frac{1.2 \tau}{K \theta}$ & $2 \theta$ & $\frac{\theta}{2}$ & \cite{ziegler_nichols}\\
        SIMC PI & $\frac{\tau}{K (\tau_c + \theta)}$ & $\min\left(\tau, 4 (\tau_c+\theta)\right)$ & -- & \cite{simc_paper}\\
        SIMC PID & $\frac{\tau_1}{K (\tau_c + \theta)}$ & $\min\left(\tau_1, 4 (\tau_c+\theta)\right)$ & $\tau_2$ & \cite{simc_paper}\\
        AMIGO PI & $\frac{0.15}{K} + \left(0.35 - \frac{\tau \theta}{\left(\tau + \theta\right)^2}\right) \frac{\tau}{K \theta}$ & $0.35 \theta + \frac{13 \tau^2 \theta}{\tau^2 + 12 \tau \theta + 7 \theta^2}$ & -- & \citep[p. 228]{advanced_pid_control}\\
        AMIGO PID & $\frac{1}{K} \left(0.2 + 0.45 \frac{\tau}{\theta}\right)$ & $\frac{0.4 \theta + 0.8 \tau}{\theta + 0.1 \tau} \theta$ & $\frac{0.5 \tau \theta}{0.3 \theta + \tau}$ & \citep[p. 233]{advanced_pid_control}\\
        \bottomrule
    \end{tabular}
    \caption{PI/PID parameters for different tuning rules. The PI controllers assume a first order model, the PID rules are required when dealing with a second order model.}
    \label{tab:pid_tuning_parameters}
\end{table}

For reasons of brevity, in table \ref{tab:pid_tuning_parameters}, the PID parameters are given as $k_p$, $T_i$ and $T_d$ as introduced in equation \ref{eqn:pid_controller_series}. $k_i$ and $k_d$ can be calculated from
\begin{align*}
    k_i &= \frac{k_p}{T_i}\\
    k_d &= k_p T_d\,.
\end{align*}

Regarding the SIMC PI/PID algorithm, \citeauthor{simc_paper} \cite{simc_paper} and \citep[ch. 5]{simc} suggests using $\tau_c = \theta$ for “\textit{tightest possible subject to maintaining smooth control}“. Following this recommendation, the minimum can be calculated from the parameters given in table \ref{tab:pid_example_model} on page \pageref{tab:pid_example_model} as $\min\left(\tau, 4 (\tau_c+\theta)\right) = \min\left(\tau, 8 \theta\right) = \tau$.

Using the rules above, the full system can be simulated now. This was done using Python. The simulation source code can be found in \external{data/simulations/sim\_pid\_controller.py} as part of the online supplemental material \cite{supplemental_material}. The simulation can be used to model arbitrary PI(D) controller and arbitrary models can be used as well. It allows to compare different settings before applying them to a real system. It also considerably shortens deployment times because especially for systems with long time scales, it becomes difficult to test several parameter sets on the fly, thus a simulation can reduce deployment time to a few minutes instead of hours.

The simulation emulates the PID controller developed for the lab temperature controller. By default it has a sampling rate of \qty{1}{\Hz}. The simulation  will apply a setpoint change of \qty{+1}{\K} \qty{10}{\s} into the simulation. After the simulation, it will plot the time domain response of the controlled system. The setpoint change in this scenario is very similar to the load disturbances that are expected. Typically a noise source is used here instead, but in contrast to the statistical noise, which could be used to test for disturbance rejection, the situation in labs are different and cannot be modelled with stationary noise. While there is some noise coming from the sensor and the lab, the major disturbances are usually caused by the experimenters instead of the lab itself. These are events like a device being switched on or off for an extended period of time, longer than the controller needs to settle. This is equivalent to a setpoint change in terms of the error term in equation \ref{eqn:pid_controller}, since there is no difference in the error term between a setpoint and a process variable change. Do note, that this is not true for the PID controller, whose derivative term directly works on the measurement (or process variable) as this was explicitly implemented above. For PID controllers, there is a difference between the setpoint change behaviour and system noise rejection. This must be kept in mind and tested accordingly.

Simulating the model above and using the PI parameters derived from table \ref{tab:pid_tuning_parameters}, gives the plot shown in figure \ref{fig:pid_controller_comparison}.
\begin{figure}[ht]
    \centering
    \input{images/sim_pid_controller_comparison.pgf}% data/simulations/sim_pid_controller.py
    \caption{Different PI Controllers tuned with parameter derived using the following methods: Ziegler-Nichols, SIMC and AMIGO. The system model is the FOPTD model for room 011.}
    \label{fig:pid_controller_comparison}
\end{figure}

As it can be seen in figure \ref{fig:pid_controller_comparison}, the Ziegler-Nichols tuning rule produces a very aggressive PI controller that shows quite a bit ringing, which is undesired for this application. The AMIGO rules are rather conservative, but do not produce any overshoot. The SIMC rules have proven the most useful for this application so far. This experience is in line with the results from \citeauthor{thesis_liebmann} \cite{thesis_liebmann}, who tested different PID tuning algorithms for their viability for temperature control in the labs discussed here.

To conclude, several PID tuning rules were presented and using a Python simulation tool it is possible to test a set of PID parameters before implementation. Using an example based on parameters extracted from a real environment, the different tuning rules were applied to a model for a real lab and the SIMC tuning rules were found to give the best results for this application. The reader should now be able to extract the model parameters from physical systems and have the tools to choose an optimal set of tuning parameters for the PID controller. Further reading recommendations are for a broad overview \cite{pid_controller}, and for more details \cite{advanced_pid_control}.

\clearpage
\section{Noise and Allan Deviation}%
\label{sec:allan_deviation}
The Allan variance \cite{adev} $\sigma_A^2(\tau)$ is a two-sample variance and used as a measure of stability. The Allan deviation $\sigma_A(\tau)$ is the square root of the variance. Originally, the Allan variance was used to quantify the performance of oscillators, namely the frequency stability, but it can be used to evaluate any quantity. In order to define the Allan variance, a few terms need to be defined first. A single measurement value of a time series $y(t)$ can be written as
\begin{equation}
    \bar y_k(t) = \frac{1}{\tau} \int_{t_{k}}^{t_{k}+\tau} y(t)\,dt . \label{eqn:allan_variance_measurement}
\end{equation}
This is the $k$-th measurement with a measurement time or integration time $\tau$. The latter term is frequently used for digital multimeters (DMM). $t_k$ is the start of the $k$-th sampling interval including the dead time $\theta$
\begin{equation}
    t_{k+1} = t_k + T
\end{equation}
with
\begin{equation}
    T \coloneqq \tau + \theta .
\end{equation}

\begin{figure}[hb]
    \centering
    \scalebox{1}{%
        \import{figures/}{allan_variance_definitions.tex}
    }% scalebox
    \caption{Measurement interval according to equation \ref{eqn:allan_variance_measurement}. The shaded region is the signal acquisition period.}
    \label{fig:allan_variance_definitions}
\end{figure}

Using this, the deviation over $N$ samples is defined as \cite{adev,psd_to_adev}
\begin{equation}
    \sigma_y^2(N,T,\tau) = \left\langle \frac{1}{N-1} \left(\sum _{k=0}^{N-1}\bar y_k^2(t)-\frac{1}{N}\left(\sum _{k=0}^{N-1} \bar y_k(t)\right)^2\right)\right\rangle
\end{equation}
The $\langle \; \rangle$ denotes the (infinite time) average over all measurands $y_k$ or, simply put, the expected value.

The Allan variance is a special case of this definition with zero dead-time ($\theta=0$) and only 2 samples:
\begin{align}
    \sigma_A^2(\tau) &= \sigma_A^2(N=2,T=\tau,\tau) \label{eqn:allan_coefficients}\\
    &= \left\langle \frac{\left(\bar y_{k+1} - \bar y_k \right)^2}{2} \right\rangle
\end{align}
It can be shown \cite{psd_to_adev} that \ref{eqn:adev_estimator} is indeed more useful than $\sigma_A^2(N\to\infty,T=\tau,\tau)$, because $\sigma_A^2(N=2,T=\tau,\tau)$ converges for processes that do not have a convergent $\sigma_A^2(N\to\infty,T=\tau,\tau)$.

In practice, no experiment can take an infinite number of samples, so typically the Allan variance must be estimated using a number of samples $m$:
\begin{equation}
    \sigma_A^2(\tau) \approx \frac1 m \sum_{k=1}^m \frac{\left(\bar y_{k+1} - \bar y_{k} \right)^2}{2} \label{eqn:adev_estimator}
\end{equation}
This estimation can lead to artifacts in the results as discussed later. In order to derive the Allan variance from a set of data points, the different values of $\tau$ are usually obtained by averaging over a number of samples as there is no dead time (by definition of the Allan variance).

Additionally, the Allan variance is mathematically related to the two-sided power spectral density $S_y(f)$ \cite{psd_to_adev}:
\begin{equation}
    \sigma_A^2(\tau) = 2 \int_0^\infty S_y(f) \frac{\sin^4\left( \pi f \tau \right)}{(\pi f \tau)^2}\,df \label{eqn:psd_to_adev}
\end{equation}

and therefore all processes that can be observed in the power spectral density plot can also be seen in the Allan deviation. The inverse transform however, is not always possible as shown by \citeauthor{inverse_adev} \cite{inverse_adev}.

Distinguishing different noise processes using the Allan deviation will be elaborated in the next section.

%TODO: Add Shot noise
\subsection{Identifying Noise in Allan Deviation Plots}
It was already mentioned by \citeauthor{adev} in \cite{adev} that types of noise, whose spectral density follows a power law
\begin{equation}
    S(f) = h_{\alpha} \cdot f^\alpha \label{eqn:power_law}
\end{equation}
can be easily identified in the Allan deviation plot. The constant $h_\alpha$ is called the power (intensity) coefficient. The most common types of noise encountered in experimental data and their representations can be found in table \ref{tab:adev_alpha}, which serves as a summary of this section. Since those types of noise are present in any measurement or electronic device, it warrants a further discussion to understand their root causes and ideas to minimize them. While not a type of noise, linear drift can also be easily identified in the Allan deviation plot. It is therefore included in table \ref{tab:adev_alpha} as well.
\begin{table}[ht]
    \centering
    \begin{tabular}{lcc}
        \toprule
        Amplitude noise type& Power-law coefficient $\alpha$& Allan variance $\sigma_A^2$\\
        \midrule
            White noise & $0$& $\frac 1 2 h_0 \tau^{-1}$ \cite{adev_noise_types}\\
            Flicker noise& $-1$& $2 \ln 2 \, h_{-1} \tau^0$ \cite{adev_noise_types}\\
            Random walk noise& $-2$& $\frac 3 2 \pi^2 h_{-2} \tau^{1}$ \cite{adev_noise_types}\\
            Burst noise& $0 \textrm{ and } -\!2$& $y_{rms}^2\frac{\bar \tau^2}{\tau^2} \left(4 e^{-\frac{\tau}{\bar \tau}} - e^{-\frac{2 \tau}{\bar \tau}} + 2 \frac{\tau}{\bar \tau} - 3 \right)$\\
            Drift & --& $\frac 1 2 D^2 \tau^2$ \cite{adev_drift}\\
        \bottomrule
    \end{tabular}
    \caption{Power law representations of different noise types using the Allan variance.}
    \label{tab:adev_alpha}
\end{table}

In order to arrive at a good understanding of the features seen in an Allan deviation plot, this section will provide the reader with examples of each type of noise and the corresponding time domain, power spectral density and Allan deviation plot. Since a complete overview is not available in current literature, all required mathematical descriptions and simulation tools will be discussed here. The simulations were done using Python and the source code is linked to in the discussions. The files are found in the online supplemental material found at \cite{supplemental_material}. Using these scripts, all the graphs shown can be recreated and explored further.

\subsubsection{White Noise}%
\label{sec:white_noise}
White noise is probably the most common type of noise found in measurement data. Johnson noise found in resistors, caused by the random fluctuation of the charge carriers, is one example of mostly white noise up to a bandwidth of \qty{100}{\MHz}, from where on quantum corrections are required \cite{nist_johnson_noise}. Amplifiers also tend to have a white noise spectrum at higher frequencies.

For the latter reason, white noise typically makes up for a considerable amount of noise in measurements, unless one works at very low frequencies. White noise is a series of uncorrelated random events and therefore characterised by a uniform power spectral density, which means there is the same power in a given bandwidth at all frequencies up to infinity. White noise therefore has infinite power (variance). In reality a measurement is always limited in bandwidth and hence the above property of a constant power spectral density only holds within that bandwidth. Those bandlimited samples of white noise thus have a finite variance.
Since white noise is so common, a few of its properties should be mentioned. One such property is that the variance $\sigma_{x+y}^2$ of two uncorrelated variables $x$ and $y$ adds as:
\begin{equation}
    \sigma_{x+y}^2  = \sigma_x^2 + \sigma_y^2 + \underbrace{2\,\mathrm{Cov}(x,y)}_{\text{uncorrelated}\, =\, 0}\ = \sigma_x^2 + \sigma_y^2 \label{eqn:adding_white_noise}
\end{equation}

This results in simple addition rules for variances from different sources, but it must be stressed here that this property is only valid for uncorrelated sources like white noise, although it is usually incorrectly applied to all measurements which unfortunately obscures rather than clarifies the uncertainties involved.

In order to demonstrate the effect of white noise in Allan deviation plots, it was simulated using the \textit{AllanTools} library written by \citeauthor{allantools} \cite{allantools}. The noise generator is based on the work of \citeauthor{noise_generation} \cite{noise_generation}. The full Python program code is published online \cite{supplemental_material} and found in \external{data/simulations/sim\_allan\_variance.py}. To allow better comparison, all noise densities are normalised to give an Allan deviation of $\sigma_A(\tau_0)=1$, with $\tau_0$ being the smallest time interval.
\begin{figure}[ht]
    \centering
    \begin{subfigure}{0.32\linewidth}
        \centering
        \scalebox{0.75}{%
            \input{images/white_noise_amplitude.pgf}% data/simulations/sim_allan_variance.py
        } % scalebox
        \caption{Time domain}
        \label{fig:white_noise_time}
    \end{subfigure}
    \hfill
    \begin{subfigure}{0.32\linewidth}
        \centering
        \scalebox{0.75}{%
            \input{images/white_noise_psd.pgf}% data/simulations/sim_allan_variance.py
        } % scalebox
        \caption{Power spectral density}
        \label{fig:white_noise_psd}
    \end{subfigure}
    \hfill
    \begin{subfigure}{0.32\linewidth}
        \centering
        \scalebox{0.75}{%
            \input{images/white_noise_adev.pgf}% data/simulations/sim_allan_variance.py
        } % scalebox
        \caption{Allan deviation}
        \label{fig:white_noise_adev}
    \end{subfigure}
    \caption{Different representations of white noise.}
    \label{fig:white_noise_simulated}
\end{figure}

Figure \ref{fig:white_noise_simulated} shows a sample of white noise in its three different forms. Figure \ref{fig:white_noise_time} is the time series representation from which the power spectral density was calculated and is shown in figure \ref{fig:white_noise_psd}. The dashed line shows the expectation value of the power spectral density and the Allan deviation.

From this simulation, several features can be observed. First of all, the power spectral density is flat and constant with $h_0 = 2$, which is in accordance with table \ref{tab:adev_alpha} and the normalisation mentioned earlier. Figure \ref{fig:white_noise_adev} shows the typical $\tau^{-\frac 1 2}$ dependence of white noise in the Allan deviation plot. This immediately explains, why filtering white noise scales with $\frac{1}{\sqrt{n}}$ with $n$ being the number of samples averaged.

\subsubsection{Burst Noise}%
\label{sec:theory_burst_noise}
Burst noise, popcorn noise, or sometimes referred to as random telegraph signal is a random bi-stable change in a signal and is caused by generation-recombination processes. This happens, for example, in semiconductors if there is a site that can trap an electron for a prolonged period of time and then randomly release it. Impurities causing lattice defects are discussed in this context \cite{kay2012operational,burst_noise_psd,popcorn_noise_orgin,technote_ti_popcorn_noise}. Such lattice defects can also be introduced by ion implantation during doping. Fortunately, this type of noise has become less prevalent in modern manufacturing processes, because the quality of the semiconductors has improved. But if a trap site is located very close to an important structure, for example a high precision Zener diode, its effect might be so strong that it can be clearly seen.

The discussion is split into two parts. First the power spectral density is calculated and then the Allan variance is calculated using that result.

The spectral density of burst noise caused by a single trap site was derived in \cite{burst_noise_wiener_khinchin} by \citeauthor{burst_noise_wiener_khinchin}. \citeauthor{burst_noise_wiener_khinchin} used the autocorrelation function of the burst noise signal and applied the Wiener-Khinchin (Wiener-Хи́нчин) theorem, which connects the autocorrelation function with the power spectral density. A more detailed derivation can be found in \cite{fundamentals_of_noise_processes}, in this paper the preconditions like stationarity of the process, are also discussed. The burst noise signal consists of two energy levels, called $0$ and $1$, split by $\Delta y$. Multiple burst noise signals can be superimposed in a real device. This would then result in multiple levels, but they can be treated separately. The measurement interval over an even number of transitions, so that one ends in the same state as the measurement has started, is the time $T$. The mean lifetime of the levels is called $\bar \tau_0$ and $\bar \tau_1$:
\begin{equation}
    \bar \tau_{0} \approx \frac 1 N \sum_{i}^N \tau_{0,i} \qquad \bar \tau_{1} \approx \frac 1 N \sum_{i}^N \tau_{1,i}
\end{equation}

Figure \ref{fig:burst_noise} shows a burst noise signal along with the definitions above.
\begin{figure}[hb]
    \centering
    %\scalebox{1}{%
        \import{figures/}{burst_noise.tex}
    %} % scalebox
    \caption{A random burst noise signal.}
    \label{fig:burst_noise}
\end{figure}

Using these definitions, one can then derive \cite{burst_noise_wiener_khinchin}:
\begin{align}
    R_{xx}(T) &= \Delta y^2 \cdot \frac{\bar \tau_1 \bar \tau_0 e^{-\left(\frac{1}{\bar \tau_1}+\frac{1}{\bar \tau_0}\right)T}}{\left(\bar \tau_1 + \bar \tau_0\right)^2} \quad \text{and} \label{eqn:burst_noise_correlation}\\
    S(\omega) &= 4 R_{xx}(0) \frac{\frac{1}{\bar \tau_1} + \frac{1}{\bar \tau_0}}{\left(\frac{1}{\bar \tau_1} + \frac{1}{\bar \tau_0}\right)^2 + \omega^2} \qquad \omega > 0 . \label{eqn:burst_noise_psd}
\end{align}
Note, that the power spectral density is the one-sided version, hence an additional factor of $2$ is included. The constant term was omitted here and can usually be neglected, because it is not relevant for calculating the power spectral density as it only contributes a single peak at $\omega=0$. Using the following definitions of the average time constant and the duty cycle
\begin{align}
    \frac{1}{\bar \tau} &= \frac{1}{\bar \tau_1} + \frac{1}{\bar \tau_0} \quad \mathrm{and} \label{eqn:definition_bar_tau}\\
    D_i &= \frac{\bar \tau_i}{\bar \tau_1 + \bar \tau_0} \quad i \in \{0 ; 1\}
\end{align}

equations \ref{eqn:burst_noise_correlation} and \ref{eqn:burst_noise_psd} can be rewritten to give a more intuitive form.
\begin{align}
    R_{xx}(T) &= \Delta y^2 D_1 D_0 \, e^{-\left(\frac{1}{\bar \tau_1}+\frac{1}{\bar \tau_0}\right)T}\\
    S(\omega) &= 4 R_{xx}(0) \frac{\bar \tau}{1 + \omega^2 \bar \tau^2} \label{eqn:burst_noise_lorentzian}
\end{align}

The special case $\bar \tau_0 = \bar \tau_1$ with $D_i=\frac 1 2$ is the previously mentioned case of random telegraph noise.

$R_{xx}(0)$ can be identified as the mean squared value of $y$:
\begin{equation}
    y_{rms} = \sqrt{R_{xx}(0)} \,.
\end{equation}

Equation \ref{eqn:burst_noise_lorentzian} is a Lorentzian function and from this, it can be easily seen that a single trap site has a power spectral density that is proportional to $\frac{1}{f^2}$ at high frequencies and is flat at low frequencies.

With the spectral density in hand, it is now possible to calculate the Allan variance as it was done by \citeauthor{allen_dev_flicker} in \cite{allen_dev_flicker} for the classic example of random telegraph noise where $\bar \tau_1 = \bar \tau_0$. Do note that table I given by \citeauthor{allen_dev_flicker} shows the total number of events instead of the instantaneous number of events typically given. Hence their notation must be multiplied by $\frac{1}{\tau^2}$ (or $\frac{1}{T^2}$ in their notation). For the generic case with $\bar \tau_1$, $\bar \tau_0$ and the definition of $\bar \tau$ given in equation \ref{eqn:definition_bar_tau} one finds for the Allan variance of burst noise:
\begin{equation}
    \sigma^2_A(\tau) = R_{xx}(0) \frac{\bar \tau^2}{\tau^2} \left(4 e^{-\frac{\tau}{\bar \tau}} - e^{-\frac{2 \tau}{\bar \tau}} + 2 \frac{\tau}{\bar \tau} - 3 \right) \label{eqn:burst_noise_avar}
\end{equation}

Having arrived at equations \ref{eqn:burst_noise_lorentzian} and \ref{eqn:burst_noise_avar} of the power spectral density and Allan variance, it it now possible to model it. For this purpose, parts of the Python library \textit{qtt} \cite{qtt} was used. This algorithm written by \citeauthor{qtt} implements continuous-time Markov chains to simulate the burst noise signal. The result can be see in figure \ref{fig:burst_noise_simulated}. For these simulations one of the time constants, namely the lifetime of the lower state $\bar \tau_0$ was held constant, while the lifetime of the upper state was varied to show the effect of different $\bar \tau$. By looking at the time domain in figure \ref{fig:burst_noise_time} it can be seen that the maximum average number of state changes can be observed, when $\bar \tau_1 = \bar \tau_0$. If $\bar \tau_1 > \bar \tau_0$ the system will favour the upper, while if $\bar \tau_1 < \bar \tau_0$ it will favour the lower state instead. This explains why the noise is strongest for random telegraph noise when $\bar \tau_1 = \bar \tau_0$, which can also be seen in the power spectral density plot in figure \ref{fig:burst_noise_psd}. Looking at the Allan deviation in figure \ref{fig:burst_noise_adev} confirms this, but also shows another interesting implication as it shows an obvious maximum. If the application allows a choice over the sampling interval $\tau$, the effect of the burst noise can be mitigated by staying well clear of the maximum.

The small deviation from the analytical solution in figure \ref{fig:burst_noise_adev} suggesting an upwards trend at large $\tau$ is a typical so-called end-of-data error. As it was discussed above, the Allan deviation can only be estimated given a limited number of samples using equation \ref{eqn:adev_estimator} and going to longer $\tau$ means there are fewer samples to average over.
\begin{figure}[ht]
    \centering
    \begin{subfigure}{0.8\linewidth}
        \centering
        \scalebox{1}{%
            \input{images/burst_noise_amplitude.pgf}% data/simulations/sim_burst_noise.py
        } % scalebox
        \caption{Time domain}
        \label{fig:burst_noise_time}
    \end{subfigure}
    \begin{subfigure}{0.8\linewidth}
        \centering
        \scalebox{1}{%
            \input{images/burst_noise_psd.pgf}% data/simulations/sim_burst_noise.py
        } % scalebox
        \caption{Power spectral density}
        \label{fig:burst_noise_psd}
    \end{subfigure}
    \begin{subfigure}{0.8\linewidth}
        \centering
        \scalebox{1}{%
            \input{images/burst_noise_adev.pgf}% data/simulations/sim_burst_noise.py
        } % scalebox
        \caption{Allan deviation}
        \label{fig:burst_noise_adev}
    \end{subfigure}
    \caption{Different representations of burst noise for different $\bar \tau_1$ and fixed $\bar \tau_0 = \qty{1}{\s}$.}
    \label{fig:burst_noise_simulated}
\end{figure}

The burst noise equations can be used to gain further insight into other types of noise. The first one is Shot noise, which is commonly found in photodetectors and lasers. Here, electrons or photons are created at discrete intervals resulting in an instantaneous signal. This means that the lifetime of the upper level is very short in comparison to the lower level ($\tau_1 \ll \tau_0$) equation \ref{eqn:burst_noise_psd} becomes:
\begin{align}
    S_{Shot}(\omega) = S_{\tau_1 \ll \tau_0}(\omega) &= 4 \Delta y^2 \frac{\tau_1}{\tau_0} \frac{\frac{1}{\bar \tau_1}}{\left(\frac{1}{\bar \tau_1}\right)^2 + \omega^2}\nonumber\\
    &= 4 \Delta y^2 \frac{1}{\tau_0} \frac{1}{\frac{1}{\tau_1^2}+\omega^2}\\
    \overset{\omega \ll 1/\tau_0}&{\approx} 4 \Delta y^2 \frac{\tau_1^2}{\tau_0} = \text{const.}
\end{align}

Typically, a very large number of such events happen. When not counting single events, but rather a stream, the relation $\omega \ll 1/\tau_0$ is valid and hence the result is a white spectrum as $S_{Shot}(\omega)$ is constant with respect to $\omega$ --- just as observed in photodetectors and lasers.

The other interesting occurrence is a case where many trap sites with different time constants are contributing to the noise. This can change the shape of the spectrum from $f^{-2}$ to $f^{-1}$ and is discussed in the next section.

\clearpage
\subsubsection{Flicker Noise}%
\label{sec:flicker_noise}
Flicker noise is also called $\frac 1 f$-noise and it can be observed in many naturally occurring phenomena. Its origin is not clear, even though there have been many explanations. An overview can be found in \cite{flicker_noise_overview, flicker_noise_overview2, origins_1_f_noise}. This section concentrates on flicker noise in electronic devices. In thick-film resistors, for example, it was shown to extend over at least 6 decades without any visible flattening \cite{1_f_noise_thick_film}. In transistors, flicker noise is caused by the existence of generation-recombination noise or burst noise discussed in the previous section \cite{origins_1_f_noise}. If there are many uncorrelated trap sites which contribute to the total noise, the envelope of the noise spectral density changes from $\frac{1}{f^2}$ to $\frac{1}{f^1}$ as shown in figure \ref{fig:flicker_noise_evelope}
\begin{figure}[hb]
    \centering
    \input{images/flicker_noise_envelope.pgf}% data/simulations/sim_flicker_noise_envelope.py
    \caption{Multiple overlapping Lorentzian noise sources forming a $\frac 1 f$-like shape.}
    \label{fig:flicker_noise_evelope}
\end{figure}

Given that no trap site can store an electron indefinitely, the number of trap sites $N$ with a certain time constant $\frac 1 2 \bar \tau = \bar \tau_0 = \bar \tau_1$ must decline when going to longer time scales. Assuming $N$ is inversely proportional to the time constant $\bar \tau$
\begin{equation}