12 steps - From idea to discovery


The “Standard Model” of particle physics is a set of equations that describes accurately how a small number of fundamental particles interact, and how they form visible matter. But we know that this model must be incomplete, since it does not account for observations such as dark matter or dark energy. Therefore, physicists formulate new, often competing, theories to explain these observations, sometimes predicting new particles with large masses or other phenomena.

Producing new particles requires a sufficient amount of energy to be crammed into a tiny space, where the energy (‘E’) converts into particle mass (‘m’) in accordance with Einstein’s equation E=mc2. In a collider this is achieved by smashing two particles together. Since producing particles with larger masses requires higher collision energies, physicists seek to build more and more powerful accelerators.



Producing new particles is only part of the story – they also have to be observed by a detector. To understand precisely how new particles could show up and how the detectors need to perform, scientists simulate a very large number of collisions. These simulations are based both on the known laws of physics (the “Standard Model”) and on the predictions of the new model(s) being explored.

A software programme simulates the particle collisions, the subsequent production of other particles and their possible decay into lighter particles. It also simulates the particle tracks, all interactions with the materials of the detector and the response of the readout electronics. Finally, a reconstruction programme analyses how the signal would have been observed in this (hypothetical) detector, in the same way as for a real event. Many such simulations are needed to optimise the geometry and material composition of the detector before it can start to be constructed.



Planning, designing, testing and constructing accelerators and detectors can take 20 years or more. For example, the planning for the Large Hadron Collider (LHC) started in 1984 and its construction was completed in 2008. The 27-km-long circular accelerator features two counter-rotating proton beams that are each accelerated to an energy of 6.5 TeV and then collide at 4 interaction points.

In parallel, thousands of physicists get together to form collaborations to plan and build detectors that measure the particles produced in the collisions in the search for new phenomena. In the LHC example, two of them, ATLAS and CMS, are huge general-purpose detectors that are based on different designs and are run by separate teams each comprising more than 3000 physicists. Two other detectors at the LHC are more specialised: the LHCb, which studies the symmetry between matter and antimatter, and ALICE, which investigates the properties of the primordial state of quarks and gluons.



The production of very massive particles is a rare event and happens typically less than once per billion collisions. Therefore, the LHC was designed to produce more than 1 billion collisions per second. The total number of collisions is often expressed in units of “inverse femtobarns” (1 fb-1 represents ~80,000 billion collisions).



A collision creates thousands of particles that pass through a detector, which surrounds the collision point. Detectors are like huge 3-dimensional digital cameras consisting of about 100 million sensors, which are organised in several layers and contribute information about the position or the energy of a particle. Charged particles traversing a sensor in a “tracker” produce a small electrical signal, which is then amplified and recorded together with the position of the sensor. A strong magnetic field bends the trajectories of charged particles, allowing their ‘momentum’ (the product of mass and velocity) to be measured. The energy of neutral and charged particles is measured in calorimeters, which are arranged in several layers outside the tracker.



Protons consist of point-like constituents called quarks and gluons. When two protons collide, the probability of a close encounter of these constituents is very small. On the other hand, only these short-range interactions can produce very massive particles. Such events are characterised by a large number of particles spraying out in all directions and leaving signals in many different layers of the detector. An ultra-rapid filtering procedure selects these ‘interesting’ events from millions of collisions in which two protons traverse each other almost undisturbed.

In the first step, very fast electronic circuits check whether large amounts of energy are deposited in the outer layers of the detector; thousands of online computers then assimilate and synchronise information from different parts of the detector with a view to finding particles such as high-energy muons, electrons or photons.

The selected data, typically 1000 “raw” events per second (equivalent to about 1 GB) are then transferred to the CERN data centre and written to mass storage, resulting in a data volume of several 10,000 TB per year per detector.



Each “raw” event contains all the information recorded in the different detector layers. In the event reconstruction, these raw data are transformed into physical quantities. Charged particle tracks are identified and their parameters (direction, curvature) are determined; energy deposits in the calorimeters are summed up; the information from the different layers is combined to reconstruct physics objects such as photons, electrons, muons or particle jets. By adding the energy of all measured particles together and comparing it with the collision energy, the ‘missing energy’ of a particle that has escaped detection (e.g. a neutrino) can be inferred.

The event reconstruction is typically repeated several times as the teams’/scientists’ understanding of the detector improves, so that the most precise measurements of the particles’ tracks and energy deposits are available. The event is then classified according to its physics characteristics and recorded as “event summary data” for further stages of analysis. Event reconstruction makes heavy use of computing time and is mainly done using the LHC Computing Grid.



The measurement of the position and energy of detector layers must be precisely calibrated before their output can be interpreted as physics signals. Initial calibrations are based on test-beam measurements and are then continuously refined using real collision data. The position of the sensors in the tracker layers is calibrated by comparing (and correcting) the predicted and measured positions of a particle travelling through them. The energy scales of the calorimeters are calibrated by using the decay products of well-known particles. The calibration data are used in the reconstruction programmes to transform the measured signals into physical quantities.



The physics analysis starts with the event summary data, which contains complete information about the reconstructed event characteristics (e.g. number of electrons, muons, photons, jets). The analysis programmes consist of a set of selection criteria, and their goal is to search for specific patterns among the detected particles by calculating a set of derived physical quantities for each event.

Let us consider the search for a yet unknown particle decaying into two high-energy photons. For all events containing two (or more) photons above a certain energy threshold, and assuming that the photons come from the decay of a parent particle, the ‘invariant’ mass of the parent particle in its rest frame is calculated from the measured position and energy of the photons (see formula on the left). For each combination of two photons, this calculation is repeated. The results for all selected events are entered into a histogram showing how many times a certain value of the calculated mass occurred.



A particle decaying into a pair of photons appears as a ‘bump’ in the two-photon mass distribution and may be spread over several mass intervals, depending on the lifetime of the decaying particle and the measurement resolution.

Such a bump usually sits on top of a smooth distribution stemming from background processes that can also produce two (or more) photons. It is important to understand the differences between simulated and experimental data when looking for signs of new particles. For each analysis, the number of simulated events must be comparable with (or even larger than) the number of analysed events, to ensure that statistical fluctuations are not dominated by uncertainties in the background distributions.



Before the discovery of a new particle can be claimed, scientists must make sure that the excess of events on top of the background distribution has a sufficiently high statistical significance. If we take the example of the search for a particle decaying into two photons, the number of events from the background, i.e. all known processes producing two photons, is determined for each interval of the histogram representing the invariant mass distribution. This number (N) has a statistical uncertainty (sigma) equal to the square root of N provided that N is significantly larger than 1. To put it very simply, if 100 events are expected from the background during a given interval, one sigma would be equal to 10. The number of events actually observed is then compared with the expected number of events and the difference is expressed in sigma, giving the statistical significance. For example, if 150 events are observed, the excess number compared to the background is 50 and the statistical significance is 50/10, i.e. 5 sigma. A “discovery” is claimed when the statistical significance of the excess (the “bump”) is at least 5 sigma. This means that the probability of the bump in the data being due to a statistical fluke of the background is only 3 in 10 million.



If a new particle is discovered, many other questions arise. What are the properties of the particle, such as its mass and width (a characteristic linked to its lifetime)? What is its production rate? Can the spin of the particle be inferred from the angular distribution of the decay products? Is the particle produced alone or in association with another particle? Is the same particle observed decaying into other particles (such as the Higgs boson, which was first observed both in the two photon spectrum and also through its 4 lepton decay)? Has an independent experiment observed the same particle with parameters that are statistically compatible? Which theoretical model fits the observations best? Are there theoretical models excluded by the observations? Are further experiments or even bigger accelerators needed to answer these questions? All this analysis work will eventually lead to greater knowledge, new models, new predictions – and new experiments.

You are here