create account

Particle physics data taking and analysing - a 'quick' overview by freyablekman

View this thread on: hive.blogpeakd.comecency.com
· @freyablekman · (edited)
$119.29
Particle physics data taking and analysing - a 'quick' overview
One of the most common questions I get most as a particle physicist has to do with how we collect data and then analyse the data. So in this post I will discuss the process from collision to plot shortly. For each step, there are many details that I will waive over, sacrificing detail for the big picture. In the future I will revisit each of these topics for dedicated posts as well, as I am really skimming over details.

### LHC datataking

![IMG_2162.jpg](https://steemitimages.com/DQmc7LJW5fthwj2JZUWhehP9uuwfcGtZ4wxa2aLG3pCCeCF/IMG_2162.jpg)
_data taking in action. This is my work station when I am in charge of data taking at the CMS experiment. Source: F.Blekman_ 

The goal of particle physics experiments at the Large Hadron Collider is to reproduce, in the most controlled way possible, what happens when two protons collide. Each individual collision is not particularly special, most proton collisions are extremely well known and in general we are only interested in the special cases where something _rare_ happens. So, one of the challenges is that the more rare the process, the more collisions are needed to actually see that _consistently_ something new and rare is happening. This is technologically the challenge for the LHC accelerator physicists, make as many collisions as possible. For example in the last year's run, the LHC was colliding clouds of protons 40 million times per second. And in each collision of clouds (we call those bunches), about 30 actual proton-proton collisions occurred. 

The rest of the proton cloud continues around the ring, to have another chance to collide again in one of the four LHC collision points.  In each proton-proton collision, tens to thousands of new particles are created from the kinetic energy held by the protons.

The amount of collisions per second is quantified in _Luminosity_.  Typically experimental particle physicists talk about integrated luminosity, that is, the amount of collisions collected over a certain time, for example a day, month or year. 
![luminosity during the 2016 datataking](https://cms-service-lumi.web.cern.ch/cms-service-lumi/publicplots/int_lumi_per_day_cumulative_pp_2016_alt.png)_Amount of total integrated luminosity (y-axis,=number of proton collisions) collected by the CMS experiment in 2016 as a function of days (x-axis). source: [CMS experiment luminosity public results](https://twiki.cern.ch/twiki/bin/view/CMSPublic/LumiPublicResults#2016_proton_proton_13_TeV_collis)_

### Selecting interesting collisions

Considering the _HUGE_ numbers of collisions, the next challenge is two-fold: how do you record these collisions, and how do we store them. For the recording, we use dedicated detectors with around 100 million readout channels that are sensitive to measuring the particles produced in the collision. Each of those 100 million mini detectors is sensitive to seeing some particles, and they are placed in a clever way so that independent of the directions that the particles are produced, the particles will cross some of the detector elements. Having so many detection elements is great as it allows us to have very detailed knowledge of all the particle energy and trajectories. It also means that storing each collision takes 2MB after all zero-supression etc. You can imagine that recording the 40 million collisions per second becomes a problem very quickly, 80 Terabyte per second. Storing all of those data is not feasible as we just cannot afford the disk space. So the trick we use in particle physics is that we use a selection algorithm called a _trigger_ to on-the-fly pick out interesting collisions. And all other collisions are just discarded. Typically we save between 500 and 1000 collisions per second in the end and this number is driven by how much storage we can afford from our budgets, we receive funding from [national funding agencies and governments all over the world](https://home.cern/about/member-states).

![cables part of the data acquisition system of the CMS experiment](https://steemitimages.com/DQmfBqRNFgst1R3AiCo6i9BtZGtmagtRqYA6GmrqGJHkbDr/FRY7kpATSyiJyoMfXxxO7A.jpg)_cables part of the data acquisition system of the Compact Muon Solenoid experiment at CERN, photo by F. Blekman_

### Offline statistical analysis and an example of discovery

Once the data is on disk, the fun starts! These datasets are so large, that one needs to use dedicated tools to access them, at the LHC we use a dedicated distributed computing system called the [LHC Computing Grid](http://wlcg-public.web.cern.ch/) which gives anyone of the thousands of experimental physicists at the LHC experiments access to hundreds of thousands of CPUs to analyse the *%@^#&% out of this data. And this is necessary, because even after all the data reduction, most of the the things we are looking for are extremely extremely rare. 

For example, in the gif below shows the subset of collisions where four high-energy electrons or muons were produced in the full dataset of 2010 and 2011. It is one of the most famous plots from the [discovery of the Higgs boson](http://cms.web.cern.ch/news/observation-new-particle-mass-125-gev). Out of the billions and billions of collisions produced by the accelerator, and after running algorithms (yes including machine learning) optimised to find electrons and muons, on the grid, about a few hundred collisions are left. Those collisions are mostly due to known and well-understood processes that happen at proton-proton collisions which are also simulated for comparison, and are shown in the blue and green distributions. And then, if you take enough data, you see additional structure appear _when you look at the right distribution_. In this case the distribution shown is the total mass of the four leptons (including their kinetic energies). 


![Higgs discovery](https://steemitimages.com/DQmWXYP5AhUjRcsenfbzVNeBcbcN9yQ7PDRt7N6wBjCg76j/CMS_HZZ4l_animated.gif)_Seeing the LHC data come in for the discovery of the Higgs boson using the data collected in 2010 and 2011. [Source: CMS collaboration (incl. F. Blekman)](http://cms.web.cern.ch/news/observation-new-particle-mass-125-gev)_



And the additional structure here was then shown via statistical analysis to not be consistent with the background, specifically the data (black dots) peak is not consistent with the blue background prediction, with a probability of about 1/3000. On the other hand, the data is actually consistent with the blue background _plus_ the red peak which is the prediction of what the boson predicted by [Brout&Englert and Higgs, and others](https://en.wikipedia.org/wiki/Higgs_mechanism) (a.k.a. Higgs particle) would look like! Very exciting, but first we did about half a year of cross checks. We also checked distributions of different other collisions where for example two photons were produced, and there was a peak there at the same total mass! This is when things get very excited, of course we cross checked our own results in all different ways including fully independent analysis, only optimising on simulation before we decide to look at data. It was an exciting and very enervating time. 

In the end I was one of the physicists that went to the two-yearly _International Conference for High Energy Physics (ICHEP)_ where the new result was announced. An amazing memory that I will cherish my whole life. One of the other things I remember most from it was being extremely jet lagged as the conference was in Australia and I only arrived from Europe the morning of the announcement ;-)

### What next?

Although the Higgs boson discovery was an exciting first step, just finding the Higgs boson was only one of the aspects that the LHC was designed for. The LHC will allow us to measure the actual properties of those bosons (after all, that little peak contained only about 20 collisions consistent with HIggs bosons. From 2 years of data taking). For that we need lots and lots of data. As the data is so diverse, we can answer many other questions at the same time, for example study the subtle  differences between matter and antimatter, study if there are any other new, yet undiscovered particles, maybe see if we can produce the elusive _dark matter_ at the LHC, understand if the particles we know now are really fundamental. Answering even a part of any of these puzzles would be a new paradigm shift in fundamental physics of Nobel-prize winning proportions. I'm also a fan of making sure that we are open even to questions we have not thought of, theoretical physicists such as @lemouth are always producing new predictions of what we could be seeing in the data in the future. The LHC will [run for a very long time, the current plan is at least up to 2035](https://lhc-commissioning.web.cern.ch/lhc-commissioning/schedule/LHC-long-term.htm). And I expect to have enough data to stay busy until then, definitely!
👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and 100 others
properties (23)
authorfreyablekman
permlinkparticle-physics-data-taking-and-analysing-a-quick-overview
categorysteemstem
json_metadata{"tags":["steemstem","physics","datascience","cern","lhc"],"users":["lemouth"],"image":["https://steemitimages.com/DQmc7LJW5fthwj2JZUWhehP9uuwfcGtZ4wxa2aLG3pCCeCF/IMG_2162.jpg","https://cms-service-lumi.web.cern.ch/cms-service-lumi/publicplots/int_lumi_per_day_cumulative_pp_2016_alt.png","https://steemitimages.com/DQmfBqRNFgst1R3AiCo6i9BtZGtmagtRqYA6GmrqGJHkbDr/FRY7kpATSyiJyoMfXxxO7A.jpg","https://steemitimages.com/DQmWXYP5AhUjRcsenfbzVNeBcbcN9yQ7PDRt7N6wBjCg76j/CMS_HZZ4l_animated.gif"],"links":["https://twiki.cern.ch/twiki/bin/view/CMSPublic/LumiPublicResults#2016_proton_proton_13_TeV_collis","https://home.cern/about/member-states","http://wlcg-public.web.cern.ch/","http://cms.web.cern.ch/news/observation-new-particle-mass-125-gev","https://en.wikipedia.org/wiki/Higgs_mechanism","https://lhc-commissioning.web.cern.ch/lhc-commissioning/schedule/LHC-long-term.htm"],"app":"steemit/0.1","format":"markdown"}
created2018-02-26 22:06:00
last_update2018-02-26 23:28:30
depth0
children16
last_payout2018-03-05 22:06:00
cashout_time1969-12-31 23:59:59
total_payout_value93.980 HBD
curator_payout_value25.314 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length8,895
author_reputation2,560,455,840,388
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,702,484
net_rshares21,149,243,402,894
author_curate_reward""
vote details (164)
@alignment · (edited)
Thanks @freyablekman for this very interesting article.
I am wondering if the particles (Protons) Have the same energy and speed why are there some special collisions?
properties (22)
authoralignment
permlinkre-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180226t222844687z
categorysteemstem
json_metadata{"tags":["steemstem"],"users":["freyablekman"],"app":"steemit/0.1"}
created2018-02-26 22:34:03
last_update2018-02-26 22:38:18
depth1
children4
last_payout2018-03-05 22:34:03
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length167
author_reputation476,051,699,136
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,707,043
net_rshares0
@freyablekman · (edited)
that is the nature of quantum physics. Even if the initial conditions would be exactly the same, every collision would have a certain number of possible outcomes (not just one, many!), and there are probabilities (chances) that can be calculated by the physics theory we call the Standard Model, which makes predictions on how often each kind of collision happens. The interesting collisions are really really really rare. But even 'normal' proton-proton collisions are all slightly different.
properties (22)
authorfreyablekman
permlinkre-alignment-re-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180226t224938299z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-02-26 22:49:39
last_update2018-02-26 22:52:21
depth2
children3
last_payout2018-03-05 22:49:39
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length493
author_reputation2,560,455,840,388
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,709,652
net_rshares0
@alignment · (edited)
You mean statistical predictions or there are known factors by which the nature of collision differs?
properties (22)
authoralignment
permlinkre-freyablekman-re-alignment-re-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180226t224916162z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-02-26 22:54:36
last_update2018-02-26 22:56:42
depth3
children2
last_payout2018-03-05 22:54:36
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length101
author_reputation476,051,699,136
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,710,591
net_rshares0
@bitgeek ·
comment
Congratulations @freyablekman, this post is the forth most rewarded post (based on pending payouts) in the last 12 hours written by a User account holder (accounts that hold between 0.1 and 1.0 Mega Vests). The total number of posts by User account holders during this period was 2868 and the total pending payments to posts in this category was $7405.58. To see the full list of highest paid posts across all accounts categories, [click here](www.steemit.com/steemit/@bitgeek/payout-stats-report-for-27th-february-2018--part-ii). 

If you do not wish to receive these messages in future, please reply stop to this comment.
properties (22)
authorbitgeek
permlinkre-particle-physics-data-taking-and-analysing-a-quick-overview-20180227t074818
categorysteemstem
json_metadata""
created2018-02-27 07:48:21
last_update2018-02-27 07:48:21
depth1
children0
last_payout2018-03-06 07:48:21
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length624
author_reputation13,049,044,453,787
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,807,008
net_rshares0
@fixedbetwin ·
Follow me , I follow u : (u can win lot of money here)
https://steemit.com/introduceyourself/@fixedbetwin/introducing-fixedbetwin-experts-in-football-prediction-669293ae34586
properties (22)
authorfixedbetwin
permlinkre-freyablekman-2018226t233558455z
categorysteemstem
json_metadata{"tags":["steemstem","physics","datascience","cern","lhc"],"app":"esteem/1.5.1","format":"markdown+html","community":"esteem"}
created2018-02-26 22:36:24
last_update2018-02-26 22:36:24
depth1
children0
last_payout2018-03-05 22:36:24
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length174
author_reputation521,588,595
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries
0.
accountesteemapp
weight1,000
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,707,449
net_rshares0
@itinerantartist · (edited)
wow that is quite a mouthful! loved seeing the process photos tho, and especially the animated gifs - they really help tell the story and give us an insight as to what you physicists are really up to on the daily :) 

i really need to start posting about my hyperloop work - but thinking a separate account is best. can you imagine going from art and poetry to hyperloop posts back and forth ?! haha:)

anyway maybe i need to make a song about higgs boson or quantum physics - stay tuned ;)
properties (22)
authoritinerantartist
permlinkre-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180228t112933822z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-02-28 11:29:33
last_update2018-02-28 11:29:57
depth1
children2
last_payout2018-03-07 11:29:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length490
author_reputation6,024,479,846,853
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id41,103,706
net_rshares0
@freyablekman ·
I'd love to read about the hyperloop work, definitely! And let me know if you end up writing a song about Higgs bosons, but just to be aware: you have Nick Cave and others to compete with :)
properties (22)
authorfreyablekman
permlinkre-itinerantartist-re-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180301t190617670z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-03-01 19:06:18
last_update2018-03-01 19:06:18
depth2
children1
last_payout2018-03-08 19:06:18
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length190
author_reputation2,560,455,840,388
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id41,450,883
net_rshares0
@itinerantartist ·
haha will do! and omg songs exist about higgs boson already?!!? *immediately goes to google* :D
👍  
properties (23)
authoritinerantartist
permlinkre-freyablekman-re-itinerantartist-re-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180301t194205193z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-03-01 19:42:06
last_update2018-03-01 19:42:06
depth3
children0
last_payout2018-03-08 19:42:06
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length95
author_reputation6,024,479,846,853
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id41,457,530
net_rshares3,046,708,642
author_curate_reward""
vote details (1)
@launglilawangsa · (edited)
there is not from above the occurrence of most of the proton crash is very famous and in general we are only interested in special cases where something rare happens. So, one of the challenges is the less the process, the more collisions it takes to actually see that consistently something new and rare is happening. may be profits as well as others, I got interested to understand the lesson
properties (22)
authorlaunglilawangsa
permlinkre-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180227t005818977z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-02-27 00:58:36
last_update2018-02-27 01:01:48
depth1
children0
last_payout2018-03-06 00:58:36
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length393
author_reputation255,270,763,119
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,731,898
net_rshares0
@midlet ·
Very cool! In your opinion what do these discoveries mean for humanity? I know that's a broad question, but I guess I mean, is this about just having a better understanding of the universe and world around us or are there any applications for this knowledge that could disrupt things in any way?
properties (22)
authormidlet
permlinkre-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180226t222656338z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-02-26 22:26:57
last_update2018-02-26 22:26:57
depth1
children2
last_payout2018-03-05 22:26:57
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length295
author_reputation293,267,832,592,637
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,705,849
net_rshares0
@freyablekman · (edited)
the research we do is basic (fundamental) science. This means it produces knowledge, in my case the understanding of particles, forces between particles etc. 

The goal is not to have applications beyond that, but of course it does happen by accident, a very good example is the world wide web which was designed so particle physicists could share their results more efficiently. Of course it is much more difficult to say if the specific discovery of the Higgs particle will have further consequences for humanity. But looking at history, the work that was done for electromagnetism, quantum mechanics and relativity also took many tens of years before applications were even thought of. In the long run, most if not all modern physics discoveries eventually payed off. And of course the knowledge and understanding is amazing too, humans are curious so it is in our nature to try to answer questions as fundamental as what we do at CERN :)
👍  ,
properties (23)
authorfreyablekman
permlinkre-midlet-re-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180226t224649121z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-02-26 22:46:48
last_update2018-02-26 22:50:45
depth2
children1
last_payout2018-03-05 22:46:48
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length941
author_reputation2,560,455,840,388
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,709,128
net_rshares1,693,613,335
author_curate_reward""
vote details (2)
@midlet ·
Really interesting, thanks for the thoughtful reply :)
👍  
properties (23)
authormidlet
permlinkre-freyablekman-re-midlet-re-freyablekman-particle-physics-data-taking-and-analysing-a-quick-overview-20180227t034420187z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-02-27 03:44:21
last_update2018-02-27 03:44:21
depth3
children0
last_payout2018-03-06 03:44:21
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length54
author_reputation293,267,832,592,637
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,762,788
net_rshares3,163,889,743
author_curate_reward""
vote details (1)
@pagoda ·
$0.14
This was easier to understand than I thought 😆! TIL
👍  , , ,
properties (23)
authorpagoda
permlinkre-freyablekman-2018226t173954741z
categorysteemstem
json_metadata{"tags":["steemstem","physics","datascience","cern","lhc"],"app":"esteem/1.5.1","format":"markdown+html","community":"esteem"}
created2018-02-26 22:39:57
last_update2018-02-26 22:39:57
depth1
children1
last_payout2018-03-05 22:39:57
cashout_time1969-12-31 23:59:59
total_payout_value0.116 HBD
curator_payout_value0.020 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length51
author_reputation379,668,716,538
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries
0.
accountesteemapp
weight1,000
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id40,708,030
net_rshares26,618,619,949
author_curate_reward""
vote details (4)
@freyablekman ·
$0.03
Happy to hear that, I think it is important to talk about the concepts, not to obfuscate with technicalities or jargon, I'm happy to read you noticed :)
👍  ,
properties (23)
authorfreyablekman
permlinkre-pagoda-re-freyablekman-2018226t173954741z-20180301t190502121z
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steemit/0.1"}
created2018-03-01 19:05:03
last_update2018-03-01 19:05:03
depth2
children0
last_payout2018-03-08 19:05:03
cashout_time1969-12-31 23:59:59
total_payout_value0.024 HBD
curator_payout_value0.003 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length152
author_reputation2,560,455,840,388
root_title"Particle physics data taking and analysing - a 'quick' overview"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id41,450,620
net_rshares5,801,672,110
author_curate_reward""
vote details (2)