de
en
Schliessen
Detailsuche
Bibliotheken
Projekt
Impressum
Datenschutz
de
en
Schliessen
Impressum
Datenschutz
zum Inhalt
Detailsuche
Schnellsuche:
OK
Ergebnisliste
Titel
Titel
Inhalt
Inhalt
Seite
Seite
Im Dokument suchen
A popularity prediction and dynamic data replication study for the ATLAS distributed data management / by Thomas Beermann. Wuppertal, July 5, 2017
Inhalt
Abstract
Acknowledgements
Contents
I Introduction
1 The LHC, ATLAS and the WLCG
1.1 General
1.1.1 CERN and LHC
1.1.2 The ATLAS Experiment
1.1.3 Trigger
1.1.4 WLCG
2 ATLAS Computing Model and Systems
2.1 ATLAS Computing Model
2.1.1 Data Types
2.1.2 Cloud & Tier Structure
2.2 ATLAS Computing Systems
2.2.1 DDM
2.2.2 Automated Data Replication
2.2.3 Space Tokens
2.2.4 Data Retention
2.2.5 Data Replication Policies
2.2.6 WMS
2.2.7 Tracer
2.2.8 Popularity System
2.2.9 Victor
2.2.10 PD2P
3 Motivation
II Prediction
4 Popularity Prediction
4.1 Introduction
4.1.1 Time Series
4.1.2 Time Series Prediction
4.2 System Design
4.2.1 Input Data
4.3 Prediction Methods
4.3.1 Static Prediction
4.3.2 Linear Prediction
4.4 Neural Network Prediction
4.4.1 Basics
4.4.2 Training
4.4.3 Dataset Access Prediction
4.4.4 Neural network separation
4.5 Hybrid Solution
4.5.1 Prefiltering
4.5.2 Hybrid Prediction
4.6 Implementation
4.6.1 Python neural networks initialisation
4.6.2 Scale Down/Up
4.6.3 Training
4.6.4 Prediction
5 Prediction Evaluation
5.1 Prediction Input Data
5.2 Data Access Pattern Examples
5.2.1 Highly popular project / datatypes
5.2.2 Popular project / datatypes
5.2.3 Unpopular project / datatypes
5.3 Evaluation
5.3.1 Evaluation Metric
5.3.2 Static Prediction
5.3.3 Linear Prediction
5.3.4 Neural Networks
5.3.5 Summary
5.3.6 Hybrid Prediction
5.4 Conclusion
III Redistribution
6 Data Redistribution
6.1 Introduction
6.1.1 Example
6.2 Requirements
6.2.1 Input Data
6.2.2 Profits
6.2.3 Constraints
6.2.4 Costs
6.3 Redistribution Algorithm
6.3.1 Replica Creation
6.3.2 Replica Deletion
6.3.3 Combination
6.3.4 Example
6.4 Implementation
6.4.1 Site Module
6.4.2 Data Catalogue
6.4.3 Redistribution
7 Grid Simulator
7.1 Motivation
7.2 Requirements and Design
7.3 Architecture / Components
7.3.1 Sites
7.3.2 DDM
7.3.3 WMS
7.3.4 Inputs
7.3.5 Outputs
7.3.6 Workflow
7.4 Implementation
7.4.1 Simulation Framework
7.4.2 Handling of Jobs and Sites
7.4.3 Data Catalogue
7.4.4 Transfers and Deletions during a simulation
7.4.5 Adding Workload to the System
7.4.6 Site Utilisation
7.4.7 Break and Restart Simulation
7.4.8 Example
7.4.9 Summary
8 Redistribution Evaluation
8.1 Introduction
8.2 Simulation Setup
8.2.1 Sites Setup
8.2.2 Workload Setup
8.2.3 Replica Catalogue
8.2.4 Simulation Mode
8.2.5 Redistribution
8.3 Simulation Parameters
8.3.1 Workload
8.3.2 Sites
8.3.3 Replicas
8.4 Results
8.4.1 First Week
8.4.2 Second Week
8.4.3 Third Week
8.5 Summary
IV Summary
9 Conclusion and Outlook
9.1 Summary
9.2 Outlook
9.2.1 Open Questions
9.2.2 Further developments
V Appendix
List of Figures
List of Tables
10 Code Samples
10.1 Prediction
10.2 Redistribution
10.3 Simulator
Reference