The Parallel Models of Coronal Polarization Brightness Calculation

About This Presentation
Title:

The Parallel Models of Coronal Polarization Brightness Calculation

Description:

Title: PowerPoint Presentation Last modified by: jwq Created Date: 1/1/1601 12:00:00 AM Document presentation format: Other titles –

Number of Views:91
Avg rating:3.0/5.0
Slides: 27
Provided by: dcabesMee5
Category:

less

Transcript and Presenter's Notes

Title: The Parallel Models of Coronal Polarization Brightness Calculation


1
The Parallel Models of Coronal Polarization
Brightness Calculation
  • Jiang Wenqian

2
Outline
  • Introduction
  • pB Calculation Formula
  • Serial pB Calculation Process
  • Parallel pB Calculation Models
  • Conclusion

3
Part ?. Introduction
  • Space weather forecast needs an accurate solar
    wind model for the solar atmosphere and the
    interplanetary space. The global model of corona
    and heliosphere is the basis of numerical space
    weather forecast, and the observation basis of
    explaining various relevant relations.
  • Meanwhile, three-dimensional numerical
    Magnetohydrodynamics (MHD) simulation is one of
    the most common numerical methods to study corona
    and solar wind.

4
Part ?. Introduction
  • Besides, calculating and converting the generated
    coronal electron density to the coronal
    polarization brightness (pB) is the key method of
    comparing with observation results, and is
    important to validate the MHD models.
  • Due to the massive data and the complexity of the
    pB model, the computation will cost too much time
    to visualize the pB data in nearly real time
    while using a single CPU (or core).

5
Part ?. Introduction
  • According to the characteristic of CPU/GPU
    computing environment, we analyze the pB
    conversion algorithm, implement two parallel
    models of pB calculation with MPI and CUDA, and
    compares the two models efficiency.

6
Part ?. pB Calculation Formula
  • pB is derived from electron-scattered photosphere
    radiation. It can be used in the inversion of
    coronal electron density and to validate
    numerical models. Taking limb darkening into
    account, pB calculation formula of a small
    coronal volume element is shown as followed

  • (1)

  • (2)

  • (3)

7
Part ?. pB Calculation Formula
  • The polarization brightness image for comparing
    with the observation of coronagraph can be
    generated through integrating the electron
    density along the line of sight.
  • Density integral Process of pB Calculation

8
Part ?. Serial pB Calculation Process
  • The steps of the serial model of pB calculation
    on CPU with the experimental data are shown as
    below.
  • The
    serial process of pB calculation

9
Part ?. Serial pB Calculation Process
  • According to the serial process of pB calculation
    above, we implement it under the environment of
    G95 on Linux and Visual Studio 2005 on Windows XP
    respectively.
  • With being measured the time cost of each step,
    it is found that the most time-consuming part of
    the whole program is the calculation of pB
    values, accounting for 98.05 and 99.05 of the
    total time cost respectively.

10
Part ?. Serial pB Calculation Process
  • Therefore, in order to improve the performance to
    meet the command of getting coronal polarization
    brightness in nearly real-time, we should
    optimize the calculation part of pB values.
  • As the density integration of each point over
    solar limb along the line of sight is
    independent, the parallel computation method is
    very suitable for pB calculation.

11
Part ?. Parallel pB Calculation Models
  • Currently, parallelized MHD numerical calculation
    is mainly based on MPI.
  • With the development of high performance
    computation, using GPU architecture to solve
    intensive computation shows obvious advantages.
  • Based on this situation, it will be an efficient
    parallel solution to implement the parallel MHD
    numerical calculation using GPU.
  • We implement two parallel models based on MPI and
    CUDA respectively.

12
Part ?. Parallel pB Calculation Models
  • Experiment Environment
  • Experimental Data
  • 424282(r, ?, f) density data(den)
  • 321321481(x , y, z) cartesian coordinate grid
  • 321321 pB values will be generated.
  • Hardware
  • Intel(R) Xeon(R) CPU, E5405 _at_ 2.00GHz(8 CPUs)
  • 1GB memory
  • NVIDIA Quadro FX 4600 GPU, 760MB Global Memory
    GDDR3 SDRAM graphics card
  • (It owns G80 kernel architecture, 12 MPs and
    128 SPs )

13
Part ?. Parallel pB Calculation Models
  • Experiment Environment
  • Compiling Environment
  • CUDA-based parallel model
  • Visual Studio 2005 on Windows XP
  • CUDA 1.1 SDK
  • MPI-based parallel model
  • G95 on Linux
  • MPICH2

14
Part ?. Parallel pB Calculation Models
  • MPI-based Parallelized Implementation
  • In the MPI environment, how the experiment
    decomposes computing domain into sub-domains is
    shown as bellow.

15
Part ?. Parallel pB Calculation Models
  • MPI-based Parallelized Implementation

16
Part ?. Parallel pB Calculation Models
  • MPI-based Parallelized Implementation
  • The final result shows that MPI-based parallel
    model reaches a speedup of 5.8. As the experiment
    is implemented under the platform with 8 CPU
    cores, the speed-up ratio of the result is closed
    to its theoretical value.
  • Meanwhile, it is revealed that the MPI-based
    parallel solution for the experiment has balanced
    the utilization ratio of processors and the
    communication between processors.

17
Part ?. Parallel pB Calculation Models
  • CUDA-based Parallelized Implementation
  • According to pB serial calculation process and
    the CUDA architecture, we should put the
    calculation part into the Kernel function to
    implement the parallel program.
  • Since the calculation of density interpolation
    and the cumulative sum involved in every pB value
    are independent, we can use multi-threads to
    process the pB value calculation in the CUDA, and
    each thread calculates one pB value.

18
Part ?. Parallel pB Calculation Models
  • CUDA-based Parallelized Implementation
  • However, the pB values to be calculated is much
    larger than the available thread number of GPU,
    so each thread should calculate multiple pB
    values. According to experimental conditions, the
    thread number is setting to 256 for each block so
    as to maximize the use of computing resources.
  • The block number depends on the ratio of pB
    number and thread number. In addition, since the
    access time of global memory is large, we can put
    some independent data to the shared memory to
    reduce data access time.

19
Part ?. Parallel pB Calculation Models
  • CUDA-based Parallelized Implementation
  • The size of data put into shared memory is about
    7KB, less than 16KB provided by GPU, so the
    parallel solution is feasible.
  • Moreover, the data-length array is read-only and
    its using frequency is very high, so the
    optimized strategy that the data-length array is
    migrated from shared memory into constant memory
    is adopted to further improve its access
    efficiency.
  • The CUDA-based parallel calculation process is
    shown as bellow.

20
Part ?. Parallel pB Calculation Models
21
Part ?. Parallel pB Calculation Models
  • Experiment results
  • The pB calculation time of two models is shown in
    Table 1.
  • Table 1. The pB calculation time of serial models
    and parallel
  • models and their speed-up ratio

MPI (G95) CUDA (Visual Studio 2005)
pB calculation time of serial models(s) 32.403 48.938
pB calculation time of parallel models(s) 5.053 1.536
Speed-up ratio 6.41 31.86
22
Part ?. Parallel pB Calculation Models
  • Experiment results
  • The total performance of two models is as shown
    in Table 2.
  • Table 2. The total running-time of two parallel
    models and the speed-up ratios
  • compared with their serial models

MPI (G95)(s) CUDA (Visual Studio 2005)(s) The speed-up ratio of running-time
Serial models 33.05 49.406 0.67
Parallel models 5.70 2.004 2.84
23
Part ?. Parallel pB Calculation Models
  • Experiment results
  • Finally, we draw the coronal polarization
    brightness image shown as bellow with using
    calculated data.

24
Conclusion
  • Under the same environment, pB calculation time
    of MPI-based parallel model costs 5.053 seconds
    while the serial model costs 32.403 seconds. The
    models speedup is 6.41.
  • The pB calculation time of CUDA-based parallel
    model costs 1.536 seconds while the serial model
    costs 48.936 seconds. The models speedup is
    31.86.
  • The total running-time of CUDA-based model is
    2.84 times than that of MPI-based model.

25
Conclusion
  • It finds that the CUDA-based parallel model is
    more suitable for pB calculation, and it provides
    a better solution for post-processing and
    visualizing the MHD numerical calculation
    results.

26
  • Thank you!!!
Write a Comment
User Comments (0)
About PowerShow.com