Deck 12: Cuda Model, Parallelism, Memory System, and Communication Models
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/25
Play
Full screen (f)
Deck 12: Cuda Model, Parallelism, Memory System, and Communication Models
1
the BlockPerGrid and ThreadPerBlock parameters are related to the ________ model supported by CUDA.
A)host
B)kernel
C)thread abstraction
D)none of above
A)host
B)kernel
C)thread abstraction
D)none of above
thread abstraction
2
_________ is Callable from the device only
A)_host_
B)__global__
C)_device_
D)none of above
A)_host_
B)__global__
C)_device_
D)none of above
_device_
3
______ is Callable from the host
A)_host_
B)__global__
C)_device_
D)none of above
A)_host_
B)__global__
C)_device_
D)none of above
__global__
4
______ is Callable from the host
A)_host_
B)__global__
C)_device_
D)none of above
A)_host_
B)__global__
C)_device_
D)none of above
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
5
Cache memory works on the principle of
A)locality of data
B)locality of memory
C)locality of reference
D)locality of reference & memory
A)locality of data
B)locality of memory
C)locality of reference
D)locality of reference & memory
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
6
SIMD represents an organization that ______________.
A)refers to a computer system capable of processing several programs at the same time.
B)represents organization of single computer containing a control unit, processor unit and a memory unit.
C) includes many processing units under the supervision of a common control unit
D)none of the above.
A)refers to a computer system capable of processing several programs at the same time.
B)represents organization of single computer containing a control unit, processor unit and a memory unit.
C) includes many processing units under the supervision of a common control unit
D)none of the above.
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
7
Select different aspects of parallelism
A)server applications utilize high aggregate network bandwidth
B)scientific applications typically utilize high processing and memory system performance
C)all of the above
D)data intensive applications utilize high aggregate throughput
A)server applications utilize high aggregate network bandwidth
B)scientific applications typically utilize high processing and memory system performance
C)all of the above
D)data intensive applications utilize high aggregate throughput
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
8
Select correct answer: DRAM access times have only improved at the rate of roughly___________ % per year over this interval.
A)20
B)40
C)50
D)10
A)20
B)40
C)50
D)10
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
9
Analyze, if the second instruction has data dependencies with the first, but the third instruction does not, the first
A)out-of-order
B)both of the above
C)none of the above
D)in-order
A)out-of-order
B)both of the above
C)none of the above
D)in-order
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
10
Select the parameters which captures Memory system performance
A)bandwidth
B)both of the above
C)none of the above
D)latency
A)bandwidth
B)both of the above
C)none of the above
D)latency
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
11
Consider the example of a fire- hose. If the water comes out of the hose five seconds after the hydrant is turned on. Once the water starts flowing, if the hydrant delivers water at the rate of 15 gallons/second. Analyze the bandwidth and latency.
A)bandwidth: 5*15 gallons/second and latency: 15 seconds
B)bandwidth: 15 gallons/second and latency: 5 seconds
C)bandwidth: 3 gallons/second and latency: 5 seconds
D)bandwidth: 5 gallons/second and latency: 15 seconds
A)bandwidth: 5*15 gallons/second and latency: 15 seconds
B)bandwidth: 15 gallons/second and latency: 5 seconds
C)bandwidth: 3 gallons/second and latency: 5 seconds
D)bandwidth: 5 gallons/second and latency: 15 seconds
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
12
Select alternate approaches for Hiding Memory Latency
A)multithreading
B)spatial locality
C)all of the above
D)prefeching
A)multithreading
B)spatial locality
C)all of the above
D)prefeching
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
13
Select which clause in OpenMP is similar to the private, except values of variables are initialized to corresponding values before the
A)firstprivate
B)shared
C)all of the above
D)private
A)firstprivate
B)shared
C)all of the above
D)private
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
14
Which of the following projects of Blue Gene is not in development?
A)blue gene / m
B)blue gene / p
C)blue gene / q
D)blue gene / l
A)blue gene / m
B)blue gene / p
C)blue gene / q
D)blue gene / l
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
15
A decomposition can be illustrated in the form of a directed graph with nodes corresponding to tasks and edges indicating that the result of one task is required for processing the next. Such graph is called as
A)task dependency graph
B)task interaction graph
C)process interaction graph
D)process dependency graph
A)task dependency graph
B)task interaction graph
C)process interaction graph
D)process dependency graph
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
16
In which case, the owner computes rule implies that the output is computed by the process to which the output data is assigned?
A)output data decomposition
B)both of the above
C)none of the above
D)input data decomposition
A)output data decomposition
B)both of the above
C)none of the above
D)input data decomposition
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
17
Select relevant task characteristics from the options given below:
A)task sizes
B)size of data associated with tasks
C)all of the above
D)task generation
A)task sizes
B)size of data associated with tasks
C)all of the above
D)task generation
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
18
A classic example of game playing - each 15 puzzle board is the example of
A)dynamic task generation
B)none of the above
C)all of the above
D)static task generation
A)dynamic task generation
B)none of the above
C)all of the above
D)static task generation
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
19
Which model is equally suitable to shared-address- space or message- passing paradigms, since the interaction is naturally two ways.
A)master slave model
B)data parallel model
C)producer consumer or pipeline model
D)work pool model
A)master slave model
B)data parallel model
C)producer consumer or pipeline model
D)work pool model
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
20
In which type of the model, tasks are dynamically assigned to the processes for balancing the load?
A)master slave model
B)data parallel model
C)producer consumer or pipeline model
D)work pool model
A)master slave model
B)data parallel model
C)producer consumer or pipeline model
D)work pool model
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
21
Select the appropriate stage of GPU Pipeline which receives commands from CPU and also pulls geometry information from system memory.
A)vertex processing
B)memory interface
C)host interface
D)pixel processing
A)vertex processing
B)memory interface
C)host interface
D)pixel processing
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
22
In all-to-one reduction, data items must be combined piece-wise and the result made available at a___________ processor.
A)last
B)target
C)n-1
D)first
A)last
B)target
C)n-1
D)first
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
23
Analyze the Cost of Scatter and Gather .
A)t=ts log p + tw m (p-1)
B)t=ts log p - tw m (p-1)
C)t=tw log p - ts m (p-1)
D)t=tw log p + ts m (p-1)
A)t=ts log p + tw m (p-1)
B)t=ts log p - tw m (p-1)
C)t=tw log p - ts m (p-1)
D)t=tw log p + ts m (p-1)
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
24
All-to-all personalized communication is also known as___________ .
A)total exchange
B)both of the above
C)none of the above
D)partial exchange
A)total exchange
B)both of the above
C)none of the above
D)partial exchange
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck
25
All-to-all personalized communication is performed independently in each row with clustered messages of size on a mesh.
A)p
B)m?p
C)p?m
D)m
A)p
B)m?p
C)p?m
D)m
Unlock Deck
Unlock for access to all 25 flashcards in this deck.
Unlock Deck
k this deck