StrixTheKiet Notes

Search

Description:

Issues:

Work partitioning:

Divide the work so all cores have sth to do but also load balance them
Can be divided from a for loop if there is no dependencies, ex: a[i][j]=a[i][j-1]

Coordination and synchoronization:

Cache Coherency
Synchronizing for parallel programs
- Problem: Race Condition
- Atomic read & write memory operations:
  - Between read & write: no writes to that address
  - There are many atomic hardware primitives, ex Atomic Instructions
- Critical Section:
  - Typically, some a section of the function or a loop uses the shared variable, its the critical section
  - Only 1 thread can be executing critical section at a time, other threads can wait
  - ex:
```
double area, pi, x;  
int i, n;  
  
area = 0.0;  
#pragma omp parallel for private(x)  
for (i = 0; i < n; i++) {  
	x = (i + 0.5)/n;  
#pragma omp critical  
	area += 4.0 / (1.0 + x*x);  
}  
pi = area / n;
```
    - the time spent in citical section can be reduced by minize the calculation
```
double area, pi, x;  
int i, n;  
  
area = 0.0;  
#pragma omp parallel for private(x)  
for (i = 0; i < n; i++) {  
	x = (i + 0.5)/n;  
	tmp = 4.0 / (1.0 + x*x);  
#pragma omp critical  
	area += tmp
}  
pi = area / n;
```
  - Mutual exclusion lock:
    - Mutex Lock
- How to write parallel programs
  - Threads and processes
  - Critical sections, race conditions, and mutexes

Communication overhead:

Parallel Programming