yvsou.com

What is the Shared Memory Process?

In its simplest form, shared memory is a low-level programming process on a single server that enables clients and servers to exchange data and instructions using main memory. Performance is much faster than using system service like operating system data buffers.

For example, a client needs to exchange data with the server for modification and return. Without shared memory, both client and server use operating system buffers to accomplish the modification and exchange.

The client writes to an output file in the buffer, and the server writes the file to its workspace. When it completes the modification, the process reverses. Each single time this occurs, the system generates 2 reads and 2 writes between client and server.

With shared memory, the client writes its process directly into RAM and issues a semaphore value to flag server attention. The server accomplishes the modifications directly in main memory and alerts the client by changing the semaphore value. There is only 1 read and 1 write per communication, and the read/write is considerably faster than using system services.

Shared Memory and Single Microprocessor General Flow

Server uses a system call to request a shared memory key, and memorizes the returned ID.

Server starts.

Server issues another system call to attach shared memory to the server's address space.

Server initializes the shared memory.

Client starts.

Client requests shared memory

Server issues unique memory ID to client

Client attaches shared memory ID to the address space and uses the memory.

When complete, client detaches all shared memory segments and exits.

Using two more system calls, server detaches and removes shared memory .

Multi-Processor Shared Memory

This simplified scheme works for single microprocessors, but memory sharing among multiple microprocessors is more complex especially when each microprocessor has its own memory cache. Popular approaches include uniform memory access (UMA), and non-uniform memory access (NUMA). Distributed memory sharing is also possible, although it uses different sharing technology.

UMA: Shared memory in parallel computing environments

In parallel computing, multiprocessors use the same physical memory and access it in parallel, although the processors may have a private memory caches as well. Shared memory accelerates parallel execution of large applications where processing time is critical.

NUMA: Shared memory in symmetric multiprocessor systems (SMU)

NUMA configures SMU to use shared memory. SMU is a clustered architecture that tightly couples multiple processors in a share-everything single server environment with a single OS. As each processor uses the same bus, intensive operations will slow down performance and increase latency.

NUMA replaces the single system bus by grouping CPU and memory resources into configurations it calls NUMA nodes. Multiple high-performing nodes efficiently operate within clusters, allowing CPUs to treat its assigned nodes as a local shared memory resource. This relieves the load on the bus, assigning it to flexible, high performance memory nodes.

Shared memory in distributed systems

Distributed shared memory uses a different technology but has the same result: separate computers share memory for better performance and scalability. Distributed shared memory enables separate computer systems to access each other’s memory by abstracting it from the server level into a logically shared address space.

The architecture can either separate memory and distribute the parts among the nodes and main memory, or can distribute all memory between the nodes. Distributed memory sharing uses either hardware (network interfaces and cache coherence circuits) or software. Unlike single or multiprocessor shared memory, distributed memory sharing scales efficiently and supports intensive processing tasks such as large complex databases.

Caution: Shared Memory Challenges

Shared memory programming is straightforward in a single CPU or clustered CPUs. All processors share the same view of data, and communication between them is very fast; and shared memory programming is a relatively simple affair.

However, most multiprocessor systems assign individual cache memory to its processors in addition to main memory. Cache memory processing is considerably faster than using RAM, but can cause conflict and data degradation if the same system is also using shared memory. There are three main issues for shared memory in cache memory architectures: degraded access times, data incoherence and false sharing.

Degraded access time

Several processors cause contention and performance slowdowns by accessing the same memory location at the same time. For this reason, non-distributed shared memory systems do not scale very efficiently over ten processors.

Data incoherence

Multiple processors with memory sharing typically have individual memory caches to speed up performance. In this system, two or more processors may have cached copy of the same memory location. Both processors modify the data without being aware of another cache’s modifications, meaning that the data that should be identical—i.e. coherent--is now incoherent, and can lead to corruption when that data is written back to the main memory.

Cache coherence

Cache coherence protocols manage these conflicts by synchronizing data values within multiple caches. Whenever a cache propagates modified back to the shared memory location, the data remains coherent. Cache coherence protects high-performance cache memory while supporting memory sharing.

False sharing

This memory usage pattern degrades performance, and occurs in multiprocessor systems with shared memory and individual processor caches. Caching works by reading data from the assigned memory location plus nearby locations. (The minimum size of a cache line is 64 bytes.) The problem arises when the processor accesses a shared block that contains modifiable data, or variables. Whether or not one processor actually modifies that data does not matter; reading changes, the other caches will reload their entire blocks. The cache coherency protocol does not initiate the reload and does not grant it any resources, so the incoming process must bear the overhead. This forces the main bus to reconnect with every write to shared memory locations, degrading performance and wasting bandwidth.

Programming is the solution

“Cache padding” inserts meaningless bytes between the exact memory location and its neighbors, so the single 64-byte cache line only writes the exact data. Cache coherency does the synchronization, so other caches are not forced to reload their blocks.

Shared Memory Advantages

Multiple applications share memory for more efficient processing.

Efficiently passes data between programs to improve communications and efficiency.

Works in single microprocessor systems, multiprocessor parallel or symmetric systems, and distributed servers.

Avoids redundant data copies by managing shared data in main memory in caches.

Minimizes input/output (I/O) processes by enabling program to access a single data copy already in memory.

For programmers, the main advantage of the shared memory is that there is no need to write explicit code for processor interaction and communication.

Cache coherence protocols protect shared memory against data incoherence and performance slow-downs.

Let us look at a few details of the system calls related to shared memory.

#include <sys/ipc.h>

#include <sys/shm.h>

int shmget(key_t key, size_t size, int shmflg)

The above system call creates or allocates a System V shared memory segment. The arguments that need to be passed are as follows −

The first argument, key, recognizes the shared memory segment. The key can be either an arbitrary value or one that can be derived from the library function ftok(). The key can also be IPC_PRIVATE, means, running processes as server and client (parent and child relationship) i.e., inter-related process communiation. If the client wants to use shared memory with this key, then it must be a child process of the server. Also, the child process needs to be created after the parent has obtained a shared memory.

The second argument, size, is the size of the shared memory segment rounded to multiple of PAGE_SIZE.

The third argument, shmflg, specifies the required shared memory flag/s such as IPC_CREAT (creating new segment) or IPC_EXCL (Used with IPC_CREAT to create new segment and the call fails, if the segment already exists). Need to pass the permissions as well.

Note − Refer earlier sections for details on permissions.

This call would return a valid shared memory identifier (used for further calls of shared memory) on success and -1 in case of failure. To know the cause of failure, check with errno variable or perror() function.

#include <sys/types.h>

#include <sys/shm.h>

void * shmat(int shmid, const void *shmaddr, int shmflg)

The above system call performs shared memory operation for System V shared memory segment i.e., attaching a shared memory segment to the address space of the calling process. The arguments that need to be passed are as follows −

The first argument, shmid, is the identifier of the shared memory segment. This id is the shared memory identifier, which is the return value of shmget() system call.

The second argument, shmaddr, is to specify the attaching address. If shmaddr is NULL, the system by default chooses the suitable address to attach the segment. If shmaddr is not NULL and SHM_RND is specified in shmflg, the attach is equal to the address of the nearest multiple of SHMLBA (Lower Boundary Address). Otherwise, shmaddr must be a page aligned address at which the shared memory attachment occurs/starts.

The third argument, shmflg, specifies the required shared memory flag/s such as SHM_RND (rounding off address to SHMLBA) or SHM_EXEC (allows the contents of segment to be executed) or SHM_RDONLY (attaches the segment for read-only purpose, by default it is read-write) or SHM_REMAP (replaces the existing mapping in the range specified by shmaddr and continuing till the end of segment).

This call would return the address of attached shared memory segment on success and -1 in case of failure. To know the cause of failure, check with errno variable or perror() function.

#include <sys/types.h>

#include <sys/shm.h>

int shmdt(const void *shmaddr)

The above system call performs shared memory operation for System V shared memory segment of detaching the shared memory segment from the address space of the calling process. The argument that needs to be passed is −

The argument, shmaddr, is the address of shared memory segment to be detached. The to-be-detached segment must be the address returned by the shmat() system call.

This call would return 0 on success and -1 in case of failure. To know the cause of failure, check with errno variable or perror() function.

#include <sys/ipc.h>

#include <sys/shm.h>

int shmctl(int shmid, int cmd, struct shmid_ds *buf)

The above system call performs control operation for a System V shared memory segment. The following arguments needs to be passed −

The first argument, shmid, is the identifier of the shared memory segment. This id is the shared memory identifier, which is the return value of shmget() system call.

The second argument, cmd, is the command to perform the required control operation on the shared memory segment.

Valid values for cmd are −

IPC_STAT − Copies the information of the current values of each member of struct shmid_ds to the passed structure pointed by buf. This command requires read permission to the shared memory segment.

IPC_SET − Sets the user ID, group ID of the owner, permissions, etc. pointed to by structure buf.

IPC_RMID − Marks the segment to be destroyed. The segment is destroyed only after the last process has detached it.

IPC_INFO − Returns the information about the shared memory limits and parameters in the structure pointed by buf.

SHM_INFO − Returns a shm_info structure containing information about the consumed system resources by the shared memory.

The third argument, buf, is a pointer to the shared memory structure named struct shmid_ds. The values of this structure would be used for either set or get as per cmd.

This call returns the value depending upon the passed command. Upon success of IPC_INFO and SHM_INFO or SHM_STAT returns the index or identifier of the shared memory segment or 0 for other operations and -1 in case of failure. To know the cause of failure, check with errno variable or perror() function.

Let us consider the following sample program.

Create two processes, one is for writing into the shared memory (shm_write.c) and another is for reading from the shared memory (shm_read.c)

The program performs writing into the shared memory by write process (shm_write.c) and reading from the shared memory by reading process (shm_read.c)

In the shared memory, the writing process, creates a shared memory of size 1K (and flags) and attaches the shared memory

The write process writes 5 times the Alphabets from ‘A’ to ‘E’ each of 1023 bytes into the shared memory. Last byte signifies the end of buffer

Read process would read from the shared memory and write to the standard output

Reading and writing process actions are performed simultaneously

After completion of writing, the write process updates to indicate completion of writing into the shared memory (with complete variable in struct shmseg)

Reading process performs reading from the shared memory and displays on the output until it gets indication of write process completion (complete variable in struct shmseg)

Performs reading and writing process for a few times for simplication and also in order to avoid infinite loops and complicating the program

Homework 09: Write a Review Paper on Shared Memory Process Version 0
👤 Author: by bhupeshaawasthi952gmailcom 2020-05-25 10:43:37

Reversion History