HPlogo HP-UX Memory Management: White Paper > Chapter 1 MEMORY MANAGEMENT

VIRTUAL MEMORY AND exec()

» 

Technical documentation

Complete book in PDF

 » Table of Contents

When the system performs an exec(), the virtual memory system concerns itself with cleaning up old pregions/regions and setting up new ones.

Cleaning up from a vfork()

Cleanup in the vfork() case is simple.

  • The child process is executing but borrowing its resources from the parent process.

  • The routine creates its own uarea and returns the parent's resources.

  • Then the routine adds text, data, and so on.

    • The routine gets a new vas and attaches it to the child process (p_vas).

    • The uarea and stack of the parent process are copied and the pregions and regions are created for the child uarea, just as for a FORK_PROCESS fork type.

    • The uarea is copied into the child's uarea region, which is pointed to the now-complete uarea from the thread, and the thread switches from using the parent's kernel stack to the new child kernel stack.

Disposing of the old pregions: dispreg()

If exec() is called after a FORK_PROCESS fork, several regions must be disposed of first. Typically, all pregions are disposed of except for the PT_UAREA pregion, which is still needed. If the file is calling exec() on itself, we save a little processing and keep the PT_TEXT and PT_NULLDREF regions, too.

  • deactivate_preg() is used to deactivate the pregion by removing it from the active pregion list. If the agehand is pointing to the pregion being deactivated and stealhand is pointing to the next region in the active pregion list, the agehand is moved back one pregion to prevent the agehand from exceeding the stealhand in sequence. Otherwise if the agehand or stealhand is pointing to the pregion being deactivated, both hands are moved forward one pregion.

  • If the region is type RT_PRIVATE or the pregion being discarded is the last attached, its resources must be freed up.

    • wait_for_io() awaits completion of any pending I/O to the region (that is, r_poip = 0), so that no I/O request returns to modify a page now assigned a different purpose.

    • The region's B-tree is traversed to delete all the virtual address translations. (That is, for each valid vfd, the TLBs are purged, the cache flushed, and the pde entry invalidated (set space to -1, address to 0, pfn to 0, valid to 0, ref to 0, and clear the bit from pde_os).

  • If the pde is not the HTBL entry, the pde is moved from hash list to free list. If it is the HTBL pde and it is unused, an effort is made to fill it with a translation down its linked list, and then free the copied pde.

  • The physical-to-virtual translation is removed from pfn_to_virt_table. If it was the last virtual translation for this physical page, the HDLPF_TRANS is cleared in the pfdat entry.

  • The pregion pointer is removed from the rpregs list and the memory used by the pregion is freed (that is, returned to its kernel memory bucket).

  • The region's r_incore and r_refcnt elements are decremented. If r_refcnt equals zero, the region is freed also.

    • Again, r_poip must decrement to zero before a region can be freed, to prevent any unexpected I/O to its pages.

    • The B-tree is walked again, and for each valid page found, r_nvalid and pf_use are decremented in the pfdat entry. If the physical page is not aliased, its pf_use will now be 0; it can be freed for other uses.

    • Its P_QUEUE flag is set and the page is put on the pfdat free list (phead). The kernel global freemem is incremented. If any other processes are waiting for memory, we wake them all up so that the first one here can have the page (the losers of the race will go to sleep again).

  • If r_bstore is swapdev_vp, the reserved swap pages (r_swalloc) are released, as are the swap pages reserved for the B-tree structure (r_root->b_rpages).

  • The pages themselves are freed by invalidating their pdes, purging the TLBs, flushing the caches, moving the non-HTBL pdes from the hash list to the free list, and linking the pfdat entry into phead.

  • r_root and r_chunk region elements are moved back to the buckets rather than being freed.

  • activeregions is decremented; the region is removed from the r_forw / r_back region chain, and the region memory returned to its memory allocation bucket.

Building the new process

If the process for which memory structures are being created is the first to use the a.out as an executable, the a.out vnode's v_vas is NULL, and requires creating the pseudo-vas, pseudo-pregion, and region. Otherwise, the pseudo-vas' reference count is updated.

  • To what region a PT_TEXT pregion is attached depends on the type of executable.

    • If the executable is non-EXEC_MAGIC, a PT_TEXT pregion is attached to the pseudo-vas region.

    • If the executable is EXEC_MAGIC, VA_WRTEXT is set in the process vas, the pseudo-vas' region is duplicated as a type RT_PRIVATE region (performing all the steps discussed for an RT_PRIVATE region), RF_SWLAZYWRT is set in the new region so that no swap is reserved before needed, and a PT_TEXT pregion is attached to it.

    • In both cases, a new space is attached to the pregion's virtual address.

    • A PT_NULLDREF pregion is attached to the global region (globalnullrp), using the same space as PT_TEXT.

    • The pseudo-vas' region is duplicated as a type RT_PRIVATE region using r_off to point to the beginning of the data portion of the a.out file. A PT_DATA pregion is attached to it. If this is an EXEC_MAGIC executable, we use the PT_TEXT pregion's space, otherwise a new space is assigned.

    • The PT_DATA pregion is incremented by the size of bss (uninitialized data area), using dbd type DBD_DZERO. This sets b_protoidx to the end of the inititialized data area and b_proto2 to DBD_ZERO. More swap is reserved.

    • A private region of three pages (SSIZE +1) is created for the user stack. The dbd proto value is set to DBD_DZERO, and a PT_STACK pregion is attached at USRSTACK. The PT_DATA pregion's space is used.

    • When a shared library is linked to the process, two PT_MMAP pregions are created: an RT_SHARED pregion containing text mapped into the third quadrant with a space of KERNELSPACE and an RT_PRIVATE pregion containing associated data (such as library global variables) with the PT_DATA pregion's space.

    • If VA_WRTEXT is set, the data pregion takes the first available address above where the text ends (in the first or second quadrant); othwerwise it is assigned the first available address above 0x40000000 (the second quadrant).

Virtual memory and exit()

From the virtual memory perspective, an exit() resembles the first part of an exec(). All virtual memory resources associated with the process are discarded, but no new ones are allocated.

Thus, when exiting from a vfork child before the child has performed an exec(), nothing needs to be cleaned up from virtual memory except to return resources to the parent process. If exiting from a non-vfork child, the virtual memory resources are discarded by calling dispreg().