Chris Lord, Sr. Program Manager at Microsoft presented a Sequencing Deep dive @ Tech-ed 2008.
The presentation describes 'inside the virtual file system', the sequencing process, volume management, Dynamic Suit Composition, State Management, Sequencing process. Last year when I visited Microsoft in Boston Chris gave this great information; I really think this information is useful for every App-V die-harder. The presentation can be downloaded here: https://www.virtuall.nl/knowledge/App-V45SequencingDeepDive.pdf
Below is the complete description of App-V package generation and all the corresponding terminology.
The life of a package file begins with the Sequencer. The process of producing a package consists of three stages: you install the application, you run the application, and you save a package. The files and folders that are created and modified during installation and execution become part of the package. Prepare to dive to about 5,000 feet.
Package Generation: Sequencer Application Installation Stage
The installation stage is driven by the Installation Wizard, shown here at the top. Below this, is a data flow diagram (or DFD) that roughly aligns with the stages of the wizard. Circles represent processes or activities, rectangles represent external processes or activities, and the open boxes are data stores (both volatile and persistent). Unlike similar looking state diagrams, the direction of an arrow shows movement of data. This is typically from a provider process to a data store, from a data store to a consumer process, or directly from a provider to a consumer. The label on an arrow clarifies the type of data that is flowing.
One of the changes in 4.5 Sequencer is that the Installation Wizard prompts you for the installation directory when you start monitoring (and before you have installed the application) rather than when you stop monitoring. SoftGrid 4.5 supports ACL capture and propagation from the Sequencer through to the Client. This change was necessary to ensure the ACLs we capture are accurate and match what the application intends. After you have specified the target installation directory—the last component of which becomes the package root—the security descriptors are initialized to known values.
The sequencer then begins the process of monitoring all file operations for new processes spawned by the shell or select system services. This is when you run your installation program.
SoftGrid 4.5 includes better support for applications that install and use SxS assemblies. One of the improvements is to dynamically privatize public assemblies that are installed by the installation program so they are immediately available to the application. In the past, we have encountered application installers (such as Microsoft Money 2007) that needed these assemblies for later installation stages and would not work with the post-processing privatization that we provided in 4.2. This is especially true if an application installer offers to check and install upgrades at the end of its installation.
When you stop monitoring, the list of all files and directories that were created or modified by the application installation populates three lists: the list of user files, the list of application files, and the list of VFS files.
Any files that are created outside the specified installation directory become part of the VFS. Copies of these files are moved to the VFS directory under the package root and into subdirectories based on the name of the CSIDL. CSIDLs are symbolic machine-independent names for common file locations. For example, CSIDL_PROGRAM_FILES refers to the program files directory, which could be named and located differently on different systems.
Each file and directory outside the installation directory is parsed against all standard CSIDLs and a VFS mapping is created. This process is a vital part of the way SoftGrid makes applications independent of the system on which they are sequenced and run. For example, if an application stores a file in C:\Program Files, but the application is run on a system where C:\Program Files is named something different, perhaps a localized version of the name, the process of parsing the VFS mappings is reversed so the file will appear in the correct location even though the target system is configured differently.
Package Generation: Application and Save Stages
The next stage is the application installation stage, which is managed by the Application Wizard in the Sequencer. You can see from the DFD below that the overall process is very similar to that of the installation stage. There are, however, a few of differences. First, the VFS exists in this stage so file operations that occur to files and directories with VFS mappings are captured and handled by the VFS. Second, the rules for determining what is application data and what is user data change. Third, the monitoring includes file I/O (reads and writes), not just operations like creates, renames and deletes.
For our purposes, the most interesting is the I/O monitoring since this is what drives the final stage, saving the package. Because the Sequencer knows which byte offsets of which files are accessed during the application stage, it is able to place those blocks in feature block 1 and thereby create the minimum set of data necessary to satisfy the application execution scenario. When you launch an uncached application, you only wait for FB1 to be loaded; all other data is brought down on demand or in the background.
All other file data, ACLs, and SystemGuard configuration also are written to the package along with the metadata that identifies the SoftGrid attributes of each file.
Application versus User Files
The SoftGrid package format and file system identifies package files one of four different ways: application data, application configuration, user data, or user configuration. This is historical since the SoftGrid 4.x file system no longer distinguishes between data and configuration files. In other words, user data and user configuration are treated the same on the client, and application data and application configuration are likewise treated the same.
Regardless of whether a file was created during the install or application stage, if it is an executable or SxS manifest, then it is application data. From there, the logic differs depending on stage. For files created during the installation phase, anything in the user profile is user data, and anything outside the profile with a .dat, .ini, .dot, or .mdw file type is user data. All else is application data.
In contrast, everything created during the application stage is considered user data. This is one reason why it is important to do perform the right operations in each sequencing stage. If you do something like installing the bulk of the application during the Application Wizard, then you risk miscategorizing files.
There is a design change under consideration to simplify this logic so that, roughly, files in the user profile are user files and files outside are application files. The Sequencer would still provide you a means for overriding this default heuristic.
The SoftGrid file system manages package data using a collection of six container volumes. You can think of these volumes as self-contained file systems within a single file. Each volume stores different information based on the type of data, where it is located, and who modifies it. The next two slides describe each of these volumes and illustrate when the volume is first initialized, what it contains, and where it lives.
The first two volumes are global across all users and packages.
File System Data Cache
Contains all loaded package data and metadata from previously streamed content. This volume contains only RO data that is recoverable by restreaming or reloading the application. The file is not preallocated and grows to the maximum configured size as package data is loaded.
File System Global Cache
This volume contains files that are created outside the context of a SoftGrid process or virtual environment and are not associated with any package. We grant select system processes like sftlist, csrss, and winlogon access to the SoftGrid drive. If they create data in the root of the drive or not associated with a running package, then the data gets stored here. This is rarely encountered catchall.
By default, these volumes are both stored in the shared All Users or Public documents folder. Users have only read access to these volumes and because they are opened by the file system at system start up, access to these volumes is usually through the SoftGrid file system itself.
The next four volumes are unique to each package. The first three, shown in red, are stored in the shared profile, and the last one, shown in blue, is stored in the user profile.
Global Package Volume
The Global Package Volume contains any application-specific data that is modified by a system process. The well-known SID for system is appended to the volume name to make this clear. In SoftGrid 4.0 and 4.1, this volume was used for all modified application data. A side effect of this was that any user was able to modify application-specific files and affect others users. To remedy this, 4.2 and 4.5 separate modifications into those made by allowed system processes such as the Listener, and those made by the user application processes. User modifications go instead to the Application Data Isolation Volume. The global package volume also contains the SystemGuard settings for system processes.
Global User Volume
The Global User Volume contains new or modified user-specific data from a system process like csrss or winlogon that cannot be associated with a specific user context. Like the global cache, this is not very common.
Application Data Isolation Volume
Contains application-specific files that are modified by any user process in the virtual environment. The SID of the user is appended to the volume name to uniquely identify it.
User Package Volume
Contains user-specific files that are modified—or new files that are created— by any user process in the virtual environment. This volume also contains the SystemGuard configuration settings.cp file that defines all VFS and VReg information.
All these volumes are created when a Load operation is performed, or upon shutdown of the first time a package is launched. The user package volume, because it is stored in the user profile, is only created for the user that is performing the load or the launch.
Data Organization: File and Directory Locations
This slide shows the locations of all these volumes as well as the temporary copies for a package. The SoftGrid file system ensures the integrity and consistency of package data by always making modifications to a temporary copy of the file system volume, which then replaces the package volume when the application is no longer in use by the user (in the case of the per-user volumes) or all users (in the case of per-system volumes). If the system is disrupted before the package is shutdown, then it automatically rollsback to the original volumes when the package was launched.
Data Management: Launch and Shutdown
This slide describes the activities that occur when a package is first launch by a user, first launched on a system, or subsequently launched by a user or on a system. Four combinations can occur. The two obvious ones are the first launch on a system by a user, a subsequent launch on a system by a user. Two less obvious cases are a new user on a system on which the package has already been launched (such as would happen on a terminal server), and a user that has previously used a package on a new system, such as might happen in a terminal server farm.
The first time you launch a package, the package data is streamed down to the client and stored in the file system data cache. Temporary versions of the four package-specific volumes are initialized but don’t yet contain actual data. As the application modifies files, the modified versions get written to the appropriate volumes. When the application exits and the virtual environment closes, the temporary volumes are copied to their permanent locations and names.
The second time the user launches the package, the permanent volumes already exist so they are copied to the temporary versions. All modifications are made to the temporary versions and when the package is shutdown, they replace the previous permanent versions.
Consider the new user on a Terminal Server. The per-system volumes already exist, so it behaves as a subsequent launch. However, the per-user volumes do not exist, so it behaves as an initial launch.
Conversely, on a new system, the per-system volumes are initialized by a user that previously used the package behaves as a subsequent user launch.
Package In Use
Lastly, if a package is in use, then only the first user incurs the cost of copying and mounting the per-system volumes. Other users only incur the per-user volume setup. Conversely, the per-system volumes are only save when the last user of the package exits the application. This has some subtle side effects in terms of performance and when changes are committed. The first user to launch a package will have a slightly longer launch time than subsequent users. And since changes to per-system volumes are only committed when the last user of a package exits the application, a system disruption will cause changes to the per-system volumes to be rolled back to when the first user launched the package, while changes to the per-user volumes will be rolled back just to the start of a user’s launch of the package.
Data Management: Administrative Operations
The three primary client management console commands that affect the file system volumes. When you load an application, in addition to populating the file system cache with the package data, it pre-initializes the two global package volumes. Unload deletes the two global package volumes. The repair and clear options will delete the user package volume (and also delete the settings.cp file).
There are several management activities that directly affect file system volumes. When you perform a package load, it initializes the per-system volumes and initializes the per-user volumes for the user performing the load. This operation behaves very much like a launch: at the start of the load, temporary versions of each volume are created and when the load completes, these are moved to their final package locations.
Repair and clear are the same from a file system perspective: they delete the user package volume for the user performing the repair. Because the user’s SystemGuard personal settings live in this volume, a repair also has the effect of eliminating any modifications to the VFS or Virtual Registry made by the user. Clear differs from repair because it also removes knowledge of the application, its FTAs and shortcuts from the user.
Unload and Delete are likewise similar: they have the same effect on file system volumes. The per-system volumes and the application data isolation volume are removed, and the package data is evicted from the file system cache.
Delete, like Clear, removes information about the application for all users.
4.0 Data Robustness Improvements
Data consistency - Metadata is maintained with file data in file system container volumes and updated as a single unit.
Data Integrity - Window of vulnerability for cache flushed reduced to just the periods when the file system cache is being modified. Modifications are made to working copies of file system volumes and synched on package shutdown.
Efficient - SystemGuard settings.cp stored in the user package volume and always reflects state of file system files. (Shared profiles and profile errors may orphan files due to inconsistent file system and VFS state.) Only data for package files modified by the user are stored in the per-package user volumes. (Commingled package storage and copy-on-access leads to large profiles with many files in AppFS.)
Override and Permanent
With the change in 16476 to always delete the application data volumes on an upgrade, the purpose of the override and permanent bits is lost. It is a loss we accept, but one we should lose consistently across the product.
Historically in 4.0, an application-specific file (‘application data’ or ‘application configuration’ SoftGrid file attributes) is replaced with a newer version in an upgrade package unless the Permanent flag is set and the Override flag is clear. This is actually different than the 3.x behavior, which ignored the Permanent flag. The table below shows this difference.
In both 3.x and 4.0, application-specific files that were modified by the user but were unchanged in the upgraded package were left alone during a package upgrade. With the 16476 change, these are lost.
The security changes to isolate application-specific changes to a user via the application data isolation volume and the intended behavior on package upgrade to delete all application-specific data (now covered in 16476), means 4.2 and 4.5 no longer care about the Override and Permanent flags.
Since their behavior has never been well understood outside the Sequencing and File System teams, I suggest these options be completely removed from the Sequencer. The Sequencer already marks all application-specific files with the Override flag by default and should continue to do so for backwards compatibility (whether or not that compatibility is officially sanctioned, since it is needed for phase-in and test).