Design, Features, and Applicability of Solaris File Systems

Mar 26, 2004

📄 Contents

␡

Understanding What a File System Is
Understanding File System Taxonomy
Understanding Local File System Functionality
Understanding Differences Between Types of Shared File Systems
Understanding How Applications Interact With Different Types of File Systems
Conclusions
About the Author

⎙ Print

< Back Page 5 of 7 Next >

Like this article? We recommend 

Migrating to the Solaris Operating System: The Discipline of UNIX-to-UNIX Migrations

Learn More Buy

Understanding How Applications Interact With Different Types of File Systems

Although there are a wide variety of ways to categorize applications, one of the more straightforward methods is to consider the size of the data sets involved. This means both the size of individual data sets as well as the size of the logical groups of data sets; these correspond to the size of the files of significance and the file systems that will store them together. As we saw in the descriptions of each file system, one of the key engineering design tradeoffs that is made in the development of each file system is the way it handles scalability in the dimensions of size and number of clients.

In addition to the size of files, there are several other broad application categories that include the most common applications found in the Solaris environment. Database systems such as Oracle, DB2, MySQL, and Postgres contain a large proportion of all data on these servers. The applications that use the databases can be divided into two qualitatively different categories: online transaction processing (OLTP) and decision support (DS) systems.

OLTP work is primarily concerned with recording and utilizing the transactions that represent the operation of a business. In particular, the dominant operations focus on individual transactions, which are relatively small. Decision support or business intelligence applications are concerned with identifying trends and patterns in transactions captured by OLTP systems. The key distinction is that DSS systems process groups of transactions, while OLTP systems deal with individual transactions.

For the most part, OLTP systems experience activity that is similar to small-file work, while DSS systems experience activity that is similar to large-file workloads. High performance technical computing (HPTC) is a basic category independent of the others. This group typically makes use of the largest files and usually also creates the highest demand on I/O bandwidth. The files that support various Internet service daemons, such as ftp servers, web servers, LDAP directories, and the like typically fall into the small-file category (even ftp servers). Occasionally, the concentrated access visibility will create higher than usual I/O rates to these files, but usually they are just more small files to a file system. One reason that ftp servers fall into this category, even if they are actually providing large files (for example, the 150–650 megabyte files associated with large software distributions such as StarOffice^TM, Linux, or Solaris) is that the ftp server's outbound bandwidth constrains individual I/O request rates for individual streams, causing physical I/O to be strongly interleaved. The result is activity that has the same properties as small-file activity rather than the patterns that are more commonly associated with files of this size.

The activities and I/O characteristics associated with these primary workload groups can be used to assign file systems to applications in a fairly straight-forward way.

Selecting a Shared File System

Deciding which shared file system to use is fairly straightforward because there is relatively little overlap between the main products in this space. The three primary considerations are as follows:

The level of cooperation between the clients and server (for example, are they members of a cluster or not)
The size of the most important data sets
The sensitivity of the data

PxFS is the primary solution sharing data within a cluster. Although NFS works within a cluster, it has somewhat higher overhead than PxFS, while not providing features that would be useful within luster.

QFS/SW is being evaluated in this application today, since the security model is common between Sun Cluster software and QFS/SW. Furthermore, Sun Cluster software and QFS/SSW usually use the same interconnectivity model when the nodes are geographically collocated. Early indications suggest that QFS/SW might be able to serve a valuable role in this application with lower overhead, but such configurations are not currently supported because the evaluation is not complete.

For clients that are not clustered together, access to small files should use NFS unless the server that owns the files happens to run Windows. In this context, "small" means files that are less than a few megabytes or so.

The most common examples of small-file access are typical office automation home directories, software distribution servers (such as /usr/dist within Sun), software development build or archive servers, and support of typical Internet services such as web pages, mail and even typical ftp sites. This last category usually occurs when multiple systems are used to provide service to a large population using a single set of shared data.The clearest example is the use of 10–20 web servers in a load-balanced configuration, all sharing the same highly available and possibly writable data.

CIFS can also be used in this same space, but the lack of an efficient kernel implementation in Solaris, combined with the lack of any CIFS client package⁶ mean that CIFS will be used almost exclusively by Windows and Linux systems. Because server hardware and especially file system implementations scale to thousands of clients (NFS) or at least hundreds of clients (CIFS) per server, they are clearly preferred when processing is diverse and divided amongst many clients.

The other main application space is that of bulk data, applications in which the most interesting data sets are larger than a few megabytes and often vastly larger. Applications such as satellite download, oil and gas exploration, movie manipulation and other scientific computations often manipulate files that are tens of gigabytes in size. The leading edge of these applications are approaching 100 terabytes per data set. Decision support or business intelligence systems have data set profiles that would place them in this category, but these systems do not typically use shared data.

Because of the efficiencies of handling bulk data through direct storage access, QFS/SW is the primary choice in this space. The nature of bulk data applications generally keeps data storage physically near the processing systems, and these configurations typically enjoy generous bandwidth. However, NFS is required in the uncommon case where Fibre Channel is not the storage interconnection medium or where disk interconnect bandwidth is quite limited. For example, if the distance between processor and storage is more than a few kilometers, NFS is the most suitable alternative.

A consideration that is receiving attention is security. Highly sensitive data might require substantial security measures that are not available in QFS/SW. Data privacy is typically addressed by encryption. Neither QFS/SW nor Fibre Channel have any effective encryption capability. This is hardly surprising given the performance goals associated with each of these technologies. In contrast, NFS can be configured with Kerberos-v5 for more secure authentication and/or privacy. Security can also be addressed at the transport level. The NFS transport is IP, which can optionally be configured with IPsec link-level encryption. No corresponding capability is available in Fibre Channel.

In addition to these implementation differences, there is a subtle architectural difference between QFS/SW and NFS. QFS/SW clients trust both the storage and the other clients. NFS clients only need to trust the server as long as the server is using a secure authentication mechanism.

Selecting a Local File System

As with shared file systems, deciding which local file system to apply to given applications centers on the size of the interesting files. The local file systems have the additional consideration of how large the file system is, and what proportion of it might be dead or inactive.

When the size of the important files is relatively small any of the local fail systems will handle the job. For example, files in most office automation home directories average around 70 kilobytes today, despite an interesting proportion of image files, MP3 files, massive StarOffice presentations, and archived mail folders.

How those files will be accessed is not a discriminator. The files might be used on the local server (such as on a time-sharing system) or being exported through a file-sharing protocol for use on clients elsewhere in the network. Because logging UFS is a part of the Solaris OS, it is usually recommended for these applications, especially for the Solaris 9 OS and later with the availability of recent improvements such as hashed DNLC handling, logging performance, and optional suppression of access time updates.

In the small-file arena, Sun recommends SAM functionality (for example, either QFS or SAMFS with the archiving features enabled) over logging UFS when the size of the file systems becomes truly massive or when the proportion of dead or stale files represents an interesting overhead cost. Although it might seem counter intuitive that small files could result in massive file systems, several fairly common scenarios result in such installations. One is sensor data capture, in which a large number of sensors each report a small amount of data. Another is server consolidation as applied to file servers. This particular application has been implemented within Sun's own infrastructure systems and has resulted in many very large file systems full of small, stale files.

One major exception to the rule of small files is the use of database tables in a file system. Because databases carefully manage the data within their tablespaces, the files visible to the file system have some special characteristics that make them particularly easy for relatively simple file systems to manage them. The files tend to be constantly "used" in that the database always opens all of them, and the files are essentially never extended. Finally, the databases have their own caching and consistency mechanisms, so access is almost always quite careful and deliberate.

Operating under these assumptions, UFS is easily able to handle database tables used for OLTP systems. At one time (Solaris 2.3, 40 megahertz SuperSPARC®), the performance difference between raw disk devices and UFS tablespaces was quite substantial: performance was about 40 percent lower on file systems. Since that time, constant evolution of the virtual memory system, file system, and microprocessors have narrowed this gap to about 5 percent, an amount that is well worth the savings ease of administration.

Because the databases access many parts of the tablespaces, placing database tables on SAM file systems is rarely worthwhile. The access patterns result in files that are fully staged almost all the time; there is little or nothing to be gained for the complexity.

For a variety of reasons, file systems that contain large files not associated with database tables are best handled with QFS. Probably the most important consideration is linear (streaming) performance because QFS's storage organization is specifically designed to accommodate the needs of large data sets. The performance characteristics of the QFS storage organization also mean that databases hosted on file systems that are primarily used for decision-support applications will perform best on QFS rather than on UFS.

As with small files, the intended use of the file is not particularly material. QFS is usually the most appropriate file system for large, data-intensive applications, whether the local system is doing the processing—as in a high performance computing (HPC) environment—or if it is exporting the data to some other client for processing. In particular, data-intensive NFS applications (such as geological exploration systems) should use QFS.

As one might expect, file systems that operate on very large data sets are prime candidates for SAM archiving. It doesn't take very many idle 50-gigabyte data sets to require the management of a few more disk arrays. This is also the context in which segmented files are most useful. Because most of the overhead is in the manipulation of metadata and not user data, there is very little additional work for the system to do and substantial improvement in smoothness of handling the data through the system. In fact, a reasonable default policy might be to segment all large files at 1 gigabyte boundaries. The 1 gigabyte size permits a reasonable tradeoff between consumption of inodes (and especially their corresponding in-memory data structures) and staging mechanism utilization.

Accelerating Operations With tmpfs

The special file system whose application requires some explanation is tmpfs. As the name implies, tmpfs by definition stores its data in virtual memory. tmpfs is volatile; data does not persist across a reboot (whether intentional or not). The intent behind tmpfs is to accelerate operations on data that will not be saved permanently; for example, the intermediate files that are passed between the various phases of a compiler, sort temporary files, and editor scratch files. The running instances of these applications are lost in the event of a reboot, so loss of their temporary data is of no interest.

Even if data is temporary, there is no advantage to putting it in a tmpfs if it is large enough to cause the VM system to have to make special efforts to manage it. As a rule of thumb, tmpfs ceases to be productive when the data sets exceed about a third of physical memory configured in the system. When the data sets get this big, they tend to compete with the applications for physical memory and this ends up lowering overall system performance. (Obviously, a truly massive number of small files exceeding 35 percent of memory is also a problem.)

In a very small number of cases, tmpfs has been used to accelerate some specific database operations. This is a bit complex because it involves having a tablespace created in the temporary file system before the database is brought up. This has been done by creating and initializing the tablespace, then copying it to a reference location on stable storage. When the system comes up, the reference table is copied to a well-known location on a tmpfs, and then the database is started. This is a somewhat extreme solution, and most database systems are now capable of making direct use of very large (64-bit) shared memory pools, which was the original reason why the tmpfs technique was developed. However, this might be another useful trick for optimizing the performance of database engines that have not been extended to fully utilize 64-bit address spaces (notably MySQL and Postgres).

< Back Page 5 of 7 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Privacy Notice

Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children

This site is not directed to children under the age of 13.

Marketing

Pearson may send or direct marketing communications to users, provided that

Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
Such marketing is consistent with applicable law and Pearson's legal obligations.
Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

As required by law.
With the consent of the individual (or their parent, if the individual is a minor)
In response to a subpoena, court order or legal process, to the extent permitted or required by law
To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
To investigate or address actual or suspected fraud or other illegal activities
To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020

Email Address