Introduction ¶
There is a common concern in Linux-based design: the buffering of filesystem writes is very good for performance, but may cause the buffered data to be lost and so filesystem corruption when the system go down unexpectedly.
This documents presents some solutions you can take to make your Linux-based system perform more robustly.
1. Use a Read-Only Partition for Root Filesystem ¶
The simplest solution to avoiding power failures during writes is to avoid all writes - files that don't to be modified should be kept on a partition that is mounted read-only. In most cases (not include NAND flash device) it means that there will be no modifications to the entire partition, and as a result there is no risk of corruption. If local data writing is required, an extra read-write partition can be established for that purpose only. With this setup, Even in worst-case, we can still ensure system will boot correctly.
In Android platform, there're several options to use read-only partitions:
- Root filesystem as ramdisk and Android system partition as read-only partition.
- Root filesystem and Android system files are in a same read-only partition.
- Root filesystem and Android system files are in separate read-only partition.
Which option to use depends on the image size of the root filesystem. Ramdisk can only hold small rootfs, and a too large partition is not convenient to do upgrading.
2. Use the Right Filesystem for Read-write partitions ¶
Linux offers many many filesystems, such as jffs2, yaffs2 and ubifs for flash devices, ext2, ext3, ext4 and btrfs for disk-like devices.
If the Linux-based system (i.e. Android platform) is booting from a SD/MMC card or managed NAND, the ext3 or ext4 is a good choice.
The
ext3 or
third extended filesystem (wiki:
http://en.wikipedia.org/wiki/Ext3) is a journaled file system that is commonly used by Linux kernel. Its main advantage over ext2 is journaling which improves reliability and eliminates the need to check the file system after an unclean shutdown. The ext4 is its successor which has many extensions and performance improvements but also brings more data loss probabilities to be paid more attention.
Starting from Android-2.3 (Gingerbread), many Android devices are going to be moving from YAFFS to the ext4 filesystem according to the post on Google official Android developer blog. (
http://android-developers.blogspot.com/2010/12/saving-data-safely.html)
With the ext3/4 filesystem, system crashes or power loss are far less likely to make system become corrupted because of the journaling design.
- One of the ext3 features to be noted is that by default, ext3 will commit changes to its journal every 5 seconds. So, in general, 5 seconds worth of writes might be lost as the result of a system crash or power loss. The value can be tuned.
- One feature to be noted in ext4 is the delayed allocation which means that the filesystem tries to delay the allocation of physical disk blocks for written data as long as possible. This policy brings some important performance benefits. Many files are short-lived; delayed allocation can keep the system from writing fleeting temporary files to disk at all. And, for longer-lived files, delayed allocation allows the kernel to accumulate more data and to allocate the blocks for data contiguously, speeding up both the write and any subsequent reads of that data. However, this also brings more data loss probabilities.
There are some sysctl variables to shorten the system’s writeback time and so as to reduce the data loss possibility:
/proc/sys/vm/dirty_expire_centiseconds
/proc/sys/vm/dirty_writeback_centiseconds
- The first variable controls how long written data can sit in the page cache before it’s considered “expired” and queued to be written to disk; it defaults to 30 seconds.
- The second variable controls how ofter the pdflush process wakes up to actually flush expired data to disk; it defaults to 5 seconds.
- Lowering these values will cause the system to flush data to disk more aggressively, with a cost in the form of reduced performance.
3. Use Right Solution When Developing Applications ¶
The final solution to this problem is to fix the applications which are expecting the filesystem to provide more guarantees than it really is.
- Application developer should keep in mind that data doesn’t actually consistently reach the storage media when write() or even close() is called.
- Applications which want to be sure that their critical data have been committed to the media can use the fsync() system call; that can be slow, so be careful not to call it carelessly.
- In Android application, if you just use SharedPreferences? or SQLite, you don't need to worry about the data loss, because Android has already used fsync() to do the right thing about buffering.