-What: /debug/pktcdvd/pktcdvd[0-7]
+What: /sys/kernel/debug/pktcdvd/pktcdvd[0-7]
Date: Oct. 2006
KernelVersion: 2.6.20
Contact: Thomas Maier <balagi@justmail.de>
The pktcdvd module (packet writing driver) creates
these files in debugfs:
-/debug/pktcdvd/pktcdvd[0-7]/
+/sys/kernel/debug/pktcdvd/pktcdvd[0-7]/
info (0444) Lots of driver statistics and infos.
Example:
-------
-cat /debug/pktcdvd/pktcdvd0/info
+cat /sys/kernel/debug/pktcdvd/pktcdvd0/info
###
# The targets that may be used.
-PHONY += xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs
+PHONY += xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs cleandocs
BOOKS := $(addprefix $(obj)/,$(DOCBOOKS))
xmldocs: $(BOOKS)
dochelp:
@echo ' Linux kernel internal documentation in different formats:'
@echo ' htmldocs - HTML'
- @echo ' installmandocs - install man pages generated by mandocs'
- @echo ' mandocs - man pages'
@echo ' pdfdocs - PDF'
@echo ' psdocs - Postscript'
@echo ' xmldocs - XML DocBook'
+ @echo ' mandocs - man pages'
+ @echo ' installmandocs - install man pages generated by mandocs'
+ @echo ' cleandocs - clean all generated DocBook files'
###
# Temporary files left by various tools
clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man
+cleandocs:
+ $(Q)rm -f $(call objectify, $(clean-files))
+ $(Q)rm -rf $(call objectify, $(clean-dirs))
+
# Declare the contents of the .PHONY variable as phony. We keep that
# information in a variable se we can use it in if_changed and friends.
iii. Plugging the queue to batch requests in anticipation of opportunities for
merge/sort optimizations
-This is just the same as in 2.4 so far, though per-device unplugging
-support is anticipated for 2.5. Also with a priority-based i/o scheduler,
-such decisions could be based on request priorities.
-
Plugging is an approach that the current i/o scheduling algorithm resorts to so
that it collects up enough requests in the queue to be able to take
advantage of the sorting/merging logic in the elevator. If the
queue is empty when a request comes in, then it plugs the request queue
-(sort of like plugging the bottom of a vessel to get fluid to build up)
+(sort of like plugging the bath tub of a vessel to get fluid to build up)
till it fills up with a few more requests, before starting to service
the requests. This provides an opportunity to merge/sort the requests before
passing them down to the device. There are various conditions when the queue is
unplugged (to open up the flow again), either through a scheduled task or
could be on demand. For example wait_on_buffer sets the unplugging going
-(by running tq_disk) so the read gets satisfied soon. So in the read case,
-the queue gets explicitly unplugged as part of waiting for completion,
-in fact all queues get unplugged as a side-effect.
+through sync_buffer() running blk_run_address_space(mapping). Or the caller
+can do it explicity through blk_unplug(bdev). So in the read case,
+the queue gets explicitly unplugged as part of waiting for completion on that
+buffer. For page driven IO, the address space ->sync_page() takes care of
+doing the blk_run_address_space().
Aside:
This is kind of controversial territory, as it's not clear if plugging is
multi-page bios being queued in one shot, we may not need to wait to merge
a big request from the broken up pieces coming by.
- Per-queue granularity unplugging (still a Todo) may help reduce some of the
- concerns with just a single tq_disk flush approach. Something like
- blk_kick_queue() to unplug a specific queue (right away ?)
- or optionally, all queues, is in the plan.
-
4.4 I/O contexts
I/O contexts provide a dynamically allocated per process data area. They may
be used in I/O schedulers, and in the block layer (could be used for IO statis,
process (bash) into it. CPU time consumed by this bash and its children
can be obtained from g1/cpuacct.usage and the same is accumulated in
/cgroups/cpuacct.usage also.
+
+cpuacct.stat file lists a few statistics which further divide the
+CPU time obtained by the cgroup into user and system times. Currently
+the following statistics are supported:
+
+user: Time spent by tasks of the cgroup in user mode.
+system: Time spent by tasks of the cgroup in kernel mode.
+
+user and system are in USER_HZ unit.
+
+cpuacct controller uses percpu_counter interface to collect user and
+system times. This has two side effects:
+
+- It is theoretically possible to see wrong values for user and system times.
+ This is because percpu_counter_read() on 32bit systems isn't safe
+ against concurrent writes.
+- It is possible to see slightly outdated values for user and system times
+ due to the batch processing nature of percpu_counter.
Salient features
-a. Enable control of both RSS (mapped) and Page Cache (unmapped) pages
+a. Enable control of Anonymous, Page Cache (mapped and unmapped) and
+ Swap Cache memory pages.
b. The infrastructure allows easy addition of other types of memory to control
c. Provides *zero overhead* for non memory controller users
d. Provides a double LRU: global memory pressure causes reclaim from the
global LRU; a cgroup on hitting a limit, reclaims from the per
cgroup LRU
-NOTE: Swap Cache (unmapped) is not accounted now.
-
Benefits and Purpose of the memory controller
The memory controller isolates the memory behaviour of a group of tasks
moved to the parent. If you want to avoid that, force_empty will be useful.
5.2 stat file
- memory.stat file includes following statistics (now)
- cache - # of pages from page-cache and shmem.
- rss - # of pages from anonymous memory.
- pgpgin - # of event of charging
- pgpgout - # of event of uncharging
- active_anon - # of pages on active lru of anon, shmem.
- inactive_anon - # of pages on active lru of anon, shmem
- active_file - # of pages on active lru of file-cache
- inactive_file - # of pages on inactive lru of file cache
- unevictable - # of pages cannot be reclaimed.(mlocked etc)
-
- Below is depend on CONFIG_DEBUG_VM.
- inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
- recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
- recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
- recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
- recent_scanned_file - VM internal parameter. (see mm/vmscan.c)
-
- Memo:
+
+memory.stat file includes following statistics
+
+cache - # of bytes of page cache memory.
+rss - # of bytes of anonymous and swap cache memory.
+pgpgin - # of pages paged in (equivalent to # of charging events).
+pgpgout - # of pages paged out (equivalent to # of uncharging events).
+active_anon - # of bytes of anonymous and swap cache memory on active
+ lru list.
+inactive_anon - # of bytes of anonymous memory and swap cache memory on
+ inactive lru list.
+active_file - # of bytes of file-backed memory on active lru list.
+inactive_file - # of bytes of file-backed memory on inactive lru list.
+unevictable - # of bytes of memory that cannot be reclaimed (mlocked etc).
+
+The following additional stats are dependent on CONFIG_DEBUG_VM.
+
+inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
+recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
+recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
+recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
+recent_scanned_file - VM internal parameter. (see mm/vmscan.c)
+
+Memo:
recent_rotated means recent frequency of lru rotation.
recent_scanned means recent # of scans to lru.
showing for better debug please see the code for meanings.
+Note:
+ Only anonymous and swap cache memory is listed as part of 'rss' stat.
+ This should not be confused with the true 'resident set size' or the
+ amount of physical memory used by the cgroup. Per-cgroup rss
+ accounting is not done yet.
5.3 swappiness
Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only.
- Following cgroup's swapiness can't be changed.
+ Following cgroups' swapiness can't be changed.
- root cgroup (uses /proc/sys/vm/swappiness).
- a cgroup which uses hierarchy and it has child cgroup.
- a cgroup which uses hierarchy and not the root of hierarchy.
2. Basic accounting routines
- a. void res_counter_init(struct res_counter *rc)
+ a. void res_counter_init(struct res_counter *rc,
+ struct res_counter *rc_parent)
Initializes the resource counter. As usual, should be the first
routine called for a new counter.
- b. int res_counter_charge[_locked]
- (struct res_counter *rc, unsigned long val)
+ The struct res_counter *parent can be used to define a hierarchical
+ child -> parent relationship directly in the res_counter structure,
+ NULL can be used to define no relationship.
+
+ c. int res_counter_charge(struct res_counter *rc, unsigned long val,
+ struct res_counter **limit_fail_at)
When a resource is about to be allocated it has to be accounted
with the appropriate resource counter (controller should determine
* if the charging is performed first, then it should be uncharged
on error path (if the one is called).
- c. void res_counter_uncharge[_locked]
+ If the charging fails and a hierarchical dependency exists, the
+ limit_fail_at parameter is set to the particular res_counter element
+ where the charging failed.
+
+ d. int res_counter_charge_locked
+ (struct res_counter *rc, unsigned long val)
+
+ The same as res_counter_charge(), but it must not acquire/release the
+ res_counter->lock internally (it must be called with res_counter->lock
+ held).
+
+ e. void res_counter_uncharge[_locked]
(struct res_counter *rc, unsigned long val)
When a resource is released (freed) it should be de-accounted
from the resource counter it was accounted to. This is called
"uncharging".
- The _locked routines imply that the res_counter->lock is taken.
-
+ The _locked routines imply that the res_counter->lock is taken.
2.1 Other accounting routines
After a reasonable transition period, we will remove the legacy
fakephp interface.
Who: Alex Chiang <achiang@hp.com>
+
+---------------------------
+
+What: i2c-voodoo3 driver
+When: October 2009
+Why: Superseded by tdfxfb. I2C/DDC support used to live in a separate
+ driver but this caused driver conflicts.
+Who: Jean Delvare <khali@linux-fr.org>
+ Krzysztof Helt <krzysztof.h1@wp.pl>
The P_Key for any interface is given by the "pkey" file, and the
main interface for a subinterface is in "parent."
+Datagram vs Connected modes
+
+ The IPoIB driver supports two modes of operation: datagram and
+ connected. The mode is set and read through an interface's
+ /sys/class/net/<intf name>/mode file.
+
+ In datagram mode, the IB UD (Unreliable Datagram) transport is used
+ and so the interface MTU has is equal to the IB L2 MTU minus the
+ IPoIB encapsulation header (4 bytes). For example, in a typical IB
+ fabric with a 2K MTU, the IPoIB MTU will be 2048 - 4 = 2044 bytes.
+
+ In connected mode, the IB RC (Reliable Connected) transport is used.
+ Connected mode is to takes advantage of the connected nature of the
+ IB transport and allows an MTU up to the maximal IP packet size of
+ 64K, which reduces the number of IP packets needed for handling
+ large UDP datagrams, TCP segments, etc and increases the performance
+ for large messages.
+
+ In connected mode, the interface's UD QP is still used for multicast
+ and communication with peers that don't support connected mode. In
+ this case, RX emulation of ICMP PMTU packets is used to cause the
+ networking stack to use the smaller UD MTU for these neighbours.
+
+Stateless offloads
+
+ If the IB HW supports IPoIB stateless offloads, IPoIB advertises
+ TCP/IP checksum and/or Large Send (LSO) offloading capability to the
+ network stack.
+
+ Large Receive (LRO) offloading is also implemented and may be turned
+ on/off using ethtool calls. Currently LRO is supported only for
+ checksum offload capable devices.
+
+ Stateless offloads are supported only in datagram mode.
+
+Interrupt moderation
+
+ If the underlying IB device supports CQ event moderation, one can
+ use ethtool to set interrupt mitigation parameters and thus reduce
+ the overhead incurred by handling interrupts. The main code path of
+ IPoIB doesn't use events for TX completion signaling so only RX
+ moderation is supported.
+
Debugging Information
By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set
http://ietf.org/rfc/rfc4391.txt
IP over InfiniBand (IPoIB) Architecture (RFC 4392)
http://ietf.org/rfc/rfc4392.txt
+ IP over InfiniBand: Connected Mode (RFC 4755)
+ http://ietf.org/rfc/rfc4755.txt
--- 6.7 Custom kbuild commands
--- 6.8 Preprocessing linker scripts
- === 7 Kbuild Variables
- === 8 Makefile language
- === 9 Credits
- === 10 TODO
+ === 7 Kbuild syntax for exported headers
+ --- 7.1 header-y
+ --- 7.2 objhdr-y
+ --- 7.3 destination-y
+ --- 7.4 unifdef-y (deprecated)
+
+ === 8 Kbuild Variables
+ === 9 Makefile language
+ === 10 Credits
+ === 11 TODO
=== 1 Overview
The kbuild infrastructure for *lds file are used in several
architecture-specific files.
+=== 7 Kbuild syntax for exported headers
+
+The kernel include a set of headers that is exported to userspace.
+Many headers can be exported as-is but other headers requires a
+minimal pre-processing before they are ready for user-space.
+The pre-processing does:
+- drop kernel specific annotations
+- drop include of compiler.h
+- drop all sections that is kernel internat (guarded by ifdef __KERNEL__)
+
+Each relevant directory contain a file name "Kbuild" which specify the
+headers to be exported.
+See subsequent chapter for the syntax of the Kbuild file.
+
+ --- 7.1 header-y
+
+ header-y specify header files to be exported.
+
+ Example:
+ #include/linux/Kbuild
+ header-y += usb/
+ header-y += aio_abi.h
+
+ The convention is to list one file per line and
+ preferably in alphabetic order.
+
+ header-y also specify which subdirectories to visit.
+ A subdirectory is identified by a trailing '/' which
+ can be seen in the example above for the usb subdirectory.
+
+ Subdirectories are visited before their parent directories.
+
+ --- 7.2 objhdr-y
+
+ objhdr-y specifies generated files to be exported.
+ Generated files are special as they need to be looked
+ up in another directory when doing 'make O=...' builds.
+
+ Example:
+ #include/linux/Kbuild
+ objhdr-y += version.h
+
+ --- 7.3 destination-y
+
+ When an architecture have a set of exported headers that needs to be
+ exported to a different directory destination-y is used.
+ destination-y specify the destination directory for all exported
+ headers in the file where it is present.
+
+ Example:
+ #arch/xtensa/platforms/s6105/include/platform/Kbuild
+ destination-y := include/linux
+
+ In the example above all exported headers in the Kbuild file
+ will be located in the directory "include/linux" when exported.
+
+
+ --- 7.4 unifdef-y (deprecated)
+
+ unifdef-y is deprecated. A direct replacement is header-y.
+
-=== 7 Kbuild Variables
+=== 8 Kbuild Variables
The top Makefile exports the following variables:
INSTALL_MOD_STRIP will used as the option(s) to the strip command.
-=== 8 Makefile language
+=== 9 Makefile language
The kernel Makefiles are designed to be run with GNU Make. The Makefiles
use only the documented features of GNU Make, but they do use many
There are some cases where "=" is appropriate. Usually, though, ":="
is the right choice.
-=== 9 Credits
+=== 10 Credits
Original version made by Michael Elizabeth Chastain, <mailto:mec@shout.net>
Updates by Kai Germaschewski <kai@tp1.ruhr-uni-bochum.de>
Updates by Sam Ravnborg <sam@ravnborg.org>
Language QA by Jan Engelhardt <jengelh@gmx.de>
-=== 10 TODO
+=== 11 TODO
- Describe how kbuild supports shipped files with _shipped.
- Generating offset header files.
./include/asm/setup.h as COMMAND_LINE_SIZE.
- acpi= [HW,ACPI,X86-64,i386]
+ acpi= [HW,ACPI,X86]
Advanced Configuration and Power Interface
Format: { force | off | ht | strict | noirq | rsdt }
force -- enable ACPI if default was off
acpi_osi="!string2" # remove built-in string2
acpi_osi= # disable all strings
- acpi_pm_good [X86-32,X86-64]
+ acpi_pm_good [X86]
Override the pmtimer bug detection: force the kernel
to assume that this machine's pmtimer latches its value
and always returns good values.
Also note the kernel might malfunction if you disable
some critical bits.
- code_bytes [IA32/X86_64] How many bytes of object code to print
+ code_bytes [X86] How many bytes of object code to print
in an oops report.
Range: 0 - 8192
Default: 64
MTRR settings. This parameter disables that behavior,
possibly causing your machine to run very slowly.
- disable_timer_pin_1 [i386,x86-64]
+ disable_timer_pin_1 [X86]
Disable PIN 1 of APIC timer