Power Management

From dankwiki
Jump to: navigation, search

Matthew Garrett's "Observations on Power Management" is a great intro.

Implementations

  • APM (Advanced Power Management): All PM policy/mechanism resides within the BIOS
    • Motivated by, largely relevant only to laptops
    • apmd debian package
    • No longer supported in Vista. Off by default in recent Debian kernels.
  • ACPI: Current, often buggy (but also often easily repairable via BIOS flash or by hand)
    • C-States, which are decomposable into P-States and T-States
  • P4 Thermal Throttling: Slows down or shuts off the processor based on CPU temperature
    • Adjustment is either via idle cycle insertion or lowering the clock multiplier

CPU Frequency

  • On Linux, cpufreq-info provides lots of good information:
[recombinator](0) $ cpufreq-info
cpufrequtils 004: cpufreq-info (C) Dominik Brodowski 2004-2006
Report errors and bugs to cpufreq@lists.linux.org.uk, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which need to switch frequency at the same time: 0
  hardware limits: 1.60 GHz - 2.39 GHz
  available frequency steps: 2.39 GHz, 1.60 GHz
  available cpufreq governors: ondemand, performance
  current policy: frequency should be within 1.60 GHz and 2.39 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.60 GHz.
  cpufreq stats: 2.39 GHz:27.25%, 1.60 GHz:72.75%  (78631)
analyzing CPU 1:
  driver: acpi-cpufreq
  CPUs which need to switch frequency at the same time: 1
  hardware limits: 1.60 GHz - 2.39 GHz
  available frequency steps: 2.39 GHz, 1.60 GHz
  available cpufreq governors: ondemand, performance
  current policy: frequency should be within 1.60 GHz and 2.39 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.60 GHz.
  cpufreq stats: 2.39 GHz:4.32%, 1.60 GHz:95.68%  (19387)
[recombinator](0) $ 
  • You can get C-state latency information from /proc/acpi/processor/*/power:
[wopr](0) $ cat /proc/acpi/processor/CP10/power 
active state:            C0
max_cstate:              C8
maximum allowed latency: 2000000000 usec
states:
    C1:                  type[C1] promotion[--] demotion[--] latency[032] usage[02035391] duration[00000000000000000000]
    C2:                  type[C2] promotion[--] demotion[--] latency[064] usage[01050475] duration[00000000000624232458]
    C3:                  type[C2] promotion[--] demotion[--] latency[096] usage[150501625] duration[00000020175094863449]
[wopr](0) $ 
  • On FreeBSD, sysctls from the dev.cpu and debug.cpufreq MIB hierarchies are your window into frequency control. See cpufreq(4).

Disks/Filesystems

  • noatime -- critical for all kinds of things! don't believe me; trust ingo molnár:
i cannot over-emphasise how much of a deal it is in practice. Atime 
updates are by far the biggest IO performance deficiency that Linux has 
today. Getting rid of atime updates would give us more everyday Linux 
performance than all the pagecache speedups of the past 10 years, 
_combined_.

it's also perhaps the most stupid Unix design idea of all times. Unix is 
really nice and well done, but think about this a bit:

   ' For every file that is read from the disk, lets do a ... write to
     the disk! And, for every file that is already cached and which we
     read from the cache ... do a write to the disk! '

tell that concept to any rookie programmer who knows nothing about 
kernels and the answer will be: 'huh, what? That's gross!'. And Linux 
does this unconditionally for everything, and no, it's not only done on 
some high-security servers that need all sorts of auditing enabled that 
logs every file read - no, it's done by 99% of the Linux desktops and 
servers. For the sake of some lazy mailers that could now be using 
inotify, and for the sake of ... nothing much, really - forensics 
software perhaps.
  • SATA link state management? what is this? seen in powertop output
  • turning up the writeback time / disk head parking / other debatable techniques

Workload Distribution

  • The /sys/devices/system/cpu/sched_smt_power_savings tunable causes tasks (under light load) to be preferentially distributed across processing elements (ie including SMT units) and cores of physical packages (as opposed to packages themselves).
  • The /sys/devices/system/cpu/sched_mc_power_savings tunable does the same, but doesn't apply to SMT.
  • Task migration has overhead and associated architecture warmup (ie, caches, branch prediction and hardware prefetching). How is this affected? FIXME

Networking

Wired

  • Disable Wake-on-LAN if it's not being used: ethtool -s DEVICE wol d

Wireless

  • PS-Poll (PowerSave Poll) turns the radio off longer in exchange for higher latencies (only when it's disabled, though)
    • Requires support from access point
  • Auto-association / aggressive scanning, especially when wireless is not being used
  • MAC80211_DEFAULT_PS, introduced in the 2.6.32 development cycle, sets wireless powersaving by default
    • Use latency requirement registration (Documentation/power/pm_qos_interface.txt) for applications which need it

See Also