Check out my first novel, midnight's simulacra!
Libtorque: Difference between revisions
From dankwiki
No edit summary |
No edit summary |
||
Line 8: | Line 8: | ||
* 2009-11-19: CSE 6230 checkpoint | * 2009-11-19: CSE 6230 checkpoint | ||
* 2009-12-10: CSE 6230 due date | * 2009-12-10: CSE 6230 due date | ||
==Design/Functionality== | |||
===System discovery=== | |||
* Full support for [[CPUID]] as most recently defined by Intel and AMD (more advanced, as of 2009-10-31, than [http://www.codemonkey.org.uk/projects/x86info/ x86info]) | |||
* Full support for [Linux APIs|Linux] and [FreeBSD APIs|FreeBSD's] native [[cpuset]] libraries, and SGI's [[cpuset|libcpuset]] and [[http://oss.sgi.com/projects/libnuma libNUMA]] | |||
* Discovers and makes available, for each processor type: | |||
** ISA, ISA-specific capabilities, and number of concurrent threads supported (degrees of SMT) | |||
** Line count, associativity, line length, geometry, and type of all caches | |||
** Entry count, associativity, page size and type of all TLBs | |||
** Inclusiveness relationships among cache and TLB levels | |||
** [[SMP on x86|APIC]] ID's and how caches are shared among them | |||
** More: properties of hardware prefetching, ability to support non-temporal loads ([[SIMD|MOVNTDQA]], [[SIMD|PREFETCHNTA]], etc) | |||
* Discovers and makes available, for each memory node type: | |||
** Connected processor groups and relative distance information | |||
** Number of pages and bank geometry | |||
** More: OS page prefetching policy, error-recovery info | |||
==References/Prior Art== | ==References/Prior Art== | ||
* Philip Mucci's "[http://icl.cs.utk.edu/~mucci/latest/pubs/Notur2009-new.pdf Linux Multicore Performance Analysis and Optimization in a Nutshell]", delivered at NOTUR 2009 | * Philip Mucci's "[http://icl.cs.utk.edu/~mucci/latest/pubs/Notur2009-new.pdf Linux Multicore Performance Analysis and Optimization in a Nutshell]", delivered at NOTUR 2009 |
Revision as of 11:34, 31 October 2009
My project for Professor Rich Vuduc's Fall 2009 CSE6230, libtorque is a multithreaded event library for UNIX designed to take full advantage of the manycore, heterogenous, NUMA future. Previous, non-threaded event libraries include libevent, libev and liboop. My project proposal suggests motivation for libtorque: I believe it necessary to take scheduling and memory-placement decisions into account to most optimally handle events, especially on manycore machines and especially to handle unexpected traffic sets (denial of service attacks, oversubscribed pipes, mixed-latency connections, etc).
Resources
- git hosting from GitHub:
- Available from the dankamongmen/libtorque project page
- git clone from git://github.com/dankamongmen/libtorque.git
- bugzilla, hosted here on http://dank.qemfd.net/bugzilla/
Milestones
- 2009-11-19: CSE 6230 checkpoint
- 2009-12-10: CSE 6230 due date
Design/Functionality
System discovery
- Full support for CPUID as most recently defined by Intel and AMD (more advanced, as of 2009-10-31, than x86info)
- Full support for [Linux APIs|Linux] and [FreeBSD APIs|FreeBSD's] native cpuset libraries, and SGI's libcpuset and [libNUMA]
- Discovers and makes available, for each processor type:
- ISA, ISA-specific capabilities, and number of concurrent threads supported (degrees of SMT)
- Line count, associativity, line length, geometry, and type of all caches
- Entry count, associativity, page size and type of all TLBs
- Inclusiveness relationships among cache and TLB levels
- APIC ID's and how caches are shared among them
- More: properties of hardware prefetching, ability to support non-temporal loads (MOVNTDQA, PREFETCHNTA, etc)
- Discovers and makes available, for each memory node type:
- Connected processor groups and relative distance information
- Number of pages and bank geometry
- More: OS page prefetching policy, error-recovery info
References/Prior Art
- Philip Mucci's "Linux Multicore Performance Analysis and Optimization in a Nutshell", delivered at NOTUR 2009
- Elmeleegy et al's "Lazy Asynchronous I/O", USENIX 2004
- PGAS: Kathy Yelick's "Performance and Productivity Opportunities using Global Address Space Programming Models", 2006
- Emery Berger's Hoard and other manycore-capable allocators (libumem aka magazined slab, Google's ctmalloc, etc).
888 ,e, 888 d8 "...tear the roof off the sucka..." 888 " 888 88e d88 e88 88e 888,8, e88 888 8888 8888 ,e e, 888 888 888 888b d88888 d888 888b 888 " d888 888 8888 8888 d88 88b 888 888 888 888P 888 Y888 888P 888 Y888 888 Y888 888P 888 , 888 888 888 88" 888 "88 88" 888 "88 888 "88 88" "YeeP" _____________________________________________ 888 _________________ continuation-based unix i/o for manycore numa\888/© nick black 2009